开发者社区> 问答> 正文

MySQL如何更快的删除重复记录?

怎么样才能更快的删除重复数据,以下我是已经采取过的方案。
5W多条记录 删除重复记录,保留ID最小的行。
`delete from table

where peopleId in
    (select peopleId from table group by peopleId having count(peopleId) > 1)   
and id not in
    (select min(id) from table group by peopleId having count(peopleId)>1)`

出错:
MariaDB>1093 - You can't specify target table 'table' for update in FROM claus
不能先select出同一表中的某些值,再update这个表(在同一语句中),
于是,使用子集。
`delete from table

where peopleId in
    (select peopleId from (select tmp.* from table tmp) a 

group by peopleId having count(a.peopleId) > 1)

and id not in
    (select min(id) from (select tmp.* from table tmp) b

group by peopleId having count(b.peopleId)>1)`
运行几十秒后都没结果,没等结果出来,开始优化
子集只需要需要的列
`delete from table

where peopleId in
    (select peopleId from (select tmp.id, from table tmp) a

group by peopleId having count(a.peopleId) > 1)

and id not in
    (select min(id) from (select tmp.id,tmp.peopleId from table tmp) b

group by peopleId having count(b.peopleId)>1)`
30分钟无结果
求教各位大神如何做,是速度加快呢?

展开
收起
落地花开啦 2016-02-13 14:28:05 2612 0
1 条回答
写回答
取消 提交回答
  • 喜欢技术,喜欢努力的人

    我练习过一个类似的例子。直接说下几步关键的,你可以用这种方法对你的数据试验下(根据people表的pid判重,保留多条重复记录中pid最小的记录):
    A、创建临时表存储有重复的记录的pid
    `CREATE TEMPORARY TABLE SELECT pid FROM people
    GROUP BY pid HAVING COUNT(pid) > 1;`
    B、创建一个临时表存储重复列组中值_rowid最小的列的_rowid:
    `CREATE TEMPORARY TABLE t2 SELECT MIN(_rowid) AS rowid FROM people
    GROUP BY pid HAVING COUNT(pid) > 1;`
    C、删除重复记录
    DELETE FROM people WHERE pid IN (SELECT pid FROM t1) AN

    2019-07-17 18:42:00
    赞同 展开评论 打赏
问答排行榜
最热
最新

相关电子书

更多
搭建电商项目架构连接MySQL 立即下载
搭建4层电商项目架构,实战连接MySQL 立即下载
PolarDB MySQL引擎重磅功能及产品能力盛大发布 立即下载

相关镜像