|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
自己用爬虫建立的一个一万条数据的数据库,想要删除重复项。
- DELETE FROM newtable WHERE id IN (
- SELECT * FROM (
- SELECT id FROM newtable WHERE (job_name,company_href,issuedate)
- IN (
- SELECT job_name,company_href,issuedate FROM newtable GROUP BY job_name,company_href,issuedate HAVING COUNT(1) > 1
- ) AND id NOT IN (
- SELECT MIN(id) FROM newtable GROUP BY job_name,company_href,issuedate HAVING COUNT(1) > 1
- )
- ) AS newtable_copy
- );
复制代码
用如上代码,执行超时
- SELECT * FROM (
- SELECT id FROM newtable WHERE (job_name,company_href,issuedate)
- IN (
- SELECT job_name,company_href,issuedate FROM newtable GROUP BY job_name,company_href,issuedate HAVING COUNT(1) > 1
- ) AND id NOT IN (
- SELECT MIN(id) FROM newtable GROUP BY job_name,company_href,issuedate HAVING COUNT(1) > 1
- )
- ) AS newtable_copy
复制代码
执行该代码,花费了15分钟,请教一下是什么原因为什么需要这么久?
|
|