马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
自己用爬虫建立的一个一万条数据的数据库,想要删除重复项。DELETE FROM newtable WHERE id IN (
SELECT * FROM (
SELECT id FROM newtable WHERE (job_name,company_href,issuedate)
IN (
SELECT job_name,company_href,issuedate FROM newtable GROUP BY job_name,company_href,issuedate HAVING COUNT(1) > 1
) AND id NOT IN (
SELECT MIN(id) FROM newtable GROUP BY job_name,company_href,issuedate HAVING COUNT(1) > 1
)
) AS newtable_copy
);
用如上代码,执行超时 SELECT * FROM (
SELECT id FROM newtable WHERE (job_name,company_href,issuedate)
IN (
SELECT job_name,company_href,issuedate FROM newtable GROUP BY job_name,company_href,issuedate HAVING COUNT(1) > 1
) AND id NOT IN (
SELECT MIN(id) FROM newtable GROUP BY job_name,company_href,issuedate HAVING COUNT(1) > 1
)
) AS newtable_copy
执行该代码,花费了15分钟,请教一下是什么原因为什么需要这么久?
|