鱼C论坛

 找回密码
 立即注册
查看: 1336|回复: 1

迭代循环读取文本出现问题

[复制链接]
发表于 2018-2-27 15:23:48 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
手头两个csv表格(因无法上传该格式附件,我将文件样本的内容贴在下面),一大一小,大的有将近7万行,小的4千多行,需要找出其中大文件ic.csv的第6列中包含小文件ga.csv的第2列值的行。
我自己尝试写了如下代码:
  1. file0 = 'ic.csv'
  2. file1 = 'ga.csv'

  3. ic = open(file0)
  4. ga = open(file1)
  5. count_ga = 0
  6. for ga_line in ga:
  7.     count_ic = 0
  8.     #获取GA第2列的值
  9.     ga_str  = ga_line.split(',')[1]
  10.     for ic_line in ic:
  11.         #获取IC第6列的值
  12.         ic_str = ic_line.split(',')[5]
  13.         #如果IC第6列包含GA第2列中的值,则打印出各自的行号
  14.         if ga_str in ic_str:
  15.             print('GA第%d行与IC第%d行匹配'%(count_ga,count_ic))
  16.             continue
  17.         count_ic += 1            
  18.     count_ga += 1
复制代码

结果是这样的(只有这一个结果)不知道哪里出问题了。:
  1. GA第0行与IC第0行匹配
复制代码



ga.csv内容:
  1. sdfsk,网站域名,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf
  2. sdfsk,bmyxiao.com;,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf
  3. sdfsk,xpjoyu.com;,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf
  4. sdfsk,,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf
  5. sdfsk,bwp.com;,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf
  6. sdfsk,cscai.com;,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf
  7. sdfsk,,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf
  8. sdfsk,jodu.com;,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf
  9. sdfsk,csd168.com;,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf
  10. sdfsk,hnpx.com;,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf,skdf
复制代码



ic.csv内容:
  1. 序号,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,网站域名
  2. 1,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,yazu.com;
  3. 2,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,benyz.com;benkz.com;
  4. 3,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,tahucy.cn;taochy.com;
  5. 4,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,zgtzs.com;
  6. 5,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,csjt.com;
  7. 6,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,hoa.cc;
  8. 7,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,cslzc.com;
  9. 8,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,dykm.com;sxdw.com;
  10. 9,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,bmyxiao.com;
  11. 10,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,xpjoyu.com;
  12. ,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,
  13. 11,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,bwp.com;
  14. 12,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,cscai.com;
  15. ,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,
  16. 13,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,jodu.com;
  17. 14,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,csd168.com;
  18. 15,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,hnpx.com;
  19. 16,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,stkj.com;
  20. 17,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,hnpjx.com;
  21. 18,skdlsldjfl,skdfjavsdk,sfadfrt,dsflskdjf,zhanjy.com;
复制代码






小甲鱼最新课程 -> https://ilovefishc.com
回复

使用道具 举报

 楼主| 发表于 2018-2-27 15:40:13 | 显示全部楼层
将代码中的continue改成break后的结果如下:
  1. GA第0行与IC第0行匹配
  2. GA第1行与IC第8行匹配
  3. GA第2行与IC第0行匹配
  4. GA第3行与IC第0行匹配
  5. GA第4行与IC第0行匹配
  6. GA第5行与IC第0行匹配
  7. GA第6行与IC第0行匹配
  8. GA第7行与IC第0行匹配
  9. GA第8行与IC第0行匹配
  10. GA第9行与IC第0行匹配
复制代码
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2025-12-28 05:01

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表