[已解决]萌新求助:decode(encode,'ignore')里面的'ignore'代表啥?

746780487 · 发表于 2018-5-13 15:23:42

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

本帖最后由 746780487 于 2018-5-13 15:24 编辑

import urllib.request as ur
import chardet

def main():
i = 0

with open('urls.txt','r') as f:
      urls = f.read().splitlines()#分割换行符

      for each_url in urls:
         response = ur.urlopen(each_url)
         html = response.read()

         encode = chardet.detect(html)['encoding']#识别编码
         if encode == 'GB2312':
            encode = 'GBK'

         i += 1
         filename = 'url_%d.txt'% i


         with open(filename,'w',encoding=encode) as each_file:
            each_file.write(html.decode(encode,'ignore'))
#html.decode(encode)我能够理解,后面的'ignore'是代表啥呢

if __name__ == '__main__':
main()

最佳答案

月排行榜 / 总排行榜

alltolove

2018-5-13 15:39:45

就是有时候掺杂了别的编码，就给他忽略掉

跳转到最佳答案楼层

alltolove · 发表于 2018-5-13 15:39:45

这个最佳答案由 alltolove 给出，感谢 alltolove 的回答。

单击隐藏图章

就是有时候掺杂了别的编码，就给他忽略掉

账号		自动登录	找回密码
密码			立即注册