python读取txt文件报错,Python交流,编程语言专区,鱼C论坛

MarkYoung 发表于 2021-10-22 09:43:37

python读取txt文件报错

open之后，read()时总是提示：
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
f.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 11: illegal multibyte sequence

请问是什么原因啊，和视频里讲的不一样，无法操作。请大神指导，谢谢啦

逃兵发表于 2021-10-22 10:05:41

'gbk'无法解码
试试在open函数加上解码
f = open(xxx,encoding = 'utf-8')

ba21 发表于 2021-10-22 12:30:06

使用cchardet判断字符编码（准确度高）
cchardet 比chardet准确度高，速度快。
检测文件编码

import cchardet as chardet

# 先检测出文件编码
with open("test.txt", "rb") as f:
msg = f.read()

enc = chardet.detect(msg) # 返回的是个字典编码和准确度。如：{'encoding': 'UTF-8', 'confidence': 0.9900000095367432}
print(enc)
enc = enc['encoding']

# 然后以指定编码打开文件
with open("test.txt", "r", encoding=enc) as f:
print(f.read())
网页编码判断

import requests
import cchardet

res = requests.get('http://www.baidu.com/')
rawdata= res.content
enc = cchardet.detect(rawdata)
enc = enc['encoding']
print(enc)

MarkYoung 发表于 2021-10-22 23:20:53

逃兵发表于 2021-10-22 10:05
'gbk'无法解码
试试在open函数加上解码
f = open(xxx,encoding = 'utf-8')

感谢指导，我是百度完发现txt可以选择存储格式，我换成ANSI了

页: [1]

鱼C论坛's Archiver

python读取txt文件报错