一次性解决你所有的编码检测问题,Python交流,编程语言专区,鱼C论坛

liuganzhihui 发表于 2017-8-8 22:21:23

这个拓展很有用

SuperBoy007 发表于 2017-9-12 22:01:09

本帖最后由 SuperBoy007 于 2017-9-12 22:02 编辑

问下为何审查元素看到的utf-8编码，用chardet模块看到的确实GB2312？

boll112233 发表于 2017-9-14 11:05:18

真是屌的不行！厉害厉害！

驻火蚁 发表于 2017-10-8 19:15:10

gaomengsuijia 发表于 2016-7-20 14:02
>>> file = rq.urlopen("http://www.sina.com")
>>> html = file.read()
>>> chardet.detect(html)

>>> import urllib.requestas rq
>>> file = rq.urlopen("http://www.sina.com")
>>> import chardet
>>> html = file.read()
>>> chardet.detect(html)['encoding']
'utf-8'
>>> chardet.detect(html)

我试了试，好好的
{'confidence': 0.99, 'encoding': 'utf-8', 'language': ''}

bbdzx123 发表于 2017-10-13 00:30:17

Python 3.2安装不了怎么办

YINXINGSHU 发表于 2017-10-20 13:46:46

小甲鱼赞！

Coolsize 发表于 2017-10-31 21:16:02

发现了新大陆~

payton24 发表于 2017-12-14 02:33:08

不错不错，终于学到这里了

孔亚辉 发表于 2017-12-20 11:03:52

chardet模块里没有detect这个函数啊= =

若兮一 发表于 2018-1-4 16:15:12

支持下这个确实不错哦~

瑨mememao 发表于 2018-1-24 16:57:17

很难受，你们都有ez_setup.py文件么

mongoole 发表于 2018-2-27 18:02:12

给力

学学看看 发表于 2018-7-26 15:03:52

盾盾发表于 2018-8-16 21:37:30

太给力了~~~

lbf4325 发表于 2018-8-28 21:05:00

你的网站编码变了。对于新手锻炼不到，只不过看到了也非常好。

rockeben 发表于 2018-11-6 14:43:27

不错，request 要用r.content识别
r = requests.get(url)
if chardet.detect(r.content)["encoding"] == "GB2312":
r.encoding = "GBK"
else:
r.encoding = chardet.detect(r.content)["encoding"]

yang930808 发表于 2018-12-19 15:52:12

为什么我的fishc是这个？？
{'encoding': 'Windows-1254', 'confidence': 0.4510049011289909, 'language': 'Turkish'}

haitao_matt 发表于 2019-4-15 14:31:49

厉害了

Artilleryyy 发表于 2019-4-17 07:38:50

欸欸欸，可是我，用的是GB2312decode为什么没有报错……？！

denny1984 发表于 2019-6-12 16:21:25

学习了

页: 1 2 [3] 4 5

鱼C论坛's Archiver