byzh168 发表于 2020-5-29 21:45:42

求大佬看看,我爬虫一个网站,结果utf-8出现错误

import urllib.request
url = "http://www.offcn.com/"

headers={
"User-Agent": "Mozilla / 5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 70.0.3538.102Safari / 537.36Edge / 18.18362"
}
req = urllib.request.Request(url=url,headers=headers)
response = urllib.request.urlopen(req)
print(response.read().decode("utf-8"))


运行之后出现图片上的错误

永恒的蓝色梦想 发表于 2020-5-29 21:47:05

用 requests

Twilight6 发表于 2020-5-29 21:52:17

爬虫之前建议先看下网页编码:

把这边改下编码即可
print(response.read().decode('gb2312'))

xiaosi4081 发表于 2020-5-30 07:54:21

用requests
import requests
url = "http://www.offcn.com/"

headers={
"User-Agent": "Mozilla / 5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 70.0.3538.102Safari / 537.36Edge / 18.18362"
}
req = request.get(url,headers=headers)
response = req. text

print(response)

byzh168 发表于 2020-6-1 09:50:39

永恒的蓝色梦想 发表于 2020-5-29 21:47
用 requests

谢谢

byzh168 发表于 2020-6-1 09:56:12

Twilight6 发表于 2020-5-29 21:52
爬虫之前建议先看下网页编码:

把这边改下编码即可

谢谢
页: [1]
查看完整版本: 求大佬看看,我爬虫一个网站,结果utf-8出现错误