求大佬看看,我爬虫一个网站,结果utf-8出现错误
import urllib.requesturl = "http://www.offcn.com/"
headers={
"User-Agent": "Mozilla / 5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 70.0.3538.102Safari / 537.36Edge / 18.18362"
}
req = urllib.request.Request(url=url,headers=headers)
response = urllib.request.urlopen(req)
print(response.read().decode("utf-8"))
运行之后出现图片上的错误 用 requests 爬虫之前建议先看下网页编码:
把这边改下编码即可
print(response.read().decode('gb2312')) 用requests
import requests
url = "http://www.offcn.com/"
headers={
"User-Agent": "Mozilla / 5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 70.0.3538.102Safari / 537.36Edge / 18.18362"
}
req = request.get(url,headers=headers)
response = req. text
print(response) 永恒的蓝色梦想 发表于 2020-5-29 21:47
用 requests
谢谢 Twilight6 发表于 2020-5-29 21:52
爬虫之前建议先看下网页编码:
把这边改下编码即可
谢谢
页:
[1]