爬虫百度网页源码抓取

875038534 · 发表于 2018-3-26 14:23:02

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

import urllib.request

url="http://www.baidu.com"
head= {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36'}
req=urllib.request.Request(url,head)
a=urllib.request.urlopen(req)
html=a.read().decode("utf-8")
print(html)

提示
TypeError: can't concat str to bytes

875038534 · 发表于 2018-3-26 14:30:21

后来仔细看了一下原来是系统错把我的第二参数当成urllib.request.Request中的data参数，所以会报错

附上正确代码：
import urllib.request

url="http://www.baidu.com"
head = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36'}
req=urllib.request.Request(url,data=None,headers=head)
a=urllib.request.urlopen(req)
html=a.read().decode("utf-8")
print(html)

账号		自动登录	找回密码
密码			立即注册

爬虫 百度网页源码抓取

马上注册，结交更多好友，享用更多功能^_^

爬虫百度网页源码抓取