[已解决]求助

第一浩男 · 发表于 2019-5-29 11:04:12

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

这个错误是怎么回事
错误

Traceback (most recent call last):
File "F:\Py\课堂练习.py", line 12, in <module>
title_list = re.findall(r'<a\s+herf.*?>.*?</a>',response)
File "D:\安装目标\Python\lib\re.py", line 223, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or bytes-like object

复制代码

程序

import urllib.request
import re
url = 'https://www.baidu.com/s?wd=%E4%BB%A3%E7%90%86ip%E5%85%8D%E8%B4%B9&rsv_spt=1&rsv_iqid=0x98ba7a740015a39f&issp=1&f=3&rsv_bp=1&rsv_idx=2&ie=utf-8&tn=sitehao123_15&rsv_enter=1&rsv_sug3=1&rsv_sug1=1&rsv_sug7=001&rsv_sug2=1&rsp=0&rsv_sug9=es_2_1&rsv_sug4=2795&rsv_sug=3'
headers ={'user-agent':'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.204 Mobile Safari/537.36'}
req = urllib.request.Request(url,headers = headers)
response =urllib.request.urlopen(req)
title_list = re.findall(r'<a\s+herf.*?>.*?</a>',response)
print(title_list)

复制代码

最佳答案

月排行榜 / 总排行榜

kaohsing

2019-5-29 11:55:07

import urllib.request
import urllib.parse
import re

url = 'https://www.baidu.com/'

headers ={'user-agent':'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N)'}
data= bytes(urllib.parse.urlencode({'word':'代理ip免费'}),encoding='utf-8')
req = urllib.request.Request(url,headers = headers,data=data)

response =urllib.request.urlopen(req)
#这样写返回的是一个对象
print(response)
#<http.client.HTTPResponse object at 0x01B1A230>
print(dir(response))
#['__abstractmethods__', '__class__', '__del__', '__delattr__',
# 很多属性省略一下
#'read', 'read1', 'readable', 'readinto', 'readinto1', 'readline', 'readlines', 'reason', 'seek', 'seekable', 'status', 'tell',
# 'truncate', 'url', 'version', 'will_close', 'writable', 'write', 'writelines']

#属性里面read可以返回网页的内容
response =urllib.request.urlopen(req).read().decode('utf-8')
print(response)

跳转到最佳答案楼层

wp231957 · 发表于 2019-5-29 11:09:52

估计是表达式非法

kaohsing · 发表于 2019-5-29 11:55:07

这个最佳答案由 kaohsing 给出，感谢 kaohsing 的回答。

单击隐藏图章

import urllib.request
import urllib.parse
import re

url = 'https://www.baidu.com/'

headers ={'user-agent':'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N)'}
data= bytes(urllib.parse.urlencode({'word':'代理ip免费'}),encoding='utf-8')
req = urllib.request.Request(url,headers = headers,data=data)

response =urllib.request.urlopen(req)
#这样写返回的是一个对象
print(response)
#<http.client.HTTPResponse object at 0x01B1A230>
print(dir(response))
#['__abstractmethods__', '__class__', '__del__', '__delattr__',
# 很多属性省略一下
#'read', 'read1', 'readable', 'readinto', 'readinto1', 'readline', 'readlines', 'reason', 'seek', 'seekable', 'status', 'tell',
# 'truncate', 'url', 'version', 'will_close', 'writable', 'write', 'writelines']

#属性里面read可以返回网页的内容
response =urllib.request.urlopen(req).read().decode('utf-8')
print(response)

账号		自动登录	找回密码
密码			立即注册