requests问题
import requestsimport bs4
r=requests.get('https://movie.douban.com/top250')
soup=bs4.BeautifulSoup(r.text,'lxml')
title=soup.find_all('div',class_='hd')
for i in title:
print(i.a.span.text)
代码应该是没错的,为什么什么都打印不出来,求解{:10_277:}
本帖最后由 xiaosi4081 于 2020-7-23 16:59 编辑
python -m pip install -U urllib3 -i https://pypi.tuna.tsinghua.edu.cn/simple
还有把代码改成这样:
import requests
import bs4
headers = {"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36"}
r=requests.get('https://movie.douban.com/top250',headers=headers)
soup=bs4.BeautifulSoup(r.text,'lxml')
title=soup.find_all('div',class_='hd')
for i in title:
print(i.a.span.text)
headers就是请求头,不加请求头,可能爬不到,不信的话{:10_256:} :
import requests
import bs4
r=requests.get('https://movie.douban.com/top250')
print(r.status_code)
soup=bs4.BeautifulSoup(r.text,'html.parser')
title=soup.find_all('div',class_='hd')
for i in title:
print(i.a.span.text)
结果:
418
也就是说被反爬了
求最佳{:10_254:} python -m pip install -U chardet -i https://pypi.tuna.tsinghua.edu.cn/simple 一点反反爬措施都没有的嘛..
import requests
import bs4
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0'}
r=requests.get('https://movie.douban.com/top250', headers = headers)
soup=bs4.BeautifulSoup(r.text,'lxml')
title=soup.find_all('div',class_='hd')
for i in title:
print(i.a.span.text)
页:
[1]