|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
- import requests
- import bs4
- import time
- with open('./豆瓣.txt',mode='w+',encoding='utf-8') as f:
- headers = {
- 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'
- }
- q = input("输入要爬几页")
- for i in range(int(q)):
- url = "https://movie.douban.com/top250"
- params = {q : (i * 25)}
- res = requests.get(url=url, headers=headers, params=params)
- soup = bs4.BeautifulSoup(res.text, "html.parser")
- targets = soup.find_all("div", class_="hd")
- for each in targets:
- f.write(each.a.text)
- time.sleep(1)
复制代码
为什么无法爬取下一页,一直在第一页里循环
- import requests
- import bs4
- import time
- with open('./豆瓣.txt',mode='w+',encoding='utf-8') as f:
- headers = {
- 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'
- }
- q = input("输入要爬几页")
- for i in range(int(q)):
- url = f"https://movie.douban.com/top250?start={i * 25}&filter="
- res = requests.get(url=url, headers=headers)
- soup = bs4.BeautifulSoup(res.text, "html.parser")
- targets = soup.find_all("div", class_="hd")
- for each in targets:
- f.write(each.a.text)
- time.sleep(1)
复制代码
|
|