python爬虫,Python交流,编程语言专区,鱼C论坛

cjy520 发表于 2021-6-15 21:26:52

python爬虫

python爬虫：通过搜索关键字“新冠”，并把搜索到的前20页链接里的新闻稿下载，然后保存到excel里面。
求这个的源码

路神发表于 2021-6-15 23:35:02

981048327

Py与C。。。 发表于 2021-6-16 10:13:46

在哪个网页搜索

xiaosi4081 发表于 2021-6-16 20:33:00

本帖最后由 xiaosi4081 于 2021-6-16 20:34 编辑

import requests
from bs4 import BeautifulSoup
keyword = input("关键词:")
news = []
def getcode(keyword,page):
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36"}
url="https://www.baidu.com/s?tn=news&rtt=4&bsst=1&cl=2&wd={}&pn={}".format(keyword,page)
print(url)
res = requests.get(url,headers=headers)
soup = BeautifulSoup(res.text,"html.parser")
each = soup.find_all("a",class_="news-title-font_1xS-F")
for i in each:
   news.append(i.text)
if __name__ == "__main__":
for i in range(0,200,10):
   getcode(keyword,i)
for i in news:
   print(news)

这是从百度资讯上爬取的
如果对你有作用，请设置【最佳答案】{:10_275:}

页: [1]

鱼C论坛's Archiver

python爬虫