|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
本帖最后由 庸人忧天下 于 2020-4-19 15:08 编辑
- import requests
- import time
- from bs4 import BeautifulSoup
- headers={'user-agent': 'Mozilla/5.0 (iPad; CPU OS 11_0 like Mac OS X) AppleWebKit/604.1.34 (KHTML, like Gecko) Version/11.0 Mobile/15A5341f Safari/604.1'}
- def get_info(url):
- wb_data=requests.get(url,headers=headers)
- soup=BeautifulSoup(wb_data.text,'html.parser')
- ranks=soup.select('#rankWrap > div.pc_temp_songlist > ul > li:nth-child(1) > span.pc_temp_num > strong')
- titles=soup.select('#rankWrap > div.pc_temp_songlist > ul > li:nth-child(1) > a')
- times=soup.select('#rankWrap > div.pc_temp_songlist > ul > li:nth-child(1) > span.pc_temp_tips_r > span')
- for rank,title,time in zip(ranks,titles,times):
- print(rank.get_text())
- data={
- 'rank':rank.get_text().strip,
- 'title':title.get_text().split('-')[0],
- 'time':time.get_text().strip()
- }
- print(data)
- if __name__=='__main__':
- urls=['https://www.kugou.com/yy/rank/home/{}-8888.html?from=rank'.format(str(i)) for i in range(1,24)]
- for url in urls:
- get_info(url)
- time.sleep(1)
复制代码
|
|