|
|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
请求大神帮助:
(源代码)
import requests
import bs4
def song_message(url):
song_name=[]
headers={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36',
'referer':'https://music.163.com/song?id=4466775'
}
res=requests.get(url,headers=headers)
soup = bs4.BeautifulSoup(res.text,'html.parser')
targets = soup.find_all('div',class_='tit') #我怀疑问题从这里开始
for each in targets:
print(each.em.text) #提示是这里的问题
def main():
url='https://music.163.com/#/song?id=1357785909'
song_message(url)
if __name__=='__main__':
main()
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
运行后总是报错:
File "C:\Users\17659\Desktop\test.py", line 15, in song_message
print(each.em.text)
AttributeError: 'NoneType' object has no attribute 'text'
确认了一下你的网页,我确认了你的思路是没错的,所以这一定就是网易云的反爬手段了,刚才我查了一下别人爬取网易云的论坛帖子,找到了一个方法确认有效,就是把你的URL里面的#/去掉就可以了。不过因此也会使你的targets多了一个None对象,这大概是遗留问题,我自己是通过异常处理给pass掉的,源代码是这样的,希望能帮到你:
- import requests
- import bs4
- def song_message(url):
- song_name=[]
- headers={
- 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36',
- 'referer':'https://music.163.com/song?id=4466775'
- }
- res=requests.get(url,headers=headers)
- soup = bs4.BeautifulSoup(res.text,'html.parser')
- try:
- targets = soup.find_all('div',class_='tit')
- for each in targets:
- print(each.em.text)
- except AttributeError:
- pass
- def main():
- url='https://music.163.com/song?id=1357785909'
- song_message(url)
- if __name__=='__main__':
- main()
复制代码
|
|