马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
目的是爬取今日头条街拍图片,大哥们帮我看看这个怎么解决
import requests
from requests.exceptions import RequestException
from urllib.parse import urlencode
import json
def get_page_index(page_num):
data = {
'keyword': '街拍',
'pd': 'atlas',
'source': 'search_subtab_switch',
'dvpf': 'pc',
'aid': '4916',
'page_num': 'page_num',
'rawJSON': '1',
'search_id': '202110031811450101501322040B5D277C'
}
url = 'https://so.toutiao.com/search?' + urlencode(data)
response = requests.get(url)
try:
if response.status_code == 200:
return response.text
return None
except RequestException:
print('索引失败')
return None
def parse_page_index(html):
images = html.get('rawData').get('data')
for image in images:
link = image.get('img_url')
yield link
def main(page_num):
html = get_page_index(page_num)
for url in parse_page_index(html):
print(url)
if __name__ == '__main__':
for i in range(2):
main(i)
下面是提示错误信息
Traceback (most recent call last):
File "E:\QMDownload\今日头条\main.py", line 42, in <module>
main(i)
File "E:\QMDownload\今日头条\main.py", line 37, in main
for url in parse_page_index(html):
File "E:\QMDownload\今日头条\main.py", line 28, in parse_page_index
images = html.get('rawData').get('data')
AttributeError: 'str' object has no attribute 'get'
给个建议,可以把 json 数据转换为字典来进行解析
而且你想要图片的话应该找的是图片对应的链接
|