|

楼主 |
发表于 2021-1-22 17:30:40
|
显示全部楼层
- import requests
- from lxml import etree
- import re
- headers = {
- 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko'
- }
- url = 'https://image.baidu.com/search/acjson?tn=resultjson_com&logid=10175432787291229734&ipn=rj&ct=201326592&is=&fp=result&queryWord=%E4%BA%8C%E6%AC%A1%E5%85%83%E5%9B%BE%E7%89%87&cl=2&lm=-1&ie=utf-8&oe=utf-8&adpicid=&st=&z=&ic=&hd=&latest=©right=&word=%E4%BA%8C%E6%AC%A1%E5%85%83%E5%9B%BE%E7%89%87&s=&se=&tab=&width=&height=&face=&istype=&qc=&nc=&fr=&expermode=&force=&pn=30'
- response = requests.get(url=url,headers=headers)
- response.encoding = response.apparent_encoding
- html = response.text
- regular = re.compile('"middleURL":"(.*?)"')
- m = regular.findall(html)
- for each in m:
- print(each)
复制代码
这个程序应该是拿到前30张图片的地址,有兴趣继续发挥的小伙伴可以看看设成最佳答案的大佬的思路和讲解 |
|