|
|
发表于 2021-5-23 22:33:41
|
显示全部楼层
本楼为最佳答案
你试下这段代码看能不能获取到url,如果可以获取到,那要么就是你代码问题,要么是多线程请求太频繁被网站限制了。
- import requests
- from lxml import etree
- url = 'https://sc.chinaz.com/jianli/free.html'
- headers = {
- 'User-Agent': 'Mozilla/5.0'
- }
- res = requests.get(url, headers=headers)
- sel = etree.HTML(res.text)
- urls = sel.xpath('//div[@id="main"]/div/div/a/@href')
- for url_ in urls:
- res1 = requests.get('https:'+url_, headers=headers)
- sel1 = etree.HTML(res1.text)
- down_url = sel1.xpath('//div[@id="down"]/div[2]/ul/li[1]/a/@href')[0]
- print(down_url)
- break
复制代码 |
|