|
|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
from lxml import etree
from urllib import request
header = {"User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36",
"Referer":"https://www.qiushibaike.com/"}
url = "https://www.qiushibaike.com/imgrank/"
req = request.Request(url, data=None, headers=header)
ope = request.urlopen(req)
pic = ope.read()
epic = etree.HTML(pic)
pic_list = epic.xpath("//div[@class='article block untagged mb15']/div[@class='thumb']/a/img")
print(pic_list)
这样爬出来就是空的
如果把url = "https://www.qiushibaike.com/imgrank/"改为url = "https://www.qiushibaike.com/pic/"就有结果,两个网页的结构一模一样 xpath的查询条件也是一样,求问为什么匹配不到
其他很多网页都试过 ,爬 多个div下的一直爬出来都是空
我看的源码是class=“article block untagged mb15 typs_hot” 
用div[contain(@class,"article block untagged mb15")]试试,好像是这么写的
|
|