lqq123 发表于 2021-5-31 16:09:31

xpath 爬取知,乎 ,的专栏标题,输出列表为空

结果是个空列表,xpath定位浏览器提取的和自己写的都试了,结果是空列表

import requests
from lxml import etree

url = 'https://zhuanlan.zhihu.com/pypcfx'
headers = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36'}
html = requests.get(url, headers=headers)
data = etree.HTML(html.text)
title = data.xpath('//div[@class="Card css-ny4o71"]/div[@class="css-8txec3"]//h2/a/text()')
print(title)


wp231957 发表于 2021-5-31 16:42:20

这玩意是通过AJAX传过来的你试试

url = 'https://www.zhihu.com/api/v4/columns/pypcfx/pinned-items'
headers = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36'}
html = requests.get(url, headers=headers).json()["data"]["content"]
print(html)
页: [1]
查看完整版本: xpath 爬取知,乎 ,的专栏标题,输出列表为空