使用xpath筛选出的结果有疑惑
import requestsfrom lxml import etree
headers = {
'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36',
#'Host':'www.xbiquge.la'
}
url = 'https://www.xbiquge.la'
# 发起请求
html_url = requests.get(url)
# 解析后的地址进行转码
#html_url.encoding = html_url.apparent_encoding
html_url.encoding = 'utf-8'
html_code = html_url.text
html_content = etree.HTML(html_code)
html_word = html_content.xpath('//div[@id="main"]/div/div/div/dl/dt/a')
print(html_word)
xpath不像bs那样直接打印节点码源,想打印码源得这样:
html_word = html_content.xpath('//div[@id="main"]/div/div/div/dl/dt/a')
pt = etree.tostring(html_word, encoding='unicode')
print(pt)
但如果你要获取文本,这样既可:html_word = html_content.xpath('//div[@id="main"]/div/div/div/dl/dt/a/text()')
多谢兔子大佬的指教
页:
[1]