Beautiful内容匹配求助,Python交流,编程语言专区,鱼C论坛

小甲鱼的铁粉 发表于 2021-2-4 10:07:35

Beautiful内容匹配求助

本帖最后由小甲鱼的铁粉于 2021-2-4 10:15 编辑

想提取出来网页的评论区的整个div，但是就算是find_all("div")，没有加任何信息，也找不到任何div，soup输出的内容正常，就是执行过divs = soup.find_all('div')就不对了，麻烦鱼油们帮我看一下

import re
import os
import requests
from bs4 import BeautifulSoup

def get_div():
url = "https://www.mgtv.com/b/350683/11017269.html?fpa=76&fpos=3&lastp=ch_home"
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"}
response = requests.get(url, headers = headers)

soup = BeautifulSoup(response.text, 'lxml')
#print(soup)
divs = soup.find_all('div')
print(divs)

if __name__ == "__main__":
get_div()

输出

<div data-server-rendered="true" id="__nuxt"><div id="__layout"><div><noscript>
请启用 JavaScript
</noscript>   <div class="m-video-error-infomessage" style="display:none;"><div class="video-error-infomessage"><h4>您将了解到本次错误原因：</h4><p>错误码：<em></em></p><p>错误详情：<em></em></p><p class="video-error-infomessage-closed"><a href="javascript:;">关闭</a></p></div></div> </div></div></div>

小甲鱼的铁粉 发表于 2021-2-4 10:16:30

是不是find_all的时候被反爬了

qq1151985918 发表于 2021-2-4 10:37:31

因为你爬的是视频的播放地址不是评论地址

小甲鱼的铁粉 发表于 2021-2-4 10:50:30

qq1151985918 发表于 2021-2-4 10:37
因为你爬的是视频的播放地址不是评论地址

是被js反爬了，网页里面有js反爬{:10_266:}

页: [1]

鱼C论坛's Archiver

Beautiful内容匹配求助