BeautifulSoup中 find方法获取不到text内容,Python交流,编程语言专区,鱼C论坛

niceyes 发表于 2021-8-21 15:16:49

BeautifulSoup中 find方法获取不到text内容

import requests
from bs4 import BeautifulSoup

response=requests.get("http://ncov.dxy.cn/ncovh5/view/pneumonia")
home_page=response.content.decode()
# print(home_page)
soup=BeautifulSoup(home_page,"lxml")
script=soup.find(id="getListByCountryTypeService2true")
text=script.text
print(len(text))

怎么获取不到内容

suchocolate 发表于 2021-8-21 15:24:37

import requests
from bs4 import BeautifulSoup

headers = {'user-agent': 'Mozilla'}
r = requests.get("http://ncov.dxy.cn/ncovh5/view/pneumonia", headers=headers)
soup = BeautifulSoup(r.text, "lxml")
script = soup.find(id="getListByCountryTypeService2true")
print(script)

niceyes 发表于 2021-8-21 17:00:08

suchocolate 发表于 2021-8-21 15:24

首先谢谢你的解答, 不过,你这个代码同我的一样也是可以 print(script),没有区别,现在我是需要的是 print(script.text) ,也就是不要标签中的文本

我的理解不知道正不正确:
print(script) 包含前后标签
print(script.text) 不包含前后标签

niceyes 发表于 2021-8-21 17:01:58

print(script.text) 我输出的是空, 我不明白错在哪个地方

niceyes 发表于 2021-8-21 17:10:43

script=soup.find(id="getListByCountryTypeService2true")
script <class 'bs4.element.Tag'> 是个类, 我没法用正则表达式处理如, re.findall(r'\[.+\]',script)

suchocolate 发表于 2021-8-21 17:12:21

本帖最后由 suchocolate 于 2021-8-21 17:15 编辑

niceyes 发表于 2021-8-21 17:00
首先谢谢你的解答, 不过,你这个代码同我的一样也是可以 print(script),没有区别,现在我是需要的是...

我这输出正常

用你的代码也有文字，不是空的。

niceyes 发表于 2021-8-21 17:20:10

suchocolate 发表于 2021-8-21 17:12
我这输出正常

怪事了,为什么我这返回 0

niceyes 发表于 2021-8-21 17:23:45

suchocolate 发表于 2021-8-21 17:12
我这输出正常

print(script.attrs) 里面只有 {'id': 'getListByCountryTypeService2true'}没有 text

niceyes 发表于 2021-8-21 17:28:43

但是 print(script) 是正常的, 就是 print(script.txt) 没内容

niceyes 发表于 2021-8-21 17:30:22

然后,我直接用正则表达式处理 script 也报错,

白two 发表于 2021-8-21 21:51:42

你试试用 string 方法呢，我也 text 方法没用，用 string 方法就可以，原因我再研究研究

import requests
from bs4 import BeautifulSoup

response = requests.get("http://ncov.dxy.cn/ncovh5/view/pneumonia")
home_page = response.content.decode()
# print(home_page)
soup = BeautifulSoup(home_page, "lxml")
script = soup.find(id="getListByCountryTypeService2true")
text = script.string
print(len(text))

运行结果：
190421

niceyes 发表于 2021-8-22 11:03:33

白two 发表于 2021-8-21 21:51
你试试用 string 方法呢，我也 text 方法没用，用 string 方法就可以，原因我再研究研究

辛苦你了, 找到原因别忘了发出来

页: [1]

鱼C论坛's Archiver

BeautifulSoup中 find方法 获取不到text内容

BeautifulSoup中 find方法获取不到text内容