小白学习爬虫中，运行报错AttributeError: 'NoneType' object has no attribute 't...,Python交流,编程语言专区,鱼C论坛

wscshuai 发表于 2020-12-25 14:00:29

小白学习爬虫中，运行报错AttributeError: 'NoneType' object has no attribute 't...

本帖最后由 wscshuai 于 2020-12-25 14:10 编辑

from docx import Document
import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36'}
url = 'https://splcgk.court.gov.cn/gzfwww//qwallist'

for i in range(1,3):
param = {
'fbdw': '最高人民法院',
'bt':'',
'lx': 'lzdx',
'pageNum': i
}
res = requests.post(url = url, headers = headers, params = param)
case_json = res.json()
case_lists = case_json['list']
for case_list in case_lists:
   case_url = 'https://splcgk.court.gov.cn/gzfwww//qwal/qwalDetails?id='+case_list['cBh']

   res_case = requests.get(url = case_url, headers = headers)
   case = BeautifulSoup(res_case.text, 'html.parser')
   title = case.find('div',class_="fd-fix").find('h2').text
   content = case.find('p', style="text-align:left;").text

   document = Document()
   document.add_heading(title,0)
   p = document.add_paragraph(content)
   # document.add_page_break()
   document.save('{}.docx'.format(title))

小白一枚，运行该代码后，运行报错
Traceback (most recent call last):

File "C:\Users\Administrator\Desktop\指导案例1.py", line 33, in <module>
content = case.find('p', style="text-align:left;").text

AttributeError: 'NoneType' object has no attribute 'text'

但是第一页的指导案例111号和指导案例112号确实被下载下来了，
其余的案例没法下载，请问是怎么一回事。
是网址的问题，还是我的标签有错误

suchocolate 发表于 2020-12-26 12:33:22

当循环到 https://splcgk.court.gov.cn/gzfwww//qwal/qwalDetails?id=ff808081635e1e1901650e5015f60e1a 这个页面时，页面的结构变了。
case.find('p', style="text-align:left;")拿不到内容，你得分析一下这个页面了。

页: [1]

鱼C论坛's Archiver

小白学习爬虫中，运行报错AttributeError: 'NoneType' object has no attribute 't...