|
|

楼主 |
发表于 2018-12-13 13:37:55
|
显示全部楼层
- import pandas as pd
- import requests as res
- from bs4 import BeautifulSoup as bs
- url = 'https://news.sina.com.cn/'
- head = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.80 Safari/537.36'}
- res1 = res.get(url,headers = head)
- res1.encoding = res1.apparent_encoding
- soup = bs(res1.text,'lxml')
- cont = soup.find('div',class_='ct_t_01')
- # print(cont)
- l = []
- for i in cont:
- tit = i.find('h1',attrs = {'data-client':'headline'}).text
- #content = i.find_all('h1',data-client = 'throw').text
-
- l.append(tit)
- print(l)
复制代码
我这个代码没写完,只是先写一部分运行看下有没问题,结果出了这档子事。。 |
|