|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
有没有老哥知道下面的代码出了啥问题,返回的是一个空列表
import requests
import bs4
url = 'https://www.ewi-psy.fu-berlin.de/en/mitarbeiterliste/index.html?show=profs'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36'}
response = requests.get(url, headers=headers)
soup = bs4.BeautifulSoup(response.text, 'html.parser')
url_targets = soup.find_all('div', class_=('box-staff-list-table-name', 'col-s-12', 'col-l-4'))
真正的数据在ajax response里。
- import requests
- from bs4 import BeautifulSoup
- def main():
- # 基础信息
- basic_url = 'https://www.ewi-psy.fu-berlin.de'
- url = 'https://www.ewi-psy.fu-berlin.de/en/mitarbeiterliste/index.html?show=profs'
- headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'}
- ajax_headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)',
- 'X-Requested-With': 'XMLHttpRequest',
- 'Referer': 'https://www.ewi-psy.fu-berlin.de/en/mitarbeiterliste/index.html?show=profs'}
- # 获取ajax url
- r = requests.get(url, headers=headers)
- soup = BeautifulSoup(r.text, 'html.parser')
- ajax_url = soup.find_all('div', class_="cms-box-ajax-content")[1]['data-ajax-url']
- # 合成正式 url
- url = f'{basic_url}{ajax_url}'
-
- # 查找div
- r = requests.get(url, headers=ajax_headers)
- soup = BeautifulSoup(r.text, 'html.parser')
- url_targets = soup.find_all('div', class_=('box-staff-list-table-name', 'col-s-12', 'col-l-4'))
- print(url_targets)
- if __name__ == "__main__":
- main()
复制代码
|
|