|
|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
import requests
from bs4 import BeautifulSoup
import pandas as pd
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36"}
allstart_url ='http://www.stat-nba.com/playerList.php'
allstart_html = requests.get(allstart_url,headers = headers)
soup =BeautifulSoup(allstart_html.text,'lxml')
R =[]
Player_name = []
for i in soup.find_all('a',class_="allstarplayer"):
url = 'http://www.stat-nba.com/'+i['href'].strip('.')
R.append(url)
Player_name.append(i.span.string.strip().split('/')[0])
DATA = []
NBA = []
for key in zip(R[0:2],Player_name):
html = requests.get(key[0],headers = headers)
html.encoding = 'utf-8'
soup =BeautifulSoup(html.text,'lxml')
table_t = []
table_d = []
for row in soup.find('div',class_="stat_box").find_all('table',class_="stat_box"):
title =[title for title in row.thead.strings if title.strip() != '' ]
table_t.append(title)
data =[sort.text.strip('\n').split('\n') for sort in row.find_all('tr',class_="sort")]
table_d.append(data)
DATA+=table_d
请教大家!
尝试爬取http://www.stat-nba.com/playerList.php 这个网站里面热门球星的数据,每个球员有五张数据表,数据爬下来了,现在问题是该怎么样才能按照球员名字,然后每个人的数据表保存起来? |
|