|
|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
import time
import csv
import requests
starturl = "http://www.tianqihoubao.com/aqi/hangzhou.html"
html = urlopen(starturl)
bsObj = BeautifulSoup(html, "lxml")
Sites = []
for link in bsObj.findAll(href=re.compile("^(/aqi/hangzhou-)")):
site = "http://www.tianqihoubao.com"+link.attrs['href']
print(site)
time.sleep(2)
Sites.append(site)
Sites.reverse()
for url in Sites[1:]:
html = urlopen(url)
bsObj = BeautifulSoup(html, "lxml")
table = bsObj.findAll("table",{"cellpadding":"1"})[0]
rows = table.findAll("tr")
csvFile = open("AQI.csv", 'w', newline='', encoding='gb18030')
writer = csv.writer(csvFile)
try:
for row in rows:
csvRow = []
for cell in row.findAll(['tr', 'td']):
csvRow.append(cell.get_text())
writer.writerow(csvRow)
finally:
csvFile.close()
结果却只显示最新一个月份的数据 代码如何修改,才能获得所有数据,求大神解答 |
|