我注意到你的代码中有几个问题。首先,你引入了re模块,但是在代码中没有使用到它,所以可以将`import re`这一行删除,不用引入。另外,在创建工作簿和工作表时,你需要使用`openpyxl.Workbook()`和`wb.active`,而不是`Workbook`和`wb.activesheet`。此外,你在`sheet.append()`方法中没有传递任何数据,所以应该将省级代码和名称作为参数传递给`append()`方法。下面是修改后的代码示例:
import requests
from bs4 import BeautifulSoup
from openpyxl import Workbook
url = "https://www.stats.gov.cn/sj/tjbz/tjyqhdmhcxhfdm/2023/index.html"
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:109.0) Gecko/20100101 Firefox/115.0",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language": "zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2",
"Accept-Encoding": "gzip, deflate, br",
"Connection": "keep-alive",
"Cookie": "wzws_sessionid=gWRmZGNjZaBmFfeqgjdlZDJkMIA2MC4xNjAuMTU2LjUw"
}
res = requests.get(url, headers=headers)
res.encoding = res.apparent_encoding
text = res.text
# 使用BeautifulSoup解析网页内容
soup = BeautifulSoup(text, 'lxml')
# 创建工作簿和工作表
wb = Workbook()
sheet = wb.active
sheet.title = '行政区划'
# 省级代码
province_Num = []
province_list = soup.select('tr.provincetr a')
for province in province_list:
href = province.get("href")
province_code = href
print('province_code:', province_code)
province_Num.append(province_code)
province_name = province.text
# 将省级代码和名称作为列表传递给append()方法
sheet.append([province_code, province_name])
wb.save('E:/Temp/行政区划1.xlsx')
print('省级区划已入库~~')
# 市级代码
city_Num = []
for num in province_Num:
url1 = url + num + '.html'
soup1 = BeautifulSoup(requests.get(url1).content, 'lxml')
city_list = soup1.select('tr.citytr a')
for city in city_list:
href = city.get("href")
city_code = href
print('city_code:', city_code)
city_Num.append(city_code)
city_name = city.text
print('city_name:', city_name)
# 将市级代码和名称作为列表传递给append()方法
sheet.append([city_code, city_name])
wb.save('E:/Temp/行政区划1.xlsx')
print('市级区划已入库~~')
这样,你就可以正确地将省级和市级区划代码和名称保存到Excel文件中了。另外,注意在循环中保存工作簿,以便每次迭代都能保存更新的数据。希望对你有帮助!如果还有其他问题,请随时提问。
以上回复来自 FishC AI 小助理,如未能正确解答您的问题,请继续追问。 |