马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
现象:
源代码:from selenium.webdriver import Chrome, ChromeOptions
from bs4 import BeautifulSoup
from pandas import DataFrame
from time import sleep
options = ChromeOptions()
options.add_argument("headless")
driver = Chrome(options=options)
url = 'https://movie.douban.com/explore'
driver.get(url)
sleep(2)
source = driver.page_source
soup = BeautifulSoup(source, 'lxml')
names = soup.select('.drc-subject-info-title-text')
names = [name.text for name in names]
types = soup.select('.drc-subject-info-subtitle')
types = [movieType.text for movieType in types]
data = []
for i, name in enumerate(names):
data.append([name, types[i]])
df = DataFrame(data)
df.to_csv('1.csv', encoding='utf-8')
driver.quit()
(曾多次运行,仍未解决)
这个问题可能是由于输出到csv文件时使用的编码不正确导致的。请尝试使用UTF-8-sig编码来保存csv文件,即将 encoding='utf-8' 改为 encoding='utf-8-sig' 。
修改后的代码如下:from selenium.webdriver import Chrome, ChromeOptionsfrom bs4 import BeautifulSoupfrom pandas import DataFramefrom time import sleepoptions = ChromeOptions()options.add_argument("headless")driver = Chrome(options=options)url = 'https://movie.douban.com/explore'driver.get(url)sleep(2)source = driver.page_sourcesoup = BeautifulSoup(source, 'lxml')names = soup.select('.drc-subject-info-title-text')names = [name.text for name in names]types = soup.select('.drc-subject-info-subtitle')types = [movieType.text for movieType in types]data = []for i, name in enumerate(names): data.append([name, types[ i]])df = DataFrame(data)df.to_csv('1.csv', encoding='utf-8-sig')driver.quit()
这样应该就能够正确保存csv文件了。
|