|
发表于 2020-4-1 08:07:25
From FishC Mobile
|
显示全部楼层
|阅读模式
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
运行就打开浏览器 ,求大佬
import requests
from lxml import etree
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'
}
def get_chapter_urls(url):
response = requests.get(url,headers=headers)
response.encoding = response.apparent_encoding
html = etree.HTML(response.text)
urls_ = html.xpath('//div[@class="pt-chapter-cont-detail"]/a/@href')
urls = []
for url in urls_:
urls.append("http://www.cwwx.cc" + url)
return urls
def get_chapter_text(url):
response = requests.get(url,headers=headers)
response.encoding = response.apparent_encoding
html = etree.HTML(response.text)
chapter_title = html.xpath('//div[@class="pt-read-title"]/h1/a/text()')[0]
chapter_text_ = html.xpath('//div[@class="size16 color5 pt-read-text"]/p/text()')
chapter_text = []
for text in chapter_text_:
chapter_text.append(text.strip())
return'\n'.join(chapter_text)
def save(filename,url):
links = get_chapter_urls(url)
with open(filename + '.txt',mode='w',encoding='utf-8') as f:
for url in links:
f.write(get_chapter_text(url))
bookname = input("书名: ")
download_link = input("小说链接: ")
save(bookname,download_link)
18274508008 发表于 2020-4-1 10:00
没反应
弄个断点逐一排查,我也没什么别的好办法
看看那一条语句触发了打开浏览器的操作 
|
|