爬虫一运行就打开浏览器,Python交流,编程语言专区,鱼C论坛

18274508008 发表于 2020-4-1 08:07:25

爬虫一运行就打开浏览器

运行就打开浏览器{:10_266:}，求大佬

import requests
from lxml import etree
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'
}
def get_chapter_urls(url):
response = requests.get(url,headers=headers)
response.encoding = response.apparent_encoding
html = etree.HTML(response.text)
urls_ = html.xpath('//div[@class="pt-chapter-cont-detail"]/a/@href')
urls = []
for url in urls_:
   urls.append("http://www.cwwx.cc" + url)
return urls

def get_chapter_text(url):
response = requests.get(url,headers=headers)
response.encoding = response.apparent_encoding
html = etree.HTML(response.text)
chapter_title = html.xpath('//div[@class="pt-read-title"]/h1/a/text()')
chapter_text_ = html.xpath('//div[@class="size16 color5 pt-read-text"]/p/text()')
chapter_text = []
for text in chapter_text_:
   chapter_text.append(text.strip())
return'\n'.join(chapter_text)

def save(filename,url):
links = get_chapter_urls(url)
with open(filename + '.txt',mode='w',encoding='utf-8') as f:
      for url in links:
         f.write(get_chapter_text(url))

bookname = input("书名： ")
download_link = input("小说链接： ")
save(bookname,download_link)

18274508008 发表于 2020-4-1 08:21:55

求大佬

_2_ 发表于 2020-4-1 08:54:21

18274508008 发表于 2020-4-1 08:21
求大佬

把它关上

18274508008 发表于 2020-4-1 09:52:19

_2_ 发表于 2020-4-1 08:54
把它关上

是关浏览器吗

_2_ 发表于 2020-4-1 09:52:54

18274508008 发表于 2020-4-1 09:52
是关浏览器吗

嗯，然后再看看程序还在不在跑
如果不行就最小化

18274508008 发表于 2020-4-1 09:54:46

_2_ 发表于 2020-4-1 09:52
嗯，然后再看看程序还在不在跑
如果不行就最小化

没用{:10_266:}还是直接打开浏览器{:10_250:}{:10_250:}

_2_ 发表于 2020-4-1 09:55:25

18274508008 发表于 2020-4-1 09:54
没用还是直接打开浏览器

……最小化试试

18274508008 发表于 2020-4-1 09:56:00

_2_ 发表于 2020-4-1 09:55
……最小化试试

试了{:10_266:}

_2_ 发表于 2020-4-1 09:56:39

18274508008 发表于 2020-4-1 09:54
没用还是直接打开浏览器

可能是你的程序跑完了，把结果直接用浏览器打开了
不妨等一会儿

18274508008 发表于 2020-4-1 10:00:41

_2_ 发表于 2020-4-1 09:56
可能是你的程序跑完了，把结果直接用浏览器打开了
不妨等一会儿

没反应

_2_ 发表于 2020-4-1 10:02:21

18274508008 发表于 2020-4-1 10:00
没反应

弄个断点逐一排查，我也没什么别的好办法
看看那一条语句触发了打开浏览器的操作{:10_269:}

18274508008 发表于 2020-4-1 10:03:59

_2_ 发表于 2020-4-1 10:02
弄个断点逐一排查，我也没什么别的好办法
看看那一条语句触发了打开浏览器的操作

好吧，谢了

页: [1]

鱼C论坛's Archiver

爬虫一运行就打开浏览器