新手求助

flag2020 · 发表于 2020-5-13 21:57:02

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

网上照抄了个爬虫代码，IDLE运行没有报错，却找不到输出文件在哪里，各位大神帮忙解答下怎么在windows下寻找文件存储路径或者帮忙指出哪里出问题了。

系统windows 10

代码如下：

import requests
import os
import time
import threading
from bs4 import BeautifulSoup

def download_page (url):
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; x64;rv:61.0) Gecko/20100101 Firefox/61.0"}
r = requests.get(url,headers=headers)
r.encoding="gb2312"
return r.text

def get_pic_list(html):
soup = BeautifulSoup(html,"html.parser")
pic_list = soup.find_all("li",class_="wp-item")
for i in pic_list:
      a_tag=i.find("h3",class_="tit").find("a")
      link=a_tag.get("href")
      text=a_tag.get_text()
      get_pic(link,text)

def get_pic(link,text):
html = download_page(link)
soup = BeautifulSoup(html,"html.parser")
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; x64;rv:61.0) Gecko/20100101 Firefox/61.0"}
create_dir("pic/{}".format(text))
for i in pic_list:
      pic_link=i.get("src")
      r = requests.get(url,headers=headers)
      with open ("pic/{}/{}".format(text,link.split('/')[-1]),"wb")as f:
         f.write(r.content)
         time.sleep(1)

def creat_dir(name):
if not os.path.exists(name):
      os.makedirs(name)

def execute(url):
page_html=download_page(url)
get_pic_list(page_html)

def main():
create_dir("pic")
queue=[i for i in range(1,72)]
thread=[]
while len(queue)>0:
      for thread in threads:
         if not thread.is_alive():
            threads.remove(thread)
         while len(threads)<5 and len(queue)>0:
            cur_page = queue.pop(0)
            url="http://meizitu.com/a/more_{}.html".format(cur_page)
            thread=threading.Thread(target=execue,args=(url,))
            thread,setDaemon(True)
            thread.start()
            print("{}正在下载{}".format(threading.current_thread().name,cur_page))
            threads.append(thread)

if __name__=="___main__":
main()

zltzlt · 发表于 2020-5-14 08:00:07

文件应该在程序所在目录中的 pic 文件夹下

_荟桐_ · 发表于 2020-5-14 23:58:19

文件开头加上
import os
print((os.getcwd())

flag2020 · 发表于 2020-5-15 11:11:52

_荟桐_ 发表于 2020-5-14 23:58
文件开头加上
import os
print((os.getcwd())

不成，没毛用

flag2020 · 发表于 2020-5-15 11:12:33

zltzlt 发表于 2020-5-14 08:00
文件应该在程序所在目录中的 pic 文件夹下

试了，没用~

_荟桐_ · 发表于 2020-5-15 18:33:59

flag2020 发表于 2020-5-15 11:11
不成，没毛用

你看看打印的路径是什么
文件应该就存在那里

Twilight6 · 发表于 2020-5-15 19:51:34

如何正确地发代码、上传图片和附件？
https://fishc.com.cn/thread-52272-1-1.html
(出处: 鱼C论坛)
下次这样发代码吧~~~

账号		自动登录	找回密码
密码			立即注册