鱼C论坛

 找回密码
 立即注册
查看: 670|回复: 4

关于线程池的异常问题

[复制链接]
发表于 2020-5-30 18:46:15 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
  1. import requests
  2. import parsel

  3. headers = {'user-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3756.400 QQBrowser/10.5.4043.400'}


  4. def send_requests(url):
  5.     '''请求数据'''
  6.     response = requests.get(url = url, headers = headers, verify = False)
  7.     return(response)

  8. def parse_data(data):
  9.     '''数据解析'''
  10.     selector = parsel.Selector(data)
  11.     result_list = selector.xpath('//a[@class="col-xs-6 col-sm-3"]')
  12.     for result in result_list:
  13.         img_url = result.xpath('./img/@data-original').extract_first()
  14.         img_title = result.xpath('./img/@alt').extract_first()

  15.         #准备文件后缀名
  16.         all_title = title + '.' + src_url.split('.')[-1]
  17.         yield all_title, src_url


  18. def sava_data(file_name,data):
  19.     '''数据保存'''
  20.     with open('img\\' + file_name, mode = 'wb') as f:
  21.         f.write(data)
  22.         print('保存完成:', file_name)



  23. def main(page):
  24.     '''实现翻页的效果'''
  25.     for page in range(1,page + 1):
  26.         print('============正在爬取第{}页数据============'.format(page))
  27.         thread_pool = concurrent.futures.ThreadPoolExecutor(max_workers=3)
  28.         res = send_request('https://www.doutula.com/photo/list/?page={}'.format(str(page)))
  29.         src_url = parse_data(res.text)
  30.         for file, url in src_url:
  31.             image_response = send_request(url)
  32.             hread_pool.submit(save_data,file,image_response.content)
  33.         thread_pool.shutdown()




  34. if __name__ == '__main__':
  35.     main(10)
复制代码


报错内容:


C:\Users\Administrator\AppData\Local\Programs\Python\Python36\python.exe E:/python_fruit/表情包/表情包_ronot-多线程.py
Traceback (most recent call last):
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 49, in <module>
    main(10)
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 37, in main
    thread_pool = concurrent.futures.ThreadPoolExecutor(max_workers=3)
NameError: name 'concurrent' is not defined
============正在爬取第1页数据============

Process finished with exit code 1

'''最近比较闲 在学线程池 但是又层出不穷的BUG 希望各位鱼油能够给一些与线程有关的文献或者资料啥的'''
小甲鱼最新课程 -> https://ilovefishc.com
回复

使用道具 举报

 楼主| 发表于 2020-5-31 07:08:58 | 显示全部楼层
我知道为什么我会报错了 我是少写了import concurrent.futures
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2020-5-31 09:06:14 | 显示全部楼层
哪位鱼油能告诉我这个祖安异常是怎么回事?
C:\Users\Administrator\AppData\Local\Programs\Python\Python36\python.exe E:/python_fruit/表情包/表情包_ronot-多线程.py
============正在爬取第1页数据============
C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\connectionpool.py:986: InsecureRequestWarning: Unverified HTTPS request is being made to host 'www.doutula.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/e ... e.html#ssl-warnings
  InsecureRequestWarning,
Traceback (most recent call last):
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 58, in <module>
    main(10)
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 48, in main
    image_response = send_request(url)
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 16, in send_request
    response = requests.get(url = url, headers = headers, verify = False)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 516, in request
    prep = self.prepare_request(req)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 459, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 314, in prepare
    self.prepare_url(url, params)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 388, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '一群渣渣': No schema supplied. Perhaps you meant http://一群渣渣?
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2020-5-31 12:45:16 | 显示全部楼层
目前遇到的祖安异常:
C:\Users\Administrator\AppData\Local\Programs\Python\Python36\python.exe E:/python_fruit/表情包/表情包_ronot-多线程.py
============正在爬取第1页数据============
C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\connectionpool.py:986: InsecureRequestWarning: Unverified HTTPS request is being made to host 'www.doutula.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/e ... e.html#ssl-warnings
  InsecureRequestWarning,
Traceback (most recent call last):
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 58, in <module>
    main(10)
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 48, in main
    image_response = send_request(url)
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 16, in send_request
    response = requests.get(url = url, headers = headers, verify = False)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 516, in request
    prep = self.prepare_request(req)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 459, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 314, in prepare
    self.prepare_url(url, params)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 388, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '一群渣渣': No schema supplied. Perhaps you meant http://一群渣渣?

Process finished with exit code 1
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2020-5-31 15:08:21 | 显示全部楼层
目前写出来的最新的代码
  1. import requests
  2. import parsel
  3. import re
  4. import concurrent.futures





  5. headers = {
  6.     'user-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3756.400 QQBrowser/10.5.4043.400'}


  7. def send_request(url):
  8.     '''请求数据'''
  9.     response = requests.get(url = url, headers = headers, verify = False)
  10.     return(response)

  11. def parse_data(data):
  12.     '''数据解析'''
  13.     selector = parsel.Selector(data)
  14.     result_list = selector.xpath('//a[@class="col-xs-6 col-sm-3"]')
  15.     for result in result_list:
  16.         title = result.xpath('./img/@data-original').extract_first()
  17.         src_url = result.xpath('./img/@alt').extract_first()

  18.         #准备文件后缀名
  19.         all_title = title + '.' + src_url.split('.')[-1]
  20.         yield all_title, src_url


  21. def sava_data(file_name,data):
  22.     '''数据保存'''
  23.     with open('img\\' + file_name, mode = 'wb') as f:
  24.         f.write(data)
  25.         print('保存完成:', file_name)



  26. def main(page):
  27.     '''实现翻页的效果'''
  28.     for page in range(1,page + 1):
  29.         print('============正在爬取第{}页数据============'.format(page))
  30.         thread_pool = concurrent.futures.ThreadPoolExecutor(max_workers=3)
  31.         res = send_request('https://www.doutula.com/photo/list/?page={}'.format(str(page)))
  32.         src_url = parse_data(res.text)
  33.         for file, url in src_url:
  34.             image_response = send_request(url)
  35.             thread_pool.submit(save_data, file, image_response.content)


  36.         thread_pool.shutdown()




  37. if __name__ == '__main__':
  38.     main(10)
复制代码



各位鱼油可以自己尝试哈
有个异常:祖安异常
C:\Users\Administrator\AppData\Local\Programs\Python\Python36\python.exe E:/python_fruit/表情包/表情包_ronot-多线程.py
============正在爬取第1页数据============
C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\connectionpool.py:986: InsecureRequestWarning: Unverified HTTPS request is being made to host 'www.doutula.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/e ... e.html#ssl-warnings
  InsecureRequestWarning,
Traceback (most recent call last):
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 58, in <module>
    main(10)
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 48, in main
    image_response = send_request(url)
  File "E:/python_fruit/表情包/表情包_ronot-多线程.py", line 16, in send_request
    response = requests.get(url = url, headers = headers, verify = False)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 516, in request
    prep = self.prepare_request(req)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 459, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 314, in prepare
    self.prepare_url(url, params)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 388, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '一群渣渣': No schema supplied. Perhaps you meant http://一群渣渣?

Process finished with exit code 1


希望各位鱼油能够告诉我为什么会出现这个异常
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2025-6-21 00:36

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表