一只魈咸鱼 发表于 2021-8-11 09:58:48

爬虫中产生了requests.exceptions.ConnectionError

报错如下:
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.netbian.comhttps', port=80): Max retries exceeded with url: //pic.netbian.com/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000023052EC3220>: Failed to establish a new connection: getaddrinfo failed'))


第一个get返回的信息没有警告,就从第二个get出现的警告,加上verify=False依然没有解决


代码如下:
import requests
from bs4 import BeautifulSoup
import lxml
if __name__ == '__main__':
    url0 = 'http://www.netbian.com/dongman/'
    res0 = requests.get(url0)
    res0.encoding = 'gbk'
    text0 = res0.text
    soup0 = BeautifulSoup(text0,'lxml')
    soup01 = soup0.find('div',class_='list')
    img_url_list1 = soup01.find_all('li')
    for each0 in img_url_list1:
      middle_url = 'http://www.netbian.com'+\
                     each0.find('a').get('href')
      end_res = requests.get(middle_url,verify=False)
      end_res.encoding='gbk'
      end_soup = BeautifulSoup(end_res.text,'lxml')
      print(end_soup.find('div',class_='pic'))
      name_url = end_soup.find('div',class_='pic').find('img')
      print(name_url)
      img_name = name_url.get('alt')
      img_url = name_url.get('src')
      print(img_name,img_url)
      content = requests.get(img_url).content
      with open('彼岸桌面壁纸下载/'+img_name+'.jpg','wb')as f:
            f.write(content)
            print('好了一个!')

1q23w31 发表于 2021-8-11 10:25:10

研究一下html,第三个图片的url不正确

一只魈咸鱼 发表于 2021-8-11 15:24:47

确实如此!
页: [1]
查看完整版本: 爬虫中产生了requests.exceptions.ConnectionError