|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
- import requests
- from bs4 import BeautifulSoup
- import time
- url = "https://sc.chinaz.com/tupian/weimeiyijingtupian.html"
- resp = requests.get(url)
- resp.encoding = 'utf-8'
- obj = BeautifulSoup(resp.text, 'html.parser')
- alist = obj.find('div', attrs={'class': 'text_left text_leftbq'}).find_all('a')
- for a in alist:
- href = 'https:' + a.get('href')
- child_resp = requests.get(href)
- child_resp.encoding = 'utf-8'
- child_obj = BeautifulSoup(child_resp.text, 'html.parser')
- imga = child_obj.find('div', attrs={'class': 'imga'}).find('a').find('img')
- img = 'https:' + imga.get('src')
- img_resp = requests.get(img)
- img_name = img.split('/')[-1]
- with open('ibook/' + img_name, 'wb') as f:
- f.write(img_resp.content)
- print('开始下载:', img_name)
- time.sleep(2)
- print('下载完成!!!')
复制代码
这个程序是用bs4爬取网站图片的,
但是执行结果出现了一点小意外:
- 开始下载: apic32712.jpg
- 开始下载: apic32712.jpg
- 开始下载: apic31743.jpg
- 开始下载: apic31743.jpg
- 开始下载: apic31609.jpg
- 开始下载: apic31609.jpg
复制代码
哪位大神帮忙看一下啊?谢谢啊
- import requests
- from bs4 import BeautifulSoup
- import time
- url = "https://sc.chinaz.com/tupian/weimeiyijingtupian.html"
- resp = requests.get(url)
- resp.encoding = 'utf-8'
- obj = BeautifulSoup(resp.text, 'html.parser')
- alist2 = obj.find('div', attrs={'class': 'text_left text_leftbq'}).find_all('a')
- del alist2[-1]
- alist=[]
- for i in range(0,len(alist2),2):
- alist.append(alist2[i])
- for a in alist:
- href = 'https:' + a.get('href')
- child_resp = requests.get(href)
- child_resp.encoding = 'utf-8'
- child_obj = BeautifulSoup(child_resp.text, 'html.parser')
- imga = child_obj.find('div', attrs={'class': 'imga'}).find('a').find('img')
- img = 'https:' + imga.get('src')
- img_resp = requests.get(img)
- img_name = img.split('/')[-1]
- with open('ibook/' + img_name, 'wb') as f:
- f.write(img_resp.content)
- print('开始下载:', img_name)
- time.sleep(2)
- print('下载完成!!!')
复制代码
|
|