马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
import requests
from bs4 import BeautifulSoup
import time
url = "https://sc.chinaz.com/tupian/weimeiyijingtupian.html"
resp = requests.get(url)
resp.encoding = 'utf-8'
obj = BeautifulSoup(resp.text, 'html.parser')
alist = obj.find('div', attrs={'class': 'text_left text_leftbq'}).find_all('a')
for a in alist:
href = 'https:' + a.get('href')
child_resp = requests.get(href)
child_resp.encoding = 'utf-8'
child_obj = BeautifulSoup(child_resp.text, 'html.parser')
imga = child_obj.find('div', attrs={'class': 'imga'}).find('a').find('img')
img = 'https:' + imga.get('src')
img_resp = requests.get(img)
img_name = img.split('/')[-1]
with open('ibook/' + img_name, 'wb') as f:
f.write(img_resp.content)
print('开始下载:', img_name)
time.sleep(2)
print('下载完成!!!')
这个程序是用bs4爬取网站图片的,
但是执行结果出现了一点小意外:
开始下载: apic32712.jpg
开始下载: apic32712.jpg
开始下载: apic31743.jpg
开始下载: apic31743.jpg
开始下载: apic31609.jpg
开始下载: apic31609.jpg
哪位大神帮忙看一下啊?谢谢啊
import requests
from bs4 import BeautifulSoup
import time
url = "https://sc.chinaz.com/tupian/weimeiyijingtupian.html"
resp = requests.get(url)
resp.encoding = 'utf-8'
obj = BeautifulSoup(resp.text, 'html.parser')
alist2 = obj.find('div', attrs={'class': 'text_left text_leftbq'}).find_all('a')
del alist2[-1]
alist=[]
for i in range(0,len(alist2),2):
alist.append(alist2[i])
for a in alist:
href = 'https:' + a.get('href')
child_resp = requests.get(href)
child_resp.encoding = 'utf-8'
child_obj = BeautifulSoup(child_resp.text, 'html.parser')
imga = child_obj.find('div', attrs={'class': 'imga'}).find('a').find('img')
img = 'https:' + imga.get('src')
img_resp = requests.get(img)
img_name = img.split('/')[-1]
with open('ibook/' + img_name, 'wb') as f:
f.write(img_resp.content)
print('开始下载:', img_name)
time.sleep(2)
print('下载完成!!!')
|