爬虫部分:
此部分倒是没有遇到什么坎坷:
import requests
from lxml import etree
headers={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3947.100 Safari/537.36"}
pages=0
for page in range(1,779):
url="https://www.enterdesk.com/zhuomianbizhi/%s.html"%page
reponse=requests.get(url=url,headers=headers)
reponse.encoding=("utf-8")
tree = etree.HTML(reponse.text)
data=tree.xpath("//img/@src")
for x in data:
pages+=1
imgcontent=requests.get(x)
with open(r'/pic/enterdesk/'+x.split("/")[-1], 'wb') as file:
file.write(imgcontent.content)
print("共计%d张图片被下载"%pages)
'''
e:\>python ex20.py
共计12448张图片被下载
'''
就是浪费了我不少时间,一万多张图片 被成功下载 |