|

楼主 |
发表于 2020-2-26 20:07:43
|
显示全部楼层
爬虫部分:
此部分倒是没有遇到什么坎坷:
- import requests
- from lxml import etree
- headers={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3947.100 Safari/537.36"}
- pages=0
- for page in range(1,779):
- url="https://www.enterdesk.com/zhuomianbizhi/%s.html"%page
- reponse=requests.get(url=url,headers=headers)
- reponse.encoding=("utf-8")
- tree = etree.HTML(reponse.text)
- data=tree.xpath("//img/@src")
- for x in data:
- pages+=1
- imgcontent=requests.get(x)
- with open(r'/pic/enterdesk/'+x.split("/")[-1], 'wb') as file:
- file.write(imgcontent.content)
- print("共计%d张图片被下载"%pages)
- '''
- e:\>python ex20.py
- 共计12448张图片被下载
- '''
复制代码
就是浪费了我不少时间,一万多张图片 被成功下载 |
|