|
|

楼主 |
发表于 2019-4-8 11:12:38
|
显示全部楼层
- # coding: utf-8
- from urllib.request import urlretrieve
- from bs4 import BeautifulSoup
- import re,os,requests,urllib
- from urllib.request import urlopen
- path="e:\\reeoo.pic\"
- isExists=os.path.exists(path)
- if not isExists:
- os.makedirs(path)
- else:
- print (path+' 目录已存在' )
- headers = {'User-Agent':'Mozilla/5.0(Wimdows NT 6.1; WOW64) AppleWebkit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36'}
- url="http://reeoo.com"
- res = requests.get(url,headers=headers,timeout=20)
- content=res.text
- ls2=re.findall("https:\/\/media\.langtze\.com\/[^\.]+\.png!page",content)
- for x in ls2[:]:
- filename=path+x[26:-5]
- #print(x)
- try:
- url2=re.sub(r" ","%20",x)
- response = urlopen(url2)
- with open(filename, 'wb') as fp:
- fp.write(response.read())
- except:
- print(x)
-
复制代码
还差以下几个文件无法下载:
https://media.langtze.com/Nam Insik’s Portfolio Site.png!page
https://media.langtze.com/和-水都饌菓.png!page
https://media.langtze.com/このラジオがヤバい.png!page
https://media.langtze.com/村山人形店.png!page
https://media.langtze.com/Hinderer & Wolff.png!page
https://media.langtze.com/66° Nord.png!page
|
|