Catgu
发表于 2020-5-28 13:43:18
新手来学习
laolao程序猿
发表于 2020-5-29 23:01:01
喜欢喜欢
fforwardd
发表于 2020-5-30 08:57:22
good
七彩兔
发表于 2020-5-30 20:54:01
{:7_132:}
sansansa
发表于 2020-5-31 10:42:50
太强了
1058704332
发表于 2020-5-31 21:35:37
1
罗罗罗罗
发表于 2020-6-2 20:27:42
666
cosyeria
发表于 2020-6-2 21:12:13
1
bilibiliyo
发表于 2020-6-3 00:26:04
看看
xspython
发表于 2020-6-4 09:38:01
{:5_95:}
小魔芋
发表于 2020-6-4 11:11:28
开车了,滴滴滴~~~
吴半仙
发表于 2020-6-4 12:25:57
{:5_110:}
nj2010
发表于 2020-6-4 18:06:35
{:5_110:}{:5_110:}
BC菜鸡
发表于 2020-6-5 22:37:39
111111
SeanNate
发表于 2020-6-8 13:32:59
感谢分享
xjk39809337
发表于 2020-6-14 17:07:54
????/
WxxThu
发表于 2020-6-14 20:02:02
很符合龟仙人的头像
东北第一好汉
发表于 2020-6-14 20:34:25
2020.6.14随笔做了一个
import os
import urllib.request
import urllib.parse
import bs4
def open_url(url):
req = urllib.request.Request(url)
req.add_header("User-Agent","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36")
response = urllib.request.urlopen(req)
html = response.read()
return html
#找出当前网页的下一页
def begin(url):
html = open_url(url)
html = html.decode("utf-8")
soup=bs4.BeautifulSoup(html,"html.parser")
temp=soup.find_all(class_="previous-comment-page")#class_="current-comment-page"
print("begin")
#print(temp)
#print(temp.attrs["href"])
return (temp.attrs["href"])
#找出当前网页的所有图片下载地址网址
def find_imgs(url_page):
html = open_url(url_page).decode("utf-8")
soup=bs4.BeautifulSoup(html,"html.parser")
temp=soup.find_all("img",referrerpolicy="no-referrer")
addrs=[]
for i in temp:
print(i.attrs["src"])
addrs.append(i.attrs["src"])
print("find")
return addrs
#保存图片
def save_imgs(img_addrs):
for i in img_addrs:
url = "http:"+i
html = open_url(url)
name = url.split("/")[-1]
with open(name,"wb") as f:
f.write(html)
print("save")
#初始网页
url = "http://jandan.net/ooxx"
def download(url,page=30):
path=os.getcwd()+"/aaa"
os.mkdir(path)
os.chdir(path)
save_imgs(find_imgs(url))#url第一个网页下载图片
#后续网页循环下载
for i in range(page-1):
url_page = "http:" + begin(url)
save_imgs(find_imgs(url_page))
url = url_page
if __name__ == "__main__":
download(url)
zhanghaofeng1
发表于 2020-6-15 11:40:30
6
xiaoywy
发表于 2020-6-15 15:21:46
ddsds