|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
- import urllib.request
- import os
- def get_page(url):
- req=urllib.request.Request(url)
- req.add_header('User-Agent','Mozilla/5.0 (Windows NT 10.0; WOW64; rv:61.0) Gecko/20100101 Firefox/61.0')
- request=urllib.request.urlopen(req)
- html=response.read().decode('utf-8')
- a=html.find('current-comment-page')+23
- b=html.find(']',a)
- print(html[a:b])
- def find_imgs(url):
-
- pass
- def save_imgs(folder,img_addrs):
- pass
- def download_mm(folder='ooxx',pages=10):
- os.mkdir(folder)
- os.chdir(folder)
- url='http://jandan.net/ooxx'
- page_num=int(get_page(url))
- for i in range(pages):
- page_num-=i
- page_url=url+'page-'+str(page_num)+'#comments'
- img_addrs=find_imgs(page_url)
- save_imgs(folder,img_addrs)
- if __name__=='__main__':
- download__mm
复制代码
然后显示
Traceback (most recent call last):
File "D:/爬图.py", line 31, in <module>
download__mm
NameError: name 'download__mm' is not defined
知道会有错误,为啥无法显示页数啊?
还有问一下,最开始get_page(url)后面url不应该是个参数吗?为啥在其他函数里面给的url网址也能在这个函数里面用啊,不应该写在最开始吗?之前从零开始入门的视频都看了,是看漏了?这明显又不是继承啊…… |
|