|

楼主 |
发表于 2018-12-31 23:25:54
|
显示全部楼层
本帖最后由 jasmorning 于 2018-12-31 23:32 编辑
- import requests
- import urllib.request
- import re
- def get_html(url):
- res = requests.get(url)
- res.encoding = res.apparent_encoding
- return res.text
- def get_list():
- page = 1
- while True:
- if page > 2:
- break
- url = 'http://www.tu11.com/meituisiwatupian/list_2_' + str(page) + '.html'
- html = get_html(url)
- postfind = r'<a href="([^"]+.html)" t'
- postlistx =re.findall(postfind,html)
- for postlist in postlistx:
- postlist = r'http://www.tu11.com' + str(postlist)
- html = get_html(url)
- get_url(postlist)
- print(postlist)
- page += 1
- ###postlist列表中打开,匹配下一页,匹配jpg,循环。
- def get_url(postlist):
- for imgx in postlist:
- print(imgx)
- get_list()
复制代码
sorry!sorry!
因为第一个print(postlist)的打印结果正常,进入第二个函数的for循环就变样了。。 |
|