求助正确的正则表达式
import reimport requests
url = 'https://www.mnmulu.com/star/nvshen/'
rep = requests.get(url).text
print(rep)
reb = re.compile(r'a href="((?<=/photos).*?(?=(\d/)))"')
res = re.findall(reb, rep)
print(res)
num = 67
for i in res:
if i.startswith("/photos"):
i1 = "https://www.mnmulu.com" + i
red = requests.get(i1).text
rea = re.compile(r'src="(.+?.jpg)"')
rez = re.findall(rea, red)
print(rez)
b = requests.get(i, headers=headers,)
with open(r'E:\编程文件\项目一\我的图集\我的美图%s.jpg'%num, 'wb')as file:
file.write(b.content)
num += 1
print('第%s张图片下载完毕'%num) 沙发,然而不会
{:10_266:} 你要匹配什么? Twilight6 发表于 2020-6-21 09:06
你要匹配什么?
图片的链接地址,找不到图片的链接地址 liminghu 发表于 2020-6-21 16:08
图片的链接地址,找不到图片的链接地址
直接 r'<img.*?href="(.+?)".*?>' 不行吗? 本帖最后由 suchocolate 于 2020-6-22 18:31 编辑
这个网站的图片url在ajax里:
import requests
num = int(input('请输入获取的页数:'))
base_url = 'https://www.mnmulu.com/star/nvshen/?ajax=1&v=20&page='
headers = {'user-agent': 'firefox', 'X-Requested-With': 'XMLHttpRequest'}
pics_list = []
for item in range(1, num +1):
url = base_url + str(item)
r = requests.get(url, headers=headers)
data = r.json()
result = data['msg']['list']
for x in result:
print(x['imgUrl'])
pics_list.append(x['imgUrl'])
print(pics_list)
页:
[1]