爬取王者所有英雄及下载头像

Pythonnewers · 发表于 2020-5-8 12:58:26

您需要登录才可以下载或查看，没有账号？立即注册

x

import requests
import re
import time
from bs4 import BeautifulSoup
from urllib.request import urlretrieve
import os
print("=============================")
Url = "https://pvp.qq.com/web201605/herolist.shtml"
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.14 Safari/537.36 Edg/83.0.478.13"
}
cwd = os.getcwd()
def here_name():
try:
os.mkdir("王者头像")
except:
pass
finally:
x = 1
global cwd
html_ = requests.get(Url, headers=headers).content.decode("gbk")
heros = re.findall('height="91" alt="(.*?)">', html_, re.S)
heros.pop()
with open(cwd + "\\王者头像\\hero","w",encoding="utf-8") as f:
for hero in heros:
f.write(hero+"\n")
print("=============================")
print("总共有%d" % (len(heros)))
pictures = re.findall('<img.*?src="(.*?)"',html_,re.S)
for picture in pictures:
if picture[:2] == "//":
urlretrieve("http:"+picture,cwd+"\\王者头像\"+str(x)+".jpg")
x+=1
else:
pass
if __name__ == "__main__":
here_name()
"""
<li><a href="herodetail/105.shtml" target="_blank"><img
src="//game.gtimg.cn/images/yxzj/img201606/heroimg/105/105.jpg" width="91"
height="91" alt="廉颇">廉颇</a></li>"""

复制代码

那个廉颇那个是我懒得删了

老八秘制 · 发表于 2020-5-8 12:59:45

虽然不打农药，但是也来看看

Pythonnewers · 发表于 2020-5-8 13:02:46

老八秘制发表于 2020-5-8 12:59
虽然不打农药，但是也来看看

我才发了不到十秒...

账号		自动登录	找回密码
密码			立即注册

[作品展示] 爬取王者所有英雄及下载头像