爬取房天下装修图片保存本地

wcq15759797758 · 发表于 2021-7-28 22:32:57

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

谢谢大捞们捧场

from time import sleep
import requests,json
from lxml import etree
import os
def processing(strs):
s = '' # 定义保存内容的字符串
for n in strs:
n = ''.join(n.split()) # 去除空字符
s = s + n # 拼接字符串
return s # 返回拼接后的字符串
def run(url):
try:
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36',
'HOST': 'home.fang.com'}
response = requests.get(url=url,headers=headers)
if response.status_code == 200:
html = etree.HTML(response.text)
div_list = html.xpath('//div[@class="photo_list"]/ul/li')
for div in div_list:
item = {}
names = div.xpath('./ol/p/a/text()')
name = processing(names)
item['name'] = name
hrefs = div.xpath('./ol/p/a/@href')
href = 'https://home.fang.com' + processing(hrefs)
item['href'] = href
TP(img_url=href,headers=headers)
sleep(5)
except:
return None
def TP(img_url,headers):
print('下载装修效果图中' + img_url)
img_response = requests.get(url=img_url,headers=headers)
img_html = etree.HTML(img_response.text)
li_list = img_html.xpath('//div[@id="BoxUl"]/ul//li')
for li in li_list:
img_pngs = li.xpath('./span/img/@src|./span/img/@src2')
img_png = processing(img_pngs)
img_names = li.xpath('./input/@value')
img_name = processing(img_names) + '.jpg'
try:
headers1 = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36'}
response_img = requests.get(url=img_png,headers=headers1)
if not os.path.exists(img_name):
with open(img_name,'wb') as f:
f.write(response_img.content)
sleep(0.5)
except requests.ConnectionError:
print('保存失败！')
if __name__ == '__main__':
for page in range(1,2):
url = f'https://home.fang.com/album/s24/{page}/'
print('正在分析:' + url)
run(url=url)
sleep(5)

复制代码

wcq15759797758 · 发表于 2021-7-28 22:34:31

自己爬的时候加了代理IP

超级玛尼哄 · 发表于 2021-7-28 22:47:02

学习学习，顺便领个币

wp231957 · 发表于 2021-7-29 06:45:11

静态爬大体上都一样，有空玩一玩动态爬，反反爬

鸬鹚鸟 · 发表于 2021-7-29 07:26:38

6666

菜鸡10086 · 发表于 2021-7-29 11:16:06

冲冲冲

hornwong · 发表于 2021-7-29 11:46:46

感谢分享！

wcq15759797758 · 发表于 2021-7-29 12:11:31

wp231957 发表于 2021-7-29 06:45
静态爬大体上都一样，有空玩一玩动态爬，反反爬

好得

OD_lrean · 发表于 2021-7-29 14:35:16

学习还可以领币？？

枫叶向上_ · 发表于 2021-7-29 15:07:12

学习！

懒狗李 · 发表于 2021-7-29 16:07:49

fxj2002 · 发表于 2021-7-29 19:53:03

sxztry · 发表于 2021-7-30 10:39:03

反反爬还在学习中

犬来猫荒 · 发表于 2021-7-30 10:39:08

aironeng · 发表于 2021-8-3 10:05:16

学习一下

账号		自动登录	找回密码
密码			立即注册

[技术交流] 爬取房天下装修图片保存本地

马上注册，结交更多好友，享用更多功能^_^

本帖被以下淘专辑推荐:

回帖奖励 +1 鱼币

回帖奖励 +1 鱼币

回帖奖励 +1 鱼币

回帖奖励 +1 鱼币

回帖奖励 +1 鱼币

回帖奖励 +1 鱼币

回帖奖励 +1 鱼币

回帖奖励 +1 鱼币

回帖奖励 +1 鱼币

回帖奖励 +1 鱼币