鱼C论坛

 找回密码
 立即注册
查看: 1618|回复: 1

爬虫

[复制链接]
发表于 2021-11-2 23:32:17 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
急!急!急!
题目:访问https://www.suning.com
使用selenium实现输入iphone进行搜索,跳转到iphone商品页面
爬取所有商品的名称、价格和评论数据,存为json文件

有没有兄弟会这个题目啊,我以及爬取出内容了,但是不会截取和存为json文件
我的代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from time import sleep
import json
driver = webdriver.Chrome()
driver.get('https://www.suning.com')
driver.maximize_window()
wait = WebDriverWait(driver,10)
_input = wait.until(EC.presence_of_element_located((By.ID,'searchKeywords')))
_input.clear()
_input.send_keys('iphone')
_input.send_keys(Keys.ENTER)
js = 'window.scrollTo(0,document.body.scrollHeight)'
driver.execute_script(js)
p = driver.page_source
t = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,'#product-list')))
sleep(10)
print(t)
for item in t:
    print(item.text)
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复

使用道具 举报

发表于 2021-11-3 11:38:54 | 显示全部楼层
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
import json
import time
from lxml import etree

driver = webdriver.Chrome()
driver.get('https://www.suning.com')
driver.maximize_window()
wait = WebDriverWait(driver, 10)
_input = wait.until(EC.presence_of_element_located((By.ID, 'searchKeywords')))
_input.clear()
_input.send_keys('iphone')
_input.send_keys(Keys.ENTER)
js = 'window.scrollTo(0,document.body.scrollHeight)'
driver.execute_script(js)
time.sleep(3)
wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '#product-list')))
html = etree.HTML(driver.page_source)
lis = html.xpath('//li[contains(@class,"item-wrap")]')  # 商品列表
result = []
for li in lis:
    price = li.xpath('.//span[@class="def-price"]/text()')[-1]
    description = li.xpath('.//a[@tabindex="0"]/@aria-label')[0]
    comment = li.xpath('.//a[@tabindex="-1"]/i/text()')[0]
    data = {'price': price, 'description': description, 'comment': comment}
    result.append(data)
with open('test.json', 'w', encoding='utf-8') as f:
    f.write(json.dumps(result, ensure_ascii=False))
    # f.write(json.dumps(result, indent=2, ensure_ascii=False))  # 如果要缩进好看一点就用这个
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2025-1-12 21:09

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表