马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
本帖最后由 zhangsonghong00 于 2021-12-15 19:04 编辑
这个是代码内容:
from selenium import webdriver
import time
def serch_product():
driver.find_element_by_xpath('//*[@id="q"]').send_keys('卫衣女')
driver.find_element_by_xpath('//*[@id="J_TSearchForm"]/div[1]/button').click() #搜索的 点击
driver.find_element_by_xpath('//*[@id="login"]/div[1]/i').click() #电脑登录转扫码登录的点击
time.sleep(10)
driver.find_element_by_xpath('//*[@id="J_relative"]/div[1]/div/ul/li[2]/a').click() #销售排序点击
def drop_down():
for x in range(1,11,2):
time.sleep(0.5)
j=x/10
js='document.documentElement.scrollTop=document.documentElement.scrollHeight * %f'% j
driver.execute_script(js) #下拉框,以便获取全部商品信息
def get_product():
divs=driver.find_elements_by_xpath('//*div[@class="items"]/div[@class="item J_MouserOnverReq item-ad "]') #爬取商品数据
print(divs)
for div in divs:
image=div.find_element_by_xpath('.//div[@class="pic"]a/img').get_attribute('stc')
price=div.find_element_by_xpath('.//div[@id="J_Itemlist_TLink_577012675835"]').get_attribute('trace-price')+'元'
deal=div.find_element_by_xpath('.//div[@class="deal-cnt""]').text
info=div.find_element_by_xpath('.//div[@class="row row-2 title"]').text
position=div.find_element_by_xpath('.//div[@class="row row-3 g-clearfix"]/div[@class="location]').text
name=div.find_element_by_xpath('.//div[@class="shop"]/a/span[2]').text
product={'标题':info,'价格':price,'订单量':deal,'图片':image,'店铺名字':name,'地理位置':position}
print(product)
if __name__=='__main__':
driver=webdriver.Chrome()
driver.get('https://www.taobao.com/')
serch_product()
代码完毕
求大大解答下:
1.这个是selenium爬虫淘宝商品,测试运行了几次搜索和登录正常,下拉框滑动(加载商品数据)正常。
但是为什么系统还是提示关于登录和按销量排序的几次点击出现这个问题飘红:DeprecationWarning: find_element_by_* commands are deprecated. Please use find_element() instead
2.爬虫后的分别打印解析数据,没有出现任何数据。系统报错Stacktrace: 和 Backtrace:
系统提示的错误是什么?为什么我打印不出爬取后的数据?
以下是错误内容:
Warning (from warnings module):
File "C:\Users\Administrator\Desktop\编程\淘宝爬虫测试.py", line 23
divs=driver.find_elements_by_xpath('//*div[@class="items"]/div[@class="item J_MouserOnverReq item-ad "]')
DeprecationWarning: find_elements_by_* commands are deprecated. Please use find_elements() instead
Traceback (most recent call last):
File "C:\Users\Administrator\Desktop\编程\淘宝爬虫测试.py", line 42, in <module>
get_product()
File "C:\Users\Administrator\Desktop\编程\淘宝爬虫测试.py", line 23, in get_product
divs=driver.find_elements_by_xpath('//*div[@class="items"]/div[@class="item J_MouserOnverReq item-ad "]')
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 543, in find_elements_by_xpath
return self.find_elements(by=By.XPATH, value=xpath)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 1279, in find_elements
return self.execute(Command.FIND_ELEMENTS, {
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 424, in execute
self.error_handler.check_response(response)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 247, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidSelectorException: Message: invalid selector: Unable to locate an element with the xpath expression //*div[@class="items"]/div[@class="item J_MouserOnverReq item-ad "] because of the following error:
SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//*div[@class="items"]/div[@class="item J_MouserOnverReq item-ad "]' is not a valid XPath expression.
(Session info: chrome=91.0.4472.77)
Stacktrace:
Backtrace:
Ordinal0 [0x00BA2DB3+2502067]
Ordinal0 [0x00B3C5B1+2082225]
Ordinal0 [0x00A42498+1057944]
Ordinal0 [0x00A44B5A+1067866]
Ordinal0 [0x00A44A1E+1067550]
Ordinal0 [0x00A44C40+1068096]
Ordinal0 [0x00A6CAC8+1231560]
Ordinal0 [0x00A967CC+1402828]
Ordinal0 [0x00A85D5A+1334618]
Ordinal0 [0x00A94B7B+1395579]
Ordinal0 [0x00A85BEB+1334251]
Ordinal0 [0x00A62174+1188212]
Ordinal0 [0x00A63009+1191945]
GetHandleVerifier [0x00D1EC5C+1511084]
GetHandleVerifier [0x00DC8522+2205554]
GetHandleVerifier [0x00C23393+480739]
GetHandleVerifier [0x00C22579+477129]
Ordinal0 [0x00B41E5D+2104925]
Ordinal0 [0x00B463F8+2122744]
Ordinal0 [0x00B46537+2123063]
Ordinal0 [0x00B4EE53+2158163]
BaseThreadInitThunk [0x75E5FA29+25]
RtlGetAppContainerNamedObjectPath [0x778875F4+228]
RtlGetAppContainerNamedObjectPath [0x778875C4+180]
|