马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
from selenium import webdriver
from time import sleep
from selenium.webdriver.common.by import By
from lxml import html
etree = html.etree
#创建浏览器对象
bro = webdriver.Chrome(r'D:/技能/chromedriver.exe')
bro.get('https://xiaoyuan.zhaopin.com/')
sleep(2)
bro.find_element(By.XPATH,'//*[@id="root"]/div/div[1]/div/div[1]/div[2]/a[1]').click()
sleep(5)
bro.find_element(By.XPATH,'//*[@id="root"]/div[2]/div/div/div[3]/span[2]').click()
sleep(2)
s = bro.find_element(By.XPATH,'//*[@id="search_input_one"]').send_keys('java')
sleep(1)
d = bro.find_element(By.XPATH,'//*[@id="root"]/div/div[2]/div/div/div[2]/div/span[2]').click()
sleep(1)
reponse = bro.page_source
print(reponse)
bro.quit()
大佬们,这个扫码登录后为啥爬取不到页面源码呀
selenium 确实可以获取到渲染后的源码,我刚刚研究了一下发现智联招聘在搜索后打开了新的窗口,要切换一下窗口才可以
对你的代码修改如下:from selenium import webdriver
from time import sleep
from selenium.webdriver.common.by import By
from lxml import html
etree = html.etree
#创建浏览器对象
bro = webdriver.Chrome(r'D:/技能/chromedriver.exe')
bro.get('https://xiaoyuan.zhaopin.com/')
sleep(2)
bro.find_element(By.XPATH,'//*[@id="root"]/div/div[1]/div/div[1]/div[2]/a[1]').click()
sleep(5)
bro.find_element(By.XPATH,'//*[@id="root"]/div[2]/div/div/div[3]/span[2]').click()
sleep(2)
s = bro.find_element(By.XPATH,'//*[@id="search_input_one"]').send_keys('java')
sleep(1)
d = bro.find_element(By.XPATH,'//*[@id="root"]/div/div[2]/div/div/div[2]/div/span[2]').click()
sleep(1)
bro.switch_to.window(bro.window_handles[-1]) # 加这一行切换窗口
reponse = bro.page_source
print(reponse)
print('Java工程师-苏州' in reponse)
bro.quit()
|