|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
- from selenium import webdriver
- from time import sleep
- from selenium.webdriver.common.by import By
- from lxml import html
- etree = html.etree
- #创建浏览器对象
- bro = webdriver.Chrome(r'D:/技能/chromedriver.exe')
- bro.get('https://xiaoyuan.zhaopin.com/')
- sleep(2)
- bro.find_element(By.XPATH,'//*[@id="root"]/div/div[1]/div/div[1]/div[2]/a[1]').click()
- sleep(5)
- bro.find_element(By.XPATH,'//*[@id="root"]/div[2]/div/div/div[3]/span[2]').click()
- sleep(2)
- s = bro.find_element(By.XPATH,'//*[@id="search_input_one"]').send_keys('java')
- sleep(1)
- d = bro.find_element(By.XPATH,'//*[@id="root"]/div/div[2]/div/div/div[2]/div/span[2]').click()
- sleep(1)
- reponse = bro.page_source
- print(reponse)
- bro.quit()
复制代码
大佬们,这个扫码登录后为啥爬取不到页面源码呀
selenium 确实可以获取到渲染后的源码,我刚刚研究了一下发现智联招聘在搜索后打开了新的窗口,要切换一下窗口才可以
对你的代码修改如下:
- from selenium import webdriver
- from time import sleep
- from selenium.webdriver.common.by import By
- from lxml import html
- etree = html.etree
- #创建浏览器对象
- bro = webdriver.Chrome(r'D:/技能/chromedriver.exe')
- bro.get('https://xiaoyuan.zhaopin.com/')
- sleep(2)
- bro.find_element(By.XPATH,'//*[@id="root"]/div/div[1]/div/div[1]/div[2]/a[1]').click()
- sleep(5)
- bro.find_element(By.XPATH,'//*[@id="root"]/div[2]/div/div/div[3]/span[2]').click()
- sleep(2)
- s = bro.find_element(By.XPATH,'//*[@id="search_input_one"]').send_keys('java')
- sleep(1)
- d = bro.find_element(By.XPATH,'//*[@id="root"]/div/div[2]/div/div/div[2]/div/span[2]').click()
- sleep(1)
- bro.switch_to.window(bro.window_handles[-1]) # 加这一行切换窗口
- reponse = bro.page_source
- print(reponse)
- print('Java工程师-苏州' in reponse)
- bro.quit()
复制代码
|
|