为什么无法获网页参数,Python交流,编程语言专区,鱼C论坛

python羊 发表于 2021-7-9 14:14:53

为什么无法获网页参数

爬取url：http://www.qybz.org.cn/standardProduct/toAdvancedResult.do

提取该搜索网页的搜索结果，如图》：企业名称，标准名称，发布时间，状态。

最终网站为 post表单格式的。
表单中：geetest_challenge，geetest_validate，geetest_seccode。三个参数为动态值。

目前卡在：获取geetest_challenge参数。
该地址应该返回 geetest_challenge 的值。但是一直是错误，无法获取返回值。

请问应该如何做，才能获取 geetest_challenge 的值。

源代码————————

import requests
import json
import time

header={
'Referer': 'http://www.qybz.org.cn/standardProduct/toAdvancedResult.do',
'User-Agent':'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Mobile Safari/537.36',}

#_______获取challenge 参数

url_t = 'http://www.qybz.org.cn/gc/geetest/query?t=1625710079856'

response = requests.get(url_t, headers = header)

t_dict = json.loads(response.text)
print(t_dict['challenge'],t_dict['gt'])

#_______获取geetest_challenge参数

call_time = 'geetest_'+str(int(round(time.time() * 1000)))

params={
'is_next':'true',
'type':'slide3',
'gt':t_dict['gt'],
'challenge': t_dict['challenge'],
'lang': 'zh-cn',
'https': 'false',
'protocol': 'http://',
'offline': 'false',
'product': 'embed',
'api_server': 'api.geetest.com',
'isPC': 'true',
'autoReset': 'true',
'width': '100%',
'callback':call_time}

url_chan='http://api.geetest.com/get.php?'
response_chan = requests.get(url_chan, headers = header,params=params)

print(response_chan.text)

chan_dict = json.loads(response_chan.text)
print(chan_dict['challenge'])

wp231957 发表于 2021-7-9 20:19:06

这个网站不是有滑块验证我看网上例子都是使用selenium的

python羊 发表于 2021-7-10 09:03:40

wp231957 发表于 2021-7-9 20:19
这个网站不是有滑块验证我看网上例子都是使用selenium的

geetest_challenge 这个参数在滑块之前的

python羊 发表于 2021-7-12 09:26:54

自顶

python羊 发表于 2021-7-12 09:28:02

不要沉

YunGuo 发表于 2021-7-12 21:13:28

geetest_challenge参数是需要通过滑动验证后才行。在查询之前一共要经过两次滑动验证，打开网站一次，查询一次。你通过接口http://www.qybz.org.cn/gc/geetest/query?会获得一个challenge参数，但这个不是最新的，用这个参数去请求是无效的，第二个challenge参数（最新）是输入关键词，点击搜索经过滑动验证后，极验返回的一个challenge参数，这个参数就是geetest_challenge参数，携带这个参数去请求才会返回正常结果。除非你能破解极验的验证码，否则还是老老实实用selenium吧。

页: [1]

鱼C论坛's Archiver

为什么无法获网页参数