POST 表单之后,无内容
本帖最后由 python羊 于 2021-7-9 08:05 编辑爬取url:http://www.qybz.org.cn/standardProduct/toAdvancedResult.do
提取 该搜索网页的搜索结果,如图》:企业名称,标准名称,发布时间,状态。
最终网站 为 post表单格式的。
表单中:geetest_challenge,geetest_validate,geetest_seccode。三个参数为动态值。
目前卡在 :获取geetest_challenge参数。
该地址应该返回 geetest_challenge 的值。但是一直是 错误,无法获取返回值。
请问应该如何做,才能获取 geetest_challenge 的值。
源代码————————
import requests
import json
import time
header={
'Referer': 'http://www.qybz.org.cn/standardProduct/toAdvancedResult.do',
'User-Agent':'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Mobile Safari/537.36',}
#_______获取challenge 参数
url_t = 'http://www.qybz.org.cn/gc/geetest/query?t=1625710079856'
response = requests.get(url_t, headers = header)
t_dict = json.loads(response.text)
print(t_dict['challenge'],t_dict['gt'])
#_______获取geetest_challenge参数
call_time = 'geetest_'+str(int(round(time.time() * 1000)))
params={
'is_next':'true',
'type':'slide3',
'gt':t_dict['gt'],
'challenge': t_dict['challenge'],
'lang': 'zh-cn',
'https': 'false',
'protocol': 'http://',
'offline': 'false',
'product': 'embed',
'api_server': 'api.geetest.com',
'isPC': 'true',
'autoReset': 'true',
'width': '100%',
'callback':call_time}
url_chan='http://api.geetest.com/get.php?'
response_chan = requests.get(url_chan, headers = header,params=params)
print(response_chan.text)
chan_dict = json.loads(response_chan.text)
print(chan_dict['challenge'])
#____获取 geetest_validate 参数
call_time_va = 'geetest_'+str(int(round(time.time() * 1000)))
params_va={
'gt': t_dict['gt'],
'challenge': chan_dict['challenge'],
'lang': 'zh-cn',
'pt': '0',
'client_type': 'web',
'w': 'LZl52UUP41jslURsQGopJkBLPz7JEjf3hWLCpxdNWyum)h2W)LZcTNiC6flLp5ivHU1DRd6Cu(R0qaMet9HXk0VcJK8HLR6N3tC8riwalap0HqQsxkzyMftO6LFy52vXctvYP5YpvDRmuT0szgcVuqGq(9asOy87CrstJY)5MjGVnaONXAg)t)LUzcRuhK3wBJ6Nx)YuJTrrZi)h7cFBuHui1LwpChQvdNrTMzCZpx(w6xhmL7WOY)N9c2Jo81HCVB(zuGtDD(lLliLqfIvhxC)Yx(KClkTr1m6U1N96bYC0oNJUnn1Q6wDOu1dnUDIuVi34UQczG6Vb(cOyV9IIJNQvt(iitcpXmmQzvk56KdUeXP7iMxkZoWBD2b81njKitu6Wbine9nx)ib)yqfX2Ogo5fJwH1PtzxSjzUoxE8wPPW)ljj0aKp1)A6suHyLme28RihhnXR3fwjp4uScxxzF9ldV9yZ(TmGisleuE5zXurtZ3)Y4PFofwR5x)drRney2DoQW4Ji0aizt(LvJrrehYlf(i5L7FwVVMOnzCBIB1n6gZPig4P7fTKlFp8yP)pluwtHXY9FegQbBbsu(lR9iqrSymrdezTy8h5Yi(X3(9UeIdZcSiv9R0h(7gwmhVhSOomCfHqfQH8QGXCy)ynDax(fYX1Ldwm3A2lTIkgoSw6AJBn4WiR0fWcZLw8A4PxfDwggF315XhvSNWDxRWmoIFfsywu1JVl1phm7zHBQ58CKj4FIG9T5JWdSgluPPOtMufp)QdZFOxFYODF28Gb7ygLT(hxVomnDFoLcXWHiVVAZkQJ0zzkj)p6O(4gBji2BDQg8ZdZT5jQ0lpPc2mtwoT(6g8lN4ynzpWX2YP9QENKIqp(WMBxakCSDqCrip098o0LANGCJEeC2BY5BisGVgrHbDyYN(X6bVx2FLMkayE.6864bd8da2ca59fe689b9a03365d72668b2a61dc00c34a44c68b846694f2270fde7b67f763db0809fdb13adf6f78572d5ee48733c64fcd1e2b1cab73efd2e785b19c068d9b92210125fcbb3e6624a454b7b3b06718ba5f1a81317397ccd69d6554fb439de9603ead81ac45461287e319f60d2554c90eea8b7a513a5fea7374be',
'callback': call_time_va}
url_va='http://api.geetest.com/ajax.php?'
response_va = requests.get(url_chan, headers = header,params=params_va)
va_dict = json.loads(response_va.text)
print(va_dict['validate'])
#___获取搜索结果
Num = 'Q/XSHSW021-2021'
url_post='http://www.qybz.org.cn/standardProduct/anvancedQueryPaged.do'
date={
'standardName':'' ,
'enterpriseName':'' ,
'standardCode':Num,
'standardStatus':'',
'xzqh':'' ,
'orgCode':'' ,
'pageNum':' 1',
'geetest_challenge':chan_dict['challenge'],
'geetest_validate':va_dict['validate'],
'geetest_seccode':va_dict['validate']+'|jordan'}
post_url = requests.post(url=url_post,headers=header, data=date)
print(post_url.text)
里面有些值可能是动态的,你直接复制无效 geetest_challenge':' 6c7d00a042cb55c51c1ce6464abfeb6fkj',
'geetest_validate':' 00cc191bba44aa24170ff84972f25beb',
'geetest_seccode':' 00cc191bba44aa24170ff84972f25beb|jordan'
这三个参数是动态的 拼图验证,你通过了? 阿奇_o 发表于 2021-7-7 20:52
拼图验证,你通过了?
我想绕过
行嘛 Kayko 发表于 2021-7-8 08:42
行嘛
不行
python羊 发表于 2021-7-8 09:58
不行
拼图验证不是这样绕过的,孩子,你得先看服务器响应的东西,再根据响应的东西想办法绕过 西瓜味的苹果 发表于 2021-7-8 14:44
拼图验证不是这样绕过的,孩子,你得先看服务器响应的东西,再根据响应的东西想办法绕过
不能强绕?
难道不能站着把钱挣了 {:10_260:} 西瓜味的苹果 发表于 2021-7-8 15:37
跪下了,帮我看看,感谢 xfmiao 发表于 2021-7-7 17:38
geetest_challenge':' 6c7d00a042cb55c51c1ce6464abfeb6fkj',
'geetest_validate':' 00cc191bba44aa24170f ...
修改了,帮忙看看,感谢
页:
[1]