[已解决]requests怎么获取这个网站的正确验证码图

非凡 · 发表于 2021-11-2 18:13:55

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

目前需要爬下面这网站，批量查询一批号码的标记情况
url='http://opene164.org.cn/mark/index.html'

但这网站二维码有个这样的情况，刚点开，或者刚刷新的时候二维码的src是这样的

微信截图_20211102180304.png

img_url = 'http: // opene164.org.cn /mark/query/captcha.html'

当我点击二维码刷新个新图时，就变成下面这样式了。
微信截图_20211102180555.png

正常来说，这样式的才是当前验证码的正常src吧。
img_url = 'http: // opene164.org.cn /mark/query/captcha.html?1635847506232'

问题：
1、那在刚打开网页和刷新后，我如果正确获取二维码的src呢？

2、
假设我能获取到二维码了，那第二个问题，通过过get请求获取了页面二维码，

res = requests.get(url=url,headers=header)

复制代码

然后我再通过post发送二维码请求的时候，网页当前的二维码会不会刷新变化
url_data = 'http://opene164.org.cn/mark/data.do'

请大神帮忙解答下疑惑

最佳答案

月排行榜 / 总排行榜

suchocolate

2021-11-3 12:33:59

由于requests不能像浏览器那样，同时加载一个页面中的多个资源，所以如果页面中有对会话绑定的要求，就需要用到requests.session，让session替你维持会话。

import requests
def main():
url = 'http://opene164.org.cn/mark/index.html'
headers = {'user-agent': 'firefox'}
s = requests.session()
s.get(url, headers=headers) # 初次访问
url = 'http://opene164.org.cn/mark/query/captcha.html'
r = s.get(url, headers=headers) # 下载验证码图片
with open('test.gif', 'wb') as f:
f.write(r.content)
if __name__ == '__main__':
main()

复制代码

跳转到最佳答案楼层

suchocolate · 发表于 2021-11-3 12:33:59

这个最佳答案由 suchocolate 给出，感谢 suchocolate 的回答。

单击隐藏图章

由于requests不能像浏览器那样，同时加载一个页面中的多个资源，所以如果页面中有对会话绑定的要求，就需要用到requests.session，让session替你维持会话。

import requests
def main():
url = 'http://opene164.org.cn/mark/index.html'
headers = {'user-agent': 'firefox'}
s = requests.session()
s.get(url, headers=headers) # 初次访问
url = 'http://opene164.org.cn/mark/query/captcha.html'
r = s.get(url, headers=headers) # 下载验证码图片
with open('test.gif', 'wb') as f:
f.write(r.content)
if __name__ == '__main__':
main()

复制代码

非凡 · 发表于 2021-11-4 00:19:04

suchocolate 发表于 2021-11-3 12:33
由于requests不能像浏览器那样，同时加载一个页面中的多个资源，所以如果页面中有对会话绑定的要求，就需要 ...

感谢大佬指点，成功了：
起初我认为第一次访问的url是访问时用的链接是http://opene164.org.cn/mark/index.html

后来才发现原来中间这查询框是另一个链接 http://opene164.org.cn/mark/query/index.html
捕获.PNG

import requests
url = 'http://opene164.org.cn/mark/query/index.html'
header = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36 Edg/95.0.1020.40'}
session = requests.session()
res = session.get(url=url,headers=header)
img_url = 'http://opene164.org.cn/mark/query/captcha.html'
img = session.get(img_url,headers=header)
with open('img-1.gif','wb') as f:
f.write(img.content)
port_html ='http://opene164.org.cn/mark/data.do'
data={'phone':'07586543210'}
captcha_in = input('captcha_IN')
data['captcha'] = captcha_in
find = session.post(url=port_html,data=data,headers=header)

复制代码

find.text
Out[5]: '{"data":"{\"msg\":\"您号码开户的运营商未接入平台，今日查询量已达到试用上限，请明日查询\",\"status\":500}","msg":"成功","status":200}'

复制代码

非凡 · 发表于 2021-11-4 00:20:23

suchocolate 发表于 2021-11-3 12:33
由于requests不能像浏览器那样，同时加载一个页面中的多个资源，所以如果页面中有对会话绑定的要求，就需要 ...

非常感谢大佬的指点，我成功了

账号		自动登录	找回密码
密码			立即注册

[已解决]requests怎么获取这个网站的正确验证码图

马上注册，结交更多好友，享用更多功能^_^

浏览过的版块