鱼C论坛

 找回密码
 立即注册
查看: 2503|回复: 6

[已解决]求解如何爬这个中国气象局api的数据

[复制链接]
发表于 2022-12-24 17:35:40 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
在网上的博客里找了几个api但都已经不更新了

我找到一个链接,这个链接浏览器能正常访问,但我用爬虫爬不下来
http://t.weather.sojson.com/api/weather/city/101010100


代码和报错如下



  1. import requests
  2. header = {
  3.     'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
  4.     'Accept-Encoding': 'gzip, deflate',
  5.     'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
  6.     'Cache-Control': 'max-age=0',
  7.     'Host': 't.weather.sojson.com',
  8.     'Proxy-Connection': 'keep-alive',
  9.     'Upgrade-Insecure-Requests': '1',
  10.     'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36 Edg/108.0.1462.54',
  11. }

  12. res = requests.get('http://t.weather.sojson.com/api/weather/city/101010100', headers=header)
复制代码

  1. Traceback (most recent call last):
  2.   File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connection.py", line 174, in _new_conn
  3.     conn = connection.create_connection(
  4.   File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\util\connection.py", line 72, in create_connection
  5.     for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  6.   File "C:\ProgramData\Anaconda3\lib\socket.py", line 954, in getaddrinfo
  7.     for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
  8. socket.gaierror: [Errno 11001] getaddrinfo failed

  9. During handling of the above exception, another exception occurred:

  10. Traceback (most recent call last):
  11.   File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
  12.     httplib_response = self._make_request(
  13.   File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 398, in _make_request
  14.     conn.request(method, url, **httplib_request_kw)
  15.   File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connection.py", line 239, in request
  16.     super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  17.   File "C:\ProgramData\Anaconda3\lib\http\client.py", line 1285, in request
  18.     self._send_request(method, url, body, headers, encode_chunked)
  19.   File "C:\ProgramData\Anaconda3\lib\http\client.py", line 1331, in _send_request
  20.     self.endheaders(body, encode_chunked=encode_chunked)
  21.   File "C:\ProgramData\Anaconda3\lib\http\client.py", line 1280, in endheaders
  22.     self._send_output(message_body, encode_chunked=encode_chunked)
  23.   File "C:\ProgramData\Anaconda3\lib\http\client.py", line 1040, in _send_output
  24.     self.send(msg)
  25.   File "C:\ProgramData\Anaconda3\lib\http\client.py", line 980, in send
  26.     self.connect()
  27.   File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connection.py", line 205, in connect
  28.     conn = self._new_conn()
  29.   File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connection.py", line 186, in _new_conn
  30.     raise NewConnectionError(
  31. urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x0000021C5AD86D60>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed

  32. During handling of the above exception, another exception occurred:

  33. Traceback (most recent call last):
  34.   File "C:\ProgramData\Anaconda3\lib\site-packages\requests\adapters.py", line 440, in send
  35.     resp = conn.urlopen(
  36.   File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 785, in urlopen
  37.     retries = retries.increment(
  38.   File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\util\retry.py", line 592, in increment
  39.     raise MaxRetryError(_pool, url, error or ResponseError(cause))
  40. urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='t.weather.sojson.com', port=80): Max retries exceeded with url: /api/weather/city/101010100 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000021C5AD86D60>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))

  41. During handling of the above exception, another exception occurred:

  42. Traceback (most recent call last):
  43.   File "c:\Users\wldcmzy\Desktop\asdf.txt.py", line 14, in <module>
  44.     res = requests.get('http://t.weather.sojson.com/api/weather/city/101010100', headers=header)
  45.   File "C:\ProgramData\Anaconda3\lib\site-packages\requests\api.py", line 75, in get
  46.     return request('get', url, params=params, **kwargs)
  47.   File "C:\ProgramData\Anaconda3\lib\site-packages\requests\api.py", line 61, in request
  48.     return session.request(method=method, url=url, **kwargs)
  49.   File "C:\ProgramData\Anaconda3\lib\site-packages\requests\sessions.py", line 529, in request
  50.     resp = self.send(prep, **send_kwargs)
  51.   File "C:\ProgramData\Anaconda3\lib\site-packages\requests\sessions.py", line 645, in send
  52.     r = adapter.send(request, **kwargs)
  53.   File "C:\ProgramData\Anaconda3\lib\site-packages\requests\adapters.py", line 519, in send
  54.     raise ConnectionError(e, request=request)
  55. requests.exceptions.ConnectionError: HTTPConnectionPool(host='t.weather.sojson.com', port=80): Max retries exceeded with url: /api/weather/city/101010100 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000021C5AD86D60>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
复制代码

最佳答案
2022-12-24 19:24:32
你这是网络问题吧,你在浏览器能打开这个链接吗?
小甲鱼最新课程 -> https://ilovefishc.com
回复

使用道具 举报

发表于 2022-12-24 17:52:13 | 显示全部楼层
我试了一下,可以爬啊,直接用的你的代码
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-12-24 18:21:42 | 显示全部楼层
gywjj 发表于 2022-12-24 17:52
我试了一下,可以爬啊,直接用的你的代码

阿哲  那给我整的更不会了0.0
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-12-24 19:17:31 | 显示全部楼层
在python 3.11.1中运行没有任何错误
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-12-24 19:24:32 | 显示全部楼层    本楼为最佳答案   
你这是网络问题吧,你在浏览器能打开这个链接吗?
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-12-24 19:48:21 | 显示全部楼层
isdkz 发表于 2022-12-24 19:24
你这是网络问题吧,你在浏览器能打开这个链接吗?

浏览器正常访问 没有问题呀
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-12-24 19:51:13 | 显示全部楼层
isdkz 发表于 2022-12-24 19:24
你这是网络问题吧,你在浏览器能打开这个链接吗?

我知道了,以前蹭学校没法自动获取ip的网线改了dns忘了改回自动,本来浏览器应该什么也不能访问了,但是挂了代理把这个问题掩盖了,导致我以为网络没问题,现在好了0.0
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2025-4-28 22:56

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表