|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
本帖最后由 fineconey 于 2019-6-29 09:06 编辑
请教一个问题,- # -*- coding: utf-8 -*-
- import re
- import requests
- sn=0
- headers ={
- "Accept":" text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
- "Accept-Encoding": "gzip, deflate, br",
- "Accept-Language": "zh-CN,zh;q=0.9",
- "Cache-Control": "max-age=0",
- "Connection":" keep-alive",
- "Cookie": "tp=MGM0NzFFYTgwZWM3YjFjZDc3MTg5MmQ5MDIwZDIzNjFkMjdiMmQxNA%3D%3D",
- "DNT": "1",
- "Host": "tp.m-team.cc",
- "Referer": "https://tp.m-team.cc/",
- "Upgrade-Insecure-Requests": "1",
- "User-Agent":" Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3704.400 QQBrowser/10.4.3588.400"
- }
- url='https://tp.m-team.cc/movie.php?inclbookmarked=0&incldead=1&spstate=0&page=1'
- r=requests.get(url,headers=headers)
- print(r)
复制代码
想爬取一个pt网站的标题,但是需要登陆才可以。网上查询,可以传入cookies给request.get。但是发现一个,老是出错。不知道原因在哪里。但是用同样的方法爬取另外一个网站就没问题。
求解决。感谢。
显示以下错误
- "F:\【Mr.Zhang's python files】\venv\Scripts\python.exe" "F:/【Mr.Zhang's python files】/mteam.py"
- Traceback (most recent call last):
- File "F:/【Mr.Zhang's python files】/mteam.py", line 20, in <module>
- r=requests.get(url,headers=headers)
- File "C:\Users\Mr.Zhang\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\api.py", line 75, in get
- return request('get', url, params=params, **kwargs)
- File "C:\Users\Mr.Zhang\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\api.py", line 60, in request
- return session.request(method=method, url=url, **kwargs)
- File "C:\Users\Mr.Zhang\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\sessions.py", line 519, in request
- prep = self.prepare_request(req)
- File "C:\Users\Mr.Zhang\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\sessions.py", line 462, in prepare_request
- hooks=merge_hooks(request.hooks, self.hooks),
- File "C:\Users\Mr.Zhang\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\models.py", line 314, in prepare
- self.prepare_headers(headers)
- File "C:\Users\Mr.Zhang\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\models.py", line 448, in prepare_headers
- check_header_validity(header)
- File "C:\Users\Mr.Zhang\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\utils.py", line 942, in check_header_validity
- raise InvalidHeader("Invalid return character or leading space in header: %s" % name)
- requests.exceptions.InvalidHeader: Invalid return character or leading space in header: User-Agent
- 进程已结束,退出代码1
复制代码
好多地方都多了空格
# -*- coding: utf-8 -*-
import re
import requests
sn=0
headers ={
"Accept":" text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "zh-CN,zh;q=0.9",
"Cache-Control": "max-age=0",
"Connection":" keep-alive",
"Cookie": "tp=MGM0NzFFYTgwZWM3YjFjZDc3MTg5MmQ5MDIwZDIzNjFkMjdiMmQxNA%3D%3D",
"DNT": "1",
"Host": "tp.m-team.cc",
"Referer": "https://tp.m-team.cc/",
"Upgrade-Insecure-Requests": "1",
"User-Agent":" Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3704.400 QQBrowser/10.4.3588.400"
}
url='https://tp.m-team.cc/movie.php?inclbookmarked=0&incldead=1&spstate=0&page=1'
r=requests.get(url,headers=headers)
print(r)
|
|