|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
[求助]用爬虫通过以下方式获取资源:
打开https://msdn.itellyou.cn/,在搜索框输入ASP.NET,下载搜索到的资源
首先在首页获取到token和cookie,再通过搜索接口获取到结果。
- import requests
- import re
- url = ['https://msdn.itellyou.cn/', 'https://msdn.itellyou.cn/Index/Search']
- def get_index():
- # 获取token
- headers = {
- 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 '
- '(KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36'
- }
- response = requests.get(url[0], headers=headers)
- if response.status_code == 200:
- token = re.findall('data-token=(.*?)>', response.content.decode())[0]
- cookie = response.headers.get('set-cookie')
- return token, cookie
- else:
- return None
- def get_search(token, cookie, keyword):
- # 获取结果
- headers = {
- 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 '
- '(KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36',
- 'x-csrf-token': token,
- 'cookie': cookie
- }
- form_data = {
- 'keyword': keyword,
- 'filter': 'true'
- }
- response = requests.post(url[1], headers=headers, data=form_data).json()
- data = response['result']['list'][0]['product'][0]
- name = data['name']
- ed2k = data['url']
- print('文件名:', name)
- print(ed2k)
- if __name__ == '__main__':
- word = input('输入正确的关键词:')
- token, cookie = get_index()
- get_search(token, cookie, word)
复制代码
|
|