H.E.G. 发表于 2021-4-3 16:59:55

关于爬虫的问题

又白白在fishc混了一个月
问题又来了{:10_256:}
刚刚看小甲鱼的MV,学着编一下代码,但是爬到的字典是那样的{:10_245:}
↓代码如下↓
#对应小甲鱼的0基础入门学习Python(P56)
from urllib import request
from urllib import parse
import json

temp = input('Enter you want to translation content:')
url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule&smartresult=ugc&sessionFrom=http://www.youdao.com/'

head = {}#设置header
head['User-Agent'] = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36 Edg/89.0.774.63'

data = {}#设置爬虫数据
data['type']='AUTO'
data['i']=temp
data['doctype']='json'
data['xmlVersion']='1.6'
data['keyfrom']='fanyi.web'
data['ue']='UFT-8'
data['typoResult']='true'
data = parse.urlencode(data).encode('utf-8')

req = request.Request(url,data,head)#返回
response = request.urlopen(req)
html = response.read().decode('utf-8')

target = json.loads(html)
print(target)
res = target['translateResult']['tgt']
print(res)
爬到的结果
Enter you want to translation content:你好
{'type': 'UNSUPPORTED', 'errorCode': 30, 'elapsedTime': 2, 'translateResult': [[{'src': '您的请求来源非法,商业用途使用请关注有道翻译API官方网站“有道智云”: http://ai.youdao.com', 'tgt': '您的请求来源非法,商业用途使用请关注有道翻译API官方网站“有道智云”: http://ai.youdao.com'}]]}
您的请求来源非法,商业用途使用请关注有道翻译API官方网站“有道智云”: http://ai.youdao.com

yayc_zcyd 发表于 2021-4-3 17:24:41

这些网页特别讨厌爬虫

suchocolate 发表于 2021-4-3 21:33:22

供参考
from urllib import request, parse
import json


def main():
    trans = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0"}
    dic = {"doctype": "json"}
    wd = input('请输入要翻译的内容: ')
    dic['i'] = wd
    data = bytes(parse.urlencode(dic), encoding='utf-8')
    q = request.Request(url=trans, data=data, headers=headers, method='POST')
    r = request.urlopen(q)
    result = json.loads(r.read().decode('utf-8'))
    print(result['translateResult']['tgt'])


if __name__ == '__main__':
    main()

龙舞九天 发表于 2021-5-31 15:16:13

{:5_95:}
页: [1]
查看完整版本: 关于爬虫的问题