[已解决]爬虫

月魔同学 · 发表于 2020-8-6 10:49:20

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

<div>import urllib.request
import urllib.parse
import json
import time</div><div>while True:
</div><div> content=input('请输入要翻译的内容（如果结束请输入q）：')
if content=='q':
break
head={}
head['User-Agent']='Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3706.400 SLBrowser/10.0.4040.400'</div><div> url='http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
data={'i':content,
'from':'AUTO',
'to':'AUTO',
'smartresult':'dict',
'client':'fanyideskweb',
'salt':'15966771704905',
'sign': '8634c91db8413ef4cfbe684fd030d518',
'ts': '1596677170490',
'bv': 'd16528ec6ead722121051f646932f6ab',
'doctype': 'json',
'version': '2.1',
'keyfrom': 'fanyi.web',
'action':'FY_BY_REALTlME'}
data=urllib.parse.urlencode(data).encode('utf-8')</div><div> response=urllib.request.urlopen(url,data,head)
html=response.read().decode('utf-8')</div><div> target=json.loads(html)</div><div> target=target['translateResult'][0][0]['tgt']</div><div> print(target)
time.sleep(5)
</div>

复制代码

结果：
======================= RESTART: E:\python\程序\54_有道翻译.py =======================
请输入要翻译的内容（如果结束请输入q）我爱你
Traceback (most recent call last):
  File "E:\python\程序\54_有道翻译.py", line 31, in <module>
response=urllib.request.urlopen(url,data,head)
  File "D:\python\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
  File "D:\python\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
  File "D:\python\lib\urllib\request.py", line 542, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
  File "D:\python\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)
  File "D:\python\lib\urllib\request.py", line 1348, in http_open
return self.do_open(http.client.HTTPConnection, req)
  File "D:\python\lib\urllib\request.py", line 1319, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
  File "D:\python\lib\http\client.py", line 1230, in request
self._send_request(method, url, body, headers, encode_chunked)
  File "D:\python\lib\http\client.py", line 1276, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
  File "D:\python\lib\http\client.py", line 1225, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
  File "D:\python\lib\http\client.py", line 1004, in _send_output
self.send(msg)
  File "D:\python\lib\http\client.py", line 944, in send
self.connect()
  File "D:\python\lib\http\client.py", line 915, in connect
self.sock = self._create_connection(
  File "D:\python\lib\socket.py", line 793, in create_connection
sock.settimeout(timeout)
TypeError: an integer is required (got type dict)

小甲鱼视频的程序，不知道为什么会是类型错误？Data就是从Form Date 复制过来的
类型错误:需要一个整数(获取类型dict)

最佳答案

月排行榜 / 总排行榜

Twilight6

2020-8-6 10:54:33

本帖最后由 Twilight6 于 2020-8-6 10:55 编辑

月魔同学发表于 2020-8-6 10:52

urlopen 不能直接加 headers ，你需要先 Request

还有url 中的 _o 需要去掉，否则会出现 error:50 的假网址内容

import urllib.request

import urllib.parse

import json

import time

while True:

content=input('请输入要翻译的内容（如果结束请输入q）：')

if content=='q':

      break

head={}

head['User-Agent']='Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3706.400 SLBrowser/10.0.4040.400'

url='http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'

data={'i':content,

      'from':'AUTO',

      'to':'AUTO',

      'smartresult':'dict',

      'client':'fanyideskweb',

      'salt':'15966771704905',

      'sign': '8634c91db8413ef4cfbe684fd030d518',

      'ts': '1596677170490',

      'bv': 'd16528ec6ead722121051f646932f6ab',

      'doctype': 'json',

      'version': '2.1',

      'keyfrom': 'fanyi.web',

      'action':'FY_BY_REALTlME'}

data=urllib.parse.urlencode(data).encode('utf-8')

req = urllib.request.Request(url,data=data,headers=head)

response=urllib.request.urlopen(req)

html=response.read().decode('utf-8')

target=json.loads(html)

target=target['translateResult'][0][0]['tgt']

print(target)

time.sleep(5)
复制代码

跳转到最佳答案楼层

zltzlt · 发表于 2020-8-6 10:49:57

乱码了

月魔同学 · 发表于 2020-8-6 10:52:03

import urllib.request
import urllib.parse
import json
import time
while True:
content=input('请输入要翻译的内容（如果结束请输入q）')
if content=='q':
break
head={}
head['User-Agent']='Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3706.400 SLBrowser/10.0.4040.400'
url='http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
data={'i':content,
'from':'AUTO',
'to':'AUTO',
'smartresult':'dict',
'client':'fanyideskweb',
'salt':'15966771704905',
'sign': '8634c91db8413ef4cfbe684fd030d518',
'ts': '1596677170490',
'bv': 'd16528ec6ead722121051f646932f6ab',
'doctype': 'json',
'version': '2.1',
'keyfrom': 'fanyi.web',
'action':'FY_BY_REALTlME'}
data=urllib.parse.urlencode(data).encode('utf-8')
response=urllib.request.urlopen(url,data,head)
html=response.read().decode('utf-8')
target=json.loads(html)
target=target['translateResult'][0][0]['tgt']
print(target)
time.sleep(5)

复制代码

Twilight6 · 发表于 2020-8-6 10:54:33

这个最佳答案由 Twilight6 给出，感谢 Twilight6 的回答。

单击隐藏图章

本帖最后由 Twilight6 于 2020-8-6 10:55 编辑

月魔同学发表于 2020-8-6 10:52

urlopen 不能直接加 headers ，你需要先 Request

还有url 中的 _o 需要去掉，否则会出现 error:50 的假网址内容

import urllib.request

import urllib.parse

import json

import time

while True:

content=input('请输入要翻译的内容（如果结束请输入q）：')

if content=='q':

      break

head={}

head['User-Agent']='Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3706.400 SLBrowser/10.0.4040.400'

url='http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'

data={'i':content,

      'from':'AUTO',

      'to':'AUTO',

      'smartresult':'dict',

      'client':'fanyideskweb',

      'salt':'15966771704905',

      'sign': '8634c91db8413ef4cfbe684fd030d518',

      'ts': '1596677170490',

      'bv': 'd16528ec6ead722121051f646932f6ab',

      'doctype': 'json',

      'version': '2.1',

      'keyfrom': 'fanyi.web',

      'action':'FY_BY_REALTlME'}

data=urllib.parse.urlencode(data).encode('utf-8')

req = urllib.request.Request(url,data=data,headers=head)

response=urllib.request.urlopen(req)

html=response.read().decode('utf-8')

target=json.loads(html)

target=target['translateResult'][0][0]['tgt']

print(target)

time.sleep(5)
复制代码

账号		自动登录	找回密码
密码			立即注册

[已解决]爬虫

马上注册，结交更多好友，享用更多功能^_^

浏览过的版块