|
|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
本帖最后由 charst89 于 2019-6-15 08:24 编辑
用的python3.7.3
代码
- import urllib.request
- import urllib.parse
- import re
- from bs4 import BeautifulSoup
- def main():
- keyword = input("请输入关键词:")
- keyword = urllib.parse.urlencode({"word":keyword})
- response = urllib.request.urlopen("http://baike.baidu.com/search/word?%s" % keyword)
- html = response.read()
- soup = BeautifulSoup(html, "html.parser")
- for each in soup.find_all(href=re.compile("view")):
- content =''.join([each.text])
- url2 = ''.join(["http://baike.baidu.com", each["href"]])
- response2 = urllib.request.urlopen(url2)
- html2 = response2.read()
- soup2 = BeautifulSoup(html2, "html.parser")
- if soup2.h2:
- content = ''.join([content, soup2, h2.text])
- content = ''.join([content, "->", url2])
- print(content)
- if __name__=="__main__":
- main()
复制代码
报错
- RESTART: C:/Users/cgt19/AppData/Local/Programs/Python/Python37-32/p14_92.py
- 请输入关键词:牛
- Traceback (most recent call last):
- File "C:/Users/cgt19/AppData/Local/Programs/Python/Python37-32/p14_92.py", line 24, in <module>
- main()
- File "C:/Users/cgt19/AppData/Local/Programs/Python/Python37-32/p14_92.py", line 16, in main
- response2 = urllib.request.urlopen(url2)
- File "C:\Users\cgt19\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 222, in urlopen
- return opener.open(url, data, timeout)
- File "C:\Users\cgt19\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 525, in open
- response = self._open(req, data)
- File "C:\Users\cgt19\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 543, in _open
- '_open', req)
- File "C:\Users\cgt19\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
- result = func(*args)
- File "C:\Users\cgt19\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 1345, in http_open
- return self.do_open(http.client.HTTPConnection, req)
- File "C:\Users\cgt19\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 1317, in do_open
- encode_chunked=req.has_header('Transfer-encoding'))
- File "C:\Users\cgt19\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 1229, in request
- self._send_request(method, url, body, headers, encode_chunked)
- File "C:\Users\cgt19\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 1240, in _send_request
- self.putrequest(method, url, **skips)
- File "C:\Users\cgt19\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 1107, in putrequest
- self._output(request.encode('ascii'))
- UnicodeEncodeError: 'ascii' codec can't encode characters in position 36-39: ordinal not in range(128)
复制代码
|
|