|
发表于 2017-6-12 00:49:01
|
显示全部楼层
今年5月初淘宝上买了小甲鱼老师的《零基础学习Python》(正版哟),目前自学到了西瓜皮。
昨天看到OOXX的视频,把小甲鱼老师的代码改了下,加入了一些捕捉URL和HTTP报错的代码,自已找了个妹子网站,我是每抓取哪个地址,都会打印出来,但是像下面这种错误还是经常出现,这些地址确实不存,但网站会把它自动重定向到主页,大部分这种情况都是自动忽略的,但有时却会解决发下面这种报错。
自已搜索了下,说是因为浏览器在测试期间过早关闭造成的,但不知道怎么解决,
本来想自已发贴呢,但发现没权限,囧......
召唤@小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...小甲鱼老师...
Traceback (most recent call last):
File "D:\python_xiaojiayu\Crawler\download_mm.py", line 70, in <module>
download_mm()
File "D:\python_xiaojiayu\Crawler\download_mm.py", line 66, in download_mm
img_addrs = find_imgs(page_url)
File "D:\python_xiaojiayu\Crawler\download_mm.py", line 15, in find_imgs
html = url_open(url).decode('utf-8')
File "D:\python_xiaojiayu\Crawler\download_mm.py", line 10, in url_open
html = urllib.request.urlopen(req).read()
File "D:\Python\lib\urllib\request.py", line 162, in urlopen
return opener.open(url, data, timeout)
File "D:\Python\lib\urllib\request.py", line 465, in open
response = self._open(req, data)
File "D:\Python\lib\urllib\request.py", line 483, in _open
'_open', req)
File "D:\Python\lib\urllib\request.py", line 443, in _call_chain
result = func(*args)
File "D:\Python\lib\urllib\request.py", line 1283, in https_open
context=self._context, check_hostname=self._check_hostname)
File "D:\Python\lib\urllib\request.py", line 1243, in do_open
r = h.getresponse()
File "D:\Python\lib\http\client.py", line 1174, in getresponse
response.begin()
File "D:\Python\lib\http\client.py", line 282, in begin
version, status, reason = self._read_status()
File "D:\Python\lib\http\client.py", line 251, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response |
|