|
|
2鱼币
我用的python3.4.4我看到scrapy1.1支持python3,不知道和这个有没有关系,请大家帮忙指点一下 。
C:\tutorial>scrapy crawl dmoz
2016-06-28 14:20:56 [scrapy] INFO: Scrapy 1.1.0 started (bot: tutorial)
2016-06-28 14:20:56 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'tu
torial.spiders', 'ROBOTSTXT_OBEY': True, 'BOT_NAME': 'tutorial', 'SPIDER_MODULES
': ['tutorial.spiders']}
2016-06-28 14:20:56 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.logstats.LogStats',
'scrapy.extensions.corestats.CoreStats']
2016-06-28 14:20:56 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2016-06-28 14:20:56 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2016-06-28 14:20:56 [scrapy] INFO: Enabled item pipelines:
[]
2016-06-28 14:20:56 [scrapy] INFO: Spider opened
2016-06-28 14:20:56 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 i
tems (at 0 items/min)
2016-06-28 14:20:56 [scrapy] ERROR: Error downloading <GET http://www.dmoz.org/r
obots.txt>: name 'x' is not defined
Traceback (most recent call last):
File "c:\python34\lib\site-packages\twisted\internet\defer.py", line 1126, in
_inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "c:\python34\lib\site-packages\twisted\python\failure.py", line 389, in t
hrowExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "c:\python34\lib\site-packages\scrapy\core\downloader\middleware.py", lin
e 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
NameError: name 'x' is not defined
2016-06-28 14:20:56 [scrapy] ERROR: Error downloading <GET http://www.dmoz.org/C
omputers/Programming/Languages/Python/Books/>
Traceback (most recent call last):
File "c:\python34\lib\site-packages\twisted\internet\defer.py", line 1126, in
_inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "c:\python34\lib\site-packages\twisted\python\failure.py", line 389, in t
hrowExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "c:\python34\lib\site-packages\scrapy\core\downloader\middleware.py", lin
e 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
NameError: name 'x' is not defined
2016-06-28 14:20:56 [scrapy] ERROR: Error downloading <GET http://www.dmoz.org/C
omputers/Programming/Languages/Python/Resources/>
Traceback (most recent call last):
File "c:\python34\lib\site-packages\twisted\internet\defer.py", line 1126, in
_inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "c:\python34\lib\site-packages\twisted\python\failure.py", line 389, in t
hrowExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "c:\python34\lib\site-packages\scrapy\core\downloader\middleware.py", lin
e 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
NameError: name 'x' is not defined
2016-06-28 14:20:56 [scrapy] INFO: Closing spider (finished)
2016-06-28 14:20:56 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 3,
'downloader/exception_type_count/builtins.NameError': 3,
'downloader/request_bytes': 734,
'downloader/request_count': 3,
'downloader/request_method_count/GET': 3,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2016, 6, 28, 6, 20, 56, 976625),
'log_count/ERROR': 3,
'log_count/INFO': 7,
'scheduler/dequeued': 2,
'scheduler/dequeued/memory': 2,
'scheduler/enqueued': 2,
'scheduler/enqueued/memory': 2,
'start_time': datetime.datetime(2016, 6, 28, 6, 20, 56, 742250)}
2016-06-28 14:20:56 [scrapy] INFO: Spider closed (finished)
C:\tutorial>
|
|