鱼C论坛

 找回密码
 立即注册
查看: 1364|回复: 12

response.xpath()到底是小括号还是尖括号?

[复制链接]
发表于 2022-5-8 21:14:45 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
本帖最后由 lzb1001 于 2022-5-8 21:16 编辑

123.png

看小甲鱼在视频讲解里的所演示,看起来好像是尖括号<>

比如第一行,看起来是:

>>> response.xpath<'//title'>

而教材里则是:

>>> response.xpath('//title')

到底哪个是对的???
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复

使用道具 举报

发表于 2022-5-8 21:17:21 From FishC Mobile | 显示全部楼层
当然是小括号了
尖括号是html里的东西
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-5-8 21:18:33 From FishC Mobile | 显示全部楼层
你自己去shell下敲一敲就知道了,只是看起来像而已
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-5-8 21:20:04 | 显示全部楼层


第二个是对的,圆括号

甲鱼哥的字体原因看上去像尖括号

想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-5-8 21:20:40 | 显示全部楼层
方法调用肯定用的是小括号呀
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-5-8 21:20:58 From FishC Mobile | 显示全部楼层
是小括号。python语法没有尖括号,只有大于小于号,看起来像尖括号是字体的问题。
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-5-8 21:39:12 | 显示全部楼层
dmoz案例没法跟着操作,学起来真是不方便,希望小甲鱼能重新选择案例并录入新视频
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-5-8 21:42:43 | 显示全部楼层
wp231957 发表于 2022-5-8 21:18
你自己去shell下敲一敲就知道了,只是看起来像而已

不是不想亲手敲,而是:dmoz案例没法跟着操作,学起来真是不方便
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-5-8 21:44:27 From FishC Mobile | 显示全部楼层
lzb1001 发表于 2022-5-8 21:42
不是不想亲手敲,而是:dmoz案例没法跟着操作,学起来真是不方便

你就随便敲个小括号再随便敲个尖括号不就完了
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-5-8 22:05:22 | 显示全部楼层
wp231957 发表于 2022-5-8 21:44
你就随便敲个小括号再随便敲个尖括号不就完了

都是返回错误,虽然错误原因不同,还是有案例有正确返回感觉学起来或敲代码有动力,尤其看到亲手敲代码能有相同正确结果返回,学起来更有获得感啊

>>> response.xpath<'//title'>
  File "<stdin>", line 1
    response.xpath<'//title'>
                            ^
SyntaxError: invalid syntax

>>> response.xpath('//title')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'response' is not defined
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-5-8 22:10:06 | 显示全部楼层
isdkz 发表于 2022-5-8 21:20
方法调用肯定用的是小括号呀


https://fishc.com.cn/thread-213082-1-1.html
--------------------------------------
D:\work\tutorial>scrapy shell 'http://dmoztools.net/Computers/Programming/Languages/Python/Books/'

2022-05-07 23:56:37 [scrapy.utils.log] INFO: Scrapy 2.6.1 started (bot: tutorial)
2022-05-07 23:56:37 [scrapy.utils.log] INFO: Versions: lxml 4.8.0.0, libxml2 2.9.12, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 22.4.0, Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 22.0.0 (OpenSSL 3.0.3 3 May 2022), cryptography 37.0.2, Platform Windows-10-10.0.19041-SP0
2022-05-07 23:56:37 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'tutorial',
'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter',
'LOGSTATS_INTERVAL': 0,
'NEWSPIDER_MODULE': 'tutorial.spiders',
'ROBOTSTXT_OBEY': True,
'SPIDER_MODULES': ['tutorial.spiders']}
2022-05-07 23:56:37 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2022-05-07 23:56:37 [scrapy.extensions.telnet] INFO: Telnet Password: b1f68aed69443fa2
2022-05-07 23:56:37 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole']
2022-05-07 23:56:37 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-05-07 23:56:37 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-05-07 23:56:37 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-05-07 23:56:37 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-05-07 23:56:37 [scrapy.core.engine] INFO: Spider opened
2022-05-07 23:56:38 [scrapy.downloadermiddlewares.robotstxt] ERROR: Error downloading <GET http://'http/robots.txt>: invalid hostname: 'http
Traceback (most recent call last):
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\core\downloader\middleware.py", line 49, in process_request
    return (yield download_func(request=request, spider=spider))
ValueError: invalid hostname: 'http
Traceback (most recent call last):
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\Scripts\scrapy.exe\__main__.py", line 7, in <module>
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\cmdline.py", line 145, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\cmdline.py", line 100, in _run_print_help
    func(*a, **kw)
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\cmdline.py", line 153, in _run_command
    cmd.run(args, opts)
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\commands\shell.py", line 74, in run
    shell.start(url=url, redirect=not opts.no_redirect)
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\shell.py", line 43, in start
    self.fetch(url, spider, redirect=redirect)
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\shell.py", line 111, in fetch
    reactor, self._schedule, request, spider)
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\site-packages\twisted\internet\threads.py", line 120, in blockingCallFromThread
    result.raiseException()
  File "C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\site-packages\twisted\python\failure.py", line 500, in raiseException
    raise self.value.with_traceback(self.tb)
ValueError: invalid hostname: 'http

D:\work\tutorial>

按小甲鱼视频中讲解,运行最后将返回>>>,但看上面好像不是,最后又返回D:\work\tutorial>,而且还有一大堆错误,是哪里有问题哦?

我将命令行中的网址的单引号换成双引号:---是不是必须用双引号,不能用单引号啊???

D:\work\tutorial>scrapy shell "http://dmoztools.net/Computers/Programming/Languages/Python/Books/"

2022-05-08 00:00:37 [scrapy.utils.log] INFO: Scrapy 2.6.1 started (bot: tutorial)
2022-05-08 00:00:37 [scrapy.utils.log] INFO: Versions: lxml 4.8.0.0, libxml2 2.9.12, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 22.4.0, Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 22.0.0 (OpenSSL 3.0.3 3 May 2022), cryptography 37.0.2, Platform Windows-10-10.0.19041-SP0
2022-05-08 00:00:37 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'tutorial',
'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter',
'LOGSTATS_INTERVAL': 0,
'NEWSPIDER_MODULE': 'tutorial.spiders',
'ROBOTSTXT_OBEY': True,
'SPIDER_MODULES': ['tutorial.spiders']}
2022-05-08 00:00:37 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2022-05-08 00:00:37 [scrapy.extensions.telnet] INFO: Telnet Password: bb2332e9ee9c5698
2022-05-08 00:00:37 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole']
2022-05-08 00:00:38 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-05-08 00:00:38 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-05-08 00:00:38 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-05-08 00:00:38 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-05-08 00:00:38 [scrapy.core.engine] INFO: Spider opened
2022-05-08 00:00:38 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://dmoztools.net/robots.txt> from <GET http://dmoztools.net/robots.txt>
2022-05-08 00:00:38 [py.warnings] WARNING: C:\Users\dell\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\core\engine.py:276: ScrapyDeprecationWarning: Passing a 'spider' argument to ExecutionEngine.download is deprecated
  return self.download(result, spider) if isinstance(result, Request) else result

2022-05-08 00:00:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://dmoztools.net/robots.txt> (referer: None)
2022-05-08 00:00:40 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://dmoztools.net/Computers/Programming/Languages/Python/Books/> from <GET http://dmoztools.net/Computers/Programming/Languages/Python/Books/>
2022-05-08 00:00:40 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://dmoztools.net/Computers/Programming/Languages/Python/Books/> (referer: None)
2022-05-08 00:00:41 [asyncio] DEBUG: Using selector: SelectSelector
Available Scrapy objects:
   scrapy     scrapy module (contains scrapy.Request, scrapy.Selector, etc)
   crawler    <scrapy.crawler.Crawler object at 0x000001A105E874C8>
   item       {}
   request    <GET http://dmoztools.net/Computers/Programming/Languages/Python/Books/>
   response   <404 https://dmoztools.net/Computers/Programming/Languages/Python/Books/>
   settings   <scrapy.settings.Settings object at 0x000001A105E83F08>
   spider     <DmoztoolsSpider 'dmoztools' at 0x1a10656c308>
Useful shortcuts:
   fetch(url[, redirect=True]) Fetch URL and update local objects (by default, redirects are followed)
   fetch(req)                  Fetch a scrapy.Request and update local objects
   shelp()           Shell help (print this help)
   view(response)    View response in a browser
2022-05-08 00:00:41 [asyncio] DEBUG: Using selector: SelectSelector
In [1]:

显示结果怎么有删除线?我附上返回的结果图吧:
000405n4f6fnb6opm86r66.png.thumb.jpg
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2022-5-8 22:10:45 From FishC Mobile | 显示全部楼层
lzb1001 发表于 2022-5-8 22:05
都是返回错误,虽然错误原因不同,还是有案例有正确返回感觉学起来或敲代码有动力,尤其看到亲手敲代码能 ...

我只是让你敲括号,看看括号长啥样,如此而已
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2022-5-8 22:12:33 | 显示全部楼层
wp231957 发表于 2022-5-8 22:10
我只是让你敲括号,看看括号长啥样,如此而已

哦,我是看视频觉得很像尖括号,觉得疑惑所以提出来问下
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2024-10-7 18:24

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表