正则表达式反向引用概念

木一乡 · 发表于 2018-8-30 23:21:36

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

import re
p = re.compile(r'(\b\w+)\s')
p.search('Paris in the the spring')
<_sre.SRE_Match object; span=(0, 6), match='Paris '>
这样找到的‘Paris ’没有问题可以理解
p = re.compile(r'(\b\w+)\s\1')
p.search('Paris in the the spring')
<_sre.SRE_Match object; span=(9, 16), match='the the'>
按理说\1会重复（\b\w+）的内容，结果为‘Paris Paris’啊，为何结果为‘the the’

求助~~~！！！@@@###￥￥￥%%%

塔利班 · 发表于 2018-8-31 07:18:02

他是匹配重复内容，不是自动给你重复一个，

木一乡 · 发表于 2018-8-31 21:27:37

塔利班发表于 2018-8-31 07:18
他是匹配重复内容，不是自动给你重复一个，

不对啊
如果是匹配重复内容，如果是三个the的话，返回结果应该是the the the但是我试了还是返回the the少一个。
import re
p = re.compile(r'(\b\w+)\s\1')
p.search('Paris in the the the spring')
<_sre.SRE_Match object; span=(9, 16), match='the the'>

塔利班 · 发表于 2018-8-31 21:31:56

\1只负责匹配第一个括号构成的子组的相同内容，加上原来的子组就2个，哪里来3个

木一乡 · 发表于 2018-8-31 23:56:41

塔利班发表于 2018-8-31 21:31
\1只负责匹配第一个括号构成的子组的相同内容，加上原来的子组就2个，哪里来3个

这样讲，第一个子组返回的有重复的只有the所以是一个the。加上\1则再重复这个重复的the就变成（the the）
若采用Parise Paris in the the spring验证：
>>> import re
>>> p = re.compile(r'(\b\w+)\s\1')
>>> p.search('Paris Paris in the the spring')
<_sre.SRE_Match object; span=(0, 11), match='Paris Paris'>
因为有\b，验证OK
感谢~~！！！@@#￥%…………&*大神

账号		自动登录	找回密码
密码			立即注册