|
|
发表于 2017-7-3 00:08:49
|
显示全部楼层

贪婪不贪婪的不懂 不过如果
url改一下就看出效果了:
- >>> url = '<span>"|</span><a target="_blank" >帮助</a><span>"|</span><a target="_blank" >帮助</a><span>"|</span><a target="_blank" >帮助</a>'
- >>> res = re.findall('>(.*?)</a>',url)
- >>> res
- ['"|</span><a target="_blank" >\xe5\xb8\xae\xe5\x8a\xa9', '"|</span><a target="_blank" >\xe5\xb8\xae\xe5\x8a\xa9', '"|</span><a target="_blank" >\xe5\xb8\xae\xe5\x8a\xa9']
- >>> res = re.findall('>(.*)</a>',url)
- >>> res
- ['"|</span><a target="_blank" >\xe5\xb8\xae\xe5\x8a\xa9</a><span>"|</span><a target="_blank" >\xe5\xb8\xae\xe5\x8a\xa9</a><span>"|</span><a target="_blank" >\xe5\xb8\xae\xe5\x8a\xa9']
复制代码
有点乱 .*匹配最长的为一条了 .*?匹配到了所有最短的
|
|