正则表达式结尾问题
ui='<video class="video-small" id="videosmall" autoplay="" crossorigin="anonymous" loop="" src="https://vd2.bdstatic.com/mda-nckd33p1p6dd8igv/cae_h264_delogo/1647854913822034441/mda-nckd33p1p6dd8igv.mp4?v_from_s=hkapp-haokan-hnb&auth_key=1647921548-0-0-5d05585767419032e4a3f644fa853af1&bcevod_channel=searchbox_feed&pd=1&vt=1&cd=0&watermark=0&did=&logid=1748363977&vid=911389646208710823&pt=0&appver=&model=&cr=0&abtest=peav_l52&sle=1&sl=320&split=298735" style="width: 238px; height: 134px;"></video><video class="art-video" preload="metadata" crossorigin="anonymous" autoplay="" src="https://vd4.bdstatic.com/mda-nckhek3dbwp00478/sc/cae_h264_delogo/1647866695166866916/mda-nckhek3dbwp00478.mp4?v_from_s=hkapp-haokan-hnb&auth_key=1647921548-0-0-75ad69749dc9894b70fefe2ae6f61558&bcevod_channel=searchbox_feed&cd=0&pd=1&pt=3&logid=1747979696&vid=9340172854313612982&abtest=100815_1-101130_2-17451_2&klogid=1747979696"></video><video class="video-small" id="videosmall" autoplay="" crossorigin="anonymous" loop="" src="https://vd2.bdstatic.com/mda-nbkg882mxu2ddzaa/cae_h264_nowatermark_delogo/1645443119480177002/mda-nbkg882mxu2ddzaa.mp4?v_from_s=hkapp-haokan-hnb&auth_key=1647921548-0-0-22a9ee13589689c684cc3b734b4fd39d&bcevod_channel=searchbox_feed&pd=1&vt=1&cd=0&watermark=0&did=&logid=1748363977&vid=8319953855371380823&pt=0&appver=&model=&cr=0&abtest=peav_l52&sle=1&sl=527&split=469076" style="width: 238px; height: 134px;"></video>'t=re.findall('^.*split=\d*$',ui)
print(t)
我想把src开头,然后用split=一串数字结尾的链接提取出来。但结果返回空集 import re
ui='<video class="video-small" id="videosmall" autoplay="" crossorigin="anonymous" loop="" src="https://vd2.bdstatic.com/mda-nckd33p1p6dd8igv/cae_h264_delogo/1647854913822034441/mda-nckd33p1p6dd8igv.mp4?v_from_s=hkapp-haokan-hnb&auth_key=1647921548-0-0-5d05585767419032e4a3f644fa853af1&bcevod_channel=searchbox_feed&pd=1&vt=1&cd=0&watermark=0&did=&logid=1748363977&vid=911389646208710823&pt=0&appver=&model=&cr=0&abtest=peav_l52&sle=1&sl=320&split=298735" style="width: 238px; height: 134px;"></video><video class="art-video" preload="metadata" crossorigin="anonymous" autoplay="" src="https://vd4.bdstatic.com/mda-nckhek3dbwp00478/sc/cae_h264_delogo/1647866695166866916/mda-nckhek3dbwp00478.mp4?v_from_s=hkapp-haokan-hnb&auth_key=1647921548-0-0-75ad69749dc9894b70fefe2ae6f61558&bcevod_channel=searchbox_feed&cd=0&pd=1&pt=3&logid=1747979696&vid=9340172854313612982&abtest=100815_1-101130_2-17451_2&klogid=1747979696"></video><video class="video-small" id="videosmall" autoplay="" crossorigin="anonymous" loop="" src="https://vd2.bdstatic.com/mda-nbkg882mxu2ddzaa/cae_h264_nowatermark_delogo/1645443119480177002/mda-nbkg882mxu2ddzaa.mp4?v_from_s=hkapp-haokan-hnb&auth_key=1647921548-0-0-22a9ee13589689c684cc3b734b4fd39d&bcevod_channel=searchbox_feed&pd=1&vt=1&cd=0&watermark=0&did=&logid=1748363977&vid=8319953855371380823&pt=0&appver=&model=&cr=0&abtest=peav_l52&sle=1&sl=527&split=469076" style="width: 238px; height: 134px;"></video>'
t=re.findall(r'src="([^>]*?split=\d*)"',ui)
print(len(t))
print(t) isdkz 发表于 2022-3-22 14:02
你太牛了~!!
我想问问是不是带了括号就表示的匹配的内容就是括号内“t=\d*)”,表示最后是数字结尾
还有[^>]这个^是表示开始,那么>表示啥呀,搭配上^,可以详细解释一下吗{:5_99:} isdkz 发表于 2022-3-22 14:02
那比如我想找http开头,mp4结尾,可以这样写吗?
t=re.findall(r'^http.*\.mp4$',ui) hunter魔术师 发表于 2022-3-22 14:19
你太牛了~!!
我想问问是不是带了括号就表示的匹配的内容就是括号内“t=\d*)”,表示最后是数字结尾
...
有分组的话 findall 只会匹配分组里面的,
你用 ^ $ 会匹配整行的开头和结尾,匹配不上就匹配不上,
所以不应该用 ^ $,我在中括号里面加 ^ 不是代表开头的意思,而是排除,[^>] 是匹配除了 > 以外的任意字符,
你可以试试把 [^>] 替换回 . ,你就知道我为什么要用 [^>] 了,因为任意字符,即使中间有已经结束的标签也会匹配上 isdkz 发表于 2022-3-22 14:26
有分组的话 findall 只会匹配分组里面的,
你用 ^ $ 会匹配整行的开头和结尾,匹配不上就匹配不上, ...
那匹配http开头,带mp4结尾呢
t=re.findall(r'(https://.*\w*\.mp4)',ui) 我这样写,出不来结果 hunter魔术师 发表于 2022-3-22 14:37
那匹配http开头,带mp4结尾呢
t=re.findall(r'(https://.*\w*\.mp4)',ui) 我这样写,出不来结果
正则表达式中的全部都要获取不需要加分组
t=re.findall(r'https://.*?.mp4',ui) isdkz 发表于 2022-3-22 14:40
正则表达式中的全部都要获取不需要加分组
t=re.findall(r'https://.*?.mp4',ui)
{:5_106:}
页:
[1]