|
|
发表于 2017-12-1 08:58:10
|
显示全部楼层
正则还是得参考环境来写
- target = '''<html>
- <body>
- <img class="BDE_Image" pic_type="0" width="560" height="838" src="https://1.jpg" pic_ext="jpeg"><p>123</p>
- <img class="BDE" pic_type="0" width="560" height="838" src="https://2.jpg" pic_ext="jpeg"><p>123</p>
- <img class="BDE_Image" pic_type="0" width="560" height="838" src="https://3.jpg" pic_ext="jpeg"><span>456</span>
- <img class="BDE123" pic_type="0" width="560" height="838" src="https://4.jpg" pic_ext="jpeg"><span>456</span>
- <img class="BDE_Image" pic_type="0" width="560" height="838" src="https://5.jpg" pic_ext="jpeg">
- <img class="BDE_Image" pic_type="0" width="560" height="838" src="https://6.jpg" pic_ext="jpeg"><span>456</span>
- </body>
- </html>
- '''
- html = re.findall(r'(?m)(<img class="BDE_Image".*? src="(?P<w>.*?)" .*?>)',target)
- for x in html:
- print(x)
复制代码- ('<img class="BDE_Image" pic_type="0" width="560" height="838" src="https://1.jpg" pic_ext="jpeg">', 'https://1.jpg')
- ('<img class="BDE_Image" pic_type="0" width="560" height="838" src="https://3.jpg" pic_ext="jpeg">', 'https://3.jpg')
- ('<img class="BDE_Image" pic_type="0" width="560" height="838" src="https://5.jpg" pic_ext="jpeg">', 'https://5.jpg')
- ('<img class="BDE_Image" pic_type="0" width="560" height="838" src="https://6.jpg" pic_ext="jpeg">', 'https://6.jpg')
复制代码 |
|