从文本中使用正则匹配多行内容并提取该内容上下关联内容到文本
如题,新手一枚,最近捣鼓从文本中进行多行匹配“switchport trunk encapsulation dot1q\n switchport mode trunk\n”,然后提取上下关联行内容写入新的文本中,折腾了两个礼拜没成功,请大神帮忙看下怎么实现。一、匹配内容"interface .* switchport trunk encapsulation dot1q\n switchport mode trunk\n!",字符串开始为“interface xx”,字符串结束为"!"
txt=
"""
!
interface GigabitEthernet1/2
description TO_MDCN_YJL_1F_C1_SW2-H3C-S7606_xg0/0/2
switchport
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 115
switchport mode trunk
channel-protocol lacp
channel-group 10 mode active
!
!
interface Port-channel1
description To-MM-SW-C6509-02-Channel-1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
storm-control broadcast level 1.00
!
!
interface Port-channel10
description 1JiLou-9306-1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
!
!
interface GigabitEthernet1/1
description TO_MDCN_YJL_1F_D1_SW1-H3C-S7606_XG0/0/1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
!
interface GigabitEthernet1/2
description TO_MDCN_YJL_1F_C1_SW2-H3C-S7606_xg0/0/2
switchport
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 115
switchport mode trunk
channel-protocol lacp
channel-group 10 mode active
!
interface GigabitEthernet1/3
description TO_MDCN_YJL_1F_C1_SW2-H3C-S7606_xg0/0/3
switchport
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 115
switchport mode trunk
channel-protocol lacp
channel-group 10 mode active
!
interface GigabitEthernet1/4
description this port is bad
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
speed nonegotiate
storm-control broadcast level 1.00
!
interface GigabitEthernet1/5
description no-use
no ip address
shutdown
!
interface GigabitEthernet1/6
description To_4F-B3-WLAN_Controller1_G0/1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
storm-control broadcast level 10.00
!
interface GigabitEthernet1/7
description To_4F-B1-MDCN-R2-NE40_G5/0/0
switchport
switchport access vlan 501
switchport mode access
!
"""
二、预期输出内容:
interface Port-channel1
description To-MM-SW-C6509-02-Channel-1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
storm-control broadcast level 1.00
interface Port-channel10
description 1JiLou-9306-1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
interface GigabitEthernet1/1
description TO_MDCN_YJL_1F_D1_SW1-H3C-S7606_XG0/0/1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
interface GigabitEthernet1/4
description this port is bad
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
speed nonegotiate
storm-control broadcast level 1.00
interface GigabitEthernet1/6
description To_4F-B3-WLAN_Controller1_G0/1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
storm-control broadcast level 10.00
三、因为内容跨行并且要整个文本都要查找,使用re.match和re.search没搞定,后来使用re.findall
脚本1
import re
print("".join(re.findall(r'(^interface .* switchport trunk encapsulation dot1q\n switchport mode trunk\n.*!$)',txt)))
输出:
脚本结束
Process finished with exit code 0
或
脚本2
import re
print("".join(re.findall(r' switchport trunk encapsulation dot1q\n switchport mode trunk\n',txt)))
输出:
switchport trunk encapsulation dot1q
switchport mode trunk
switchport trunk encapsulation dot1q
switchport mode trunk
switchport trunk encapsulation dot1q
switchport mode trunk
switchport trunk encapsulation dot1q
switchport mode trunk
switchport trunk encapsulation dot1q
switchport mode trunk
脚本结束
Process finished with exit code 0
结果:脚本1 无法match任何数据,脚本2输出数据不完整,只有一部分,另外最终我想输出结果到新文本中,最好是以函数调用方式实现。
看的好头疼。。。{:5_104:} 任选其一
result = re.findall(r'^inter.*|^switch.*|^storm.*', txt, re.M)
result = re.findall(r'(?:^inter|^switch|^storm).*', txt, re.M) suchocolate 发表于 2020-4-24 22:19
任选其一
不行,输出如下:
C:\Users\L\AppData\Local\Programs\Python\Python38\python.exe D:/桌面/code/translator/test1.py
interface GigabitEthernet1/2
interface Port-channel1
interface Port-channel10
interface GigabitEthernet1/1
interface GigabitEthernet1/2
interface GigabitEthernet1/3
interface GigabitEthernet1/4
interface GigabitEthernet1/5
interface GigabitEthernet1/6
interface GigabitEthernet1/7
脚本结束
Process finished with exit code 0 我这里正常,你是如何输出的? suchocolate 发表于 2020-4-24 23:00
我这里正常,你是如何输出的?
我就复制你的代码运行。附了图片 本帖最后由 suchocolate 于 2020-4-25 00:36 编辑
菜鸟江湖 发表于 2020-4-24 23:07
我就复制你的代码运行。附了图片
我知道原因了,你那里的txt数据 switch storm前头有空格。而我是拿你开贴1楼拿的txt数据做的样本(那里所有数据开头都没有空格),下次样本的话也贴成代码。
代码这样改:
result = re.findall(r'(?:^inter|^ *switch|^ *storm).*', txt, re.M)
本帖最后由 buhaozhao 于 2020-4-25 00:29 编辑
.*后加?变成.*?,
看错了。请忽略 re.findall('interface.*?\n.*?\nswitchport\nswitchport trunk encapsulation dot1q\nswitchport mode trunk\n.*?\n',txt) x=txt.split('!')
result=
print(result)
这个投机取巧一下 suchocolate 发表于 2020-4-25 00:08
我知道原因了,你那里的txt数据 switch storm前头有空格。而我是拿你开贴1楼拿的txt数据做的样本(那里所 ...
还是不行,没有采集到我需要的数据。输出如下:
D:\桌面\code\translator>test1.py
interface GigabitEthernet1/2
switchport
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 115
switchport mode trunk
interface Port-channel1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
storm-control broadcast level 1.00
interface Port-channel10
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
interface GigabitEthernet1/1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
interface GigabitEthernet1/2
switchport
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 115
switchport mode trunk
interface GigabitEthernet1/3
switchport
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 115
switchport mode trunk
interface GigabitEthernet1/4
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
storm-control broadcast level 1.00
interface GigabitEthernet1/5
interface GigabitEthernet1/6
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
storm-control broadcast level 10.00
interface GigabitEthernet1/7
switchport
switchport access vlan 501
switchport mode access
D:\桌面\code\translator>
buhaozhao 发表于 2020-4-25 00:51
x=txt.split('!')
result=
print(result)
不行,提取的是所有字符串,没有筛选到目标数据 菜鸟江湖 发表于 2020-4-26 11:26
还是不行,没有采集到我需要的数据。输出如下:
D:\桌面\code\translator>test1.py
interface GigabitE ...
这个输出不就是你预期输出的吗? suchocolate 发表于 2020-4-26 12:26
这个输出不就是你预期输出的吗?
不好意思,可能我没说清楚。
条件一:数据包含在两个”!“之间
!
xxx
xxx
xxx
!
条件二:xxx数据需包含关键行(跨行内容)
switchport trunk encapsulation dot1q
switchport mode trunk
例如:
interface Port-channel1
description To-MM-SW-C6509-02-Channel-1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
storm-control broadcast level 1.00
interface GigabitEthernet1/6
description To_4F-B3-WLAN_Controller1_G0/1
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
storm-control broadcast level 10.00
suchocolate 发表于 2020-4-26 12:26
这个输出不就是你预期输出的吗?
谢谢了, 我已经解决了,”曲线救国“用的别的方法,感谢!
a=re.sub("switchport trunk encapsulation dot1q\n switchport mode trunk","yes",txt) suchocolate 发表于 2020-4-26 12:26
这个输出不就是你预期输出的吗?
我的最终需求就是为了把match到的跨行字符串"switchport trunk encapsulation dot1q\n switchport mode trunk"替换成我需要的字符串,第一次发帖我没描述清楚,所以可能让大家绕弯路了,谢谢!
页:
[1]