[已解决]正则表达式匹配字符多个‘0’

duke0522 · 发表于 2020-10-10 12:15:36

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

本帖最后由 duke0522 于 2020-10-10 14:13 编辑

我的目的是想把文件名中数字部分前面的‘0’全部去掉，比如，0001.txt改为1.txt。我写的正则表达式re.compile(r'^(.*)(0+)(.*)$')，但是运行的结果却是得到了少了一个‘0’的文件名，001.txt。
各位大神，我的正则表达式有什么问题？

最佳答案

月排行榜 / 总排行榜

altf11

2020-10-10 22:33:28

正则默认贪婪匹配，请在你代码的第1个分组里加上?

你的原代码：

import re, os, shutil
# Create a regex that matches files with '0' in the filename.
zeroPattern = re.compile(r'^(.*)(0+)(.*)

复制代码

将第4行修改为：

import re, os, shutil
# Create a regex that matches files with '0' in the filename.
zeroPattern = re.compile(r'^(.*?)(0+)(.*)

复制代码

跳转到最佳答案楼层

疾风怪盗 · 发表于 2020-10-10 12:27:11

import re
text='sdaa000121sada1.txt'
pattern=re.compile(r'^(.*)(0+)(.+)
是这样么？)
print(pattern.fullmatch(text))

复制代码

是这样么？你的(.x)是什么意思？

一世长安呢 · 发表于 2020-10-10 13:13:27

文件名都啥样，只去掉前面的0吗

hrp · 发表于 2020-10-10 13:43:39

本帖最后由 hrp 于 2020-10-10 13:55 编辑

import re
a = '0db001s.txt'
b = re.sub(r'^0+', '', a)
print(b)
# db001s.txt

复制代码

sunrise085 · 发表于 2020-10-10 14:14:00

不知道你的文件名是什么样子的，默认就是你写的这种样子了。
给你两种方法

import re
str1=["0001.txt","0012.txt","1002.txt"]
#第一种方法sub替换
for each in str1:
print(re.sub(r"^0*","",each))#直接把开头的0去掉就可以了
#第二种方法分组匹配
p=re.compile(r"[0]*(\d*\..+)")#前面的中括号加*是滤掉开头的0。后面的小括号内是你需要的文件名，其中\d*匹配多个数字，\.匹配一个点，最后的 .+ 匹配文件的后缀
for each in str1:
result=p.match(each)
print(result.group(1))

复制代码

duke0522 · 发表于 2020-10-10 14:14:00

疾风怪盗发表于 2020-10-10 12:27
是这样么？你的(.x)是什么意思？

我写错了，应该是*

import re
text = 'sdaa000121sada1.txt'
pattern = re.compile(r'^(.*)(0+)(.+)')
mo = pattern.search(text)
print(mo.group(2))

复制代码

返回的只匹配了一个0，而不是三个

duke0522 · 发表于 2020-10-10 14:14:35

本帖最后由 duke0522 于 2020-10-10 14:19 编辑

一世长安呢发表于 2020-10-10 13:13
文件名都啥样，只去掉前面的0吗

对，比如good002.txt，改为good2.txt

duke0522 · 发表于 2020-10-10 14:16:42

本帖最后由 duke0522 于 2020-10-10 14:20 编辑

hrp 发表于 2020-10-10 13:43

比如good002.txt，改为good2.txt，这是我想要达到的效果

疾风怪盗 · 发表于 2020-10-10 14:17:36

本帖最后由疾风怪盗于 2020-10-10 14:24 编辑

duke0522 发表于 2020-10-10 14:14
我写错了，应该是*

import re
text='good00002.txt'
pattern=re.compile(r'^(.*)(0+)(.*)
print(pattern.fullmatch(text))

复制代码

<re.Match object; span=(0, 13), match='good00002.txt'>

Process finished with exit code 0

不是都匹配的么？所以你的问题到底什么呢？

duke0522 · 发表于 2020-10-10 14:29:40

疾风怪盗发表于 2020-10-10 14:17
Process finished with exit code 0

不是都匹配的么？所以你的问题到底什么呢？

import re, os, shutil
# Create a regex that matches files with '0' in the filename.
zeroPattern = re.compile(r'^(.*)(0+)(.*)
我的目的是为了把abd001.txt和agsad01.txt分别改为abd1.txt和agsad1.txt。但是我的代码只能改为abd01.txt和agsad1.txt)
# Loop over the files in the working directory.
for filename in os.listdir():
mo = zeroPattern.search(filename)
# Skip file without '0' in the filename.
if mo == None:
continue
# Get the different parts of the filename.
beforePart = mo.group(1)
zeroPart = mo.group(2)
afterPart = mo.group(3)
# Form the filename without '0'.
zeroFilename = beforePart + afterPart
# Get the full, absolute file paths.
absWorkingDir = os.path.abspath('.')
filePath = os.path.join(absWorkingDir + '\\' + filename)
zeroFilePath = os.path.join(absWorkingDir + '\\' + zeroFilename)
# Rename the files.
print("Renaming '%s' to '%s'" % (filePath, zeroFilePath))
#shutil.move(filePath, zeroFilePath) # uncomment after test

复制代码

我的目的是为了把abd001.txt和agsad01.txt分别改为abd1.txt和agsad1.txt。但是我的代码只能改为abd01.txt和agsad1.txt

疾风怪盗 · 发表于 2020-10-10 14:55:30

本帖最后由疾风怪盗于 2020-10-10 15:00 编辑

duke0522 发表于 2020-10-10 14:29
我的目的是为了把abd001.txt和agsad01.txt分别改为abd1.txt和agsad1.txt。但是我的代码只能改为abd01 ...

你为什么要写这么复杂，正则是没问题的，能找到，问题出在你的group上，你打印看下

beforePart = mo.group(1)
zeroPart = mo.group(2)
afterPart = mo.group(3)

复制代码

这三个值是什么。。。。。。。

而且你只要去掉0，为什么不用replace呢？正则其实也用不着的吧。。。。。。。。。

import re, os, shutil
# Create a regex that matches files with '0' in the filename.
zeroPattern = re.compile(r'(.*)(0+)(.*)')
# Loop over the files in the working directory.
for filename in os.listdir(r'D:\python\test\1'):
print(filename)
mo = zeroPattern.search(filename)
#Skip file without '0' in the filename.
if mo == None:
continue
zeroFilename=mo.group().replace('0','')
print(zeroFilename)
absWorkingDir = os.path.abspath('.')
filePath = os.path.join(absWorkingDir + '\\' + filename)
zeroFilePath = os.path.join(absWorkingDir + '\\' + zeroFilename)
# Rename the files.
print("Renaming '%s' to '%s'" % (filePath, zeroFilePath))
#shutil.move(filePath, zeroFilePath) # uncomment after test

复制代码

hrp · 发表于 2020-10-10 14:59:36

本帖最后由 hrp 于 2020-10-10 16:18 编辑

duke0522 发表于 2020-10-10 14:16
比如good002.txt，改为good2.txt，这是我想要达到的效果

import re
a = 'good0020.txt'
# 使用先行断言
b = re.sub(r'0+(?=[1-9]+.*\..*$)', '', a)
print(b)
# good20.txt

复制代码

疾风怪盗 · 发表于 2020-10-10 15:03:30

duke0522 发表于 2020-10-10 14:29
我的目的是为了把abd001.txt和agsad01.txt分别改为abd1.txt和agsad1.txt。但是我的代码只能改为abd01 ...

按你的写法的话，加个\D试试

import re, os, shutil
# Create a regex that matches files with '0' in the filename.
zeroPattern = re.compile(r'(.\D*)(0+)(.*)')
#我的目的是为了把abd001.txt和agsad01.txt分别改为abd1.txt和agsad1.txt。但是我的代码只能改为abd01.txt和agsad1.txt)
# Loop over the files in the working directory.
for filename in os.listdir(r'D:\python\test\1'):
print(filename)
mo = zeroPattern.search(filename)
#Skip file without '0' in the filename.
if mo == None:
continue
print(mo)
# Get the different parts of the filename.
beforePart = mo.group(1)
zeroPart = mo.group(2)
afterPart = mo.group(3)
print('1'+beforePart,zeroPart,afterPart)
# Form the filename without '0'.
zeroFilename = beforePart + afterPart
# Get the full, absolute file paths.
absWorkingDir = os.path.abspath('.')
filePath = os.path.join(absWorkingDir + '\\' + filename)
zeroFilePath = os.path.join(absWorkingDir + '\\' + zeroFilename)
# Rename the files.
print("Renaming '%s' to '%s'" % (filePath, zeroFilePath))
#shutil.move(filePath, zeroFilePath) # uncomment after test

复制代码

abd001.txt
<re.Match object; span=(0, 10), match='abd001.txt'>
1abd 00 1.txt
Renaming 'D:\python\test\abd001.txt' to 'D:\python\test\abd1.txt'
agsad01.txt
<re.Match object; span=(0, 11), match='agsad01.txt'>
1agsad 0 1.txt
Renaming 'D:\python\test\agsad01.txt' to 'D:\python\test\agsad1.txt'

altf11 · 发表于 2020-10-10 22:33:28

这个最佳答案由 altf11 给出，感谢 altf11 的回答。

单击隐藏图章

正则默认贪婪匹配，请在你代码的第1个分组里加上?

你的原代码：

import re, os, shutil
# Create a regex that matches files with '0' in the filename.
zeroPattern = re.compile(r'^(.*)(0+)(.*)

复制代码

将第4行修改为：

import re, os, shutil
# Create a regex that matches files with '0' in the filename.
zeroPattern = re.compile(r'^(.*?)(0+)(.*)

复制代码

duke0522 · 发表于 2020-10-11 11:00:13

疾风怪盗发表于 2020-10-10 14:55
你为什么要写这么复杂，正则是没问题的，能找到，问题出在你的group上，你打印看下

这三个值是什么 ...

这个方法确实更简单，但是我还是想搞清楚为什么正则表达式哪里有问题，altf11解答了这个问题。
不过仍然非常感谢你

duke0522 · 发表于 2020-10-11 11:02:09

hrp 发表于 2020-10-10 14:59

还没学明白断言，看不大懂，

仍然十分感谢！

duke0522 · 发表于 2020-10-11 11:02:49

altf11 发表于 2020-10-10 22:33
正则默认贪婪匹配，请在你代码的第1个分组里加上?

你的原代码：

我好像是明白我哪里搞错了，谢谢！

账号		自动登录	找回密码
密码			立即注册

[已解决]正则表达式匹配字符多个‘0’

马上注册，结交更多好友，享用更多功能^_^

浏览过的版块