python docx 替换段落中子字符串后如何保持原格式

bigbird0419 · 发表于 2020-5-15 22:36:16

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

本帖最后由 bigbird0419 于 2020-5-15 22:39 编辑

from docx import Document
doc = Document('公司合同范本.docx')
phs = doc.paragraphs
for ph in phs:
if(ph.text.find('xxx合同名称xxx')>=0):
ph.text = ph.text.replace('xxx合同名称xxx','某某公司')
print(ph.text)

复制代码

这样修改后整个段落的格式都消失了,求大神们帮忙诊断一下

Twilight6 · 发表于 2020-5-15 23:20:55

from docx import Document
doc = Document('公司合同范本.docx')
phs = doc.paragraphs
for ph in phs:
if(ph.text.find('xxx合同名称xxx')>=0):
style = ph.style
text = ph.text.replace('xxx合同名称xxx','某某公司')
ph.text = text
ph.style = style
print(ph.text)
doc.save('test.docx')

复制代码

这样呢试试看

txxcat · 发表于 2020-5-15 23:25:15

我之前也碰到过这种问题，但是没有找到直接的办法，只好用个笨办法，就是按原来的格式设置一遍，像这样：

dc=Document(main_save) #打开周报文件
dc.styles['Normal'].font.name = u'宋体'
dc.styles['Normal']._element.rPr.rFonts.set(qn('w:eastAsia'), u'宋体')
dc.paragraphs[2].text=''
p2=dc.paragraphs[2]
run2=p2.add_run('报告汇总：'+user)
run2.font.size = Pt(14) # 字体大小：四号14

复制代码

坐等有没有大神能解决这个问题。

txxcat · 发表于 2020-5-15 23:30:17

Twilight6 发表于 2020-5-15 23:20
这样呢试试看

测试失败，除了居中还保留，其他格式都没了……

Twilight6 · 发表于 2020-5-15 23:31:22

txxcat 发表于 2020-5-15 23:30
测试失败，除了居中还保留，其他格式都没了……

好吧我也不怎么懂docx 怎么操作。。。

bigbird0419 · 发表于 2020-5-20 22:09:33

各位的方法我都试验了，不行。
后来我就在paragraph.run中检索要替换的字段就不会改变格式，但是局限性是要替换的字段必须是同一个run，“某某公司"是同一个run,"xxx公司"就是两个run,甚至“某某某公司"都是多个字段。
最后某某某多个字段可能跟word检测拼写错误有关，会在第三某字下加波浪线，导致生成新run

niezongxia · 发表于 2021-11-26 11:47:27

def check_and_change(document, replace_dict, new_file):#docx分为段落里的run和表格里的cell两部分逐个替换
j=0
k=0#敏感词计数
###check敏感词
for para in document.paragraphs:
      for i in range(len(para.runs)):
         for key, value in replace_dict.items():
            j = j + para.runs[i].text.count(key)
for table in document.tables:
      for row in table.rows:
         for cell in row.cells:
            for key, value in replace_dict.items():
                  k = k + cell.text.count(key)
if j+k>0:#若j+k大于零说明有敏感词
      ###change敏感词
      for para in document.paragraphs:
         for run in para.runs:
            for key, value in replace_dict.items():
                  if key in run.text:
                     run.text = run.text.replace(key, value)
      for table in document.tables:
         for row in table.rows:
            for cell in row.cells:
                  for key, value in replace_dict.items():
                     if key in cell.text:
                        cell.text = cell.text.replace(key, value)
      document.save(new_file)#保存新文件
return document

niezongxia · 发表于 2021-11-26 11:47:58

niezongxia 发表于 2021-11-26 11:47
def check_and_change(document, replace_dict, new_file):#docx分为段落里的run和表格里的cell两部分逐个 ...

replace_dict = {'【编辑日期】':now_date,'【版本号】':task,'【新建日期】':new_date,'【测试】':tester,'【安装包名】':auto_ipa,}
构造完字典就可以只替换文本，不改变原有样式了

账号		自动登录	找回密码
密码			立即注册

python docx 替换段落中子字符串后如何保持原格式

马上注册，结交更多好友，享用更多功能^_^

浏览过的版块