python docx 替换段落中子字符串后如何保持原格式,萌新交流区,萌新训练营,鱼C论坛

bigbird0419 发表于 2020-5-15 22:36:16

python docx 替换段落中子字符串后如何保持原格式

本帖最后由 bigbird0419 于 2020-5-15 22:39 编辑

from docx import Document
doc = Document('公司合同范本.docx')
phs = doc.paragraphs
for ph in phs:
if(ph.text.find('xxx合同名称xxx')>=0):
ph.text = ph.text.replace('xxx合同名称xxx','某某公司')
print(ph.text)

这样修改后整个段落的格式都消失了,求大神们帮忙诊断一下

Twilight6 发表于 2020-5-15 23:20:55

from docx import Document
doc = Document('公司合同范本.docx')
phs = doc.paragraphs
for ph in phs:
if(ph.text.find('xxx合同名称xxx')>=0):
   style = ph.style
   text = ph.text.replace('xxx合同名称xxx','某某公司')
   ph.text = text
   ph.style = style
   print(ph.text)
doc.save('test.docx')
这样呢试试看

txxcat 发表于 2020-5-15 23:25:15

我之前也碰到过这种问题，但是没有找到直接的办法，只好用个笨办法，就是按原来的格式设置一遍，像这样：
dc=Document(main_save) #打开周报文件
dc.styles['Normal'].font.name = u'宋体'
dc.styles['Normal']._element.rPr.rFonts.set(qn('w:eastAsia'), u'宋体')
dc.paragraphs.text=''
p2=dc.paragraphs
run2=p2.add_run('报告汇总：'+user)
run2.font.size = Pt(14) # 字体大小：四号14

坐等有没有大神能解决这个问题。

txxcat 发表于 2020-5-15 23:30:17

Twilight6 发表于 2020-5-15 23:20
这样呢试试看

测试失败，除了居中还保留，其他格式都没了……

Twilight6 发表于 2020-5-15 23:31:22

txxcat 发表于 2020-5-15 23:30
测试失败，除了居中还保留，其他格式都没了……

好吧我也不怎么懂docx 怎么操作。。。

bigbird0419 发表于 2020-5-20 22:09:33

各位的方法我都试验了，不行。
后来我就在paragraph.run中检索要替换的字段就不会改变格式，但是局限性是要替换的字段必须是同一个run，“某某公司"是同一个run,"xxx公司"就是两个run,甚至“某某某公司"都是多个字段。
最后某某某多个字段可能跟word检测拼写错误有关，会在第三某字下加波浪线，导致生成新run

niezongxia 发表于 2021-11-26 11:47:27

def check_and_change(document, replace_dict, new_file):#docx分为段落里的run和表格里的cell两部分逐个替换
j=0
k=0#敏感词计数
###check敏感词
for para in document.paragraphs:
   for i in range(len(para.runs)):
         for key, value in replace_dict.items():
            j = j + para.runs.text.count(key)
for table in document.tables:
   for row in table.rows:
         for cell in row.cells:
            for key, value in replace_dict.items():
               k = k + cell.text.count(key)
if j+k>0:#若j+k大于零说明有敏感词
   ###change敏感词
   for para in document.paragraphs:
         for run in para.runs:
            for key, value in replace_dict.items():
               if key in run.text:
                     run.text = run.text.replace(key, value)
   for table in document.tables:
         for row in table.rows:
            for cell in row.cells:
               for key, value in replace_dict.items():
                     if key in cell.text:
                        cell.text = cell.text.replace(key, value)
   document.save(new_file)#保存新文件
return document

niezongxia 发表于 2021-11-26 11:47:58

niezongxia 发表于 2021-11-26 11:47
def check_and_change(document, replace_dict, new_file):#docx分为段落里的run和表格里的cell两部分逐个 ...

replace_dict = {'【编辑日期】':now_date,'【版本号】':task,'【新建日期】':new_date,'【测试】':tester,'【安装包名】':auto_ipa,}
构造完字典就可以只替换文本，不改变原有样式了

页: [1]

鱼C论坛's Archiver

python docx 替换段落中子字符串后如何保持原格式