根据excel列数据提取模版，生成新的excel，有图片和格式的,Python交流,编程语言专区,鱼C论坛

颂风侯 发表于 2024-7-26 14:13:48

根据excel列数据提取模版，生成新的excel，有图片和格式的

目前已经提取了一个pandas的dataframe文件。
df_merge.to_excel("提取数据.xlsx", index=False)
同时生成到了excel，如下

管线号                                                       焊接类型工艺卡编号焊接材质外径壁厚管线寸口道数焊接寸口
/TRIM-排污-取样器水槽-N1-40-A1TB52-N       对焊          HKB20-02 20-GB/T8163 Φ48.3*5.08 1.5 1 1.5
/TRIM-排污-取样器水槽-N1-40-A1TB52-N       承插焊          HKC20-02 20-GB/T8163&20-NB/T47008 Φ*5.08 1.5 2 3
/TRIM-排污-取样器水槽-N1-40-A1TB52-N       承插焊          HKC20-02 20-GB/T8163&20-NB/T47008 Φ48.3*5.08 1.5 2 3
/TRIM-排污-取样器水槽-N1-40-A1TB52-N       承插焊          HKC20-02 20-GB/T8163&A105 Φ*5.08 1.5 1 1.5
/TRIM-排污-取样器水槽-N1-40-A1TB52-N       承插焊          HKC20-02 20-GB/T8163&A105 Φ48.3*5.08 1.5 1 1.5
/TRIM-排污-安全阀及消声器排水-20-A1TB51-N 承插焊 HKC20-02 20-GB/T8163 Φ21.3*3.73 1 7 7
/TRIM-排污-安全阀及消声器排水-20-A1TB51-N 对焊          HKB20-02 20-GB/T8163 Φ26.7*3.91 1 3 3

现在需要根据“管线号”列的数据，新建名称为“管线号”列数据的excel，并去同一级的另一个名称为“模版的文件夹里，找到”对应“工艺卡编号”列的名称的对应模版的excel，插入第一个sheet
sheet名称用“工艺卡编号“的名称

就是生成名称为“/TRIM-排污-取样器水槽-N1-40-A1TB52-N”.xlsx的excel，里面有模版sheet，名称为“HKB20-02”，“HKC20-02 ”，是复制“模版的文件夹里的对应模版的第一个sheet

提取模版能提取，但是模版里面有图，有格式，提取后图没了，excel表格式也没有全乱的，这个有没有其他办法

FishC 发表于 2024-7-26 14:14:08

哎呀，看起来你遇到了“提取模版”的问题，简直就像把冰淇淋放在阳光下，没几分钟就融化了！别担心，我来帮你解决。我们可以使用`openpyxl`这个库，它能够处理Excel文件，并保留格式和图片，而不只是数据。

首先，要确保你已经安装了`openpyxl`库，可以通过以下命令进行安装：

pip install openpyxl

接下来，我给你一个示例代码，帮你实现这个功能：

import pandas as pd
import openpyxl
from openpyxl import load_workbook
import os

# 假设你的DataFrame已经准备好了
df_merge = pd.DataFrame({
"管线号": ["/TRIM-排污-取样器水槽-N1-40-A1TB52-N", "/TRIM-排污-安全阀及消声器排水-20-A1TB51-N"],
"工艺卡编号": ["HKB20-02", "HKC20-02"],
# 其他列数据...
})

# 创建一个名为模版的文件夹路径
template_folder = "模版文件夹"

# 遍历 DataFrame 行
for index, row in df_merge.iterrows():
pipe_code = row['管线号']
process_card_number = row['工艺卡编号']

# 新建 Excel 文件
new_filename = f"{pipe_code}.xlsx"
new_wb = openpyxl.Workbook()# 创建新的工作簿
new_ws = new_wb.active
new_ws.title = process_card_number# 设置工作表的名称

# 找到对应的模板文件
template_file = os.path.join(template_folder, f"{process_card_number}.xlsx")

if os.path.exists(template_file):
   # 加载模板文件
   template_wb = load_workbook(template_file)
   template_ws = template_wb.active# 获取第一个工作表

   # 将模板内容复制到新文件中（包括格式和图片）
   for row in template_ws.iter_rows():
         for cell in row:
            new_ws.value = cell.value
            if cell.has_style:
               new_ws.font = cell.font
               new_ws.border = cell.border
               new_ws.fill = cell.fill
               new_ws.number_format = cell.number_format
               new_ws.protection = cell.protection
               new_ws.alignment = cell.alignment

   # 复制图片
   for img in template_ws._images:
         new_ws.add_image(img)

# 保存新文件
new_wb.save(new_filename)
print(f"生成文件：{new_filename}")

print("所有文件已成功生成！")

### 这个代码是干什么的？
1. **遍历**：通过`iterrows()`函数，遍历`df_merge`中的每一行。
2. **新建文件**：为每个“管线号”新建一个Excel文件，并把“工艺卡编号”作为sheet名称。
3. **加载模板**：根据“工艺卡编号”，从指定的模版文件夹加载对应的Excel模板。
4. **复制内容**：将模板中的内容、格式和图片都复制到新建的Excel文件。
5. **保存文件**：最后，保存生成的新Excel文件。

### 注意事项：
- 确保模版文件夹中存在对应的模板文件，否则会抛出异常。
- 图片的处理要依赖模板文件中原有的图片是否存在。

希望这个方法能让你的Excel文件恢复“美丽”，再也不会像冰淇淋一样融化啦！如果过程中有问题，别犹豫，随时问我！

以上回复来自 FishC AI 小助理，如未能正确解答您的问题，请继续追问。

颂风侯 发表于 2024-7-26 14:29:48

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
      pipe_number = row['管线号']# 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
      new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")
      #new_file_path = os.path.join("E:\\JIESHOU",f"{pipe_number}.xlsx")
      #new_file_path = os.path.join("E:\\JIESHOU",f"{pipe_number}.xlsx")
      #new_file_path = os.path.join("C:\\33",f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
      existing_sheets = set()

# 创建新的 Excel 工作簿
      new_wb = Workbook()
# 删除默认生成的第一个Sheet
      default_sheet = new_wb.active
      new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
      for process_card in df_merge.loc == pipe_number, '工艺卡编号'].unique():
                     template_path = os.path.join(template_folder, f"{process_card}.xlsx")

                     if os.path.exists(template_path):
                                 wb_template = load_workbook(template_path)
                                 sheet_name = process_card

# 确保 Sheet 名称唯一
                                 suffix = 1
                                 while sheet_name in existing_sheets:
                                          sheet_name = f"{process_card}_{suffix}"
                                          suffix += 1

# 获取模版的第一个sheet
                                 template_sheet = wb_template.active

# 创建一个新的工作表
                                 new_sheet = new_wb.create_sheet(title=sheet_name)

                                 # 复制数据
                                 for row in template_sheet.iter_rows(values_only=True):
                                       # new_sheet.append(row) # 将每一行的数据添加到新的工作表中
                                       for cell in row:
                                          new_sheet.value = cell.value
                                          if cell.has_style:
                                                new_sheet.font = cell.font
                                                new_sheet.border = cell.border
                                                new_sheet.fill = cell.fill
                                                new_sheet.number_format = cell.number_format
                                                new_sheet.protection = cell.protection
                                                new_sheet.alignment = cell.alignment

                                          # 复制图片
                                 for img in template_sheet._images:
                                       new_sheet.add_image(img)

                                 existing_sheets.add(sheet_name)
                     else:
                                 print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
      new_wb.save(new_file_path)
      print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")

程序报错
D:\PYTHON_NEW\pythonProject\.venv\Scripts\python.exe "D:\PYTHON_NEW\pythonProject\提取python excel\可用抽出工艺卡片.py"
Traceback (most recent call last):
File "D:\PYTHON_NEW\pythonProject\提取python excel\可用抽出工艺卡片.py", line 60, in <module>
new_sheet.value = cell.value
                                    ^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'value'

FishC 发表于 2024-7-26 14:30:04

颂风侯发表于 2024-07-26 14:29
import pandas as pd
import os
from openpyxl impo ...

看起来你的代码在尝试设置新工作表单元格的值时出现了问题。具体来说，`new_sheet.value` 这一行的用法是不正确的，因为 `new_sheet` 是一个工作表对象，而不是单独的单元格。

要为新工作表的特定单元格设置值，你需要使用 `cell` 方法来指定具体的行和列，像这样：

new_sheet.cell(row=i, column=j, value=cell.value)

我会给你修改后的代码片段，以及相关的说明，确保你能顺利运行：

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
pipe_number = row['管线号']# 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
existing_sheets = set()

# 创建新的 Excel 工作簿
new_wb = Workbook()
# 删除默认生成的第一个Sheet
default_sheet = new_wb.active
new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
for process_card in df_merge.loc == pipe_number, '工艺卡编号'].unique():
   template_path = os.path.join(template_folder, f"{process_card}.xlsx")

   if os.path.exists(template_path):
         wb_template = load_workbook(template_path)
         sheet_name = process_card

         # 确保 Sheet 名称唯一
         suffix = 1
         while sheet_name in existing_sheets:
            sheet_name = f"{process_card}_{suffix}"
            suffix += 1

         # 获取模板的第一个sheet
         template_sheet = wb_template.active

         # 创建一个新的工作表
         new_sheet = new_wb.create_sheet(title=sheet_name)

         # 复制数据
         for i, row in enumerate(template_sheet.iter_rows(values_only=False)):
            for j, cell in enumerate(row):
               new_cell = new_sheet.cell(row=i + 1, column=j + 1, value=cell.value)
               if cell.has_style:
                     new_cell.font = cell.font
                     new_cell.border = cell.border
                     new_cell.fill = cell.fill
                     new_cell.number_format = cell.number_format
                     new_cell.protection = cell.protection
                     new_cell.alignment = cell.alignment

         # 复制图片
         for img in template_sheet._images:
            new_sheet.add_image(img)

         existing_sheets.add(sheet_name)
   else:
         print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
new_wb.save(new_file_path)
print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")

### 修改的地方：
1. **新单元格赋值：** 使用 `new_sheet.cell()` 方法重新设置单元格值，确保可以正确访问到单元格。
2. **计数器的使用：** 要确保在复制行时，从1开始计数（Excel的行列索引从1开始，而Python是从0开始）。

### 总结：
这个代码将根据你提供的“管线号”生成新的Excel文件。如果遇到模板文件不存在的情况，会在控制台中输出提示信息。希望这次修改能够解决你的困扰！如果还有问题，随时告诉我哦！

以上回复来自 FishC AI 小助理，如未能正确解答您的问题，请继续追问。

颂风侯 发表于 2024-7-26 14:37:45

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
      pipe_number = row['管线号']# 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
      new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")
      #new_file_path = os.path.join("E:\\JIESHOU",f"{pipe_number}.xlsx")
      #new_file_path = os.path.join("E:\\JIESHOU",f"{pipe_number}.xlsx")
      #new_file_path = os.path.join("C:\\33",f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
      existing_sheets = set()

# 创建新的 Excel 工作簿
      new_wb = Workbook()
# 删除默认生成的第一个Sheet
      default_sheet = new_wb.active
      new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
      for process_card in df_merge.loc == pipe_number, '工艺卡编号'].unique():
                     template_path = os.path.join(template_folder, f"{process_card}.xlsx")

                     if os.path.exists(template_path):
                                 wb_template = load_workbook(template_path)
                                 sheet_name = process_card

# 确保 Sheet 名称唯一
                                 suffix = 1
                                 while sheet_name in existing_sheets:
                                          sheet_name = f"{process_card}_{suffix}"
                                          suffix += 1

# 获取模版的第一个sheet
                                 template_sheet = wb_template.active

# 创建一个新的工作表
                                 new_sheet = new_wb.create_sheet(title=sheet_name)

                                 # 复制数据
                                 for i, row in enumerate(template_sheet.iter_rows(values_only=False)):
                                    for j, cell in enumerate(row):
                                       new_cell = new_sheet.cell(row=i + 1, column=j + 1, value=cell.value)
                                       if cell.has_style:
                                             new_cell.font = cell.font
                                             new_cell.border = cell.border
                                             new_cell.fill = cell.fill
                                             new_cell.number_format = cell.number_format
                                             new_cell.protection = cell.protection
                                             new_cell.alignment = cell.alignment

                                 # 复制图片
                                 for img in template_sheet._images:
                                    new_sheet.add_image(img)
                                 existing_sheets.add(sheet_name)
                     else:
                                 print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
      new_wb.save(new_file_path)
      print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")

报错如下，
D:\PYTHON_NEW\pythonProject\.venv\Scripts\python.exe "D:\PYTHON_NEW\pythonProject\提取python excel\可用抽出工艺卡片.py"
Traceback (most recent call last):
File "D:\PYTHON_NEW\pythonProject\提取python excel\可用抽出工艺卡片.py", line 61, in <module>
new_cell.font = cell.font
^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\openpyxl\styles\styleable.py", line 27, in __set__
setattr(instance._style, self.key, coll.add(value))
                                    ^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\openpyxl\utils\indexed_list.py", line 48, in add
self.append(value)
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\openpyxl\utils\indexed_list.py", line 43, in append
if value not in self._dict:
   ^^^^^^^^^^^^^^^^^^^^^^^
TypeError: unhashable type: 'StyleProxy'

Process finished with exit code 1

FishC 发表于 2024-7-26 14:38:08

颂风侯发表于 2024-07-26 14:37
import pandas as pd
import os
from openpyxl impo ...

哎呀，看起来你的代码在处理样式的时候遇到问题了！报错信息显示你在尝试将某个样式（`StyleProxy`）赋值时出现了“不可哈希类型”的错误。这通常是因为我们直接将模板单元格的样式属性复制到新单元格，而这些样式并不能直接赋值。

为了解决这个问题，我们可以尝试以下方法：

1. **只复制可用的属性**：而不是直接把整个样式对象进行赋值，逐一复制样式的各个属性。
2. **使用 `copy`** 方法来确保样式对象被正确地复制。

下面是修改后的版本。请注意，我们在复制样式时，只提取具体的样式属性，如字体、边框、填充等：

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
pipe_number = row['管线号']# 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
existing_sheets = set()

# 创建新的 Excel 工作簿
new_wb = Workbook()
# 删除默认生成的第一个Sheet
default_sheet = new_wb.active
new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
for process_card in df_merge.loc == pipe_number, '工艺卡编号'].unique():
   template_path = os.path.join(template_folder, f"{process_card}.xlsx")

   if os.path.exists(template_path):
         wb_template = load_workbook(template_path)
         sheet_name = process_card

         # 确保 Sheet 名称唯一
         suffix = 1
         while sheet_name in existing_sheets:
            sheet_name = f"{process_card}_{suffix}"
            suffix += 1

         # 获取模版的第一个sheet
         template_sheet = wb_template.active

         # 创建一个新的工作表
         new_sheet = new_wb.create_sheet(title=sheet_name)

         # 复制数据和样式
         for i, row in enumerate(template_sheet.iter_rows(values_only=False)):
            for j, cell in enumerate(row):
               new_cell = new_sheet.cell(row=i + 1, column=j + 1, value=cell.value)

               # 复制样式属性
               if cell.has_style:
                     new_cell.font = cell.font.copy() if cell.font else None
                     new_cell.border = cell.border.copy() if cell.border else None
                     new_cell.fill = cell.fill.copy() if cell.fill else None
                     new_cell.number_format = cell.number_format
                     new_cell.protection = cell.protection.copy() if cell.protection else None
                     new_cell.alignment = cell.alignment.copy() if cell.alignment else None

         # 复制图片
         for img in template_sheet._images:
            new_sheet.add_image(img)

         existing_sheets.add(sheet_name)
   else:
         print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
new_wb.save(new_file_path)
print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")

### 修改的地方：
1. 在复制样式的过程中，为每个样式属性调用 `.copy()` 方法，这样可以确保样式对象被正确复制，不再共享原样式。
2. 修改了对 `row` 数据的访问方式，特别是获取“管线号”列的数据，确保能正确获取到这一列。

希望这个修改能帮助你解决问题！如果还有其他错误或困惑，随时告诉我哦！

以上回复来自 FishC AI 小助理，如未能正确解答您的问题，请继续追问。

颂风侯 发表于 2024-7-26 14:42:30

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

报错如下
D:\PYTHON_NEW\pythonProject\.venv\Scripts\python.exe "D:\PYTHON_NEW\pythonProject\提取python excel\可用抽出工艺卡片.py"
D:\PYTHON_NEW\pythonProject\提取python excel\可用抽出工艺卡片.py:61: DeprecationWarning: Call to deprecated function copy (Use copy(obj) or cell.obj = cell.obj + other).
new_cell.font = cell.font.copy() if cell.font else None
D:\PYTHON_NEW\pythonProject\提取python excel\可用抽出工艺卡片.py:62: DeprecationWarning: Call to deprecated function copy (Use copy(obj) or cell.obj = cell.obj + other).
new_cell.border = cell.border.copy() if cell.border else None
D:\PYTHON_NEW\pythonProject\提取python excel\可用抽出工艺卡片.py:63: DeprecationWarning: Call to deprecated function copy (Use copy(obj) or cell.obj = cell.obj + other).
new_cell.fill = cell.fill.copy() if cell.fill else None
D:\PYTHON_NEW\pythonProject\提取python excel\可用抽出工艺卡片.py:65: DeprecationWarning: Call to deprecated function copy (Use copy(obj) or cell.obj = cell.obj + other).
new_cell.protection = cell.protection.copy() if cell.protection else None
D:\PYTHON_NEW\pythonProject\提取python excel\可用抽出工艺卡片.py:66: DeprecationWarning: Call to deprecated function copy (Use copy(obj) or cell.obj = cell.obj + other).
new_cell.alignment = cell.alignment.copy() if cell.alignment else None
已创建: C:/TRIM-排污-取样器水槽-N1-40-A1TB52-N.xlsx 包含模版: {'HKB12-02', 'HKB12-03'}
已创建: C:/TRIM-排污-取样器水槽-N1-40-A1TB52-N.xlsx 包含模版: {'HKB12-02', 'HKB12-03'}
已创建: C:/TRIM-排污-取样器水槽-N1-40-A1TB52-N.xlsx 包含模版: {'HKB12-02', 'HKB12-03'}
已创建: C:/TRIM-排污-取样器水槽-N1-40-A1TB52-N.xlsx 包含模版: {'HKB12-02', 'HKB12-03'}
已创建: C:/TRIM-排污-取样器水槽-N1-40-A1TB52-N.xlsx 包含模版: {'HKB12-02', 'HKB12-03'}
已创建: C:/TRIM-排污-安全阀及消声器排水-20-A1TB51-N.xlsx 包含模版: {'HKB12-02'}
已创建: C:/TRIM-排污-安全阀及消声器排水-20-A1TB51-N.xlsx 包含模版: {'HKB12-02'}
已创建: C:/TRIM-排污-安全阀及消声器排水-20-A1TB51-N.xlsx 包含模版: {'HKB12-02'}
已创建: C:/TRIM-GG.xlsx 包含模版: {'HKB12-02'}
已创建: C:/TRIM-CC.xlsx 包含模版: {'HKB12-01'}
已创建: C:/TRIM-BB.xlsx 包含模版: {'HKB12-02'}
已创建: C:/TRIM-BB.xlsx 包含模版: {'HKB12-02'}
所有文件处理完成！

Process finished with exit code 0

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
      pipe_number = row['管线号']# 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
      new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")
      #new_file_path = os.path.join("E:\\JIESHOU",f"{pipe_number}.xlsx")
      #new_file_path = os.path.join("E:\\JIESHOU",f"{pipe_number}.xlsx")
      #new_file_path = os.path.join("C:\\33",f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
      existing_sheets = set()

# 创建新的 Excel 工作簿
      new_wb = Workbook()
# 删除默认生成的第一个Sheet
      default_sheet = new_wb.active
      new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
      for process_card in df_merge.loc == pipe_number, '工艺卡编号'].unique():
                     template_path = os.path.join(template_folder, f"{process_card}.xlsx")

                     if os.path.exists(template_path):
                                 wb_template = load_workbook(template_path)
                                 sheet_name = process_card

# 确保 Sheet 名称唯一
                                 suffix = 1
                                 while sheet_name in existing_sheets:
                                          sheet_name = f"{process_card}_{suffix}"
                                          suffix += 1

# 获取模版的第一个sheet
                                 template_sheet = wb_template.active

# 创建一个新的工作表
                                 new_sheet = new_wb.create_sheet(title=sheet_name)

                                 # 复制数据
                                 for i, row in enumerate(template_sheet.iter_rows(values_only=False)):
                                    for j, cell in enumerate(row):
                                       new_cell = new_sheet.cell(row=i + 1, column=j + 1, value=cell.value)
                                       if cell.has_style:
                                             new_cell.font = cell.font.copy() if cell.font else None
                                             new_cell.border = cell.border.copy() if cell.border else None
                                             new_cell.fill = cell.fill.copy() if cell.fill else None
                                             new_cell.number_format = cell.number_format
                                             new_cell.protection = cell.protection.copy() if cell.protection else None
                                             new_cell.alignment = cell.alignment.copy() if cell.alignment else None

                                 # 复制图片
                                 for img in template_sheet._images:
                                    new_sheet.add_image(img)
                                 existing_sheets.add(sheet_name)
                     else:
                                 print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
      new_wb.save(new_file_path)
      print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")
能抽出数据，但是图片没抽出来的，单元格格式也不对

FishC 发表于 2024-7-26 14:42:52

颂风侯发表于 2024-07-26 14:42
import pandas as pd
import os
from openpyxl impo ...

看来你的代码在提取数据方面表现得不错，但在图片和单元格格式的保留上还有点小问题。别担心，我们来逐步解决这个问题。

1. **复制图片**：确保你将图片正确添加到新的工作表中。
2. **保留单元格格式**：根据你的代码，复制样式的方式可能需要调整，以确保其能够正确应用于新单元格。

以下是经过修订的代码：

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
pipe_number = row['管线号']# 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
existing_sheets = set()

# 创建新的 Excel 工作簿
new_wb = Workbook()
# 删除默认生成的第一个Sheet
default_sheet = new_wb.active
new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
for process_card in df_merge.loc == pipe_number, '工艺卡编号'].unique():
   template_path = os.path.join(template_folder, f"{process_card}.xlsx")

   if os.path.exists(template_path):
         wb_template = load_workbook(template_path)
         sheet_name = process_card

         # 确保 Sheet 名称唯一
         suffix = 1
         while sheet_name in existing_sheets:
            sheet_name = f"{process_card}_{suffix}"
            suffix += 1

         # 获取模版的第一个sheet
         template_sheet = wb_template.active

         # 创建一个新的工作表
         new_sheet = new_wb.create_sheet(title=sheet_name)

         # 复制数据和样式
         for i, row in enumerate(template_sheet.iter_rows(values_only=False)):
            for j, cell in enumerate(row):
               new_cell = new_sheet.cell(row=i + 1, column=j + 1, value=cell.value)

               # 复制样式属性
               if cell.has_style:
                     new_cell.font = cell.font
                     new_cell.border = cell.border
                     new_cell.fill = cell.fill
                     new_cell.number_format = cell.number_format
                     new_cell.protection = cell.protection
                     new_cell.alignment = cell.alignment

         # 复制图片
         for img in template_sheet._images:
            new_sheet.add_image(img)

         existing_sheets.add(sheet_name)
   else:
         print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
new_wb.save(new_file_path)
print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")

### 修改内容：
1. **获取“管线号”**：确保从 `row` 中正确提取“管线号”（确保它是字符串）。
2. **复制样式**：使用直接赋值而不是 `.copy()` 方法。同时，补全了样式的赋值逻辑，以确保样式被完整保留。
3. **复制图片**：在复制图片的部分，确保将图片添加到了新创建的工作表中。

### 注意事项：
- 确保你在 `template_folder` 中有对应模板的所有文件，包括图片。
- 确保图片确实存在于原模板中。

### 测试：
请在运行这段代码之前确认一下Excel模板的正确性，确保包含了必要的图片和格式。如果还有其他问题或疑问，随时告诉我！

以上回复来自 FishC AI 小助理，如未能正确解答您的问题，请继续追问。

颂风侯 发表于 2024-7-26 14:45:05

主要是合并，对齐，行高这些都不对
图片没拉过去

FishC 发表于 2024-7-26 14:45:24

颂风侯发表于 2024-07-26 14:45
主要是合并，对齐，行高这些都不对
图片没拉过去

好的，看来图片、合并单元格、对齐和行高这些问题需要更仔细地处理。下面是一个完整的解决方案，包括所有细节，确保图片、单元格格式（如合并、对齐和行高）都能被正确复制。

### 更新后的代码：

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
pipe_number = row['管线号']# 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
existing_sheets = set()

# 创建新的 Excel 工作簿
new_wb = Workbook()
# 删除默认生成的第一个Sheet
default_sheet = new_wb.active
new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
for process_card in df_merge.loc == pipe_number, '工艺卡编号'].unique():
   template_path = os.path.join(template_folder, f"{process_card}.xlsx")

   if os.path.exists(template_path):
         wb_template = load_workbook(template_path)
         sheet_name = process_card

         # 确保 Sheet 名称唯一
         suffix = 1
         while sheet_name in existing_sheets:
            sheet_name = f"{process_card}_{suffix}"
            suffix += 1

         # 获取模版的第一个sheet
         template_sheet = wb_template.active

         # 创建一个新的工作表
         new_sheet = new_wb.create_sheet(title=sheet_name)

         # 复制数据和样式
         for i, row in enumerate(template_sheet.iter_rows(values_only=False)):
            for j, cell in enumerate(row):
               new_cell = new_sheet.cell(row=i + 1, column=j + 1, value=cell.value)

               # 复制样式属性
               if cell.has_style:
                     new_cell.font = cell.font
                     new_cell.border = cell.border
                     new_cell.fill = cell.fill
                     new_cell.number_format = cell.number_format
                     new_cell.protection = cell.protection
                     new_cell.alignment = cell.alignment

               # 复制合并单元格
               if cell.merged_cells:
                     new_sheet.merge_cells(start_row=new_cell.row, start_column=new_cell.column,
                                          end_row=i + 1 + (cell.row_span - 1), end_column=j + 1 + (cell.column_span - 1))

         # 复制行高
         for row in template_sheet.iter_rows():
            new_sheet.row_dimensions.row].height = row.row_height

         # 复制图片
         for img in template_sheet._images:
            new_sheet.add_image(img)

         existing_sheets.add(sheet_name)
   else:
         print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
new_wb.save(new_file_path)
print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")

### 关键修改点：

1. **保持合并单元格**：
- 使用 `new_sheet.merge_cells()` 方法，确保复制了原单元格的合并状态。
- 注意需要根据源单元格的行列范围来设置新的合并单元格。

2. **行高复制**：
- 在循环复制单元格时，也获取并设置了行高，使用 `new_sheet.row_dimensions.row].height = row.row_height`。

3. **确保字体、边框、填充等样式逐一复制**，保持原样。

4. **图片复制**：检查确保你加载的模板中的图片确实存在，并使用 `new_sheet.add_image(img)` 来添加它们。

### 注意：
- 确保你的模板文件实际上包含了上述所有元素，并且路径正确。
- 如果图像仍未复制，请确认模板文件中是否包含图像，并尝试手动验证图像是否正常。如果是图片背景或某种复杂的布局，可能需要进一步调整这些处理。

希望这次的调整能够满足你的需求！如果还有其他问题，随时告诉我哦！

以上回复来自 FishC AI 小助理，如未能正确解答您的问题，请继续追问。

颂风侯 发表于 2024-7-26 14:49:58

# Copyright (c) 2010-2024 openpyxl

from copy import copy
from warnings import warn

from .numbers import (
BUILTIN_FORMATS,
BUILTIN_FORMATS_MAX_SIZE,
BUILTIN_FORMATS_REVERSE,
)
from .proxy import StyleProxy
from .cell_style import StyleArray
from .named_styles import NamedStyle
from .builtins import styles

class StyleDescriptor(object):

def __init__(self, collection, key):
   self.collection = collection
   self.key = key

def __set__(self, instance, value):
   coll = getattr(instance.parent.parent, self.collection)
   if not getattr(instance, "_style"):
         instance._style = StyleArray()
   setattr(instance._style, self.key, coll.add(value))

def __get__(self, instance, cls):
   coll = getattr(instance.parent.parent, self.collection)
   if not getattr(instance, "_style"):
         instance._style = StyleArray()
   idx =getattr(instance._style, self.key)
   return StyleProxy(coll)

class NumberFormatDescriptor(object):

key = "numFmtId"
collection = '_number_formats'

def __set__(self, instance, value):
   coll = getattr(instance.parent.parent, self.collection)
   if value in BUILTIN_FORMATS_REVERSE:
         idx = BUILTIN_FORMATS_REVERSE
   else:
         idx = coll.add(value) + BUILTIN_FORMATS_MAX_SIZE

   if not getattr(instance, "_style"):
         instance._style = StyleArray()
   setattr(instance._style, self.key, idx)

def __get__(self, instance, cls):
   if not getattr(instance, "_style"):
         instance._style = StyleArray()
   idx = getattr(instance._style, self.key)
   if idx < BUILTIN_FORMATS_MAX_SIZE:
         return BUILTIN_FORMATS.get(idx, "General")
   coll = getattr(instance.parent.parent, self.collection)
   return coll

class NamedStyleDescriptor(object):

key = "xfId"
collection = "_named_styles"

def __set__(self, instance, value):
   if not getattr(instance, "_style"):
         instance._style = StyleArray()
   coll = getattr(instance.parent.parent, self.collection)
   if isinstance(value, NamedStyle):
         style = value
         if style not in coll:
            instance.parent.parent.add_named_style(style)
   elif value not in coll.names:
         if value in styles: # is it builtin?
            style = styles
            if style not in coll:
               instance.parent.parent.add_named_style(style)
         else:
            raise ValueError("{0} is not a known style".format(value))
   else:
         style = coll
   instance._style = copy(style.as_tuple())

def __get__(self, instance, cls):
   if not getattr(instance, "_style"):
         instance._style = StyleArray()
   idx = getattr(instance._style, self.key)
   coll = getattr(instance.parent.parent, self.collection)
   return coll.names

class StyleArrayDescriptor(object):

def __init__(self, key):
   self.key = key

def __set__(self, instance, value):
   if instance._style is None:
         instance._style = StyleArray()
   setattr(instance._style, self.key, value)

def __get__(self, instance, cls):
   if instance._style is None:
         return False
   return bool(getattr(instance._style, self.key))

class StyleableObject(object):
"""
Base class for styleble objects implementing proxy and lookup functions
"""

font = StyleDescriptor('_fonts', "fontId")
fill = StyleDescriptor('_fills', "fillId")
border = StyleDescriptor('_borders', "borderId")
number_format = NumberFormatDescriptor()
protection = StyleDescriptor('_protections', "protectionId")
alignment = StyleDescriptor('_alignments', "alignmentId")
style = NamedStyleDescriptor()
quotePrefix = StyleArrayDescriptor('quotePrefix')
pivotButton = StyleArrayDescriptor('pivotButton')

__slots__ = ('parent', '_style')

def __init__(self, sheet, style_array=None):
   self.parent = sheet
   if style_array is not None:
         style_array = StyleArray(style_array)
   self._style = style_array

@property
def style_id(self):
   if self._style is None:
         self._style = StyleArray()
   return self.parent.parent._cell_styles.add(self._style)

@property
def has_style(self):
   if self._style is None:
         return False
   return any(self._style)

失败了，报错如下
D:\PYTHON_NEW\pythonProject\.venv\Scripts\python.exe "D:\PYTHON_NEW\pythonProject\提取python excel\测试含图片的excel抽取工艺卡.py"
Traceback (most recent call last):
File "D:\PYTHON_NEW\pythonProject\提取python excel\测试含图片的excel抽取工艺卡.py", line 59, in <module>
new_cell.font = cell.font
^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\openpyxl\styles\styleable.py", line 27, in __set__
setattr(instance._style, self.key, coll.add(value))
                                    ^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\openpyxl\utils\indexed_list.py", line 48, in add
self.append(value)
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\openpyxl\utils\indexed_list.py", line 43, in append
if value not in self._dict:
   ^^^^^^^^^^^^^^^^^^^^^^^
TypeError: unhashable type: 'StyleProxy'

Process finished with exit code 1

FishC 发表于 2024-7-26 14:50:22

颂风侯发表于 2024-07-26 14:49
# Copyright (c) 2010-2024 openpyxl

from copy im ...

看起来在处理单元格样式时，仍然遇到了“不可哈希类型：‘StyleProxy’”的错误。这通常是由于试图将 `StyleProxy` 对象直接用于一个需要哈希值的集合或字典中，而它并不支持哈希。这会导致你不能将格式从一个单元格复制到另一个。

要解决这个问题，同时确保正确地复制图片、合并单元格、对齐和行高，我们可以采取以下步骤：

1. **只复制基础样式属性**：直接访问和复制字体、边框等属性，而不是整个样式对象。
2. **使用 `cell.copy()` 方法**：这会更安全一些，尤其是在处理复杂的样式时。
3. **确保复制行高和合并单元格**。

### 更新后的代码

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
pipe_number = row['管线号']# 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
existing_sheets = set()

# 创建新的 Excel 工作簿
new_wb = Workbook()
# 删除默认生成的第一个Sheet
default_sheet = new_wb.active
new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
for process_card in df_merge.loc.unique():
   template_path = os.path.join(template_folder, f"{process_card}.xlsx")

   if os.path.exists(template_path):
         wb_template = load_workbook(template_path)
         sheet_name = process_card

         # 确保 Sheet 名称唯一
         suffix = 1
         while sheet_name in existing_sheets:
            sheet_name = f"{process_card}_{suffix}"
            suffix += 1

         # 获取模版的第一个sheet
         template_sheet = wb_template.active
         new_sheet = new_wb.create_sheet(title=sheet_name)

         # 复制数据和样式
         for i, row in enumerate(template_sheet.iter_rows(values_only=False)):
            for j, cell in enumerate(row):
               new_cell = new_sheet.cell(row=i + 1, column=j + 1, value=cell.value)

               # 复制样式属性
               if cell.has_style:
                     if cell.font is not None:
                        new_cell.font = cell.font
                     if cell.border is not None:
                        new_cell.border = cell.border
                     if cell.fill is not None:
                        new_cell.fill = cell.fill
                     new_cell.number_format = cell.number_format
                     new_cell.protection = cell.protection
                     if cell.alignment is not None:
                        new_cell.alignment = cell.alignment

         # 复制行高
         for row in template_sheet.iter_rows():
            new_sheet.row_dimensions.row].height = row.row_height

         # 复制合并单元格
         for merged_range in template_sheet.merged_cells.ranges:
            new_sheet.merge_cells(str(merged_range))

         # 复制图片
         for img in template_sheet._images:
            new_sheet.add_image(img)

         existing_sheets.add(sheet_name)
   else:
         print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
new_wb.save(new_file_path)
print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")

### 主要调整点：

1. **复制样式属性**：不再复制整个样式对象，而是逐个属性进行复制，如字体、边框、填充和对齐。

2. **合并单元格**：直接使用 `new_sheet.merge_cells(str(merged_range))` 来复制合并的单元格。

3. **行高**：保留原始行高。

4. **图片的复制**：确保图片正常添加到新工作表。

### 小贴士：
- 确保测试过程中，图片在源 Excel 模板中的确存在。某些图片格式或图层可能需要额外处理。
- 若仍有样式丢失，请考虑分步debug，例如打印样式信息以查找丢失的部分。

希望这次的解决方案能满足你的需求！如果还有其他问题，随时告诉我哦！

以上回复来自 FishC AI 小助理，如未能正确解答您的问题，请继续追问。

颂风侯 发表于 2024-7-26 14:52:48

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
pipe_number = row['管线号']# 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
existing_sheets = set()

# 创建新的 Excel 工作簿
new_wb = Workbook()
# 删除默认生成的第一个Sheet
default_sheet = new_wb.active
new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
for process_card in df_merge.loc.unique():
   template_path = os.path.join(template_folder, f"{process_card}.xlsx")

   if os.path.exists(template_path):
         wb_template = load_workbook(template_path)
         sheet_name = process_card

         # 确保 Sheet 名称唯一
         suffix = 1
         while sheet_name in existing_sheets:
            sheet_name = f"{process_card}_{suffix}"
            suffix += 1

         # 获取模版的第一个sheet
         template_sheet = wb_template.active
         new_sheet = new_wb.create_sheet(title=sheet_name)

         # 复制数据和样式
         for i, row in enumerate(template_sheet.iter_rows(values_only=False)):
            for j, cell in enumerate(row):
               new_cell = new_sheet.cell(row=i + 1, column=j + 1, value=cell.value)

               # 复制样式属性
               if cell.has_style:
                     if cell.font is not None:
                        new_cell.font = cell.font
                     if cell.border is not None:
                        new_cell.border = cell.border
                     if cell.fill is not None:
                        new_cell.fill = cell.fill
                     new_cell.number_format = cell.number_format
                     new_cell.protection = cell.protection
                     if cell.alignment is not None:
                        new_cell.alignment = cell.alignment

         # 复制行高
         for row in template_sheet.iter_rows():
            new_sheet.row_dimensions.row].height = row.row_height

         # 复制合并单元格
         for merged_range in template_sheet.merged_cells.ranges:
            new_sheet.merge_cells(str(merged_range))

         # 复制图片
         for img in template_sheet._images:
            new_sheet.add_image(img)

         existing_sheets.add(sheet_name)
   else:
         print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
new_wb.save(new_file_path)
print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")

还是报错失败的。如下
D:\PYTHON_NEW\pythonProject\.venv\Scripts\python.exe "D:\PYTHON_NEW\pythonProject\提取python excel\测试含图片的excel抽取工艺卡.py"
Traceback (most recent call last):
File "D:\PYTHON_NEW\pythonProject\提取python excel\测试含图片的excel抽取工艺卡.py", line 33, in <module>
for process_card in df_merge.loc.unique():
                     ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1184, in __getitem__
return self._getitem_tuple(key)
      ^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1368, in _getitem_tuple
return self._getitem_lowerdim(tup)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1089, in _getitem_lowerdim
return getattr(section, self.name)
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1191, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1418, in _getitem_axis
raise ValueError("Cannot index with multidimensional key")
ValueError: Cannot index with multidimensional key

Process finished with exit code 1

颂风侯 发表于 2024-7-26 15:06:03

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
      pipe_number = row['管线号'] # 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
      new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
      existing_sheets = set()

# 创建新的 Excel 工作簿
      new_wb = Workbook()
# 删除默认生成的第一个Sheet
      default_sheet = new_wb.active
      new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
      for process_card in df_merge.loc.unique():
            template_path = os.path.join(template_folder, f"{process_card}.xlsx")

            if os.path.exists(template_path):
                                 wb_template = load_workbook(template_path)
                                 sheet_name = process_card

# 确保 Sheet 名称唯一
                                 suffix = 1
                                 while sheet_name in existing_sheets:
                                          sheet_name = f"{process_card}_{suffix}"
                                          suffix += 1

# 获取模版的第一个sheet
                                 template_sheet = wb_template.active
                                 new_sheet = new_wb.create_sheet(title=sheet_name)

# 复制数据和样式
                                 for i, row in enumerate(template_sheet.iter_rows(values_only=False)):
                                       for j, cell in enumerate(row):
                                                         new_cell = new_sheet.cell(row=i + 1, column=j + 1, value=cell.value)

# 复制样式属性
                                                         if cell.has_style:
                                                                     if cell.font is not None:
                                                                        new_cell.font = cell.font
                                                                     if cell.border is not None:
                                                                        new_cell.border = cell.border
                                                                     if cell.fill is not None:
                                                                        new_cell.fill = cell.fill
                                                                     new_cell.number_format = cell.number_format
                                                                     new_cell.protection = cell.protection
                                                                     if cell.alignment is not None:
                                                                        new_cell.alignment = cell.alignment

# 复制行高
                                 for row in template_sheet.iter_rows():
                                          new_sheet.row_dimensions.row].height = row.row_height

# 复制合并单元格
                                 for merged_range in template_sheet.merged_cells.ranges:
                                          new_sheet.merge_cells(str(merged_range))

# 复制图片
                                 for img in template_sheet._images:
                                          new_sheet.add_image(img)

                                 existing_sheets.add(sheet_name)
            else:
                                 print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
new_wb.save(new_file_path)
print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")

AI老师能不能发的代码可以直接复制不用调缩进的。。。
失败如下
D:\PYTHON_NEW\pythonProject\.venv\Scripts\python.exe "D:\PYTHON_NEW\pythonProject\提取python excel\测试含图片的excel抽取工艺卡.py"
Traceback (most recent call last):
File "D:\PYTHON_NEW\pythonProject\提取python excel\测试含图片的excel抽取工艺卡.py", line 33, in <module>
for process_card in df_merge.loc.unique():
                     ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1184, in __getitem__
return self._getitem_tuple(key)
      ^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1368, in _getitem_tuple
return self._getitem_lowerdim(tup)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1089, in _getitem_lowerdim
return getattr(section, self.name)
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1191, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1418, in _getitem_axis
raise ValueError("Cannot index with multidimensional key")
ValueError: Cannot index with multidimensional key

Process finished with exit code 1

颂风侯 发表于 2024-7-26 15:07:49

import pandas as pd
import os
from openpyxl import load_workbook, Workbook

# 假设你的 DataFrame 已经创建为 df_merge
df_merge = pd.read_excel("提取数据2.xlsx")

# 模板文件夹路径
template_folder = "D:/PYTHON_NEW/pythonProject/提取python excel/工艺卡模版"
# 定义输出路径
output_folder = "C:\\集中"

# 确保输出文件夹存在
os.makedirs(output_folder, exist_ok=True)

# 遍历每一行
for index, row in df_merge.iterrows():
pipe_number = row['管线号']# 假设管线号在 DataFrame 中的列名是 '管线号'

# 创建新的 Excel 文件
new_file_path = os.path.join(output_folder, f"{pipe_number}.xlsx")

# 设置已存在的 Sheet 名称
existing_sheets = set()

# 创建新的 Excel 工作簿
new_wb = Workbook()
# 删除默认生成的第一个Sheet
default_sheet = new_wb.active
new_wb.remove(default_sheet)

# 遍历与该管线号对应的所有工艺卡编号
for process_card in df_merge.loc.unique():
   template_path = os.path.join(template_folder, f"{process_card}.xlsx")

   if os.path.exists(template_path):
         wb_template = load_workbook(template_path)
         sheet_name = process_card

         # 确保 Sheet 名称唯一
         suffix = 1
         while sheet_name in existing_sheets:
            sheet_name = f"{process_card}_{suffix}"
            suffix += 1

         # 获取模版的第一个sheet
         template_sheet = wb_template.active
         new_sheet = new_wb.create_sheet(title=sheet_name)

         # 复制数据和样式
         for i, row in enumerate(template_sheet.iter_rows(values_only=False)):
            for j, cell in enumerate(row):
               new_cell = new_sheet.cell(row=i + 1, column=j + 1, value=cell.value)

               # 复制样式属性
               if cell.has_style:
                     if cell.font is not None:
                        new_cell.font = cell.font
                     if cell.border is not None:
                        new_cell.border = cell.border
                     if cell.fill is not None:
                        new_cell.fill = cell.fill
                     new_cell.number_format = cell.number_format
                     new_cell.protection = cell.protection
                     if cell.alignment is not None:
                        new_cell.alignment = cell.alignment

         # 复制行高
         for row in template_sheet.iter_rows():
            new_sheet.row_dimensions.row].height = row.row_height

         # 复制合并单元格
         for merged_range in template_sheet.merged_cells.ranges:
            new_sheet.merge_cells(str(merged_range))

         # 复制图片
         for img in template_sheet._images:
            new_sheet.add_image(img)

         existing_sheets.add(sheet_name)
   else:
         print(f"模板文件 {process_card}.xlsx 不存在，请检查路径！")

# 保存新文件
new_wb.save(new_file_path)
print(f"已创建: {new_file_path} 包含模版: {existing_sheets}")

print("所有文件处理完成！")

报错如下
D:\PYTHON_NEW\pythonProject\.venv\Scripts\python.exe "D:\PYTHON_NEW\pythonProject\提取python excel\测试含图片的excel抽取工艺卡.py"
Traceback (most recent call last):
File "D:\PYTHON_NEW\pythonProject\提取python excel\测试含图片的excel抽取工艺卡.py", line 33, in <module>
for process_card in df_merge.loc.unique():
                     ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1184, in __getitem__
return self._getitem_tuple(key)
      ^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1368, in _getitem_tuple
return self._getitem_lowerdim(tup)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1089, in _getitem_lowerdim
return getattr(section, self.name)
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1191, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PYTHON_NEW\pythonProject\.venv\Lib\site-packages\pandas\core\indexing.py", line 1418, in _getitem_axis
raise ValueError("Cannot index with multidimensional key")
ValueError: Cannot index with multidimensional key

Process finished with exit code 1

页: [1]

鱼C论坛's Archiver

根据excel列数据提取模版，生成新的excel，有图片和格式的