你好!根据你的需求,你可以使用Python编写代码来处理这个问题。下面是一个示例代码:
import pandas as pd
# 读取txt文件并分割字段
data = pd.read_csv("1.简明蒙古语词典_任务1最终版 - 副本.txt", sep="@#", engine="python")
# 定义函数用于提取词条和释义信息
def extract_info(row):
meaning = row["释义"]
if "【" in meaning:
if "】" in meaning:
start_pos = meaning.index("【")
end_pos = meaning.index("】")
word = meaning[start_pos+1:end_pos]
if "】" in meaning[end_pos+1:]:
next_start = meaning[end_pos+1:].index("【")
next_end = meaning[end_pos+1:].index("】")
if next_start < next_end:
next_word = meaning[end_pos+1:][next_start+1:next_end]
meaning = meaning[:start_pos] + next_word + meaning[end_pos+1:][next_end+1:]
else:
meaning = meaning[:start_pos]
else:
meaning = meaning[:start_pos]
else:
word = meaning[meaning.index("【")+1:]
meaning = ""
else:
word = ""
# 创建新的行
new_row = {
"序号": row["序号"],
"词条": word,
"西里尔蒙古文": "",
"释义": meaning
}
return new_row
# 处理每一行数据
new_data = []
for index, row in data.iterrows():
extracted_row = extract_info(row)
new_data.append(row.tolist())
new_data.append(extracted_row)
# 将新数据转换为DataFrame并保存为Excel文件
new_data = pd.DataFrame(new_data, columns=data.columns)
new_data.to_excel("提取后的数据.xlsx", index=False)
请确保你已经安装了pandas库和openpyxl库(用于读写Excel文件)。将txt文件命名为"1.简明蒙古语词典_任务1最终版 - 副本.txt",然后运行上述代码,它将生成一个名为"提取后的数据.xlsx"的Excel文件,其中包含提取后的数据。
希望对你有帮助!如果有任何问题,请随时向我提问。
如果问题已经解决,请设置最佳答案 |