|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
我的代码:
rr = '''Data_holderStructure ={"series":[{"name":"机构持有比例","data":[0.59,0.63,0.6,1.1]},{"name":"个人持有比例","data":[99.41,99.37,99.4,98.9]},{"name":"内部持有比例","data":[0.0,0.0,0.0,0.0]}],"categories":["2021-06-30","2021-12-31","2022-06-30","2022-12-31"]};'''
result = re.findall(r'Data_holderStructure =(\{.*\});', rr)[0]
parsed_result = json.loads(result)['series']
df = pd.DataFrame(parsed_result)
输出df:
name data
0 机构持有比例 [0.59, 0.63, 0.6, 1.1]
1 个人持有比例 [99.41, 99.37, 99.4, 98.9]
2 内部持有比例 [0.0, 0.0, 0.6, 0.0]
想要的结果如下,求大师出手帮忙解决(其它形式的表也可以,就是要把列表里数据分配出来)
categories 机构持有比例 个人持有比例 内部持有比例
2021-06-30 0.59 99.41 0
2021-12-31 0.63 99.37 0
2022-06-30 0.16 99.4 0.6
2022-12-31 1.1 98.9 0
本帖最后由 isdkz 于 2023-4-2 22:19 编辑
根据你的需求,我理解你希望将原始的字典列表混合结构转换为数据帧,并将数据重新组织。以下是一种实现方法,将原始数据转换为所需格式的数据帧:
import re
import json
import pandas as pd
rr = '''Data_holderStructure ={"series":[{"name":"机构持有比例","data":[0.59,0.63,0.6,1.1]},{"name":"个人持有比例","data":[99.41,99.37,99.4,98.9]},{"name":"内部持有比例","data":[0.0,0.0,0.0,0.0]}],"categories":["2021-06-30","2021-12-31","2022-06-30","2022-12-31"]};'''
result = re.findall(r'Data_holderStructure =(\{.*\});', rr)[0]
parsed_result = json.loads(result)
# 获取数据部分并转换为字典形式
data = {}
for item in parsed_result['series']:
data[item['name']] = item['data']
# 创建DataFrame并使用categories作为索引
df = pd.DataFrame(data, index=parsed_result['categories'])
# 重命名索引
df.index.name = "categories"
print(df)
输出结果:
机构持有比例 个人持有比例 内部持有比例
categories
2021-06-30 0.59 99.41 0.0
2021-12-31 0.63 99.37 0.0
2022-06-30 0.60 99.40 0.0
2022-12-31 1.10 98.90 0.0
现在,你得到的数据帧格式与你期望的结果相匹配。注意,你提供的示例数据中,“内部持有比例”为[0.0, 0.0, 0.0, 0.0],
所以输出结果与你期望的结果在“内部持有比例”的值上有所不同。
如果你希望将categories作为数据帧的一列,而不是索引,你可以稍微修改代码:
import re
import json
import pandas as pd
rr = '''Data_holderStructure ={"series":[{"name":"机构持有比例","data":[0.59,0.63,0.6,1.1]},{"name":"个人持有比例","data":[99.41,99.37,99.4,98.9]},{"name":"内部持有比例","data":[0.0,0.0,0.0,0.0]}],"categories":["2021-06-30","2021-12-31","2022-06-30","2022-12-31"]};'''
result = re.findall(r'Data_holderStructure =(\{.*\});', rr)[0]
parsed_result = json.loads(result)
# 获取数据部分并转换为字典形式
data = {}
for item in parsed_result['series']:
data[item['name']] = item['data']
# 创建DataFrame
df = pd.DataFrame(data)
# 将categories添加为一列
df.insert(0, 'categories', parsed_result['categories'])
print(df)
输出结果:
categories 机构持有比例 个人持有比例 内部持有比例
0 2021-06-30 0.59 99.41 0.0
1 2021-12-31 0.63 99.37 0.0
2 2022-06-30 0.60 99.40 0.0
3 2022-12-31 1.10 98.90 0.0
现在,categories被作为数据帧的一列,而不是索引。
|
|