数据帧每列数据排名的问题,请大师帮忙,谢谢
import pandas as pddata = [["2345", "A", "2022-12-20", "2.1016", "-0.72","-2.8", "-0.77"],
["004243", "C", "2022-12-15", "2.0891", "45","-0.77"],
["519212", "B", "2022-12-10", "333", "1.9733", "-0.77", "-3.29"],
["162411", "E", "2022-12-9", "0.7071", "-1.64","25", "-1.46"],
["007844", "FC", "2022-11-19", "0.6967", '8', "88","-1.46"],
["163208", "G", "2022-11-10", "1.0440", "-0.40","" ,"-0.40"]]
columns = ["编号", "名称", "日期", "AA", "BB","CC","DD"]
df = pd.DataFrame(data, columns=columns)
我想把上面的数据帧 的 AA,BB,CC列等每列的数据,按其自身大小排序成下面的样子
原数据不动,增加一列(A、B、C、D)填充为其排名号,请大虾帮忙,谢谢
"AA", "CC","DD"]:降序
"BB":升序
编号 名称 日期 AA A BB B CC C DD D
0 2345 A 2022-12-20 2.1016 2 -0.72 2 -2.8 4 -0.77 2
1004243 C 2022-12-15 2.0891 3 45 6 -0.77 3 None
2519212 B 2022-12-10 333 1 1.9733 4 -0.77 3 -3.29 4
3162411 E 2022-12-9 0.7071 5 -1.64 1 25 2 -1.46 3
4007844 FC2022-11-19 0.6967 6 8 5 88 1 -1.46 3
5163208 G 2022-11-10 1.0440 4 -0.40 3 -0.40 1 你可以使用pandas的 rank() 方法为数据帧的指定列进行排名。
以下是一段用于实现你想要的功能的代码:
import pandas as pd
data = [["2345", "A", "2022-12-20", "2.1016", "-0.72","-2.8", "-0.77"],
["004243", "C", "2022-12-15", "2.0891", "45","-0.77"],
["519212", "B", "2022-12-10", "333", "1.9733", "-0.77", "-3.29"],
["162411", "E", "2022-12-9", "0.7071", "-1.64","25", "-1.46"],
["007844", "FC", "2022-11-19", "0.6967", '8', "88","-1.46"],
["163208", "G", "2022-11-10", "1.0440", "-0.40", "", "-0.40"]]
columns = ["编号", "名称", "日期", "AA", "BB","CC","DD"]
df = pd.DataFrame(data, columns=columns)
df['AA'] = df['AA'].astype(float)
df['BB'] = df['BB'].astype(float)
df['CC'] = pd.to_numeric(df['CC'], errors='coerce')
df['A'] = df['AA'].rank(ascending=False)
df['B'] = df['BB'].rank(ascending=True)
df['C'] = df['CC'].rank(ascending=False)
df = df.fillna('')
print(df)
运行上述代码,你将得到按照你的要求排好序的数据帧。 df['A'] = df['AA'].rank(method='first',ascending=False)
df['B'] = df['BB'].rank(method='first',ascending=True)
df['C'] = df['CC'].rank(method='first',ascending=False)
df['d'] = df['DD'].rank(method='first',ascending=False)
页:
[1]