pandas列有空格时,求平均值出现问题,求大师帮忙,谢谢
data = {"A":['5','95','0.43','86.33','84.17','95','72.81','69.93','95','83.86','64.11','','89.16','','','0.11','','89.92','96','34.82','79.88','79.54','86.35','16.65','87.72'],"B":['0','0','0','-8.67','-3.61','0','-6.49','0','0','0','0','','-1.35','','','0','','-8.34','0','0','0','-9.72','0','0','-7.28']}
df = pd.DataFrame(data)
求A列的平均数
求B列大于0的个数,等于0的个数,小于0的个数,为空的个数
由于A列、B列都有空格,下面求值似乎用不了,求大师帮忙再看看,谢谢
# 计算 A 列的平均值
A_average = df['A'].astype(float).mean()
print(A_average)
# 计算 B 列的平均值
greater_than_zero = (df['B'].astype(float) > 0).sum()
equals_zero = (df['B'].astype(float) == 0).sum()
less_than_zero = (df['B'].astype(float) < 0).sum()
null_values = df['B'].isnull().sum()
print(greater_than_zero)
print(equals_zero)
print(less_than_zero)
print(null_values) 您好!在这种情况下,您可以先将空字符串替换为numpy.nan,这样才能进行相应的计算。请尝试以下代码:
import pandas as pd
import numpy as np
data = {"A":['5','95','0.43','86.33','84.17','95','72.81','69.93','95','83.86','64.11','','89.16','','','0.11','','89.92','96','34.82','79.88','79.54','86.35','16.65','87.72'],
"B":['0','0','0','-8.67','-3.61','0','-6.49','0','0','0','0','','-1.35','','','0','','-8.34','0','0','0','-9.72','0','0','-7.28']}
df = pd.DataFrame(data)
# 将空字符串替换为 np.nan
df.replace("", np.nan, inplace=True)
# 计算 A 列的平均值
A_average = df['A'].astype(float).mean()
print(A_average)
# 计算 B 列的各项数值
greater_than_zero = (df['B'].astype(float) > 0).sum()
equals_zero = (df['B'].astype(float) == 0).sum()
less_than_zero = (df['B'].astype(float) < 0).sum()
null_values = df['B'].isnull().sum()
print(greater_than_zero)
print(equals_zero)
print(less_than_zero)
print(null_values)
这段代码应该能满足您的需求。如果您还有其他问题,请随时提问!
页:
[1]