futui 发表于 2023-4-8 22:53:42

pandas列有空格时,求平均值出现问题,求大师帮忙,谢谢

data = {"A":['5','95','0.43','86.33','84.17','95','72.81','69.93','95','83.86','64.11','','89.16','','','0.11','','89.92','96','34.82','79.88','79.54','86.35','16.65','87.72'],
"B":['0','0','0','-8.67','-3.61','0','-6.49','0','0','0','0','','-1.35','','','0','','-8.34','0','0','0','-9.72','0','0','-7.28']}

df = pd.DataFrame(data)

求A列的平均数
求B列大于0的个数,等于0的个数,小于0的个数,为空的个数

由于A列、B列都有空格,下面求值似乎用不了,求大师帮忙再看看,谢谢

# 计算 A 列的平均值
A_average = df['A'].astype(float).mean()
print(A_average)

# 计算 B 列的平均值
greater_than_zero = (df['B'].astype(float) > 0).sum()
equals_zero = (df['B'].astype(float) == 0).sum()
less_than_zero = (df['B'].astype(float) < 0).sum()
null_values = df['B'].isnull().sum()

print(greater_than_zero)
print(equals_zero)
print(less_than_zero)
print(null_values)

isdkz 发表于 2023-4-8 22:55:58

您好!在这种情况下,您可以先将空字符串替换为numpy.nan,这样才能进行相应的计算。请尝试以下代码:

import pandas as pd
import numpy as np

data = {"A":['5','95','0.43','86.33','84.17','95','72.81','69.93','95','83.86','64.11','','89.16','','','0.11','','89.92','96','34.82','79.88','79.54','86.35','16.65','87.72'],
"B":['0','0','0','-8.67','-3.61','0','-6.49','0','0','0','0','','-1.35','','','0','','-8.34','0','0','0','-9.72','0','0','-7.28']}

df = pd.DataFrame(data)

# 将空字符串替换为 np.nan
df.replace("", np.nan, inplace=True)

# 计算 A 列的平均值
A_average = df['A'].astype(float).mean()
print(A_average)

# 计算 B 列的各项数值
greater_than_zero = (df['B'].astype(float) > 0).sum()
equals_zero = (df['B'].astype(float) == 0).sum()
less_than_zero = (df['B'].astype(float) < 0).sum()
null_values = df['B'].isnull().sum()

print(greater_than_zero)
print(equals_zero)
print(less_than_zero)
print(null_values)

这段代码应该能满足您的需求。如果您还有其他问题,请随时提问!
页: [1]
查看完整版本: pandas列有空格时,求平均值出现问题,求大师帮忙,谢谢