gzj137070928 发表于 2020-12-15 14:46:25

Pandas的数据清洗——删除NaN数据

删除NaN(主要针对列Series),在pandas里可以使用布尔选择或者
dropna函数删除DataFrame的某Series列里的数据,但不会影响DataFrame本身。
import pandas as pd
import numpy as np
val = np.arange(10, 38).reshape(7, 4)
col = list("abcd")
idx = "cake make fake sake wake lake take".split()
df = pd.DataFrame(val, columns = col, index = idx)
df["e"] = np.nan
df.at["make", "e"] = 100
df.at["wake", "e"] = 300
df.loc["jake"] = np.nan
df.at["jake", "c"] = 200
df["f"] = np.nan
print (df)
print (df.e)
df1 = df.e.dropna()
print ('某一列删除NaN数据',df1)
df2 = df.dropna(axis=1,how='all')# how的默认参数是any,表示只要有NaN值,则删除整行或整列
# axis=1等价于axis='column'
print ('将全部都是NaN数据的列删除',df2)
df3 = df.dropna(axis=0,how='all')# axis的默认值是0,等价于axis='row'
print ('将全部都是NaN数据的行删除',df3)
df4 = df.copy()
df4.at['jake','c'] = np.NaN
df4.dropna(how='all',inplace=True)
print('默认删除都是NaN数据的行',df4)
页: [1]
查看完整版本: Pandas的数据清洗——删除NaN数据