|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
代码:
- import pandas as pd
- import numpy as np
- # 打开数据表,Excel是DataFrame数据类型
- df = pd.read_excel("analysis.xlsx")
- dg = df.fillna(value=0)
复制代码
然后在这个环境下出现的问题如下:
>>> dg.shape[0]
3455
>>> df.shape[0]
3455
>>> di = dg.loc[dg["REPLY_TO_RECEIVE_POWER_SUPPLY_SCHEME_END_DATE"]!=0]
>>> di.shape[0]
3345
>>> di["REPLY_TO_RECEIVE_POWER_SUPPLY_SCHEME_END_DATE"][40]
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
di["REPLY_TO_RECEIVE_POWER_SUPPLY_SCHEME_END_DATE"][40]
File "C:\Users\Chysial\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\series.py", line 767, in __getitem__
result = self.index.get_value(self, key)
File "C:\Users\Chysial\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\indexes\base.py", line 3118, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas\_libs\index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 40
>>> di["REPLY_TO_RECEIVE_POWER_SUPPLY_SCHEME_END_DATE"][0]
Timestamp('2018-02-23 10:09:51')
>>> di["REPLY_TO_RECEIVE_POWER_SUPPLY_SCHEME_END_DATE"][3447]
Timestamp('2018-05-14 14:18:54')
>>>
具体的问题是,我筛选了不为0的数据,然后也显示shape[0]也确实减少了不是0的数据,并且如果运行 di["REPLY_TO_RECEIVE_POWER_SUPPLY_SCHEME_END_DATE"]也是可以显示出来,但是问题是如果原先df["REPLY_TO_RECEIVE_POWER_SUPPLY_SCHEME_END_DATE"]中的空白值位置,如果使用索引就如同例子中的di["REPLY_TO_RECEIVE_POWER_SUPPLY_SCHEME_END_DATE"][40]就出现了问题,按照道理,在df.fillna(value=0)填充筛选后就不存在di中,并且di.shape[0]为3345,如果di["REPLY_TO_RECEIVE_POWER_SUPPLY_SCHEME_END_DATE"][3447]也会出现结果,各位大神帮下忙。
|
|