|
|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
本帖最后由 ilsoviet1917 于 2020-2-12 20:01 编辑
筛选数据,但是超过1000的数据中,如果有逗号分隔符就会报错,如何处理?
用E:\数据分析\2>python pandas_value_meets_condition.py supplier_data.csv pandas_output_loc.csv运行后出现错误提示:
Traceback (most recent call last):
File "pandas_value_meets_condition.py", line 12, in <module>
data_frame['Cost'] = data_frame['Cost'].str.strip('$').astype(float)
File "C:\Users\ilsov\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\generic.py", line 5882, in astype
dtype=dtype, copy=copy, errors=errors, **kwargs
File "C:\Users\ilsov\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\internals\managers.py", line 581, in astype
return self.apply("astype", dtype=dtype, **kwargs)
File "C:\Users\ilsov\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\internals\managers.py", line 438, in apply
applied = getattr(b, f)(**kwargs)
File "C:\Users\ilsov\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\internals\blocks.py", line 559, in astype
return self._astype(dtype, copy=copy, errors=errors, values=values, **kwargs)
File "C:\Users\ilsov\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\internals\blocks.py", line 643, in _astype
values = astype_nansafe(vals1d, dtype, copy=True, **kwargs)
File "C:\Users\ilsov\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\dtypes\cast.py", line 729, in astype_nansafe
return arr.astype(dtype, copy=True)
ValueError: could not convert string to float: '6,015.00 '
这个CSV文件里有两个大于1000的值,并且有逗号分隔符,报错因该是这个原因。怎么解决呢?
- import pandas as pd
- import sys
- input_file = sys.argv[1]
- output_file = sys.argv[2]
- data_frame = pd.read_csv(input_file)
- data_frame['Cost'] = data_frame['Cost'].str.strip('
- ).astype(float)
- data_frame_value_meets_condition = data_frame.loc[(data_frame['Supplier Name']\
- .str.contains('Z')) | (data_frame['Cost'] > 600.0), :]
- data_frame_value_meets_condition.to_csv(output_file, index = False)
复制代码
|
|