|
|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
本帖最后由 Dawnstar 于 2019-11-20 15:19 编辑
- import pandas as pd
- path = 'G:\\deleted entries\\test\\'
- df1 = pd.read_table(path + 'phosphorylation-len20.txt')
- df1_col = list(df1)#获取DataFrame列标签
- df1_col.insert(0,df1_col.pop(df1_col.index("Uniprot Entry")))#改变列标签为指定顺序
- df1.index = df1['Uniprot Entry']
- print(df1.index)#将这一列设置为索引
- print(df1)
- df2 = pd.read_table(path + 'deleted.txt')
- print(df2.columns)
- x = ''
- for x in df1.index.tolist(): #筛选两列重复值
- if x in df2["Uniprot Entry"].tolist():
- df1 = df1.drop(x)
复制代码 #根据索引删除行的方式,删除这些重复的行
df2的'Uniprot Entry'列与df1的'Uniprot Entry'列存在重复值。
目的:筛选这两列的重复值,删除其在df1中所在行。
但运行以上代码会出现报错:
Traceback (most recent call last):
File "<ipython-input-25-58024f1969bb>", line 1, in <module>
runfile('G:/Posttranslational modification/uniprot deleted entries/test/test.py')
File "d:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "d:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "G:/Posttranslational modification/uniprot deleted entries/test/test.py", line 30, in <module>
df1 = df1.drop(x)
File "d:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4102, in drop
errors=errors,
File "d:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py", line 3914, in drop
obj = obj._drop_axis(labels, axis, level=level, errors=errors)
File "d:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py", line 3965, in _drop_axis
raise KeyError("{} not found in axis".format(labels))
KeyError: "['Q19286'] not found in axis"
找不到原因在哪儿,希望有了解的各路大神帮忙指点一二。 |
|