|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
首先是在win10平台上写的,楼主正在学生信。楼主pc9代i7,跑这几行代码竟要2-3秒,在生信这行肯定是不行的,想请教各位大神怎样优化下面的代码,谢谢。
代码为:
f=open('total.fa')
dict1={}
for i in f:
if i.startswith('>'):
a=i.split(' ')[0]
dict1[a]=[]
else:
dict1[a].append(i)
keys=list(dict1.keys())
keys.sort()
for i in keys:
print(i)
for k in dict1[i]:
print(k.split('\n')[0],end='')
print()
print()
f.close()
原文件名为total.fa如下:
>X02662.1 E. coli gap gene for GAPDH (glyceraldehyde-3-phosphate dehydrogenase)
GATCAAACAGTGATATACGCCGTCACGCTTGTTATGCAGTAAACGACCCGTAAATGGCGGCTCTGTCCCA
TGATTCTGCGTCACGTAAAACTGCATCTCGGACAAATTTTTTTTCAGTTCTTCTGCCGAAGTTTATTAGC
CATTTGCTCACATCTCACTTTAATCGTGCTCACATTACGTGACTGATTCTAACAAAACATTAACACCAAC
TGGCAAAATTTTGTCCTAAACTTGATCTCGACGAAATGGCTGCACCTAAATCGTGATGAAAATCACATTT
TTATCGTAATTGCCCTTTAAAATTCGGGGCGCCGACCCCATGTGGTCTCAAGCCCAAAGGAAGAGTGAGG
CGAGTCAGTCGCGTAATGCTTAGGCACAGGATTGATTTGTCGCAATGATTGACACGATTCGCTTGACGCT
GCGTAAGGTTTTTGTAATTTTACAGGCAACCTTTTATTCACTAACAAATAGCTGGTGGAATATATGACTA
TCAAAGTAGGTATCAACGGTTTTGGCCGTATCGGTCGCATTGTTTTCCGTGCTGCTCAGAAACGTTCTGA
CATCGAGATCGTTGCAATCAACGACCTGTTAGACGCTGATTACATGGCATACATGCTGAAATATGACTCC
ACTCACGGCCGTTTCGACGGTACCGTTGAAGTGAAAGACGGTCATCTGATCGTTAACGGTAAAAAAATCC
GTGTTACCGCTGAACGTGATCCGGCTAACCTGAAATGGGACGAAGTTGGTGTTGACGTTGTCGCTGAAGC
AACTGGTCTGTTCCTGACTGACGAAACTGCTCGTAAACACATCACCGCTGGTGCGAAGAAAGTGGTTATG
ACTGGTCCGTCTAAAGACAACACTCCGATGTTCGTTAAAGGCGCTAACTTCGACAAATATGCTGGCCAGG
ACATCGTTTCCAACGCTTCCTGCACCACCAACTGCCTGGCTCCGCTGGCTAAAGTTATCAACGATAACTT
CGGCATCATCGAAGGTCTGATGACCACCGTTCACGCTACTACCGCTACTCAGAAAACCGTTGATGGCCCG
TCTCACAAAGACTGGCGCGGCGGCCGCGGCGCTTCCCAGAACATCATCCCGTCCTCTACCGGTGCTGCTA
AAGCTGTAGGTAAAGTACTGCCAGAACTGAATGGCAAACTGACTGGTATGGCGTTCCGCGTTCCGACCCC
GAACGTATCTGTAGTTGACCTGACCGTTCGTCTGGAAAAAGCTGCAACTTACGAGCAGATCAAAGCTGCC
GTTAAAGCTGCTGCTGAAGGCGAAATGAAAGGCGTTCTGGGCTACACCGAAGATGACGTAGTATCTACCG
ATTTCAACGGCGAAGTTTGCACTTCCGTGTTCGATGCTAAAGCTGGTATCGCTCTGAACGACAACTTCGT
GAAACTGGTATCCTGGTACGACAACGAAACCGGTTACTCCAACAAAGTTCTGGACCTGATCGCTCACATC
TCCAAATAAGTTGAGATGACACTGTGATCACACCATCGTCACAGCCTTCGATC
>gi|325296756|ref|NM_001204686.1|:1-968 Aplysia californica insulin precursor (PIN), mRNA
CCTGAATATAGCCAACTAAATTCTAGGAACTCTAAGAGGACTACGCTTGTCTCCAACATCTTATCGTCAA
CATCTTCTGCAAGCGATAACTATATTTCTGGTCCGCCAAAGTAGTATACGCTAAGAACAAGAGGAAGAGA
GTCGTAAGGTTTTTTATTCCCAGCCGGCGAGAGCAGAAACTGTTGTTCTAGCTGCCTTTCTGGTCTTAAC
AGGACCATTTTGCTGGCCAGTGAAAAACTAACTCGGGTGAAACAACATTGGTGCTACCAGCCTCTCCTGA
CTGTTCCAACGGTGCCTTCTCGTAGCCAGAATGAGCAAGTTCCTCCTCCAGAGCCACTCCGCCAACGCCT
GCCTGCTCACCCTTCTGCTCACGCTGGCCTCCAACCTCGACATATCCCTGGCCAACTTCGAGCACTCGTG
CAACGGCTACATGCGGCCCCACCCGCGGGGTCTGTGCGGCGAAGACCTGCACGTCATCATTTCCAACCTG
TGCAGCTCTCTGGGGGGCAACAGGAGGTTCCTGGCCAAGTACATGGTCAAAAGAGACACGGAAAATGTGA
ACGACAAGTTACGAGGGATCCTGCTCAATAAGAAAGAAGCTTTCTCCTACTTGACCAAGAGAGAGGCCTC
AGGCTCCATCACATGCGAATGTTGCTTCAACCAGTGTCGGATATTTGAGCTGGCTCAGTACTGCCGTCTG
CCAGACCATTTCTTCTCCAGAATATCCAGAACCGGAAGGAGCAACAGTGGACATGCGCAGTTGGAGGACA
ACTTTAGTTAGACATGTTGAGGGCGTAAATGCTTTTAAAATTTTTAATTTGGTGATTATTATTATAAAGG
AGGAGTCCACGTGGTGTCAGATTTAGCGGGTTTTTTCCACGTGTTTGACTAAAGTTTCCAGATTTATTTC
ATACCAGCGATACCCGCAGGAATAGAAGGTCCCCTAAGAAGCTGAAGGCATTATTGAT
>gi|568815582|ref|NC_000016.10|:176553-177647 Homo sapiens chromosome 16, GRCh38.p13 Primary Assembly
CAGGCCCCGCCCGGGACTCCCCTGCGGTCCAGGCCGCGCCCCGGGCTCCGCGCCAGCCAATGAGCGCCGC
CCGGCCGGGCGTGCCCCCGCGCCCCAAGCATAAACCCTGGCGCGCTCGCGGCCCGGCACTCTTCTGGTCC
CCACAGACTCAGAGAGAACCCACCATGGTGCTGTCTCCTGCCGACAAGACCAACGTCAAGGCCGCCTGGG
GTAAGGTCGGCGCGCACGCTGGCGAGTATGGTGCGGAGGCCCTGGAGAGGTGAGGCTCCCTCCCCTGCTC
CGACCCGGGCTCCTCGCCCGCCCGGACCCACAGGCCACCCTCAACCGTCCTGGCCCCGGACCCAAACCCC
ACCCCTCACTCTGCTTCTCCCCGCAGGATGTTCCTGTCCTTCCCCACCACCAAGACCTACTTCCCGCACT
TCGACCTGAGCCACGGCTCTGCCCAGGTTAAGGGCCACGGCAAGAAGGTGGCCGACGCGCTGACCAACGC
CGTGGCGCACGTGGACGACATGCCCAACGCGCTGTCCGCCCTGAGCGACCTGCACGCGCACAAGCTTCGG
GTGGACCCGGTCAACTTCAAGGTGAGCGGCGGGCCGGGAGCGATCTGGGTCGAGGGGCGAGATGGCGCCT
TCCTCGCAGGGCAGAGGATCACGCGGGTTGCGGGAGGTGTAGCGCAGGCGGCGGCTGCGGGCCTGGGCCC
TCGGCCCCACTGACCCTCTTCTCTGCACAGCTCCTAAGCCACTGCCTGCTGGTGACCCTGGCCGCCCACC
TCCCCGCCGAGTTCACCCCTGCGGTGCACGCCTCCCTGGACAAGTTCCTGGCTTCTGTGAGCACCGTGCT
GACCTCCAAATACCGTTAAGCTGGAGCCTCGGTGGCCATGCTTCTTGCCCCTTGGGCCTCCCCCCAGCCC
CTCCTCCCCTTCCTGCACCCGTACCCCCGTGGTCTTTGAATAAAGTCTGAGTGGGCGGCAGCCTGTGTGT
GCCTGAGTTTTTTCCCTCAGCAAACGTGCCAGGCATGGGCGTGGACAGCAGCTGGGACACACATGGCTAG
AACCTCTCTGCAGCTGGATAGGGTAGGAAAAGGCAGGGGCGGGAG
>gi|568815582|ref|NC_000016.10|:172750-173834 Homo sapiens chromosome 16, GRCh38.p13 Primary Assembly
AGGCCCCGCCCGGGACTCCCCTGCGGTCCAGGCCGCGCCCCGGGCTCCGCGCCAGCCAATGAGCGCCGCC
CGGCCGGGCGTGCCCCCGCGCCCCAAGCATAAACCCTGGCGCGCTCGCGGGCCGGCACTCTTCTGGTCCC
CACAGACTCAGAGAGAACCCACCATGGTGCTGTCTCCTGCCGACAAGACCAACGTCAAGGCCGCCTGGGG
TAAGGTCGGCGCGCACGCTGGCGAGTATGGTGCGGAGGCCCTGGAGAGGTGAGGCTCCCTCCCCTGCTCC
GACCCGGGCTCCTCGCCCGCCCGGACCCACAGGCCACCCTCAACCGTCCTGGCCCCGGACCCAAACCCCA
CCCCTCACTCTGCTTCTCCCCGCAGGATGTTCCTGTCCTTCCCCACCACCAAGACCTACTTCCCGCACTT
CGACCTGAGCCACGGCTCTGCCCAGGTTAAGGGCCACGGCAAGAAGGTGGCCGACGCGCTGACCAACGCC
GTGGCGCACGTGGACGACATGCCCAACGCGCTGTCCGCCCTGAGCGACCTGCACGCGCACAAGCTTCGGG
TGGACCCGGTCAACTTCAAGGTGAGCGGCGGGCCGGGAGCGATCTGGGTCGAGGGGCGAGATGGCGCCTT
CCTCTCAGGGCAGAGGATCACGCGGGTTGCGGGAGGTGTAGCGCAGGCGGCGGCTGCGGGCCTGGGCCGC
ACTGACCCTCTTCTCTGCACAGCTCCTAAGCCACTGCCTGCTGGTGACCCTGGCCGCCCACCTCCCCGCC
GAGTTCACCCCTGCGGTGCACGCCTCCCTGGACAAGTTCCTGGCTTCTGTGAGCACCGTGCTGACCTCCA
AATACCGTTAAGCTGGAGCCTCGGTAGCCGTTCCTCCTGCCCGCTGGGCCTCCCAACGGGCCCTCCTCCC
CTCCTTGCACCGGCCCTTCCTGGTCTTTGAATAAAGTCTGAGTGGGCAGCAGCCTGTGTGTGCCTGGGTT
CTCTCTATCCCGGAATGTGCCAACAATGGAGGTGTTTACCTGTCTCAGACCAAGGACCTCTCTGCAGCTG
CATGGGGCTGGGGAGGGAGAACTGCAGGGAGTATG
|
|