|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
代码一
import re
# Open a sample fastq file for reading
filename = '/scratch/SampleDataFiles/Sample.R1.fastq'
with open(filename, 'r') as read_sample:
for line in read_sample:
# get rid the hidden new line character
line = line.rstrip()
if re.match('^[ATGCN]+$', line):
# Print the line
print(line)
代码二
seq = 'GCCGGCCCTCAGACAGGAGTGGTCCTGGATGTGGATG'
kmer_length = 6
kmer_dictionary = {}
stop = len(seq) - kmer_length + 1
aip_kmers = open("newlines1.txt", 'w')
for start in range(0, stop):
kmer = seq[start:start + kmer_length]
if kmer in kmer_dictionary:
kmer_dictionary[kmer] += 1
else:
kmer_dictionary[kmer] = 1
t = "\t"
for kmer in kmer_dictionary:
count = kmer_dictionary[kmer]
out = (kmer, str(count))
print(t.join(out))
kmer1 = "t.join(out)"
aip_kmers.write(t.join(out) + "\n")
print(kmer1)
请问大神,代码一会得到非常多的数据, 请问如何把代码一大量的data代入到代码二开头的seq=“ ”里面 |
|