|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
求大佬帮忙,第一次发帖,急急急。。。。
数据如下,第一列是顺序排列的
scaffold_1..12548466 6
scaffold_1..12548467 6
scaffold_1..12548468 6
scaffold_1..12548469 6
scaffold_1..12548470 6
scaffold_1..12548471 6
scaffold_1..12548472 6
scaffold_1..12548473 5
scaffold_1..12548474 5
scaffold_1..12548475 5
scaffold_1..12548476 5
scaffold_1..12548477 5
scaffold_1..12548478 5
scaffold_1..12548479 4
scaffold_1..12548480 4
scaffold_1..12548481 4
scaffold_1..12548482 4
scaffold_1..12548483 4
想要的结果输出格式
scaffold_1..12548466 scaffold_1..12548472 6 7
scaffold_1..12548473 scaffold_1..12548478 5 6
scaffold_1..12548479 scaffold_1..12548483 4 5
如果第二行的第二列和第一行的第二列相同,则输出第二列的值和频次,以及第一列开头和结尾(第一列是按顺序排列的)
本帖最后由 xingkong0214 于 2021-11-12 12:10 编辑
- # coding:utf-8
- if __name__ == '__main__':
- output_text = dict()
- # 假设数据从文件中读取
- with open(r'test_data.txt', 'r') as f:
- for each_line in f:
- serial_number = each_line.split()[0]
- value = each_line.split()[1]
- # 统计数据,存储到一个字典
- if value not in output_text.keys():
- output_text[value] = {'start': serial_number, 'stop': serial_number, 'count': 1}
- else:
- count = output_text[value]['count'] + 1
- output_text[value]['stop'], output_text[value]['count'] = serial_number, count
- # 按照格式打印内容
- for each_item in output_text:
- print(output_text[each_item]['start'], output_text[each_item]['stop'], each_item, output_text[each_item]['count'])
复制代码
|
|