poppy章鱼 发表于 2021-12-25 22:00:22

求大神帮忙python:第一列相同,则对第二列分类统计比例

示例数据:
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment315
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664


想要得到的结果:
block10_scaffold_1        total:21        alignment2        12        0.5714        alignment665        8        0.3810        alignment315        1        0.0476       
block12_scaffold_1        total:26        alignment2        140.5385        alignment664        12        0.4615

如果第一列相同,则统计第一列的总数,然后统计第二列有几种类型,分别输出在第一列值的后面【统计每个类型个数及占第一列的比例。每行不同类型按占比的大小排列(从大到小)】

感恩

傻眼貓咪 发表于 2021-12-26 10:48:20

本帖最后由 傻眼貓咪 于 2021-12-26 11:00 编辑

data = """block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment2
block10_scaffold_1...alignment315
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block10_scaffold_1...alignment665
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment2
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664
block12_scaffold_1...alignment664"""

arr = dict()

for each in data.split("\n"):
    a, b = each.split("...")
    if a in arr:
      arr += 1
      if(b in arr):
            arr += 1
      else:
            arr = 1
    else:
      arr =
      arr = 1

for each in arr.items():
    key, res = each
    a, b = res
    print(f"{key} total: {a}", end = "\t")
    ans = []
    for elem in b.items():
      k, v = elem
      ans.append((k, v, v/a))
    ans.sort(key = lambda x: x, reverse = True)
   
    print()block10_scaffold_1 total: 21    alignment2 120.5714   alignment665 80.3810alignment315 10.0476
block12_scaffold_1 total: 26    alignment2 140.5385   alignment664 120.4615

poppy章鱼 发表于 2021-12-26 17:00:52

感谢帮助,感动{:10_266:}.....
页: [1]
查看完整版本: 求大神帮忙python:第一列相同,则对第二列分类统计比例