Skip to content

PAF format and identity problem #73

@jianshu93

Description

@jianshu93

Hi team,

mashmap -k 16 --perc_identity 85 -r GCF_018619585.1_ASM1861958v1_genomic.fna -q query_test5.fasta

minimap2 -x map-hifi -k 16 -w 21 -o try.paf GCF_018619585.1_ASM1861958v1_genomic.fna query_test5.fasta

mashmap3 PAF format is strange:

[jzhao399@atl1-1-02-017-36-2 test_superani_minimap2]$ cat try.paf
test05 2880 3 2863 + NZ_JAHHEG010000008.1 2669591 28003 30863 2667 2860 60 tp:A:P cm:i:277 s1:i:2667 s2:i:0 dv:f:0.0002 rl:i:0
[jzhao399@atl1-1-02-017-36-2 test_superani_minimap2]$ cat mashmap.out
test05 2880 0 2880 + NZ_JAHHEG010000008.1 2669591 26319 29199 92 2880 19 id:f:0.988335 kc:f:1.06012

First is the minimap2 PAF, Second Iis mashmap3 PAF. Same reference and query, column 10 equals the number of sequence matches, and column 11 equals the total number of sequence matches, mismatches, insertions and deletions in the alignment, according to the page here: https://github.com/lh3/miniasm/blob/master/PAF.md, I have 92 for mashamap3, 2667 for minimap2. I understand those are differnce algorithm, but difference by this much? Identity by minimap2 is coulmn 10/11, which is 93.2% but mashmap3 report 98.8%. Any idea why?

Thanks,
Jianshu

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions