Investigate performance with big k-mer sizes (e.g. 5,001)

Example: when creating the dot plot of two ~5 Mbp genomes using _k_ = 5,001, the common substrings method runs out of memory (despite it handling _k_ = 33 for the same sequences fine). Not sure why exactly.

However, the "suff-only" method can actually handle this okay.

It would be nice to have some guidance on why big _k_-mer sizes cause problems, and how to handle them. I am not sure if anyone is out here regularly creating dot plots with _k_-mer sizes in the thousands and up, but apparently this tool can do that at least :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigate performance with big k-mer sizes (e.g. 5,001) #22

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Investigate performance with big k-mer sizes (e.g. 5,001) #22

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions