Skip to content

3. How It Works

Felix Thalén edited this page Sep 26, 2021 · 5 revisions

Input/output

Input: a set of DNA query contigs (i.e., the output frome a de novo whole-genome assembler such as SPAdes) and a set of protein reference sequences.

Output: a set of translated query sequences, that have been merged based on how they align to the provided reference sequences.

Workflow

Patchwork's workflow can be divided into four separate steps:

  1. Pooling and DIAMOND database construction
  2. Alignment of translated DNA sequences to reference amino acid sequences
  3. A "hit-stitching" step, where overlapping hits are merged
  4. Application of filters, statistical reports, and re-evaluation of contig regions with multi-locus hits

Graphical Overview

Graphical Overview

Clone this wiki locally