Skip to content

1. Getting started

Felicia Sandberg edited this page Feb 25, 2025 · 8 revisions

Get Patchwork

If you haven't installed Patchwork yet, begin by following the instructions for installing Patchwork from source.

Getting help

You can launch Patchwork with the flag --help to see all of the command-line options.

$ julia --project=. src/Patchwork.jl --help

Minimal example

There is a couple of example files within the source directory's test folder. After installing Patchwork, you can try running the following to make sure that everything was installed correctly.

$ julia --project=. src/Patchwork.jl --contigs test/07673.fna  --reference test/07673_Alitta_succinea.faa 
P A T C H W O R K v0.7.2
- Developers: Felicia Sandberg, Clara G. Köhne    - Cite      : 10.1093/gbe/evad227
- Contact   : <[email protected]>         - DIAMOND v.: 2.1.8
- Docs      : github.com/fethalen/patchwork/wiki  - Threads   : 12
──────────────────────────────────────────────────────────────────────────
INPUT NEEDED: found output from a previous run, overwrite old files? (y/n):y
[ Info: Sequence data file format detected: FASTA
[ Info: Building DIAMOND database
[ Info: Aligning query sequences against reference database
[ Info: Merging overlapping hits
[ Info: Trimming alignments
──────────────────────────────────────────────────────────────────────────
# query files        : 1
# query sequences    : 1
# query nucleotides  : 1048
# reference sequences: 1
# alignments out     : 1 (100.0%)
┌────────────────┬───────┬───────┬────────┬───────┐
│       variable │  mean │   min │ median │   max │
├────────────────┼───────┼───────┼────────┼───────┤
│  reference_len │ 354.0 │   354 │  354.0 │   354 │
│      query_len │  18.0 │    18 │   18.0 │    18 │
│        regions │   1.0 │     1 │    1.0 │     1 │
│        contigs │   1.0 │     1 │    1.0 │     1 │
│        matches │  16.0 │    16 │   16.0 │    16 │
│     mismatches │   2.0 │     2 │    2.0 │     2 │
│      deletions │ 336.0 │   336 │  336.0 │   336 │
│ query_coverage │  5.08 │  5.08 │   5.08 │  5.08 │
│       identity │ 88.89 │ 88.89 │  88.89 │ 88.89 │
└────────────────┴───────┴───────┴────────┴───────┘
Query coverage in markers:
            ┌                                        ┐ 
   0.01-20% ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 1   
     21-40% ┤ 0                                        
     41-60% ┤ 0                                        
     61-80% ┤ 0                                        
    81-100% ┤ 0                                        
            └                                        ┘ 
Percent identity in markers:
             ┌                                        ┐ 
     missing ┤ 0                                        
       ≤ 30% ┤ 0                                        
    > 31-50% ┤ 0                                        
    > 51-70% ┤ 0                                        
    > 71-90% ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 1   
   > 91-100% ┤ 0                                        
             └                                        ┘ 
 12.261641 seconds (16.78 M allocations: 860.561 MiB, 1.42% gc time, 76.49% compilation time: 10% of which was recompilation)

In this run, the following output files were created:

patchwork_output/
├── database.dmnd
├── diamond_blastx.log
├── diamond_makedb.log
├── diamond_out
│   ├── ...
│   └── SEQUENCE_NAME.tsv
├── dna_query_sequences
│   ├── ...
│   └── SEQUENCE_NAME.fas
├── plots
│   ├── percent_identity.png
│   └── query_coverage.png
├── query_sequences
│   ├── ...
│   └── SEQUENCE_NAME.fas
├── sequence_stats
|   ├── average.csv
|   └── statistics.csv
├── trimmed_alignments.txt
└── untrimmed_alignments.txt

First, we can take a look at alignments.txt, this file contain a visual representation of each alignment that were made:

1. -----------------------------------------------------------------------------

Reference ID:        [email protected]
Reference Length:    354
Query Length:        295
Contigs:             3
Matches:             172
Mismatches:          123
Deletions:           59
Occupancy:           0.8333333333333334

  seq:   1 AIMELGTAEELAGLIKFTSPFFVSMWKAKAAKLVKALVD-YSRHGSRNRKRSRTMQRMH*  59
           || ||| ||||||||||| ||     ||||||||  |||          |          
  ref:   1 AILELGDAEELAGLIKFTRPFLSHVSKAKAAKLVRHLVDKFLDMEAGTGKEVDLCKECID  60

  seq:  60 MGKNRKTNILRQALEARLI*LYFETGQYEDALGLEG*LIKELKKMDDKALLVEVQLLE*K 119
                |   |||||||||| |   |  || || |   | ||||| ||||||||||||| |
  ref:  61 WAMDEKRTFLRQALEARLISLFYDTKRYEEALQLGSSLLKELKKLDDKALLVEVQLLESK 120

  seq: 120 VYHALTNYQKSTSSTNMSTNNSKRNLLPTKTTSRTRLQ*GILHAADEKDFKTAF*YFYEA 179
           ||||| |  |                 | |      || | ||||||||||||| |||||
  ref: 121 VYHALSNLPKARAALTSGRTTANGIYCPPKLQAALDLQSGVLHAADEKDFKTAFSYFYEA 180

  seq: 180 FEGYD*VE*-AKAVIALKYMLLAKIMLNAADEVQ*IL*GKLALKYAGPAVEAMK*IAKA* 238
           ||||| |    ||  ||||||| ||||| || || |  ||||| | || | ||| || | 
  ref: 181 FEGYDSVDNNPKALTALKYMLLSKIMLNNADDVQSIVSGKLALRYSGPDVDAMKSIAQAS 240

  seq: 239 RDR*LAEFQQT*VIYK*ELVDDPIVRAH*TD-YTTTTRTKLCRIIEPF*RVQVNHIAN-- 295
             | || || |   ||  | ||||||||    |       |||||||| ||||||||   
  ref: 241 QKRSLADFQETLQKYKGQLADDPIVRAHLDSLYDSLLEQNLCRIIEPFSRVQVNHIASLI 300

  seq: 295 ------------------------------------------------------ 295
                                                                 
  ref: 301 KLPVEDVEKKLSQMILDKKFSGILDQGSGVLVVFDETPVDKTYDTALEVVDSLY 354

Statistics are written to the statistics folder. If you have multiple reference files, you will find the statistics for all of the output in patchwork_output/statistics/average.csv and each individual entry in patchwork_output/statistics/statistics.csv. For example:

$ cat patchwork_output/statistics/statistics.csv | column -t
id                length_reference  length_query  regions  contigs  matches  mismatches  deletions  occupancy
[email protected]  354               295           2        2        172      123         59         0.83

Clone this wiki locally