-
Notifications
You must be signed in to change notification settings - Fork 3
1. Getting started
If you haven't installed Patchwork yet, begin by following the instructions for installing Patchwork from source.
You can launch Patchwork with the flag --help to see all of the command-line options.
$ julia --project=. src/Patchwork.jl --help
There is a couple of example files within the source directory's test folder. After installing Patchwork, you can try running the following to make sure that everything was installed correctly.
$ julia --project=. src/Patchwork.jl --contigs test/07673.fna --reference test/07673_Alitta_succinea.faa
P A T C H W O R K v0.7.2
- Developers: Felicia Sandberg, Clara G. Köhne - Cite : 10.1093/gbe/evad227
- Contact : <[email protected]> - DIAMOND v.: 2.1.8
- Docs : github.com/fethalen/patchwork/wiki - Threads : 12
──────────────────────────────────────────────────────────────────────────
INPUT NEEDED: found output from a previous run, overwrite old files? (y/n):y
[ Info: Sequence data file format detected: FASTA
[ Info: Building DIAMOND database
[ Info: Aligning query sequences against reference database
[ Info: Merging overlapping hits
[ Info: Trimming alignments
──────────────────────────────────────────────────────────────────────────
# query files : 1
# query sequences : 1
# query nucleotides : 1048
# reference sequences: 1
# alignments out : 1 (100.0%)
┌────────────────┬───────┬───────┬────────┬───────┐
│ variable │ mean │ min │ median │ max │
├────────────────┼───────┼───────┼────────┼───────┤
│ reference_len │ 354.0 │ 354 │ 354.0 │ 354 │
│ query_len │ 18.0 │ 18 │ 18.0 │ 18 │
│ regions │ 1.0 │ 1 │ 1.0 │ 1 │
│ contigs │ 1.0 │ 1 │ 1.0 │ 1 │
│ matches │ 16.0 │ 16 │ 16.0 │ 16 │
│ mismatches │ 2.0 │ 2 │ 2.0 │ 2 │
│ deletions │ 336.0 │ 336 │ 336.0 │ 336 │
│ query_coverage │ 5.08 │ 5.08 │ 5.08 │ 5.08 │
│ identity │ 88.89 │ 88.89 │ 88.89 │ 88.89 │
└────────────────┴───────┴───────┴────────┴───────┘
Query coverage in markers:
┌ ┐
0.01-20% ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 1
21-40% ┤ 0
41-60% ┤ 0
61-80% ┤ 0
81-100% ┤ 0
└ ┘
Percent identity in markers:
┌ ┐
missing ┤ 0
≤ 30% ┤ 0
> 31-50% ┤ 0
> 51-70% ┤ 0
> 71-90% ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 1
> 91-100% ┤ 0
└ ┘
12.261641 seconds (16.78 M allocations: 860.561 MiB, 1.42% gc time, 76.49% compilation time: 10% of which was recompilation)
In this run, the following output files were created:
patchwork_output/
├── database.dmnd
├── diamond_blastx.log
├── diamond_makedb.log
├── diamond_out
│ ├── ...
│ └── SEQUENCE_NAME.tsv
├── dna_query_sequences
│ ├── ...
│ └── SEQUENCE_NAME.fas
├── plots
│ ├── percent_identity.png
│ └── query_coverage.png
├── query_sequences
│ ├── ...
│ └── SEQUENCE_NAME.fas
├── sequence_stats
| ├── average.csv
| └── statistics.csv
├── trimmed_alignments.txt
└── untrimmed_alignments.txt
First, we can take a look at alignments.txt, this file contain a visual representation
of each alignment that were made:
1. -----------------------------------------------------------------------------
Reference ID: [email protected]
Reference Length: 354
Query Length: 295
Contigs: 3
Matches: 172
Mismatches: 123
Deletions: 59
Occupancy: 0.8333333333333334
seq: 1 AIMELGTAEELAGLIKFTSPFFVSMWKAKAAKLVKALVD-YSRHGSRNRKRSRTMQRMH* 59
|| ||| ||||||||||| || |||||||| ||| |
ref: 1 AILELGDAEELAGLIKFTRPFLSHVSKAKAAKLVRHLVDKFLDMEAGTGKEVDLCKECID 60
seq: 60 MGKNRKTNILRQALEARLI*LYFETGQYEDALGLEG*LIKELKKMDDKALLVEVQLLE*K 119
| |||||||||| | | || || | | ||||| ||||||||||||| |
ref: 61 WAMDEKRTFLRQALEARLISLFYDTKRYEEALQLGSSLLKELKKLDDKALLVEVQLLESK 120
seq: 120 VYHALTNYQKSTSSTNMSTNNSKRNLLPTKTTSRTRLQ*GILHAADEKDFKTAF*YFYEA 179
||||| | | | | || | ||||||||||||| |||||
ref: 121 VYHALSNLPKARAALTSGRTTANGIYCPPKLQAALDLQSGVLHAADEKDFKTAFSYFYEA 180
seq: 180 FEGYD*VE*-AKAVIALKYMLLAKIMLNAADEVQ*IL*GKLALKYAGPAVEAMK*IAKA* 238
||||| | || ||||||| ||||| || || | ||||| | || | ||| || |
ref: 181 FEGYDSVDNNPKALTALKYMLLSKIMLNNADDVQSIVSGKLALRYSGPDVDAMKSIAQAS 240
seq: 239 RDR*LAEFQQT*VIYK*ELVDDPIVRAH*TD-YTTTTRTKLCRIIEPF*RVQVNHIAN-- 295
| || || | || | |||||||| | |||||||| ||||||||
ref: 241 QKRSLADFQETLQKYKGQLADDPIVRAHLDSLYDSLLEQNLCRIIEPFSRVQVNHIASLI 300
seq: 295 ------------------------------------------------------ 295
ref: 301 KLPVEDVEKKLSQMILDKKFSGILDQGSGVLVVFDETPVDKTYDTALEVVDSLY 354
Statistics are written to the statistics folder. If you have multiple reference files,
you will find the statistics for all of the output in patchwork_output/statistics/average.csv and each
individual entry in patchwork_output/statistics/statistics.csv. For example:
$ cat patchwork_output/statistics/statistics.csv | column -t
id length_reference length_query regions contigs matches mismatches deletions occupancy
[email protected] 354 295 2 2 172 123 59 0.83