Skip to content

Commit 3df3506

Browse files
authored
Merge pull request #995 from hugolefeuvre/genes_collection_metaG
Genes catalogue build workflow from metagenomic raw reads data
2 parents a271047 + 67e8f03 commit 3df3506

File tree

9 files changed

+2214
-8
lines changed

9 files changed

+2214
-8
lines changed

workflows/bacterial_genomics/amr_gene_detection/amr_gene_detection-tests.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@
77
- hash_function: SHA-1
88
hash_value: a3545f886ccca21cba986bf12daa2240f36ef1b6
99
Select a taxonomy group point mutation: Enterococcus_faecalis
10-
Select a AMR genes detection database: amrfinderplus_V3.12_2024-05-02.2
11-
Select a virulence genes detection database: vfdb
10+
AMR genes detection database: amrfinderplus_V3.12_2024-05-02.2
11+
Virulence genes detection database: vfdb
1212
Select a StarmAMR database: staramr_downloaded_07042025_resfinder_d1e607b_pointfinder_694919f_plasmidfinder_3e77502
1313
Select species for point mutation analysis using PointFinder: enterococcus_faecalis
1414
outputs:

workflows/bacterial_genomics/amr_gene_detection/amr_gene_detection.ga

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -147,10 +147,10 @@
147147
"inputs": [
148148
{
149149
"description": "Select the database to identify AMR genes with AMRFinderPlus.",
150-
"name": "Select a AMR genes detection database"
150+
"name": "AMR genes detection database"
151151
}
152152
],
153-
"label": "Select a AMR genes detection database",
153+
"label": "AMR genes detection database",
154154
"name": "Input parameter",
155155
"outputs": [],
156156
"position": {
@@ -174,10 +174,10 @@
174174
"inputs": [
175175
{
176176
"description": "Select the database to identify virulence genes with ABRicate.",
177-
"name": "Select a virulence genes detection database"
177+
"name": "Virulence genes detection database"
178178
}
179179
],
180-
"label": "Select a virulence genes detection database",
180+
"label": "Virulence genes detection database",
181181
"name": "Input parameter",
182182
"outputs": [],
183183
"position": {

workflows/genome_annotation/functional-annotation/functional-annotation-protein-sequences/Functional_annotation_of_protein_sequences.ga

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@
124124
]
125125
},
126126
"2": {
127-
"annotation": "EggNOG Mapper compares each protein sequence of the annotation to a huge set of ortholog groups from the EggNOG database.",
127+
"annotation": "EggNOG Mapper compares each protein sequence of the annotation to a huge set of ortholog groups from the eggNOG database.",
128128
"content_id": "toolshed.g2.bx.psu.edu/repos/galaxyp/eggnog_mapper/eggnog_mapper/2.1.8+galaxy4",
129129
"errors": null,
130130
"id": 2,

workflows/genome_annotation/functional-annotation/functional-annotation-protein-sequences/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
This workflow uses eggNOG mapper and Interproscan for functional annotation of protein sequences.
44
It can be used on proteins from any organism.
55

6-
EggNOG Mapper compares each protein sequence of the annotation to a huge set of ortholog groups from the EggNOG database. In this database, each ortholog group is associated with functional annotation like Gene Ontology (GO) terms or KEGG pathways. When the protein sequence of a new gene is found to be very similar to one of these ortholog groups, the corresponding functional annotation is transfered to this new gene.
6+
EggNOG Mapper compares each protein sequence of the annotation to a huge set of ortholog groups from the eggNOG database. In this database, each ortholog group is associated with functional annotation like Gene Ontology (GO) terms or KEGG pathways. When the protein sequence of a new gene is found to be very similar to one of these ortholog groups, the corresponding functional annotation is transfered to this new gene.
77

88
InterProScan is a tool that analyses each protein sequence from our annotation to determine if they contain one or several of the signatures from InterPro. When a protein contains a known signature, the corresponding functional annotation will be assigned to it by InterProScan.
99

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
version: 1.2
2+
workflows:
3+
- name: main
4+
subclass: Galaxy
5+
publish: true
6+
primaryDescriptorPath: /metagenomic-genes-catalogue.ga
7+
testParameterFiles:
8+
- /metagenomic-genes-catalogue-tests.yml
9+
authors:
10+
- name: ABRomics
11+
12+
- name: abromics-consortium
13+
url: https://www.abromics.fr/
14+
- name: Hugo Lefeuvre
15+
orcid: 0009-0005-6834-4058
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Changelog
2+
3+
## [1.0] - 2025-10-14
4+
5+
- First release
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Genes catalogue generation from metagenomic reads
2+
3+
This workflow generates genes catalogue from paired short reads.
4+
5+
The workflow supports assembly using **MEGAHIT**.
6+
7+
## Genes catalogue Annotation and Quality Control
8+
9+
After assembly, CDS are detected from resulting contigs with **Prodigal** and the potential genes are clustered with **MMseqs2linclust**
10+
11+
The following processing steps are then performed:
12+
13+
- **Genes annotation** with Eggnog-mapper
14+
- **Taxonomic Assignment** using MMseqs2taxonomy
15+
- **Assembly Quality Control** via QUAST
16+
- **Abundance Estimation** per sample with CoverM
17+
- **AMR detection** with ABRicate, AMRFinderPlus and starAMR
18+
19+
## Input Requirements
20+
21+
Input reads must be quality-filtered, with host reads removed.
22+
23+
- **Trimmed reads**: Quality-trimmed reads from individual samples, used solely for abundance estimation.
Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
- doc: Test outline for Genes_catalogue_paired_collection_metaG_ABRomics
2+
job:
3+
Metagenomics Trimmed reads:
4+
class: Collection
5+
collection_type: list:paired
6+
elements:
7+
- class: Collection
8+
type: paired
9+
identifier: genes_catalogue_test
10+
elements:
11+
- class: File
12+
identifier: forward
13+
location: https://zenodo.org/records/17348100/files/genes_catalogue_R1.fastqsanger.gz
14+
- class: File
15+
identifier: reverse
16+
location: https://zenodo.org/records/17348100/files/genes_catalogue_R2.fastqsanger.gz
17+
eggNOG database: 5.0.2
18+
mmseqs2 taxonomy DB: UniRef50-17-b804f-07112025
19+
starAMR database: staramr_downloaded_07042025_resfinder_d1e607b_pointfinder_694919f_plasmidfinder_3e77502
20+
Virulence genes detection database: vfdb
21+
AMR genes detection database: amrfinderplus_V3.12_2024-05-02.2
22+
outputs:
23+
MMseqs2 Taxonomy Filtered:
24+
asserts:
25+
has_text:
26+
text: "domain"
27+
has_n_columns:
28+
n: 4
29+
Eggnog Annotation Filtered:
30+
asserts:
31+
has_text:
32+
text: "Contig id"
33+
has_n_columns:
34+
n: 21
35+
MMseqs2 Taxonomy Kraken:
36+
asserts:
37+
has_text:
38+
text: "cellular root"
39+
has_n_columns:
40+
n: 6
41+
Argnorm AMRfinderplus Report:
42+
asserts:
43+
has_text:
44+
text: "Protein identifier"
45+
Eggnog Annotations:
46+
asserts:
47+
has_text:
48+
text: "#query"
49+
has_n_columns:
50+
n: 21
51+
Resfinder:
52+
asserts:
53+
has_text:
54+
text: "Isolate ID"
55+
has_n_columns:
56+
n: 13
57+
Abricate Virulence Report:
58+
asserts:
59+
has_text:
60+
text: "#FILE"
61+
has_n_columns:
62+
n: 15
63+
Amrfinderplus Report:
64+
asserts:
65+
has_text:
66+
text: "Protein identifier"
67+
has_n_columns:
68+
n: 22
69+
Tooldistillator Summarize Collection:
70+
element_tests:
71+
genes_catalogue_test:
72+
asserts:
73+
- that: has_text
74+
text: "megahit_report"
75+
- that: has_text
76+
text: "quast_report"
77+
- that: has_text
78+
text: "prodigal_report"
79+
- that: has_text
80+
text: "coverm_report"
81+
Tooldistillator Summarize Catalogue:
82+
asserts:
83+
- that: has_text
84+
text: "eggnogmapper_report"
85+
- that: has_text
86+
text: "mmseqs2linclust_report"
87+
- that: has_text
88+
text: "staramr_report"
89+
- that: has_text
90+
text: "mmseqs2taxonomy_report"
91+
- that: has_text
92+
text: "abricate_report"
93+
- that: has_text
94+
text: "amrfinderplus_report"
95+
- that: has_text
96+
text: "argnorm_report"

0 commit comments

Comments
 (0)