Skip to content

Commit 38835fb

Browse files
authored
Merge pull request #76 from fulaibaowang/reports_fraser_mae
Reports fraser mae
2 parents ef662a3 + e428a7f commit 38835fb

File tree

34 files changed

+2289
-72
lines changed

34 files changed

+2289
-72
lines changed

assets/multiqc_config.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ report_comment: >
22
This report has been generated by the <a href="https://github.com/nf-core/drop/tree/dev" target="_blank">nf-core/drop</a>
33
analysis pipeline. For information about how to interpret these results, please see the
44
<a href="https://nf-co.re/drop/dev/docs/output" target="_blank">documentation</a>.
5+
56
report_section_order:
67
"nf-core-drop-methods-description":
78
order: -1000
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
custom_content:
2+
order:
3+
- fraser_overview
4+
- q_estimation_psi5
5+
- q_estimation_psi3
6+
- q_estimation_theta
7+
- q_estimation_jaccard
8+
- aberrantly_spliced_genes
9+
- batch_correlation_psi5_FALSE
10+
- batch_correlation_psi5_TRUE
11+
- batch_correlation_psi3_FALSE
12+
- batch_correlation_psi3_TRUE
13+
- batch_correlation_theta_FALSE
14+
- batch_correlation_theta_TRUE
15+
- batch_correlation_jaccard_FALSE
16+
- batch_correlation_jaccard_TRUE
17+
- total_outliers
18+
- results
19+
20+
custom_data:
21+
fraser_overview:
22+
section_name: "Fraser overview"
23+
format: "tsv"
24+
plot_type: "table"
25+
26+
q_estimation_psi5:
27+
section_name: "Hyperparameter optimization - Q_estimation_psi5"
28+
29+
q_estimation_psi3:
30+
section_name: "Hyperparameter optimization - Q_estimation_psi3"
31+
32+
q_estimation_theta:
33+
section_name: "Hyperparameter optimization - Q_estimation_theta"
34+
35+
q_estimation_jaccard:
36+
section_name: "Hyperparameter optimization"
37+
description: "Q_estimation_jaccard"
38+
39+
aberrantly_spliced_genes:
40+
section_name: "Aberrantly spliced genes"
41+
42+
batch_correlation_psi5_FALSE:
43+
section_name: "Batch Correlation psi5 raw"
44+
batch_correlation_psi5_TRUE:
45+
section_name: "Batch correlation psi5 normalized"
46+
47+
batch_correlation_psi3_FALSE:
48+
section_name: "Batch Correlation psi3 raw"
49+
batch_correlation_psi3_TRUE:
50+
section_name: "Batch correlation psi3 normalized"
51+
52+
batch_correlation_theta_FALSE:
53+
section_name: "Batch Correlation theta raw"
54+
batch_correlation_theta_TRUE:
55+
section_name: "Batch correlation theta normalized"
56+
57+
batch_correlation_jaccard_FALSE:
58+
section_name: "Batch Correlation jaccard raw"
59+
batch_correlation_jaccard_TRUE:
60+
section_name: "Batch correlation jaccard normalized"
61+
62+
total_outliers:
63+
section_name: "Results"
64+
format: "tsv"
65+
plot_type: "table"
66+
description: "Total splicing outliers"
67+
68+
results:
69+
section_name: "Results table"
70+
description: "FRASER results (up to 500 rows shown)"
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
custom_content:
2+
order:
3+
- mae_overview
4+
- cascade_plot
5+
- variant_frequency
6+
- median_of_each_category
7+
- results
8+
9+
custom_data:
10+
mae_overview:
11+
section_name: "MAE overview"
12+
format: "tsv"
13+
plot_type: "table"
14+
15+
cascade_plot:
16+
section_name: "Cascade plot"
17+
description: |
18+
a cascade plot that shows a progression of added filters
19+
20+
- `>10 counts`: only variants supported by more than 10 counts
21+
- `+MAE`: and shows mono allelic expression
22+
- `+MAE for REF`: the monoallelic expression favors the reference allele
23+
- `+MAE for ALT`: the monoallelic expression favors the alternative allele
24+
- `rare`:
25+
- if add_AF is set to true in config file must meet minimum AF set by the config value max_AF
26+
- must meet the inner-cohort frequency maxVarFreqCohort cutoff
27+
28+
variant_frequency:
29+
section_name: "Variant Frequency within Cohort"
30+
31+
median_of_each_category:
32+
section_name: "Median of each category"
33+
34+
results:
35+
section_name: "MAE Results table"
36+
description: "MAE results (up to 500 rows shown)"
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
custom_content:
2+
order:
3+
- matching_values_distribution
4+
- heatmap_matching_variants
5+
- identify_matching_samples
6+
- false_matches
7+
- false_mismatches
8+
9+
custom_data:
10+
matching_values_distribution:
11+
section_name: "DNA - RNA matching values distribution"
12+
13+
heatmap_matching_variants:
14+
section_name: "Heatmap of matching variants"
15+
description: |
16+
Shows the proportion of matching DNA (rows) - RNA (cols) variants. Possible values are:
17+
18+
- match: the DNA sample matches the annotated RNA sample
19+
- no match: the DNA sample does not match the annotated RNA and no match was found
20+
- matches other: the DNA sample does not match the annotated RNA, but another match was found
21+
- matches more: the DNA sample matches the annotated RNA, but also other RNAs not annotated to match
22+
- matches less: the DNA sample is annotated with more than 1 RNA. Not all annotated RNAs are correct.
23+
24+
Similar for the RNAs.
25+
26+
identify_matching_samples:
27+
section_name: "Identify matching samples"
28+
format: "tsv"
29+
plot_type: "table"
30+
description: |
31+
Considerations: On our experience, the median of the proportion of matching variants in matching samples is around 0.95, and the median of the proportion of matching variants in not matching samples is around 0.58. Sometimes we do see some values between 0.7 - 0.85. That could mean that the DNA-RNA combination is not from the same person, but from a relative. It could also be due to a technical error. For those cases, check the following:
32+
33+
- RNA sequencing depth (low seq depth that can lead to variants not to be found in the RNA)
34+
- Number of variants (too many variants called due to sequencing errors)
35+
- Ratio of heterozygous/homozygous variants (usually too many called variants means too many heterozygous ones)
36+
- Is the sample a relative of the other?
37+
38+
false_matches:
39+
section_name: "Samples that were annotated to match but do not"
40+
41+
false_mismatches:
42+
section_name: "Samples that were not annotated to match but actually do"
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
custom_content:
2+
order:
3+
- number_of_samples
4+
- number_of_introns
5+
- number_of_splice_sites
6+
- comparison_local_and_external
7+
- expression_filtering
8+
9+
custom_data:
10+
number_of_samples:
11+
section_name: "Number of samples"
12+
format: "tsv"
13+
plot_type: "table"
14+
15+
number_of_introns:
16+
section_name: "Number of introns"
17+
format: "tsv"
18+
plot_type: "table"
19+
20+
number_of_splice_sites:
21+
section_name: "Number of splice sites"
22+
format: "tsv"
23+
plot_type: "table"
24+
25+
comparison_local_and_external:
26+
section_name: "Comparison of local and external counts"
27+
description: |
28+
Using external counts
29+
30+
External counts introduce some complexity into the problem of counting junctions because it is unknown whether or not a junction is not counted (because there are no reads) compared to filtered and not present due to legal/personal sharing reasons. As a result, after merging the local (counted from BAM files) counts and the external counts, only the junctions that are present in both remain. As a result it is likely that the number of junctions will decrease after merging.
31+
32+
expression_filtering:
33+
section_name: "Expression filtering"
34+
description: |
35+
The expression filtering step removes introns that are lowly expressed. The requirements for an intron to pass this filter are:
36+
37+
- at least 1 sample has 20 counts (K) for the intron
38+
- at least 5% of the samples need to have a total of at least 10 reads for the splice metric denominator (N) of the intron
39+
40+
variability_filtering:
41+
section_name: "Variability filtering"
42+
description: |
43+
The variability filtering step removes introns that have no or little variability in the splice metric values across samples. The requirement for an intron to pass this filter is:
44+
45+
- at least 1 sample has a difference of at least 0.05 in the splice metric compared to the mean splice metric of the intron

conf/modules.config

Lines changed: 123 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -299,6 +299,7 @@ process {
299299
ext.prefix = { meta.annotation }
300300
publishDir = [
301301
path: { "${params.outdir}/processed_results/mae/${meta.drop_group}" },
302+
pattern: "MAE_results_*",
302303
mode: params.publish_dir_mode,
303304
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
304305
overwrite: true
@@ -332,49 +333,139 @@ process {
332333
//
333334
// Other processes
334335
//
335-
withName: 'MULTIQC_COUNTEXPRESSION' {
336+
withName: 'DROP.*MULTIQC_GENECOUNTS' {
336337
publishDir = [
337338
path: { "${params.outdir}/reports/aberrant_expression/genecounts" },
338339
mode: params.publish_dir_mode,
339-
saveAs: {
340-
filename ->
341-
switch (filename) {
342-
case 'versions.yml':
343-
null
344-
break
345-
case ~/\[TAG:.+\]_multiqc_(report\.html|plots|data)/:
346-
def tag = (filename =~ /\[TAG:(.+)\]_multiqc_(report\.html|plots|data)/)[0][1]
347-
def new_filename = filename.replaceFirst(
348-
"(?<prefix>.*)\\[TAG:${tag}\\]_(?<suffix>multiqc_(report\\.html|plots|data).*)",
349-
'${prefix}${suffix}')
350-
"${tag}/${new_filename}"
351-
break
352-
default:
353-
filename
340+
saveAs: { filename ->
341+
if (filename == 'versions.yml') {
342+
return null
343+
}
344+
else if (filename ==~ /\[TAG:genecounts_.+\]_multiqc_(report_mqc\.html|plots|data)/) {
345+
def tag = (filename =~ /\[TAG:genecounts_(.+)\]_multiqc_(report_mqc\.html|plots|data)/)[0][1]
346+
def new_filename = filename.replaceFirst(
347+
"(?<prefix>.*)\\[TAG:genecounts_${tag}\\]_(?<suffix>multiqc_(report_mqc\\.html|plots|data).*)",
348+
'${prefix}${suffix}'
349+
)
350+
return "${tag}/${new_filename}"
351+
}
352+
else {
353+
return filename
354354
}
355355
}
356356
]
357357
}
358358

359-
withName: 'MULTIQC_OUTRIDER' {
359+
withName: 'DROP.*MULTIQC_OUTRIDER' {
360360
publishDir = [
361361
path: { "${params.outdir}/reports/aberrant_expression/outrider" },
362362
mode: params.publish_dir_mode,
363-
saveAs: {
364-
filename ->
365-
switch (filename) {
366-
case 'versions.yml':
367-
null
368-
break
369-
case ~/\[TAG:.+\]_multiqc_(report\.html|plots|data)/:
370-
def tag = (filename =~ /\[TAG:(.+)\]_multiqc_(report\.html|plots|data)/)[0][1]
371-
def new_filename = filename.replaceFirst(
372-
"(?<prefix>.*)\\[TAG:${tag}\\]_(?<suffix>multiqc_(report\\.html|plots|data).*)",
373-
'${prefix}${suffix}')
374-
"${tag}/${new_filename}"
375-
break
376-
default:
377-
filename
363+
saveAs: { filename ->
364+
if (filename == 'versions.yml') {
365+
return null
366+
}
367+
else if (filename ==~ /\[TAG:outrider_.+\]_multiqc_(report_mqc\.html|plots|data)/) {
368+
def tag = (filename =~ /\[TAG:outrider_(.+)\]_multiqc_(report_mqc\.html|plots|data)/)[0][1]
369+
def new_filename = filename.replaceFirst(
370+
"(?<prefix>.*)\\[TAG:outrider_${tag}\\]_(?<suffix>multiqc_(report_mqc\\.html|plots|data).*)",
371+
'${prefix}${suffix}'
372+
)
373+
return "${tag}/${new_filename}"
374+
}
375+
else {
376+
return filename
377+
}
378+
}
379+
]
380+
}
381+
382+
withName: 'DROP.*MULTIQC_SPLICECOUNTS' {
383+
publishDir = [
384+
path: { "${params.outdir}/reports/aberrant_splicing/splicecounts" },
385+
mode: params.publish_dir_mode,
386+
saveAs: { filename ->
387+
if (filename == 'versions.yml') {
388+
return null
389+
}
390+
else if (filename ==~ /\[TAG:splicecounts_.+\]_multiqc_(report_mqc\.html|plots|data)/) {
391+
def tag = (filename =~ /\[TAG:splicecounts_(.+)\]_multiqc_(report_mqc\.html|plots|data)/)[0][1]
392+
def new_filename = filename.replaceFirst(
393+
"(?<prefix>.*)\\[TAG:splicecounts_${tag}\\]_(?<suffix>multiqc_(report_mqc\\.html|plots|data).*)",
394+
'${prefix}${suffix}'
395+
)
396+
return "${tag}/${new_filename}"
397+
}
398+
else {
399+
return filename
400+
}
401+
}
402+
]
403+
}
404+
405+
withName: 'DROP.*MULTIQC_FRASER' {
406+
publishDir = [
407+
path: { "${params.outdir}/reports/aberrant_splicing/fraser" },
408+
mode: params.publish_dir_mode,
409+
saveAs: { filename ->
410+
if (filename == 'versions.yml') {
411+
return null
412+
}
413+
else if (filename ==~ /\[TAG:fraser_.+\]_multiqc_(report_mqc\.html|plots|data)/) {
414+
def tag = (filename =~ /\[TAG:fraser_(.+)\]_multiqc_(report_mqc\.html|plots|data)/)[0][1]
415+
def new_filename = filename.replaceFirst(
416+
"(?<prefix>.*)\\[TAG:fraser_${tag}\\]_(?<suffix>multiqc_(report_mqc\\.html|plots|data).*)",
417+
'${prefix}${suffix}'
418+
)
419+
return "${tag}/${new_filename}"
420+
}
421+
else {
422+
return filename
423+
}
424+
}
425+
]
426+
}
427+
428+
withName: 'DROP.*MULTIQC_MAE' {
429+
publishDir = [
430+
path: { "${params.outdir}/reports/mae/mae" },
431+
mode: params.publish_dir_mode,
432+
saveAs: { filename ->
433+
if (filename == 'versions.yml') {
434+
return null
435+
}
436+
else if (filename ==~ /\[TAG:mae_.+\]_multiqc_(report_mqc\.html|plots|data)/) {
437+
def tag = (filename =~ /\[TAG:mae_(.+)\]_multiqc_(report_mqc\.html|plots|data)/)[0][1]
438+
def new_filename = filename.replaceFirst(
439+
"(?<prefix>.*)\\[TAG:mae_${tag}\\]_(?<suffix>multiqc_(report_mqc\\.html|plots|data).*)",
440+
'${prefix}${suffix}'
441+
)
442+
return "${tag}/${new_filename}"
443+
}
444+
else {
445+
return filename
446+
}
447+
}
448+
]
449+
}
450+
451+
withName: 'DROP.*MULTIQC_MAEQC' {
452+
publishDir = [
453+
path: { "${params.outdir}/reports/mae/maeqc" },
454+
mode: params.publish_dir_mode,
455+
saveAs: { filename ->
456+
if (filename == 'versions.yml') {
457+
return null
458+
}
459+
else if (filename ==~ /\[TAG:maeqc_.+\]_multiqc_(report_mqc\.html|plots|data)/) {
460+
def tag = (filename =~ /\[TAG:maeqc_(.+)\]_multiqc_(report_mqc\.html|plots|data)/)[0][1]
461+
def new_filename = filename.replaceFirst(
462+
"(?<prefix>.*)\\[TAG:maeqc_${tag}\\]_(?<suffix>multiqc_(report_mqc\\.html|plots|data).*)",
463+
'${prefix}${suffix}'
464+
)
465+
return "${tag}/${new_filename}"
466+
}
467+
else {
468+
return filename
378469
}
379470
}
380471
]
@@ -389,5 +480,4 @@ process {
389480
]
390481
}
391482

392-
393483
}

0 commit comments

Comments
 (0)