Skip to content

Use --bga when creating bigWigs #470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [[PR #444](https://github.com/nf-core/chipseq/pull/444)] - Add empty map to ch_gff so that when provided by the user `GFFREAD` works.
- [[#451](https://github.com/nf-core/chipseq/issues/451)] - Pass `map.single_read` to `SUBREAD_FEATURECOUNTS` as to correctly set parameter `-p`.
- [[PR #462](https://github.com/nf-core/chipseq/pull/462)] - Updated pipeline template to [nf-core/tools 3.2.1](https://github.com/nf-core/tools/releases/tag/3.2.1)
- [[#468](https://github.com/nf-core/chipseq/issues/468)] - Changed bigWig generation to use `-bga` option instead of `-bg` in `bedtools genomecov` for lower background levels and better IGV visualization. Users can revert to previous behavior using configuration. See [documentation](https://nf-co.re/chipseq/dev/docs/output/#normalised-bigwig-files) for details.

### Parameters

Expand Down
3 changes: 2 additions & 1 deletion conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@

process {

// TODO nf-core: Check the defaults for all processes
// Default resource requirements for all processes
// These are optimized for ChIP-seq workflows and scale with task.attempt for retry logic
cpus = { 1 * task.attempt }
memory = { 6.GB * task.attempt }
time = { 4.h * task.attempt }
Expand Down
11 changes: 11 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,17 @@ The [Preseq](http://smithlabresearch.org/software/preseq/) package is aimed at p

The [bigWig](https://genome.ucsc.edu/goldenpath/help/bigWig.html) format is in an indexed binary format useful for displaying dense, continuous data in Genome Browsers such as the [UCSC](https://genome.ucsc.edu/cgi-bin/hgTracks) and [IGV](http://software.broadinstitute.org/software/igv/). This mitigates the need to load the much larger BAM files for data visualisation purposes which will be slower and result in memory issues. The coverage values represented in the bigWig file can also be normalised in order to be able to compare the coverage across multiple samples - this is not possible with BAM files. The bigWig format is also supported by various bioinformatics software for downstream processing such as meta-profile plotting.

> [!IMPORTANT]
> As of v2.2.0dev, the pipeline uses the `-bga` option in `bedtools genomecov` instead of `-bg`. This includes zero-coverage bins in the output, resulting in lower background levels and better visualization in IGV. Users who prefer the previous behavior can override this by adding `-bg` to the `ext.args` parameter in their configuration:
>
> ```groovy
> process {
> withName: '.*:BAM_BEDGRAPH_BIGWIG_BEDTOOLS_UCSC:BEDTOOLS_GENOMECOV' {
> ext.args = '-bg'
> }
> }
> ```

### ChIP-seq QC metrics

<details markdown="1">
Expand Down
5 changes: 4 additions & 1 deletion modules/local/bedtools_genomecov.nf
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,13 @@ process BEDTOOLS_GENOMECOV {
SCALE_FACTOR=\$(grep '[0-9] mapped (' $flagstat | awk '{print 1000000/\$1}')
echo \$SCALE_FACTOR > ${prefix}.scale_factor.txt

# Use -bga instead of -bg to include zero-coverage bins in output
# This results in lower background levels and better visualization in IGV
# Users can override this by specifying -bg in ext.args if needed
bedtools \\
genomecov \\
-ibam $bam \\
-bg \\
-bga \\
-scale \$SCALE_FACTOR \\
$pe \\
$args \\
Expand Down
49 changes: 24 additions & 25 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -278,38 +278,37 @@ manifest {
name = 'nf-core/chipseq'
author = """Espinosa-Carrasco J, Patel H, Wang C, Ewels P""" // The author field is deprecated from Nextflow version 24.10.0, use contributors instead
contributors = [
// TODO nf-core: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0
[
name: 'Espinosa-Carrasco J',
affiliation: '',
email: '',
github: '',
contribution: [], // List of contribution types ('author', 'maintainer' or 'contributor')
orcid: ''
name: 'Jose Espinosa-Carrasco',
affiliation: 'The Comparative Bioinformatics Group at The Centre for Genomic Regulation, Spain',
email: '[email protected]',
github: 'JoseEspinosa',
contribution: ['author', 'maintainer'],
orcid: '0000-0002-1541-042X'
],
[
name: ' Patel H',
affiliation: '',
email: '',
github: '',
contribution: [], // List of contribution types ('author', 'maintainer' or 'contributor')
orcid: ''
name: 'Harshil Patel',
affiliation: 'Seqera Labs, Spain',
email: '[email protected]',
github: 'drpatelh',
contribution: ['author', 'maintainer'],
orcid: '0000-0003-2707-7940'
],
[
name: ' Wang C',
affiliation: '',
email: '',
github: '',
contribution: [], // List of contribution types ('author', 'maintainer' or 'contributor')
orcid: ''
name: 'Chuan Wang',
affiliation: 'National Genomics Infrastructure at SciLifeLab, Sweden',
email: '[email protected]',
github: 'chuan-wang',
contribution: ['author'],
orcid: '0000-0003-1113-5297'
],
[
name: ' Ewels P',
affiliation: '',
email: '',
github: '',
contribution: [], // List of contribution types ('author', 'maintainer' or 'contributor')
orcid: ''
name: 'Phil Ewels',
affiliation: 'National Genomics Infrastructure at SciLifeLab, Sweden',
email: '[email protected]',
github: 'ewels',
contribution: ['author'],
orcid: '0000-0003-4101-2502'
],
]
homePage = 'https://github.com/nf-core/chipseq'
Expand Down
3 changes: 0 additions & 3 deletions workflows/chipseq.nf
Original file line number Diff line number Diff line change
Expand Up @@ -112,9 +112,6 @@ workflow CHIPSEQ {
params.seq_center
)
ch_versions = ch_versions.mix(INPUT_CHECK.out.versions)
// TODO: OPTIONAL, you can use nf-validation plugin to create an input channel from the samplesheet with Channel.fromSamplesheet("input")
// See the documentation https://nextflow-io.github.io/nf-validation/samplesheets/fromSamplesheet/
// ! There is currently no tooling to help you write a sample sheet schema

//
// SUBWORKFLOW: Read QC and trim adapters
Expand Down