Skip to content

Conversation

emmcauley
Copy link

@emmcauley emmcauley commented Sep 4, 2025

This PR adds a new module, samclip, and corresponding tests. samclip is used to filter SAM files for soft and hard clipped alignments. Test data is derived from nf-core test-datasets.

PR checklist

Closes #8342

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfill software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

@emmcauley emmcauley force-pushed the samclip branch 3 times, most recently from a8c551d to 5b0fa90 Compare September 4, 2025 18:13
@emmcauley emmcauley marked this pull request as ready for review September 4, 2025 18:20
@emmcauley emmcauley added the new module Adding a new module label Sep 4, 2025
Copy link
Contributor

@SPPearce SPPearce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tool only works on sam files and only outputs sam tools?
I think this should be piped in and out of samtools inside the module.
Potentially also samtools fixmate to fix paired end reads.

@emmcauley emmcauley requested a review from SPPearce September 14, 2025 19:03
Copy link
Contributor

@SPPearce SPPearce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other things to consider:

  • I think it would be good to add args to the samtools commands, which would allow for e.g. sorting ready for tools that require name-sorted or template-coordinate sorted, or filtering via that first samtools view.
  • Do we want to support making cram files?
  • Do we want to support index creation inside the final samtools sort?


samtools view -h --output-fmt sam ${bam} | \\
samclip ${args} --ref ${ref_filename} | \\
samtools sort -n -O BAM -T /tmp | \\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to make this output a bam, or would an uncompressed sam be faster?

type: map
description: |
Groovy Map containing sample information
e.g. `[ id:'sample1', single_end:false ]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it is in the template, but we don't need to mention single_end here.

Suggested change
e.g. `[ id:'sample1', single_end:false ]`
e.g. `[ id:'sample1' ]`

tag "modules"
tag "modules_nfcore"
tag "samclip"
tag "samtools/view"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tag "samtools/view"

tag "samtools/view"

test("test-data - NA12878.chr22.bam") {
config "./nextflow.config"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
config "./nextflow.config"

name "Test Process SAMCLIP"
script "../main.nf"
process "SAMCLIP"
config "./nextflow.config"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
config "./nextflow.config"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can delete this now you are not using samtools_view

def ref_filename = reference.getName().replaceAll(/\.gz$/, "")
"""
# decompress reference if gzipped
${is_compressed ? "gzip -c -d ${reference} > ${ref_filename}" : ""}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this decompressed reference at the end if it is made please.

def prefix = task.ext.prefix ?: "${meta.id}.samclip"
def is_compressed = reference.getName().endsWith(".gz")
def ref_filename = reference.getName().replaceAll(/\.gz$/, "")
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://nf-co.re/docs/guidelines/components/modules#command-file-output-naming

Suggested change
"""
if ("$bam" == "${prefix}.bam") error "Input and output names are the same, set prefix in module configuration to disambiguate!"
"""


script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}.samclip"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think underscores is how most modules set the extra part of prefix:

Suggested change
def prefix = task.ext.prefix ?: "${meta.id}.samclip"
def prefix = task.ext.prefix ?: "${meta.id}_samclip"

"""

stub:
def prefix = task.ext.prefix ?: "${meta.id}.samclip"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think underscores is how most modules set the extra part of prefix:

Suggested change
def prefix = task.ext.prefix ?: "${meta.id}.samclip"
def prefix = task.ext.prefix ?: "${meta.id}_samclip"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

new module: samclip

2 participants