-
Notifications
You must be signed in to change notification settings - Fork 909
feat: samclip module #8999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: samclip module #8999
Conversation
a8c551d
to
5b0fa90
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tool only works on sam files and only outputs sam tools?
I think this should be piped in and out of samtools inside the module.
Potentially also samtools fixmate to fix paired end reads.
…Seqera container, update tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other things to consider:
- I think it would be good to add args to the samtools commands, which would allow for e.g. sorting ready for tools that require name-sorted or template-coordinate sorted, or filtering via that first samtools view.
- Do we want to support making cram files?
- Do we want to support index creation inside the final samtools sort?
|
||
samtools view -h --output-fmt sam ${bam} | \\ | ||
samclip ${args} --ref ${ref_filename} | \\ | ||
samtools sort -n -O BAM -T /tmp | \\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to make this output a bam, or would an uncompressed sam be faster?
type: map | ||
description: | | ||
Groovy Map containing sample information | ||
e.g. `[ id:'sample1', single_end:false ]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it is in the template, but we don't need to mention single_end here.
e.g. `[ id:'sample1', single_end:false ]` | |
e.g. `[ id:'sample1' ]` |
tag "modules" | ||
tag "modules_nfcore" | ||
tag "samclip" | ||
tag "samtools/view" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tag "samtools/view" |
tag "samtools/view" | ||
|
||
test("test-data - NA12878.chr22.bam") { | ||
config "./nextflow.config" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config "./nextflow.config" |
name "Test Process SAMCLIP" | ||
script "../main.nf" | ||
process "SAMCLIP" | ||
config "./nextflow.config" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config "./nextflow.config" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can delete this now you are not using samtools_view
def ref_filename = reference.getName().replaceAll(/\.gz$/, "") | ||
""" | ||
# decompress reference if gzipped | ||
${is_compressed ? "gzip -c -d ${reference} > ${ref_filename}" : ""} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this decompressed reference at the end if it is made please.
def prefix = task.ext.prefix ?: "${meta.id}.samclip" | ||
def is_compressed = reference.getName().endsWith(".gz") | ||
def ref_filename = reference.getName().replaceAll(/\.gz$/, "") | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://nf-co.re/docs/guidelines/components/modules#command-file-output-naming
""" | |
if ("$bam" == "${prefix}.bam") error "Input and output names are the same, set prefix in module configuration to disambiguate!" | |
""" |
|
||
script: | ||
def args = task.ext.args ?: '' | ||
def prefix = task.ext.prefix ?: "${meta.id}.samclip" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think underscores is how most modules set the extra part of prefix:
def prefix = task.ext.prefix ?: "${meta.id}.samclip" | |
def prefix = task.ext.prefix ?: "${meta.id}_samclip" |
""" | ||
|
||
stub: | ||
def prefix = task.ext.prefix ?: "${meta.id}.samclip" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think underscores is how most modules set the extra part of prefix:
def prefix = task.ext.prefix ?: "${meta.id}.samclip" | |
def prefix = task.ext.prefix ?: "${meta.id}_samclip" |
This PR adds a new module,
samclip
, and corresponding tests.samclip
is used to filter SAM files for soft and hard clipped alignments. Test data is derived from nf-coretest-datasets
.PR checklist
Closes #8342
versions.yml
file.label
nf-core modules test <MODULE> --profile docker
nf-core modules test <MODULE> --profile singularity
nf-core modules test <MODULE> --profile conda
nf-core subworkflows test <SUBWORKFLOW> --profile docker
nf-core subworkflows test <SUBWORKFLOW> --profile singularity
nf-core subworkflows test <SUBWORKFLOW> --profile conda