nf-core
diff --git a/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎CITATIONS.md‎
Lines changed: 2 additions & 0 deletions b/‎CITATIONS.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 3 additions & 2 deletions b/‎README.md‎
Lines changed: 3 additions & 2 deletions
diff --git a/‎conf/modules.config‎
Lines changed: 4 additions & 0 deletions b/‎conf/modules.config‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/output.md‎
Lines changed: 13 additions & 0 deletions b/‎docs/output.md‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎docs/usage.md‎
Lines changed: 6 additions & 0 deletions b/‎docs/usage.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎modules.json‎
Lines changed: 5 additions & 0 deletions b/‎modules.json‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎modules/nf-core/seqtk/sample/environment.yml‎
Lines changed: 5 additions & 0 deletions b/‎modules/nf-core/seqtk/sample/environment.yml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎modules/nf-core/seqtk/sample/main.nf‎
Lines changed: 58 additions & 0 deletions b/‎modules/nf-core/seqtk/sample/main.nf‎
Lines changed: 58 additions & 0 deletions
diff --git a/‎modules/nf-core/seqtk/sample/meta.yml‎
Lines changed: 52 additions & 0 deletions b/‎modules/nf-core/seqtk/sample/meta.yml‎
Lines changed: 52 additions & 0 deletions
@@ -12,6 +12,7 @@ Initial release of nf-core/seqinspector, created with the [nf-core](https://nf-c
 - [#20](https://github.com/nf-core/seqinspector/pull/20) Use tags to generate group reports
 - [#13](https://github.com/nf-core/seqinspector/pull/13) Generate reports per run, per project and per lane.
 - [#49](https://github.com/nf-core/seqinspector/pull/49) Merge with template 3.0.2.
+- [#50](https://github.com/nf-core/seqinspector/pull/50) Add an optional subsampling step.
 - [#51](https://github.com/nf-core/seqinspector/pull/51) Add nf-test to CI.
 - [#63](https://github.com/nf-core/seqinspector/pull/63) Contribution guidelines added about displaying results for new tools
 
 
@@ -18,6 +18,8 @@
 
 > Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
 
+- [Seqtk](https://github.com/lh3/seqtk)
+
 ## Software packaging/containerisation tools
 
 - [Anaconda](https://anaconda.com)
 
@@ -31,8 +31,9 @@
      workflows use the "tube map" design for that. See https://nf-co.re/docs/contributing/design_guidelines#examples for examples.   -->
 <!-- TODO nf-core: Fill in short bullet-pointed list of the default steps in the pipeline -->
 
-1. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))
-2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))
+1. Subsample reads ([`Seqtk`](https://github.com/lh3/seqtk))
+2. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))
+3. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))
 
 ## Usage
 
 
@@ -18,6 +18,10 @@ process {
         saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
     ]
 
+    withName: SEQTK_SAMPLE {
+        ext.args = '-s100'
+    }
+
     withName: FASTQC {
         ext.args = '--quiet'
     }
 
@@ -10,10 +10,23 @@ The directories listed below will be created in the results directory after the
 
 The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps:
 
+- [Seqtk](#seqtk) - Subsample a specific number of reads per sample
 - [FastQC](#fastqc) - Raw read QC
 - [MultiQC](#multiqc) - Aggregate report describing results and QC from the whole pipeline
 - [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution
 
+### Seqtk
+
+<details markdown="1">
+<summary>Output files</summary>
+
+- `seqtk/`
+  - `*_fastq`: FastQ file after being subsampled to the sample_size value.
+
+</details>
+
+[Seqtk](https://github.com/lh3/seqtk) samples sequences by number.
+
 ### FastQC
 
 <details markdown="1">
 
@@ -93,6 +93,12 @@ genome: 'GRCh37'
 
 You can also generate such `YAML`/`JSON` files via [nf-core/launch](https://nf-co.re/launch).
 
+Optionally, the `sample_size` parameter allows you to subset a random number of reads to be analysed. Note that it refers to an absolute number.
+
+```bash
+nextflow run nf-core/seqinspector --input ./samplesheet.csv --outdir ./results --sample_size 1000000 -profile docker
+```
+
 ### Updating the pipeline
 
 When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline:
 
@@ -14,6 +14,11 @@
                         "branch": "master",
                         "git_sha": "cf17ca47590cc578dfb47db1c2a44ef86f89976d",
                         "installed_by": ["modules"]
+                    },
+                    "seqtk/sample": {
+                        "branch": "master",
+                        "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
+                        "installed_by": ["modules"]
                     }
                 }
             },
Original file line number	Diff line number	Diff line change
`@@ -18,6 +18,10 @@ process {`
`18`	`18`	`saveAs: { filename -> filename.equals('versions.yml') ? null : filename }`
`19`	`19`	`]`
`20`	`20`
	`21`	`+ withName: SEQTK_SAMPLE {`
	`22`	`+ ext.args = '-s100'`
	`23`	`+ }`
	`24`	`+`
`21`	`25`	`withName: FASTQC {`
`22`	`26`	`ext.args = '--quiet'`
`23`	`27`	`}`
Original file line number	Diff line number	Diff line change
`@@ -14,6 +14,11 @@`
`14`	`14`	`"branch": "master",`
`15`	`15`	`"git_sha": "cf17ca47590cc578dfb47db1c2a44ef86f89976d",`
`16`	`16`	`"installed_by": ["modules"]`
	`17`	`+ },`
	`18`	`+ "seqtk/sample": {`
	`19`	`+ "branch": "master",`
	`20`	`+ "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",`
	`21`	`+ "installed_by": ["modules"]`
`17`	`22`	`}`
`18`	`23`	`}`
`19`	`24`	`},`