Workflow: Nanopore assembly workflow
**Workflow for sequencing with ONT Nanopore data, from basecalled reads to (meta)assembly and binning**<br> - Workflow Nanopore Quality - Kraken2 taxonomic classification of FASTQ reads - Flye (de-novo assembly) - Medaka (assembly polishing) - metaQUAST (assembly quality reports) **When Illumina reads are provided:** - Workflow Illumina Quality: https://workflowhub.eu/workflows/336?version=1 - Assembly polishing with Pilon<br> - Workflow binnning https://workflowhub.eu/workflows/64?version=11 - Metabat2 - CheckM - BUSCO - GTDB-Tk **All tool CWL files and other workflows can be found here:**<br> Tools: https://git.wur.nl/unlock/cwl/-/tree/master/cwl<br> Workflows: https://git.wur.nl/unlock/cwl/-/tree/master/cwl/workflows<br> The dependencies are either accessible from https://unlock-icat.irods.surfsara.nl (anonymous,anonymous)<br> and/or<br> By using the conda / pip environments as shown in https://git.wur.nl/unlock/docker/-/blob/master/kubernetes/scripts/setup.sh<br>
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
memory | Integer (Optional) | Maximum memory in MB |
Maximum memory usage in megabytes |
binning | Boolean (Optional) | Run binning workflow |
Run with contig binning workflow |
threads | Integer (Optional) | Number of threads |
Number of threads to use for computational processes |
identifier | String | Identifier used |
Identifier for this dataset used in this workflow |
metagenome | Boolean (Optional) | When working with metagenomes |
Metagenome option for the flye assembly |
deduplicate | Boolean (Optional) | Deduplicate reads |
Remove exact duplicate reads (Illumina) with fastp |
destination | String (Optional) | Output Destination |
Optional Output destination used for cwl-prov reporting. |
pilon_fixlist | String | Pilon fix list |
A comma-separated list of categories of issues to try to fix |
basecall_model | String | Basecalling model |
Basecalling model used with Guppy |
kraken_database | String | Kraken2 database |
Absolute path with database location of kraken2 |
filter_references | String[] | Contamination reference file(s) |
Reference fasta file(s) for contamination filtering |
nanopore_fastq_files | String[] (Optional) | Nanopore reads |
List of file paths with Nanopore raw reads in fastq format |
nanopore_fastq_reads | File[] (Optional) | Nanopore FASTQ reads |
File(s) of FASTQ reads in gzip format |
illumina_forward_reads | String[] (Optional) | illumina forward reads |
illumina sequenced forward read file |
illumina_reverse_reads | String[] (Optional) | illumina reverse reads |
illumina sequenced reverse file |
use_reference_mapped_reads | Boolean | Use mapped reads |
Continue with reads mapped to the given reference |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
flye |
../flye/flye.cwl
(CommandLineTool)
|
De novo assembler for single molecule sequencing reads, with a focus in Oxford Nanopore Technologies reads |
Flye v2.9 assembler with a focus in reads from Oxford Nanopore Technologies. |
medaka |
../medaka/medaka_py.cwl
(CommandLineTool)
|
Polishing of assembly created from ONT nanopore long reads |
Uses Medaka to polish an assembly constructed from on ONT nanopore reads that have been basecalled by Guppy. |
kraken2_krona |
../krona/krona.cwl
(CommandLineTool)
|
Krona |
Visualization of Kraken2 report results. ktImportText -o $1 $2 |
workflow_pilon |
workflow_pilon_mapping.cwl
(Workflow)
|
Metagenomics workflow |
Workflow pilon assembly polishing Steps: - BBmap (Read mapping to assembly) - Pilon |
illumina_kraken2 |
../kraken2/kraken2.cwl
(CommandLineTool)
|
Kraken2 metagenomics read classification |
Kraken2 metagenomics read classification. |
kraken2_compress |
../bash/pigz.cwl
(CommandLineTool)
|
compress a file multithreaded with pigz | |
metaquast_medaka |
../metaquast/metaquast.cwl
(CommandLineTool)
|
metaQUAST: Quality Assessment Tool for Metagenome Assemblies |
Runs the Quality Assessment Tool for Metagenome Assemblies application |
nanopore_kraken2 |
../kraken2/kraken2.cwl
(CommandLineTool)
|
Kraken2 metagenomics read classification |
Kraken2 metagenomics read classification. |
workflow_binning |
workflow_metagenomics_binning.cwl
(Workflow)
|
Metagenomic Binning from Assembly |
Workflow for Metagenomics from raw reads to annotated bins.<br>
Summary
- MetaBAT2 (binning)
- CheckM (bin completeness and contamination)
- GTDB-Tk (bin taxonomic classification)
- BUSCO (bin completeness) |
flye_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
pilon_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
medaka_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
binning_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
kraken2_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
assembly_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
metaquast_nanopore_pilon |
../metaquast/metaquast.cwl
(CommandLineTool)
|
metaQUAST: Quality Assessment Tool for Metagenome Assemblies |
Runs the Quality Assessment Tool for Metagenome Assemblies application |
workflow_quality_illumina |
workflow_illumina_quality.cwl
(Workflow)
|
Illumina read quality control, trimming and contamination filter. |
**Workflow for Illumina paired read quality control, trimming and filtering.**<br />
Multiple paired datasets will be merged into single paired dataset.<br />
Summary:
- FastQC on raw data files<br />
- fastp for read quality trimming<br />
- BBduk for phiX and (optional) rRNA filtering<br />
- Kraken2 for taxonomic classification of reads (optional)<br />
- BBmap for (contamination) filtering using given references (optional)<br />
- FastQC on filtered (merged) data<br /> |
workflow_quality_nanopore |
workflow_nanopore_quality.cwl
(Workflow)
|
Nanopore Quality Control and Filtering |
**Workflow for nanopore read quality control and contamination filtering.**
- FastQC before filtering (read quality control)
- Kraken2 taxonomic read classification
- Minimap2 read filtering based on given references
- FastQC after filtering (read quality control) |
illumina_pilon_readmapping |
../bbmap/bbmap.cwl
(CommandLineTool)
|
BBMap |
Read filtering using BBMap against a (contamination) reference genome |
metaquast_pilon_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
illumina_pilon_sam_to_sorted_bam |
../samtools/sam_to_sorted-bam.cwl
(CommandLineTool)
|
sam to sorted bam |
samtools view -@ $2 -hu $1 | samtools sort -@ $2 -o $3.bam |
metaquast_medaka_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
binning_output | Directory | Binning output |
Binning outputfolders |
kraken2_output | Directory | Kraken2 reports |
Kraken2 taxonomic classification reports |
assembly_output | Directory | Assembly output |
Output from different assembly steps |
illumina_quality_stats | Directory | Filtered statistics |
Statistics on quality and preprocessing of the reads |
nanopore_quality_output | Directory | Read quality and filtering reports |
Quality reports |
https://w3id.org/cwl/view/git/b9097b82e6ab6f2c9496013ce4dd6877092956a0/cwl/workflows/workflow_nanopore_assembly.cwl