Workflow: rna annotation

Fetched 2023-04-03 01:24:57 GMT

RNAs - predict, cluster, identify, annotate

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
jobid String
m5nrBDB File
m5rnaFull File
sequences File
m5rnaClust File
m5rnaIndex Directory
m5rnaPrefix String
rnaIdentity Float (Optional)

Steps

ID Runs Label Doc
rnaBlat
../Tools/blat.tool.cwl (CommandLineTool)
BLAT

fast sequence search command line tool >blat -fastMap -t dna -q rna -out blast8 <database> <query> <output>

sortseq
../Tools/seqUtil.tool.cwl (CommandLineTool)
seqUtil

Utility tool for various sequence file transformations.

sorttab
../Tools/sort.tool.cwl (CommandLineTool)
GNU sort

sort text file base on given field(s)

sortmerna
../Tools/sortmerna.tool.cwl (CommandLineTool)
sortmerna

align rRNA fasta file against clustered rRNA index output in blast m8 format >sortmerna -a <# core> -m <MB ram> -e 0.1 --blast '1 cigar qcov qstrand' --ref '<refFasta>,<indexDir>/<indexName>' --reads <input> --aligned <input basename>

bleachSims
../Tools/bleachsims.tool.cwl (CommandLineTool)
bleachsims

filter similarity file by E-value and number of hits >bleachsims -s <input> -o <output> -m 20 -r 0 -c 3

rnaCluster
../Tools/cdhit-est.tool.cwl (CommandLineTool)
CD-HIT-est

cluster nucleotide sequences use max available cpus and memory >cdhit-est -n 9 -d 0 -T 0 -M 0 -c 0.97 -i <input> -o <output>

rnaFeature
../Tools/rna_feature.tool.cwl (CommandLineTool)
rna features

identify rRNAs features from given rRNA fasta and blast aligned files >rna_feature.pl --seq <sequence> --sim <aligned> --ident 75 --output <output>

annotateSims
../Tools/sims_annotate.tool.cwl (CommandLineTool)
annotate sims

create expanded annotated sims files from input md5 sim file and m5nr db sims_annotate.pl --verbose --in_sim <input> --in_scg <scgs> --ann_file <database> --format <seqFormat> --out_filter <outFilter> --out_expand <outExpand> -out_lca <outLca> --frag_num 5000

formatCluster
../Tools/format_cluster.tool.cwl (CommandLineTool)
cluster file reformat

re-formats cd-hit .clstr file into mg-rast .mapping file >format_cluster.pl --input <input> --output <output>

Outputs

ID Type Label Doc
rnaLCAOut File
rnaSimsOut File
rnaExpandOut File
rnaFilterOut File
rnaFeatureOut File
rnaClustMapOut File
rnaClustSeqOut File
Permalink: https://w3id.org/cwl/view/git/f5839797da8209a9d3e441023f88130219751020/CWL/Workflows/rna-annotation.workflow.cwl