COVID-19: variation analysis of ARTIC ONT data

Annotation: A Galaxy workflow that replaces the ARTIC minion shell command

StepAnnotation
Step 1: Input dataset
select at runtime
Step 2: Input dataset
select at runtime
Step 3: Input dataset collection
select at runtime
Step 4: Input parameter
Not available.
Step 5: Input parameter
Not available.
Step 6: fastp
Single-end
Output dataset 'output' from step 3
Adapter Trimming Options:
True
Empty.
Global trimming options:
Not available.
Not available.
Overrepresented Sequence Analysis:
False
Not available.
Filter Options:
Quality filtering options:
True
Not available.
Not available.
Not available.
Length filtering options:
False
Not available.
Not available.
Low complexity filtering options:
False
Not available.
Read Modification Options:
Disable polyG tail trimming
Disable polyX trimming
UMI processing:
False
Empty.
Not available.
Empty.
Per read cutting by quality options:
False
False
Not available.
Not available.
Base correction by overlap analysis options:
False
Output Options:
True
False
Step 7: Map with minimap2
Use a genome from history and build index
Output dataset 'output' from step 2
Single
Output dataset 'out1' from step 6
Oxford Nanopore read to reference mapping. Slightly more sensitive for Oxford Nanopore to reference mapping (-k15). For PacBio reads, HPC minimizers consistently leads to faster performance and more sensitive results in comparison to normal minimizers. For Oxford Nanopore data, normal minimizers are better, though not much. The effectiveness of HPC is determined by the sequencing error mode. (map-ont)
Indexing options:
False
Not available.
Not available.
Not available.
Mapping options:
Not available.
Not available.
Not available.
Not available.
Not available.
Not available.
Not available.
Not available.
Not available.
Not available.
False
Not available.
Alignment options:
No, use profile setting or leave turned off
Not available.
Not available.
Not available.
Not available.
Not available.
Not available.
Not available.
Not available.
Not available.
True
Set advanced output options:
BAM
False
False
Not available.
Nothing selected.
False
False
False
Step 8: Samtools view
Output dataset 'alignment_output' from step 7
A filtered/subsampled selection of reads
Configure filters:
No
No
1
Empty.
Not available.
Nothing selected.
Read is unmapped Alignment of the read is not primary
Nothing selected.
Configure subsampling:
Specify a downsampling factor
1.0
Not available.
All reads retained after filtering and subsampling
False
Read Reformatting Options:
Strip read tags from outputs
False
BAM (-b)
No, see help (-output-fmt-option no_ref)
Step 9: ivar trim
Output dataset 'outputsam' from step 8
History
Output dataset 'output' from step 1
1
0
4
False
Step 10: Samtools stats
Output dataset 'outputsam' from step 8
No
False
One single summary file
Do not filter
Not available.
Not available.
Not available.
Not available.
Not available.
No
No
False
False
Not available.
Step 11: BamLeftAlign
History
Output dataset 'output_bam' from step 9
Output dataset 'output' from step 2
5
Step 12: medaka consensus tool
Output dataset 'output_bam' from step 11
r941_min_high_g351
100
None
800
400
False
False
False
Empty.
Empty.
Not available.
False
Result
Step 13: QualiMap BamQC
Output dataset 'output_bam' from step 11
All (whole genome)
False
Reads flagged as duplicates in input
Settings affecting specific plots:
400
True
Nothing selected.
3
Step 14: medaka variant tool
No
Output dataset 'out_result' from step 12
Use a genome from history
Output dataset 'output' from step 2
Empty.
False
Output annotated VCF
Output dataset 'output_bam' from step 11
0
0
True
Step 15: Flatten Collection
Output dataset 'raw_data' from step 13
underscore ( _ )
Step 16: Lofreq filter
Output dataset 'out_annotated' from step 14
SNVs and Indels
Quality-based filter options:
No, don't apply call quality filter
No, don't apply call quality filter
Coverage-based filter options:
0
0
Allele frequency filter options:
0.0
0.0
Strand bias filter options:
Yes, filter on multiple testing corrected strand-bias p-value (lofreq default)
0.001
False-discovery rate
True
False
Keep variants, but indicate failed filters in output FILTER column
Step 17: MultiQC
Results
Results 1
Samtools
Samtools outputs
Samtools output 1
stats
Output dataset 'output' from step 10
Results 2
Qualimap (BamQC or RNASeq output)
Output dataset 'output' from step 15
Empty.
Empty.
False
False
Step 18: Replace
Output dataset 'outvcf' from step 16
#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO.*
#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\tFORMAT\tSAMPLE
True
False
False
False
False
entire line
Step 19: SnpEff eff:
Output dataset 'outfile' from step 18
VCF
NC_045512.2: COVID19 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1
VCF (only if input is VCF)
False
No upstream / downstream intervals (0 bases)
Use 'EFF' field compatible with older versions (instead of 'ANN') Use Classic Effect names and amino acid variant annotations (NON_SYNONYMOUS_CODING vs missense_variant and G180R vs p.Gly180Arg/c.538G>C)
select at runtime
select at runtime
Do not show DOWNSTREAM changes Do not show INTERGENIC changes Do not show UPSTREAM changes
No
Use default (based on input type)
Empty.
True
True