COVID-19: variation analysis on WGS SE data

Annotation: Call variants from WGS (non-ampliconic) single-end reads.

StepAnnotation
Step 1: Input dataset collection
select at runtime
Step 2: Input dataset
select at runtime
Step 3: fastp
Single-end
Output dataset 'output' from step 1
Adapter Trimming Options:
False
Empty.
Global trimming options:
Not available.
Not available.
Overrepresented Sequence Analysis:
False
Not available.
Filter Options:
Quality filtering options:
False
Not available.
Not available.
Not available.
Length filtering options:
False
Not available.
Not available.
Low complexity filtering options:
False
Not available.
Read Modification Options:
Automatic trimming for Illumina NextSeq/NovaSeq data
Not available.
Disable polyX trimming
UMI processing:
False
Empty.
Not available.
Empty.
Per read cutting by quality options:
False
False
Not available.
Not available.
Base correction by overlap analysis options:
False
Output Options:
True
True
Step 4: Bowtie2
Single-end
Output dataset 'out1' from step 3
False
False
Use a genome from the history and build index
Output dataset 'output' from step 2
Automatically assign ID using name of history item(s)
1: Default setting only
Very sensitive end-to-end (--very-sensitive)
No
True
Step 5: MarkDuplicates
Output dataset 'output' from step 4
Comments
True
True
SUM_OF_BASE_QUALITIES
Empty.
100
Empty.
Lenient
Step 6: MultiQC
Results
Results 1
fastp
Output dataset 'report_json' from step 3
Results 2
Bowtie 2
Output dataset 'mapping_stats' from step 4
Results 3
Picard
Picard outputs
Picard output 1
Markdups
Output dataset 'metrics_file' from step 5
Empty.
Empty.
False
False
Step 7: Realign reads
Output dataset 'outFile' from step 5
History
Output dataset 'output' from step 2
Advanced options:
False
Keep unchanged
2
Step 8: Insert indel qualities
Output dataset 'realigned' from step 7
Dindel
History
Output dataset 'output' from step 2
Step 9: Call variants
Output dataset 'output' from step 8
History
Output dataset 'output' from step 2
Whole reference
SNVs and indels
Configure settings
Coverage:
5
1000000
Paired reads:
False
Base-calling quality:
30
30
Use original base qualities
Base alignment quality:
Yes, and prefer existing alignment qualities encoded in input
Base and indel alignment qualities (BAQ and IDAQ)
True
Mapping quality:
20
Yes, incorporate MAPQ into joint quality score
255
Source quality:
No, don't incorporate source quality into joint quality score
Joint quality:
0
0
0
Custom filter settings/combinations
0.0005
0
False
Step 10: Lofreq filter
Output dataset 'variants' from step 9
SNVs and Indels
Quality-based filter options:
No, don't apply call quality filter
No, don't apply call quality filter
Coverage-based filter options:
0
0
Allele frequency filter options:
0.0
0.0
Strand bias filter options:
Yes, filter on multiple testing corrected strand-bias p-value (lofreq default)
0.001
False-discovery rate
True
False
Keep variants, but indicate failed filters in output FILTER column
Step 11: SnpEff eff:
Output dataset 'outvcf' from step 10
VCF
NC_045512.2: COVID19 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1
VCF (only if input is VCF)
False
No upstream / downstream intervals (0 bases)
Use 'EFF' field compatible with older versions (instead of 'ANN') Use Classic Effect names and amino acid variant annotations (NON_SYNONYMOUS_CODING vs missense_variant and G180R vs p.Gly180Arg/c.538G>C)
select at runtime
select at runtime
Do not show DOWNSTREAM changes Do not show INTERGENIC changes Do not show UPSTREAM changes Do not show 5_PRIME_UTR or 3_PRIME_UTR changes
No
Use default (based on input type)
Empty.
True
True