PXD019119_Metaproteomics workflow 09082020 with QT

Annotation:

StepAnnotation
Step 1: Input dataset
select at runtime
Step 2: Input dataset
select at runtime
Step 3: Protein Database Downloader
UniProtKB
9606
UniProtKB
Reference Proteome Set
True
Step 4: Input dataset
select at runtime
Step 5: Input dataset
select at runtime
Step 6: UniProt
A history dataset with an Organism Taxonomy Name column
Output dataset 'output' from step 1
1
fasta
Step 7: FASTA Merge Files and Filter Unique Sequences
Merge individual FASTAs (output collection if input is collection)
Input FASTA File(s)s
Input FASTA File(s) 1
Output dataset 'output_database' from step 3
Input FASTA File(s) 2
Output dataset 'output' from step 4
Accession and Sequence
^>([^ ]+).*$
Step 8: msconvert
Output dataset 'output' from step 5
True
mgf
Data Processing Filters:
True
All Levels (1-)
Prefer vendor algorithm, fallback to local-maximum
False
no
Filter by Thresholds
False
False
False
False
Scan Inclusion/Exclusion Filters:
no
Filter Scan Indices
Filter Scan Numbers
False
False
no
no
General Options:
False
False
False
False
False
False
False
0
Output Encoding Settings:
64
32
zlib
False
Step 9: FASTA Merge Files and Filter Unique Sequences
Merge individual FASTAs (output collection if input is collection)
Input FASTA File(s)s
Input FASTA File(s) 1
Output dataset 'output' from step 2
Input FASTA File(s) 2
Output dataset 'output_database' from step 3
Input FASTA File(s) 3
Output dataset 'proteome' from step 6
Input FASTA File(s) 4
Output dataset 'output' from step 4
Accession and Sequence
^>([^ ]+).*$
Step 10: Search GUI
Output dataset 'output' from step 9
Protein Database Options:
True
False
False
Output dataset 'output' from step 8
Search Engine Options:
X!Tandem MS-GF+ OMSSA
Protein Digestion Options:
Select Enzymes
Enzymes
Enzymes 1
Trypsin
2
Precursor Options:
Parts per million (ppm)
20.0
Parts per million (ppm)
20.0
2
6
b
y
0
1
Protein Modification Options:
Carbamidomethylation of C
Acetylation of protein N-term Oxidation of M
Andvanced Options:
Default
Default
Default
Default
Default
Default
Default
Default
Default
Default
Step 11: Peptide Shaker
Output dataset 'searchgui_results' from step 10
Default Processing Options
Advanced Filtering Options
8
60
20.0
ppm
True
GalaxyP Project contact (Not suitable for PRIDE submission)
Exporting options:
True
False
False
False
PSM Report Peptide Report Protein Report Certificate of Analysis
Step 12: Select
Output dataset 'output_peptides' from step 11
NOT Matching
con_
Step 13: Filter
Output dataset 'out_file1' from step 12
c17=='Confident'
1
Step 14: Cut
c6
Tab
Output dataset 'out_file1' from step 13
Step 15: Remove beginning
1
Output dataset 'out_file1' from step 14
Step 16: Unipept
pept2lca: lowest common ancestor
False
True
True
True
tabular
Output dataset 'out_file1' from step 15
1
Tabular with one line per peptide Comma Separated Values (.csv) with one line per peptide JSON Taxomony Tree (for pept2lca, pep2taxa, and peptinfo)
False
Step 17: Query Tabular
workdb.sqlite
Add tables to an existing database:
select at runtime
Database Tables
Database Table 1
Output dataset 'output_tsv' from step 16
Filter Dataset Input:
Filter Tabular Input Lines
Filter Tabular Input Lines 1
regex replace value in column
1
#peptide
peptide
Table Options:
lca
True
Empty.
False
Empty.
Table Indices
Modify the database:
Database Manipulation SQL Statements
False
SELECT DISTINCT peptide FROM lca WHERE taxon_name == 'Coronaviridae' OR genus == 'Anaerococcus' OR genus == 'Dietzia' OR genus == 'Hafnia' OR species == 'Campylobacter volucris' OR species == 'Candida albicans' OR species == 'Cladophialophora psammophila' OR species == 'Johnsonella ignava' OR species == 'Mycobacterium aquaticum' OR species == 'Mycobacterium palustre' OR species == 'Prevotella buccalis' OR species == 'Rickettsia argasii' OR species == 'Scedosporium apiospermum' OR species == 'Trichosporon asahii' OR species == 'Yersinia frederiksenii'
No
Additional Queries:
Database Manipulation SQL Statements
Step 18: PepQuery
Input Data:
peptide
Peptide list from your history
Output dataset 'output' from step 17
Output dataset 'output' from step 7
Output dataset 'output' from step 8
spectrum title in MGF
Modifications:
Carbamidomethylation of C (57.02146372057) modaa
Oxidation of M (15.99491461956) modaa
3
True
True
Mass spectrometer:
Tolerance:
20
ppm
0.05
Digestion:
Trypsin
2
PSM:
CID/HCD
HyperScore
6
2
10
12
45
1000
True
Step 19: Query Tabular
workdb.sqlite
Add tables to an existing database:
select at runtime
Database Tables
Database Table 1
Output dataset 'psm_rank_txt' from step 18
Filter Dataset Input:
Filter Tabular Input Lines
Table Options:
PepQueryValidation
peptide,modification,n,spectrum_file,spectrum_title,charge,exp_mass,ppm,pep_mass,mz,score,n_db,total_db,n_random,total_random,pvalue,rank,n_ptm
False
Empty.
Table Indices
False
SELECT peptide,modification,n,spectrum_file,spectrum_title,charge,exp_mass,ppm,pep_mass,mz,score,n_db,total_db,n_random,total_random,pvalue,rank,n_ptm FROM PepQueryValidation WHERE n_ptm==0
False
Step 20: Cut
c1
Tab
Output dataset 'output' from step 19
Step 21: Group
Output dataset 'out_file1' from step 20
c1
False
Nothing selected.
Operations
Operation 1
Concatenate Distinct
c1
NO
Not available.
Step 22: Cut
c1
Tab
Output dataset 'out_file1' from step 21
Step 23: Unipept
peptinfo: Tryptic peptides and associated EC and GO terms and lowest common ancestor taxonomy
False
True
True
True
True
tabular
Output dataset 'out_file1' from step 22
c1
Tabular with one line per peptide JSON Taxomony Tree (for pept2lca, pep2taxa, and peptinfo)
False
Step 24: Cut
c1,c3,c5,c15
Tab
Output dataset 'output_tsv' from step 23
Step 25: Group
Output dataset 'out_file1' from step 24
4
False
Nothing selected.
Operations
Operation 1
Count
4
NO
Not available.