PHINCH - A framework for visualizing bio data

HELP

How to Use

Instructions

This visualization framework aims to address current bottlenecks in the analysis of large sequence datasets (rRNA amplicons, metagenomes), helping researchers analyze high-throughput datasets more efficiently. Phich takes advantage of standard outputs from computational pipelines in order to bridge the gap between biological software (e.g. QIIME) and existing data visualization capabilities (harnessing the scalability of WebGL and HTML5 in a browser-based tool).

Phinch currently supports downstream analyses of .biom files ( Biological Observation Matrix, a JSON-formatted file type typically used to represent marker gene OTUs or metagenomic data). All sample metadata and taxonomy/ontology information MUST be embedded in the .biom file before being uploaded into Phinch.

In QIIME (version 1.7 or later), users can prepare the .biom file by executing the following commands:

First, construct an OTU table:

make_otu_table.py -i final_otu_map_mc2.txt -o otu_table_mc2_w_tax.biom -t rep_set_tax_assignments.txt

Where your input file (-i) is your OTU Map (defining clusters of raw sequences reads), and taxonomy file (-t) contains the taxonomy or gene ontology strings that correspond to each OTU.

Second, add your sample metadata to your .biom file:

add_metadata.py -i otu_table_mc2_w_tax.biom -o otu_table_mc2_w_tax_and_metadata.biom -m sample_metadata_mapping_file.txt

Where your input file (-i) is your .biom file from the previous step, and your mapping file (-m) is a tab-delimited file containing sample metadata (formatted according to these QIIME instructions).

After these two steps, you're ready to upload.

If you want to visualize biological data currently formatted as a tab-delimited text file (e.g. the style of OTU tables produced by older versions of QIIME), please refer to this documentation for conversion instructions. Phinch supports both "sparse" and "dense" BIOM formats (although sparse .biom files are highly recommended, since the file size is much smaller).

Some important notes on metadata

In order to be properly detected, all date/time metadata must be standardized according to MIxS standardized format (more information at the Genomic Standards Consortium wiki), and entered into one column in your original sample metadata mapping file, as follows:

[YYYY]-[MM]-[DD]T[hh]:[mm]:[ss]-[Z]
For example, metadata for a sample collected at 2:30pm EST on May 4, 2007 would be entered as: 2007-04-05T14:30:00-05:00

Similarly, any geographic coordinates or GPS data must be entered as decimal degrees (the format used by GoogleMaps, e.g. -90.017926). We recommend using separate columns labeled “Latitude” and “Longitude” in your original sample metadata mapping file, to ensure that GPS metadata is correctly detected.

To label your samples with human-readable IDs in the visualizations, include a column in your metadata mapping file with the header labelled as “phinch”. These IDs will be pulled through into the visualizations to populate graph axes. If this column is not included, an arbitrary numerical ID will be assigned to each sample.

How to Use

NOT SUPPORTED