| Literature DB >> 25815061 |
Dmitry Alexeev1,2, Tanya Bibikova3, Boris Kovarsky1, Damir Melnikov3, Alexander Tyakht1, Vadim Govorun1.
Abstract
BACKGROUND: One of the most challenging tasks in genomic analysis nowadays is metagenomics. Biomedical applications of metagenomics give rise to datasets containing hundreds and thousands of samples from various body sites for hundreds of patients. Inherently metagenome is by far more complex than a single genome as it varies in time by the amount of bacteria comprising it. Other levels of data complexity include geography of the samples and phylogenetic distance between the genomes of the same operational taxonomic unit (OTU). We have developed the visualization concept for the representation of multilayer metagenomics data - the bacterial rose garden. The approach allows to display the taxonomic distance between the representatives of the same OTU in different samples and use variety of the metadata for display.Entities:
Keywords: Gut microbiota; Metagenomic data visualization; Phylogeny visualization; Rose garden
Year: 2015 PMID: 25815061 PMCID: PMC4374582 DOI: 10.1186/s13040-015-0045-5
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Data used in the study
| Country | Source | Number of samples | Number of donors and | Sequencing platform | Reads metrics |
|---|---|---|---|---|---|
| USA | Human Microbiome Project [ | 138 | 50 (single samples), 41 (two samples), 2 (three samples) | Illumina | 101 bp, paired-end |
| Denmark | MetaHIT project [ | 85 | 85 | Illumina | 44 bp (13 samples), 75 bp (72 samples) |
| China | BGI-Shenzhen [ | 126 | 50 (type II diabetes), 70 (healthy), 6 (unknown) | Illumina | 75 bp |
| Russia | Metagenome.ru consortia [ | 162 | 116(single sample), 2 (two technical repeats),14 (two samples and one technical repeat) | SOLiD | 50 bp |
| Russia | Metagenome.ru consortia [ | 5 | 5 | Illumina | 100 bp |
Origin of WGS metagenome data used in prototype.
Figure 1Basic data visualization techniques for microbiota studies. Examples of heatmap (a) and MDS (b,c) visualization of sample distances.
Figure 2Bacterial rose. The bacterial rose visualization principle.
Figure 3Regional bacterial rose. a. Bacterial rose of single OTU with all the representatives of this OTU from all the regions showing all the distances from the sample on the radius to all the other samples. b. The same for the chosen region, i.e. only distances from the samples on the radius to the sample belonging to chosen region are shown.
Figure 4Rose garden. Part of the rose garden presenting all the bacterial roses for the chosen OTUs.
Figure 5Clustering and travelling rose. a. The example of the Eubacterium eligens shows distinct clustering of samples from China and Russia and intermixed samples from Europe and USA. b - Dialister invisius as an example of visualization of the travelling bacteria, where in the circle of the closest Chinese sample radius we find lots of samples from Europe or USA, therefore concluding that distance calculated is smaller.
Figure 6Two OTUs case. a. Single sample view of bacterial rose for Barnesiella intestinihominis showing several samples much closer than the others. b. All samples view showing distribution of the distances, two distinct groups could be identified – one with smaller phylogenetic distances and one with larger.
Figure 7Outlier example. Unusual distance distribution for the sample SRS014979 of Bacteroides cellulosilyticus DSM_14838 OTU (top right corner outlier).