| Literature DB >> 27659943 |
Ilias Lagkouvardos1, Divya Joseph1, Martin Kapfhammer1, Sabahattin Giritli1, Matthias Horn2, Dirk Haller1,3, Thomas Clavel1.
Abstract
The SRA (Sequence Read Archive) serves as primary depository for massive amounts of Next Generation Sequencing data, and currently host over 100,000 16S rRNA gene amplicon-based microbial profiles from various host habitats and environments. This number is increasing rapidly and there is a dire need for approaches to utilize this pool of knowledge. Here we created IMNGS (Integrated Microbial Next Generation Sequencing), an innovative platform that uniformly and systematically screens for and processes all prokaryotic 16S rRNA gene amplicon datasets available in SRA and uses them to build sample-specific sequence databases and OTU-based profiles. Via a web interface, this integrative sequence resource can easily be queried by users. We show examples of how the approach allows testing the ecological importance of specific microorganisms in different hosts or ecosystems, and performing targeted diversity studies for selected taxonomic groups. The platform also offers a complete workflow for de novo analysis of users' own raw 16S rRNA gene amplicon datasets for the sake of comparison with existing data. IMNGS can be accessed at www.imngs.org.Entities:
Year: 2016 PMID: 27659943 PMCID: PMC5034312 DOI: 10.1038/srep33721
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Host range and relative sequence abundance of bacteria related to Acetatifactor muris based on sequence similarity search in IMNGS.
(A) Prevalence of A. muris-like sequences at different levels of similarity expressed as percentages of positive samples in each host (the number of samples included in the analysis are shown in parenthesis). (B) Percentages of samples positive for the species A. muris (97% sequence similarity) at the indicated threshold of relative abundances.
Figure 2Population structure and diversity of the bacterial phylum Poribacteria based on a taxonomic query in IMNGS.
(A) Successive clustering of Poribacteria sequences (n = 2,308) at different levels of sequence similarity (~3, 5 and 10%). Prominent molecular species and genera clusters based on the number of samples they represented in the database (black bars) are written in bold and indicated by colored nodes. The star represents the candidate type species of Poribacteria originally investigated by single-cell genomics28. (B) Host specificity of sponge colonization by Poribacteria. Average contribution of each prominent molecular species of Poribacteria relative to the total number of sequences classified as Poribacteria in the different sponge species. The maximum relative sequence abundance of Poribacteria in the microbial profiles of each sponge is shown in parenthesis (the rest corresponded to other bacteria).
Figure 3Overview of the IMNGS system.
IMNGS can be separated into two components: the Build, where SRA files are retrieved and processed, and the Use, where users query and interact with data via the web front. The pipeline is fully automated and can run unsupervised, ensuring smooth integration of new data.