| Literature DB >> 20436876 |
Gail L Rosen1, Bahrad A Sokhansanj, Robi Polikar, Mary Ann Bruns, Jacob Russell, Elaine Garbarine, Steve Essinger, Non Yok.
Abstract
Traditionally, studies in microbial genomics have focused on single-genomes from cultured species, thereby limiting their focus to the small percentage of species that can be cultured outside their natural environment. Fortunately, recent advances in high-throughput sequencing and computational analyses have ushered in the new field of metagenomics, which aims to decode the genomes of microbes from natural communities without the need for cultivation. Although metagenomic studies have shed a great deal of insight into bacterial diversity and coding capacity, several computational challenges remain due to the massive size and complexity of metagenomic sequence data. Current tools and techniques are reviewed in this paper which address challenges in 1) genomic fragment annotation, 2) phylogenetic reconstruction, 3) functional classification of samples, and 4) interpreting complementary metaproteomics and metametabolomics data. Also surveyed are important applications of metagenomic studies, including microbial forensics and the roles of microbial communities in shaping human health and soil ecology.Entities:
Year: 2009 PMID: 20436876 PMCID: PMC2808676 DOI: 10.2174/138920209789208255
Source DB: PubMed Journal: Curr Genomics ISSN: 1389-2029 Impact factor: 2.236
| Features | Classifier | Published Method |
|---|---|---|
| Homology-based | Nearest-Neighbor | BLAST [ |
| Nearest-Neighbor & Last Common Ancestor | MEGAN [ | |
| Composition-based | Naïve Bayesian | Sandberg |
| RDP classifier (16S sequences only) [ | ||
| Rosen | ||
| Support Vector Machines | PhyloPythia [ |
| Taxonomic-level Accuracy | BLAST | NBC |
|---|---|---|
| Strain (635 genome training data only) | 66% | 76% |
| Species (77 strains, 5-fold CV) | 89.2% ± 1.9% | 90.2% ± 1.2% |
| Genera (216 strains, 5-fold CV) | 86.0% ± 3.5% | 66.3% ± 6.3% |
| Purpose | Tool | Algorithm | Access | Cost | Website |
|---|---|---|---|---|---|
| Sequence Alignment | BLAST [ | Local alignment; similar to Smith-Waterman | Server; Executable | Free | |
| Clustal [ | Global alignment; distance matrix, neighbor-joining | Server; Executable | Free | ||
| Phylogeny Inference | MEGA [ | Graphical Clustal ; Parsimony, neighbor-joining, UPGMA | Executable | Free | |
| PAUP* [ | Maximum Parsimony | Executable | $100 | ||
| MrBayes [ | Bayesian inference | Executable | Free | ||
| Phylip [ | Parsimony, distance matrix, bootstrapping, maximum likelihood | Executable | Free | ||
| UniFrac [ | UniFrac distance metric; P-test | Server | Free |