| Literature DB >> 23047562 |
Robert A Edwards1, Robert Olson, Terry Disz, Gordon D Pusch, Veronika Vonstein, Rick Stevens, Ross Overbeek.
Abstract
Annotation of metagenomes involves comparing the individual sequence reads with a database of known sequences and assigning a unique function to each read. This is a time-consuming task that is computationally intensive (though not computationally complex). Here we present a novel approach to annotate metagenomes using unique k-mer oligopeptide sequences from 7 to 12 amino acids long. We demonstrate that k-mer-based annotations are faster and approach the sensitivity and precision of blastx-based annotations without loosing accuracy. A last-common ancestor approach was also developed to describe the members of the community.Entities:
Mesh:
Year: 2012 PMID: 23047562 PMCID: PMC3519453 DOI: 10.1093/bioinformatics/bts599
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Statistics of the k-mers and FIGfams
| Number of | Number of families | Mean number of | Median number of | |
|---|---|---|---|---|
| 7 | 207 362 319 | 171 606 | 1208.4 | 110 |
| 8 | 639 234 488 | 173 332 | 3687.9 | 325 |
| 9 | 812 679 565 | 173 513 | 4683.7 | 404 |
| 10 | 866 382 763 | 173 561 | 4991.8 | 425 |
| 11 | 896 943 566 | 173 587 | 5167.1 | 434 |
| 12 | 921 081 710 | 173 606 | 5305.6 | 441 |