| Literature DB >> 21339894 |
Viktoria Gontcharova1, Eunseog Youn, Randall D Wolcott, Emily B Hollister, Terry J Gentry, Scot E Dowd.
Abstract
The existing chimera detection programs are not specifically designed for "next generation" sequence data. Technologies like Roche 454 FLX and Titanium have been adapted over the past years especially with the introduction of bacterial tag-encoded FLX/Titanium amplicon pyrosequencing methodologies to produce over one million 250-600 bp 16S rRNA gene reads that need to be depleted of chimeras prior to downstream analysis. Meeting the needs of basic scientists who are venturing into high-throughput microbial diversity studies such as those based upon pyrosequencing and specifically providing a solution for Windows users, the B2C2 software is designed to be able to accept files containing large multi-FASTA formatted sequences and screen for possible chimeras in a high throughput fashion. The graphical user interface (GUI) is also able to batch process multiple files. When compared to popular chimera screening software the B2C2 performed as well or better while dramatically decreasing the amount of time required generating and screening results. Even average computer users are able to interact with the Windows .Net GUI-based application and define the stringency to which the analysis should be done. B2C2 may be downloaded from http://www.researchandtesting.com/B2C2.Entities:
Keywords: Chimera; Windows; chimera detection; high throughput; next generation.; pyrosequencing
Year: 2010 PMID: 21339894 PMCID: PMC3040993 DOI: 10.2174/1874285801004010047
Source DB: PubMed Journal: Open Microbiol J ISSN: 1874-2858
Results of test set Implementation on B2C2. The Software Classified Chimeras from Three sets, Distant Chimeras, Medium Chimeras, and Close Chimeras, into Four Resultant files: “NotChimera”, “PossibleChimera”, “DefiniteChimera”, and a “NotClassified” File
| Category | Distant Chimeras | Medium Chimeras | Close Chimeras | Not Chimeras |
|---|---|---|---|---|
| Problems | 0 | 0 | 0 | 0 |
| Not Chimeras | 1 | 11 | 35 | 89 |
| Maybe Chimeras | 1 | 11 | 4 | 3 |
| Chimeras | 98 | 78 | 61 | 8 |
| 100 | 100 | 100 | 100 |
Summary of Results of Test Set Implementation on B2C2. The Software Classified Chimeric and Non-Chimeric Sequences. Three Possible Classification Categories Resulted: Chimera, Possible Chimera and Non-Chimera
| Actual | |||
|---|---|---|---|
| Chimeras | Non-Chimeras | ||
| 237 | 8 | ||
| 16 | 3 | ||
| 47 | 89 | ||
Results of Test Set Implementation on Chimera_Check for RDPII. The Software Classified Chimeras from Three Sets, Distant Chimeras, Medium Chimeras, and Close Chimeras, into Four Resultant Files: Not Chimeras, Maybe Chimeras, Definite Chimeras, and a Problem File
| Category | Distant Chimeras | Medium Chimeras | Close Chimeras | Not Chimeras |
|---|---|---|---|---|
| Problems | 0 | 0 | 0 | 0 |
| Not Chimeras | 0 | 2 | 5 | 54 |
| Maybe Chimeras | 0 | 2 | 20 | 20 |
| Chimeras | 100 | 96 | 75 | 26 |
| 100 | 100 | 100 | 100 |
Summary of Results of Test Set Implementation on Chimera_Check for RDPII. The Software Classified Chimeric and Non-Chimeric Sequences. Three Possible Classification Categories Resulted: Chimera, Possible Chimera and Non- Chimera
| Actual | |||
|---|---|---|---|
| Chimeras | Non-Chimeras | ||
| 271 | 26 | ||
| 22 | 20 | ||
| 7 | 54 | ||
Results of Second Test Set (Composed of Longer Sequences) Implementation on the Greengenes Server. The Software Classified Chimeras from Three Sets, Distant Chimeras, Medium Chimeras, and Close Chimeras, into Four Resultant Files: Not Chimeras, Maybe Chimeras, Definite Chimeras, and a Problem File
| Category | Distant Chimeras | Medium Chimeras | Close Chimeras | Not Chimeras |
|---|---|---|---|---|
| Problems | 12 | 10 | 5 | 0 |
| Not Chimeras | 69 | 67 | 82 | 62 |
| Maybe Chimeras | 14 | 14 | 7 | 22 |
| Chimeras | 5 | 9 | 6 | 16 |
| 100 | 100 | 100 | 100 |
Summary of Results of Second Test Set (Composed of Longer Sequences) Implementation on the Greengenes Server. The Software Classified Chimeric and Non-Chimeric Sequences. Three Possible Classification Categories Resulted: Chimera, Possible Chimera and Non-Chimera
| Actual | |||
|---|---|---|---|
| Chimeras | Non-Chimeras | ||
| 20 | 16 | ||
| 35 | 22 | ||
| 218 | 62 | ||
Results of Test Set Implementation on B2C2 Using Full Length Bacterial Sequences. The Software Classified Chimeras from Three Sets, Distant Chimeras, Medium Chimeras, and Close Chimeras, into Four Resultant Files: NotChimera, Possible Chimera, DefiniteChimera, and a NotClassified File
| Category | Distant Chimeras | Medium Chimeras | Close Chimeras | Not Chimeras |
|---|---|---|---|---|
| Problems | 0 | 0 | 0 | 0 |
| Not Chimeras | 0 | 9 | 29 | 88 |
| Maybe Chimeras | 1 | 17 | 7 | 7 |
| Chimeras | 99 | 74 | 64 | 5 |
| 100 | 100 | 100 | 100 |
Summary of Results of Test Set Implementation on B2C2 Using Full Length Bacterial Sequences. The Software Classified Chimeric and Non-Chimeric Sequences. Three Possible Classification Categories Resulted: Chimera, Possible Chimera and Non-Chimera
| Actual | |||
|---|---|---|---|
| Chimeras | Non-Chimeras | ||
| 237 | 5 | ||
| 25 | 7 | ||
| 38 | 88 | ||