| Literature DB >> 16524471 |
Zhengdong Zhang1, George W Jackson, George E Fox, Richard C Willson.
Abstract
BACKGROUND: The public availability of over 180,000 bacterial 16S ribosomal RNA (rRNA) sequences has facilitated microbial identification and classification using hybridization and other molecular approaches. In their usual format, such assays are based on the presence of unique subsequences in the target RNA and require a prior knowledge of what organisms are likely to be in a sample. They are thus limited in generality when analyzing an unknown sample.Herein, we demonstrate the utility of catalogs of masses to characterize the bacterial 16S rRNA(s) in any sample. Sample nucleic acids are digested with a nuclease of known specificity and the products characterized using mass spectrometry. The resulting catalogs of masses can subsequently be compared to the masses known to occur in previously-sequenced 16S rRNAs allowing organism identification. Alternatively, if the organism is not in the existing database, it will still be possible to determine its genetic affinity relative to the known organisms.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16524471 PMCID: PMC1488874 DOI: 10.1186/1471-2105-7-117
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Comparison between RNase T1 and RNase A 16S rRNA catalogs. See Supplementary Table for more details.
| Enzyme | Attributes of the oligoribonucleotide catalog† | |||||
| Length Range | Total Oligos | Distinct Oligos | Distinct Masses | Avg. Oligo Seqs per 16S | Avg. Masses per 16S | |
| RNase T1 | 1 – 54 | 898,494 | 8,601 | 858 | 128 | 77 |
| RNase A | 1 – 21 | 1,225,481 | 1,994 | 227 | 81 | 50 |
| RNase T1 | 1 – 54 | 898,494 | 8,601 | 2,404 | 128 | 159 |
| RNase A | 1 – 21 | 1,225,481 | 1,994 | 644 | 81 | 88 |
† The columns are:
Length Range: the minimum and maximum length of oligoribonucleotides in catalogs.
Total Oligos: the number of all oligoribonucleotides generated by complete RNase digestion.
Distinct Oligos: the number of different oligoribonucleotide sequences in the catalogs.
Distinct Masses: the number of different oligoribonucleotide masses in the catalogs.
Avg. Oligos per 16S: the average number of different oligoribonucleotide sequences generated by every 16S rRNA RNase digestion.
Avg. Masses per 16S: the average number of different oligoribonucleotide masses generated by every 16S rRNA RNase digestion.
Figure 1Population distributions of 16S rRNA RNase digestion products of different lengths. Oligoribonucleotides were generated by RNase T1 and RNase A digestion of 16S rRNA from 1,921 organisms. Numbers of RNase T1 and RNase A fragments are shown in white and gray, respectively.
Figure 2Comparison of numbers of unique polyisotopic oligoribonucleotide masses actually occurring in digests of 16S rRNA versus possible masses. The number of unique polyisotopic oligoribonucleotide masses in the actual RNase T1 and A catalogs are presented. The unique polyisotopic oligoribonucleotide masses in the theoretical sets are calculated from all possible RNase T1 and RNase A oligos and with consideration for the natural isotopic distribution. Only carbon and oxygen isotopes and the resultant oligoribonucleotide masses above a 50% maximum relative intensity are considered. The counts of RNase T1-generated oligoribonucleotide more than 26 nt long are not shown.
Examples of organisms that can be identified by the observation of one, two, or three RNase T1-generated 16S rRNA masses.
| I | II | III | |
| - | - | ||
| N | - | ||
| N | - | ||
| N | - | ||
| N | - | ||
| str. KYT0-F | N | - | |
| str. NCA 213 B ATCC 7949 | N | - | |
| str. 2023 ATCC 17851 | N | - | |
| str. Eklund 202 F ATCC 23387 | N | N | |
| str. Eklund 17B ATCC 25765 | N | - | |
| Langeland NCTC 10281 | N | - | |
| NCTC 7272 | N | - | |
| NCTC 7273 | N | N | |
| str. 468 toxin type C | N | - | |
| N | - | ||
| - | - | ||
| N | N | ||
| N | - | ||
| N | - | ||
| N | - | ||
| N | - |
A 'Y' in column I indicates that a uniquely identifying RNase T1 oligoribonucleotide composition was found that was not present in any of the other 1,920 organisms sampled. A 'Y' in Column II or III indicates that two or three masses respectively were found that when taken together are uniquely identifying. An "N" indicates that the calculation did not yield a one, two, or three peak signature. "-" indicates that further determinations were not necessary.
Simulation results for sample identification with four different RES and WIN settings.†
| RES (sample mass resolution threshold, daltons) | |||||
| 1 | 5 | ||||
| WIN (catalog mass selection threshold, daltons) | 1 | 89.29% | 74.83% | ||
| 90.91% | 75.00% | ||||
| 91.55% | 76.43% | ||||
| 95.10% | 76.81% | ||||
| 95.80% | 77.27% | ||||
| 3 | 97.90% | 97.73% | |||
| 100.00% | 97.90% | ||||
| 100.00% | 100.00% | ||||
| 100.00% | 100.00% | ||||
| 100.00% | 100.00% | ||||
† Bacteria in the virtual test sample are Acdp. spC1, M. sturniT, Mc. janrrnB, and Tmms. chrmg. Only bacteria identified to be in the sample with five highest fractional representations are listed.
* True positive identification.
Figure 3Identification rank and representation of truncated 16S rRNA from . 16S rRNA of Nitrosospira sp. Strain L115 was truncated from its 5' end 10 nt a time for 89 times. Identification of each truncated 16S rRNA was simulated, with RES = 5, WIN = 3, and isotopic masses in RNase T1 catalog. The solid line is the identification rank, with the primary y-axis on the left. Rank 0 means one of the three Nitrosospira (Nss. multi5, Nss. spT7, and Nss. spD11) in the catalog is ranked as the highest-fractional representation bacterium in the sample. Rank -1 means one of the three Nitrosospira in the catalog is ranked as the second most represented bacterium in the sample, and so on. The dashed line is the fractional representation of the highest ranked Nitrosospira in the catalog, with the secondary y-axis on the right.
Figure 4Identification rank and representation of truncated 16S rRNA from . 16S rRNA of Nitrosospira sp. Strain L115 was truncated from its 3' end 10 nt a time for 89 times. As in Figure 3, identification of each truncated 16S rRNA was simulated with RES = 5, WIN = 3, and isotopic masses in RNase T1 catalog. Axes are identical to those in Figure 3.