| Literature DB >> 35404089 |
Kerstin Locher1,2, Corrie R Belanger1, Eric Eckbo1,2, Melissa Caza1, Billie Velapatino1, Marthe K Charles1,2.
Abstract
Sanger sequencing of the 16S rRNA gene is routinely used for the identification of bacterial isolates. However, this method is still performed mostly in more-specialized reference laboratories, and traditional protocols can be labor intensive. In this study, 99 clinical bacterial isolates were used to validate a fast, simplified, and largely automated protocol for 16S sequencing. The workflow combines real-time PCR of the first 500 bp of the bacterial 16S rRNA gene and amplicon sequencing on an automated, cartridge-based sequence analyzer. Sequence analysis, NCBI BLAST search, and result interpretation were performed using an automated R-based script. The automated workflow and R analysis described here produced results equal to those of manual sequence analysis. Of the 96 sequences with adequate quality, 90 were concordantly identified to the genus (n = 62) or species level (n = 28) compared with routine laboratory identification of the organism. One organism identification was discordant, and 5 resulted in an inconclusive identification. For sequences that gave a valid result, the overall accuracy of identification to at least the genus level was 98.9%. This simplified sequencing protocol provides a standardized approach to clinical 16S sequencing, analysis, and quality control that would be suited to frontline clinical microbiology laboratories with minimal experience. IMPORTANCE Sanger sequencing of the 16S rRNA gene is widely used as a diagnostic tool for bacterial identification, especially in cases where routine diagnostic methods fail to provide an identification, for organisms that are difficult to culture, or from specimens where cultures remain negative. Our simplified protocol is tailored toward use in frontline laboratories with little to no experience with sequencing. It provides a highly automated workflow that can deliver fast results with little hands-on time. Implementing 16S sequencing in-house saves additional time that is otherwise required to send out isolates/specimens for identification to reference laboratories. This makes results available much faster to physicians who can in turn initiate or adjust patient treatment accordingly.Entities:
Keywords: 16S RNA; 16S Sanger sequencing; R script; automated workflow; automation; bacterial identification
Mesh:
Substances:
Year: 2022 PMID: 35404089 PMCID: PMC9045293 DOI: 10.1128/spectrum.00408-22
Source DB: PubMed Journal: Microbiol Spectr ISSN: 2165-0497
Time requirements for 16S sequencing workflow using the SeqStudio genetic analyzer
| Step | Total time | Hands-on time |
|---|---|---|
| A. Crude DNA extraction | 20 min | 5 min |
| B. Real-time PCR | ∼1–1.5 h | 15 min |
| C. Sequencing reaction setup | ∼3.5 h | ∼35 min |
| D. Amplicon sequencing on analyzer | 50 min | ∼5 min |
| E1. Sequence analysis using automated script | ∼15 min | ∼10min |
| E2. Sequence analysis using manual script | ∼30 min | ∼30 min |
Time required for 4 reactions with additional ∼40 min per each additional 4 reactions.
Comparison of manual and automated R sequence analysis quality
| Sequence type | Analysis quality of: | |||||
|---|---|---|---|---|---|---|
| Manual sequence analysis | R analysis | |||||
| >440 bp | 400–440 bp | <400 bp | >440 bp | 400–440 bp | <400 bp | |
| Consensus sequence | 83 | 5 | 0 | 86 | 6 | 0 |
| Single-read sequence | 3 | 2 | 5 | 3 | 1 | 2 |
Consensus sequence, generated from trimmed sequences.
After trimming.
Sequence failed QC.
Overview of 16S sequencing results comparing manual analysis to automated analysis using R
| Characteristic | Value for: | |
|---|---|---|
| Manual sequence analysis | R analysis | |
| Initial sequence QC met | 93 | 96 |
| Excluded from analysis (poor sequence QC) | 5 | 2 |
| Identification concordant to genus or group level | 57 | 62 |
| Identification concordant to species level | 28 | 28 |
| Identification discordant with reference result | 1 | 1 |
| Identification inconclusive | 7 | 5 |
Overview of results with different final ID resolution between manual and R analysis
| Reference method ID | Manual BLAST analysis | Automated R analysis | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 16S result | Aligned (bp) | Query cover | % id | % distance | 16S result | Aligned (bp) | Query cover | % id | % distance | |
| Inconclusive | 440 | 100 | 95.7 | 2.0 |
| 509 | 99 | 99.5 | 1.5 | |
|
| Inconclusive | 457 | 97 | 98.0 | 0.9 | 454 | 99 | 98.7 | 2.0 | |
|
| Inconclusive | 478 | 97 | 99.8 | 0.9 | 487 | 99 | 98.4 | 0.2 | |
|
| Inconclusive | 478 | 97 | 99.6 | 0 | 487 | 99 | 98.8 | 0 | |
|
| Inconclusive | 478 | 99 | 96.9 | 1.7 | 496 | 100 | 97.8 | 3.0 | |
|
| Inconclusive | 462 | 96 | 98.9 | 2.1 | 488 | 99 | 99.0 | 0.7 | |
|
|
| 456 | 98 | 99.6 | 5.0 | 479 | 99 | 98.5 | 5.0 | |
|
|
| 450 | 100 | 99.8 | 3.4 | Inconclusive | 427 | 98 | 98.9 | 0.5 |
|
|
| 482 | 98 | 99.6 | 2.2 | 494 | 99 | 98.5 | 2.3 | |
|
| 429 | 98 | 99.8 | 1.6 |
| 478 | 100 | 99.4 | 2.0 | |
|
| 412 | 100 | 99.3 | 0.2 | Inconclusive | 395 | 100 | 100 | 0.3 | |
| 466 | 98 | 98.7 | NA | Inconclusive | 476 | 98 | 96.2 | NA | ||
|
| Excluded | 399 | 100 | 99.8 | 1.4 |
| 400 | 100 | 99.5 | 1.5 |
|
| Excluded | 381 | 99 | 99.0 | 0.3 | 400 | 100 | 98.9 | 2.9 | |
To next species.
16S quality control parameters for identification of bacterial pathogens
| Parameter | Final identification | ||||
|---|---|---|---|---|---|
| To species | To genus | To genus | Inconclusive | Inconclusive | |
| Distance to next species | ≥0.8% | <0.8% | NA | NA | NA |
| % identity to reference sequence | ≥99% | ≥99% | 97–99% | <97% | NA |
| Query cover | ≥98% | ≥98% | ≥98% | ≥98% | <98% |
| Aligned query length | >440 bp | >440 bp | >440 bp | >440 bp | >440 bp |
In NCBI BLAST database.
If aligned query length was between 400 and 440 base pairs, sequences were identified to genus if query cover was ≥98 and % identity was ≥99%.