| Literature DB >> 32397992 |
Shantelle Claassen-Weitz1, Sugnet Gardner-Lubbe2, Kilaza S Mwaikono3,4, Elloise du Toit5, Heather J Zar6,7,8, Mark P Nicol5,9.
Abstract
BACKGROUND: Careful consideration of experimental artefacts is required in order to successfully apply high-throughput 16S ribosomal ribonucleic acid (rRNA) gene sequencing technology. Here we introduce experimental design, quality control and "denoising" approaches for sequencing low biomass specimens.Entities:
Keywords: 16S rRNA gene; Bacteriome; Contamination; High-throughput sequencing; Low biomass; Mock controls; Negative controls; Optimization; Reproducibility; Respiratory
Mesh:
Substances:
Year: 2020 PMID: 32397992 PMCID: PMC7218582 DOI: 10.1186/s12866-020-01795-7
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
Fig. 1Representation of a expected and b actual sequencing profiles from no template controls, low biomass and high biomass specimens following 16S rRNA gene sequencing. a Expected 16S rRNA gene sequencing profiles from i) no template controls (NTCs), ii) low biomass and iii) high biomass biological specimens which corresponds with their endogenous bacterial composition. b Actual 16S rRNA gene sequencing profiles generated from i) NTCs may comprise of reagent and laboratory contaminants as well as exogenous sequences from low and high biomass specimens (well-to-well contamination); ii) low biomass biological specimen sequencing profiles may be overrepresented by exogenous profiles from both NTCs (reagent and laboratory contaminants) and high biomass specimens (well-to-well contamination); whilst iii) high biomass sequencing profiles are expected to be least affected by reagent and laboratory contaminants present in NTCs and cross-contamination from low biomass specimens
Reference guide to DNA extraction kits, storage buffers/no template controls, bacterial mock communities, technical repeats and decontamination approaches
Extraction and sequencing controls included in this study
| Control type | Control subtype | Control name | Composition | Source | DNA extraction performed in our laboratory | DNA extraction kit | Extraction replicates | Sequencing replicates | Total included for equencing |
|---|---|---|---|---|---|---|---|---|---|
| Extraction controls | Bacterial mock communities | Zymobiomics-Primestore-high | 900 μl of Zymobiomics-Cells suspended in 3600 μl Primestore | Zymo Research Corp., Irvine, CA, United States & Longhorn Vaccines & Diagnostics, Bethesda, MD, USA | ✓ | Kit-QS Kit-ZB | 3 per DNA extraction kit | – | 6 |
| Zymobiomics-STGG-high | 900 μl of Zymobiomics-Cells suspended in 3600 μl STGG | Zymo Research Corp., Irvine, CA, United States & National Health Laboratory Services, Cape Town, South Africa | ✓ | Kit-QS Kit-ZB | 3 per DNA extraction kit | – | 6 | ||
| Zymobiomics-Primestore-low | 1-in-104 fold dilution of Zymobiomics-Primestore-high | Zymo Research Corp., Irvine, CA, United States & Longhorn Vaccines & Diagnostics, Bethesda, MD, USA | ✓ | Kit-QS Kit-ZB | 3 per DNA extraction kit | – | 6 | ||
| Zymobiomics-STGG-low | 1-in-104 fold dilution of Zymobiomics-STGG-high | Zymo Research Corp., Irvine, CA, United States & National Health Laboratory Services, Cape Town, South Africa | ✓ | Kit-QS Kit-ZB | 3 per DNA extraction kit | – | 6 | ||
| NTCs | Primestore | Storage buffer Primestore | Longhorn Vaccines & Diagnostics, Bethesda, MD, USA | ✓ | Kit-QS Kit-ZB | 3 per DNA extraction kit | – | 6 | |
| STGG | Storage buffer STGG | National Health Laboratory Services, Cape Town, South Africa | ✓ | Kit-QS Kit-ZB | 3 per DNA extraction kit | – | 6 | ||
| Sequencing controls | Bacterial mock community DNA | BEI-DNA | 1-in-10 fold dilution of HM-783D in Milli-Q® ultrapure water | BEI Resources, NIAID, NIH as part of the Human Microbiome Project, Manassas, VA, USA & MilliporeSigma, Burlington, MA, USA | x | Not specified | – | – | 3 |
| Zymobiomics-DNA | 1-in-10 fold dilution of ZymoBIOMICS™ Microbial Community DNA Standard in Milli-Q® ultrapure water | Zymo Research Corp., Irvine, CA, United States & MilliporeSigma, Burlington, MA, USA | x | Not specified | – | – | 8 | ||
| Technical repeats | Within-run repeats | NP and IS specimens stored in Primestore | Collected from infants enrolled in the DCHS [ | ✓ | Kit-QS | – | 2 per specimen | 86 | |
| Between-run repeats | NP and IS specimens stored in Primestore | Collected from infants enrolled in the DCHS [ | ✓ | Kit-QS | – | 2, 3 or 4 per specimen | 123 | ||
| NTCs | Primestore | Storage buffer Primestore | Longhorn Vaccines & Diagnostics, Bethesda, MD, USA | ✓ | Kit-QS | – | – | 35 |
Zymobiomics-Cells ZymoBIOMICS™ Microbial Community Standard bacterial cells (Catalog No. D6300, Zymo Research Corp., Irvine, CA, United States), Kit-QS DSP Virus/Pathogen Mini Kit® using QIAsymphony® SP instrument (catalogue no. 937036, Qiagen GmbH, Hilden, Germany), Kit-ZB ZymoBIOMICS DNA Miniprep Kit (catalogue no. ZR D4300, Zymo Research Corp., Irvine, CA, United States), NTCs No template controls, Primestore PrimeStore® Molecular Transport medium (Longhorn Vaccines & Diagnostics Bethesda, MD, USA), STGG transport medium containing Skim-milk, Tryptone, Glucose and Glycerol, NP Nasopharyngeal swabs IS Induced sputum, DCHS Drakenstein Child Health Study
Quantity and quality of DNA extracted using two DNA extraction methods
| Control | DNA extraction kit | Buffer | Replicate | 16S rRNA gene copy numbers (copies/ml of specimen input volume) | 260/280 NanoDrop® ND-1000 ratio |
|---|---|---|---|---|---|
| Zymobiomics-Primestore-high | Kit-QS | Primestore | 1 | 2.47E9 | 1.68 |
| 2 | 2.06E9 | 1.75 | |||
| 3 | 1.92E9 | 1.90 | |||
| Kit-ZB | Primestore | 1 | 2.14E9 | 2.09 | |
| 2 | 1.99E9 | 1.34 | |||
| 3 | 1.61E9 | 1.19 | |||
| Zymobiomics-Primestore-low | Kit-QS | Primestore | 1 | 3.77E3 | – |
| 2 | 5.82E3 | – | |||
| 3 | 5.37E3 | – | |||
| Kit-ZB | Primestore | 1 | 2.08E5 | – | |
| 2 | 2.72E5 | – | |||
| 3 | 3.43E5 | – | |||
| Zymobiomics-STGG-high | Kit-QS | STGG | 1 | 7.52E8 | 1.96 |
| 2 | 7.32E8 | 1.89 | |||
| 3 | 5.58E8 | 1.93 | |||
| Kit-ZB | STGG | 1 | 3.09E9 | 2.53 | |
| 2 | 1.89E9 | 1.34 | |||
| 3 | 1.73E9 | 1.23 | |||
| Zymobiomics-STGG-low | Kit-QS | STGG | 1 | 2.79E5 | – |
| 2 | 3.58E5 | – | |||
| 3 | 5.10E5 | – | |||
| Kit-ZB | STGG | 1 | 1.89E5 | – | |
| 2 | 2.85E5 | – | |||
| 3 | 2.76E5 | – |
Zymobiomics-Primestore-high 900 μl of Zymobiomics-Cells suspended in 3600 μl Primestore, Zymobiomics-Primestore-low 1-in-104 fold dilution of Zymobiomics-Primestore-high, Zymobiomics-STGG-high, 900 μl of Zymobiomics-Cells suspended in 3600 μl STGG, Zymobiomics-STGG-low 1-in-104 fold dilution of Zymobiomics-STGG-high, Kit-QS DSP Virus/Pathogen Mini Kit® using QIAsymphony® SP instrument (catalogue no. 937036, Qiagen GmbH, Hilden, Germany), Kit; ZB ZymoBIOMICS DNA Miniprep Kit (catalogue no. ZR D4300, Zymo Research Corp., Irvine, CA, United States), Primestore PrimeStore® Molecular Transport medium (Longhorn Vaccines & Diagnostics Bethesda, MD, USA), STGG Storage medium containing skim milk, tryptone, glucose, and glycerine
Fig. 216S rRNA gene bacterial profiles are reflective of specimen biomass and are further influenced by DNA extraction methods and storage buffers. a Differences in beta diversities (calculated at OTU-level) measured from all bacterial mock community controls and no template controls (NTCs). b Differences in beta diversities measured from bacterial mock community controls and NTCs generated using Primestore storage buffer. c Differences in beta diversities measured from bacterial mock community controls and NTCs generated using STGG storage buffer. The proportion of variance captured by coordinate analysis axes are shown in the bottom left corner of each panel. Blue and red colours represent DNA extraction methods Kit-QS and Kit-ZB, respectively. Shades of chartreuse filled circles represent bacterial mock communities generated using Primestore storage buffer (solid-filled chartreuse circles: high biomass bacterial mock communities; pattern-filled chartreuse circles: low biomass bacterial mock communities). Shades of emerald filled circles represent bacterial mock communities generated using STGG storage buffer (solid-filled emerald circles: high biomass bacterial mock communities; pattern-filled emerald circles: low biomass bacterial mock communities). Dark green filled circles represent Zymobiomics-DNA. Chartreuse and emerald pattern-filled squares represent Primestore and STGG NTCs, respectively
Fig. 3Proportions of operational taxonomic units (OTUs) in four bacterial mock communities using two DNA extraction methods, with triplicate testing. A gradient scale is used to represent the proportions of the 100 most abundant OTUs detected across the bacterial mock community controls. OTU and genus-level classifications are provided on the left and right side of the figure, respectively. OTUs expected in each of the four bacterial mock communities are shown using green squares. Red squares denote OTUs not expected in the four bacterial mock communities (“background OTUs”). The bacterial mock communities (Primestore vs STGG), their biomass (high versus low) and the DNA extraction methods used are denoted at the top of each heatmap
Fig. 4Participant age at specimen collection, read counts and alpha diversity relative to specimen biomass for no template controls (Primestore, n = 35) and technical repeats (n = 209). a Scatter plot of participant age at specimen collection b read counts following bioinformatic processes and c Shannon diversity indices (alpha diversity) at OTU-level in relation to specimen biomass (16S rRNA gene copies/μl) plotted on loge scale. Vertical pink shaded area highlights 16S rRNA gene copies/μl < 500
Fig. 5Logarithm of ratio-transformed data (log-ratio) biplots in relation to participant age at specimen collection, 16S rRNA gene copies/μl and read counts following bioinformatic processing. Data points are coloured according to a participant age at specimen collection (in days), b 16S rRNA gene copies/μl and c read counts available for downstream analyses. Technical repeats (n = 209) are represented using filled circles. No template controls (Primestore, n = 35) are represented using filled triangles
Fig. 6Bacterial composition in no template controls (Primestore, n = 35) and low biomass technical repeats (n = 209). a Dendogram representing unsupervised hierarchical clustering distances are based on Bray Curtis dissimilarity indices calculated at OTU-level. The dendogram is colour-coded based on specimen type (Primestore: darkturquoise; technical repeats: deeppink). b Differences between Primestore and technical repeats are shown at genus-level, with colour-codes representing phylum-level classification (Shades of blue: Proteobacteria, shades of red: Firmicutes). Genera with proportions < 1% in each of the specimens are grouped together as “Other” and shown in grey. c Most abundant genera within each the specimens, specimen type, participant age at specimen collection (in days) and 16S rRNA gene copy numbers (copies/μl) are summarised at the bottom of the Fig. A-N-P-R: Allorhizobium-Neorhizobium-Pararhizobium-Rhizobium
Fig. 7Associations between reproducibility and a participant age at specimen collection, b 16S rRNA gene copy numbers, and c read counts. Reproducibility is measured by coefficient of determination (R2) values, calculated by comparing proportions of each OTU present between technical repeats. Horizontal blue bars highlight R2 values > 0.90. Different shades of vertical blue bars represent a< 7, < 14, < 30, < 60 days; b< 100, < 500, < 1000 copies/μl; and c< 2000, < 4000, < 6000, < 8000 and < 10,000 reads; respectively. For b and c, each set of technical repeats had two 16S rRNA gene copy number/read count measures shown as two points connected by a horizontal line on the X-axis
Fig. 8Shifts in profiles of bacterial genera commonly detected from the nasopharynx prior to and following decontamination via two in silico approaches. Per specimen shifts (n = 148) in bacterial proportions are shown for bacterial genera commonly detected from the nasopharynx: aMoraxella, bCorynebacterium 1, cHaemophilus, dStaphylococcus and eStreptococcus. Open circles and smoothing splines (representing a factor of 2x the standard deviation) denote bacterial proportions (Y-axis) for each of the specimens (X-axis). Red: Proportions prior to decontamination; Blue: Proportions following the removal of “potential contaminants” identified using the “NTConly” approach; Yellow: Proportions following the removal of “potential contaminants” identified using the “NTC + decontam” approach
Fig. 9Shifts in profiles of potential contaminants prior to and following decontamination via two in silico approaches. Per specimen shifts (n = 148) in bacterial proportions are shown for bacterial genera commonly described as “potential contaminants” in 16S rRNA gene sequencing datasets aAquabacterium, bAcidovorax, cNoviherbaspirillum, dAcinetobacter and eStenotrophomonas. Open circles and smoothing splines (representing a factor of 2x the standard deviation) denote bacterial proportions (Y-axis) for each of the specimens (X-axis). Red: Proportions prior to decontamination; Blue: Proportions following the removal of “potential contaminants” identified using the “NTConly” approach; Yellow: Proportions following the removal of “potential contaminants” identified using the “NTC + decontam” approach