| Literature DB >> 31239396 |
Jeremiah J Minich1, Jon G Sanders2, Amnon Amir2, Greg Humphrey2, Jack A Gilbert3,4, Rob Knight5,6,7,8.
Abstract
Microbial sequences inferred as belonging to one sample may not have originated from that sample. Such contamination may arise from laboratory or reagent sources or from physical exchange between samples. This study seeks to rigorously assess the behavior of this often-neglected between-sample contamination. Using unique bacteria, each assigned a particular well in a plate, we assess the frequency at which sequences from each source appear in other wells. We evaluate the effects of different DNA extraction methods performed in two laboratories using a consistent plate layout, including blanks and low-biomass and high-biomass samples. Well-to-well contamination occurred primarily during DNA extraction and, to a lesser extent, in library preparation, while barcode leakage was negligible. Laboratories differed in the levels of contamination. Extraction methods differed in their occurrences and levels of well-to-well contamination, with plate methods having more well-to-well contamination and single-tube methods having higher levels of background contaminants. Well-to-well contamination occurred primarily in neighboring samples, with rare events up to 10 wells apart. This effect was greatest in samples with lower biomass and negatively impacted metrics of alpha and beta diversity. Our work emphasizes that sample contamination is a combination of cross talk from nearby wells and background contaminants. To reduce well-to-well effects, samples should be randomized across plates, samples of similar biomasses should be processed together, and manual single-tube extractions or hybrid plate-based cleanups should be employed. Researchers should avoid simplistic removals of taxa or operational taxonomic units (OTUs) appearing in negative controls, as many will be microbes from other samples rather than reagent contaminants.IMPORTANCE Microbiome research has uncovered magnificent biological and chemical stories across nearly all areas of life science, at times creating controversy when findings reveal fantastic descriptions of microbes living and even thriving in what were once thought to be sterile environments. Scientists have refuted many of these claims because of contamination, which has led to robust requirements, including the use of controls, for validating accurate portrayals of microbial communities. In this study, we describe a previously undocumented form of contamination, well-to-well contamination, and show that this sort of contamination primarily occurs during DNA extraction rather than PCR, is highest with plate-based methods compared to single-tube extraction, and occurs at a higher frequency in low-biomass samples. This finding has profound importance in the field, as many current techniques to "decontaminate" a data set simply rely on an assumption that microbial reads found in blanks are contaminants from "outside," namely, the reagents or consumables.Entities:
Keywords: 16S rRNA gene; automation; built environment; contamination; genomics; low biomass; metagenomics; microbiome; microbiota; study design
Year: 2019 PMID: 31239396 PMCID: PMC6593221 DOI: 10.1128/mSystems.00186-19
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
FIG 1Plate design and experimental design. (a) NTC, sink, and source samples are distributed in a checkboard pattern across the plate. (b and c) Antifoam A is added to first half (b) and second half (c) of the 96-well plates processed with the robot in order to test whether antifoam A reduces foaming during bead beating and thereby well-to-well contamination. The manual samples did not receive antifoam A. Each unique DNA extraction plate is processed in duplicate PCR plates.
FIG 2Example of plates with cross-contamination. Each panel depicts a 96-well plate, with source, sink, and blank wells denoted by “O,” “X,” and empty squares, respectively. Colors indicate the number of reads from a specific bacterium (Psychrobacter species, present in well E5). Panels a and b, c and d, and e and f correspond to two PCR replicates of robotic extractions 1 and 2 and manual extraction, respectively.
FIG 3Distance-decay relationship of source samples contaminating surrounding samples. The distance (in units of wells) between “contaminant” observations of each sOTU and its source well was calculated. Histograms plot the number of inferred contamination events for each distance range for all 16 source microbes across the various DNA extraction plates and PCR replicate plate types from UCSD. Panels a and b, c and d, and e and f correspond to two PCR replicates of manual extraction and robotic extractions 1 and 2, respectively.
FIG 4Summary statistics of sample fraction compositions of well-to-well contaminants compared across extraction types (blanks [pink], sink [blue], and source [purple]) and across extraction methods (tube versus plate). The y axis has a maximum value of 1 (corresponding to 100%). Sample types (NTC, sink, or source) were assigned an estimated input biomass of 0 to 100 cells, 1e5 cells, or 1e7 cells, respectively. For UCSD tube extractions, samples from both PCR replicate plates (PCRA and PCRB) were included. For UCSD robot plate extractions, samples from both PCR replicate plates and both DNA extraction plates were combined and organized by sample type. Argonne processed samples included one extraction plate and one PCR replicate plate. Samples processed at UCSD are indicated by circles with no outline, and samples processed at Argonne are indicated by circles with a dark border. All samples with zero well-to-well contamination occurrences are given a count of 0.00001 to enable visualization on the graph (labeled 0 counts). Medians and interquartile ranges are displayed in black lines over the data points. ****, P < 0.001; ns, not significant.
Impact of contamination (well to well and background) on NTC, low-biomass, and high-biomass sample types
| Sample type | Location | Extraction | Well to well | Background kit | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Mean | Richness | Composition (%) | Mean | Median | Max | ||||||
| Avg no. of | W2W% | Mean | Median | Max | |||||||
| NTC | |||||||||||
| 61 | UCSD | m_tube | 47.54 | 20 | 4.12 | 4.64 | 0.00 | 56.00 | 95.36 | 100.0 | 100.0 |
| 32 | Argonne | m_tube | 53.13 | 165 | 1.56 | 0.85 | 0.03 | 8.23 | 99.15 | 99.97 | 100.0 |
| 28 | Argonne | m_plate | 10.71 | 8 | 4.23 | 3.14 | 0.00 | 75.17 | 96.86 | 100.0 | 100.0 |
| 116 | UCSD | Robot | 95.69 | 15 | 27.79 | 63.79 | 74.78 | 100.0 | 36.21 | 25.22 | 100.0 |
| Sink | |||||||||||
| 93 | UCSD | m_tube | 15.05 | 20 | 0.96 | 0.05 | 0.00 | 2.78 | 3.35 | 1.68 | 98.73 |
| 48 | Argonne | m_tube | 50.00 | 189 | 1.67 | 2.31 | 0.00 | 59.34 | 78.08 | 83.82 | 98.78 |
| 46 | Argonne | m_plate | 32.61 | 16 | 6.61 | 13.99 | 0.00 | 98.71 | 58.46 | 62.67 | 100.0 |
| 187 | UCSD | Robot | 67.38 | 15 | 12.70 | 0.70 | 0.08 | 15.61 | 0.93 | 0.25 | 40.51 |
| Source | |||||||||||
| 31 | UCSD | m_tube | 61.29 | 18 | 6.51 | 0.13 | 0.01 | 2.99 | 8.30 | 0.29 | 100.0 |
| 16 | Argonne | m_tube | 87.50 | 21 | 13.78 | 0.02 | 0.02 | 0.07 | 11.54 | 0.41 | 99.98 |
| 16 | Argonne | m_plate | 81.25 | 17 | 16.79 | 2.37 | 0.01 | 36.40 | 13.13 | 0.32 | 99.99 |
| 64 | UCSD | Robot | 70.31 | 12 | 13.76 | 0.94 | 0.04 | 50.67 | 7.32 | 0.16 | 100.0 |
Composition refers to the mean, median, or maximum frequency of sOTU contaminants that are due to well-to-well contamination or background kits.
Refers to the total samples or well which had enough sequencing data for analysis.
Location refers to the two laboratories which processed samples, either UCSD or Argonne.
m_, manual (non-robotic-based extraction); Robot, robot-based DNA cleanup.
Prevalence is calculated as the number of samples with any well-to-well contamination/total number of samples.
W2W% is the percentage of total richness that is a result of well-to-well events, calculated as the number of unique well-to-well contaminants/total number of sOTUs (mean).
FIG 5Well-to-well effect size. Shown are proportions of samples containing well-to-well contaminants organized by sample type (NTC, sink, and source) (a) and extraction method (b). The y axis has a maximum value of 1 (corresponding to 100%). Statistical analyses of data within bars were performed using Kruskal-Wallis nonparametric testing and indicate differences in contaminant fractions across extraction types (a) and among sample types (b). IQR, interquartile range.