| Literature DB >> 25968645 |
Mark B Smith1, Andrea M Rocha2, Chris S Smillie3, Scott W Olesen4, Charles Paradis5, Liyou Wu6, James H Campbell, Julian L Fortney7, Tonia L Mehlhorn8, Kenneth A Lowe8, Jennifer E Earles8, Jana Phillips8, Steve M Techtmann7, Dominique C Joyner7, Dwayne A Elias2, Kathryn L Bailey2, Richard A Hurt2, Sarah P Preheim4, Matthew C Sanders4, Joy Yang3, Marcella A Mueller8, Scott Brooks8, David B Watson8, Ping Zhang6, Zhili He6, Eric A Dubinsky9, Paul D Adams, Adam P Arkin, Matthew W Fields10, Jizhong Zhou6, Eric J Alm, Terry C Hazen11.
Abstract
UNLABELLED: Biological sensors can be engineered to measure a wide range of environmental conditions. Here we show that statistical analysis of DNA from natural microbial communities can be used to accurately identify environmental contaminants, including uranium and nitrate at a nuclear waste site. In addition to contamination, sequence data from the 16S rRNA gene alone can quantitatively predict a rich catalogue of 26 geochemical features collected from 93 wells with highly differing geochemistry characteristics. We extend this approach to identify sites contaminated with hydrocarbons from the Deepwater Horizon oil spill, finding that altered bacterial communities encode a memory of prior contamination, even after the contaminants themselves have been fully degraded. We show that the bacterial strains that are most useful for detecting oil and uranium are known to interact with these substrates, indicating that this statistical approach uncovers ecologically meaningful interactions consistent with previous experimental observations. Future efforts should focus on evaluating the geographical generalizability of these associations. Taken as a whole, these results indicate that ubiquitous, natural bacterial communities can be used as in situ environmental sensors that respond to and capture perturbations caused by human impacts. These in situ biosensors rely on environmental selection rather than directed engineering, and so this approach could be rapidly deployed and scaled as sequencing technology continues to become faster, simpler, and less expensive. IMPORTANCE: Here we show that DNA from natural bacterial communities can be used as a quantitative biosensor to accurately distinguish unpolluted sites from those contaminated with uranium, nitrate, or oil. These results indicate that bacterial communities can be used as environmental sensors that respond to and capture perturbations caused by human impacts.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25968645 PMCID: PMC4436078 DOI: 10.1128/mBio.00326-15
Source DB: PubMed Journal: MBio Impact factor: 7.867
FIG 1 Uranium and nitrate contamination can be effectively identified using bacterial DNA. We trained a random forest classifier using 16S abundance data from 2,972 operational taxonomic units measured across 93 wells. Classifier performance data for uranium (a) and nitrate (b) across the Oak Ridge Field Site are shown. The maximum contaminant level (MCL) is the cutoff used to determine which sites are contaminated (samples below the cutoff are uncontaminated). Contaminant levels are measured at each well and linearly interpolated between wells. Overall classification performance values measured by specificity, sensitivity, and accuracy for detecting contamination were higher for uranium (0.71, 0.87, and 0.82, respectively) than for nitrate (0.81, 0.63, and 0.70).
FIG 2 Bacterial DNA can be used to quantitatively predict many geochemical features. Besides classification, we can use 16S sequence data to predict quantitative values for a variety of geochemical measurements at each well. (a and b) For example, the prominent features displayed in our map of true pH (a) are recovered in our map of predicted pH (b). (C) We found that predicted values for pH are highly correlated with true values (P < 1 × 10−10, Kendall tau rank test). (D) We extended this approach to 38 other geochemical parameters, where we have plotted the coefficient of correlation (Kendall’s tau) between true and predicted values. Among these correlations, 18 are highly significant (P < 0.0001, indicated by closed circles), 8 are significant (P < 0.01, indicated by open circles), and 12 are not significant.
FIG 3 Near-perfect classification of oil contamination using bacterial DNA. (A) Samples collected prior to the Deepwater Horizon oil spill (green), during the spill but outside the oil plume (white), from the oil plume (orange), and from the plume but after the oil had been degraded (red) across the Gulf of Mexico are shown. (B) To compare oil classification performance with classification of uranium and nitrate, we show the receiver operator curves for all classifiers. The values for the area under the curve are 0.99 for oil, 0.82 for uranium, and 0.76 for nitrate, compared to 0.50 for an uninformative random classifier.
FIG 4 random forests identify highly discriminative, biologically meaningful taxonomic groups that predict environmental conditions. (A) To understand the remarkable performance of the oil classifier, we have plotted a phylogenetic tree that includes the 50 most informative taxonomic groups for predicting uranium (red) and oil (black) data. The betaproteobacteria (β) and gammaproteobacteria (ɣ) clades are indicated. (B) We tested each of these features by itself as a classifier and plotted the Matthews correlation coefficient (MCC) for each of these single-feature classifiers as a bar plot at each leaf of the tree. While the best uranium features are highly informative (mean MCC = 0.49), the best features for oil classification are individually nearly perfect classifiers (mean MCC = 0.97). Error bars for the summary of these single-feature classifiers reflect 1 standard deviation. (C) The relative abundances of two highly informative features are shown for each sample. The relative abundance is expressed as the z score of each group relative to the abundances of other taxonomic groups from the same sample.