Literature DB >> 35695567

The Microbial Community and Functional Potential in the Midland Basin Reveal a Community Dominated by Both Thiosulfate and Sulfate-Reducing Microorganisms.

Kara Tinker1,2, Daniel Lipus1,3,4, James Gardiner1,2, Mengling Stuckman1,2, Djuna Gulliver1.   

Abstract

The Permian Basin is the highest producing oil and gas reservoir in the United States. Hydrocarbon resources in this region are often accessed by unconventional extraction methods, including horizontal drilling and hydraulic fracturing. Despite the importance of the Permian Basin, there is no publicly available microbiological data from this region. We completed an analysis of Permian produced water samples to understand the dynamics present in hydraulically fractured wells in this region. We analyzed produced water samples taken from 10 wells in the Permian region of the Midland Basin using geochemical measurements, 16S rRNA gene sequencing, and metagenomic sequencing. Compared to other regions, we found that Permian Basin produced water was characterized by higher sulfate and lower total dissolved solids (TDS) concentrations, with a median of 1,110 mg/L and 107,000 mg/L. Additionally, geochemical measurements revealed the presence of frac hits, or interwell communication events where an established well is affected by the pumping of fracturing fluid into a new well. The occurrence of frac hits was supported by correlations between the microbiome and the geochemical parameters. Our 16S rRNA gene sequencing identified a produced water microbiome characterized by anaerobic, halophilic, and sulfur reducing taxa. Interestingly, sulfate and thiosulfate reducing taxa including Halanaerobium, Orenia, Marinobacter, and Desulfohalobium were the most prevalent microbiota in most wells. We further investigated the metabolic potential of microorganisms in the Permian Basin with metagenomic sequencing. We recovered 15 metagenome assembled genomes (MAGs) from seven different samples representing 6 unique well sites. These MAGs corroborated the high presence of sulfate and thiosulfate reducing genes across all wells, especially from key taxa including Halanaerobium and Orenia. The observed microbiome composition and metabolic capabilities in conjunction with the high sulfate concentrations demonstrate a high potential for hydrogen sulfide production in the Permian Basin. Additionally, evidence of frac hits suggests the possibility for the exchange of microbial cells and/or genetic information between wells. This exchange would increase the likelihood of hydrogen sulfide production and has implications for the oil and gas industry. IMPORTANCE The Permian Basin is the largest producing oil and gas region in the United States and plays a critical role supplying national energy needs. Previous work in other basins has demonstrated that the geochemistry and microbiology of hydrocarbon regions can have a major impact on well infrastructure and production. Despite that, little work has been done to understand the complex dynamics present in the Permian Basin. This study characterizes and analyzes 10 unique wells and one groundwater sample in the Permian Basin using geochemical and microbial techniques. Across all wells we found a high number of classic and thiosulfate reducers, suggesting that hydrogen sulfide production may be especially prevalent in the Permian Basin. Additionally, our analysis revealed a biogeochemical signal impacted by the presence of frac hits, or interwell communication events where an established well is affected by the pumping of fracturing fluid into a new well. This information can be utilized by the oil and gas industry to improve oil recovery efforts and minimize commercial and environmental costs.

Entities:  

Keywords:  16S RNA; Permian Basin; environmental microbiology; geomicrobiology; hydraulic fracturing; hydrocarbons; metagenomics

Mesh:

Substances:

Year:  2022        PMID: 35695567      PMCID: PMC9430316          DOI: 10.1128/spectrum.00049-22

Source DB:  PubMed          Journal:  Microbiol Spectr        ISSN: 2165-0497


INTRODUCTION

Hydraulically fractured oil and natural gas are one of the world’s fastest growing major fuel sources and represent an essential energy source in the United States (1, 2). While efforts to minimize greenhouse gas emissions have advanced, hydraulically fractured oil and natural gas have emerged as major replacements for coal and are an important gateway toward renewable energy sources (3, 4). Advanced horizontal drilling and hydraulic fracturing technologies have contributed to increased yields as they can extract large quantities of oil and natural gas from previously impermeable shale formations (5, 6). However, these technologies are often associated with frac hits, or interwell communication events where an established well is affected by the pumping of fracturing fluid into a new well. Additionally, hydraulic fracturing operations generate vast amounts of wastewater, referred to as produced water, which is characterized by high concentrations of salt, metals, organic compounds, radionuclides, and microorganisms (7–9). Prior to reuse or disposal, these fluids need to be remediated, creating additional operational and environmental expenses (10, 11). Issues associated with produced water management such as corrosion and fouling events, leakage and spillage of potentially hazardous substances, and hydrogen sulfide contamination highlight the need to improve and optimize current management strategies and approaches (6, 8, 12–15). Many of the key issues associated with produced water management have been linked to microbial activity (8, 16–22). Therefore, the role of microorganisms during hydraulic fracturing operations and hydraulic fracturing produced water management has become a focal point (6, 8). Currently, it is unclear if the majority of the microbial biomass found in produced fluids and the hydraulic fracturing infrastructure stems from the subsurface or is introduced and distributed through the fracturing process and by the continuous recycling of produced water (8, 23). However, research efforts to this point suggest that despite the use of a wide array of biocides, low diversity microbial populations adapted to the hypersaline and extreme conditions are able to establish themselves in the borehole and surrounding well environment including the gas separator and storage tanks. Biocides may actually play an important role in shaping the community, as shale-associated microorganisms are often resistant to biocides (18, 24, 25) and ineffective biocide treatments may result in increased antimicrobial resistance genes and/or enhanced microbial activity (18, 19, 26). In addition to the deleterious effects microbial activity can have on oil and gas operations and the surrounding environments, recent work also suggests that microbiological data from hydraulic fracturing well sites can be used to forecast geographic areas and subsurface sections, which may be particularly productive (27). These contrasting roles highlight the necessity to study the microbial community, microbial processes, and biogeochemical interactions in and around hydraulic fracturing wells so we can incorporate these findings in current produced water management strategies. The Permian Basin is currently the largest oil-producing basin in the United States, producing nearly 3.5 million barrels of oil and 12 million cubic feet of natural gas per day (28). Therefore, obtaining insights on microbial processes potentially interfering with operations and evaluating biogeochemical interactions as part of advanced produced water management is of significant scientific and engineering interest. However, to our knowledge there is little publicly available microbiological data from the Permian Basin. A 2015 study by Kilbane et al. (29) investigated the potential for microbial sulfide production using culture-based tests and the sequencing of two samples, one water and one biofilm. Results revealed microbial abundances of 103 cells/mL in water samples and 105 cells per cm2 biofilm (29). Arcobacter was the dominant taxa present in the water sample while Desulfovibrio was dominant in the biofilm sample. A recent paper by Lascelles et al. (27) discusses the usefulness and necessity to analyze DNA from Permian formations, but does not provide any specific results on microbial abundance of community structure. Finally, a 2019 paper by Ursell et al. measured the presence of microbial DNA markers in produced fluid and well cuttings from different depths in the Permian Basin (30). Although they found unique DNA marker profiles at different depths, they do not report their exact methodology or the taxonomy of the microbes. These results demonstrate the overall need for robust microbial studies in the Permian Basin and suggest that there is a high potential for sulfide producing halophilic microorganisms. However, the unavailable proprietary sequencing data limits the impact of previous work and therefore additional investigation is required. While microbiological data on Permian Basin produced waters is scarce, several studies have investigated the geochemistry and microbiology of the Marcellus Shale and other major oil and gas regions in the United States (22, 23, 31, 32). From this work, we know shale wells are characterized by high salinity and cation concentrations, with total dissolved solids (TDS) concentrations up to 40 g/L and excess Fe2+, Ba2+, and Mg2+ in these systems (22, 33–37). These conditions usually result in a halophilic microbial community with high osmotic and oxidative stress tolerance abilities (19, 36, 38). Both amplicon based and metagenomic approaches have identified several microbial taxa and reconstructed metabolic pathways that can contribute to corrosion, fouling, and souring events in hydraulic fracturing wells, pipelines, tanks, and associated equipment (17, 22–24, 31, 35). In addition, microorganisms in produced waters are known to actively express key genes involved in sulfide production and methanogenesis pathways (24). However, this work also demonstrates that oil and gas regions have unique geochemical and microbiological signatures (22, 23, 31, 32, 35, 37). For example, Bakken region produced waters have been found to exhibit greater TDS concentrations and lower biomass than Marcellus Shale or Barnett Shale produced waters, which results in a distinct ecological niche within the Bakken Shale (8, 23, 33–35). Therefore, although we can make some predictions about the Permian Basin based on previous work in other hydrocarbon regions, it is necessary to complete a comprehensive study of produced water from the Permian region in order to accurately understand and manage local resources. Our overall goal is to assess the unique geochemical and microbiological features of hydraulically fractured shales in the Permian Basin in order to improve enhanced oil recovery efforts and minimize commercial and environmental costs. In this study, we analyzed produced water samples taken from 10 wells in the Permian region of the Midland Basin. We also analyzed one proximal groundwater sample, as groundwater is typically used as source water for fracturing fluid in this region. We characterized the microbial community of water samples using 16S rRNA gene sequencing, collected detailed geochemistry data from each sample, and used metagenomic shotgun sequencing to functional potential present in select hydraulically fractured wells. To our knowledge, this is the first extensive analysis of Permian produced water samples and represents both a significant advancement on microbial life in subsurface environments and provides valuable knowledge that can be utilized by the oil and gas industry.

RESULTS

Geochemistry of produced water from the Permian Basin.

We completed geochemical analysis on 14 produced water samples and one groundwater sample in order to better understand the unique ecosystem present in the Permian Basin (Fig. S1, Table S1). Our produced water samples were collected from 10 unique well sites, with four of those sites having both paired wellhead and separator samples (Table S1). We also collected gas measurements at each wellhead, although we were unable to do so at the separator. We found that all wellheads had a measurable concentration of H2S ranging from 111-391ppm. We found that the produced waters display a small range in pH, from 6.4 to 7.3, with all values being circumneutral (Table 1). Alkalinity ranged from 95 to 737 mg/L, with an average of 404 mg/L and a median of 385 mg/L across all produced water samples (Table 1). Our groundwater pH and alkalinity measurements fell within these ranges, at 7.1 and 427 mg/L, respectively (Table 1). These values are consistent with previously collected geochemistry data from the Permian Basin (39, 40).
TABLE 1

Geochemical measurements for produced water samples and proximal groundwater in the Permian Basin

Well no.pHAlkalinity (mg/L)TDS (mg/L)Ca2+ (mg/L)Mg+ (mg/L)Na (mg/L)K+ (mg/L)Total Fe (mg/L)So42− (mg/L)Cl (mg/L)Br (mg/L)
1S6.65225830016806011850033843365032800229
2S6.97376970018305062350038315268039800308
3S6.43665450013505171820029924349030100238
4S6.73378400021505092840043653250049300388
5W7.337811100032105933790052049130066700524
6S7.0368107000318056536600415BDL109064300469
6Wa7.251210650033105813630040967111563650470
7S6.62901220003600611417006105881774100614
7W6.73881230003610615420006135285674500604
8S6.54251200003430602404005426486473300555
8W6.73811180003420599403005405282771300554
9S6.63981200003830623404004288762673100530
9W6.64641180003790617399004238063071300524
10S6.9955320013205041800030224342029300229
Minb6.495532001320504180002991562629300229
Maxb7.373712300038306234200061387365074500614
Medianb6.738510900032605963725042652110365500497
Avgb6.84049751428365753300744751170558111445
Proximal groundwater7.1427843292.948.5283011.0ND207029606.23

Values reported for 6W are the average of two technical replicates.

Value was calculated using only produced water samples; the proximal groundwater sample was not included.

Geochemical measurements for produced water samples and proximal groundwater in the Permian Basin Values reported for 6W are the average of two technical replicates. Value was calculated using only produced water samples; the proximal groundwater sample was not included. Total dissolved solids measurements for produced water samples range from 53,200 to 123,000 mg/L, with the majority of the TDS represented by high concentrations of chloride and sodium (Table 1). Notably, half of the produced water samples have a significantly lower TDS measurement than the median of 107,000 mg/L (Table 1). These lower TDS produced water samples have an elevated dissolved sulfate range (2,500–3,650 mg/L) relative to the overall dissolved sulfate median of 1,110 mg/L (Table 1). The groundwater also recorded a sulfate concentration (2,070 mg/L) higher than the median (Table 1). Since groundwater is commonly used as a major component of fracturing fluid in this region, the reduced TDS and elevated sulfate values in the produced water implies recent contact with injected groundwater. As all wells sampled were older than 6 months active, at which point produced water has a distinct chemical signature compared to the initial fracture fluid injection (8), this low TDS high sulfate signature was most likely from a nearby frac hit. It is notable that the sulfate concentrations in produced fluids with reduced TDS measurements were higher than groundwater sulfate levels. This suggests that additional reactions occurred within the reservoir after the frac hit event.

Microbial diversity and distribution of produced water microbial communities.

We found that there was a large range in all alpha diversity metrics (Table 2, Table S1, Fig. S1). Produced water from the separator site 2 (2S) had the highest number of unique amplicon sequence variants (ASVs) (66) as well as the highest recorded values for richness (77), diversity (3.03) and evenness (0.720) (Table 2, Table S1, Fig. S1). In contrast, produced water from the separator at site 9 (9S) had the lowest number of unique ASVs (7) as well as the lowest Chao1 index (7) (Table 2, Table S1, Fig. S1). Notably, the wellhead sample at site 9 (9W) had a significantly higher number of unique ASVs (29) and Chao1 index (28.1), although the diversity and evenness for 9S and 9W were comparable (Table 2, Table S1, Fig. S1). Finally, sample 1S had the lowest measurements for diversity and evenness, at 0.974 and 0.299, respectively (Table 2, Table S1, Fig. S1).
TABLE 2

Alpha diversity metrics for experimental samples

Sample IDASVsRichnessbDiversitycEvennessd
1S2637.30.9740.299
2S67773.030.720
3S22261.200.389
4S33542.200.630
5W21351.260.414
6W20381.510.504
9S771.210.624
9W2828.11.680.504
10S4764.51.730.450
GW2325.51.390.444

Metrics were calculated after sequence libraries were resampled to the depth of the sample with the fewest sequences from this experiment (1023 sequences).

Chao1 Index.

Shannon Index.

Pielou’s Evenness.

Alpha diversity metrics for experimental samples Metrics were calculated after sequence libraries were resampled to the depth of the sample with the fewest sequences from this experiment (1023 sequences). Chao1 Index. Shannon Index. Pielou’s Evenness. In order to visualize differences among water samples, we constructed two ordination plots using non-metric dimensional scaling (NMDS) based on Bray-Curtis dissimilarity distances calculated after sequence libraries were subsampled to the lowest sequence depth (1023 sequences). The first NMDS plot contained only produced water samples (Fig. 1), while the second plot contained all produced water samples as well as a groundwater sample (Fig. S2). For both NMDS plots, we used analysis of similarity (ANOSIM) calculations to measure possible clustering patterns. At the time of sampling, several wells were reported as frac hits by well operators (Table S2). Therefore, we specifically tested for clustering by sample origin (separator, wellhead, or groundwater), well age, and frac hit status (affected, unaffected, or groundwater). Among only the produced water samples, we found no statistically significant clustering by sample origin, well age, or frac hit status. However, when we incorporated the groundwater sample, we measured statistically significant clustering by frac hit status (R = 0.4806, P < 0.05), although there was no statistically significant clustering by sample origin or well age.
FIG 1

Nonmetric multidimensional scaling (NMDS) plot of produced water samples with a stress value of 9.83E-05. This plot was constructed using Bray-Curtis distance calculated after sequence libraries were resampled to the depth of the sample with the fewest sequences from this experiment (1023 sequences). ANOSIM confirmed no clustering by sample origin, well age, or frac hit status and a Mantel test revealed no significant correlation between the microbial community and associated geochemical profile. Environmental vectors for geochemical parameters (Table 1) were fit onto the ordination plot, with the direction of the arrow corresponding to the direction of the gradient and the length of the vector proportional to the strength of the correlation between ordination and environmental variables. A table containing the R2 and P-values for the corresponding vectors is located in SI Table 4, with 7 of the 11 environmental variables demonstrating a statistically significant (P < 0.05) correlation.

Nonmetric multidimensional scaling (NMDS) plot of produced water samples with a stress value of 9.83E-05. This plot was constructed using Bray-Curtis distance calculated after sequence libraries were resampled to the depth of the sample with the fewest sequences from this experiment (1023 sequences). ANOSIM confirmed no clustering by sample origin, well age, or frac hit status and a Mantel test revealed no significant correlation between the microbial community and associated geochemical profile. Environmental vectors for geochemical parameters (Table 1) were fit onto the ordination plot, with the direction of the arrow corresponding to the direction of the gradient and the length of the vector proportional to the strength of the correlation between ordination and environmental variables. A table containing the R2 and P-values for the corresponding vectors is located in SI Table 4, with 7 of the 11 environmental variables demonstrating a statistically significant (P < 0.05) correlation. We were also interested in understanding the relationship between the microbial community present in each sample and the associated sample geochemistry. In order to measure the strength of this relationship, we completed a Mantel test. We found no statistically significant relationship among the produced water samples, however we found a statistically significant relationship when we incorporated the groundwater sample (R = 0.4552, P < 0.05). We also calculated and fit environmental vectors onto our two ordination plots in order to identify any specific geochemical measurements that had a statistically significant relationship with the microbial data. We found no statistically significant environmental vectors among the entire data set (SI Fig. 2). However, when we examined only the produced water, we found that 7 of the 11 geochemical measurements have a statistically significant correlation with the microbial data (Table S4). The R2 of the statistically significant correlations ranged from 0.6137 to 0.7545, with TDS having an R2 of 0.6252 (Table S4). Total dissolved solids (TDS) is a measurement of the total amount of cations and/or anions dissolved in water, therefore it is unsurprising that many of the TDS constituents also had a significant correlation with the microbial data. These include calcium, magnesium, sodium, iron, sulfate, and chloride (Table S4). In contrast alkalinity, potassium, bromide, and pH did not have a statistically significant relationship with the microbial data (Table S4).
FIG 2

Bubble plot showing the relative abundance of the top 25 most abundant taxa found across the sampling sites.

Bubble plot showing the relative abundance of the top 25 most abundant taxa found across the sampling sites.

Composition of Permian Basin produced water microbial communities.

16S rRNA gene sequencing revealed a produced water microbial community characterized by anaerobic, halophilic, and sulfur reducing taxa (Fig. 2, Fig. S3 and S4). We found that the majority of these taxa were bacteria, with archaea only representing an average of 0.47% of the sequences across all produced water samples. In contrast, the groundwater sample was primarily composed of an archaeon from the Methanobacteriaceae family (Fig. 2, Fig. S3). This archaeon represented 55.58% of all of the sequences from the groundwater sample and was only found in one of the produced water samples, 9S, at low abundance (1.08%) (Fig. 2, Fig. S3). Our metagenomic analysis generally supported these findings, although several produced water samples are predicted to have a significantly higher abundance of archaea than that revealed by 16S rRNA gene sequencing. Specifically, metagenomic sequencing predicted a relative abundance of Archaea of up to 54% in sample 7S, 13% in sample 2S, and 30% in the 4S sample (Fig. S4). Previous work has demonstrated that members of the Halanerobiales order are commonly found in hydraulic fracturing produced water and are often associated with thiosulfate reduction and acid production (19, 21, 22, 41). Our samples were no exception, with an average relative abundance of 33.58% Halanerobiales across all produced water samples (Fig. 2, Fig. S3 and S4). The majority of the Halanerobiales present across these samples were classified as either Orenia sp. or Halanaerobium spp. (Fig. 2). Orenia sp. represent 61% to 74% of the total relative abundance across samples 1S, 3S, and 5W (Fig. 2). In contrast, the majority of the Halanerobiales taxa from samples 2S, 4S, and 6W were classified as Halanaerobium spp., representing 20% to 44% of the total relative abundance (Fig. 2). Although both of these genera have previously been associated with hydrocarbon environments (42), Halanaerobium is both better characterized and more commonly associated with hydraulically fractured wells (19, 21, 22, 41). Sulfur-reducing taxa are also frequently associated with produced water, especially those within the Desulfuromonadales and Desulfovibrionales orders (22, 35, 36, 42). Members of the Desulfuromonadales order were present across seven of the nine wells, although at a low abundance (Fig. 2, Fig. S3 and S4). Members of this order represented 1% of the total relative abundance across all produced water samples, with a maximum of 3.9% relative abundance in sample 10S (Fig. S3 and S4). In contrast, we found that members of the Desulfovibrionales order were especially enriched in samples 2S, 6W, and 9S, representing 23% to 38% of the total relative abundance (Fig. 2, Fig. S3 and S4). Desulfohalobium sp. constituted the majority of the Desulfovibrionales taxa for samples 2S and 6W, at a relative abundance of 19% and 25%, respectively (Fig. 2). Sample 2S also contained a significant amount of Desulfovermiculus spp., at 14% of the total relative abundance (Fig. 2). Finally, the majority of the Desulfovibrionales taxa present in sample 9S were identified as members of the Desulfovibrio genus, at a relative abundance of 20% (Fig. 2). Interestingly, members of the Desulfovibrionales order were also highly abundant within the groundwater sample and represent 39% of the total relative abundance for that sample (Fig. 2, Fig. S3). However, the Desulfovibrionales present in the groundwater sample were identified as belonging to three distinct Desulfovibrio spp., none of which present in any of the produced water samples (Fig. 2). Several members of the Campylobacterales and Deferribacterales orders were found across most produced water samples (Fig. 2, Fig. S3). Specifically, Arcobacter sp. was found within each produced water sample, with an average relative abundance of 12% across all samples (Fig. 2). Samples 1S, 3S, 4S, and 5W contained a particularly high amount of Arcobacter sp., ranging from 17% to 41% relative abundance (Fig. 2). Arcobacter halophilus was also present in most produced water samples, although at much lower abundances (Fig. 2). Sample 10S contained the highest amount of Arcobacter halophilus, with an overall relative abundance of 7% (Fig. 2). Members of the Arcobacter genus have previously been associated with oil and gas environments, including the Permian Basin; however, their exact metabolic role in these environments is unknown (29, 42). Similarly, members of the Deferribacterales order have previously been found in produced water samples (42). We found one member of the Deferribacterales order, Flexistipes sp., present across all produced water samples with an average relative abundance of 10% (Fig. 2). Notably, samples 6W and 9W contained exceptionally high amounts of Flexistipes sp., at a relative abundance of 19% and 61% (Fig. 2). The Deferribacterales order is understudied and only one Flexistipes sp. genome has been published, from a strain isolated from the red sea (43). Thus, the ecological role of Flexistipes sp. in the hydrocarbon microbial community is currently unknown.

Metagenomic assembly and draft genome recovery.

After quality trimming and assembling the generated metagenomic sequences, we recovered 15 metagenome assembled genomes (MAGs) from seven different samples representing 6 unique well sites (Tables S1 and S3). Completeness of the recovered MAGs ranged between 41.08% and 98.33% while contamination ranged between 1.45% and 9.83% (Table S3). Taxonomic annotation using the applied marker gene sets in CheckM and PhylopyhtiaS allowed assignment of the recovered MAGs to the following genera: Alcanivorax, Archaeoglobus, Desulfohalobium, Geoglobus, Halanaerobium, Marinobacter, Methanohalophilus, Methanothermococcus, Orenia, Pseudomonas, and Ralstonia (Table S3). Notably, several of these genera were present at high abundances with produced water samples (Fig. 2). In particular, the Orenia and Halanerobium MAGs appear to be especially relevant, as Orenia sp. and Halanerobium spp. constitute the majority of the microbial community for several produced water samples (Fig. 2, and Figs. S3 and S4). Understanding the functional capacity of subsurface microorganisms is especially important, as microbial activity has been linked with corrosion, fouling, and souring events in hydraulically fractured wells (24, 32–34, 44–46). These processes have been linked to metabolic potential for fermentation pathways (47), sulfur metabolism (21, 22, 48), hydrocarbon degradation (22), and biofilm formation (22) within our MAGs. We also evaluated the presence of genes involved in stress response mechanisms, as they can increase resistance to biocides used for produced water management (18). Fermentation and methane production. In hydraulically fractured shales, subsurface microorganisms often generate energy through the fermentation of nutrients present within the shale environment, microbial produced metabolites, and/or chemical additives in injection fluid (31). While these fermentation products are beneficial for microbes, the presence of products such as acetate or ethanol can contribute to infrastructure corrosion (16, 19, 49–51). Using the DRAM product summary, we were able to confirm that all MAGs except for the Archaeoglobus MAG from 3S, which was only 50.25% complete, had high gene completeness over the glycolysis pathway (Fig. S5). We also used the DRAM product summary to confirm the presence of numerous genes associated with carbohydrate-active enzymes and/or short chain fatty acid (SCFA) and alcohol conversion across all MAGs (Fig. S5). We identified genes associated with mixed acid fermentation in 14 of the 15 MAGs (Fig. S6). The genes most represented across all MAGs were ack and adh, which were present in 9 and 11 of the MAGs (Fig. S6). ack encodes for acetate kinase, which catalyzes the phosphorylation of acetate, while adh encodes for an alcohol dehydrogenase, which transforms simple sugars into ethanol. Methane is often present within shale formations. Although most of this methane is thermogenically produced, there is evidence that some of the methane is biogenically produced by methanogens (52). Methanogens have the capacity to produce methane through acetoclastic, hydrogenotrophic, or methylotrophic methanogenesis. Three of the 17 recovered MAGs were identified as methanogens; two as Methanohalophilus and one as Methanothermococcus. As expected, the DRAM product summary confirmed a high completeness for genes associated with methanogenesis in these MAGs (Fig. S5). Two of these MAGs contain the genes for the alpha, beta, and gamma subunits of methyl coenzyme M reductase (mcr), which catalyzes the final reaction in the methanogenesis pathway and is only found in methanogens (Fig. S6). The Methanohalophilus MAG from sample 2 did not contain any of the mcr genes, although this is likely because of its completion rate (76.2%) (Fig. S6). Although only methanogens contain the mcr gene, many organisms are known to contain genes related to methane production. Certain members of the Archaeoglobus genus have been reported to contain genes associated with methanogenesis (53, 54). Consistent with this, the Archaeoglobus MAG recovered from produced water sample 4S contained three genes associated with acetoclastic methanogenesis: cdh, which encodes carbon monoxide dehydrogenase; and hdrA, which encodes for the coenzyme required by mcr to produce methane (Fig. S6). The Archaeglobus MAG recovered from sample 4 also contained four genes associated with hydrogenotrophic methanogenesis: ftr, mch, mtd, and mer (Fig. S6). These genes encode for the enzymes responsible for converting carbon dioxide to 5-methyl-tetrahydromethanopterin, which is an important metabolic intermediate in both the acetoclastic and hydrogenotrophic methanogenesis pathway (Fig. S6). Sulfate and thiosulfate reduction. The presence of excess sulfide in hydraulically fractured wells can lead to gas souring. Sulfide is produced by sulfate-reducing microorganisms (SRM) present in the produced water. Out of our 15 recovered MAGS, four are SRM: two Archaeoglobus and two Desulfohalobium (Fig. S6). SRM commonly reduces sulfate via the dissimilatory sulfate reduction pathway. Consistent with this, all four SRM contained genes associated with classical sulfate reduction including genes that encode for the sulfite reduction-associated complex (DsrMKJOP) and for dissimilatory sulfite reductase (dsrAB) (Fig. S6). Interestingly, the Methanohalophilus MAG recovered from 2S contained genes that encode for DsrMKJOP and dsrAB, although the Methanohalophilus MAG from samples 7S did not (Fig. S6). Although Methanohalophilus is not a typical sulfate reducer, recent work demonstrates that Methanohalophilus in hypersaline habitats have adapted to have unique sulfur metabolisms, including the genes that encode for dsrAB (55). The Methanohalophilus MAG from sample 2S may be another example of an archaeal strain uniquely adapted to a high saline environment. Recent work has shown that in some hydrocarbon environments, sulfide production primarily occurs through thiosulfate reduction, rather than through classical sulfate reduction (16, 19, 21, 22, 56). Halanaerobium often dominates produced water samples and has been increasingly linked with thiosulfate-dependent sulfidogenesis (16, 19, 21, 22, 56). We recovered a Halanerobium MAG with 73.66% completion from sample 3S. This MAG contains two sulfur transport genes as well as a rhodanese gene, although we did not detect the presence of any anaerobic sulfite reductase genes (asrA, asrB, asrC) (Fig. S6). Many of our other MAGs also contained genes associated with thiosulfate reduction. In total, 8 MAGs contain sulfur transport genes; 9 contain rhodanese or rhodanese-like genes (rdl), which are associated with the conversion of thiosulfate to adenyl sulfate; 2 contain aprAB, which encodes for the protein responsible for converting thiosulfate to sulfite; and 3 contain at least one of the asr genes, which encodes for the protein that converts sulfite into sulfide (Fig. S6). Among the MAGs recovered from produced water samples 1S and 3S, there was at least one organism that contained the mpsT, asr, and/or dsr gene (Fig. S6). Similarly, among the MAGs recovered from produced water samples 2S, 4S, and 10S there was at least one organism that contained the mpsT and/or dsr gene (Fig. S6). Thus, the microbial community present in these sites are known to have the functional capacity to convert thiosulfate into sulfide. Hydrocarbon degradation. Crude oil and gas contain complex mixtures of hydrocarbons, thus we were interested in examining hydrocarbon degradation capabilities present in our MAGs. We searched for the presence of 26 common hydrocarbon degradation genes associated with toluene degradation, biphenyl degradation, naphthalene degradation, catechol ortho-cleavage, and catechol meta-cleavage (Fig. S6). Overall, we found minimal presence of major hydrocarbon degradation genes in our 15 MAGS. No MAGs contained any of the selected genes associated with toluene, biphenyl, or benzoate degradation (Fig. S6). However, our Marinobacter MAG contained the gene for 2-hydroxychromene-2-carboxylate isomerase, which is associated with naphthalene degradation (Fig. S6). Nine of our MAGs contained one gene associated with either catechnol ortho- or meta-cleavage (Fig. S6). The Pseudomonas MAG had two genes associated with catechnol ortho-cleavage, 3-oxoadipate CoA-transferase and acetyl-CoA C-acyltransferase, and one gene associated with catechnol meta-cleaveage, 2-hydroxymuconate-6-semialdehyde hydrolase (Fig. S6). Overall, the Pseudomonas and Marinobacter MAGs contained the most hydrocarbon degradation-associated genes, at 3 and 2 genes total (Fig. S6). Motility and biofilm formation. Biofilm formation during or after hydraulic fracturing can lead to major operational challenges (23, 57). Biofilm growth begins when microbes adhere to one another and/or the well infrastructure. As the biofilm grows, microbes produce extracellular polymeric substances (EPS), which aid in cell attachment and can trap sulfate and other microbial products within the biofilm or between the biofilm and the adjacent well infrastructure, leading to clogging and/or increased rates of microbial induced corrosion. Established biofilms are often dense, which makes it difficult to control them with biocidal application. We examined our MAGs for flagellin and motility genes, which are thought to be important for cellular attachment and biofilm formation (45). Most recovered MAGs contained the flg, fli, flh, and/or motAB genes (Fig. S6). However, the Geoglobus, Methanothermococcus, and one of the Methanohalophilus MAGs did not (Fig. S6). We also found three MAGs contained the genes encoding sporulation two-component response regulator Spo0A, which is associated with surface attachment initiation, and numerous MAGs that contained the genes encoding for a glycosyl transferase group 2 family protein (glt2) and/or diguanylate cyclase (adrA), which are associated with exopolysaccharide production (58, 59). This suggests that, under the right circumstances, there is a high potential for biofilm formation within our sample sites. Stress response. Produced water from hydraulically fractured wells is characterized by high salt concentrations and the presence of heavy metals (60–62). Thus, microbes in produced water employ stress response mechanisms to protect the cell. Halophilic microorganisms have two strategies which assist in maintaining proper osmotic pressure within the cell (63). Organisms utilizing the “salt-in” strategy accumulate molar concentrations of potassium and chloride while organisms employing the “osmolyte” strategy accumulate organic osmotic solutes (63). All recovered MAGs contained the trkA, trkH, and/or ktrB genes, which are associated with potassium transport and uptake (Fig. 3). Similarly, 8 recovered MAGs contained the proX, proW, and/or opuA genes, which are associated with glycine and betaine uptake (Fig. S6).
FIG 3

A, Conceptual figure demonstrating possible spatial relationships present in subsurface environments in the Permian Basin. Although the location and spatial relationships are hypothetical, the icons indicate the presence or absence of selected microorganisms with a MAG based on 16S rRNA gene sequencing data available for shown sites. B, Summary of sulfate metabolic genes present within each of the selected organisms based on the available MAG data in SI Fig. 6

A, Conceptual figure demonstrating possible spatial relationships present in subsurface environments in the Permian Basin. Although the location and spatial relationships are hypothetical, the icons indicate the presence or absence of selected microorganisms with a MAG based on 16S rRNA gene sequencing data available for shown sites. B, Summary of sulfate metabolic genes present within each of the selected organisms based on the available MAG data in SI Fig. 6 The presence of heavy metals can induce oxidative stress responses in nearby microorganisms. Therefore, we examined our recovered MAGs for: perR, a redox sensitive transcriptional regulator; sor, which encodes for a superoxidide reductase; and a gene encoding for rubredoxin. We also searched for more general stress response genes including a periplasmic stress response gene (ompH), a universal stress protein (uspA), a heat shock protein (grpE), and heat shock chaperones (groES and grosEL). Unsurprisingly, all our recovered MAGs contained at least one of these genes and many recovered MAGs contain multiple of these genes, including the Halanaerobium, Orenia, Desulfohalobium, and Marinobacter MAGs. Finally, hydraulically fractured wells are often treated with biocides in an attempt to control or reduce the growth of detrimental microbes. We examined our recovered MAGs for a wide variety of genes associated with antibiotic and biocidal resistance, including multidrug resistance transporters and multidrug resistance proteins (Fig. S6). All our MAGs except for the Archaeoglobus from site 3S (50.25% complete) contained at least one of these genes (Fig. S6). Interestingly, the Marinobacter MAG contained 8 genes including several drug resistance transporters; acriflavin, bacteriocin, and polymyxin resistance proteins; and several heavy metal resistance genes (Fig. S6).

DISCUSSION

The development of hydraulic fracturing technology has led to the extraction of oil and gas from previously impermeable shale formations. Currently, the Permian Basin is the largest oil producing basin in the United States with newly fractured wells having the highest oil production rate. Despite the rapid development in the Permian region, little has been done to examine the microbial life in the Permian Basin. In this study, we address this knowledge gap by using geochemical and microbiological techniques to analyze produced water from 14 samples taken at 10 unique sites in the Permian region of the Midland Basin. We also recovered 15 MAGs from seven different samples representing 6 unique well sites (Table S1 and S3) in order to connect functional capabilities with specific microorganisms present within Permian Basin produced water samples. Hydraulically fractured wells are generally characterized by high concentrations of salt, metals, and organics, with TDS concentrations as high as 345,000 mg/L (7–9). Our geochemical analysis of produced water from the Permian Basin supported these trends. We found excess Fe2+, Ba2+, and Mg2+ in these systems, which is consistent with previous work (22, 33–37). Interestingly, we also found lower TDS values in the Midland Basin compared to those in the Bakken, Marcellus, or Barnett Shales (64, 65), with sample values ranging from 53,200 to 123,000 mg/L. We found that the produced water from our sample sites clustered into two groups: those with high TDS and low dissolved sulfate or low TDS and high dissolved sulfate, compared to the median values of 107, 000 mg/L and 1,110 mg/L. Notably, even samples with low dissolved sulfate concentrations had higher values than those found within other basins, which had concentrations <200 mg/L (65). Produced water samples with low TDS and high dissolved sulfate concentrations had geochemical profiles which resembled the groundwater sample. When a new well is drilled, fracturing fluid is injected into the subsurface and then a shut-in period occurs where the fracturing fluid remains pressurized in the hydraulically fractured well. After the shut-in period, the flow-back period begins and a large volume of water, which is chemically similar to the injection fluid, is generated. Over time, as flow-back ends and hydrocarbon production begins the chemical composition of the fluid changes to higher salinity and is referred to as produced fluid. Previous work by Mouser et al. (32) demonstrates that this process generally occurs between 3 and 6 months after the shut-end period ends. Although groundwater is commonly used as a major component of fracturing fluid in this region, because all of the wells we sampled were ≥6 months old, it would be highly unlikely that the produced fluid resembled the groundwater unless there was outside interference (i.e., a frac hit). Thus, this provides evidence that injection water from newly fractured wells floods nearby older wells in the Permian Basin. This finding represents, to our knowledge, the first documentation of mature wells having a change in geochemistry due to communication from an outside well. This geochemical evidence was further supported by local operators, who reported these wells as frac hits (Table S2). The sulfate concentrations in produced fluids with reduced TDS measurements were higher than typical basin produced water sulfate levels of <200 mg/L (65), suggesting that the Permian region may have a higher potential for sulfate reduction than other regions. Furthermore, as the potential fracture fluid source and the wells impacted by communication to newly drilled wells appeared to have higher sulfate levels, this suggests any frac hit could increase microbial sulfate reduction risk even in established wells. Well-to-well communication events were self-reported by well operators; they do not take into account the amount of time since the initial frac hit event and were treated as single events in our categorical analysis. It is likely that there is repeated, ongoing well-to-well communication that is dependent on numerous factors including the productivity of each well and the unique shale fractures present in that geological formation. Because of these challenges, it is difficult to accurately determine if frac hits are associated with changes in the microbial community. However, we measured two correlations that suggest frac hits contribute to the structure of the microbial community. When we examined all of the microbial data, we measured statistically significant clustering by frac hit status, but not sample origin or well age. Additionally, 7 of the 11 geochemical measurements had a statistically significant correlation. Two of these values include TDS and sulfate concentration, which is notable as frac hits have a unique geochemical signature characterized by low TDS and high sulfate concentration. Network analysis is a common tool used to identify and visualize microbial interactions occurring in specific environments. Previously, our group completed a network analysis using data from three unconventional hydrocarbon reservoirs: the Marcellus Shale, Bakken Formation, and Permian Basin (65). This analysis relied upon previously published data from our research group (22, 35, 36, 38) as well as the data presented in this manuscript. Keystone species are highly interactive species that have a disproportionately large effect in the environment. Two of the keystone species we identified across all three basins were Marinobacter and Desulfohalobium retbaense DSM 5692 (65). This is notable as three of the recovered, highly abundant MAGs in the Permian Basin included one Marinobacter and two Desulfohalobium. We found that Marinobacter sp. was present at low abundances (<10%) in all of our produced water samples except 9S (Fig. 2). Marinobacter has previously been associated with hydrocarbon environments and is often used in bioremediation efforts (66). Marinobacter has been shown to consume hydrocarbons and use them to produce and excrete EPS, which work as an effective natural surfactant (66). Although there were minimal hydrocarbon degradation genes in our MAG, it contained numerous genes associated with biofilm formation (Fig. S6). Desulfohalobium retbaense DSM 5692 was also present in all of our samples, at up to 25% relative abundance (Fig. 2). Our Desulfohalobium MAGs contained genes associated with classical sulfate reduction, including genes that encode for the sulfite reduction-associated complex (DsrMKJOP) and for dissimilatory sulfite reductase (dsrAB). Although significant additional investigation is required, it seems likely that Desulfohalobium and Marinobacter participate in cross-feeding with each other and/or other members of the microbial community within the Permian Basin. Cross-feeding is when one species of microorganism utilizes the metabolic products of another. Cross-feeding has previously been demonstrated among Halanaerobium, Geotoga, and Methanohalophilus present in the Marcellus and Utica Basins (67), although it has not been investigated in the Permian Basin. In this case, Desulfohalobium and Marinobacter may be uniquely suited to utilize sulfate and sulfate-derivatives for anaerobic respiration and biosynthesis, respectively. Established wells typically contained lower diversity microbial communities dominated by microbiota adapted to the unique reservoir conditions (32). Consistent with this, we found that our produced water samples, which were all obtained from wells ≥6 months old, were dominated by a smaller number of halophilic, thermophilic, sulfate-reducing, fermentative, and methanogenic taxa (Fig. 2). Interestingly, six of our 15 recovered MAGs were found to be within the top 25 most abundant taxa found (Fig. 2). These MAGs include one Halanaerobium, two Orenia, one Marinobacter, and two Desulfohalobium. Recent work has shown that in some hydrocarbon environments there is a high abundance of members of the Halanerobiales order (19, 21, 22, 41). Halanerobiales have the ability to produce sulfide through thiosulfate reduction, rather than classical sulfate reduction, and are thought to be a major contributor to microbial induced corrosion (19, 21, 22, 41). Three of our highly abundant MAGs, Halanaerobium and Orenia, belong to this order. Halanaerobium has previously been implicated in thiosulfate-reduction in hydraulically fractured wells (16, 21). Although Orenia has previously been associated with hydrocarbon environments (42), to our knowledge it has never been found in such high abundances in hydraulically fractured wells. All three of these MAGs contain genes associated with thiosulfate reduction (Fig. 3), suggesting these taxa also contribute to thiosulfate reduction in the Permian Basin. Because these organisms are closely related, it is highly possible that well-to-well communication events would foster horizontal gene transfer or introduce competing strains of bacteria (Fig. 3). Each time a frac hit event occurs, microbial biomass is potentially exchanged between the newly fractured well and the existing well. Previous research demonstrates that newly fractured wells contain a diverse assemblage of microbes that converge over time to a low diversity microbial community dominated by halophiles uniquely adapted to the reservoir conditions (32). Established wells are often routinely treated with biocides in order to minimize microbial induced biofilm and corrosion (68, 69), therefore these unique adaptations may include biocidal resistant genes. Thus, a frac hit can act as a seeding event where highly adapted, biocidal resistant halophiles from established wells are introduced to newly fractured wells. A frac hit also acts as a disruption event for the established well, which could impact the resident microbial community composition or activity. Additionally, this may offer the opportunity for horizontal gene transfer. Although well-to-well communication is a known problem within the oil and gas industry (70), there is little published evidence investigating the impact of frac hits or other well-to-well communication events. Thus, it is challenging to know the likelihood of such scenarios. However, our microbial analysis suggests that frac hits drive the structure and functional potential of the microbial community. This work is an important investigation into the microbial ecology of unconventional oil and gas wells. It is one of the first studies to examine the microbiology of the Permian Basin and is the first study within this region to use integrated geochemical and microbial data sets. Well-to-well communication is a known problem in the oil and gas industry. However, to our knowledge, this represents the first study to provide geochemical and microbial evidence demonstrating occurrence of frac hits. Additionally, in this study we utilize our microbial data to provide a conceptual framework for understanding how well-to-well communication may impact in the subsurface. The Permian Basin is the largest producing oil and gas region in the United States and continues to grow. With this continued growth, frac hits will undoubtedly become more frequent. Future work should be conducted in order to investigate the impact of frac hits, on both a short and long time scale. Understanding how microbial communities are impacted by well-to-well communication will allow us to reduce risks that lead to biocidal resistance or increased hydrogen sulfide production.

MATERIALS AND METHODS

Sampling.

Produced water samples were collected from 10 actively producing hydraulically fractured oil and gas wells located in the Wolfcamp Formation of the Permian Basin in Texas in April 2018 (Table S1). Proximal groundwater is typically used as fracture fluid during the hydraulic fracturing process; therefore, we also collected a groundwater sample. Our sampling method was consistent with those previously reported (8, 17, 22, 23, 35, 36, 38, 44, 71). In brief, samples were collected either from a sampling port at the wellhead (if available) or the three-phase separators. The function of the separator is to separate the hydrocarbons (oil and/or gas) from the remaining fluids (produced water). In several instances sampling from the wellhead was not possible due to missing sampling ports or high gas pressures. In those cases, the separators represented the closest available sampling location. We used the GA5000 Gas Analyzer (QED Environmental Systems, Dexter, MI) to measure the gas content produced at each wellhead, although this was not possible at the separators. Produced water samples were collected in unused plastic carboys that were pre-rinsed with sample waters and allowed between 1 and 3 h to settle into distinguishable oil and water phases in the sealed container. A portion of the water phase was collected in a sterile 1 L Nalgene bottle and immediately placed on dry ice in order to preserve the sample for microbiological analysis. Upon arrival in the lab, the frozen produced water samples were stored at –20°C until processing. The remaining water phase was passed through glass wool cartridges under gravity and then a 0.45 μm inline filter (EnviroTech, Salt Lake City, UT) in a closed loop in order to minimize oxygen exposure. A portion of this filtered water was collected on-site and measured for pH using a Horiba multimeter (Horiba, Edison, New Jersey) and titrated for alkalinity using a Hach digital titrator (Hach, Loveland, CO). A 30-mL volume of the filtered water was acidified to 2% by volume using nitric acid and transported on ice for inductively coupled plasma optical emission spectroscopy (ICP-OES). A 15-mL volume of the filtered water was passed through a second 0.22 μm sterile polyethersulfone filter (Millipore, Inc.) and shipped on ice for ion chromatography (IC) measurement. Finally, filtered water samples were also collected with zero-headspace in Shimadzu TOC vials and transported on ice for total organic carbon (TOC) analysis.

Chemical Analysis.

Chemical analyses were measured as previously described (36). In brief, major cations and anions were detected in triplicate using ion chromatography (IC) on a ThermoFisher (Thermofisher, Waltham, MA) ICS-5000+ with AS11-HC column for anion quantification and CS16 column for cation quantification. TDS was calculated by summing the total cations and anions in each sample. Trace metals were analyzed using U.S. EPA Method 6010D with inductively coupled plasma optical emission spectroscopy (ICP-OES) on an Optima 7300 DV (Perkin Elmer, Waltham, MA). Finally, 3-5 total organic carbon (TOC) replicates were analyzed with a Shimadzu TOC Analyzer (Model: TOC-LCSN) for non-purgeable organic carbon (NPOC) and total inorganic carbon (TIC).

DNA Extraction.

Frozen produced water samples were thawed at 5°C before filtering through a 0.2 μm polyethersulfone membrane filter (Qiagen, Hilden, Germany) in order to collect microbial biomass. For samples that could not be filtered (Table S2), biomass was pelleted by the centrifugation at 5000 x G. The collected biomass (from filter or centrifugation) was used for DNA extraction. To optimize DNA recovery, we utilized two different extraction kits (Table S2). We initially extracted samples using a previously described (36) modified version of the DNeasy Powersoil kit (Qiagen, Hilden, Germany) with a 60 min lysozyme digestion and 30 min of bead beating. If we were unable to obtain a sufficient amount of DNA using this protocol, we completed a second extraction using the DNA/RNA All Prep kit (Qiagen, Hilden, Germany) with 10 min of homogenization at maximum speed after the addition of the cell lysis buffer. For both methods, DNA was eluted from spin filters in 80 μL nuclease free water. Kit blanks were also concurrently extracted to confirm that no contamination occurred.

Library preparation and sequencing.

Recovered DNA from both extraction methods was amplified using universal primers targeting the V4 region of the 16S rRNA gene, as previously described (72, 73). PCR products from both extraction methods were combined and cleaned using AMPure XP beads (Beckman Coulter, Pasadena, CA), visualized on a 1% agarose gel, and quantified using Qubit (Life Technologies, Carlsbad, CA). Negative controls were also amplified, visualized, and quantified in order to confirm that no contamination occurred. Purified 16S rRNA gene libraries were pooled (with 6–8 replicates per sample), diluted to a concentration of 2 nM, and denatured using fresh 0.2 M NaOH. Libraries were further diluted according to the manufacturer's instructions and sequenced on an Illumina Miseq (Illumina, San Diego, CA) using a 300 cycle V2 Nano kit. For several samples, extraction yielded enough DNA for metagenomic library preparation (Table S1). We used the Nextera XT library preparation kit (Illumina, San Diego, CA), which is specifically designed for low biomass samples and only required 1 ng of input template. Briefly, 1 ng of input sample DNA was tagmented with Illumina primers containing sequencing adapters and barcodes in a 12 cycle PCR step. PCR products were cleaned up using AMPure XP beads (Life Technologies, Carlsbad, CA). DNA libraries were normalized by pooling, quantified using Qubit (Life Technologies, Carlsbad, CA), and diluted to a concentration of 20–40 pM. The DNA library was denatured by heating the sample at 96°C for 2 min, cooled in an ice bath for 5 min, and sequenced using a 600 cycle V3 kit on an Illumina MiSeq sequencer (Illumina, San Diego, CA). Produced water samples are often characterized by low biomass and the presence of various inhibitor substances (34–36, 74), which make it challenging to extract sufficient amounts of high-quality DNA and prepare sequencing libraries. In this study, 16S rRNA gene libraries were successfully prepared and sequenced from the groundwater sample and produced waters from 9 of the 14 unique sample locations (Table S1). In addition, we were able to perform duplicate extractions and amplifications for samples 1S, 4S, and 10S. For each produced water sample, the unique number denotes the collection site, and the letter indicates the sample origin (S for separator or W for wellhead) (Table S2). Illumina MiSeq sequencing generated a total of 1,261,357 sequences, with 296,651 remaining after quality control filtering (Table S2). Samples obtained from sites 6S, 7S, 7W, 8S, and 8W did not generate enough sequencing data for downstream analysis and thus were not included going forward (Tables S1 and S2). Across the remaining samples, 16S rRNA gene sequencing generated between 1,023 and 42,008 reads per sample with an average depth of 22,813 sequences per sample. Metagenomic libraries were prepared from produced water samples for which a sufficient amount and quality of DNA could be extracted (Tables S1 and S3). We successfully prepared metagenomic libraries from samples representing 8 of the 14 unique produced water sample locations, however, we were unable to generate metagenomic sequencing data for sites 5W, 6S, 7W, 8S, 8W, 9W, or the groundwater sample (Table S1). A total of 54,352,861 metagenomic reads were obtained, with an average sample library size of 6,039,206 reads after quality control filtering (Table S3).

16S rRNA gene sequencing data analysis.

16S rRNA gene sequences were analyzed using QIIME2 version 2019.10 (75) as previously described (36). In brief, sequences were imported as EMPSingleEndSequences, demultiplexed using the demux emp-single command, and processed for quality control using DADA2 (76) with the default settings and a truncation length of 250 bp. The classify-sklearn (77) command was used to classify representative sequences identified through DADA2 using a pre-trained Naive Bayes classifier trained on Silva 132 99% OTUs (78, 79) from the 515F/806R region. Finally, any sequences identified as chloroplast or mitochondria were filtered from the data. Data generated by Qiime2 was imported into R for further analysis (80). The number of observed ASVs and Pielou’s evenness were calculated in Base R (80). The Vegan package (81) was used to calculate the Chao1 index, Shannon index, and Bray–Curtis dissimilarity values. Vegan (81) was also used to calculate Bray–Curtis dissimilarity values, complete non-metric multidimensional scaling (NMDS) analyses, fit environmental parameters onto the generated ordination plot, conduct the Analysis of Similarities (ANOSIM), and complete the Mantel test. All NMDS plots were constructed with k = 4 dimensions. NMDS, ANOSIM, and the Mantel test were completed using Bray–Curtis distance measurements calculated after sequencing libraries were subsampled to the lowest sequence depth (1023) represented across all samples (SI Table 2). The Mantel test also relied on the Euclidean distance of the geochemical data, which was calculated in Base R. The statistical significance of the ANOSIM, the Mantel test, and the environmental vectors was based on 999 permutations of the grouping, geochemical, and environmental data, respectively. Finally, we also used the RColorBrewer package to generate colorblind accessible palettes when necessary (82).

Metagenomic data analysis.

Metagenomic reads were quality trimmed (Q20, 70 bp minimum length) using CLC Genomics (Qiagen, Hilden, Germany). Taxonomy of metagenomic reads was assessed using Kaiju (83) with default parameters while 16S rRNA genes across metagenomic reads were predicted using Phyloflash (84). Metagenomic reads were then assembled into continuous sequences, or contigs, using Spades version 3.11 utilizing the metaSpades workflow (85). Assembled metagenomic contigs were binned using different binning software (Maxbin2, Vizbin and Patrick) based on tetranucleotide frequency, differential coverage, and/or marker genes (86–88) (Table S3). Final bins are referred to as metagenome assembled genomes (MAGs) and completeness and contamination of MAGs was assessed at the genus level or the next lowest available taxonomic marker set using the CheckM taxonomy_wf workflow (89). Taxonomic thresholds of bins were determined via Tetra Correlation Search (TCS) in JspeciesWS (90). We used RNAmmer (91) to identify rRNA genes present in our MAGs and used BLAST (92) to confirm all identified genes were consistent with the taxonomy of our MAG. Recovered bins were annotated with the RASTk (Rapid Annotations using Subsystems Technology toolkit) pipeline using the SEED database (93–95). Relevant genes associated with subsurface hydrocarbon storage were identified using a detailed literature search. In instances where specific genes or pathways are minimally discussed in the literature, we also utilized the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways database (96–98) to identify genes and pathways of interest. MAGs were also analyzed in KBase (99) with the DRAM module (100).

Data availability.

The 16S rRNA gene sequences, metagenomes, and MAGs generated from this experiment were submitted to the NCBI Sequence Read Archive and are available under BioProject number PRJNA726570.
  66 in total

1.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

2.  Halophiles 2010: life in saline environments.

Authors:  Yanhe Ma; Erwin A Galinski; William D Grant; Aharon Oren; Antonio Ventosa
Journal:  Appl Environ Microbiol       Date:  2010-09-03       Impact factor: 4.792

Review 3.  Hydraulic fracturing offers view of microbial life in the deep terrestrial subsurface.

Authors:  Paula J Mouser; Mikayla Borton; Thomas H Darrah; Angela Hartsock; Kelly C Wrighton
Journal:  FEMS Microbiol Ecol       Date:  2016-08-08       Impact factor: 4.194

Review 4.  Bacterial diguanylate cyclases: structure, function and mechanism in exopolysaccharide biofilm development.

Authors:  Chris G Whiteley; Duu-Jong Lee
Journal:  Biotechnol Adv       Date:  2014-12-10       Impact factor: 14.227

5.  Temporal changes in microbial ecology and geochemistry in produced water from hydraulically fractured Marcellus shale gas wells.

Authors:  Maryam A Cluff; Angela Hartsock; Jean D MacRae; Kimberly Carter; Paula J Mouser
Journal:  Environ Sci Technol       Date:  2014-05-20       Impact factor: 9.028

6.  RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes.

Authors:  Thomas Brettin; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Gary J Olsen; Robert Olson; Ross Overbeek; Bruce Parrello; Gordon D Pusch; Maulik Shukla; James A Thomason; Rick Stevens; Veronika Vonstein; Alice R Wattam; Fangfang Xia
Journal:  Sci Rep       Date:  2015-02-10       Impact factor: 4.379

7.  CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.

Authors:  Donovan H Parks; Michael Imelfort; Connor T Skennerton; Philip Hugenholtz; Gene W Tyson
Journal:  Genome Res       Date:  2015-05-14       Impact factor: 9.043

8.  Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center.

Authors:  Alice R Wattam; James J Davis; Rida Assaf; Sébastien Boisvert; Thomas Brettin; Christopher Bun; Neal Conrad; Emily M Dietrich; Terry Disz; Joseph L Gabbard; Svetlana Gerdes; Christopher S Henry; Ronald W Kenyon; Dustin Machi; Chunhong Mao; Eric K Nordberg; Gary J Olsen; Daniel E Murphy-Olson; Robert Olson; Ross Overbeek; Bruce Parrello; Gordon D Pusch; Maulik Shukla; Veronika Vonstein; Andrew Warren; Fangfang Xia; Hyunseung Yoo; Rick L Stevens
Journal:  Nucleic Acids Res       Date:  2016-11-29       Impact factor: 16.971

9.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Authors:  Ross Overbeek; Robert Olson; Gordon D Pusch; Gary J Olsen; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Fangfang Xia; Rick Stevens
Journal:  Nucleic Acids Res       Date:  2013-11-29       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.