Literature DB >> 31039056

Associations between Maternal Tobacco Smoke Exposure and the Cord Blood [Formula: see text] DNA Methylome.

Caitlin G Howe1, Meng Zhou2, Xuting Wang3, Gary S Pittman3, Isabel J Thompson3, Michelle R Campbell3, Theresa M Bastain1, Brendan H Grubbs4, Muhammad T Salam1,5, Cathrine Hoyo6, Douglas A Bell3, Andrew D Smith2, Carrie V Breton1.   

Abstract

BACKGROUND: Maternal tobacco smoke exposure has been associated with altered DNA methylation. However, previous studies largely used methylation arrays, which cover a small fraction of CpGs, and focused on whole cord blood.
OBJECTIVES: The current study examined the impact of in utero exposure to maternal tobacco smoke on the cord blood [Formula: see text] DNA methylome.
METHODS: The methylomes of 20 Hispanic white newborns ([Formula: see text] exposed to any maternal tobacco smoke in pregnancy; [Formula: see text] unexposed) from the Maternal and Child Health Study (MACHS) were profiled by whole-genome bisulfite sequencing (median coverage: [Formula: see text]). Statistical analyses were conducted using the Regression Analysis of Differential Methylation (RADMeth) program because it performs well on low-coverage data (minimizes false positives and negatives).
RESULTS: We found that 10,381 CpGs were differentially methylated by tobacco smoke exposure [neighbor-adjusted p-values that are additionally corrected for multiple testing based on the Benjamini-Hochberg method for controlling the false discovery rate (FDR) [Formula: see text]]. From these CpGs, RADMeth identified 557 differentially methylated regions (DMRs) that were overrepresented ([Formula: see text]) in important regulatory regions, including enhancers. Of nine DMRs that could be queried in a reduced representation bisulfite sequencing (RRBS) study of adult [Formula: see text] cells ([Formula: see text] smokers; [Formula: see text] nonsmokers), four replicated ([Formula: see text]). Additionally, a CpG in the promoter of SLC7A8 (percent methylation difference: [Formula: see text] comparing exposed to unexposed) replicated ([Formula: see text]) in an EPIC (Illumina) array study of cord blood [Formula: see text] cells ([Formula: see text] exposed to sustained maternal tobacco smoke; [Formula: see text] unexposed) and in a study of adult [Formula: see text] cells across two platforms (EPIC: [Formula: see text] smokers; [Formula: see text] nonsmokers; 450K: [Formula: see text] smokers; [Formula: see text] nonsmokers).
CONCLUSIONS: Maternal tobacco smoke exposure in pregnancy is associated with cord blood [Formula: see text] DNA methylation in key regulatory regions, including enhancers. While we used a method that performs well on low-coverage data, we cannot exclude the possibility that some results may be false positives. However, we identified a differentially methylated CpG in amino acid transporter SLC7A8 that is highly reproducible, which may be sensitive to cigarette smoke in both cord blood and adult [Formula: see text] cells. https://doi.org/10.1289/EHP3398.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 31039056      PMCID: PMC6785223          DOI: 10.1289/EHP3398

Source DB:  PubMed          Journal:  Environ Health Perspect        ISSN: 0091-6765            Impact factor:   9.031


Introduction

Maternal smoking during pregnancy is one of the most prevalent and modifiable risk factors affecting newborn health (Curtin and Matthews 2016). In the United States, of women smoke during pregnancy, with the highest rates occurring during the first trimester (Curtin and Matthews 2016). Prenatal exposure to tobacco smoke has been associated with adverse health outcomes for offspring, including overall fetal and infant mortality (Kleinman et al. 1988), preterm birth (Shah and Bracken 2000), altered fetal thyroid function (Meberg and Marstein 1986; Shields et al. 2009), obesity (Behl et al. 2013), childhood cancers (John et al. 1991), behavioral and cognitive effects (Clifford et al. 2012; He et al. 2017; Holz et al. 2014), and asthma (Gilliland et al. 2001). As epigenetic modifications represent organismal flexibility in adapting to environmental stressors, epigenetic dysregulation is one hypothesized mechanism by which exposure to tobacco smoke in utero may contribute to such a diverse array of adverse health outcomes later in life. Consistent with this, a growing body of evidence, including a meta-analysis of results from 13 cohorts (Joubert et al. 2016), suggests that maternal smoking during pregnancy alters newborn DNA methylation patterns, and some of these effects have been shown to persist throughout childhood (Bauer et al. 2016; Breton et al. 2014; Lee et al. 2015; Richmond et al. 2015), and even into adolescence (Lee et al. 2015) and adulthood (Flom et al. 2011; Richmond et al. 2015). However, most studies evaluating associations between maternal smoking during pregnancy and DNA methylation have focused on a handful of candidate genes or utilized methylation arrays, which cover a small fraction () of CpG sites in the human genome, largely selected from promoter regions. A recent study, using whole-genome bisulfite sequencing (WGBS), observed that maternal smoking during pregnancy largely impacts DNA methylation levels within enhancer regions (Bauer et al. 2016), which until recently were underrepresented on methylation arrays. Thus, the most profound effects of maternal smoking during pregnancy may occur at distal regulatory regions, which have been largely unevaluated. Another important limitation of previous studies is that most have measured DNA methylation in whole blood or total leukocyte samples, which consist of a mixture of different cell types, each with its own distinct DNA methylation profile. Findings from these studies therefore represent methylation differences averaged across multiple cell types. These studies also may have been susceptible to confounding by cell type heterogeneity. Although the importance of accounting for major cell types has been emphasized by two recent studies, which observed differential effects of maternal smoking during pregnancy on DNA methylation in leukocyte subtypes (Bauer et al. 2016; Su et al. 2016), studies evaluating the influence of maternal smoking during pregnancy on DNA methylation patterns within specific cell types are still generally lacking. Additionally, most previous studies of prenatal tobacco smoke exposure and DNA methylation have focused on predominately Caucasian populations (e.g., Joubert et al. 2012; Küpers et al. 2015; Markunas et al. 2014; Richmond et al. 2015). Much less is known about the impact of in utero tobacco smoke exposure on newborn DNA methylation patterns in Hispanic populations. The objective of the current study was therefore to investigate the impact of maternal smoking during pregnancy on the newborn methylome, using WGBS, within sorted cord blood cells collected from a subset of Hispanic white participants enrolled in the Maternal and Child Health Study (MACHS), a Los Angeles birth cohort that was designed to identify early-life risk factors for adverse metabolic and respiratory outcomes in childhood. On average, cells comprise 10–16% of white blood cells in cord blood. They are essential for host defense, as they coordinate immune responses to infections and malignancies by recruiting and activating other immune cells (Zhu and Paul 2008). They were selected for the current study because they are accessible in healthy newborns and relevant to tobacco smoke toxicity, as they have also been implicated in the pathogenesis of airway inflammation and asthma (Akbari et al. 2006; Ling and Luster 2016; Lloyd and Hessel 2010).

Methods

Maternal and Child Health Study

MACHS is a birth cohort in Los Angeles. Beginning in 2012, MACHS participants were recruited from the labor and delivery ward of the Los Angeles County and University of Southern California Medical Center, which serves a predominately low-income Hispanic population. Women were excluded from the study if they were HIV positive, currently incarcerated, or old. Women were also excluded if they had a multiple pregnancy or a physical, mental, or cognitive disability that would preclude their participation in the study. Written informed consent was obtained from all participants in accordance with the Health Sciences Institutional Review Board of the University of Southern California. For the current study, 10 infants with exposure to maternal smoking in utero and 10 unexposed infants born to Hispanic white women were identified in MACHS and matched on fetal gestational age, maternal diabetes status, maternal prepregnancy body mass index (BMI), and maternal age (Table 1).
Table 1

Maternal and newborn characteristics of Maternal and Child Health Study (MACHS) participants by exposure group.

WGBS subsetLarger MACHS study population
Tobacco smoke exposed (n=10) mean±SD or n (%)Unexposed (n=10) mean±SD or n (%)p-ValueaAll participants (n=232) mean±SD or n (%)
Matched characteristics
 Age, years25±627±80.6227±6b
 Prepregnancy BMI, kg/m229.6±5.9c28.8±5.6128.7±8d
 Prepregnancy diabetes0 (0)0 (0)113 (5.6%)e
 Gestational age, weeks38±238±20.6239±2f
 Preterm2 (20)2 (20)144 (19)
Adjustment characteristics
 Baby’s sex, male6 (60)7 (70)1115 (50)
 Mother worked during pregnancy7 (70)6 (60)1101 (44)

Note: BMI, body mass index; MACHS, Maternal and Child Health Study; SD, standard deviation; WGBS, whole-genome bisulfite sequencing.

p-value is from Wilcoxon rank-sum test (continuous variables), chi-square test (categorical variables with counts per cell), or Fisher’s exact test (categorical variables with counts per cell), comparing participants in the maternal tobacco smoke exposed group with the unexposed group.

Maternal age information available for participants.

One participant from the maternal tobacco smoke exposed group was missing information on maternal prepregnancy BMI.

Maternal prepregnancy BMI available for participants.

Maternal prepregnancy BMI available for participants.

Gestational age information available for participants.

Maternal and newborn characteristics of Maternal and Child Health Study (MACHS) participants by exposure group. Note: BMI, body mass index; MACHS, Maternal and Child Health Study; SD, standard deviation; WGBS, whole-genome bisulfite sequencing. p-value is from Wilcoxon rank-sum test (continuous variables), chi-square test (categorical variables with counts per cell), or Fisher’s exact test (categorical variables with counts per cell), comparing participants in the maternal tobacco smoke exposed group with the unexposed group. Maternal age information available for participants. One participant from the maternal tobacco smoke exposed group was missing information on maternal prepregnancy BMI. Maternal prepregnancy BMI available for participants. Maternal prepregnancy BMI available for participants. Gestational age information available for participants.

Assessing Maternal Smoking during Pregnancy

Maternal smoking during pregnancy was assessed by an interview-administered questionnaire, which was provided in either English or Spanish, depending on the primary language of the participant. Of the 10 mothers who smoked during pregnancy who were selected for this study, nine reported that they stopped smoking soon after learning that they were pregnant or at some point during the first trimester, and one reported that she had stopped smoking during the third trimester. The 10 women who did not smoke during pregnancy were lifetime nonsmokers. Although cigarette smoke exposure from other sources was not directly considered for this analysis, four of the women who reported smoking during pregnancy, and one woman who reported being a lifetime nonsmoker, shared a residence with a cigarette smoker.

Other Maternal and Newborn Characteristics

Prepregnancy BMI was calculated based on prepregnancy height and weight values obtained from maternal medical records. Gestational age and baby’s sex were acquired from pediatric medical records. Maternal age, prepregnancy diabetes status, and working status during pregnancy were determined by an interview-administered questionnaire, provided in the primary language of the participants (English or Spanish) at the time of delivery.

Cord Blood Collection, Cell Sorting, and DNA Isolation

Cord blood samples were drained into collection containers by hospital providers and then transferred by standard syringe into EDTA tubes BD Biosciences (Catalog Number: BD-366643). Samples were stored at room temperature until they were transported to the molecular biology laboratory at the Southern California Environmental Health Sciences Center, where they were processed. The median time from cord blood collection to sample processing was 14 h, which did not differ by exposure group (). Peripheral blood mononuclear cells (PBMCs) were isolated by treating 5–10 mL of whole blood with Human Granulocyte Depletion Cocktail (STEMCELL Technologies) for 20 min. The blood was diluted with an equal volume of EasySep™ Buffer (STEMCELL Technologies), which was mixed and overlaid into a 50-mL SepMate™ tube (STEMCELL Technologies) filled with 15 mL Lymphoprep™ (STEMCELL Technologies). The tube was centrifuged at for 10 min, and the PBMC layer was transferred into a 50-mL centrifuge tube using a transfer pipette. Cells were washed with 20 mL of EasySep™ Buffer and centrifuged for 8 min at . The cell pellet was resuspended in 2 mL of EasySep™ Buffer, then transferred into a 14-mL polystyrene tube. monocyte cells and T cells were then separated using the EasySep™ Human CD14 Selection Kit (STEMCELL Technologies), followed by the EasySep™ Human CD4 Selection Kit (STEMCELL Technologies), according to manufacturer protocols. Lysed buffy coat, , and cells were suspended in Qiagen and stored at until DNA extraction. This kit typically yields 97.4–99.5% cell content. DNA was isolated from sorted cells using the Qiagen All Prep DNA/RNA/miRNA kit (80224; Qiagen), according to the manufacturer’s instructions. A260:A280 ratios were determined by nanodrop and ranged between 1.80 and 1.94.

Whole-Genome Bisulfite Sequencing

Five hundred nanograms of DNA were shipped overnight on dry ice to BGI. Library preparation consisted of: a) sonication of the DNA to create 100–300 base pair fragments; b) DNA end repair, addition of 3’ adenine overhangs, and ligation of methylated sequencing adaptors; c) bisulfite treatment using the EZ DNA Methylation-Gold Kit™ (Zymo Research); and d) desalting, size selection, polymerase chain reaction (PCR) amplification (10 cycles), and an additional round of size selection. Library fragment size (250–300 base pairs) was verified by a 2100 Bioanalyzer (Agilent). Library concentration () was verified by quantitative real-time PCR. 150 base pair, paired-end WGBS was performed by BGI, using Illumina’s HiSeq 4000 System. Samples were multiplexed, and three samples were run per lane. The raw and processed WGBS data that were generated and analyzed for the current study are publicly available in Gene Expression Omnibus (GEO) (accession: GSE109212) Edgar et al. 2002; Barrett et al. 2013.

Read Alignment

Trim Galore! (version 0.4.2, Babraham Bioinformatics) was used to assess read quality and to remove adapters from raw sequencing reads and conduct quality trimming, using the default Phred score cutoff of 20. Trimmed reads were mapped to human assembly GRCh37, using the Wildcard ALignment Tool (Chen et al. 2016). Eighty-four percent of the trimmed reads were mapped, and 93% of these were uniquely mapped. The MethPipe package (version 3.4.2) was used to remove duplicate reads and to calculate average CpG methylation levels and bisulfite conversion rates (Song et al. 2013). The median percent duplication rate was 19%. The median coverage for a CpG site was . Mapping rates, percent duplication rates, and coverage were similar in the exposed and unexposed groups. Median bisulfite conversion rates were 0.99 in both groups.

Statistical Analysis

Differentially methylated CpG sites were identified using Regression Analysis of Differential Methylation (RADMeth) (Dolzhenko and Smith 2014), which utilizes beta-binomial regression and is part of the MethPipe pipeline (Song et al. 2013). We selected this method because it accounts for within-group variation in methylation and is therefore more appropriate for population studies, and has been shown to perform well on low-coverage data (Dolzhenko and Smith 2014). The code for this method is publicly available (https://github.com/smithlabcode/methpipe). Although there were no significant differences between the exposed and unexposed groups for any measured maternal or newborn characteristics, there was one fewer male baby and one more mother who worked during pregnancy in the tobacco smoke–exposed group (Table 1). We therefore included baby’s sex and maternal working status as covariates in regression models. The RADMeth program calculates three sets of p-values: a) raw p-values; b) p-values adjusted for their correlations with proximal CpGs (based on a 200-base pair window), which increases the statistical power to detect differential methylation, even at low-coverage sites, as described previously (Dolzhenko and Smith 2014); and c) neighbor-adjusted p-values that are additionally corrected for multiple testing based on the Benjamini-Hochberg method for controlling the false discovery rate (FDR) (Benjamini and Hochberg 1995) (hereafter referred to as ). The genomic inflation factor () was calculated from the raw p-values. CpG sites with a were considered differentially methylated. Differentially methylated regions (DMRs) were also called using RADmeth, which joins neighboring CpG sites that were identified as differentially methylated (). Resulting DMRs were filtered for those containing a minimum of at least three CpG sites with both a raw and FDR-adjusted to capture potential methylation differences at regulatory regions that are CpG poor (e.g., enhancers) in addition to those that are CpG rich (e.g., promoters) (Jones 2012). Since MethPipe does not systematically check for outliers, we visually inspected for outliers in a subset of loci by plotting the raw methylation values for individual participants separately by exposure group. Given the low coverage of our data, we attempted to run a sensitivity analysis restricting to the 1.4 million CpGs in our dataset that had an average coverage . However, the RADMeth program was unable to run on this restricted set of CpGs, likely because the data were too sparse. We were therefore also unable to run the subsequent DMR identification step for this restricted set of data. Annotation of genomic regions was conducted using the Goldmine package in R (version 3.5.0, R Project) (Bhasin and Ting 2016). Significant enrichment of DMRs at genomic regions of interest was determined using Fisher’s exact test. The drawGenomePool() function in the Goldmine package was used to create a length-matched pool of background sequences from GRCh37 that was the size of the query and did not a) overlap sequences in our query; or b) extend off chromosome ends or over assembly gaps (Bhasin and Ting 2016). Additionally, overlap between DMRs and FANTOM5 enhancers was examined (Andersson et al. 2014). We used a three-pronged approach and evaluated overlap with a) all known enhancers, b) enhancers expressed in T cells, and c) enhancers expressed ubiquitously across tissue types. Additionally, we examined overlap between DMRs and DNase-sensitive regions and transcription factor binding sites. Coordinates for DNase-sensitive regions were determined by merging three DNase-seq datasets for adult T cells, acquired from the ENCODE Experimental Data Matrix (file accession numbers: ENCFF569GSL, ENCFF235AEA, and ENCFF907KBL, generated in the lab of John Stamatoyannopoulos, University of Washington) (ENCODE Project Consortium 2012). Overlapping features were merged using version 2.26.0, BEDTools (Quinlan and Hall 2010). Coordinates for transcription factor binding sites were acquired from ReMap (Griffon et al. 2014). Additionally, CTCF binding site coordinates were obtained from GEO (accession number GSE12889) (Cuddapah et al. 2009) and converted to GRCh37 using the University of California Santa Cruz Genome Browser liftOver tool (Haeussler et al. 2019). Gene set enrichment analyses for PANTHER pathways (Mi et al. 2017) and gene ontology (GO) terms (Gene Ontology Consortium 2017) were conducted for all genes directly overlapping DMRs using the enrichR software (Kuleshov et al. 2016). Predicted targets of microRNAs (miRNAs) overlapping DMRs were identified using three different software programs: TargetScanHuman7.2 (Agarwal et al. 2015), miRWalk 2.0 (Dweep et al. 2014), and version 5.0, DIANA-microT-CDS tools (Paraskevopoulou et al. 2016). The intersection of predicted target genes identified by all three tools was then evaluated in downstream analyses. Potential enrichment of predicted target genes within PANTHER pathways (Mi et al. 2017) and GO terms (Gene Ontology Consortium 2017) associated with predicted targets were determined using enrichR (Kuleshov et al. 2016). To identify single-nucleotide polymorphisms (SNPs) that might impact CpGs of interest, coordinates were obtained for all SNPs identified by the 1000 Genomes Project. Since the MXL population, which represents Mexican American individuals from the Los Angeles area, is the most similar to MACHS, we identified SNPs that are common (minor allele ) in this population using version 0.1.16, VCFtools (Danecek et al. 2011), and identified overlaps with differentially methylated CpGs using BEDTools (Quinlan and Hall 2010). Replication studies were conducted for differentially methylated CpG sites and DMRs identified in MACHS, using DNA methylation data from two different study populations. The first study, WakeMed Smoking Epigenetics Study (SMKE), profiled DNA methylation using the Illumina Infinium MethylationEPIC array in cord blood cells isolated with Dynabeads (Invitrogen) from a group of 30 newborns ( exposed vs. unexposed to sustained maternal tobacco smoke), who were recruited from the WakeMed hospital in Raleigh, North Carolina. The second study, Epigenetic Biomarkers of Tobacco Smoke Exposure, profiled DNA methylation in adult cells from a subset of smokers and nonsmokers using three different methods: a) Illumina’s EPIC array ( smokers; nonsmokers), b) Illumina’s HumanMethylation450K ( smokers; nonsmokers) array, and c) reduced representation bisulfite sequencing (RRBS) ( smoking; nonsmoking women). These participants were recruited at the National Institute of Environmental Health Sciences (NIEHS) Clinical Research Unit (CRU) (Wan et al. 2018). Methods for replication analyses and participant demographics for the replication studies are described in more detail in Supplemental Materials (see “Replication Look-Up Analyses” and Tables S1–S4). Look-up analyses were conducted for DMRs identified in a previous study of maternal tobacco smoke exposure during pregnancy, which profiled DNA methylation in whole cord blood samples using WGBS (Bauer et al. 2016). Additionally, look-up analyses were conducted a) for 25 of the 26 CpGs that were identified as differentially methylated () by sustained maternal tobacco smoke exposure during pregnancy in whole cord blood in an Infinium 450K array study (Joubert et al. 2012), many of which have been widely replicated, that were also represented in the MACHS WGBS dataset; and b) for 3,258 CpGs on the 450K array that were identified as differentially methylated () in relation to both sustained and any maternal smoking in meta-analyses conducted by the Pregnancy and Childhood Epigenetics (PACE) consortium (Joubert et al. 2016), which were also represented in the MACHS WGBS dataset.

Results

Characteristics of Study Participants

Characteristics of the subset of MACHS participants with WGBS data are shown separately by exposure group and are compared with the entire MACHS study population in Table 1. For the subset of participants with WGBS data, the maternal age at pregnancy was , and the maternal prepregnancy BMI was . There were more male (65%) compared with female (45%) newborns in the current study, and the gestational age was , with 20% of babies born preterm. Sixty-five percent of mothers included in the study reported that they worked during pregnancy. Measured covariates were similar between exposure groups (Table 1). The subset of participants with WGBS data were generally similar to the overall MACHS cohort, although of MACHS participants reported having diabetes prior to pregnancy, compared with 0% in the WGBS subset, which was by design (Table 1). There were also fewer male babies and fewer mothers who reported working during pregnancy in the overall MACHS cohort (Table 1).

Differentially Methylated CpG Sites and Regions

We identified 10,381 CpG sites that were differentially methylated in cord blood between newborns with and without exposure to maternal smoking in utero () (Excel Table S1). None of the sites were statistically significant after applying a more conservative Bonferroni correction (number of ; ; smallest ). There was evidence of some genomic inflation (). Of the 10,381 differentially methylated CpGs identified, 169 (1.6%) overlapped an SNP identified as common (minor allele frequency ) in the 1000 Genomes MXL population (Excel Table S1). In our visual inspection of a subset of highly significant CpGs with moderate percentage methylation differences (10–11%), we did identify some outliers (Figure S1). From the 10,381 differentially methylated CpGs, MethPipe identified 1,533 DMRs by merging neighboring CpGs that had a . Of these DMRs, 557 contained a minimum of three CpG sites with both a raw and FDR-adjusted and were therefore retained for downstream analyses (Excel Table S2). Of these 557 DMRs, 249 DMRs were hypomethylated and 308 were hypermethylated in tobacco smoke exposed, compared with unexposed, newborns. The 10,381 differentially methylated CpGs and 557 DMRs are shown by chromosome in Figure S2. The median absolute methylation difference for the 557 DMRs was 12%, and the largest absolute methylation difference was 40% (Figure S3). These DMRs spanned from 8 to 568 base pairs, with a median length of 149 base pairs (Figure S2). Although the majority of DMRs contained fewer than 10 CpGs with both a raw and , one DMR, located in the promoter of pseudogene HSPA7, contained as many as 31. These 31 CpGs comprise a subset of 57 CpGs that were found to be differentially methylated () in the promoter region of HSPA7 in individual CpG analyses. These 57 CpGs represent more than half of the 106 CpGs located in the HSPA7 promoter that were represented in the MACHS dataset; all but one of these 57 CpGs was hypomethylated in the maternal tobacco smoke exposed, compared with unexposed, group. The 20 DMRs with the largest methylation differences are listed with their nearest genes in Table 2. The majority of these DMRs were hypermethylated and located within intergenic regions.
Table 2

The twenty DMRs with the largest methylation differences.

PositionPercent methylation differenceNo. of CpGsaGenomic locationNearest gene(s)Distance from gene (base pairs)Mean coverage per CpG in DMR
Hypermethylated
 Chr10:46775380-4677561239.77IntergenicGLUD1P77,0330.7
 Chr17:7114926-711495339.23IntronDLG401.7
 Chr2:144749232-14474929036.65IntronGTDC100.8
 Chr7:38308302-3830835435.333’EndTRGC2, TCRGC2, TARP, Z226900, 0, 0, 04.0
 Chr8:38533279-3853332734.83IntergenicTACC152,3765.8
 Chr5:26725725-2672574733.33IntergenicCDH9154,9610.7
 Chr21:39004308-3900442131.93IntronKCNJ604.9
 Chr22:21663324-2166349931.48IntergenicPOM121L8P11,3081.1
 Chr15:25200647-2520078031.34PromoterSNRPN, SNURF01.4
 Chr12:58229294-5822935530.83IntronCTDSP207.6
 Chr10:123100131-12310032430.55IntergenicFGFR21375197.4
 Chr16:57343206-5734335329.96IntergenicTRNA_Leu8,7313.1
 Chr17:27359874-2736004929.04IntergenicPIPOX9,8685.0
 Chr4:1607004-160736828.68IntergenicAX74838829,0373.2
Hypomethylated
 Chr6:171026627-17102677839.89IntergenicBC03625120,5911.4
 Chr5:564378-56447138.55IntergenicMIR445628,3800.8
 Chr1:18343023-1834312734.43IntergenicIGSF2191,1128.9
 Chr4:3295539-329566330.53IntronRGS1204.9
 Chr9:36154747-3615480429.63IntronGLIPR207.6
 Chr5:177209275-17720937529.44PromoterFAM153A03.6

Note: Chr, chromosome; DMR, differentially methylated region.

Number of CpGs in the DMR of interest with a raw and false discovery rate-adjusted .

The twenty DMRs with the largest methylation differences. Note: Chr, chromosome; DMR, differentially methylated region. Number of CpGs in the DMR of interest with a raw and false discovery rate-adjusted .

Genomic Context of Differentially Methylated Regions and Enrichment in Key Regulatory Regions

Compared with a random sample of similar-sized regions in the genome, DMRs were enriched at promoter, exon, and 3’ end regions (Table 3). In contrast, there were significantly fewer DMRs within intergenic regions than would be expected by chance (Table 3). Results were similar when DMRs were evaluated separately by those that were hyper- vs. hypomethylated in the tobacco smoke exposed, compared with the unexposed, group, and median (range) percentage methylation differences were generally similar across genomic regions (Figure S4).
Table 3

Enrichment of DMRs within regions of interest.

ObservedExpected

Enrichment p-Valuea

Promoters52153.4×1014
Exons33148.0×106
Introns2372290.49
3’ End26144.1×103
Intergenic2092851.3×1010
All enhancers1134.8×104
T-cell enhancers210.11
CD4+ DNase-sensitive regions (ENCODE)355<2.2×1016
CD4+ transcription factor binding sites (ReMap)3461.8×1017
CD4+ CTCF binding sites1123.2×105

Note: Five hundred fifty-seven DMRs were identified in cord blood cells from newborns exposed, compared with newborns unexposed, to maternal tobacco smoke. These 557 DMRs were identified from a total of 10,381 differentially methylated CpGs using the MethPipe pipeline. Differentially methylated CpGs were identified using beta-binomial regression models, which were adjusted for baby’s sex and maternal working status during pregnancy. DMR, differentially methylated region; ENCODE, Encyclopedia of DNA Elements.

p-Value from Fisher’s exact test.

Enrichment of DMRs within regions of interest. Enrichment p-Value Note: Five hundred fifty-seven DMRs were identified in cord blood cells from newborns exposed, compared with newborns unexposed, to maternal tobacco smoke. These 557 DMRs were identified from a total of 10,381 differentially methylated CpGs using the MethPipe pipeline. Differentially methylated CpGs were identified using beta-binomial regression models, which were adjusted for baby’s sex and maternal working status during pregnancy. DMR, differentially methylated region; ENCODE, Encyclopedia of DNA Elements. p-Value from Fisher’s exact test. DMRs were also significantly enriched within FANTOM5 enhancers, DNase-sensitive regions, and at transcription factor binding sites, including CTCF binding sites (Table 3). Results were similar when DMRs were evaluated separately by those that were hypermethylated vs. hypomethylated in the exposed, compared with the unexposed, group (Table S5). Of the 557 DMRs identified, 11 overlapped active enhancer regions, two of which are active in T cells. Predicted targets of these enhancer regions and descriptive information for overlapping DMRs are shown in Table S6. The two DMRs overlapping T-cell enhancers were hypomethylated among the exposed, compared with unexposed, newborns. One DMR (chr8:141,108,929-141,109,987) had a methylation difference of . The predicted target of this enhancer was the promoter region of KCNK9 () (Andersson et al. 2014; FANTOM Consortium 2014), which is imprinted in the brain (Ruf et al. 2007). This same DMR also overlapped a DNAse-sensitive region (chr8:141,109,195-141,109,405) and CTCF binding region (chr8:141,109,278-141,109,552). The second DMR (chr21:44,104,688-44,105,340) had a methylation difference of . The enhancer overlapping this DMR had two predicted targets: the promoter region of NDUFV3 (), which codes for a mitochondrial enzyme, and also ABCG1 () (Andersson et al. 2014; FANTOM Consortium 2014), which plays an important role in cellular lipid homeostasis (Kennedy et al. 2005). None of the DMRs overlapped ubiquitous enhancers.

Gene Set Enrichment Analyses

A total of 564 genes were annotated to the 557 DMRs (Excel Table S2). Of these genes, 385 directly overlapped a DMR (Excel Table S2). These genes were not significantly enriched in any PANTHER pathways or significantly associated with any GO biological processes after adjusting for multiple testing.

Predicted Targets of Differentially Methylated MicroRNA Genes

Of the 385 genes directly overlapping DMRs, four coded for miRNAs (miR-29b-2, miR4750, miR1914, and miR646HG). There were 1,065 predicted target genes identified for these miRNAs (Excel Table S3); these genes were associated with 15 GO biological process terms, which remained statistically significant after adjusting for multiple testing (Excel Table S4). Seven of these GO terms were related to the regulation of transcription or gene expression.

Replication Studies

Because we were unable to identify other WGBS studies of tobacco smoke exposure in cells, we sought to replicate our results in populations with array-based DNA methylation measures for cells. However, (485/10,381) of the differentially methylated CpGs identified in MACHS are covered on the Infinium EPIC array, and (399/10,381) are covered on the Infinium 450K array. Results for the 485 differentially methylated CpGs identified in MACHS (), which are represented on the EPIC array, are shown for the WakeMed SMKE ( newborns exposed to sustained maternal smoking during pregnancy; unexposed) and NIEHS CRU replication study ( smoking; nonsmoking adults) in Excel Table S5. The 399 differentially methylated CpGs identified in MACHS, which are represented on the 450K array, are also shown in this table for the NIEHS CRU replication study ( smoking; nonsmoking adults). Although only 60 of the 485 CpGs examined had a raw in the WakeMed SMKE study, this was a larger number than would be expected by chance (). Of these 60 CpGs, 33 were differentially methylated in the same direction in both SMKE and MACHS (Table 4). Distributions of methylation levels within each of the 33 CpGs that replicated are shown for both MACHS and SMKE in Figure S5. The replication rate for CpGs with a mean coverage (8.1%) did not differ (p-value from Fisher’s exact ) from the replication rate for CpGs with a mean coverage (6.7%). One of the CpGs that replicated (located in the promoter of SLC7A8; ) was found to be hypomethylated in the tobacco smoke exposed, compared with unexposed, group in both MACHS and the SMKE study, and was also hypomethylated in smokers compared with nonsmokers in both the EPIC and 450K NIEHS CRU studies of adult cells (Excel Table S5). Given the reproducibility of this result, we evaluated whether any other CpGs within SLC7A8 that were identified as differentially methylated in MACHS replicated; of the 14 CpGs that could be queried, five were differentially methylated (raw ) in the same direction in at least one of the replication studies (Excel Table S5). We also evaluated whether any of the CpGs contained within the 20 DMRs with the largest percentage methylation differences (Table 2) replicated in the WakeMed SMKE study. However, none of the six CpGs that could be queried replicated (Table S7).
Table 4

Differentially methylated CpGs identified in MACHS that were replicated in the SMKE cord blood study ( exposed; unexposed).

MACHS WGBS CpGPercent methylation differenceMean coverageWakeMed EPIC CpGPercent methylation differenceRaw p-valueNearest geneDistance (base pair)
Chr1:211200510.84.9cg020667160.10.05PRKCZ0
Chr1:1081139765.07.2cg249509181.90.02VAV30
Chr1:1725827305.55.2cg158748710.90.04SUCO1,756
Chr3:1948543418.911.1cg117962191.11.6×103XXYLT10
Chr5:845772013.97.3cg183710526.07.3×103LINC0222642
Chr6:332555896.16.5cg007832440.70.05WDR460
Chr6:15547564139.99.5cg051041890.40.04TIAM20
Chr7:6372213.95.1cg148717711.90.02PRKAR1B0
Chr7:702587259.17.0cg02327773a0.51.7×103AUTS2839
Chr7:7025875212.88.1cg185429670.50.03AUTS2866
Chr8:258600718.79.6cg022104418.15.3×103LOC10192781549
Chr8:2317928217.48.5cg120606698.50.04LOXL20
Chr8:7428293024.89.8cg209820462.57.0×103RDH10-AS114,233
Chr9:184938877.18.7cg135528320.70.03ADAMTSL10
Chr10:822961908.65.7cg096233773.90.02SH2D4B1,467
Chr10:10542050020.111.3cg190072696.83.3×103SH3PXD2A0
Chr11:736256658.35.5cg086650761.60.02PAAF10
Chr12:31310314.33.3cg201882120.30.05SLC6A120
Chr12:1333434277.79.9cg259398534.33.2×103GOLGA2,067
Chr14:236237279.48.9cg081045682.40.03SLC7A80
Chr16:8868949319.64.9cg010183600.50.04ZC3H180
Chr17:886733711.510.7cg253424095.20.02PIK3R50
Chr17:7603736316.37.7cg077876147.60.04TNRC6C0
Chr18:774873930.65.0cg138178223.40.03CTDP10
Chr18:34127815.68.1cg044317310.40.04TGIF10
Chr19:61343212.83.9cg00587228a6.20.01HCN20
Chr19:4691577516.94.6cg173752673.90.04CCDC80
Chr21:4410526419.94.7cg235902736.30.02PDE9A0
Chr21:4410547323.24.9cg167849858.57.4×103PDE9A0
Chr22:425483554.34.4cg170508072.20.02TCF207,663
ChrX:4868518117.13.4cg0951399613.24.6×103ERAS0
ChrX:4868530212.05.1cg226562757.60.02ERAS0
ChrX:1340487116.23.5cg241380843.40.01MOSPD10

Note: CpGs were considered replicated if the raw p-value in the SMKE study was and the percent methylation difference was in the same direction as that observed in MACHS. The exposed group consisted of 14 newborns whose mothers smoked during pregnancy in SMKE. The unexposed group consisted of 16 newborns whose mothers were lifetime nonsmokers. DNA methylation was measured in cord blood cells by the Infinium EPIC array for the SMKE study. Chr, chromosome; MACHS, Maternal and Child Health Study; WGBS, whole-genome bisulfite sequencing.

These CpGs have previously been found to be polymorphic or to be targeted by a probe with a single-nucleotide polymorphism in the single base pair extension (Chen et al. 2013).

Differentially methylated CpGs identified in MACHS that were replicated in the SMKE cord blood study ( exposed; unexposed). Note: CpGs were considered replicated if the raw p-value in the SMKE study was and the percent methylation difference was in the same direction as that observed in MACHS. The exposed group consisted of 14 newborns whose mothers smoked during pregnancy in SMKE. The unexposed group consisted of 16 newborns whose mothers were lifetime nonsmokers. DNA methylation was measured in cord blood cells by the Infinium EPIC array for the SMKE study. Chr, chromosome; MACHS, Maternal and Child Health Study; WGBS, whole-genome bisulfite sequencing. These CpGs have previously been found to be polymorphic or to be targeted by a probe with a single-nucleotide polymorphism in the single base pair extension (Chen et al. 2013). Results for the nine DMRs that were identified in MACHS that could be queried in the NIEHS CRU RRBS study of adult cells ( smoking; nonsmoking women) are shown in Table 5. Four of these DMRs (shown in Figure 1) were significantly differentially methylated () in the same direction in both studies (Table 5).
Table 5

DMR replication results.

MACHS WGBS DMRMACHS percent methylation differenceNIEHS CRU RRBS DMRNIEHS CRU percent methylation differenceRaw p-ValueNearest geneDistance (base pair)
Chr1:6852075-685226014.4Chr1:6852171-68522051.80.03CAMTA10
Chr1:54701016-5470122113.3Chr1:54701218-547013822.90.01SSBP30
Chr2:237476491-23747669219.1Chr2:237476624-2374764215.60.01CXCR70
Chr6:2764191-276438411.1Chr6:2764287-27645574.10.02WRNIP11,281
Chr7:150498649-15049869413.9Chr7:150498243-1504986946.70.01TMEM176A0
Chr9:72027018-7202728117.8Chr9:72026918-7202713010.70.04APBA115,167
Chr11:47952753-479529647.8Chr11:47952767-479528022.60.01PTPRJ49,145
Chr12:7781003-778109917.1Chr12:7780979-778105219.70.02APOBEC120,896
Chr19:47273681-47278026.7Chr19:47273536-472737933.60.02SLC1A54,337

Note: Chr, chromosome; DMR, differentially methylated region; MACHS, Maternal and Child Health Study; NIEHS CRU, National Institute of Environmental Health Sciences Clinical Research Unit; RRBS, reduced representation bisulfite sequencing; WGBS, whole-genome bisulfite sequencing.

Nine DMRs identified in MACHS could be queried in a reduced representation bisulfite sequencing study conducted in cells from a subset of women from the NIEHS CRU study. DMRs identified in MACHS were considered replicated if a) they overlapped a DMR identified in the NIEHS CRU study; b) the overlapping DMR had a ; and c) the percent methylation difference for the overlapping DMR was in the same direction as the DMR identified in MACHS.

Figure 1.

The four differentially methylated regions (DMRs) identified in the Maternal and Child Health Study (MACHS), which replicated in the National Institute of Environmental Health Sciences Clinical Research Unit (NIEHS CRU) reduced representation bisulfite sequencing study of adult cells, are depicted in (A–D). DMRs were considered replicated if a) the chromosome coordinates overlapped between the two studies; b) the p-value for the NIEHS CRU study DMR was ; and c) the percent methylation difference was in the same direction. Coordinates for each DMR and the percent methylation difference (comparing maternal tobacco smoke exposed with unexposed newborns) are indicated at the top of each panel. The location of each DMR is indicated by a vertical red line in the ideogram of its respective chromosome. Plots show the percent methylation levels (y-axis) for each MACHS participant at each CpG contained in the DMR. Chromosome positions for CpGs are indicated in gray (x-axis) above the plot. Percent methylation levels for participants in the exposed group are shown as light blue dots, and mean percent methylation levels for each CpG are connected by light blue lines. Percent methylation levels for participants in the unexposed group are shown as dark blue dots, and mean percent methylation levels for each CpG are connected by dark blue lines. Vertical gray bars highlight CpGs that were identified as differentially methylated with both a raw and a after accounting for correlations with neighboring CpGs and multiple testing, using the Benjamini-Hochberg method to control the false discovery rate.

DMR replication results. Note: Chr, chromosome; DMR, differentially methylated region; MACHS, Maternal and Child Health Study; NIEHS CRU, National Institute of Environmental Health Sciences Clinical Research Unit; RRBS, reduced representation bisulfite sequencing; WGBS, whole-genome bisulfite sequencing. Nine DMRs identified in MACHS could be queried in a reduced representation bisulfite sequencing study conducted in cells from a subset of women from the NIEHS CRU study. DMRs identified in MACHS were considered replicated if a) they overlapped a DMR identified in the NIEHS CRU study; b) the overlapping DMR had a ; and c) the percent methylation difference for the overlapping DMR was in the same direction as the DMR identified in MACHS. The four differentially methylated regions (DMRs) identified in the Maternal and Child Health Study (MACHS), which replicated in the National Institute of Environmental Health Sciences Clinical Research Unit (NIEHS CRU) reduced representation bisulfite sequencing study of adult cells, are depicted in (A–D). DMRs were considered replicated if a) the chromosome coordinates overlapped between the two studies; b) the p-value for the NIEHS CRU study DMR was ; and c) the percent methylation difference was in the same direction. Coordinates for each DMR and the percent methylation difference (comparing maternal tobacco smoke exposed with unexposed newborns) are indicated at the top of each panel. The location of each DMR is indicated by a vertical red line in the ideogram of its respective chromosome. Plots show the percent methylation levels (y-axis) for each MACHS participant at each CpG contained in the DMR. Chromosome positions for CpGs are indicated in gray (x-axis) above the plot. Percent methylation levels for participants in the exposed group are shown as light blue dots, and mean percent methylation levels for each CpG are connected by light blue lines. Percent methylation levels for participants in the unexposed group are shown as dark blue dots, and mean percent methylation levels for each CpG are connected by dark blue lines. Vertical gray bars highlight CpGs that were identified as differentially methylated with both a raw and a after accounting for correlations with neighboring CpGs and multiple testing, using the Benjamini-Hochberg method to control the false discovery rate.

Comparisons with Previous Studies of Maternal Tobacco Smoke Exposure and DNA Methylation

A previous study by Bauer et al. examined associations between maternal smoking during pregnancy and whole cord blood DNA methylation, using WGBS (Bauer et al. 2016). Using a smoothing method, they identified 8,409 DMRs, with an FDR of 12.4%. Only six of the 557 DMRs () identified in MACHS overlapped those identified by Bauer et al. (Table S8). The nearest genes to these six DMRs included TRNA_VAL, RIN3, MIR646HG, ZNF890P, APBA1, ANK1, and NKX6-3 (the latter two genes were associated with one DMR, located on chromosome 8). Two of these DMRs (one near TRNA_VAL and one near APBA1) were differentially methylated in the same direction (both were hypermethylated in the exposed, compared with unexposed, group), and one of these DMRs (chr9:72,027,018-72,027,281) was also identified in the NIEHS CRU RRBS replication study of cells from smoking and nonsmoking women. Across all three studies, this DMR was found to be hypermethylated in the tobacco smoke exposed, compared with unexposed, group (percent methylation differences: 17.8% in MACHS, 10.8% in the WGBS whole cord blood study by Bauer et al., and 10.7% in the NIEHS CRU RRBS study of adult cells). We also conducted a look-up analysis for 26 CpGs identified as differentially methylated () in cord blood by sustained maternal tobacco smoke exposure in a previous 450K array study, as many of these results have been widely replicated (Joubert et al. 2012). Twenty-five of the 26 CpGs were represented in the MACHS WGBS results and could therefore be queried. Sequencing coverage at these CpGs ranged from . None of the 25 CpGs were found to be differentially methylated in MACHS (raw p-value and ), although the directions of associations were similar for several sites (Table 6). For example, all four of the CpGs that have previously been identified as differentially methylated in AHRR were differentially methylated in the same direction in MACHS. To determine whether the inconsistencies between the Joubert et al. study and MACHS were due to the differences in the cell types examined, we conducted a look-up analysis for the same CpGs in the WakeMed SMKE EPIC array study of cord blood cells. Of the 26 differentially methylated CpGs identified by Joubert et al., 23 are represented on the EPIC array; of these, 11 (47.8%) were significantly () differentially methylated in the same direction in the SMKE study. This included one CpG that could not be examined in MACHS. For the 10 CpGs that were differentially methylated in SMKE but not MACHS, the median coverage was . This did not differ () from the median coverage () of the 12 CpGs that were not differentially methylated in either SMKE or MACHS.
Table 6

Look-up results for 26 CpGs identified as differentially methylated by maternal tobacco smoke exposure by Joubert et al. (2012) in MACHS and the SMKE study.

CpGPositionGeneJoubert et al. 2012MACHSWakeMed SMKE
MoBa (450K)NEST (450K)WGBSEPIC
Percent meth. diff.p-ValuePercent meth. diff.p-ValuePercent meth. diff.p-ValuepFDRaAve. coveragePercent meth. diff.p-Value
cg10399789Chr1:92945668GFI13.71.1×10103.40.022.60.440.445.90.50.44
cg09662411Chr1:92946132GFI16.63.0×10176.82.3×1036.80.260.783.31.60.61
cg06338710Chr1:92946187GFI15.85.0×10145.22.6×1034.30.570.664.00.70.39
cg18146737Chr1:92946700GFI112.33.3×102515.13.7×1030.80.660.644.31.50.03
cg12876356Chr1:92946825GFI111.91.7×102510.51.5×1030.020.150.435.12.10.01
cg18316974Chr1:92947035GFI17.13.2×10209.02.3×1034.30.500.543.90.60.12
cg09935388bChr1:92947588GFI113.72.7×10317.51.2×103NANANANA5.95.0×104
cg14179389Chr1:92947961GFI18.62.6×10258.24.3×1033.40.430.744.110.71.9×103
cg23067299cChr5:323907AHRR3.24.12×1093.73.6×10310.30.070.638.1NANA
cg03991871cChr5:368447AHRR2.22.0×10102.30.076.90.520.868.2NANA
cg05575921Chr5:373378AHRR7.58.0×10337.73.0×1041.40.79>0.997.34.32.1×108
cg21161138Chr5:399360AHRR2.38.9×10101.73.6×1030.050.99>0.999.33.20.01
cg11715943Chr6:33091841HLA-DPB21.83.6×1080.30.801.70.690.967.81.20.11
cg19089201Chr7:45002287MYO1G1.49.1×10112.29.2×1030.60.710.433.50.40.05
cg22132788cChr7:45002486MYO1G2.84.8×10182.19.6×1034.30.810.422.6NANA
cg04180046Chr7:45002736MYO1G5.32.9×10194.92.7×1035.10.800.691.92.70.01
cg12803068Chr7:45002919MYO1G8.31.3×10193.84.1×1030.00.970.923.50.70.03
cg04598670Chr7:68697651ENSG000002257183.01.3×1095.33.6×1030.30.96>0.9910.30.50.89
cg25949550Chr7:145814306CNTNAP21.81.0×10262.52.5×1031.10.730.6110.61.56.7×103
cg03346806Chr8:119157879EXT11.59.3×1080.10.125.80.210.816.90.60.17
cg18655025Chr14:91008005TTC7B1.26.8×1081.10.276.00.210.829.00.10.24
cg05549655Chr15:75019143CYP1A13.52.4×10103.86.0×1041.9>0.990.907.63.96.9×103
cg22549041Chr15:75019251CYP1A17.28.9×1098.94.4×1038.70.070.987.84.90.41
cg11924019Chr15:75019283CYP1A13.24.8×1085.38.0×1042.00.570.988.21.30.68
cg18092474Chr15:75019302CYP1A15.910.0×1095.34.4×1034.50.370.997.95.10.22
cg12477880Chr21:36259241RUNX14.67.6×10104.00.093.20.210.988.11.60.21

Note: Twenty-six CpGs have previously been identified as differentially methylated (after adjusting for multiple testing using the Bonferroni correction) in whole cord blood by maternal tobacco smoke exposure in two study populations (MoBa and NEST), evaluated by Joubert et al. 2012, and have been widely replicated in subsequent studies. Look-up analyses were therefore conducted for these CpGs in both MACHS and the SMKE study. Percent methylation differences and raw p-values are presented for each CpG by study. False discovery rate-adjusted p-values are also presented for MACHS. Chr, chromosome; FDR, false discovery rate; MACHS, Maternal and Child Health Study; MoBa, Norwegian Mother and Child Cohort Study; NEST, Newborn Epigenetics Study; WGBS, whole-genome bisulfite sequencing.

Raw p-values from RADmeth beta-binomial regression models were adjusted for correlations with neighboring CpGs, as described previously (Dolzhenko and Smith 2014), and were subsequently adjusted for multiple testing using the Benjamini-Hochberg method to control the false discovery rate.

Since all reads in the exposed and unexposed group were methylated in MACHS, RADMeth did not calculate a p-value for this CpG.

These CpGs are not represented on the Infinium EPIC array and therefore could not be examined in the SMKE study.

Look-up results for 26 CpGs identified as differentially methylated by maternal tobacco smoke exposure by Joubert et al. (2012) in MACHS and the SMKE study. Note: Twenty-six CpGs have previously been identified as differentially methylated (after adjusting for multiple testing using the Bonferroni correction) in whole cord blood by maternal tobacco smoke exposure in two study populations (MoBa and NEST), evaluated by Joubert et al. 2012, and have been widely replicated in subsequent studies. Look-up analyses were therefore conducted for these CpGs in both MACHS and the SMKE study. Percent methylation differences and raw p-values are presented for each CpG by study. False discovery rate-adjusted p-values are also presented for MACHS. Chr, chromosome; FDR, false discovery rate; MACHS, Maternal and Child Health Study; MoBa, Norwegian Mother and Child Cohort Study; NEST, Newborn Epigenetics Study; WGBS, whole-genome bisulfite sequencing. Raw p-values from RADmeth beta-binomial regression models were adjusted for correlations with neighboring CpGs, as described previously (Dolzhenko and Smith 2014), and were subsequently adjusted for multiple testing using the Benjamini-Hochberg method to control the false discovery rate. Since all reads in the exposed and unexposed group were methylated in MACHS, RADMeth did not calculate a p-value for this CpG. These CpGs are not represented on the Infinium EPIC array and therefore could not be examined in the SMKE study. We also conducted a look-up analysis for CpGs that were identified as differentially methylated () in whole cord blood in relation to both sustained and any maternal smoking during pregnancy in meta-analyses conducted by the PACE consortium (Joubert et al. 2016). Of the 3,272 differentially methylated CpGs () identified by PACE, 3,258 were represented in the MACHS WGBS dataset and could therefore be evaluated (Excel Table S6). Of these CpGs, 213 were differentially methylated in MACHS based on a raw , and 267 were differentially methylated after adjusting for correlations with proximal CpGs (). These numbers are higher than would be expected by chance ( and , respectively). Of the 213 CpGs with a raw , 123 were differentially methylated in the same direction in MACHS. Results did not differ for CpGs with an average coverage vs. ().

Discussion

Although earlier studies have evaluated the impact of maternal smoking during pregnancy on DNA methylation patterns, the majority used candidate gene approaches or methylation arrays, which cover a small fraction () of the CpG sites in the human genome, and have traditionally focused on promoter regions. Using WGBS with coverage as an alternative approach, we were able to interrogate CpG sites, including those located distally from genes, and identified 557 regions that were differentially methylated by exposure to maternal smoking in utero. While we observed enrichment of DMRs within promoter regions, we also observed enrichment in enhancer regions, which have been largely uninvestigated, and in DNase-sensitive regions and transcription factor binding sites, including binding sites for CTCF, which are methylation sensitive and regulate gene expression via alterations to the three-dimensional structure of the genome (Zuin et al. 2014). To our knowledge, only one previous study, conducted by Bauer et al. (2016), has utilized WGBS to investigate the effects of maternal tobacco smoke exposure during pregnancy on the newborn methylome. Similar to our findings, Bauer et al. observed significant enrichment of DMRs within enhancer regions. However, only six of the DMRs identified by Bauer et al. overlapped those identified by MACHS. Since the smoothing method used by Bauer et al. has been shown to yield similar results as the method used in the current study (MethPipe) (Dolzhenko and Smith 2014), other study differences likely contributed to these discrepancies. Two potential explanations include the fact that Bauer et al. evaluated sustained maternal smoking during pregnancy and measured DNA methylation in whole blood, similar to most previous studies (Breton et al. 2014; de Vocht et al. 2015; Ivorra et al. 2015; Joubert et al. 2012; Ladd-Acosta et al. 2016; Markunas et al. 2014; Richmond et al. 2015; Rotroff et al. 2016; Rzehak et al. 2016), whereas MACHS evaluated any maternal smoking during pregnancy and focused on cells. We also conducted a look-up analysis of 3,258 CpGs that were identified as differentially methylated in cord blood in relation to both any and sustained maternal smoking during pregnancy in meta-analyses conducted by the PACE consortium (Joubert et al. 2016). While a larger number of these sites were differentially methylated in MACHS than would be expected by chance, only 123 (3.8%) were differentially methylated in the same direction with a raw . To determine if the MACHS results were more reproducible in studies focusing on the same cell type, we conducted a series of replication analyses using data from studies that measured DNA methylation in cells from newborns or adults. We were able to query nine of the DMRs identified in MACHS in an RRBS study of cells from smoking and nonsmoking women (NIEHS CRU), four (44%) of which were significantly differentially methylated in the same direction. Of these four DMRs, one (chr9:72,027,018-72,027,281) was also identified in the whole cord blood WGBS study by Bauer et al. (2016), which suggests that this region may be sensitive to tobacco smoke exposure in multiple blood cell types. In contrast with the DMR replication results, very few of the individual CpGs that were identified as differentially methylated in MACHS that could be queried were replicated. For example, of the 485 differentially methylated CpGs identified in MACHS that could be evaluated in an EPIC array study (SMKE) of cord blood cells, only 33 (6.8%) replicated. This suggests that a large fraction of the differentially methylated CpGs identified in MACHS that are represented on the EPIC array may be false positives, although it is possible that differences in participant characteristics, such as the different racial/ethnic compositions of the two study populations, and assay differences may also play a role. Although replication rates were low for the individual CpG results, one CpG (chr14:23,623,727) was identified as hypomethylated among maternal tobacco smoke exposed, compared with unexposed, newborns in both MACHS and the SMKE study, and was also found to be hypomethylated in smokers, compared with nonsmokers, in the NIEHS CRU study of adult cells, which was confirmed across two different platforms (i.e., the EPIC array and the 450K array). This result is therefore highly reproducible and suggests that this CpG, located in the promoter of amino acid transporter SLC7A8 (Ren et al. 2017), may be sensitive to tobacco smoke exposure in cells, irrespective of life stage. Since a subset of CpGs on the Infinium 450K array have been consistently identified as differentially methylated by sustained maternal smoking during pregnancy in whole cord blood (Joubert et al. 2012), we were interested in evaluating whether these CpGs are also differentially methylated in cells. While some of these loci (e.g., all CpGs within AHRR) were differentially methylated in the same direction in MACHS, none were statistically significant. To determine if these discrepancies were driven by the different cell types examined, we conducted the same look-up analysis in the WakeMed SMKE EPIC array study of cord blood cells. Of the 23 CpGs that could be queried in SMKE, 11 (48%) were significantly differentially methylated in the same direction as that observed by Joubert et al., which suggests that about half of these CpGs are sensitive to tobacco smoke in cells, but the other half may not be. The latter result is not surprising, as there is evidence that exposure to maternal tobacco smoke in utero alters methylation patterns differently in certain leukocyte subpopulations (Bauer et al. 2016; Su et al. 2016). While it is currently unclear why the differentially methylated CpGs identified by Joubert et al. that replicated in SMKE were not also differentially methylated in MACHS, two possible explanations include the low coverage of the WGBS data, which may have reduced our statistical power, and the weaker exposure variable, as most of the MACHS mothers who smoked quit early in pregnancy. Our study has some limitations that need to be considered. First, since MACHS participants were recruited at delivery, maternal samples were not collected during pregnancy, so exposure status could not be confirmed using maternal cotinine measures. However, women are more likely to underreport, rather than falsely report, that they smoked during pregnancy (Gorber et al. 2009), and there is no reason to believe that this would occur differentially by newborn methylation status. Therefore, any potential exposure misclassification would have likely biased results toward the null. Another important consideration is that 40% of the women who reported smoking during pregnancy resided in a household with another smoker. However, previous studies have observed that the effects of maternal smoking on newborn DNA methylation are much stronger than those of other secondhand smoke exposures (Bouwland-Both et al. 2015; Richmond et al. 2015). An additional limitation is that there may have been some confounding by acculturation and/or ethnic substructure, as this information was not collected for MACHS participants, but both have been related to smoking behaviors (Detjen et al. 2007; Kondo et al. 2016). Importantly, a genomic inflation factor () of 1.25 was observed for the current study, suggesting possible population stratification. We therefore cannot rule out the possibility that there may have been some genetic differences between groups, particularly given our small sample size. This is important as certain SNPs can destroy a CpG site and thus impact methylation at that locus. However, only a small fraction (1.6%) of the differentially methylated CpGs identified in MACHS overlapped common SNPs identified in a similar population (i.e., MXL of 1000 Genomes), so this likely had a minimal impact on our results. While we focused our study on sorted cells to minimize potential confounding by cell type heterogeneity, it is possible that maternal smoking during pregnancy may alter subpopulations, which could influence our results. Furthermore, we cannot necessarily generalize our findings to tissues other than cells. Another important limitation is that time from cord blood collection until sample processing varied between individuals. However, this did not differ by exposure status. While we observed significant enrichment of differentially methylated CpGs and DMRs within key regulatory elements, we did not measure gene expression and therefore cannot make definitive conclusions about the functional relevance of our findings. Follow-up studies will therefore be important to evaluate the impact of altered methylation levels at the sensitive regions identified in MACHS on related histone modifications and gene expression. Finally, it is possible that some of the observed associations between exposure to maternal smoking in utero and DNA methylation are driven by alterations in 5-hydroxymethycytosine and related modifications, as these marks are indistinguishable from 5-methylcytosine after bisulfite conversion (Huang et al. 2010). For the current study, we used a method (RADMeth) that has been shown to perform well on very low–coverage WGBS data within a case–control context (Dolzhenko and Smith 2014). Nevertheless, we cannot rule out the possibility that the low coverage of the MACHS WGBS data may have still contributed to some false positives, both for the individual CpG and DMR results, as RADmeth’s performance on low-coverage data has not been evaluated in the context of environmental exposures, which have generally been associated with smaller methylation differences. However, while the replication rate for the individual CpG results was very low (6.8%), this did not differ by sequencing coverage, which suggests that other factors likely contributed to the poor replication rate. Since MethPipe does not systematically evaluate outliers, it is possible that outliers may have contributed to some false positives in our results. It is also possible that the modest coverage and the small sample size of the current study may have contributed to some false negatives. For example, none of the CpGs identified as differentially methylated by Joubert et al. (2012) that have been widely replicated in studies of whole cord blood were significantly differentially methylated in MACHS. While some of these discrepancies may have been driven by differences in the cell types examined, a subset of these CpGs were found to be differentially methylated in the WakeMed study of cord blood cells, which suggests that other factors, such as the low coverage of our data or the weaker maternal smoking exposure in MACHS, may have contributed to the null results. Future WGBS studies with higher coverage, which focus on specific cell types, are greatly needed to address some of these limitations. In particular, down-sampling simulations using higher coverage datasets are needed to determine the average sequencing coverage and sample sizes required to minimize false positives and false negatives in studies investigating environmental impacts on the DNA methylome. Despite these limitations, our study also had several unique strengths. By using WGBS, we were able to interrogate millions of CpG sites that have been understudied in relation to maternal tobacco smoke exposure, including CpGs within critical regulatory regions, such as enhancers, which tend to be underrepresented in methylation arrays. Since we looked at a much more homogenous cell population than has previously been investigated, our findings are also less subject to potential confounding by cell type heterogeneity. Furthermore, since cells have been implicated in the development of tobacco-related illnesses, such as asthma, they represent an ideal tissue to evaluate in the context of tobacco smoke exposure (Akbari et al. 2006; Ling and Luster 2016; Lloyd and Hessel 2010). Also, a novel aspect of the current study was its focus on Hispanic white individuals, a group that has been largely understudied in this context. An additional strength of our study was the matched design. In addition to focusing on Hispanic white participants to reduce potential confounding by race or ethnicity, we matched on several important confounders, including maternal age, gestational age, maternal diabetes status, and maternal BMI, to try to isolate the effects of exposure to maternal smoking in utero on the newborn methylome. Another major strength of our study was the inclusion of replication studies that also profiled DNA methylation in cells, which confirmed four regions and 33 CpGs that were identified as differentially methylated in MACHS. Finally, since MACHS was designed to identify early-life risk factors for adverse outcomes in childhood, regions found to be differentially methylated by maternal tobacco smoke exposure in the current study can be examined in relation to respiratory and other health outcomes in future studies.

Conclusions

Our findings contribute to growing evidence that exposure to maternal smoking in utero alters DNA methylation patterns in newborns, and present the first genome-wide description of the impact of maternal tobacco smoke exposure during pregnancy on the cord blood DNA methylome. In particular, our results suggest that maternal tobacco smoke exposure during pregnancy alters DNA methylation levels within coding regions and in important regulatory regions, such as enhancers, which have been largely uninvestigated. Although we used a WGBS pipeline that performs well on low-coverage data, our results should be interpreted with some caution, as it is still possible that the low-coverage of the data contributed to false positives. Nevertheless, we observed that a subset of results that could be queried replicated in several external study populations. For example, four of the nine DMRs identified in MACHS that could be examined in a study of adult cells were confirmed, and one differentially methylated CpG, located in the promoter region of SLC7A8, was validated in two different studies of cells (one conducted in newborns and the other in adults), which suggests that these loci are sensitive to tobacco smoke exposure in cells, and that these effects may span different life stages. Click here for additional data file. Click here for additional data file. Click here for additional data file.
  62 in total

1.  A systematic review and meta-analysis of prospective studies on the association between maternal cigarette smoking and preterm delivery.

Authors:  N R Shah; M B Bracken
Journal:  Am J Obstet Gynecol       Date:  2000-02       Impact factor: 8.661

Review 2.  Functions of T cells in asthma: more than just T(H)2 cells.

Authors:  Clare M Lloyd; Edith M Hessel
Journal:  Nat Rev Immunol       Date:  2010-11-09       Impact factor: 53.106

3.  CD4+ invariant T-cell-receptor+ natural killer T cells in bronchial asthma.

Authors:  Omid Akbari; John L Faul; Elisabeth G Hoyte; Gerald J Berry; Jan Wahlström; Mitchell Kronenberg; Rosemarie H DeKruyff; Dale T Umetsu
Journal:  N Engl J Med       Date:  2006-03-16       Impact factor: 91.245

4.  Prenatal exposure to parents' smoking and childhood cancer.

Authors:  E M John; D A Savitz; D P Sandler
Journal:  Am J Epidemiol       Date:  1991-01-15       Impact factor: 4.897

5.  An atlas of active enhancers across human cell types and tissues.

Authors:  Robin Andersson; Claudia Gebhard; Michael Rehli; Albin Sandelin; Irene Miguel-Escalada; Ilka Hoof; Jette Bornholdt; Mette Boyd; Yun Chen; Xiaobei Zhao; Christian Schmidl; Takahiro Suzuki; Evgenia Ntini; Erik Arner; Eivind Valen; Kang Li; Lucia Schwarzfischer; Dagmar Glatz; Johanna Raithel; Berit Lilje; Nicolas Rapin; Frederik Otzen Bagger; Mette Jørgensen; Peter Refsing Andersen; Nicolas Bertin; Owen Rackham; A Maxwell Burroughs; J Kenneth Baillie; Yuri Ishizu; Yuri Shimizu; Erina Furuhata; Shiori Maeda; Yutaka Negishi; Christopher J Mungall; Terrence F Meehan; Timo Lassmann; Masayoshi Itoh; Hideya Kawaji; Naoto Kondo; Jun Kawai; Andreas Lennartsson; Carsten O Daub; Peter Heutink; David A Hume; Torben Heick Jensen; Harukazu Suzuki; Yoshihide Hayashizaki; Ferenc Müller; Alistair R R Forrest; Piero Carninci
Journal:  Nature       Date:  2014-03-27       Impact factor: 49.962

6.  A promoter-level mammalian expression atlas.

Authors:  Alistair R R Forrest; Hideya Kawaji; Michael Rehli; J Kenneth Baillie; Michiel J L de Hoon; Vanja Haberle; Timo Lassmann; Ivan V Kulakovskiy; Marina Lizio; Masayoshi Itoh; Robin Andersson; Christopher J Mungall; Terrence F Meehan; Sebastian Schmeier; Nicolas Bertin; Mette Jørgensen; Emmanuel Dimont; Erik Arner; Christian Schmidl; Ulf Schaefer; Yulia A Medvedeva; Charles Plessy; Morana Vitezic; Jessica Severin; Colin A Semple; Yuri Ishizu; Robert S Young; Margherita Francescatto; Intikhab Alam; Davide Albanese; Gabriel M Altschuler; Takahiro Arakawa; John A C Archer; Peter Arner; Magda Babina; Sarah Rennie; Piotr J Balwierz; Anthony G Beckhouse; Swati Pradhan-Bhatt; Judith A Blake; Antje Blumenthal; Beatrice Bodega; Alessandro Bonetti; James Briggs; Frank Brombacher; A Maxwell Burroughs; Andrea Califano; Carlo V Cannistraci; Daniel Carbajo; Yun Chen; Marco Chierici; Yari Ciani; Hans C Clevers; Emiliano Dalla; Carrie A Davis; Michael Detmar; Alexander D Diehl; Taeko Dohi; Finn Drabløs; Albert S B Edge; Matthias Edinger; Karl Ekwall; Mitsuhiro Endoh; Hideki Enomoto; Michela Fagiolini; Lynsey Fairbairn; Hai Fang; Mary C Farach-Carson; Geoffrey J Faulkner; Alexander V Favorov; Malcolm E Fisher; Martin C Frith; Rie Fujita; Shiro Fukuda; Cesare Furlanello; Masaaki Furino; Jun-ichi Furusawa; Teunis B Geijtenbeek; Andrew P Gibson; Thomas Gingeras; Daniel Goldowitz; Julian Gough; Sven Guhl; Reto Guler; Stefano Gustincich; Thomas J Ha; Masahide Hamaguchi; Mitsuko Hara; Matthias Harbers; Jayson Harshbarger; Akira Hasegawa; Yuki Hasegawa; Takehiro Hashimoto; Meenhard Herlyn; Kelly J Hitchens; Shannan J Ho Sui; Oliver M Hofmann; Ilka Hoof; Furni Hori; Lukasz Huminiecki; Kei Iida; Tomokatsu Ikawa; Boris R Jankovic; Hui Jia; Anagha Joshi; Giuseppe Jurman; Bogumil Kaczkowski; Chieko Kai; Kaoru Kaida; Ai Kaiho; Kazuhiro Kajiyama; Mutsumi Kanamori-Katayama; Artem S Kasianov; Takeya Kasukawa; Shintaro Katayama; Sachi Kato; Shuji Kawaguchi; Hiroshi Kawamoto; Yuki I Kawamura; Tsugumi Kawashima; Judith S Kempfle; Tony J Kenna; Juha Kere; Levon M Khachigian; Toshio Kitamura; S Peter Klinken; Alan J Knox; Miki Kojima; Soichi Kojima; Naoto Kondo; Haruhiko Koseki; Shigeo Koyasu; Sarah Krampitz; Atsutaka Kubosaki; Andrew T Kwon; Jeroen F J Laros; Weonju Lee; Andreas Lennartsson; Kang Li; Berit Lilje; Leonard Lipovich; Alan Mackay-Sim; Ri-ichiroh Manabe; Jessica C Mar; Benoit Marchand; Anthony Mathelier; Niklas Mejhert; Alison Meynert; Yosuke Mizuno; David A de Lima Morais; Hiromasa Morikawa; Mitsuru Morimoto; Kazuyo Moro; Efthymios Motakis; Hozumi Motohashi; Christine L Mummery; Mitsuyoshi Murata; Sayaka Nagao-Sato; Yutaka Nakachi; Fumio Nakahara; Toshiyuki Nakamura; Yukio Nakamura; Kenichi Nakazato; Erik van Nimwegen; Noriko Ninomiya; Hiromi Nishiyori; Shohei Noma; Shohei Noma; Tadasuke Noazaki; Soichi Ogishima; Naganari Ohkura; Hiroko Ohimiya; Hiroshi Ohno; Mitsuhiro Ohshima; Mariko Okada-Hatakeyama; Yasushi Okazaki; Valerio Orlando; Dmitry A Ovchinnikov; Arnab Pain; Robert Passier; Margaret Patrikakis; Helena Persson; Silvano Piazza; James G D Prendergast; Owen J L Rackham; Jordan A Ramilowski; Mamoon Rashid; Timothy Ravasi; Patrizia Rizzu; Marco Roncador; Sugata Roy; Morten B Rye; Eri Saijyo; Antti Sajantila; Akiko Saka; Shimon Sakaguchi; Mizuho Sakai; Hiroki Sato; Suzana Savvi; Alka Saxena; Claudio Schneider; Erik A Schultes; Gundula G Schulze-Tanzil; Anita Schwegmann; Thierry Sengstag; Guojun Sheng; Hisashi Shimoji; Yishai Shimoni; Jay W Shin; Christophe Simon; Daisuke Sugiyama; Takaai Sugiyama; Masanori Suzuki; Naoko Suzuki; Rolf K Swoboda; Peter A C 't Hoen; Michihira Tagami; Naoko Takahashi; Jun Takai; Hiroshi Tanaka; Hideki Tatsukawa; Zuotian Tatum; Mark Thompson; Hiroo Toyodo; Tetsuro Toyoda; Elvind Valen; Marc van de Wetering; Linda M van den Berg; Roberto Verado; Dipti Vijayan; Ilya E Vorontsov; Wyeth W Wasserman; Shoko Watanabe; Christine A Wells; Louise N Winteringham; Ernst Wolvetang; Emily J Wood; Yoko Yamaguchi; Masayuki Yamamoto; Misako Yoneda; Yohei Yonekura; Shigehiro Yoshida; Susan E Zabierowski; Peter G Zhang; Xiaobei Zhao; Silvia Zucchelli; Kim M Summers; Harukazu Suzuki; Carsten O Daub; Jun Kawai; Peter Heutink; Winston Hide; Tom C Freeman; Boris Lenhard; Vladimir B Bajic; Martin S Taylor; Vsevolod J Makeev; Albin Sandelin; David A Hume; Piero Carninci; Yoshihide Hayashizaki
Journal:  Nature       Date:  2014-03-27       Impact factor: 49.962

7.  Tobacco smoking differently influences cell types of the innate and adaptive immune system-indications from CpG site methylation.

Authors:  Mario Bauer; Beate Fink; Loreen Thürmann; Markus Eszlinger; Gunda Herberth; Irina Lehmann
Journal:  Clin Epigenetics       Date:  2016-08-03       Impact factor: 6.551

8.  Distinct Epigenetic Effects of Tobacco Smoking in Whole Blood and among Leukocyte Subtypes.

Authors:  Dan Su; Xuting Wang; Michelle R Campbell; Devin K Porter; Gary S Pittman; Brian D Bennett; Ma Wan; Neal A Englert; Christopher L Crowl; Ryan N Gimple; Kelly N Adamski; Zhiqing Huang; Susan K Murphy; Douglas A Bell
Journal:  PLoS One       Date:  2016-12-09       Impact factor: 3.240

9.  450K epigenome-wide scan identifies differential DNA methylation in newborns related to maternal smoking during pregnancy.

Authors:  Bonnie R Joubert; Siri E Håberg; Roy M Nilsen; Xuting Wang; Stein E Vollset; Susan K Murphy; Zhiqing Huang; Cathrine Hoyo; Øivind Midttun; Lea A Cupul-Uicab; Per M Ueland; Michael C Wu; Wenche Nystad; Douglas A Bell; Shyamal D Peddada; Stephanie J London
Journal:  Environ Health Perspect       Date:  2012-07-31       Impact factor: 9.031

10.  Assessment of Offspring DNA Methylation across the Lifecourse Associated with Prenatal Maternal Smoking Using Bayesian Mixture Modelling.

Authors:  Frank de Vocht; Andrew J Simpkin; Rebecca C Richmond; Caroline Relton; Kate Tilling
Journal:  Int J Environ Res Public Health       Date:  2015-11-13       Impact factor: 3.390

View more
  5 in total

Review 1.  Epigenetics as a Biomarker for Early-Life Environmental Exposure.

Authors:  Rose Schrott; Ashley Song; Christine Ladd-Acosta
Journal:  Curr Environ Health Rep       Date:  2022-07-30

2.  Cigarette Smoke Regulates the Expression of EYA4 via Alternation of DNA Methylation Status.

Authors:  Bader O Almutairi; Mikhlid H Almutairi; Abdulwahed F Alrefaei; Daoud Ali; Saad Alkahtani; Saud Alarifi
Journal:  Biomed Res Int       Date:  2022-05-14       Impact factor: 3.246

3.  Gene-Environment Interactions and Stochastic Variations in the Gero-Exposome.

Authors:  Caleb E Finch; Amin Haghani
Journal:  J Gerontol A Biol Sci Med Sci       Date:  2021-09-13       Impact factor: 6.053

4.  Nrf2-ARE Signaling Partially Attenuates Lipopolysaccharide-Induced Mammary Lesions via Regulation of Oxidative and Organelle Stresses but Not Inflammatory Response in Mice.

Authors:  Yongxin Li; Juanjuan Shao; Pengfei Hou; Feng-Qi Zhao; Hongyun Liu
Journal:  Oxid Med Cell Longev       Date:  2021-01-08       Impact factor: 6.543

Review 5.  The role of cigarette smoke-induced epigenetic alterations in inflammation.

Authors:  Dandan Zong; Xiangming Liu; Jinhua Li; Ruoyun Ouyang; Ping Chen
Journal:  Epigenetics Chromatin       Date:  2019-11-11       Impact factor: 4.954

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.