Literature DB >> 33842847

Identification of methylation changes associated with positive and negative growth deviance in Gambian infants using a targeted methyl sequencing approach of genomic DNA.

Claire R Quilter1,2, Kerry M Harvey1, Julien Bauer1, Benjamin M Skinner1,3, Maria Gomez1,4, Manu Shrivastava1,5, Andrew M Doel6,7, Saikou Drammeh7, David B Dunger8, Sophie E Moore6,7, Ken K Ong8,9,10, Andrew M Prentice7, Robin M Bernstein11,12, Carole A Sargent1, Nabeel A Affara1.   

Abstract

Low birthweight and reduced height gain during infancy (stunting) may arise at least in part from adverse early life environments that trigger epigenetic reprogramming that may favor survival. We examined differential DNA methylation patterns using targeted methyl sequencing of regions regulating gene activity in groups of rural Gambian infants: (a) low and high birthweight (DNA from cord blood (n = 16 and n = 20, respectively), from placental trophoblast tissue (n = 21 and n = 20, respectively), and DNA from peripheral blood collected from infants at 12 months of age (n = 23 and n = 17, respectively)), and, (b) the top 10% showing rapid postnatal length gain (high, n = 20) and the bottom 10% showing slow postnatal length gain (low, n = 20) based on z score change between birth and 12 months of age (LAZ) (DNA from peripheral blood collected from infants at 12 months of age). Using BiSeq analysis to identify significant methylation marks, for birthweight, four differentially methylated regions (DMRs) were identified in trophoblast DNA, compared to 68 DMRs in cord blood DNA, and 54 DMRs in 12-month peripheral blood DNA. Twenty-five DMRs were observed to be associated with high and low length for age (LAZ) at 12 months. With the exception of five loci (associated with two different genes), there was no overlap between these groups of methylation marks. Of the 194 CpG methylation marks contained within DMRs, 106 were located to defined gene regulatory elements (promoters, CTCF-binding sites, transcription factor-binding sites, and enhancers), 58 to gene bodies (introns or exons), and 30 to intergenic DNA. Distinct methylation patterns associated with birthweight between comparison groups were observed in DNA collected at birth (at the end of intrauterine growth window) compared to those established by 12 months (near the infancy/childhood growth transition). The longitudinal differences in methylation patterns may arise from methylation adjustments, changes in cellular composition of blood or both that continue during the critical postnatal growth period, and in response to early nutritional and infectious environmental exposures with impacts on growth and longer-term health outcomes.
© 2020 The Authors. FASEB BioAdvances published by the Federation of American Societies for Experimental Biology.

Entities:  

Keywords:  DNA methylation; birthweight; environmental exposures; stunting

Year:  2021        PMID: 33842847      PMCID: PMC8019263          DOI: 10.1096/fba.2020-00101

Source DB:  PubMed          Journal:  FASEB Bioadv        ISSN: 2573-9832


Cis‐acting quantitative trait methylation Cis‐acting methylation quantitative trait locus CCCTC‐binding factor Database for Annotation, Visualization, and Integrated Discovery Differentially Methylated Region European Bioinformatics Institute Epigenome‐Wide Association Study False Discovery Rate Genetic Association Database Genome‐Wide Association Study Length for Age Z score Minor Allele Frequency Medical Research Council National Center for Bioinformatics Technology Online Mendelian Inheritance in Man Principal Components Analysis Small for Gestation Age Single Nucleotide Polymorphism Trans‐acting methylation quantitative trait locus Transcription Start Site Uterine Growth Restriction

INTRODUCTION

About 45% of global deaths in children under 5 years of age are thought to be related to undernutrition. Children who survive early periods of undernutrition may suffer longer‐term consequences, including stunting and other developmental deficits, which are major contributors to long‐term morbidity and mortality. , Although the prevalence of stunting declined in sub‐Saharan Africa from 42% in 1990 to 32% in 2015, the numbers of affected individuals increased from 47 million to 58 million. Studies estimate that 20% of growth retardation starts in utero where under‐nutrition in pregnancy increases the risks of intrauterine growth retardation (IUGR) and small for gestation age (SGA) infants, preterm delivery and long‐term impaired immunity. It is hypothesized that an adverse early life environment and nutrition induce phenotypic adaptations through developmental plasticity to favor survival in the short term, but at the expense of lifelong effects on health. , Nutritional interventions to improve child growth and adult health , have had limited success, primarily for the lack of a clear understanding of optimal timing, target groups, and the composition of supplements. The period of growth and development from conception to a child's second birthday (coined the first 1000 days) is one of the most critical windows of opportunity for interventions. There is a complex interplay between an individual's genetic constitution and the environment. Responses to extrinsic factors via modifications to the epigenome (which may include to both chromatin‐associated proteins and DNA bases) in the first 1000 days are believed to be important in establishing protective adaptations against the impact of under‐nutrition and an adverse environment (thrifty phenotype). , DNA methylation at CpG couplets is one of the most actively studied modifications to the epigenome. A large meta‐analysis of multiple epigenome‐wide association studies (EWAS) by the Childhood Epigenetics Consortium found methylation at 914 CpG sites associated with birthweight in whole blood DNA from healthy neonates, but <1.3% persisted in children (2–13 years), <0.1% in adolescents (16–18 years), and none in adults (30–45 years). The current study uses samples and data from a cohort of Gambian mother–infant pairs exhibiting high rates of maternal and child under‐nutrition. Rural Gambian infants are small at birth relative to international standards, show positive growth patterns during the first few months of life and then, enter a period of reduced growth marked by profound faltering until at least 24 months of age. , Schoenbuchner et al. have suggested that stunting is an extreme adaptation to profound faltering episodes potentially arising from a complex interaction of malnutrition, infection, and disease. Despite four decades of nutrition‐sensitive and nutrition‐specific interventions halving under‐nutrition for young children from rural Gambia, substantial (30%) growth faltering remains, indicating a gap in our understanding of its complex etiology. Epigenetic studies carried out on Gambian populations have highlighted the importance of maternal nutrition and exposures and the effects of maternal nutritional supplementation in this highly seasonal environment. Many aspects of health and behavior in rural Gambia are influenced by the annual seasonality with a single rainy “hungry” season (late June–October) followed by a dry “harvest” season (November–May/June). , Specifically, there is evidence that seasonal variation in nutrition during the periconceptional period influences methylation status in postnatal infants at a number of loci, is related to methyl‐donor nutrient content of the mother's diet , , , and may be associated with an increase in both preterm and SGA infants. Periconceptional nutrition supplementation influences methylation changes in cord and postnatal infant blood DNA at CpG loci linked to genes associated with infection and immunity and alters the methylation at imprinted loci. Maternal exposure to aflatoxin B1 is also associated with DNA methylation changes at specific loci in Gambian infants. The aim of the present study was to identify epigenetic marks that are established during the critical first 1000 days in a cohort of rural Gambian infants and explore how these may be associated with normal versus stunted growth outcomes in order to determine whether any targets for intervention are associated with prenatal and/or postnatal periods of epigenetic modification. We used a targeted methyl sequencing approach of genomic DNA from placental trophoblast tissue, cord, and infant (12 months of age) blood to identify the methylation changes. These changes may be useful as biomarkers, highlighting genes influenced by exposures during embryonic and fetal development and early infancy, and identifying potential pathways through which these may influence the growth outcomes at birth and in the first year of life.

MATERIALS AND METHODS

Samples

The study was conducted among pregnant women and their infants living in the rural West Kiang region of The Gambia. Participants were recruited as part of the HERO‐G (Hormonal Regulators of Growth) study. The study cohort was 238 newborns whose growth had been assessed longitudinally to 24 months of age. Table 1 summarizes data associated with the samples from individuals used in this study. The full HERO‐G protocol is described elsewhere. Placentas from women who delivered at home were collected by trained field workers and immediately transported on ice to the nearby Medical Research Council (MRC) Unit The Gambia Keneba laboratory (within 20–30 minutes) and carefully processed to obtain trophoblast material following a standard protocol (see placenta sample collection protocol in Data S1). Placental samples each of 400 mg were taken at four different evenly spaced locations, at least 2 cm from the edge, and at consistent relative positions in each placenta to mitigate placental tissue heterogeneity. Samples were cut into four pieces, placed in RNAlater at a volume of 5 x tissue weight (Cat No 76106, Qiagen), and transported frozen on dry‐ice to the United Kingdom for DNA extraction. After extraction samples from each of the four placental regions were pooled equimolarly. Cord blood and infant blood samples were collected into EDTA‐lined tubes (BD Vacutainer, pink top) for DNA extraction in the United Kingdom. Ethical approval for the study was given by the joint Gambia Government/Medical Research Council (MRC) Unit The Gambia Ethics Committee (SCC 1313v3), with additional approval from the University of Colorado Institutional Research Board (protocol number 13–0441). Community approval was obtained from each participating village, and written, informed consent was obtained from each participating family. Samples for analysis were selected retrospectively from the study cohort representing (a) the highest 20% and lowest 20% birthweights and (b) according to the top and bottom 10% change in length‐for‐age (LAZ) from birth to 12 months. For the 12‐month samples the male average age = 376.4 days, SD 9 days (366–409 d) and females average age = 378.8 days, SD 10 days (367–413 d). Table 2 summarizes the number of samples analyzed after quality testing for each tissue and test group and those that are common between groups.
TABLE 1

Summary of Individual‐Specific Data of Those Included in the Study

Mat IDMat AgeGA (wks)Parity (Cat)S_MOCS_MOBBW (kg)BW CatBL (cm)LAZ Change V1–12 mLAZ CatTissue
MALES
20.2736PrimiparousWW1.7low45.00−0.45lowPl, CB, 12mB, 12mH
35.1938.1MultiparousDD2.14low44.431.55highPl, 12mB, 12mH,
33.3238.1MultiparousDD2.32low47.000.53midPl, CB, 12mB
38.6138.6MultiparousWD2.38low45.50−0.39lowPl, CB, 12mB, 12mH
23.637PrimiparousWD2.44low47.800.42midPl, CB, 12mB
23.3237.8MultiparousWD2.48low45.900midPl
26.3340.7MultiparousDD2.51low49.002.14highCB,12mB, 12mH
2739.9MultiparousWW2.53low49.001.7highPl, CB, 12mB, 12mH
24.3938.9MultiparousDD2.56low47.400.99highPl, CB, 12mB
18.6539.4PrimiparousDW2.59low50.400.5midPl, CB, 12mB
31.4941.2MultiparousWD2.59low42.30−0.49lowPl, 12mB, 12mH
20.6738.9PrimiparousDD2.67low46.701.8high12mB, 12mH
31.1640.7MultiparousDD2.7low48.401.21high12mB, 12mH
45WD2.91mid0.97high12mH
38.541MultiparousDW2.92mid49.00−0.91low12mH
40.8239.7MultiparousDW2.96mid50.101.96high12mH
28.6940.7MultiparousDD3.06mid49.33−0.43low12mH
41.2740.7MultiparousDD3.07mid52.00−0.75low12mH
32.8239.9MultiparousWD3.13mid53.00−0.66low12mH
27.1141.2MultiparousDD3.26high51.50−0.49low12mH
23.0439.4MultiparousWD3.26high51.23−0.91low12mH
29.1538.6MultiparousDD3.26high48.001.52high12mB, 12mH
3738.1MultiparousDD3.27high48.101.37high12mB, 12mH
37.5338.1MultiparousWD3.28high49.301.08highPl, CB, 12mB, 12mH
25.6440.2MultiparousDW3.34high51.000.95midPl, CB, 12mB
34.4640.4MultiparousWW3.34high48.300.19midPl, CB, 12mB
20.2940.2MultiparousWW3.36high53.47−1.01lowPl, CB, 12mB, 12mH
37.1839.1MultiparousDD3.36high50.501.82highPl, CB, 12mB, 12mH
22.0741MultiparousDW3.37high48.470midPl, CB
28.9140.2MultiparousDD3.39high50.500midPl, CB
23.5138.9MultiparousDW3.45high50.471.02highPl
31.4840.7MultiparousDD3.5high50.000.08mid12mB
37.9641.2MultiparousDD3.52high50.401.7high12mB, 12mH
40.3639.1MultiparousWD3.55high49.500.6midPl
41.8841MultiparousDW3.59high51.00−0.33midPl, CB, 12mB
39.3340.7MultiparousDD3.72high52.50−0.93lowPl, CB
31.1941.8MultiparousDD3.79high50.77−0.28midPl, CB, 12mB
39.6940.2MultiparousDW3.8high53.00−1.35lowPl, CB, 12mB, 12mH
39.6541MultiparousDW3.9high50.000.82midPl, CB, 12mB
36.61MultiparousDW1.31high12mH
FEMALES
18.7239.4PrimiparousDW2.42low46.001.12highPl, 12mB, 12mH,
19.0439.9PrimiparousWW2.46low46.000.34midPl, CB, 12mB
29.8939.1MultiparousDD2.48low50.000.54midPl, CB
21.1638.9MultiparousDD2.48low45.501.86high12mB, 12mH
23.0138.7MultiparousDD2.5low47.000.11mid12mB
39.7238.6MultiparousDW2.53low46.40−0.28midPl, 12mB
22.9838.3MultiparousDD2.54low49.971.14highPl, CB, 12mB, 12mH
38.3839.7MultiparousWD2.56low47.370.05midPl, CB, 12mB
29.6540.2MultiparousWD2.58low45.000midPl, CB
26.9238.1MultiparousWD2.61low46.500.65midPl, CB, 12mB
41.6938.1MultiparousDD2.63low48.500.9midPl, 12mB
40.4937MultiparousWW2.64low47.201.15highPl, CB, 12mB, 12mH
19.1938.1PrimiparousWW2.67low46.50−4.44lowPl, CB, 12mB
38.5638.9MultiparousDW2.69low45.671.44high12mH
32.9940.7MultiparousDD2.73low46.27−0.39low12mH
23.9539.9MultiparousWW2.96mid50.27−0.47low12mH
22.0342MultiparousDD2.96mid49.401.67high12mH
37.2239.4MultiparousWD2.99mid49.50−0.46low12mH
26.0838.3MultiparousDD3mid46.00−0.46low12mH
36.241.2MultiparousDD3.07mid50.07−0.89low12mH
38.0740.4MultiparousDD3.21mid50.27−0.76low12mH
34.838.9MultiparousWD3.25high53.201.1highCB
35.440.4MultiparousDD3.29high50.500.78midCB
36.7840.2MultiparousDW3.31high49.000.07midPl
32.4339.7MultiparousWD3.33high51.000.81midPl, CB
24.139.9MultiparousWD3.33high50.200midCB
34.339.4MultiparousWD3.33high50.00−0.67lowCB
34.2541MultiparousWD3.37high45.17−1.39lowPl, CB, 12mB, 12mH
39.5440.2MultiparousWD3.42high49.80−1.11low12mB, 12mH
27.2739.9MultiparousDD3.75high53.00−0.17midPl, CB
38.3840.7MultiparousDD3.84high47.500.97highPl, CB, 12mB
27.1241MultiparousDD3.97high52.00−0.87lowPl, 12mB, 12mH
27.07MultiparousDW1.1high12mH

Summary of individual‐specific data for subjects contributing to study. Key: Mat Age, Maternal age; GA, Gestational age; S_MOC, Season of month of conception; S_MOB, Season of month of birth; BW, Birthweight; LAZ, Length for age Z score change birth to 12 months; BL, Birth length; D, Dry season; W, Wet season; pl, Placenta; CB, Cord blood; 12 mB, 12‐month blood sample selected on birthweight; and 12 mH, 12‐month blood sample selected on LAZ score. The samples categorized as high or low for both birthweight and length for age were those used in the analysis. Table 2 shows the numbers and sex for each tissue.

TABLE 2

Summary of DNA Samples Analyzed in the Study

DNA extraction
Low BWHigh BWTotal
Placenta
Male101424
Female11617
Both2120 41
Cord bloods
Male81220
Female8816
Both1620 36

Numbers of DNA samples analyzed for each tissue according to sex and test group and the number of subjects in common between tissues and test groups. Key: BW, Birthweight; LAZ, Length for age Z score.

Summary of Individual‐Specific Data of Those Included in the Study Summary of individual‐specific data for subjects contributing to study. Key: Mat Age, Maternal age; GA, Gestational age; S_MOC, Season of month of conception; S_MOB, Season of month of birth; BW, Birthweight; LAZ, Length for age Z score change birth to 12 months; BL, Birth length; D, Dry season; W, Wet season; pl, Placenta; CB, Cord blood; 12 mB, 12‐month blood sample selected on birthweight; and 12 mH, 12‐month blood sample selected on LAZ score. The samples categorized as high or low for both birthweight and length for age were those used in the analysis. Table 2 shows the numbers and sex for each tissue. Summary of DNA Samples Analyzed in the Study Numbers of DNA samples analyzed for each tissue according to sex and test group and the number of subjects in common between tissues and test groups. Key: BW, Birthweight; LAZ, Length for age Z score.

Nucleic acid extraction

DNA for DNA methylation studies was extracted from tissues using the Quick‐DNA Mini Prep Plus kit (Cat No. D4068, Zymo Research). DNA extracted from blood followed the Biological Fluids and Cells protocol and DNA extracted from placenta followed the Solid Tissue protocol. DNA abundance and quality were determined after extraction using a Nanodrop ND‐1000 spectrophotometer (Thermo Fisher Scientific, Waltham, Massachusetts, USA). Absorbance ratios (A260/A280 and A60/A230) were above the recommended 1.8. DNA from each sample was further quantified on a Qubit® Fluorometer using Qubit® dsDNA HS Assay kit (Cat. No. Q32854, Thermo Fisher Scientific, Waltham, Massachusetts, USA).

Methyl‐Seq library preparation

Methyl‐Seq was performed using the SureSelectXT Methyl‐Seq kit (Cat. No. G9651B, Agilent, Santa Clara, California, USA) according to the manufacturer's protocol (SureSelect XT Methyl‐Seq Target Enrichment System for Illumina Multiplexed Sequencing protocol, version C.0, January 2015); this covers over 3.7 million individual CpG dinucleotide sequences covering CpG islands, CpG island shores, CpG island shelves, under‐methylated regions, promoters, enhancers, transcription factors, CTCF‐binding sites, DNase 1 hypersensitive sites, and DMRs. Three micrograms of DNA from each sample was initially sheared by Covaris sonication to 150–200 bp in size and used to prepare genomic DNA libraries with the SureSelectXT Methyl‐Seq Library Prep kit. After hybridization with the SureSelectXT Methyl‐Seq capture library, targeted regions were isolated using complementary RNA baits. Isolated targets were bisulfite converted using the EZ DNA Methylation‐GoldTM (Cat. No. D5005, Zymo Research) which converts unmethylated cytosines to Uracil, while methylated cytosines are unaltered. Subsequent PCR amplification creates an unmethylated CG→TA transition at unmethylated positions. Each sequence‐modified, target‐enriched library preparation was attached to a readable index (short DNA identifying code) by PCR. Libraries were quantified on a Bioanalyzer 2100 (Agilent, Santa Clara, California, USA) using the Agilent High Sensitivity DNA kit (Cat. No. 507–4626, Agilent, Santa Clara, California, USA) or using a 2200 TapeStation (Agilent, Santa Clara, California, USA) with High Sensitivity DNA ScreenTapes (Cat. No. 5067–5593, Agilent, Santa Clara, California, USA). Equimolar indexed libraries were multiplexed (in groups of 6) to a final concentration of 4 nM in 20 μL of nuclease‐free dH2O or 10 mM of Tris–Cl, pH 8.5 (Buffer EB, Cat. No. 19086, Qiagen) and run on a single flow cell on an Ilumina NextSeq 500 according to the manufacturer's instructions using a 2x75 bp paired end read kit giving a total read length of 150 (TG NextSeq® 500 kit High Output Kit v2, Cat. No. TG‐160–2002, Illumina). To overcome color imbalance inherent to low complexity in a bisulfite‐converted genome, 10% of phiX genome was spiked into the reaction. Q30 scores of bases from NextSeq runs were within the threshold recommended by the manufacturer and depth of coverage was approximately 40x.

DNA data mapping

FastQC v0.11.4 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) was used to visualize the sequencing quality of the raw reads which were then trimmed using Trim Galore! v0.4.0 (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). This removes low quality bases (Qscore <20) starting from the 3’ end of the read. After trimming, short reads are removed (<20 bases). Figure S1 shows a typical example. The Bismark package uses Bowtie 2 alignment software v2.2.6 , to align sequences to the reference genome (GRCh38/hg19 assemblies) and then, methylation data were extracted employing default settings. Alignment mapping efficiency was in the region of 80% across all samples and is illustrated by Figure S2. The bisulfite error rate, estimated from the methylation status at cytosines outside a CpG context was in the region of 1.0% (Figure S3). Duplicated reads (removed using Bismark) were in the region of 20%. At each cytosine site, the methylation level was calculated as the ratio of the count of “C” (or the number of sequencing reads with methylated cytosine) to the count of “C” plus “T” (or the total number of reads covering that site). M‐bias plots were generated after methylation data were extracted with Bismark to yield the percentage methylation across all reads in order to identify any bias (e.g., bias at the end of the read due to drop in quality) arising from the position in the read of the cytosine residue being called. Figure S4 illustrates an example (for infant bloods) of an M‐bias plot illustrating the reduction of call quality at the 5’ and 3’ ends of the paired reads. This provides a guide to the extent of necessary sequence trimming (typically four bases removed from the 5’ end and one from the 3’ end). Methylation information was then re‐extracted and the output was processed and converted to a bedgraph.

Methylation data analysis

The resulting bed files from Bismark were used for further statistical analysis. Three comparison groups based on different growth criteria were examined: (a) high versus low birthweight babies sampled at birth for placenta and cord blood, (b) high versus low birthweight groups sampled at 12 m for infant blood, and (c) high versus low length‐for‐age based on change in Z score (LAZ) between birth and 12 months sampled at 12 months for infant blood. Differential methylation between groups was examined using BiSeq. , Only CpGs covered by at least 10 reads were included in the analysis.

Detection of DMRs

Analysis was performed using R v 3.2.2 and BiSeq version 1.18.0 (32, see review 34). BiSeq is designed specifically for targeted bisulfite sequencing data and includes features such as limiting high coverage, removing low coverage, spatial correlation, a multiple testing correction, visualization, and genomic annotation. DMRs were detected by comparing birthweight categories (high vs. low) or length‐for‐age (LAZ) scores (high vs. low) and incorporated sex as a covariate. Briefly, sequences were grouped into clusters of adjacent CpG sites. CpG methylation often occurs in clusters and spatial correlation is a key characteristic of DNA methylation. As methylation is conserved across short distances, identification of these related regions reduces data dimensions and also increases detection power by borrowing nearby CpG information. BiSeq CpG clusters were defined as CpG sites covered in at least 25% of samples (defined as frequently covered CpG sites) with a maximum distance of 100 bp between CpG sites within a cluster and with clusters containing at least 5 of these CpGs. To mitigate sequence overrepresentation distorting the data, sequences with greater than 90% of maximum coverage were removed. The methylation data were smoothed within CpG clusters using the smoothing algorithm (“predictMeth”). This estimates the true methylation level of each site in each sample. The methylation data were tested for both the test groups and resampled datasets under the null hypothesis that differences in methylation are random. The data from both were modeled by beta regression, with the group as the independent variable and the methylation probability as the dependent variable. A Wald test was used to confirm the parameters used in the beta regression could be included in the model and associated p‐values were transformed into Z scores to allow DMRs to be detected. To account for multiple testing errors (multiple testing correction using the Benjamini–Hochberg method), a two‐step hierarchical procedure was employed. This first tests clusters, then individual CpG sites within those clusters. The two‐step approach avoids loss of power by first testing at the cluster level and then, the CpG in the cluster that showed a change in methylation and hence the number of CpGs needing correction is greatly reduced. A variogram was created under the null hypothesis, which estimates the correlation in methylation between two CpG sites within a cluster. This was plotted and smoothed, with a sill of 1 for all our tests and was combined with the Z scores of the test results of interest to estimate the correlation of Z scores between two locations in a cluster. Clusters without differentially methylated CpG sites were removed (FDR >= 0.1), before the remaining clusters were trimmed to indicate individual significant CpG sites (FDR <= 0.05). PCA analysis of methylation patterns determined from different DNA sequence runs did not reveal any batch effects.

Pyrosequencing

Validation of differentially methylated cytosines as detected by Methyl‐Seq was performed by bisulfite pyrosequencing on the ZFHX3 gene. Initially, PCR primers were designed using the Pyromark assay design SW 2.0 (Cat. No.9019077, Qiagen, USA) and were supplied by Sigma‐Aldrich, UK. One of the primers was biotinylated and purified by HPLC. The primers were; ZFHX3: forward PCR primer GTTTTAATTTGATTGGGGGGAAAG, reverse PCR primer CCTTTAACAAACTAACCTCCTAACA, and forward biotinylated sequencing primer TTTTTTTAAATGTAGATTTGAATT. PCR amplification was performed with 10 ng of bisulfite converted DNA using EpiTaq HS (Cat. No. R110A, TaKaRa Bio Inc, Japan). PCR was set up according to the manufacturers’ instructions but the concentration of MgCl2 varied between 15 and 25 mM dependant on the primer set. Both methylated and unmethylated controls from the EpiTect PCR control DNA kit (Cat. No. 59695, Qiagen, USA) were run alongside. Thermal cycling conditions were performed using a touchdown program with an annealing temperature range of 53°C–62°C and cycle number range of 25–35, dependant on primer set. The PCR products were electrophoresed on a 3% of agarose gel to check for product specificity. Pyrosequencing was then performed on the PyroMark Q24 Vacuum Workstation (Qiagen, USA) as described in the manufacturer's instructions. PyroMark CpG software Design 2.0 (Cat. No. 9019067, Qiagen, USA) was used in this assay and primers with the best quality score were selected. Bisulfite conversion was shown to be efficient for all samples as the fluorescence signal by cytosine in a non‐CpG context was ≤1% of the signal produced by thymine.

Cellular heterogeneity assessment between sample groups

For cord blood, cell composition was compared between low and high birthweight groups using overlaps with a cord blood cell type‐specific reference panel of 215,000 CpGs derived from the Illumina EPIC 850 k array (ref: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6284779/). The reference panel set of CpG loci was used to find overlaps with the processed Methyl‐Seq capture dataset. Co‐methylation patterns extend up to several 100 base pairs across CpG clusters. , In order to obtain enough coverage for the regions covered by the EPIC reference set, we used intervals of 200 bases centered around the locations of the EPIC reference CpG set (updated in Human Genome––HG19). This yielded 14993 regions each containing CpG loci as present in the processed methyl capture sequence dataset. Methylation calls were extracted from the processed sequence data as described above and the mean values in these regions were used to generate PCA plots and heatmaps to calculate the correlation values between experimental groups (using Pearson correlation). In the absence of a 12‐month blood reference panel, the adult blood reference panel based on the Illumina Infinium HM450 k and EPIC 850 K methylation chips , was used and processed in the same way for coverage across the Methyl‐seq capture dataset. An interval of 200 bases yielded 33 regions each containing CpG loci (providing coverage for CD4 and CD8 lymphocytes, NK cells, neutrophils, B‐cells, and monocytes) that are present in the processed methyl capture sequence dataset to compare the 12‐month groups.

CpG and gene annotation

Ensembl was used to annotate differentially methylated CpGs (based on hg38 version GRCh38 human genome build) to determine their location with respect to regulatory features. Ontologies, mutational and functional data of those genes associated with significant differentially methylated CpGs were determined using the U.S. National Center for Biotechnology Information (NCBI; Bethesda, MD, USA; http://www.ncbi.nlm.nih.gov/) Gene, Online Mendelian Inheritance in Man (OMIM), PubMed databases and the Database for Annotation, Visualization, and Integrated Discovery v6.7 (DAVID ‐ http://david.abcc.ncifcrf.gov/). Disease associations were determined by interrogating the Genetic Association Database (GAD) for complex diseases and the EBI GWAS Catalogue. PANTHER v14.0 (http://www.pantherdb.org) , was used to provide an overview of gene ontology (GO Terms) defining protein classes, cellular components, biological procesess, and molecular functions of genes implicated by methylation marks.

RESULTS

Quality Triage of Sample Cohorts

All samples underwent assessment to exclude maternal contamination and poor‐quality samples. Maternal blood contamination of cord blood (for both sexes) was assessed using marker CpGs that are only methylated in adult blood DNA, and maternal contamination of placenta trophoblast samples from males was also flagged by examining the levels of Y DNA methylation dilution ; see Figure S5a,b Poor quality and/or obvious outlier samples were identified by plotting a heatmap of the methylation data for each experimental group (see example of the methylation data from the birthweight cohort at 12 months of age in Figure S6). The major component of variation was sex. Principal Component Analysis (PCA––done with and without inclusion of the sex chromosomes) matrices were also applied to a list of available variable information for the subjects contributing to each cohort and tissue sample (see Table S1) to determine whether they had a significant effect on the variation in the data. Examples for sex (male/female), birthweight category (high/low), and season (dry/wet) are illustrated in Figures S7a,b,c; only sex contributed significantly to variation in the data. Sample sets emerging from these analyses were re‐analyzed with sex as a covariate.

Assessment of Confounding Cellular Heterogeneity

There is no available cell‐type‐specific reference set for cord and adult blood to assess cell composition changes based on DMRs detected by Methyl‐Seq data. Potentially confounding differences in cell‐type composition between cord blood groupings (high and low birthweight) and infant blood groupings (high and low birthweight and length for age) were, therefore, assessed using DMR regions within the capture DNA sequence dataset that overlap within a 200 base‐pair interval with the EPIC 850 k cord blood and Infinium HM450 k adult blood cell‐type‐specific CpG panels (see methods). The adult overlaps were used for the 12‐month infant blood data in the absence of an age‐related reference panel for this time‐point. The heatmap and PCA plots are shown in Figures S8 and S9. For both cord and infant blood data, the heatmaps indicate high correlation between samples and no clear clustering according to comparison groups. PCA analysis indicates inter‐individual differences in cellular composition. For the small number of probes from the adult blood reference panel present within the Methyl‐Seq capture DNA sequence dataset, the analysis shows separation of individual samples into two groups; this may reflect the small number of probes available and their disproportionate weighting or variation in the rate of loss of nucleated erythrocytes between individuals. However, in all three PCA plots the variation between samples captured in PC1 and PC2 is distributed fairly uniformly across both experimental groups (high or low birthweight or tall or short height for age) indicating little difference in cellular composition between comparison groups to confound the determination of differential methylation values at the same time‐point.

Differentially Methylated Loci Identified Through BiSeq Analysis

The total number of significant differentially methylated regions (DMRs) and the direction of median methylation change identified for each comparison group (placenta‐birthweight, cord blood‐birthweight, infant blood‐birthweight at 12 months, and infant blood‐LAZ) by BiSeq analysis and the total number of significant CpGs they contain is shown in Table S2a–d and summarized in Table 3; 194 CpG loci in total. Each significantly differentially methylated CpG in each DMR was examined for the presence of single‐nucleotide polymorphism (SNP) directly in the CpG; these are shown in Table S2a–D. Apart from three SNP‐containing CpGs, all the MAFs (minor allele frequencies) were <0.01. For the CpGs associated with implicated genes RPS6KA2, PRSS3, and GAR1, the MAFs were <0.03, <0.11, and <0.02, respectively. These MAFs are at a level that would not significantly alter the estimation of methylation differences between test groups.
TABLE 3

Summary of Numbers of DMRs, CpG Loci, and Implicated Genes

CohortsTotal Number of DMRsDirection of Median Methylation Change for DMRs Relative to High Groupings for Birthweight and LAZTotal Number of CpG sites in DMRs and implicated genes
+ve‐ve
Placenta BW4224 (4)
Cord blood BW68254388 (78)
Infant blood (12 m) BW54292571 (65)
Infant blood (12 m) LAZ25131231 (26)

Summary of total number of DMRs, CpG loci, implicated genes (in brackets), and direction of median methylation change identified from the comparisons made at each time‐point between groupings. Key: BW, Birthweight; LAZ, Length for age Z score. Median methylation change is expressed relative to the high birthweight and high length for age groupings.

Summary of Numbers of DMRs, CpG Loci, and Implicated Genes Summary of total number of DMRs, CpG loci, implicated genes (in brackets), and direction of median methylation change identified from the comparisons made at each time‐point between groupings. Key: BW, Birthweight; LAZ, Length for age Z score. Median methylation change is expressed relative to the high birthweight and high length for age groupings. The distribution of CpGs between gene regulatory elements, gene bodies, and intergenic regions is shown in Table 4 (see Table S2a–d for full details on all DMRs and CpG locations). Figure 1A–D provides an overview of the gene ontology (determined using PANTHER v14.0 available at http://www.pantherdb.org) characterizing genes implicated by differential methylation marks. The pie charts summarize the distribution of this gene set across GO terms defining molecular functions, biological processes, cellular components, and protein classes. It can be seen that certain GO categories predominate. For example, analysis of molecular function reveals that binding, catalytic activity, molecular function regulator, and transcriptional regulator activity are most prominent. Detailed information on the genes and proteins in each of the GO categories can be obtained by uploading the gene lists to http://www.pantherdb.org from Table S2a–b and interrogating each pie chart sector.
TABLE 4

The distribution of methylation marks between regulatory features, gene bodies, and intergenic regions

Genomic FeatureNumber of CpGs in FeatureNumber of CpGs in Feature% of Total
>5% median methylation change<5% median methylation change
Promoter304437.9
Promoter and CTCF‐binding site936.2
CTCF‐binding site1137.2
Transcription Factor‐binding site412.5
Enhancer010.5
Exon16510.8
Intron28919.5
Intergenic171315.4
FIGURE 1

Gene Ontology Analysis of Genes Implicated by Associated Methylation Marks. Panther version 14 was used to provide an overview of the gene ontology characterizing genes implicated by methylation marks. Panther 14.0 identified 161 hits from the uploaded list of 173 genes. A. GO terms for Molecular Function found 93 molecular function hits. B. GO terms for Biological Process found 276 process hits. C. GO terms for Protein Class found 94 class hits. D. GO terms for Cellular Component found 300 cellular component hits. Color coding has been assigned starting at 12 o'clock and working clockwise on the pie chart

The distribution of methylation marks between regulatory features, gene bodies, and intergenic regions Gene Ontology Analysis of Genes Implicated by Associated Methylation Marks. Panther version 14 was used to provide an overview of the gene ontology characterizing genes implicated by methylation marks. Panther 14.0 identified 161 hits from the uploaded list of 173 genes. A. GO terms for Molecular Function found 93 molecular function hits. B. GO terms for Biological Process found 276 process hits. C. GO terms for Protein Class found 94 class hits. D. GO terms for Cellular Component found 300 cellular component hits. Color coding has been assigned starting at 12 o'clock and working clockwise on the pie chart Very few of the differentially methylated CpGs found in DMRs identified from the cord blood comparisons are found in the 12‐month infant blood comparisons; (a) of the four closely linked CpGs associated with TNXB, one (upstream intergenic) is differentially methylated in the cord blood birthweight group and the remaining three (within intron 1 of the gene) in the 12‐month infant blood birthweight group and (b) the intergenic CpG upstream of the HLX gene is differentially methylated in both the 12‐month birthweight and 12‐month length for age groups.

CpG Loci Showing 5% or Greater Methylation Change

We have chosen to focus on those marks that show 5% or greater methylation change. Figure 2 summarizes the CpGs that have been located to regulatory features (promoters, CTCF‐binding sites, and transcription factor‐binding sites––56 CpG loci in total) and figure 3 those located to gene bodies (introns and exons) and closely linked intergenic regions (64 CpG loci in total). Figures 2 and 3 also present the locations of CpGs (based on hg38 release 85 from ENSEBL) with respect to the Transcription Start Site (TSS) of genes implicated by location (93 in total), the median p‐value for the DMR corrected for multiple testing, direction and change in median methylation value, GWAS disease associations, and a short vignette summarizing any mutational data and functional studies of implicated genes culled from the various databases outlined in the materials and methods. Finally, Figures 2 and 3 flag whether any of the DMR‐associated genes are also subject to Trans or Cis‐meQTLs (genetic variation that influences methylation at CpG sites adjacent to implicated genes) and/or Cis‐eQTMs (variation of methylation that influences expression of an adjacent gene) collated in the Bios QTLBrowser held at www.genenetwork.nl/biosqtlbrowser (85), these are marked in red in the first column). These data were based on the analysis of cohorts from the Dutch population and may only partially reflect genetic variation in the Gambian population with Trans or Cis‐eQTL effects.
FIGURE 2

Implicated Genes Associated with Methylation Changes of 5% or Greater in Regulatory Elements. This figure documents those genes where a methylation change of 5% or greater has occurred within a defined regulatory feature. It also provides a summary of function and any disease associations resulting from genome‐wide association studies (GWAS––culled from the GAD and EMBL genetic association catalogue databases), mutation analysis, and functional investigations. The methylation change is expressed relative to the high birthweight and high length for age groups. All mapping of DMRs is based on human genome build hg38 version GRCh38 of the human genome. The positions of CpGs is given in relation to the Transcription Start Site (TSS) of the implicated gene. Also shown highlighted in red in the first column is whether the gene is associated with Trans and/or Cis‐meQTLs and/or Cis‐eQTMs. Where plural is shown, this indicates two or more Trans‐meQTLs, Cis‐metQTLs, or Cis‐eQTMs associated with the gene (information obtained from the BIOS QTL Browser at www.genenetwork.nl/biosqtlbrowser). NI, No Information. Color key of gene disease and functional associations: brown, neurological; purple, fertility; light blue, growth and development; dark blue, oncological; and light green, immunological. The asterisks mark the implicated genes found associated with methylation marks in related studies. , ,

FIGURE 3

Implicated Genes Associated with Methylation Changes of 5% or Greater in Gene Bodies and Intergenic Regions. This figure documents those genes where a methylation change of 5% or greater has occurred within a gene body or intergenic region. It also provides a summary of function and any disease associations resulting from genome‐wide association studies (GWAS––culled from the GAD and EMBL genetic association catalogue databases), mutation analysis and functional investigations. The methylation change is expressed relative to the high birthweight and high length for age groups. All mapping of DMRs is based on human genome build hg38 version GRCh38 of the human genome. The positions of CpGs is given in relation to the Transcription Start Site (TSS) of the implicated gene. Also shown highlighted in red in the first column is whether the gene is associated with Trans and/or Cis‐meQTLs and/or Cis‐eQTMs. Where plural is shown, this indicates two or more Trans‐meQTLs, Cis‐metQTLs, or Cis‐eQTMs associated with the gene (information obtained from the BIOS QTL Browser at www.genenetwork.nl/biosqtlbrowser). NI, No Information. Color key of gene disease and functional associations: brown, neurological; purple, fertility; light blue, growth and development; dark blue, oncological; light green, immunological; dark green, connective tissue; orange, metabolic; vermillion, cardiovascular; and gray, hearing. The asterisks mark the implicated genes found associated with methylation marks in related studies. , ,

Implicated Genes Associated with Methylation Changes of 5% or Greater in Regulatory Elements. This figure documents those genes where a methylation change of 5% or greater has occurred within a defined regulatory feature. It also provides a summary of function and any disease associations resulting from genome‐wide association studies (GWAS––culled from the GAD and EMBL genetic association catalogue databases), mutation analysis, and functional investigations. The methylation change is expressed relative to the high birthweight and high length for age groups. All mapping of DMRs is based on human genome build hg38 version GRCh38 of the human genome. The positions of CpGs is given in relation to the Transcription Start Site (TSS) of the implicated gene. Also shown highlighted in red in the first column is whether the gene is associated with Trans and/or Cis‐meQTLs and/or Cis‐eQTMs. Where plural is shown, this indicates two or more Trans‐meQTLs, Cis‐metQTLs, or Cis‐eQTMs associated with the gene (information obtained from the BIOS QTL Browser at www.genenetwork.nl/biosqtlbrowser). NI, No Information. Color key of gene disease and functional associations: brown, neurological; purple, fertility; light blue, growth and development; dark blue, oncological; and light green, immunological. The asterisks mark the implicated genes found associated with methylation marks in related studies. , , Implicated Genes Associated with Methylation Changes of 5% or Greater in Gene Bodies and Intergenic Regions. This figure documents those genes where a methylation change of 5% or greater has occurred within a gene body or intergenic region. It also provides a summary of function and any disease associations resulting from genome‐wide association studies (GWAS––culled from the GAD and EMBL genetic association catalogue databases), mutation analysis and functional investigations. The methylation change is expressed relative to the high birthweight and high length for age groups. All mapping of DMRs is based on human genome build hg38 version GRCh38 of the human genome. The positions of CpGs is given in relation to the Transcription Start Site (TSS) of the implicated gene. Also shown highlighted in red in the first column is whether the gene is associated with Trans and/or Cis‐meQTLs and/or Cis‐eQTMs. Where plural is shown, this indicates two or more Trans‐meQTLs, Cis‐metQTLs, or Cis‐eQTMs associated with the gene (information obtained from the BIOS QTL Browser at www.genenetwork.nl/biosqtlbrowser). NI, No Information. Color key of gene disease and functional associations: brown, neurological; purple, fertility; light blue, growth and development; dark blue, oncological; light green, immunological; dark green, connective tissue; orange, metabolic; vermillion, cardiovascular; and gray, hearing. The asterisks mark the implicated genes found associated with methylation marks in related studies. , , The implicated gene names in Figures 2 and 3 are color coded according to categories of gene function/disease revealed by functional and/or mutation analysis (see legends) and Figure 4 summarizes the numbers of implicated genes found in associated disease categories bearing the same color coding. It is immediately clear that neurological, growth and development, and oncological disorders are the most prominent among the implicated genes showing 5% or greater methylation change.
FIGURE 4

Number of Implicated Genes from Figures 2 and 3 Associated with Different Disease Categories. The color key allows cross‐reference to the gene lists in Figures 2 and 3. Neur, Neurological; Repro, Reproductive; Growth and Dev, Growth and Development; Onco, Oncological; Imm, Immunological; Conn Tiss, Connective Tissue; and Cardio‐vasc, Cardio‐vascular

Number of Implicated Genes from Figures 2 and 3 Associated with Different Disease Categories. The color key allows cross‐reference to the gene lists in Figures 2 and 3. Neur, Neurological; Repro, Reproductive; Growth and Dev, Growth and Development; Onco, Oncological; Imm, Immunological; Conn Tiss, Connective Tissue; and Cardio‐vasc, Cardio‐vascular

Replication of Findings in Other Methylation Studies

Identification of a significant proportion of the same implicated genes reported in related studies provides strong validation of the findings reported here. Highlighted in red bold in Table S2a–d are the DMR‐implicated genes that are also documented in the recent large meta‐analysis of multiple EWAS by the Childhood Epigenetics Consortium examining DNA methylation associated with birthweight. When all genes (4848, representing about 19% of the estimated 25,000 genes in the human genome) from the consortium study associated with 8170 CpGs significant after FDR correction for multiple testing are screened, 62 DMR implicated genes from the current study show a match (of which 34 show >5% methylation change). If this is restricted to those genes (729; about 2.9% of human genes) associated with 914 CpG loci surviving Bonferroni correction (p < 1.06E‐7), then 11 matches are found (marked with a red asterisk of which 10 show >5% methylation change in the current study). The vast majority of CpGs in the meta‐analysis are located within or closely linked to genes. Thus taking the number of genes in the genome as 25,000, the approximate probability of a match by chance for any given DMR‐implicated gene in the present study is 0.19 (4848/25000) for all genes and 0.029 (729/25000) for those associated with the 914 CpG loci. The probability that these matches have occurred by chance for 62 and 11 genes is 0.19−62 and 0.029−11, respectively. Comparisons have been made to two further studies examining the impact of gestational age (where some of the data are subsumed in the large meta‐analysis mentioned above) and smoking on birthweight ; these share, respectively, 53 (marked with a green asterisk in Table S2a–d) and 11 (marked with a blue asterisk) genes associated with differential methylation identified in the current study. These two studies identify a further 22 DMR‐associated genes that overlap with our findings, bringing the replication in other related studies to 84 (48%) of the 173 implicated genes we have documented (The asterisks in figure 2 and 3 mark which of those shared genes show 5% or greater methylation change––in total 49 of the 93 in tables 5 and 6 between the three birthweight‐related studies). The genes ZNF678, VTRNA2‐1, SCRIB, and TNXB match those reported in other studies on maternal exposures and differential methylation of associated CpG loci in Gambian infants , (marked with black triple asterisks in Table S2a–d) and SEMA3B, ARID1B, and HOXA10 from the Cambridge Baby Growth Study (marked with black double asterisks). The high degree of replication observed in related studies provides robust validation of the findings reported here. Pyrosequencing analysis of the methylation mark associated with the ZFHX3 gene was performed to illustrate an example experimental confirmation of methyl‐seq derived methylation data. Table 5 summarizes the data for several individuals selected from the high and low groups for the 12‐month LAZ comparison. The results in Table 5 show good concordance between the two methods in both the quantum and direction of change when compared to the median methylation value change derived from the group comparisons by BiSeq analysis of methyl‐seq data.
TABLE 5

Methylation Analysis of the DMR Associated with the ZFHX3 Gene by Pyrosequencing

ZFHX3 (% methylation)
PyrosequencingMethyl‐seq
SampleHighSampleLowSampleHighSampleLow
120131122140
222231222242
323331323342
423433423441
530535524540
635646
Mean:
23.6 sd+/−3.3832.6 sd+/−1.7922.8 sd+/‐ 0.7441.3 sd+/‐ 1.1

Individual samples sourced from the 12‐month high and low LAZ (length for age) comparison groups.

Median methyl‐seq determined methylation change between groups = −14.8 referenced to high LAZ value.

This table compares the methylation levels determine for the DMR associated with the ZFHX3 gene by pyrosequencing and methyl‐seq analysis. Individuals from the high (n = 5) and low (n = 6) 12‐month length for age comparison groups were selected for analysis. The table shows the mean % methylation values and standard deviation for the two types of analysis. The quantum and direction of change is close to that observed for median methylation change from the comparison of the high and low groups determined by BiSeq analysis of methyl‐seq data.

Methylation Analysis of the DMR Associated with the ZFHX3 Gene by Pyrosequencing Individual samples sourced from the 12‐month high and low LAZ (length for age) comparison groups. Median methyl‐seq determined methylation change between groups = −14.8 referenced to high LAZ value. This table compares the methylation levels determine for the DMR associated with the ZFHX3 gene by pyrosequencing and methyl‐seq analysis. Individuals from the high (n = 5) and low (n = 6) 12‐month length for age comparison groups were selected for analysis. The table shows the mean % methylation values and standard deviation for the two types of analysis. The quantum and direction of change is close to that observed for median methylation change from the comparison of the high and low groups determined by BiSeq analysis of methyl‐seq data.

DISCUSSION

It has been suggested that epigenetic changes may be involved in the mechanism of reprogramming induced by under‐nutrition, infection, and adverse environmental exposures, although, it is not clear whether these are primary or secondary events in the chain of causality. This paper has used extremes of variation in birthweight and subsequent gains in length to examine associated methylation changes in DNA from trophoblast and cord blood DNA from small and large babies, and blood DNA from 12‐month old infants analyzed both according to their size at birth and their change in length from birth to 12 months (LAZ). The methylation marks found at birth and those at 12 months in relation to birthweight show little longitudinal persistence (see figures 2 and 3 and Table S2a–d). This suggests ongoing epigenetic adjustments, significant changes in blood cellular composition (such as the loss of nucleated erythrocytes or both in the critical postnatal growth period, and the subsequent infancy‐childhood growth transition (ICT). Nonetheless, what does persist at 12 months is different, almost completely non‐overlapping methylation patterns (not confounded by cellular composition differences) between the high and low birthweight comparison groups and the comparison groups showing rapid or slow postnatal height gain. These two distinct methylation patterns may reflect different interactions with nutritional, infectious, and other environmental exposures during the postnatal growth phase potentially associated with negative or positive growth trajectories or a combination of both. Thus, any continued challenges (such as those provoked by under nutrition and infection) to homeostasis during the development period may trigger epigenetic programming and shift the timing and duration of these periods of growth. The study reported by Bernstein et al. has revealed an accelerated transition to a childhood pattern of growth in Gambian compared to UK infants. A later transition, observed in U.K. infants, extends the high growth rate experienced during the infancy stage. This is reduced in Gambian infants, potentially impacting on growth outcomes in childhood while diverting energy into other processes critical for responses to acute infectious challenges; later developmental stages in this population offer an extended window for catch‐up growth. Over half (54.3%) of the identified methylation marks are located in gene regulatory elements, 30.3% in gene bodies, and the remaining 15.4% in intergenic regions closely linked to implicated genes. Alteration of gene activity by methylation of implicated genes may occur by impacting the functionality of cis‐transcriptional regulatory elements or changing chromatin conformation and accessibility to the transcriptional machinery. Several of the methylation marks are found in the binding site (an estimated 326,000 in the human genome) for the multifunctional CTCF zinc finger protein. This protein plays a key regulatory role through a number of varied functions that include influencing chromatin architecture (binding at chromatin domain boundaries and the formation of chromatin loops), binding to promoters, enhancers and within gene bodies, and recruitment of transcription factors. The protein can also act as an insulator, blocking long‐range promoter‐enhancer interactions for review see (92). Of particular relevance is the observation that methylation at CTCF‐binding sites in imprinted regions can disrupt the binding of the CTCF protein and its insulator activity and, more generally, at many other methylated sites outside imprinted regions. From the annotation associated with each of the CpG loci covered by this methyl‐seq capture set, almost all the methylation marks described in this study are in regions containing DNAse 1 hypersensitive Sites (DHS–markers of DNA regulatory regions and transcriptionally active open chromatin) described by the ENCODE (Encyclopedia of DNA Elements project. The ENCODE project has shown that a small proportion (~ 5%) of DHSs are found in TSS (Transcription Start Site) regions, that most are located in introns and intergenic DNA and that there is cell‐type specificity in the distribution of DHSs. This indicates that the majority of methylation marks reported in our study are potentially in areas of remodeled open chromatin associated with transcriptional activity and may influence target gene activity possibly by altering chromatin architecture. Figure 2 and 3 also indicate that a number of the DMR‐associated genes showing 5% or greater methylation change are subject to trans and/or cis genetic variation (Trans‐meQTL and Cis‐meQTL) that impacts the level of methylation of closely linked CpG loci; in some cases these methylation changes affect gene expression (Cis‐eQTM). One consequence of this polymorphism in the genetic modulation of methylation marks is that it is likely to lead to a diversity of methylation responses to environmental exposures in different populations. Thus interaction between environmental exposures, genetic background and modulation of methylation patterns will have to be assessed for each study population. Distribution of implicated genes across GO term categories demonstrates that they encompass biochemical and biological functions that include signaling or interaction with signaling pathways; interacting with or acting as receptors; constituents of or interacting with the extracellular matrix; deposition of connective tissue; structure and function of the actin cytoskeleton; trafficking across cellular membranes; cell cycle control and cellular growth; transcription regulation; and metabolic regulation (see Figure 1). Biological functions revealed by functional studies, animal models, and mutation analysis primarily highlight roles in neurological, growth and developmental, neoplastic, and immunological dysfunction (see figures 2, 3, and 4 and Table S2a–d for details). The precise impact of the methylation changes on the expression of implicated genes, however, is unknown and awaits more detailed functional analysis. Nevertheless, the location of these methylation marks within appropriately positioned regulatory elements and gene bodies or in close intergenic linkage to implicated genes, encourages their consideration as biomarkers associated with and the genetic pathways within which they are active in as potential contributors to variation in prenatal and postnatal growth, subsequent outcomes in later life and as possible intervention targets. In total, 84 genes implicated by DMRs (shown in red bold and flagged by green and blue asterisks‐see Table S2a–d) are shared with DMR‐associated genes reported in the large array‐based meta‐analysis of multiple EWAS by the Childhood Epigenetics Consortium and two further related studies. , , This demonstrates concordance with a substantial proportion (48%) of the genes documented in the current study and provides robust validation of the BiSeq analysis of methyl‐seq data. Eleven matched genes are associated with CpG loci surviving stringent Bonferroni correction in the Kuppers et al. study (marked with a red asterisk in Table S2a–d). Differences in genetic background, environmental exposures and nutrition between populations contributing to different studies could lead to methylation changes at different CpG loci but still affect DMRs associated with the same implicated genes. In the case of MAD1L1 and NFIX, differential methylation has been detected at the same Bonferroni significant CpG sites that are reported in the meta‐analysis ). MAD1L1 (a component of the mitotic spindle‐assembly checkpoint) has a role in cell cycle control and tumor suppression and methylation levels have been strongly correlated with hepatocellular carcinoma. It is also a susceptibility gene for bipolar disorder and schizophrenia with a risk allele linked to reward systems in healthy adults. NFIX is most highly expressed in brain, fat, and prostate, is linked to cancer (DNA hypermethylation associated with lung adenocarcinoma––LUAD), muscle development and dystrophies. Interestingly, 19p13 microduplications encompassing NFIX are responsible for intellectual disability, short stature, and small head circumference. Three implicated genes match those flagged by methylation changes found in DNA from babies in the Cambridge Baby Growth Study investigating the effects of maternal gestational diabetes or intrauterine growth retardation. ARID1B and SEMA3B are potential tumor suppressor genes. ARID1B is a chromatin remodeling factor and individuals with ARID1B‐related disorder have many phenotypic features including slow growth. The third gene is HOXA10 (homeobox A10), whose expression is downregulated in endometriosis but in late gestation is required for proper placental differentiation and function. , Implicated genes ZNF678 (a zinc finger gene), VTRNA2‐1, SCRIB, and TNXB have been reported in other Gambian‐based studies investigating periconceptional nutritional exposures associated with differential methylation. , , VTRNA2a‐1 is a noncoding RNA gene that functions as a tumor suppressor , , , and is an imprinted locus. , SCRIB (a scaffold protein found at epithelial adherens junctions and neuronal presynaptic compartments) can act as a tumor suppressor gene and has been shown to be mutated in severe neural tube defects (see OMIM entry 607733). TNXB is an extracellular matrix glycoprotein thought to function in matrix maturation during wound healing. Different pathogenic alleles give rise to Ehlers–Danlos Syndrome and a form of chronic kidney failure, Vesicoureteral Reflux –VUR, both of which involve alterations to collagen deposition in the extracellular matrix. The genes DLK1 and MEG9 (LINC00584––long intergenic noncoding RNA) are worthy of further comment given their location within an important imprinted region on chromosome 14 at 14q32. As revealed by maternal and paternal Uniparental Disomy (UPDm and UDPp, respectively), and genetic and functional studies of individual genes encompassed within the locus (for review see OMIM entries 601038, 60563, 611896, 172690, 613648, and ref (103), the region has a major impact on growth and development. The 14q23 locus is complex with a cluster of maternally and paternally imprinted genes, noncoding snoRNAs (small nucleolar organizer RNA), miRNAs (microRNAs), LncRNAs (long noncoding RNAs), and LINC RNAs under the control of an intergenic differentially methylated region (IG‐DMR). Three genes (DLK1, RTL1, and DIO3) are all expressed from paternal alleles. DLK1, containing six epidermal growth factor repeats, has reduced plasma levels in women bearing small for gestational age babies, is an inhibitor of adipocyte differentiation and shows genetic association with age of menarche. , , , RTL1 is essential for maintenance of fetal capillaries and potentially involved in formation of the chorioallantoic placenta, while DIO3 (Thyroxine Deiodinase Type III) is essential for the maturation and function of the thyroid axis. A further four genes (MEG3, RTL1as, MEG8, and MEG9) are all expressed from maternal alleles. MEG3 is a LncRNA affecting growth and development in MEG3 knock‐out mice ; RTL1as is an antisense transcript to the paternally expressed gene RTL1 and encodes a number of microRNAs that may regulate the expression of RTL1 ; MEG8 is a LncRNA involved in the regulation of trophoblast proliferation and invasion, and implicated in spontaneous early abortion and MEG9, a LINC RNA involved in megakaryocyte differentiation and angiogenesis shows genetic association with body mass index and age of menarche. The UPDm (no paternal transcripts: Temple Syndrome) phenotype is characterized by prenataland postnatal growth retardation, neonatal hypotonia, precocious puberty, and facial dysmorphism. The UPDp (no maternal transcripts: Kagami–Ogata Syndrome) phenotype is characterized by severe growth retardation, skeletal abnormalities, facial anomalies, and abdominal muscular defects. Trans‐regulation by maternally expressed small noncoding RNAs from the 14q32 region on the activity of other genes in the genome is likely to contribute to these complex phenotypes. On the maternal chromosome DLK1 is silenced. The current study shows a 6% methylation difference of a DLK1 DMR (higher in high birthweight than low birthweight babies). In contrast, MEG9 is silenced on the paternal chromosome and shows a 22% methylation difference of a MEG9 DMR (higher in high birthweight than in low birthweight babies). It is not clear what the impact of these methylation marks is on expression levels as they lie outside the immediate promoter within the gene body. Nevertheless, given that both methylation marks are in DMRs containing DHSs marking potentially open chromatin, it is reasonable to suggest that alteration of the methylation landscape in this region of chromosome 14 could impact chromatin architecture and gene activity with a bearing on growth and development outcomes. It is interesting to note that a study examining the effect of maternal periconceptional micronutrient supplementation of Gambian mothers found increased methylation of a DLK1‐associated CpG in cord blood DNA from offspring of mothers who had received the supplements. A number of limitations should be noted. An accessible tissue such as blood as a proxy for methylation changes in other key target tissues will not capture all the relevant alterations in methylation status. However, there is sufficient concordance between tissues to yield a subset of potentially relevant loci. , , Analysis has been performed with males and females combined; hence sex differences in the methylation patterns have not been determined. The sample size is small, nevertheless, as outlined in the methods, BiSeq is designed for the analysis of targeted methyl sequence data and takes advantage of the conservation of methylation across short distances, co‐assessing methylation changes at several individual cytosine residues within intervals of 100 base pairs. This reduces data dimensions and increases detection power by borrowing nearby CpG information and provides a more detailed and statistically significant evaluation of the methylation status across any given genomic region; this has allowed identification of statistically valid differentially methylated CpGs from this small study. Greater coverage (3.7 million CpGs as compared to the Illumina HM450 k and Epic 850 k chips) of the SureSelect targeted sequencing approach of key gene regulatory elements (adjacent and distant, proximal or distal) to genes they control, offers the opportunity to identify additional methylation marks not necessarily scored by the array‐based platforms. Studies, such as the one reported here, provide associations and not cause and effect relationships between genes and phenotypes. Mutational evidence is helpful in establishing the likelihood that a gene contributes to a complex phenotype. Identification of methylation marks can be useful in that (a) they might act as biomarkers of early life adverse exposures that impact on early growth and may potentially indicate those individuals with higher future disease risks and (b) potentially flag genes that may be useful intervention targets to ameliorate the consequences of stunting. An integrated large scale analysis of inter‐individual variation of methylation marks in relation to genotype (Trans and Cis‐meQTLs), eQTLs (expression quantitative trait loci including Cis‐eQTMs), disease susceptibility, developmental phenotypes, nutrition, and environmental exposures provides a means of potentially unpicking causal relationships and the relevance of implicated genes. Clearly, the most effective approach to mitigate stunting and associated disease susceptibilities would be to ensure healthy nutrition, adequate sanitation, and living conditions early in the life course.

CONFLICT OF INTEREST

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

AUTHOR CONTRIBUTIONS

CRQ developed and managed the experimental procedures and KMH performed them; MS and MG devised the initial analysis pipeline and JB and BS finalized the pipelines and provided the bioinformatic support; CRQ performed the methylation data analysis; CAS and NAA provided supervisory support; AMD and SD managed the collection and extraction of DNA from samples; CRQ and NAA wrote the first draft of the manuscript; RMB, NA, DBD, KKO, AMP, and SEM conceived of and designed the HERO‐G study. All authors contributed to manuscript revision, read, and approved the submitted version. Fig S1 Click here for additional data file. Fig S2 Click here for additional data file. Fig S3 Click here for additional data file. Fig S4 Click here for additional data file. Fig S5 Click here for additional data file. Fig S6 Click here for additional data file. Fig S7 Click here for additional data file. Fig S8 Click here for additional data file. Fig S9 Click here for additional data file. Table S1 Click here for additional data file. Table S2 Click here for additional data file.
  111 in total

1.  HOXA-10 gene expression in ectopic and eutopic endometrium tissues: Does it differ between fertile and infertile women with endometriosis?

Authors:  Canan Özcan; Özkan Özdamar; Merve Ertan Gökbayrak; Emek Doğer; Yiğit Çakıroğlu; Naci Çine
Journal:  Eur J Obstet Gynecol Reprod Biol       Date:  2018-12-11       Impact factor: 2.435

2.  Allelic methylation levels of the noncoding VTRNA2-1 located on chromosome 5q31.1 predict outcome in AML.

Authors:  Marianne Bach Treppendahl; Xiangning Qiu; Alexandra Søgaard; Xiaojing Yang; Cecilie Nandrup-Bus; Christoffer Hother; Mette Klarskov Andersen; Lars Kjeldsen; Lars Möllgård; Lars Möllgaard; Eva Hellström-Lindberg; Johan Jendholm; Bo T Porse; Peter A Jones; Gangning Liang; Kirsten Grønbæk
Journal:  Blood       Date:  2011-11-04       Impact factor: 22.113

3.  Expanding the genetic heterogeneity of intellectual disability.

Authors:  Shams Anazi; Sateesh Maddirevula; Vincenzo Salpietro; Yasmine T Asi; Saud Alsahli; Amal Alhashem; Hanan E Shamseldin; Fatema AlZahrani; Nisha Patel; Niema Ibrahim; Firdous M Abdulwahab; Mais Hashem; Nadia Alhashmi; Fathiya Al Murshedi; Adila Al Kindy; Ahmad Alshaer; Ahmed Rumayyan; Saeed Al Tala; Wesam Kurdi; Abdulaziz Alsaman; Ali Alasmari; Selina Banu; Tipu Sultan; Mohammed M Saleh; Hisham Alkuraya; Mustafa A Salih; Hesham Aldhalaan; Tawfeg Ben-Omran; Fatima Al Musafri; Rehab Ali; Jehan Suleiman; Brahim Tabarki; Ayman W El-Hattab; Caleb Bupp; Majid Alfadhel; Nada Al Tassan; Dorota Monies; Stefan T Arold; Mohamed Abouelhoda; Tammaryn Lashley; Henry Houlden; Eissa Faqeih; Fowzan S Alkuraya
Journal:  Hum Genet       Date:  2017-09-22       Impact factor: 4.132

4.  NFIX downregulation independently predicts poor prognosis in lung adenocarcinoma, but not in squamous cell carcinoma.

Authors:  Jun Ge; Hang Dong; Ye Yang; Bin Liu; Min Zheng; Qing Cheng; Li Peng; Juan Li
Journal:  Future Oncol       Date:  2018-11-12       Impact factor: 3.404

Review 5.  Origins of lifetime health around the time of conception: causes and consequences.

Authors:  Tom P Fleming; Adam J Watkins; Miguel A Velazquez; John C Mathers; Andrew M Prentice; Judith Stephenson; Mary Barker; Richard Saffery; Chittaranjan S Yajnik; Judith J Eckert; Mark A Hanson; Terrence Forrester; Peter D Gluckman; Keith M Godfrey
Journal:  Lancet       Date:  2018-04-16       Impact factor: 79.321

6.  A human homeobox gene, HB24, inhibits development of CD4+ T cells and impairs thymic involution in transgenic mice.

Authors:  Y Deguchi; D Agus; J H Kehrl
Journal:  J Biol Chem       Date:  1993-02-15       Impact factor: 5.157

7.  Correspondence of DNA Methylation Between Blood and Brain Tissue and Its Application to Schizophrenia Research.

Authors:  Esther Walton; Johanna Hass; Jingyu Liu; Joshua L Roffman; Fabio Bernardoni; Veit Roessner; Matthias Kirsch; Gabriele Schackert; Vince Calhoun; Stefan Ehrlich
Journal:  Schizophr Bull       Date:  2015-06-08       Impact factor: 9.306

8.  PARK10 is a major locus for sporadic neuropathologically confirmed Parkinson disease.

Authors:  Gary W Beecham; Dennis W Dickson; William K Scott; Eden R Martin; Gerard Schellenberg; Karen Nuytemans; Eric B Larson; Joseph D Buxbaum; John Q Trojanowski; Vivianna M Van Deerlin; Howard I Hurtig; Deborah C Mash; Thomas G Beach; Juan C Troncoso; Olga Pletnikova; Matthew P Frosch; Bernardino Ghetti; Tatiana M Foroud; Lawrence S Honig; Karen Marder; Jean Paul Vonsattel; Samuel M Goldman; Harry V Vinters; Owen A Ross; Zbigniew K Wszolek; Liyong Wang; Derek M Dykxhoorn; Margaret A Pericak-Vance; Thomas J Montine; James B Leverenz; Ted M Dawson; Jeffery M Vance
Journal:  Neurology       Date:  2015-02-06       Impact factor: 9.910

9.  DNA methylation profiling at imprinted loci after periconceptional micronutrient supplementation in humans: results of a pilot randomized controlled trial.

Authors:  Wendy N Cooper; Batbayar Khulan; Stephen Owens; Cathy E Elks; Veronica Seidel; Andrew M Prentice; Gusztav Belteki; Ken K Ong; Nabeel A Affara; Miguel Constância; David B Dunger
Journal:  FASEB J       Date:  2012-01-20       Impact factor: 5.191

10.  TGF- β Signaling Cooperates with AT Motif-Binding Factor-1 for Repression of the α -Fetoprotein Promoter.

Authors:  Nobuo Sakata; Satoshi Kaneko; Souichi Ikeno; Yutaka Miura; Hidekazu Nakabayashi; Xue-Yuan Dong; Jin-Tang Dong; Taiki Tamaoki; Naoko Nakano; Susumu Itoh
Journal:  J Signal Transduct       Date:  2014-07-03
View more
  1 in total

Review 1.  Epigenetics in the Uterine Environment: How Maternal Diet and ART May Influence the Epigenome in the Offspring with Long-Term Health Consequences.

Authors:  Irene Peral-Sanchez; Batoul Hojeij; Diego A Ojeda; Régine P M Steegers-Theunissen; Sandrine Willaime-Morawek
Journal:  Genes (Basel)       Date:  2021-12-23       Impact factor: 4.096

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.