Literature DB >> 34055787

DNA Methylation Variation Is Identified in Monozygotic Twins Discordant for Non-syndromic Cleft Lip and Palate.

Juan I Young1, Susan Slifer1, Jacqueline T Hecht2, Susan H Blanton1.   

Abstract

Non-syndromic cleft lip with or without cleft palate (NSCLP) is the most common craniofacial birth defect. The etiology of NSCLP is complex with multiple genes and environmental factors playing causal roles. Although studies have identified numerous genetic markers associated with NSCLP, the role of epigenetic variation remains relatively unexplored. Because of their identical DNA sequences, monozygotic (MZ) twins discordant for NSCLP are an ideal model for examining the potential contribution of DNA methylation to non-syndromic orofacial clefting. In this study, we compared the patterns of whole genome DNA methylation in six MZ twin pairs discordant for NSCLP. Differentially methylated positions (DMPs) and regions (DMRs) were identified in NSCLP candidate genes, including differential methylation in MAFB and ZEB2 in two independent MZ twin pairs. In addition to DNA methylation differences in NSCLP candidate genes, we found common differential methylation in genes belonging to the Hippo signaling pathway, implicating this mechanosensory pathway in the etiology of NSCLP. The results of this novel approach using MZ twins discordant for NSCLP suggests that differential methylation is one mechanism contributing to NSCLP, meriting future studies on the role of DNA methylation in familial and sporadic NSCLP.
Copyright © 2021 Young, Slifer, Hecht and Blanton.

Entities:  

Keywords:  NSCLP; methylation; non-syndromic cleft lip and cleft palate; twins; whole genome bisulfite sequencing

Year:  2021        PMID: 34055787      PMCID: PMC8149607          DOI: 10.3389/fcell.2021.656865

Source DB:  PubMed          Journal:  Front Cell Dev Biol        ISSN: 2296-634X


Introduction

Non-syndromic cleft lip with and without cleft palate (NSCLP) is a common birth defect occurring in 1/700-1000 livebirths, affecting 135,000 newborns worldwide. The severity varies based on laterality (bi- versus unilateral involvement) and extensiveness (involves only the lip, the primary palate and/or the secondary palate). The defect affects feeding initially and, in the long term, speech articulation, hearing, dental hygiene, dentition and overall well-being as there is always residual scarring. The cost of multiple surgical interventions exact a large emotional burden for patients and families, and high financial burden for families and society. Despite many decades of research, only limited environmental agents and genetic causes of NSCLP are known (Chiquet et al., 2011; Letra et al., 2012, 2014; Cvjetkovic et al., 2015; Leslie et al., 2015, 2016a,b; Fish et al., 2017; Ludwig et al., 2017; Cox et al., 2018). While candidate gene and genome-wide studies of case-control and family-based datasets have identified causal/susceptibility loci and variants, the majority of the genetic causes remain elusive. Based on the analysis of multiplex families, in which there are multiple, but not necessarily first degree, affected relatives, there are likely extra-genic factors affecting penetrance and/or expression of the genetic liability. For example, MZ twins have a concordance rate of » 25–50% for NSCLP (compared to 1–3% in DZ twins) with a heritability estimate of approximately 70%, indicating that other factors play a role (Christensen, 1999; Grosen et al., 2011). It is likely that variation in multiple genes is etiologically necessary, with extra-genic factors contributing with a threshold effect. Epigenetics is well-recognized as a critical mediator of the interplay between genes and the environment and influences phenotypic outcomes. DNA methylation, a major substrate of epigenetic information, regulates genome activity and provides an alternative mechanism for modulating cellular physiology compared to those arising solely from genetic changes (Esteller, 2008). Thus, a potential mechanism by which environmental factors could elicit altered tissue morphogenesis is variation in gene methylation. Support for this hypothesis comes from the observation that maternal use of the methyl-group donor folic acid may decrease the risk for NSCLP (Butali et al., 2013; Wallenstein et al., 2013). Methylation differences within normal MZ twin pairs are well documented (Kaminsky et al., 2009; Czyz et al., 2012; Ollikainen et al., 2015). Importantly, methylation differences have been identified in MZ twin pairs discordant for a variety of congenital defects, including Beckwith-Wiedemann syndrome (OMIM 130650), congenital cataracts, congenital renal agenesis, cryptorchidism, Mayer-Rokitansky-Kuster-Hauser syndrome (OMIM 277000) and congenital heart disease (OMIM 217095), yielding insights into the underlying pathology (Weksberg et al., 2002; Rall et al., 2011; Tierling et al., 2011; Jin et al., 2014; Wei et al., 2015; Lu et al., 2017; Lyu et al., 2018). Given the relevance of DNA methylation in neural crest development, which plays a critical role in craniofacial development (Hu et al., 2014; Seelan et al., 2019), and the potential influence of epigenetics in the etiology of NSCLP, we compared genomic DNA methylation patterns between six pairs of monozygotic twins discordant for NSCLP.

Materials and Methods

Editorial Policies and Ethical Considerations

All DNA samples were collected and studied following informed and signed written consent from a parent or guardian. This study was approved by the University of Texas Health Science Center Committee for Protection of Human Subjects (HSC-MS-03-090) and the University of Miami Human Subject Research Office (IRB #20100505).

Dataset

Six monozygotic twin pairs discordant for NSCLP were ascertained and characterized as part of the larger genetic study of NSCLP that has been previously described (Chiquet et al., 2007; Morris et al., 2020). Briefly, participants were recruited at the craniofacial clinics affiliated with the UTHealth Science Center at Houston under IRB-approved protocols and written informed consent was obtained from parents and assents from the children. All participants were evaluated and syndromic forms of NSCLP were excluded. Study inclusion was irrespective of ethnicity and/or sex. Family history and a minimum of a three-generation pedigree were obtained from all families. The characteristics of the six pairs of MZ twins and their samples are listed in Table 1. All affected individuals were post-surgical repair. Saliva samples were collected using the Oragene® Saliva Collection Kit or the Oragene® Saliva Self−Collection Kit for Young Children (Sponge Kit) according to manufacturer’s protocol. Genomic DNA was isolated from saliva samples obtained from these male MZ twins discordant for NSCLP and from their available parents.
TABLE 1

Description of twin pairs.

Pair*Sex#Status&Age (yrs)Cleft typeEthnicity@
MZ1MA8LeftNHW
MU8
MZ2MA1LeftH
MU1
MZ3MA10LeftH
MU10
MZ6MA8LeftH
MU7
MZ7MA2LeftNHW
MU2
MZ8MA15LeftNHW
MU15
Description of twin pairs.

Candidate Gene Assembly

A list of candidate genes for NSCLP was assembled from a thorough review of the literature of orofacial clefting including human association and sequencing studies and animal models. Genes associated with non-syndromic cleft lip with or without cleft palate, cleft palate only, and syndromic forms of clefting were included (Supplementary Table 1).

Whole Genome Sample Preparation and Sequencing

gDNA samples were evaluated for concentration by Qubit® fluorometric assay or by using Picogreen®. The integrity of the gDNA was assessed by loading approximately 100 ng per sample on a 0.75% agarose gel and comparing size distribution to a size marker (Quick-Load® 1 kb Extend DNA Ladder, NEB). 0.6–1 μg of gDNA was fragmented using the Covaris L-series or E-series instrument and sheared to a final insert size of approximately 400 bp. The purified fragmented DNA was repaired by the phosphorylation of the 5′ end and the conversion of overhangs to blunt ends. One “A” nucleotide was added to each 3′ end of the blunt fragments. The indexing-specific paired-end adaptors were ligated to the ends of the DNA fragments and the adaptor-ligated library was PCR amplified. Quality and quantity of the purified ligated products were verified using the Caliper LabChip® GX. Libraries were pooled and sequenced according to manufacturer’s instructions via the Illumina HiSeq X platform. Multiplexing 6 whole human genomes per flow cell generated an average depth of 30× per sample, exceeding read depth recommendations for Whole genome bisulfite sequencing (WGBS) analysis (Ziller et al., 2015). Comparison of the whole genome sequencing (WGS, Table 2) between the affected and unaffected co-twins did not identify a pathogenic point mutation, insertion/deletion, or structural alteration in any of the known CLP-associated genes listed in Supplementary Table 1.
TABLE 2

Whole genome DNA sequencing statistics.

Average reads uniquely aligned to human genome*Average genome coverageAverage discordant SNP callsAverage discordant exonic/splicing SNP callsDiscordant SNP calls in NSCLP candidate genes
9.19E + 10 (8.61E + 10–1.01E + 11)29.12 (28.42–32.52)196,253 (187,112–213,468)14,920 (13,435–19,423)0 (0–0)
Whole genome DNA sequencing statistics.

Whole Genome Bisulfite Sequencing

Whole genome bisulfite sequencing gDNA library preparations were carried out using the TruSeq DNA Sample Prep Kit v2 (Illumina Inc.) combined with sodium bisulfite conversion. Briefly, 1 μg of gDNA spiked with 5 ng unmethylated λ DNA (Promega) was fragmented to 200–300 bp peak size (Covaris, Woburn, MA, United States). Fragment size was analyzed on a Bioanalyzer DNA 1000 Chip (Agilent). End repair, sample purification with AMPure beads (Agencourt Bioscience Corp.), adenylation of 3′ ends, and adaptor ligation were carried out following the Illumina standard protocol. Ligation DNA was purified with AMPure, analyzed on a Bioanalyzer and subjected to bisulfite conversion using the MethylCode Bisulfite Conversion Kit (ThermoFisher) according to manufacturer’s instructions. Adaptor-ligated DNA was amplified through five cycles of PCR using the Hifi Uracil + DNA polymerase (Kapa Biosystems, Woburn, MA, United States) according to manufacturer’s protocol. Library quality was monitored using the Bioanalyzer (Agilent) and the concentration of fragments carrying adapters at both ends determined by quantitative PCR using the Library Quantification Kit from KAPA Biosystem. Paired-end DNA sequencing (124 bp each) using the Illumina Hi-Seq 2500 was performed. Generated sequencing reads (average read length 120.9) were trimmed using Trim Galore[1] and aligned to the bisulfite-converted human reference genome (GRCh37/hg19) using Burrows–Wheeler alignment. Methylation states were called using Bismark v0.10.0 (Krueger and Andrews, 2011). Confirmed clonal reads and reads with low mapping quality score were removed (Johnson et al., 2012), as well as, reads aligning to regions of the reference genome identified as troublesome for high- throughput sequencing aligners [DAC Blacklisted Regions (DBR) and Duke Excluded Regions (DER)[2]]. After data preprocessing, »54.3% of the resulting reads were uniquely mapped to the human reference genome. The median bisulfite conversion efficiency of CpGs was determined to be ∼98%. CpGs not covered by at least 20 reads were excluded, which resulted in a minimum of two reads per strand across all twelve samples. Reads on opposite strands with an absolute strand difference in methylation > 20% were excluded, leaving on average 2.4 × 106 unique and high-confidence CpGs per sample. Most CpG sites retained were highly methylated (over 50%, Supplementary Figure 1).

Array-Based Methylation Assay

Array-based methylation profiling was performed using the MethylationEPIC BeadChip (Illumina, San Diego, CA, United States) assay. Briefly, 300 ng gDNA per sample were bisulfite converted using the EZ-96 DNA Methylation kit (Zymo Research, Irvine, CA, United States) and processed as described by Illumina. The MethylationEPIC BeadChip measures over 850,000 CpG loci across a diverse set of functionally relevant genomic regions that include CpG islands (CGI) and promoters as well as CGI shores, gene body and intergenic CpGs. BeadChip images were scanned and the data analyzed using the R software (version 3.2.4[3]). Raw intensity files (.idat) were obtained and processed using RnBeads package (Assenov et al., 2014). For quality control, we selected probes that had a detection p-value < 0.01 for all samples. The detection p-value represents the quality of the probes compared to background noise. Because all samples were male, we did not filter probes associated with sex. Methylated and unmethylated signal intensities were quantile normalized for each probe type separately and both beta and M values were estimated. The beta values are the ratio of the methylated signal intensity to the sum of both methylated and unmethylated signals after background subtraction (beta values range from 0 to 1, corresponding to completely unmethylated and fully methylated sites, respectively). M values are logit transformation of the beta values, which is M = log (beta/1-beta). M values were used as outcome variables for subsequent analysis because they have been shown to have better statistical properties such as homoscedasticity (Du et al., 2017). For the analysis of individual CpGs, an absolute delta beta > 0.30 was used to detect differential methylation between twins.

Identification of Differentially Methylated Positions Within Twin Pairs

From the strand-overlapping sites with > 20 reads, we selected differentially methylated sites based on the following criteria: (1) intra-twin pair difference in methylation levels of at least 30% (delta methylation > | 0.3|) and (2) a coefficient of variation in methylation levels of the unaffected individuals of less than 0.25. This strategy of restricting the search on differential methylation to sites in which the variation in unaffected twins is <25% is based on the idea that sites that exhibit large inter-individual variation in methylation levels have less regulatory potential than those sites in which the methylation levels are more tightly preserved between unaffected individuals.

DMR Analysis

Differentially methylated regions (DMR) analysis used a Bayesian hierarchical model (Wu et al., 2015) to detect regional methylation differences with ≥ 3 contiguous sites each with a methylation difference > 20% within a 2 kb distance with unidirectional methylation changes and a region-wise average methylation difference > 25%.

DNA Bisulfite Conversion and Bisulfite Specific PCR

Flanking primers (methylation-unbiased nested PCR primers) were designed for a subset of differentially methylated CpG sites and quantitative levels of methylation for each CpG dinucleotide were evaluated using the BiQ Analyzer software. Bisulfite-PCR was followed by cloning and Sanger sequencing of multiple clones for short-range mapping of methylation levels in selected loci. Genomic DNA (0.25 μg) was bisulfite converted using the EpiTect Kit (Qiagen) following the manufacturer’s protocol. Converted DNA was amplified with primers designed in MethPrimer. Optimized PCR reactions were followed by cloning using the TOPO TA Kit (Invitrogen). Multiple clones were sequenced. Bisulfite-cloning validated 70% of the DMS (Supplementary Figure 2).

Pyrosequencing Validation

Locus-specific pyrosequencing was conducted to validate the methylation data at selected loci. Pyrosequencing assays were designed and implemented by EpigenDx (Worcester, MA, United States) following the manufacturer’s recommended protocol. The correlation between the WGBS and pyrosequencing data was evaluated using the Spearman rank order correlation test. Pyrosequencing validated 73% of the DMS. Interestingly, the majority of validated DMSs had contiguous CpGs that also showed differential methylation in the same direction and with a similar magnitude to the targeted CpG.

Results

DNA Methylation Profiles Are Different in Monozygotic Twins Discordant for NSCLP

Methylome analysis was performed on saliva samples from six sets of male MZ twins discordant for NSCLP using whole-genome bisulfite sequencing (WGBS). Globally, DNA methylation profiles generated by WGBS were highly correlated across all individuals, indicating inter-individual conservation in methylation levels. The correlation of methylation patterns of the co-twins was significantly higher compared to unrelated individuals, indicative of low within-pair DNA methylation variation (average r = 0.84 for twins and r = 0.75 for unrelated individuals at 30X coverage; Figure 1A and Supplementary Figure 3). Removal of areas of extreme variability across individuals (variably methylated regions, VMRs) (Feinberg et al., 2010) from the analysis of the 30X coverage increased the correlation for intra-twin comparisons (average r = 0.93), but decreased it for unrelated samples (average r = 0.65) (Figure 1B). This pattern of correlation change suggests that the polymorphic nature of these VMRs is affected to a larger extent by environmental rather than genetic influences.
FIGURE 1

MZ co-twins are highly correlated for DNA methylation but have site-specific differences. (A) Pearson’s correlation coefficients derived from comparing intra-twin, parent-child and parent to parent (unrelated) methylation levels at 100X (blue) or 30X (brown) genome coverage. (B) Correlation coefficients using only variable CpGs as identified by Feinberg et al. (2010) at 30X genome coverage. (C) Comparison between DNA methylation values for 16,920 individual CpG loci generated by MethylationEPIC microarray and whole-genome bisulfite sequencing. Methylation values generated by Rnbeads for the MethylationEPIC microarray were subtracted from methylation levels calculated from the sequencing data for the same sample (MZ1A to MZ3U). Averaged comparisons contrasting EPIC and WGBS data from unpaired (Unp.) samples is also included (black line). (D) Distribution of average absolute differences in DNA methylation within the MZ twins as well as comparisons of unrelated individuals. A significantly (P<1.2e-10 for WGBS and P<1.9E-12 for EPIC, Kolmogorov–Smirnov test) smaller number of CpG sites with a large average within-twin differences in methylation level was observed in within twin comparisons as contrasted with comparisons of unrelated individuals (unaffected-unaffected, NSCLP-unaffected (inter-twins) or NSCLP-NSCLP).

MZ co-twins are highly correlated for DNA methylation but have site-specific differences. (A) Pearson’s correlation coefficients derived from comparing intra-twin, parent-child and parent to parent (unrelated) methylation levels at 100X (blue) or 30X (brown) genome coverage. (B) Correlation coefficients using only variable CpGs as identified by Feinberg et al. (2010) at 30X genome coverage. (C) Comparison between DNA methylation values for 16,920 individual CpG loci generated by MethylationEPIC microarray and whole-genome bisulfite sequencing. Methylation values generated by Rnbeads for the MethylationEPIC microarray were subtracted from methylation levels calculated from the sequencing data for the same sample (MZ1A to MZ3U). Averaged comparisons contrasting EPIC and WGBS data from unpaired (Unp.) samples is also included (black line). (D) Distribution of average absolute differences in DNA methylation within the MZ twins as well as comparisons of unrelated individuals. A significantly (P<1.2e-10 for WGBS and P<1.9E-12 for EPIC, Kolmogorov–Smirnov test) smaller number of CpG sites with a large average within-twin differences in methylation level was observed in within twin comparisons as contrasted with comparisons of unrelated individuals (unaffected-unaffected, NSCLP-unaffected (inter-twins) or NSCLP-NSCLP). The WGBS data was validated globally by subjecting DNA from three of the MZ pairs to the MethylationEPIC microarray (Illumina) assay. This assay interrogates the DNA methylation status of ∼865,000 CpG sites, which account for only ∼3% of all human autosomal CpGs. The methylation values obtained from the microarray were correlated with those from WGBS. WGBS data (prefiltered as described in the Methods section) were further filtered to include only CpGs present in the MethylationEPIC microarray, resulting in a total of 16,920 loci for comparison (Figure 1C). The mean methylation level correlation between WGBS and the microarray determinations for the six samples was r = 0.90 and, on average, 97.63% of loci have delta methylation levels < | 0.2|, indicating that the methylation values derived from the sequencing and the array methods are consistent across commonly interrogated loci. To identify CpGs displaying differential methylation in any of the discordant twin pairs, the differences in CpG methylation between any two paired samples were estimated. Figure 1D shows the distribution of the mean absolute differences in DNA methylation levels per individual CpG within twins, as well as between unrelated samples. The overall distribution of average intra-twin DNA methylation differences showed a significant skew to the left as compared to the distribution of the average unrelated pair differences, with fewer CpG sites having large average differences in DNA methylation (Figure 1D). However, there is still variability in DNA methylation levels at individual CpG sites within the pairs of MZ twins.

DNA Methylation Differences in MZ Twin Pairs

The differentially methylated CpG sites (DMSs) identified within twin pairs is presented in Supplementary Table 2. The number of DMS per MZ pair varies from 7 (MZ2) to 42(MZ1), for a total of 151 DMSs associated with 147 genes. No single CpG is differentially methylated in more than one twin pair. This lack of overlap in DMS per pair is also seen at the gene level; there are no shared genes containing DMSs in the different MZ pairs. Table 3 lists the five genes with the most differential methylation per twin pair.
TABLE 3

Five most differentially methylated CpGs for each NSCLP twin pair.

ChrPositionGene 1Gene 2
MZ11684479557TLDC1ATP2C2
1024754430KIAA1217ARHGAP21
1024754448KIAA1217ARHGAP21
633518020BAK1ZBTB9
1623296275SCNN1BSCNN1G
MZ2124141700CCND2PARP11
1548849784FBN1DUT
828206128PNOCZNF395
124141713CCND2PARP11
6129494272LAMA2ARHGAP18
MZ3X85175162POF1BCHM
1234913189IRF2BP2TOMM20
1541104572DNAJC17
1541104527DNAJC17
145280071TCTEX1D4PTCH2
MZ62132153021TUBA3DPOTEE
510869280DAP
750850334GRB10COBL
750850302GRB10COBL
2230617445LIFHORMAD2
MZ7Y15676869TMSB4YUTY
Y15676854TMSB4YUTY
628663545SCAND3TRIM27
152616167ZFYVE9CC2D1B
152616219ZFYVE9CC2D1B
MZ8871368648NCOA2TRAM1
162141571NTHL1PKD1
51331076TERTCLPTM1L
7150822700AGAP3GBX1
7150822698AGAP3GBX1
Five most differentially methylated CpGs for each NSCLP twin pair. To evaluate epigenetic involvement of candidate NSCLP genes, we tested whether any of the identified DMSs in the discordant twins were located in known or suspected NSCLP genes (Supplementary Table 1). Evidence of aberrant methylation in single CpGs was observed in seven NSCLP genes, four in MZ1, two in MZ2 and one in MZ6 (Table 4). Note that the number of DMSs in candidate genes is more than expected by chance in MZ1 (p-value < 0.003) and in MZ2 (p-value < 0.0001).
TABLE 4

Differentially methylated positions in candidate genes identified in three NSCLP twin pairs.

MZ pairGeneChrPositionDelta methylationGenomic feature*Distance to TSS (kb)#Candidate group&
MZ1TP6331895090450.80In (alt. promoter)1.6AGV
WNT7A313787881−0.78IGR133AGV
MAFB2039373248−0.84IGR−55AGV
GDF68963658150.85IGR807AGV
MZ2MAF16792111330.63IGR400AGV
FBN11548849784−0.78In88SCLP
MZ6SOX91771010900−0.59IGR890SCLP
Differentially methylated positions in candidate genes identified in three NSCLP twin pairs. DNA methylation levels are strongly correlated among neighboring sites across the genome, and identification of clusters of differentially methylated positions rather than single CpGs usually predicts functional relevance (Eckhardt et al., 2006). Regional analysis is also less prone to be affected by the technical artifacts associated with individual sites. Thus, we performed systematic regional analyses of the WGBS data to identify differentially methylated regions (DMRs) in the MZ NSCLP discordant twins. Amongst the six MZ pairs, we identified a total of 768 DMRs, varying in numbers from 61 in MZ2 to 349 in MZ1 (Supplementary Table 2). Table 5 lists the 10 most differentially methylated DMRs per MZ twin pair. Candidate genes containing DMRs are listed in Table 6. While each twin pair had DMRs in one or more candidate genes, only twin pairs MZ1 MZ2, and MZ3 had an overrepresentation of candidate genes among all their respective DMRs (MZ1 p < 0.004, MZ2 p < 0.02, and MZ3 p < 0.0006). A gene-based overlap analysis detected only two candidate genes containing DMRs in more than a single MZ pair, MAFB (MZ6 and MZ8) and ZEB2 (MZ3 and MZ8). The DMRs in ZEB2 are in two different areas of intron 2 (AB011141) while the DMRs in MAFB are overlapping and located in the proximal upstream region.
TABLE 5

Top ten differentially methylated regions in each NSCLP twin pair.

ChrStartEnd# CpGsGene 1Gene2
MZ11013399222813399434025DPYSL4JAKMIP3
123147564323147606920EXOC8SPRTN
22067238742067239536NRP2INO80D
2230309896303101889HORMAD2MTMR3
2955372679553739111TEKT4
1746632154466322117HOXB3
41852512921852513457ENPP6IRF2
91364058361364059176ADAMTSL2FAM163B
1069664114696645887SIRT1HERC4
1735285044352852189LHX1MRM1
MZ212414167241417515CCND2PARP11
1938803473388041515YIF1BC19orf33
72832917283299010GNA12AMZ1
1714358178143582304HS3ST3B1PMP22
1757860285578603527VMP1TUBD1
22234810942234811977FARSBSGPP2
61586196351586197346TULP4GTF2H5
2119274639192747047CHODLC21orf91
1264216292642164116SRGAP1TMEM5
21318520871318521756FAM168B
MZ31541104483411046587DNAJC17
145280056452801318TCTEX1D4PTCH2
111178496111786118ANGPTL7EXOSC10
2219825520198256366C22orf29TBX1
1144359724443598096CD82ALX4
1685033868850339244ZDHHC7CRISPLD2
71371547591371548615PTNDGKI
1269069932690700325NUP107RAP1B
1748520295485203966ACSF2CHAD
1777638489776396566RBFOX3ENPP7
MZ6616997735816997744012THBS2WDR27
192960100296047911TLE6ZNF77
1913129284131293826NFIXLYL1
6289454932894556610TRIM27ZNF311
1615083564150836669PDXDC1NTAN1
854442087544421578OPRK1ATP6V1H
2230617390306174707LIFHORMAD2
1471721206717212849SIPA1L1PCNX
12394853539486149EFCAB4BPARP11
624647358246474468KIAA0319
MZ76286630142866532729SCAND3TRIM27
Y15676834156769057TMSB4YUTY
625882373258824348SLC17A3SLC17A2
316937763516937835511MECOMACTRT3
11549084641549087828PMVK
515008177150088617FBXL7ANKH
1756634655566348218SEPT4TEX14
X299467229947257ARSFMXRA5
91383391801383393998PPP1R26OLFM1
81412201601412202186TRAPPC9C8orf17
MZ81013534318313534328116CYP2E1SYCE1
1171210236712103159KRTAP5-7NADSYN1
913954644013954662911NOTCH1EGFL7
2246279171462793359WNT7BATXN10
517154173017154185211FBXW11STK10
31074168651074169689BBXCD47
268907054689071877ARHGAP25PROKR1
7520513652052687WIPI2RBAK
2223605028236050909BCRIGLL1
164135554641356396ROR1PGM1
TABLE 6

Differentially methylated regions identified in candidate genes.

ChrStartEnd# CpGDelta methylationGenedistance to TSSGenomic feature*Candidate group&
MZ1129763182979103100.50PRDM16−8IGRMGI
155359014553591535−0.51DHCR24−6IGRSCLP
1934257519342580350.50RPL5128InSCLP
12434514502434515444−0.53SDCCAG832InMGI
21140996571141003026−0.59PAX8−63IGRAGV
313787881137879714−0.78WNT7A134IGRAGV
341282685412832684−0.52CTNNB142IGRAGV
312932881312932911880.54PLXND1−3IGRMGI
318950904518950953540.58TP63160InAGV
51498612031498614154−0.51NDST1−26IGRMGI
8963658159636586940.85GDF6807IGRSCLP
101315555491315558496−0.51MGMT290InAGV
12996528409965289650.54APAF1614InMGI
1212500843812501178350.55NCOR230InSCLP
1310692969810693028540.55EFNB2257IGRMGI
1437123643371237005−0.62PAX9−7IGRMGI
14539767365397692350.64BMP4447IGRAGV
15268776532687785940.54GABRB3140InMGI
16793615367936163750.60MAF273IGRAGV
1742646731426467965−0.66FZD212IGRMGI
17720385937204063850.53RPL38−160IGRMGI
18463454074634555350.52SMAD7132InMGI
2234207931342113294−0.50LARGE107InAGV
MZ2187668656876687255−0.63HS2ST1288IGRMGI
11649049231649049784−0.54PBX1376IGRAGV
3577410065774317150.50SLMAP0PromMGI
41578632341578633284−0.55PDGFC29InAGV
644424513444245704−0.54RUNX2−872IGRMGI
1429568885295695974−0.56FOXG1334IGRSCLP
MZ321452164301452167095−0.58ZEB262InMGI
51765512691765513574−0.52NSD1−10IGRSCLP
9170054121700552240.58BNC2−135IGRMGI
164000600400065240.54CREBBP/ADCY9−70/166IGRAGV
16850338688503392440.73CRISPLD2180InAGV
1744918054449198024−0.53WNT3/WNT9B−23/−10IGRAGV
2219825520198256366−0.62TBX181InAGV
22464381394643820340.51WNT7B−65IGRAGV
MZ6115609550515609560850.63LMNA11InMGI
2438591164385922150.52THADA−36IGRAGV
91292502701292503326−0.57LMX1B−126IGRSCLP
11611434406114354940.56TMEM216−16IGRSCLP
15283388412833889850.50OCA26InMGI
2039319752393209396−0.68MAFB−2PromAGV
MZ713224733532247344750.51FGF9228IGRMGI
215860786015860801960.55ACVR1124InMGI
21430483894304856860.50RIPK4139IGRSCLP
MZ82039320842393210114−0.59MAFB−3IGRAGV
17886957988696508−0.51PIK3R5−1PromAGV
16688298936882995840.51CDH159InAGV
214524723314524729340.60ZEB231InMGI
Top ten differentially methylated regions in each NSCLP twin pair. Differentially methylated regions identified in candidate genes.

Enrichment of Identified DMRs in Shared Disease-Relevant Pathways

To identify enriched molecular pathways, EnrichR (Chen et al., 2013; Kuleshov et al., 2016) was used to analyze the DMR-gene set from each twin pair (Table 7). Across the six twin pairs, comparison of the top 10 significantly enriched Kegg pathways from all differentially methylated genes identified common enrichment of the “Hippo Signaling Pathway” in three pairs (MZ1, MZ3 and MZ8) and “Signaling pathways regulating pluripotency of stem cells” in two pairs (MZ1 and MZ3). Interestingly, pathway analysis using only our candidate genes identified enrichment in Hippo signaling in MZ1 (BMP4, FZD2, WNT7A, CTNNB1, GDF6, and SMAD7) and Wnt signaling in MZ3 (CREBBP, WNT9B, and WNT3). Additionally, we found that the combined set of genes identified as differentially methylated in any twin pair (set of 1,234 genes) was significantly enriched in the terms Osteoblast differentiation (p < 6.6E-8), and Neural crest migration (p < 4.1E-4) (Supplementary Figure 4).
TABLE 7

Pathways enriched for differentially methylated genes*.

IndexNameAdjusted p-value
MZ1
1Signaling pathways regulating pluripotency of stem cells0.009879
2Thyroid hormone signaling pathway0.01003
3Longevity regulating pathway0.0102
4Transcriptional misregulation in cancer0.01558
5Small cell lung cancer0.01709
6Hippo signaling pathway0.02574
7Longevity regulating pathway0.0261
MZ3
1Wnt signaling pathway0.0008854
2Melanogenesis0.001263
3Pathways in cancer0.001309
4Insulin secretion0.00152
5Glucagon signaling pathway0.002417
6Pancreatic secretion0.002557
7Cushing syndrome0.002559
8Basal cell carcinoma0.002761
9Salivary secretion0.01002
10cAMP signaling pathway0.01172
11Aldosterone synthesis and secretion0.01216
12Gastric cancer0.01276
13Hippo signaling pathway0.01431
14Gastric acid secretion0.02412
15Thyroid hormone synthesis0.02429
16Oocyte meiosis0.02514
17Relaxin signaling pathway0.02519
18Vascular smooth muscle contraction0.02559
19Cell cycle0.02588
20p53 signaling pathway0.02664
21Estrogen signaling pathway0.02739
22Signaling pathways regulating pluripotency of stem cells0.02791
23Apelin signaling pathway0.0287
24Colorectal cancer0.02993
25Adrenergic signaling in cardiomyocytes0.03086
MZ8
1Hippo signaling pathway0.03994
Pathways enriched for differentially methylated genes*.

Between-Group Analyses Identified Additional DMRs

Group level DNA methylation differences between NSCLP cases and controls were also examined. Unlike the within-pair discordant MZ twin design, between-group DNA methylation differences can be attributed to both genetic and epigenetic factors. This analysis identified significantly different methylation regions only in the genes KCNAB3, NWD1, IFLTD1, and TRBJ2-3. No DMRs in NSCLP candidate genes were identified in this analysis.

Discussion

This is the first comprehensive study of epigenetic differences in MZ twins discordant for NSCLP. This epigenome-wide study was performed using saliva DNA from six male MZ twin pairs discordant for NSCLP to determine if differences in DNA methylation could explain the discordant NSCLP phenotype. Results from this study provide novel and exciting findings which point to the importance of methylation in craniofacial development. We found regional methylation differences in a number of genes in each twin pair (range 61–349). Less than 3% (37 out of 1,338) of these genes were differentially methylated in more than one twin pair. A single gene, TRIM27, was differentially methylated in three pairs. TRIM27 has not been previously associated with clefting or craniofacial development but has been shown to be differentially methylated in prostate and lung cancers (Ji et al., 2020; Wang et al., 2020). Interestingly, differential methylation of TRIM27 was reported in a study comparing cleft lip only to cleft palate only (Sharp et al., 2017). When we restricted our analysis to NSCLP candidate genes, evidence of methylation differences was found, suggesting their possible contribution to NSCLP through epigenetic variation. Of the 49 candidate genes which exhibited differential methylation, only two, MAFB and ZEB2, were differentially methylated in more than one twin pair. Importantly, the changes were all in the same direction, suggesting that these two genes are epigenetically important in orofacial development. This provides additional support for the previously reported association between NSCLP and MAFB, which is expressed in palatal shelves (Beaty et al., 2010). ZEB2 is involved in embryonic development of facial features (Ghoumid et al., 2013; Funato and Nakamura, 2017), and Zeb2 mutant mice exhibit alterations in neural crest migration and midfacial clefting (Van de Putte et al., 2003; Miyoshi et al., 2006). The lack of significant gene overlap points to the underlying heterogeneity of NSCLP. Interestingly, although the majority of the identified DMRs do not lie in genes previously associated with NSCLP, pathway enrichment analysis revealed overrepresentation of the Hippo pathway in three of the six twin pairs. This signaling pathway was initially discovered as a key regulator of tissue growth in flies and regulates organ size and tissue repair by controlling cell proliferation, survival, mobility, stemness, and differentiation. The Hippo pathway is tightly regulated by both intrinsic and extrinsic signals, such as mechanical force, cell–cell contact, polarity, energy status, stress, and many diffusible hormonal factors. Studies have shown evidence of crosstalk between the Hippo pathway and other key signaling pathways, such as Wnt signaling (Li et al., 2019), and has recently been associated with craniofacial development (Wang et al., 2016; Wang and Martin, 2017; Sun et al., 2018) and mouse secondary palate formation (Goodwin et al., 2020). The common enrichment in this pathway is not due to alterations in a small subset of similar genes in all pairs (see Supplementary Figure 5). Rather, there is pathway convergence through alterations in MZ pair-specific methylation in different genes. A comparison of our results with a recent orofacial clefting case-control report revealed a little overlap (Alvizi et al., 2017). Interestingly, this overlap includes the NSCLP candidate gene MAFB (Supplementary Table 3). The different findings between these studies provides further evidence for complex genetic and epigenetics involvement in the etiology of NSCLP. It is well known that genetic variation in a single gene can cause syndromic cases of orofacial clefting. It is therefore possible that extreme epigenetic variation in a few key genes could produce a clefting phenotype. However, it is more likely that the effects of most DNA methylation changes are subtle, similar to some non-coding genetic variants, and that multiple changes act in concert to ultimately contribute to non-sydromic orofacial clefting, as discussed above. The observation of DMPs in genes previously implicated in NSCLP, such as TP63, TBX1, CRISPLD2, CREBBP, and BMP4, suggest that de novo epimutations, which could imitate genetic mutations (Martin et al., 2005), are likely involved in the pathogenesis of NSCLP. Since we found methylation changes in several NSCLP candidate genes in each affected twin, we propose that an “epigenetic load” mechanism is involved in the dysregulation of orofacial morphogenesis, rather than a single dysfunctional NSCLP gene. This is consistent with the accepted multifactorial inheritance of NSCLP (Carter, 1969; Hecht et al., 1991; Howe et al., 2018). Based on the important role of epigenetic mechanisms in regulating gene expression, it is likely that methylomic variation could mediate disease susceptibility by cumulative altered gene expression, similar to genetic variation in regulatory regions of the genome. The use of saliva as a proxy for the lip and palatal tissues is justified by their common developmental origin (Patel and Hoffman, 2014). Salivary glands, lip, and palate all arise from neural crest cells, making cells from saliva a developmentally appropriate proxy for these tissues. In addition, saliva is a good source of high quality DNA for use in epigenomics (Thompson et al., 2013; Smith et al., 2015) and has been deemed more informative than blood for DNA methylation analysis (Lowe et al., 2013) in spite of the possibility of bacterial contamination resulting in non-human reads in human sequencing data (Samson et al., 2020). Our sample size, while small, exceeds the number of twins in most epigenetic studies of birth defects with most being based on a single discordant twin pair (Weksberg et al., 2002; Rall et al., 2011; Tierling et al., 2011; Jin et al., 2014; Wei et al., 2015; Lu et al., 2017; Lyu et al., 2018). NSCLP has an estimated frequency of 1 in 700; thus, discordant MZ twin pairs occur infrequently but are rich source of information. The stringent parameters used in this study, together with the validation of selected DMPs by either bisulfite sequencing, pyrosequencing, or methylation arrays, suggest that these are likely real differences rather than reflecting experimental noise. Amongst the identified methylation differences between co-twins, we propose that many of them underlie the discordance in NSCLP but recognize that there will be some that may reflect individual exposure to environmental factors unrelated to NSCLP, or perhaps non-craniofacial phenotypic differences. However, identification of DMRs in NSCLP candidate genes and the pathway analyses of the unbiased genes containing DMRs suggest that these DMRs play an etiologic role in NSCLP. In summary, we show for the first time that MZ twins discordant for NSCLP have differential methylation. The findings from this novel study demonstrate that methylation may play a significant role in the etiology of NSCLP and, more importantly, differential methylation could account for the “reduced penetrance” seen in many multiplex families (Hecht et al., 1991; Hecht and Blanton, 1998). Since the discordant twins share age, sex and socioeconomic status and do not exhibit CLP-related genotype variability (as analyzed by WGS), our data highlights the role of non-shared environmental and stochastic factors in the etiology of epigenetic differences and resulting discordant NSCLP phenotypes. Importantly, we find methylation differences in genes previously implicated by NSCLP genetic studies. Moreover, our data suggest that although DNA methylation at some CpG sites is consistently altered in affected twins, most differences are individual-specific, revealing epigenetic heterogeneity between NSCLP individuals. This suggests that each affected individual/family may have their own unique epigenetic signature making etiologic identification complex. However, the differences converge at the level of potentially affected pathways including the Hippo signaling pathway. Our findings complement a recent NSCLP case-control study suggesting that epigenetic mechanisms underlie NSCLP (Alvizi et al., 2017). Future studies are aimed at defining epigenetic changes in sporadic cases, as well as in multiplex NSCLP families in which DNA methylation patterns can be traced through multiple generations.

Data Availability Statement

The data presented in the study are deposited in the Gene Expression Omnibus (GEO) repository, accession number GSE173211.

Ethics Statement

The studies involving human participants were reviewed and approved by this study was approved by the University of Texas Health Science Center Committee for Protection of Human Subjects (HSC-MS-03-090) and the University of Miami Human Subject Research Office (IRB #20100505). Written informed consent to participate in this study was provided by the participants’ legal guardian/next of kin.

Author Contributions

JY, JH, and SB: conceptualization, investigation, writing—original draft, and writing—review and editing. JY and SS: formal analysis and methodology. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  62 in total

Review 1.  Salivary gland development: a template for regeneration.

Authors:  Vaishali N Patel; Matthew P Hoffman
Journal:  Semin Cell Dev Biol       Date:  2013-12-11       Impact factor: 7.727

2.  Complementary expression pattern of Zfhx1 genes Sip1 and deltaEF1 in the mouse embryo and their genetic interaction revealed by compound mutants.

Authors:  Tomoya Miyoshi; Mitsuji Maruhashi; Tom Van De Putte; Hisato Kondoh; Danny Huylebroeck; Yujiro Higashi
Journal:  Dev Dyn       Date:  2006-07       Impact factor: 3.780

3.  A genome-wide association study of cleft lip with and without cleft palate identifies risk variants near MAFB and ABCA4.

Authors:  Terri H Beaty; Jeffrey C Murray; Mary L Marazita; Ronald G Munger; Ingo Ruczinski; Jacqueline B Hetmanski; Kung Yee Liang; Tao Wu; Tanda Murray; M Daniele Fallin; Richard A Redett; Gerald Raymond; Holger Schwender; Sheng-Chih Jin; Margaret E Cooper; Martine Dunnwald; Maria A Mansilla; Elizabeth Leslie; Stephen Bullard; Andrew C Lidral; Lina M Moreno; Renato Menezes; Alexandre R Vieira; Aline Petrin; Allen J Wilcox; Rolv T Lie; Ethylin W Jabs; Yah Huei Wu-Chou; Philip K Chen; Hong Wang; Xiaoqian Ye; Shangzhi Huang; Vincent Yeow; Samuel S Chong; Sun Ha Jee; Bing Shi; Kaare Christensen; Mads Melbye; Kimberly F Doheny; Elizabeth W Pugh; Hua Ling; Eduardo E Castilla; Andrew E Czeizel; Lian Ma; L Leigh Field; Lawrence Brody; Faith Pangilinan; James L Mills; Anne M Molloy; Peadar N Kirke; John M Scott; James M Scott; Mauricio Arcos-Burgos; Alan F Scott
Journal:  Nat Genet       Date:  2010-05-02       Impact factor: 38.330

4.  Genetics of common disorders.

Authors:  C O Carter
Journal:  Br Med Bull       Date:  1969-01       Impact factor: 4.291

5.  Evidence for craniofacial enhancer variation underlying nonsyndromic cleft lip and palate.

Authors:  Vershanna E Morris; S Shahrukh Hashmi; Lisha Zhu; Lorena Maili; Christian Urbina; Steven Blackwell; Matthew R Greives; Edward P Buchanan; John B Mulliken; Susan H Blanton; W Jim Zheng; Jacqueline T Hecht; Ariadne Letra
Journal:  Hum Genet       Date:  2020-04-21       Impact factor: 4.132

6.  Genomic and epigenomic analyses of monozygotic twins discordant for congenital renal agenesis.

Authors:  Meiling Jin; Shida Zhu; Panpan Hu; Dongbing Liu; Qinggang Li; Zuoxiang Li; Xueguang Zhang; Yuansheng Xie; Xiangmei Chen
Journal:  Am J Kidney Dis       Date:  2014-02-28       Impact factor: 8.860

7.  Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.

Authors:  Maxim V Kuleshov; Matthew R Jones; Andrew D Rouillard; Nicolas F Fernandez; Qiaonan Duan; Zichen Wang; Simon Koplev; Sherry L Jenkins; Kathleen M Jagodnik; Alexander Lachmann; Michael G McDermott; Caroline D Monteiro; Gregory W Gundersen; Avi Ma'ayan
Journal:  Nucleic Acids Res       Date:  2016-05-03       Impact factor: 16.971

8.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications.

Authors:  Felix Krueger; Simon R Andrews
Journal:  Bioinformatics       Date:  2011-04-14       Impact factor: 6.937

9.  A combination of transcriptome and methylation analyses reveals embryologically-relevant candidate genes in MRKH patients.

Authors:  Katharina Rall; Gianmaria Barresi; Michael Walter; Sven Poths; Karina Haebig; Karin Schaeferhoff; Birgitt Schoenfisch; Olaf Riess; Diethelm Wallwiener; Michael Bonin; Sara Brucker
Journal:  Orphanet J Rare Dis       Date:  2011-05-28       Impact factor: 4.123

10.  Identification of shared and unique gene families associated with oral clefts.

Authors:  Noriko Funato; Masataka Nakamura
Journal:  Int J Oral Sci       Date:  2017-01-20       Impact factor: 6.344

View more
  4 in total

1.  Variable paralog expression underlies phenotype variation.

Authors:  Raisa Bailon-Zambrano; Juliana Sucharov; Abigail Mumme-Monheit; Matthew Murry; Amanda Stenzel; Anthony T Pulvino; Jennyfer M Mitchell; Kathryn L Colborn; James T Nichols
Journal:  Elife       Date:  2022-09-22       Impact factor: 8.713

2.  KDM6B interacts with TFDP1 to activate P53 signaling in regulating mouse palatogenesis.

Authors:  Tingwei Guo; Xia Han; Jinzhi He; Jifan Feng; Junjun Jing; Eva Janečková; Jie Lei; Thach-Vu Ho; Jian Xu; Yang Chai
Journal:  Elife       Date:  2022-02-25       Impact factor: 8.713

3.  Mouse models in palate development and orofacial cleft research: Understanding the crucial role and regulation of epithelial integrity in facial and palate morphogenesis.

Authors:  Yu Lan; Rulang Jiang
Journal:  Curr Top Dev Biol       Date:  2022-02-28       Impact factor: 5.242

4.  Exploring the Molecular Mechanism of lncRNA-miRNA-mRNA Networks in Non-Syndromic Cleft Lip with or without Cleft Palate.

Authors:  Xiangpu Wang; Siyuan Guo; Xinli Zhou; Yupei Wang; Ting Zhang; Renji Chen
Journal:  Int J Gen Med       Date:  2021-12-16
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.