Literature DB >> 25406947

Meta-analysis of human methylation data for evidence of sex-specific autosomal patterns.

Nina S McCarthy1, Phillip E Melton, Gemma Cadby, Seyhan Yazar, Maria Franchina, Eric K Moses, David A Mackey, Alex W Hewitt.   

Abstract

BACKGROUND: Several individual studies have suggested that autosomal CpG methylation differs by sex both in terms of individual CpG sites and global autosomal CpG methylation. However, these findings have been inconsistent and plagued by spurious associations due to the cross reactivity of CpG probes on commercial microarrays. We collectively analysed 76 published studies (n = 6,795) for sex-associated differences in both autosomal and sex chromosome CpG sites.
RESULTS: Overall autosomal methylation profiles varied substantially by study, and we encountered substantial batch effects. We accounted for these by conducting random effects meta-analysis for individual autosomal CpG methylation associations. After excluding non-specific probes, we found 184 autosomal CpG sites differentially methylated by sex after correction for multiple testing. In line with previous studies, average beta differences were small. Many of the most significantly associated CpG probes were new. Of note was differential CpG methylation in the promoters of genes thought to be involved in spermatogenesis and male fertility, such as SLC9A2, SPESP1, CRISP2, and NUPL1. Pathway analysis revealed overrepresentation of genes differentially methylated by sex in several broad Gene Ontology biological processes, including RNA splicing and DNA repair.
CONCLUSIONS: This study represents a comprehensive analysis of sex-specific methylation patterns. We demonstrate the existence of sex-specific methylation profiles and report a large number of novel DNA methylation differences in autosomal CpG sites between sexes.

Entities:  

Mesh:

Year:  2014        PMID: 25406947      PMCID: PMC4255932          DOI: 10.1186/1471-2164-15-981

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

DNA methylation of the eukaryotic genome is essential for normal cellular differentiation and embryonic development [1, 2]. Methylation within gene promoter regions is important in the regulation of gene expression [3], and changes in both overall methylation and specific methylation patterns have been shown to vary within an individual according to tissue type and disease status [4]. In healthy people, inter-individual differences in methylation are also observed and it has been postulated that these differences may be influenced by various factors, including sex [5-16]. Studies that have investigated methylation differences between males and females in repeat sequences (LINE and Alu) and other targeted genomic areas have reported sex-specific methylation differences at various autosomal sites and suggested a tendency toward higher methylation levels in males than in females [5-11], although one study reported no significant difference [17]. Advances in microarray technology have enabled the assessment of genome-wide methylation by surveying CpG methylation at thousands of sites across the genome. The Illumina Infinium series is a commonly used commercial platform and the HumanMethylation27K was one of the first comprehensive methylation microarrays. It interrogates over 27,500 CpG sites in the promoter regions of more than 14,000 RefSeq genes. Studies using this array have reported that a large number of autosomal CpG sites (up to 5% of autosomal loci, or 1,333 CpGs) appear to be differentially methylated in females and males, and that global CpG methylation is higher in males than in females [10, 12–16]. Recent work has demonstrated substantial sequence overlap between autosomal and sex-linked probes on the HumanMethylation27K microarray and that up to 10% of the probes are nonspecific and map to highly homologous genomic sequences [18]. The majority of the sex-associated methylation sites at autosomal CpG loci reported to date are likely to be technical artefacts created by the presence of cross-reactive autosomal probes hybridizing to both autosomal and sex chromosomes [18, 19]. We report new CpG associations with sex that are not due to nonspecific probes on the array. Many of the CpG associations lie in genes thought to be involved in spermatogenesis and male fertility, such as SLC9A2, SPESP1, CRISP2, and NUPL1. Pathway analysis revealed overrepresentation of genes differentially methylated between males and females in several Gene Ontology (GO) biological processes, such as regulation of immune response, RNA splicing and DNA repair. Our findings support previously reported global methylation differences between females and males: increased autosomal methylation in males and increased X chromosome methylation in females.

Results

X chromosome analysis and sex assignment

Following quality control (see Methods section), the final dataset included data on 7,333 samples from 81 studies from the European Bioinformatics Institute (EBI) database. Of these, 5,207 samples had sex recorded from the EBI phenotype files. The characteristics of this final dataset are displayed in Table 1.
Table 1

Characteristics of the final dataset (n = 7,333, 81 studies) included in the analysis

Study characteristics
Mean study sample size (sd; range)90.5 (120.0; 6-719)
Sex:number recorded (%)5,207 (71.0)
-of which n female (%)2,870 (55.1)
n male (%)2,277 (43.7)
n other (trisomies/XXY) (%)60 (1.1)
following final classification:
n female (%)3,647 (49.7)
n male (%)3,686 (50.3)
Sample Source:
n PBLs (%)3,416 (46.6)
n tissue (%)3,917 (53.4)
-of which n cancer tissue (%)2,432 (62.1)

Abbreviations: PBL peripheral blood leukocytes, n number, SD standard deviation.

Characteristics of the final dataset (n = 7,333, 81 studies) included in the analysis Abbreviations: PBL peripheral blood leukocytes, n number, SD standard deviation. We initially sought to investigate whether sex could be inferred from X chromosome methylation data using principal component analysis (PCA). The first two principal components (PCs) were plotted against each other for all samples of known sex (Figure 1(A)). Colouring by the recorded sex from the EBI phenotype files indicates that sex can be determined by classifying samples based on their first PC, with samples recorded as ‘other/trisomy’ (n = 60) clustering in the middle. The second PC, in contrast, contributes little to the separation of males and females. Figure 1(B) shows that the samples of unknown sex cluster well with those of known sex. Logistic regression of recorded sex on PC1 and recorded sex on PC2 showed that both relationships were significant and, as expected, PC1 was a much better predictor than PC2 (PC1: P < 2e-16, AIC 2447.4; PC2: P = 1e-08, AIC 7037.2).
Figure 1

Plot of the first two principal components (PC1 and PC2) based on the 999 X chromosome CpG sites for known males (n = 2,277), females (n = 2,870) and other (n = 60) (plot A) and with those of unknown sex (n = 2,126) superimposed (plot B).

Plot of the first two principal components (PC1 and PC2) based on the 999 X chromosome CpG sites for known males (n = 2,277), females (n = 2,870) and other (n = 60) (plot A) and with those of unknown sex (n = 2,126) superimposed (plot B). Due to X-inactivation as a result of lyonisation [13], global methylation values across the 999 X chromosome sites contained on the HumanMethylation27K BeadChip were expected to be higher in females than in males and could therefore represent a robust means to distinguish between the sexes. A global X chromosome methylation value was calculated for each individual by summing the individual beta values at each of the 999 X chromosome CpG sites. Global X chromosome methylation across the whole cohort (n = 7,333) was approximately normally distributed (mean ± sd: 395 ± 83; kurtosis = 2.95, skewness = 0.21). For those of known sex (n = 5,207), global X chromosome methylation was significantly higher in females (mean ± sd: 455 ± 52) compared to males (mean ± sd: 328 ± 48; Welch Two Sample t-test P <2.2e-16). Sex was also inferred by global X chromosome methylation values using the midpoint between the mean global X chromosome methylation values for males and females (391.5). Of the 5,147 recorded sexes (2,277 males and 2,870 females), using global X chromosome methylation to identify males gave a sensitivity (percentage of true males correctly identified) of 91.7% (2,088/2,277) and a specificity (percentage of true females correctly identified) of 89.3% (2,563/2,870) with an overall percentage of 90.7% samples where sex was correctly identified. By comparison, using PC1 to identify males gave a sensitivity (percentage of true males correctly identified) of 93.4% (2,127/2,277) and a specificity (percentage of true females correctly identified) of 93.4% (2,682/2,870) with an overall percentage of 93.4% samples in which sex was correctly identified. Given that PC1 was a more sensitive and specific method than using global X chromosome methylation, we used PC1 to re-classify all sexes in the sample (n = 7,333), resulting in a population of 3,647 females and 3,686 males. Inspecting all samples (n = 7,333) after sex re-classification revealed some outliers, which were removed (Additional file 1: Figure S2 and S3), leaving a final population of n = 6,795, with n = 5,016 samples of recorded sex. Of these 5,016, 94.6% were consistent with sex as classified using PC1 (Additional file 2: Table S1). For this final sample of n = 6,795, density plots of global X chromosome methylation by sex revealed distinct peaks using sex as assigned by PC1 (Additional file 1: Figure S3D). Global X chromosome methylation was significantly higher in females (mean ± sd: 463.9 ± 45.6) compared to males (mean ± sd: 314.9 ± 29.0; Welch Two Sample t-test P <2.2e-16) using sex as assigned by PC1.A receiver-operating characteristic (ROC) curve comparing the predictive ability of three metrics generated from the X chromosome methylation data (PC1, PC2 and global X chromosome methylation) showed that PC1 was the best predictor of sex (Figure 2). The area under the curve (AUC) was 0.948 for PC1 versus 0.936 for global X chromosome methylation, and only 0.553 for PC2.
Figure 2

Receiver operating characteristic (ROC) curve comparing the predictive ability of three of the metrics generated from the X chromosome methylation data; PC1, PC2 and global X chromosome methylation.

Receiver operating characteristic (ROC) curve comparing the predictive ability of three of the metrics generated from the X chromosome methylation data; PC1, PC2 and global X chromosome methylation.

Differences in autosomal methylation between sexes

Following quality control, 27,231 CpG sites on the HumanMethylation27K chip remained for analysis in 6,795 individuals who were successfully classified by sex (Figure 3(A) and (B)). Of these, 26,225 CpG sites were located on the autosomes (Figure 3(B)). A density plot of individual methylation beta values for each of the 26,225 autosomal CpG sites for all 6,795 individuals (Figure 4(A)) showed that across all studies and for both sexes, the majority (68%) of CpG sites had methylation values <0.3, whilst 17.4% of CpG sites had beta values >0.7, the range at which probes would be considered to be fully methylated [20, 21]. These percentages were not substantially different by sex (Additional file 2: Table S2).
Figure 3

Flow chart of sample quality control (A) and CpG quality control (B) for the data downloaded from the European Bioinformatics Institute (EBI) database. PBL: Peripheral blood leukocyte.

Figure 4

Density plots of methylation beta values at individual CpG sites (A) and global autosomal methylation (B) across all 26,225 autosomal CpGs in all 6,795 samples, coloured by sex.

Flow chart of sample quality control (A) and CpG quality control (B) for the data downloaded from the European Bioinformatics Institute (EBI) database. PBL: Peripheral blood leukocyte. Density plots of methylation beta values at individual CpG sites (A) and global autosomal methylation (B) across all 26,225 autosomal CpGs in all 6,795 samples, coloured by sex. Global methylation across the 22 autosomes was calculated for each sample by summing the individual CpG beta values across the 26,225 autosomal CpG sites. Global autosomal methylation was approximately normally distributed (mean ± sd: 7055 ± 652; kurtosis = 6.6, skewness = -0.31) but there were a number of female outliers (Figure 4(B) and Additional file 1: Figure S4), which skewed the mean female global autosomal methylation value (Additional file 2: Table S3). Median global autosomal methylation in males was slightly, yet statistically significantly higher than in women (median [IQR]: males 7,190 [6,770-7,426], females 7,135 [6,754-7,368]), Wilcoxon rank sum P =5.2e-05). Global methylation across autosomal CpG sites varied greatly between studies (Additional file 1: Figure S5). Given that this variability could be due to underlying batch or cohort effects, a random effects meta-analysis was conducted for studies that included ≥20 individuals, and in which both sexes were represented in at least a 1:4 ratio (Additional file 3). A total of 39 studies (n = 4,172) met these criteria, of which 10 included cancer samples. The difference in global autosomal methylation according to sex in each of these studies is shown in Additional file 1: Figure S6. Meta-analysis revealed that global autosomal methylation was indeed very heterogeneous across studies, with the proportion of global autosomal variation in study estimates that is due to heterogeneity (I2) =87.9% [95% CI: 84.5%; 90.6%, P < 0.0001]. However, a funnel plot of the 39 studies did not indicate any major outliers (Additional file 1: Figure S7) and the 10 cancer studies clustered well with the 29 non-cancer studies. Despite the high heterogeneity, application of a random effects model suggested that global autosomal methylation was nominally higher in males than in females (mean beta difference = 48.9, P = 0.049). PCA of individual beta values at autosomal CpG sites by sex was also highly confounded by batch effect, even after adjustment for study (Figure 5). Hence, to identify individual CpG sites differentially methylated between males and females, we conducted a random effects meta-analysis using the 39 studies (n = 4,172) in which both sexes were represented. Estimated overall effects (mean difference in beta values between females and males, weighted by study) and corresponding P values were calculated. Overall, 235 (0.9%) of the 26,225 CpG sites were associated with sex after Bonferroni correction (P <1.9e-06).
Figure 5

Principal Components plots using the 1 two principal components calculated using individual beta values at all autosomal CpG sites adjusted for study, coloured by study (A) and sex (B).

Principal Components plots using the 1 two principal components calculated using individual beta values at all autosomal CpG sites adjusted for study, coloured by study (A) and sex (B).

Autosomal gene-specific methylation

All previously identified non-specific or polymorphic CpG probes (as described by Chen and colleagues [18]) on the HumanMethylation27K array were cross-referenced against the 26,225 autosomal CpGs in the meta-analysis. In total, 2,783 of the 26,225 autosomal CpGs were non-specific and 838 were polymorphic. Of the 235 CpG sites associated with sex after Bonferroni correction in this study, 48 probes are non-specific and of the remaining 187 probes, 3 were polymorphic. These probes are indicated in the full results table (Additional file 4). Meta-analysis of the cohort excluding cancer samples (31 studies, n = 2,900) was performed to ensure results were not due to confounding by cancer samples. There was good correlation between the P values from both meta-analyses, with linear regression r2 = 0.33, P < 2.2e-16 (Additional file 1: Figure S8). In addition, the majority (150/235) of the CpG sites that passed Bonferroni correction (P < 1.91e-06) in the meta-analysis of all samples passed the same threshold in the meta-analysis excluding cancer samples. The remaining 85 had P values between 1.94e-06 and 1.28e-02 in the meta-analysis excluding cancer samples (Additional file 4). Mean differences in beta values between females and males were small. The mean beta difference between sexes for the 184 statistically significant probes was 0.037 (3.7%, Additional file 2: Table S4), with the vast majority of associated CpG sites (n = 178, 97%) more methylated in females than in males. Pathway analysis revealed significant enrichment of genes with sex-associated changes in CpG methylation in 53 GO Biological Pathways at P < 0.05 (FDR adjusted). All biological processes enriched at P < 0.01 (FDR adjusted) are shown in Table 2 and largely comprise cellular ‘housekeeping’ functions. Gene overlap between pathways was relatively low, with 367 (73%) of the 500 genes in the top four processes (RNA splicing, DNA repair, protein modification by small protein conjugation and viral reproduction) unique to only one of these pathways (Additional file 1: Figure S9). These top four processes were relatively distinct from the individually most strongly sex-associated CpG sites, with only 10 genes from these processes represented in the CpG sites which passed Bonferroni correction. The median –log10 P values of the CpG sites of the genes in the top four biological processes compared to P values across all CpG sites are displayed in Figure 6.
Table 2

All results P < 0.01 (FDR adjusted) from the NetGestalt Pathway Analysis

GO Biological ProcessGO AccessionNumber of genes Pvalue (FDR adjusted)D
RNA splicingGO:00083801225.73E-040.21
DNA repairGO:00062811165.73E-040.17
Protein modification by small protein conjugationGO:00324461225.73E-040.17
Viral reproductionGO:00160321405.73E-040.15
mRNA processingGO:00063971631.13E-030.18
Interphase of mitotic cell cycleGO:00513291701.13E-030.16
Protein catabolic processGO:00301631961.86E-030.15
InterphaseGO:00513251721.86E-030.15
RNA splicing, via transesterification reactionsGO:0000375832.31E-030.22
Protein ubiquitinationGO:00165671192.76E-030.15
TranslationGO:00064121013.53E-030.16
rRNA metabolic processGO:0016072307.11E-030.33
Chromatin modificationGO:0016568977.11E-030.15
M phaseGO:00002791657.84E-030.13
Cellular protein catabolic processGO:00442571648.92E-030.14
rRNA processingGO:0006364289.76E-030.33

GO: Gene Ontology. D: D-statistic; the maximum difference in cumulative fraction (of P value distribution). Only positive associations are shown (pathways enriched for genes containing CpG sites with lower P values compared to all CpG sites). The number of genes in each pathway that were represented by CpG sites in the meta-analysis is also shown.

Figure 6

Boxplot of meta analysis –log10 for all 26, 225 autosomal CpGs, compared to those located in the genes included in the top four enriched GO Biological Processes; RNA splicing (n = 112 genes), DNA repair (n = 116 genes), protein modification by small protein conjugation (n = 122 genes) and viral reproduction (n = 140 genes).

All results P < 0.01 (FDR adjusted) from the NetGestalt Pathway Analysis GO: Gene Ontology. D: D-statistic; the maximum difference in cumulative fraction (of P value distribution). Only positive associations are shown (pathways enriched for genes containing CpG sites with lower P values compared to all CpG sites). The number of genes in each pathway that were represented by CpG sites in the meta-analysis is also shown. Boxplot of meta analysis –log10 for all 26, 225 autosomal CpGs, compared to those located in the genes included in the top four enriched GO Biological Processes; RNA splicing (n = 112 genes), DNA repair (n = 116 genes), protein modification by small protein conjugation (n = 122 genes) and viral reproduction (n = 140 genes).

Discussion

In this study we conducted a meta-analysis of publicly available, genome-wide methylation datasets to examine the possible relationship between methylation profiles and sex. Following rigorous quality control, we analysed the data for differential methylation patterns in terms of global X chromosome methylation, global autosomal methylation, and differences between the methylation status at individual autosomal CpG probes according to sex.

X chromosome analysis

The X chromosome analysis demonstrated that, as expected, global X chromosome methylation is considerably higher in females compared to males. We also demonstrated that it is possible to accurately infer sex, based on PCA of X chromosome methylation data, and that this method is superior to using global X chromosome methylation. In light of the increasing evidence for the importance of correction for sex in analysis of disease-specific methylation patterns, this provides a valuable technique for studies where sex information is not easily accessible.

Autosomal methylation analysis

This analysis showed a small but significant increase in global autosomal methylation associated with male sex, concurring with previous studies [7]. Many mechanisms may account for this global difference, but have so far remained elusive. A recent study indicated that this global difference in autosomal methylation does not appear to be driven by sex hormones [22]. The findings of the analysis of methylation at individual CpG sites in this study have a number of features in common with the findings of Chen et al. [18]. After excluding probes which were found to be non-specific, Chen et al. [18] also found only small absolute differences between male and female samples in the CpG probes that they found to be significantly differentially methylated by sex (2-12%). Similar to this study, the majority (5/7) of the CpG sites that they found to be associated with sex were more methylated in females. Chen and colleagues reported associations between three (cg08124399 in DDX43, cg08532057 in NUPL1 and cg18485485 in DECR1) of our top 20 CpG sites and similar differences in beta values between sexes [18]. As predicted, the majority of the genes that Liu et al. [13] found to be differentially methylated according to sex were on the X chromosome. These investigators reported 11 genes with very small CpG differences in autosomes: LRRC2 TDGF1, RAB9P1, C6ORF68, TLE1, GLUD1, ALX4, DPPA3, NUPL1, FLJ20582 and FLJ43276. All of these genes were found to be significant in our analysis (Additional file 3), with most being among our top 20 loci. Apart from the NUPL1 CpG probe, however, all of these CpG sites were found to be non-specific by Chen et al. [18]. Autosomal sex-specific DNA methylation has been demonstrated in several candidate gene studies [5, 6, 14, 23, 24]. Although some of these genes (CDKN2A, MTHFR and MGMT) are also interrogated to some extent by the HumanMethylation27K array, we observed significant sex-specific DNA methylation only with MGMT (Additional file 3), with all MGMT probes being more methylated in females. These findings are in contrast to previous reports in which higher methylation was shown in males [5, 24]. Genes containing CpG islands with methylation differences of >5% between the sexes are listed in Table 3. Many of these top CpG associations are in genes that may play a role in sex-specific functions. The top-hit from our meta-analysis, SLC9A2, encodes Solute Carrier Family 9. Other Solute Carrier Family proteins have been shown to be associated with male infertility [25], suggesting a possible sex-specific role for this class of proteins. DDX43 has been shown to be differentially expressed in normal testis compared to testis of patients diagnosed with Sertolicell-only (SCO) syndrome, and may play a role in spermatogenesis [25]. NUPL1 has been shown to be highly expressed in the testes of fertile men compared to that of infertile men, suggesting its function is important for male fertility [25]. These genes may be associated with male infertility secondary to epigenetic deregulation. The results of the pathway analysis, on the other hand, suggest that the influence of sex on methylation is broad. We report enriched biological processes across a wide range of cellular functions, some of which, including DNA transcription and RNA splicing, were also reported by Liu et al. [13].
Table 3

CpG sites with a differences in methylation of >5% between males and females in meta-analysis of 4,172 samples

GENECHRTarget IDMeta-analysis PvalueMeta-analysis Δ betaNCBI description of protein functionn studies consistent effect
SLC9A2 2cg20050113<2.2e-16-0.09Involved in pH regulation to eliminate acids generated by active metabolism or to counter adverse environmental conditions.36
DDX43 6cg08124399<2.2e-16-0.07ATP-dependent RNA helicase in the DEAD-box family.39
SPESP1 15cg09886641<2.2e-16-0.06Human alloantigen involved in sperm-egg binding and fusion.37
FIGNL1 7cg05072008<2.2e-16-0.06May regulate osteoblast proliferation and differentiation.36
CRISP2 6cg04595372<2.2e-16-0.06Also known as Testis-Specific Protein TPX-1. May regulate some ion channels’ activity and thereby regulate calcium fluxes during sperm capacitation.36
NUPL1 13cg08532057<2.2e-16-0.06Component of the nuclear pore complex, a complex required for the trafficking across the nuclear membrane.30

Direction of effect (Δ beta) is for male relative to female methylation beta value, so that negative values indicate lower methylation in males. The number of the 39 studies in the meta-analysis which had individual associations consistent with the direction of effect in the meta analysis are also shown. All listed probes are located within a CpG island.

CpG sites with a differences in methylation of >5% between males and females in meta-analysis of 4,172 samples Direction of effect (Δ beta) is for male relative to female methylation beta value, so that negative values indicate lower methylation in males. The number of the 39 studies in the meta-analysis which had individual associations consistent with the direction of effect in the meta analysis are also shown. All listed probes are located within a CpG island. Since the completion of this study, an increasing amount of data from the HumanMethylation450K microarray are becoming publicly available, along with tools for their analyses [26]. The 450K array offers higher-resolution methylation data than the 27K, and it will be interesting see what this new set of data will reveal about sex-specific methylation patterns.

Conclusions

With the inclusion and careful analysis of all publicly available datasets, this study represents the most comprehensive analysis of sex-specific methylation patterns to date. This is likely the reason that, unlike previous studies, which reported few detectable DNA methylation differences in autosomal genes between sexes [13, 18], we identified a modest number of CpG sites associated with sex after Bonferroni correction. Similar to other studies using microarray platforms, however, the mean beta differences associated with sex are very small (approximately 5%) [13, 14, 18]. This is far below the smallest difference (approximately 17%) that microarray platforms reliably estimate [27]. The reason for this is not clear; it may be that sex-associated methylation differences are highly dynamic, leading to heterogeneous samples. Nonetheless, the results reported here reliably demonstrate the existence of sex-specific methylation profiles, which is important not only in a biological context, but in recognising and dealing with potential confounding when undertaking methylation-disease association studies.

Methods

Data collation and quality control

We extracted publicly available methylation datasets from the European Bioinformatics Institute (EBI) database (http://www.ebi.ac.uk/arrayexpress). The vast majority of studies in this repository were conducted using the Illumina Infinium HumanMethylation27K microarray to assess genome-wide DNA methylation, and we therefore chose to analyse studies using this platform. This microarray uses 50-mer oligonucleotide probes to target 27,578 CpG sites covering ~14,000 autosomal and sex chromosome genes. These CpGs map to the promoter regions of genes with an average coverage of two CpGs per gene and more extensive coverage (3–20 CpGs) for cancer-related and imprinted genes. Infinium technology has previously been described for SNP genotyping [28]. In order to detect methylation differences, DNA is treated with sodium bisulfite, which converts unmethylated cytosines to uracil, whereas methylated cytosine is protected and remains unchanged. Two probes are designed for each CpG site—one is specific for the methylated allele (cytosine) and the other for the unmethylated allele (thymidine). The DNA methylation level for a CpG site is determined by dividing the signal intensity for the methylated CpG by the sum of both the methylated and unmethylated CpGs, previously shown to be a reliable estimate of the level of methylation at a locus [20]. This is referred to as the ‘beta value’ and approximates to the percent methylation divided by 100. Previous studies have shown that probes with beta values <0.3 represent unmethylated areas of the genome [20] and we therefore defined unmethylated probes on this basis. In cell lines, probes with beta >0.7 represent genomic loci that are fully methylated; however, as tissue samples frequently comprise a mixture of cell types and therefore a mixture of methylated and unmethylated probes, we set a beta value threshold of >0.3 to define methylated probes [20, 21]. In this study, in addition to analysing beta values at individual CpG sites, we also investigated global methylation values. A global methylation value was calculated for each individual in the study as the sum of the beta values at each CpG site, either across the X chromosome (global X chromosome methylation), or across all autosomes (global autosomal methylation). Experiments from the EMBL-EBI database (http://www.ebi.ac.uk/arrayexpress) using the Illumina HumanMethylation27K BeadChip (HumanMethylation27_270596_v.1.2) up to March 2013 were incorporated into the study. As the majority of experiments made processed but not raw methylation data available in the EMBL-EBI database, we included only processed (normalised) datasets. Initially, 92 studies were retrieved from the EBI database. Samples not of human origin, and all samples derived from cell lines were excluded from further analysis. Duplicate entries were removed. Datasets that appeared to have been incorrectly processed (for example, those containing beta values <0 or >1) were excluded. Datasets which reported non- standard CpG site identifiers were also excluded. Following dataset quality control, phenotype and methylation files were merged. There were 7,614 samples from 84 studies for whom matching methylation and phenotype data were available (Figure 3A). A list of the EBI Accession numbers for these studies, and the number of samples included from each study, is provided in Additional file 5. Within each study, probes with reported detection P values (generated by Illumina GenomeStudio Software as an objective measure of overall probe performance) >0.05 were removed from the analysis. In the merged dataset of 7,614 samples, all CpG sites and samples with >10% missing methylation data were excluded. Consequently, the final dataset comprised 27,231 CpG sites from 7,381 samples (Figure 3A and B). Following the exclusion of CpG sites and samples with >10% missing methylation data, 0.13% of the remaining 27,231 CpG sites from 7,381 samples had missing beta values. The median number of missing beta values per CpG site was 3 (IQR 1-5, range 0-191). The median number of missing beta values per sample was 8 (IQR 3-15, range 0-298). Missingness was at random (the probability of missingness was not significantly related to any observed variables – sex, study, CpG site nor individual). Given that missingness rates were low, and randomly distributed, we did not anticipate missingness to be a problem in this analysis. Missing beta values were removed from the analysis of individual CpG probes.

Sex assignation using X chromosome data

The HumanMethylation27K Chip contains 1,085 CpG probes on the X chromosome. Of these, 86 were removed during quality control (Figure 3(B)), leaving 999 X chromosome CpG sites for analysis. A principal components analysis (PCA) was performed in R, using the prcomp function in the R package ‘stats’ [29]. One study (accession ID: E-GEOD-23311, ‘DNA Methylation in Human Chorionic Villus and Maternal Blood Cells’) was excluded due to being a major outlier on the PCA (Additional file 1: Figure S1).

Meta-analysis of individual CpG probes

A meta-analysis was performed to investigate associations between individual CpG probes and sex in a subset of studies that included at least 20 individuals, and both sexes in a ratio of at least 1:4 (Additional file 4). The meta-analysis was carried out using the R package ‘meta’ [29]. Inverse variance weighting was used for pooling. The DerSimonian-Laird estimate for the between-study variance was used in a random effects model [30]. To ensure that the meta-analysis results were not confounded by cancer samples, a second meta-analysis was performed using the same protocol. All 1,240 cancer samples from the original 4,172 were excluded from this analysis, leaving 31 studies (n = 2,900) that had > 20 individuals and were composed of both sexes.

Pathway analysis of the associations between individual CpG probes and sex

We conducted a pathway analysis using –log10 P values of all CpG probes from the meta-analysis, annotated by gene name. The NetGestalt web application was used to integrate these continuous data over GO biological processes using random walk distance-based hierarchical clustering for module identification [31]. Enriched modules were identified using the Kolmogorov-Smirnov test [32, 33]. The Benjamini–Hochberg method was used for controlling the false discovery rate (FDR) [34]. All annotated genes on the Infinium HumanMethylation27 arrays were used as the background list against which enriched-GO terms in target lists of genes were compared. Additional file 1: Figure S1: The 1st two PCs from the 999 X chromosome CpG sites in the initial 82 studies (n = 7,381). Figure S2. Distribution of PC1 and global methylation values across all 999 X chromosome sites in all samples (n = 7,333) and samples with sex recorded only (n = 5,147) coloured by sex. Figure S3. Distribution of global methylation values across all 999 X chromosome sites (A) and PC1 from X chromosome CpG sites (B) by sex after exclusion of spurious ‘male’ samples (n = 6,812), PC plot of X chromosome showing further outliers (C) and density plot of final dataset (n = 6,795) (D). Figure S4. Boxplot of female and male global autosomal methylation (n = 6,795). Figure S5. Density of global autosomal CpG methylation in the final cohort (n = 6,795), coloured by study. Figure S6. Boxplots of global autosomal methylation by sex in each study in the meta-analysis (n = Figure S7. Funnel plot for meta-analysis of global autosomal methylation by sex (n = 4,172, 39 studies). Figure S8. Meta-analysis P values for all samples (n = 4,172, 39 studies) plotted against those for the meta-analysis excluding cancer samples (n = 2,900, 31 studies). Figure S9. Venn diagram of the top four GO biological processes which were enriched for genes differentially methylated by sex in the pathway analysis. (DOCX 2 MB) Additional file 2: Table S1: Comparison of the sex assigned by recorded phenotype, PC1 and global X chromosome methylation in the final dataset (n = 6,795). Table S2. Percentage of the 26,225 autosomal CpGs in each methylation category according to sex. Table S3. Summary of the global methylation across the 24,225 autosomal CpG sites (n = 6,795 samples). Table S4 Average absolute difference in beta value by sex in CpG methylation across all 184 non cross-reactive, non-polymorphic CpG sites which passed Bonferroni Correction. (DOCX 20 KB) Additional file 3: All studies from the original cohort (76 studies, n = 6,795) which included ≥20 individuals and were comprised of both sexes in a ratio of at least 1:4 were included in the meta analysis. (XLSX 18 KB) Additional file 4: The 235 autosomal CpG sites associated with sex in the meta-analysis (n = 4,172, 39 studies), after Bonferroni Correction for multiple testing ( <1.918e-06). All three of the probes with beta differences of >10% in the meta-analysis [cg15915418 in TLE1 (Δ beta = 31%); cg27063525 in C6orf69 (Δ beta = 17%) and cg11673803 in GLUD1 (Δ beta = 15%)] were non-specific, and feature highly in the top non-specific probes reported by Chen et al. All three of these 50-mer probes have 100% match identity to non-target sequences on the X chromosome. Of the 235 CpG sites which were significant after Bonferroni correction, 31 of the top 100 CpG probes by beta difference were cross reactive. (XLSX 43 KB) Additional file 5: A list of the EBI Accession numbers for the initial dataset of 7,614 samples from 84 studies, and the number of samples included from each study. (XLSX 11 KB)
  30 in total

1.  The role of DNA methylation in mammalian epigenetics.

Authors:  P A Jones; D Takai
Journal:  Science       Date:  2001-08-10       Impact factor: 47.728

2.  Age related changes in 5-methylcytosine content in human peripheral leukocytes and placentas: an HPLC-based study.

Authors:  C Fuke; M Shimabukuro; A Petronis; J Sugimoto; T Oda; K Miura; T Miyazaki; C Ogura; Y Okazaki; Y Jinno
Journal:  Ann Hum Genet       Date:  2004-05       Impact factor: 1.670

3.  DNA methylation inhibits transcription indirectly via a methyl-CpG binding protein.

Authors:  J Boyes; A Bird
Journal:  Cell       Date:  1991-03-22       Impact factor: 41.582

Review 4.  DNA methylation and human disease.

Authors:  Keith D Robertson
Journal:  Nat Rev Genet       Date:  2005-08       Impact factor: 53.242

5.  Sex differential in methylation patterns of selected genes in Singapore Chinese.

Authors:  Barbara Sarter; Tiffany I Long; Wan H Tsong; Woon-Puay Koh; Mimi C Yu; Peter W Laird
Journal:  Hum Genet       Date:  2005-06-01       Impact factor: 4.132

6.  Whole-genome genotyping with the single-base extension assay.

Authors:  Frank J Steemers; Weihua Chang; Grace Lee; David L Barker; Richard Shen; Kevin L Gunderson
Journal:  Nat Methods       Date:  2006-01       Impact factor: 28.547

7.  Meta-analysis in clinical trials.

Authors:  R DerSimonian; N Laird
Journal:  Control Clin Trials       Date:  1986-09

8.  Aberrant methylation of p16INK4a in anatomic and gender-specific subtypes of sporadic colorectal cancer.

Authors:  J K Wiencke; S Zheng; A Lafuente; M J Lafuente; C Grudzen; M R Wrensch; R Miike; A Ballesta; M Trias
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  1999-06       Impact factor: 4.254

9.  High-throughput DNA methylation profiling using universal bead arrays.

Authors:  Marina Bibikova; Zhenwu Lin; Lixin Zhou; Eugene Chudin; Eliza Wickham Garcia; Bonnie Wu; Dennis Doucet; Neal J Thomas; Yunhua Wang; Ekkehard Vollmer; Torsten Goldmann; Carola Seifart; Wei Jiang; David L Barker; Mark S Chee; Joanna Floros; Jian-Bing Fan
Journal:  Genome Res       Date:  2006-01-31       Impact factor: 9.043

10.  Tissue of origin determines cancer-associated CpG island promoter hypermethylation patterns.

Authors:  Duncan Sproul; Robert R Kitchen; Colm E Nestor; J Michael Dixon; Andrew H Sims; David J Harrison; Bernard H Ramsahoye; Richard R Meehan
Journal:  Genome Biol       Date:  2012-10-03       Impact factor: 13.583

View more
  41 in total

1.  Integrative analysis of methylome and transcriptome in human blood identifies extensive sex- and immune cell-specific differentially methylated regions.

Authors:  Shimrat Mamrut; Nili Avidan; Elsebeth Staun-Ram; Elizabeta Ginzburg; Frederique Truffault; Sonia Berrih-Aknin; Ariel Miller
Journal:  Epigenetics       Date:  2015       Impact factor: 4.528

2.  Natural variation in maternal care and cross-tissue patterns of oxytocin receptor gene methylation in rats.

Authors:  Annaliese K Beery; Lisa M McEwen; Julia L MacIsaac; Darlene D Francis; Michael S Kobor
Journal:  Horm Behav       Date:  2015-06-27       Impact factor: 3.587

3.  Associations between placental CpG methylation of metastable epialleles and childhood body mass index across ages one, two and ten in the Extremely Low Gestational Age Newborns (ELGAN) cohort.

Authors:  Jeliyah Clark; Elizabeth Martin; Catherine M Bulka; Lisa Smeester; Hudson P Santos; T Michael O'Shea; Rebecca C Fry
Journal:  Epigenetics       Date:  2019-07-02       Impact factor: 4.528

Review 4.  Genetic and epigenetic sex-specific adaptations to endurance exercise.

Authors:  Shanie Landen; Sarah Voisin; Jeffrey M Craig; Sean L McGee; Séverine Lamon; Nir Eynon
Journal:  Epigenetics       Date:  2019-04-13       Impact factor: 4.528

5.  Sexual epigenetic dimorphism in the human placenta: implications for susceptibility during the prenatal period.

Authors:  Elizabeth Martin; Lisa Smeester; Paige A Bommarito; Matthew R Grace; Kim Boggess; Karl Kuban; Margaret R Karagas; Carmen J Marsit; T Michael O'Shea; Rebecca C Fry
Journal:  Epigenomics       Date:  2017-02-17       Impact factor: 4.778

6.  Genome-wide interaction studies identify sex-specific risk alleles for nonsyndromic orofacial clefts.

Authors:  Jenna C Carlson; Nichole L Nidey; Azeez Butali; Carmen J Buxo; Kaare Christensen; Frederic W-D Deleyiannis; Jacqueline T Hecht; L Leigh Field; Lina M Moreno-Uribe; Ieda M Orioli; Fernando A Poletta; Carmencita Padilla; Alexandre R Vieira; Seth M Weinberg; George L Wehby; Eleanor Feingold; Jeffrey C Murray; Mary L Marazita; Elizabeth J Leslie
Journal:  Genet Epidemiol       Date:  2018-09-11       Impact factor: 2.135

Review 7.  The genetics of pubertal timing in the general population: recent advances and evidence for sex-specificity.

Authors:  Diana L Cousminer; Elisabeth Widén; Mark R Palmert
Journal:  Curr Opin Endocrinol Diabetes Obes       Date:  2016-02       Impact factor: 3.243

Review 8.  Clinical value of DNA methylation markers in autoimmune rheumatic diseases.

Authors:  Esteban Ballestar; Amr H Sawalha; Qianjin Lu
Journal:  Nat Rev Rheumatol       Date:  2020-08-05       Impact factor: 20.543

9.  Estimation of blood cellular heterogeneity in newborns and children for epigenome-wide association studies.

Authors:  Paul Yousefi; Karen Huen; Hong Quach; Girish Motwani; Alan Hubbard; Brenda Eskenazi; Nina Holland
Journal:  Environ Mol Mutagen       Date:  2015-08-31       Impact factor: 3.216

10.  Sex-differential DNA methylation and associated regulation networks in human brain implicated in the sex-biased risks of psychiatric disorders.

Authors:  Yan Xia; Rujia Dai; Kangli Wang; Chuan Jiao; Chunling Zhang; Yuchen Xu; Honglei Li; Xi Jing; Yu Chen; Yi Jiang; Richard F Kopp; Gina Giase; Chao Chen; Chunyu Liu
Journal:  Mol Psychiatry       Date:  2019-04-11       Impact factor: 15.992

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.