Literature DB >> 33096472

Differential DNA methylation in familial hypercholesterolemia.

Laurens F Reeskamp1, Andrea Venema2, Joao P Belo Pereira3, Evgeni Levin3, Max Nieuwdorp4, Albert K Groen5, Joep C Defesche2, Aldo Grefhorst5, Peter Henneman2, G Kees Hovingh6.   

Abstract

BACKGROUND: Familial hypercholesterolemia (FH) is a monogenic disorder characterized by elevated low-density lipoprotein cholesterol (LDL-C). A FH causing genetic variant in LDLR, APOB, or PCSK9 is not identified in 12-60% of clinical FH patients (FH mutation-negative patients). We aimed to assess whether altered DNA methylation might be associated with FH in this latter group.
METHODS: In this study we included 78 FH mutation-negative patients and 58 FH mutation-positive patients with a pathogenic LDLR variant. All patients were male, not using lipid lowering therapies and had LDL-C levels >6 mmol/L and triglyceride levels <3.5 mmol/L. DNA methylation was measured with the Infinium Methylation EPIC 850 K beadchip assay. Multiple linear regression analyses were used to explore DNA methylation differences between the two groups in genes related to lipid metabolism. A gradient boosting machine learning model was applied to investigate accumulated genome-wide differences between the two groups.
FINDINGS: Candidate gene analysis revealed one significantly hypomethylated CpG site in CPT1A (cg00574958) in FH mutation-negative patients, while no differences in methylation in other lipid genes were observed. The machine learning model did distinguish the two groups with a mean Area Under the Curve (AUC)±SD of 0.80±0.17 and provided two CpG sites (cg26426080 and cg11478607) in genes with a possible link to lipid metabolism (PRDM16 and GSTT1).
INTERPRETATION: FH mutation-negative patients are characterized by accumulated genome wide DNA methylation differences, but not by major DNA methylation alterations in known lipid genes compared to FH mutation-positive patients. FUNDING: ZonMW grant (VIDI no. 016.156.445).
Copyright © 2020 The Authors. Published by Elsevier B.V. All rights reserved.

Entities:  

Keywords:  DNA methylation; Epigenetics; Familial hypercholesterolemia; LDLR

Mesh:

Substances:

Year:  2020        PMID: 33096472      PMCID: PMC7581877          DOI: 10.1016/j.ebiom.2020.103079

Source DB:  PubMed          Journal:  EBioMedicine        ISSN: 2352-3964            Impact factor:   8.143


Introduction

Familialhypercholesterolemia (FH) is a common inherited autosomal dominant disease characterized by high plasma levels of low-density lipoprotein cholesterol (LDL-C) and high risk for premature cardiovascular disease (CVD). Pathogenic variants in the genes coding the low-density lipoprotein receptor (LDLR), apolipoprotein B (APOB), and proprotein convertase subtilisin/kexin type 9 (PCSK9) have been shown to cause FH. However, no pathogenic variant in any of these three FH genes is identified in a large proportion of patients who are diagnosed with FH based on clinical signs and symptoms [1], fuelling an ongoing search for novel pathogenic pathways causing FH.

Evidence before this study

A causal pathogenic variant in one of the Familial Hypercholesterolemia (FH) genes (i.e. LDLR, APOB, PCSK9) is not found in a large proportion of patients with clinical FH. We hypothesized that differential DNA methylation, a form of epigenetic regulation, contributes to the FH phenotype in these FH mutation-negative patients. We performed a PubMed search with the following terms: “Familial Hypercholesterolemia” AND “DNA methylation” and found 11 studies. None of the studies investigated the DNA methylation pattern in FH mutation-negative patients. Next, we searched PubMed with the terms ("dna methylation" OR "methylation" OR "cpg islands" OR "ewas" OR "CpG Islands"[MeSH Terms] OR "DNA Methylation"[MAJR]) AND ("ldl" OR "low-density lipoprotein") Filters: Humans. This yielded 370 articles, and in 5 of these, epigenome wide association studies showed an association between DNA methylation in multiple genes and LDL cholesterol levels. None of the studies investigated DNA methylation patterns in FH mutation-negative patients.

Added value of this study

This study was the first large scale study in FH mutation negative patients. In order to control for confounding due to high lipid levels we studied two unique FH patient groups: FH mutation-negative patients, and a group comprising FH mutation-positive patients. Although classical candidate gene analysis did, except for CPT1A, not reveal major DNA methylation differences in known lipid genes, a machine learning approach showed that FH mutation-negative patients are characterized by a different genome wide DNA methylation pattern compared to FH mutation-positive patients, with important model features for the genes PRDM16 and GSTT1.

Implications of all the available evidence

Despite extensive sequencing efforts, a causative genetic variant is not found in a large proportion of patients with a clinical FH diagnosis. Hence efforts to find novel factors causing the FH phenotype are deemed of great relevance. Additional studies to further investigate DNA methylation and its causal role in (familial) hypercholesterolemia are warranted and might benefit from focusing on accumulation of genome-wide methylation differences instead of single gene or CpG site methylation. Alt-text: Unlabelled box Differential epigenetic regulation of the genes involved in lipid metabolism may be such a factor causing FH. DNA methylation, in which a methyl group is covalently bound to the fifth carbon atom of the nucleotide cytosine when it is followed by guanine (CpG site) is the most studied form of epigenetic gene expression regulation [2]. In general, methylation of CpG sites in promoter regions of genes results in low expression of the gene, while methylation of CpG sites within the gene typically results in high expression of the gene [2]. The role of DNA methylation in lipid metabolism is relatively under investigated, but some studies have shown that DNA methylation of multiple genes is associated with plasma LDL-C as well as other lipid levels [3], [4], [5], [6]. The expression of known genes involved in LDL-C metabolism (i.e. APOE, NPC1L1) has been found to be regulated by CpG methylation [7,8]. Moreover, DNA methylation of multiple genes (i.e. ABCA1, ABCG1, LIPC, PLTP, CETP, and LPL) were associated with lipid traits [9], [10], [11] and coronary artery disease outcomes (i.e. ABCA1) [9] in patients with molecularly proven FH. However, the impact of methylation of lipid genes has not been investigated in FH patients in whom no variant in the coding region of the three major FH genes is found. In the current study we analysed the methylation pattern in patients with and without FH causing variants. A potential confounding factor is the effect of elevated lipid levels on DNA methylation itself. To overcome this issue, we compared DNA methylation in FH patients without FH causing variant (FH mutation-negative) to group of FH patients with a known pathogenic variant in LDLR (FH mutation-positive). We not only investigated methylation differences in single genes using classical regression analysis, but also used an unbiased machine learning approach to identify whole genome differences in DNA methylation between the two groups.

Materials and methods

Study population

In this study we investigated DNA methylation differences between FH mutation-negative patients and FH mutation-positive patients. The Amsterdam UMC, location Academic Medical Center (AMC) in Amsterdam, is the national referral center for the genetic analysis of all Dutch patients with various forms of dyslipidaemias. For this study we analysed the DNA derived from index patients for whom the referring physician, after clinical evaluation (laboratory results, family history, and physical examination) based on national guidelines [12,13], requested molecular testing for FH causing genetic variants between 2012 and 2017. In the samples collected before 2016 (n = 122), molecular analysis was performed by Sanger sequencing of LDLR, APOB, and PCSK9 and was followed by multiplex ligation-dependent Probe Amplification (MLPA) of LDLR when no pathogenic variants in these three genes were found. In samples collected from 2016 onwards (n = 14), a targeted next-generation sequencing (NGS) capture covering 27 lipid genes (including LDLR, APOB, and PCSK9) was used (Supplementary Table 4). Subsequent genetic cascade screening within families of index patients is done in a separate diagnostic program. These patients were not included in the current study. The DNA of male patients was used for the current study when the patients were not a carrier for any known FH causing variant in LDLR, APOB, and PCSK9 (FH mutation-negative) or had a FH causing variant in LDLR (FH mutation-positive). We selected patients who had plasma LDL-C levels above 6 mmol/L, which corresponds to the >99th percentile for males from all ages in The Netherlands [14]. Moreover, patients whose triglycerides levels were above 3•5 mmol/L and those who were using lipid lowering therapies (i.e. statins) at the time of DNA sampling were excluded. Females were excluded from this study because of the influence of sex differences on DNA methylation [15]. The study size was based on the availability of DNA samples of patients meeting these criteria. All included subjects gave written informed consent for re-use of their DNA samples for research into novel causes of hypercholesterolemia. The Medical Ethics Review Committee of the Amsterdam UMC, location AMC, provided a waiver for the re-use of the patients clinical data and DNA samples in the current study (reference ID: W20_246 # 20.281).

DNA methylation measurements

The Gentra Puregene kit was used to isolate DNA from whole blood collected in EDTA containing tubes according to standard protocols. Samples were stored at 4 °C until analysis. DNA concentrations were measured using Qubit standard methodology. DNA was treated with bisulfite using the EZ DNA Methylation kit of ZYMO® according to the standard protocol recommended by Illumina. DNA methylation of the bisulfite treated DNA was analysed with the Illumina Infinium Methylation EPIC 850 K beadchip (Illumina, California, USA) at GenomeScan (Leiden, The Netherlands). Samples of FH mutation-negative and FH mutation-positive patients were randomly assigned to different slides to avoid potential confounding batch effects.

Statistical analysis

We analysed the methylation data in a two-step approach. First, linear regression models for each CpG site were constructed to test for major difference in DNA methylation between FH mutation-positive and FH mutation-negative patients. Next, a gradient boosting machine learning technique was used to investigate unbiased subtle genome-wide DNA methylation differences between the two FH groups.

Quality control and normalization of methylation data

Quality control of the obtained data was performed using the R-package MethylAid (version 1.30.0), conform default settings [16]. Concordance between sex chromosome probes and self-reported sex were evaluated using principal component analyses (PCA). Next the data was normalized using the Funnorm function from the Minfi R package (version 1.30.0) [17]. Probes susceptible to cross-hybridization(12), probes previously described include single nucleotide polymorphisms (SNPs) with a minor allele frequency (MAF) >0.01 in either the CpG dinucleotide itself or at the position of the single base extension, and probes which included SNPs in the probe binding position were excluded (according to the Illumina manifest).

Candidate gene analysis

In the candidate gene analysis, CpG methylation was the dependent variable with group (FH mutation-negative or FH mutation-positive), age and leukocyte cell distributions incorporated as independent variables in this model. Leukocyte cell distributions were estimated using the obtained data according to the method of Houseman et al., resulting in information on relative cell counts of CD8+ and CD4+ T cells, natural killer cells, B cells, monocytes and granulocytes [18]. Quality of the epigenetic profiles was further evaluated using density plots of raw and normalized data and PCA. Correlations of the principal components one to eight with all available variables were evaluated upon entering our statistical model. For differential methylated positions (DMPs), we applied the LMfit function in the R package Limma (version 3.40.2). Cell distribution was determined with the R package FlowSorted.Blood.EPIC. To control for multiple testing the false discovery rate (FDR) method was used, where an FDR <0.05 was defined to be significant. We corrected for inflation using the BACON package for R (version 1.12.0) [19]. We generated four groups of genes according to the grade of impact on lipid metabolism (Supplementary Table 1). Tier 1 and 2 comprised the major (LDLR, APOB, PCSK9), and minor (LDLRAP, STAP1, ABCG5, ABCG8, APOE, LIPA) FH genes, respectively. Tier 3 comprised all genes that were shown to be significantly associated with plasma LDL-C or total cholesterol levels in a large genome wide association study [20]. Tier 4 included eighteen cytosine-guanine dinucleotide positions that have been shown to be associated with LDL-C and total cholesterol levels in previous studies [3,5,6,21]. All CpG probes within 3000 base pairs surrounding the candidate genes on either side were analysed in order to cover CpG sites in the 5` promoter region and possible downstream regulatory regions that were not annotated to a gene by the Illumina manifest.

Machine learning analysis

Statistical machine learning analysis was used to identify differentially methylated CpG sites that could discriminate between FH mutation-negative and FH mutation-positive subjects on a unbiased genome wide level. In brief, we used a combination of multiple gradient boosting classifiers to improve prediction accuracy [22,23]. To avoid over-fitting, we used a 5-fold stratified cross-validation over the training partition of the data (80%) while the remaining data (20%) was used as the test dataset [24]. The latter set was not used for the construction of the machine learning models. We conducted a rigorous stability selection procedure to ensure the reliability and robustness of the biomarker signatures [25]. This was repeated 50 times and Receiver Operating Characteristics (ROC) Area Under Curve (AUC) scores were computed each time and averaged for the final test ROC AUC. A permutation (randomization test) was used to evaluate statistical validity of the results [26]. In the permutation test, the outcome variable (i.e., the FH group, either FH mutation-negative or FH mutation-positive) was randomly reshuffled 1000 times while the corresponding epigenetic profiles were kept intact. By evaluating the distribution of all the results obtained in these simulations and comparing it to the outcome variable, we computed statistical significance associated with the joint panel of the selected CpG sites. To gain insight into the features that contributed the most to the model we also report relative feature importance scores for each of the CpG sites that demonstrate preferences in the model for predicting the outcome variable in the gradient boosting model. To gain insight into the biological relationship between the top features of this model and lipid metabolism, we searched for publications listed in PubMed that described a relationship between the genes identified in the top 20 contributing CpG sites and hypercholesterolemia. We used Python version 3.8 (www.python.org), with packages Numpy, Scipy and Scikits-learn for implementing the model and R version 3.5.3 (R Foundation, Vienna, Austria) for visualizations.

Correlation methylation and gene expression

Significantly differentially methylated CpG sites identified in the candidate gene analysis and the top 20 CpG sites that contributed the most in the machine learning model, were submitted for in silico validation by exploring their correlation with gene expression data in two publicly accessible liver hepatocellular carcinoma datasets; accessible via the webtools SMART [27] and MEXPRESS [28]. These datasets were based on the smaller 450 K Illumina Infinium Beadchip assay, implying that only EPIC/450 K overlapping CpG sites were investigated. Spearman's and Pearson's correlation were retrieved from both databases. Correlations between DNA-methylation and gene expression showing a P-value < 0•05 and a correlation coefficient (R) > 0•1 were suggestive to be biological relevant.

Role of funders

The funder (ZonMW) was not involved in the design, data collection, analysis, interpretation or any other aspect of this study.

Results

Subjects for this study were diagnosed with clinical FH by the physician, who requested genetic analysis for FH in our center. The analysed cohort comprised of 78 FH mutation-negative and 58 mutation positive patients. Characteristics of the cohort are shown in table 1. FH mutation-negative patients were older (50•7 ± 12•3 vs. 39•1 ± 12•0 years old, p < 0•05 [Student's t-test]) had slightly lower LDL-C levels (median[IQR] 6•7 [6•4–7•2] mmol/L vs. 7•4 [6•7–8•4] mmol/L, p < 0•05 [Mann-Whitney U test]) and higher TG levels (1•3 [1•1–2•0] mmol/L vs. 1•8 [1•3–2•3] mmol/L, p = 0•011 [Mann-Whitney U test]) compared to the FH patients with a LDLR mutation.
Table 1

characteristics of study population.

FH mutation-positiveFH mutation-negativeP-value*
N5878
Age in years (mean (SD))38•1 (12•0)50•7 (12•3)<0•001
Males (n (%))58 (100)78 (100)
Total cholesterol, mmol/L (mean (SD))9•6 (1•3)9•0 (1•4)0•022
LDL cholesterol, mmol/L (median [IQR])7•4 [6•7–8•4]6•7 [6•4–7•2]0•001
HDL cholesterol, mmol/L (mean (SD))1•3 (0•8)1•3 (0•4)0•668
Triglycerides, mmol/L (median [IQR])1•3 [1•1–2•0]1•8 [1•3–2•3]0•011

SD, Standard Deviation; IQR, interquartile range; LDL, low-density lipoproteins; HDL, high-density lipoproteins.

normally distributed values (age, total cholesterol, HDL cholesterol) were compared using student's t-test, non-normally distributed values (LDL cholesterol and triglycerides) were compared using a Mann-Whitney U test.

characteristics of study population. SD, Standard Deviation; IQR, interquartile range; LDL, low-density lipoproteins; HDL, high-density lipoproteins. normally distributed values (age, total cholesterol, HDL cholesterol) were compared using student's t-test, non-normally distributed values (LDL cholesterol and triglycerides) were compared using a Mann-Whitney U test.

Quality control of data

No major inflation was observed after BACON inflation correction of the data (lambda = 0•9546; Supplementary Figure 1). To investigate the association between CpG sites related to genes involved in lipid metabolism and the FH group, we performed a candidate gene analysis according to the four predefined tiers of genes (Supplementary Table 1). Tier 1 consisted of the three major FH genes: LDLR, APOB, and PCSK9. None of the studied CpG sites in these genes were significantly differently methylated in FH mutation-negative patients compared to FH-mutation positive patients (see Fig. 1, panel A). Also, in tier 2, consisting of so called “minor” FH genes, no differences between the two groups were observed (Fig. 1, panel B). Next, we investigated methylation differences in genes that were previously shown to be associated with LDL-C and total cholesterol in a large GWAS study [20]. Again, no significantly differently methylated CpG sites between FH mutation-negative and FH mutation-positive patients was found (Fig. 1, panel C). Lastly, in tier 4, consisting of CpG sites previously associated with LDL-C or total cholesterol, one CpG site (cg00574958 in the gene CPT1A) showed a significant 1•3% lower methylation in FH mutation-negative patients compared to FH mutation-positive patients (β −0•013, FDR = 0•001; see Fig. 1, panel D). Methylation of the identified CPT1A CpG site is associated with decreased expression of the CPT1A gene according to MEXPRESS (Supplementary Table 3) and negatively associated with triglyceride levels in our study (r = −0•27, p = 0•001 [Spearman Rank Correlation Test]).
Fig. 1

Candidate gene analysis

Four tiers of genes were constructed based on literature (genes are listed in Supplementary Table 1). Shown are the difference in methylation (effect size) between FH-mutation negative and FH-mutation positive patients for the four tiers (panels A-D) Only in tier 4 (panel D), one CpG site (CPT1A-cg00574958) was significantly less methylated in FH-mutation negative patients. Significance was defined as a False Discovery rate (FDR) of <0•05. FH, Familial Hypercholesterolemia; GWAS, genome-wide association study; EWAS, epigenome-wide association study.

Candidate gene analysis Four tiers of genes were constructed based on literature (genes are listed in Supplementary Table 1). Shown are the difference in methylation (effect size) between FH-mutation negative and FH-mutation positive patients for the four tiers (panels A-D) Only in tier 4 (panel D), one CpG site (CPT1A-cg00574958) was significantly less methylated in FH-mutation negative patients. Significance was defined as a False Discovery rate (FDR) of <0•05. FH, Familial Hypercholesterolemia; GWAS, genome-wide association study; EWAS, epigenome-wide association study. Clearly, methylation of single genes is not likely to account for the FH phenotype in FH mutation-negative patients. To investigate whether methylation changes in multiple genes may cause the defect we applied machine learning on the whole genome methylation data set. Next, a gradient boosting machine learning analysis was applied on the whole dataset for the discovery of genome wide differences in methylation between FH mutation-positive and FH mutation-negative patients. A hierarchical structure was generated based on the effect size and the top 20 probes with the highest relative feature importance in this model are reported in Table 2 and shown in Fig. 2. Fifty percent of the top 20 CpG sites were hypermethylated with the biggest median methylation difference between the two groups for the genes PRDM16, GSTT1, and LOC728743 (Fig. 2A). In contrast, DOCK11 and KCNMA1 were most differentially hypomethylated in FH mutation-negative compared to FH mutation-positive patients. All probes with a Relative Feature Importance >10% are listed in Supplementary Table 2.
Table 2

Top 20 machine learning identified CpG sites.

CpGGeneChromosomePosition1Gene featureMethylation direction in FH mutation- negative2Relative Feature ImportanceProtein function3
1cg14265823PAX3chr2223,163,326Exon 1Hyper100Paired Box 3; involved in neural development and myogenesis during fetal development.
2cg02558132MYLKchr3123,411,198Intron 19Hypo97•97Myosin light chain kinase; involved in smooth muscle contraction via phosphorylation of myosin light chains.
3cg22162835TEAD3chr635,457,472Intron 1Hypo92•2TEA Domain Transcription Factor 3; mainly expressed in placenta and involved in transactivation of chorionic somatomammotropin-B.
4cg00415024chr2056,044,352IntergenicHypo87•39
5cg26426080PRDM16chr13,039,210Intron 1Hypo84•61PR/SET Domain 16; transcriptionfactor involved brown adipose tissue differentiation.
6cg07051648NTN5/SEC1Pchr1949,177,693Intron 4 (SEC1P)Hypo76•65Netrin 5; plays a role in neurogenesis, prevents motor neuro cell body migration out of the neural tube.
7cg05071823DOCK11chrX117,628,671IntergenicHypo61•17Dedicator Of Cytokinesis 11; involved in megakaryocyte development and platelet production.
8cg05541727EXD3chr9140,277,740Intron 2Hyper54•31Exonuclease 3′−5′ Domain Containing 3; involved in RNA degradation.
9cg24051749MYCBPchr139,340,282Intron 1Hypo53•71MYC Binding Protein; can bind to oncogenic protein C-MYC and is possibly involved in spermatogenesis
10cg11478607GSTT1chr2224,384,400IntergenicHyper51•79Glutathione S-Transferase Theta 1; conjungates reduced glutathione to exogeneous and endogeneous hydrophobic electrophiles.
11cg10020385MAF1chr8145,159,706Exon 1Hyper49•8Repressor of RNA polymerase III transcription MAF1 homolog; involved in repression of RNA polymerase III-mediated transcription.
12cg11136235chr1081,077,552IntergenicHyper48•55
13cg16370685SETDB1chr1150,899,163Intron 1Hyper46•59SET Domain Bifurcated 1; regulates histone methylation, potential target for treatment in Huntington Disease
14cg09138267LOC728743chr7150,102,791Intron 1Hyper46•47Zinc Finger Protein Pseudogene
15cg04900489chr1331,272,551IntergenicHypo46•29
16cg16685760chrX145,701,257IntergenicHyper46•17
17cg07336544KCNMA1chr1079,194,347Intron 1Hypo44•54Potassium Calcium-Activated Channel Subfamily M Alpha 1; encodes alpha subunit of the MaxiK calcium-sensitive potassium channels in smooth muscle cells.
18cg00578917CYYR1chr2127,945,542Exon 1Hyper42•69Cysteine And Tyrosine Rich 1
19cg20588438KNTC1chr12123,089,881Exon 51Hypo41•65Kinetochore Associated 1; involved in proper chromosome segregation during cell division
20cg15458017chr179,672,274IntergenicHyper41•5

Top 20 CpG sites sorted by relative feature importance for contribution in the machine learning model distinguishing FH mutation-negative from FH mutation-positive subjects.

Genomic positions as provided in human genome build – hg19.

Hypo- or hypermethylation in FH mutation negative group compared to FH mutation-positive group, based on direction of difference in median normalized beta's in both groups (see Supplementary Figure 2).

Gene names and functions (when known/available) were derived from GeneCards.org(Stelzer et al., 2016).

Fig. 2

Top 20 machine learning identified CpG sites

Top 20 CpG sites most contributing to the machine learning model performance, selected on relative feature importance. (A) Bar chart of top 20 CpG sites ordered from highest relative feature importance to lowest, coloured for absolute difference in mean methylation (%) in FH mutation-negative patients vs. FH mutation-positive patients. (B) Radar plot displaying top 20 CpG cites that differentiate between FH mutation-negative and FH mutation-positive patients. The axes represent the standardized mean CpG methylation levels (scaled zero-mean unit-variance).

Top 20 machine learning identified CpG sites. Top 20 CpG sites sorted by relative feature importance for contribution in the machine learning model distinguishing FH mutation-negative from FH mutation-positive subjects. Genomic positions as provided in human genome build – hg19. Hypo- or hypermethylation in FH mutation negative group compared to FH mutation-positive group, based on direction of difference in median normalized beta's in both groups (see Supplementary Figure 2). Gene names and functions (when known/available) were derived from GeneCards.org(Stelzer et al., 2016). Top 20 machine learning identified CpG sites Top 20 CpG sites most contributing to the machine learning model performance, selected on relative feature importance. (A) Bar chart of top 20 CpG sites ordered from highest relative feature importance to lowest, coloured for absolute difference in mean methylation (%) in FH mutation-negative patients vs. FH mutation-positive patients. (B) Radar plot displaying top 20 CpG cites that differentiate between FH mutation-negative and FH mutation-positive patients. The axes represent the standardized mean CpG methylation levels (scaled zero-mean unit-variance). Most of the top 20 CpG sites were located within introns or exons of known genes, and none are located in promotor regions of genes. Of the top 20 CpG sites, five were not located in close proximity of a gene. Eleven of the top 20 CpG sites were hypomethylated in FH mutation-negative patients compared to FH mutation-positive patients. Boxplots of the methylation per top 20 CpG site per patient group are shown in Supplementary Figure 2 and their correlation with gene expression in Supplementary Table 3. The model generated by machine learning distinguishes methylations landscape in FH mutation-negative and FH mutation-positive patients with an average Area Under the Curve (AUC) of 0•80±0•17 over 50 repeat runs with different validation and test sets(Fig. 3A). A principle component analysis showed an explained variance of 11•33% for component 1 and 9•52% for component 2 (Fig. 3B). Permutation analysis revealed that the observed AUC was statistically significant (p < 0•05)
Fig. 3

Performance of machine learning model

Performance of machine learning model in distinguishing FH mutation-negative from FH mutation-positive patients. (A) ROC curve of the model. The machine learning model was able to distinguish FH mutation-positive and FH mutation-negative patients with an Area Under the Curve (AUC±SD) of 0•80±0•17. (B) Principle Component Analysis of the top 20 CpG sites with the highest relative feature importance.

Performance of machine learning model Performance of machine learning model in distinguishing FH mutation-negative from FH mutation-positive patients. (A) ROC curve of the model. The machine learning model was able to distinguish FH mutation-positive and FH mutation-negative patients with an Area Under the Curve (AUC±SD) of 0•80±0•17. (B) Principle Component Analysis of the top 20 CpG sites with the highest relative feature importance.

Discussion

Two findings stand out from our analysis. First, no alterations were observed in the candidate gene analysis, apart from a significant 1•3% decrease in methylation in the CPT1A gene in the FH mutation-negative group, suggesting that single gene methylation is not a cause of FH in our cohort. Secondly, gradient boosting machine learning revealed an overall difference in genome-wide DNA methylation between the FH mutation-positive and FH mutation-negative subjects, with a reasonable model performance (AUC 0•80±0•17). This finding underscores that these groups do differ from each other with regards to the epigenetic architecture at a genome-wide scale. CPT1A was the only locus at which a statistical difference in methylation between the two groups was found. This gene encodes Carnitine palmitoyltransferase (CPT1A) and was found to be less methylated in FH mutation-negative patients compared to FH mutation-positive patients. CPT1A is a mitochondrial enzyme that catalyses the transfer of an acyl group from fatty acids to a carnitine molecule, hence controlling mitochondrial uptake and subsequent oxidation of the acyl group, especially in the liver. In line with this role in the regulation of fatty acid metabolism, hypomethylation of cg00574958 in the CPT1A gene is associated with plasma triglyceride concentrations [4,29]. However, in previous studies it has been shown that triglycerides affect methylation of CPT1A and not vice versa [30]. In fact, the observed lower cg00574958 methylation in the FH mutation-negative patients thus might be explained by the higher triglyceride levels in this group compared to the group of FH patients where a causative variant was identified (Table 1), since triglyceride levels were also found to negatively correlate with cg00574958 methylation in our study. Altogether, our results confirm the earlier described association between methylation in CPT1A and triglyceride levels, and a underlying mechanism of its relation to LDL cholesterol is likely not present and cannot be deducted from this study. Moreover, it is uncertain how a small methylation difference of 1.3% in this gene accounts for the severe hypercholesterolemic phenotype observed in the patients. Next, we set out to incorporate methylation of CpG sites among the whole epigenome in a machine learning model to investigate whether the net effect of multiple small methylation differences could be used to identify specific patterns in FH mutation-negative and FH mutation-positive patients. Indeed, the resulting model performed well in distinguishing FH mutation-negative and FH mutation-positive patients (AUC 0•80±0•17), which emphasizes that the two selected FH groups differ on a genome-wide methylation level. The question arises whether the epigenetic changes in the group are causal or the consequence of environmental influences. For example, it might be that lifestyle factors resulting in triglyceride level differences between the two groups might also cause epigenetic difference, or that resulting triglycerides themselves influence genome wide methylation. The top 20 CpG sites with a considerable impact on the model comprised two genes that have been linked to cholesterol metabolism in previous studies; PRDM16 and GSTT1. PRDM16 encodes PR/SET Domain 16, a protein involved in brown adipose tissue differentiation [31]. Common variants in the PRDM16 locus are associated with plasma LDL-C and triglyceride levels [32], and methylation at CpG site cg26426080 is positively associated with PRDM16 gene expression (Supplementary Table 3), suggesting that the observed hypomethylation in FH mutation-negative patients also reflects PRDM16 expression differences in these patients. GSTT1, encoding Glutathione S-Transferase Theta 1, is an enzyme involved in the cellular defense against oxidative stress and genetic variants in this gene have been associated with risk for diabetes and atherosclerosis [33], and plasma total cholesterol, LDL-C and apolipoprotein B levels [34,35]. Like PRDM16, methylation of the identified CpG site in GSTT1 (cg11478607) is correlated with expression of GSTT1 (Supplementary Table 3), suggesting that the differential methylation observed in our study has an effect on GSTT1 expression. However, the absolute differences in methylation in these two and the other top 20 CpG sites between the two groups is small (Supplementary Figure 2), suggesting that no single CpG methylation site is the causal factor for the phenotype in FH mutation-negative patients, but rather a result of the aggregate of a number of small methylation effects. Our study has several limitations. Firstly, we measured DNA methylation in peripheral white blood cells, while the liver is known for its central role in LDL homeostasis. The results we obtained from the analyses in peripheral blood cells may therefore not reflect the deranged hepatic LDL metabolism in our patients. Secondly, the mutation-negative FH patient group comprised patients in whom not only epigenetic factors, but also other unknown genetic phenomena such as intronic variants [36] or polygenic hypercholesterolemia may be the causal factor [37]. Thirdly, as can be appreciated from Supplementary Figure 2, the machine learning model supposedly identified some CpG sites that had two or three distinguishable groups of methylation levels (e.g., MYCBP-cg24051749), suggesting the presence of a SNP despite the fact that we rigorously excluded CpG sites near SNPs according to the Illumina manifest using widely accepted pre-processing steps before the analysis. The used gradient boosting model, however, allows for the identification of DNA methylation differences between the two groups despite the presence of skewed distributed methylation data because of a SNP. Further studies should be executed to assess whether the SNP has biological relevant effects in these patients or that they are coincidently identified. Moreover, in our study the group of FH mutation-negative patients were diagnosed with FH by the referring physician based on national guidelines [12,13] and thus potentially is a non-homogenous clinical FH group characterized by some characteristic differences with the FH mutation-positive patients. For example, the FH patients with a LDLR pathogenic variant were younger and the LDL-C levels were higher compared to FH variant negative patients (Table 1). Although age and lipoproteins can modulate DNA methylation [30], we estimate this effect to be minimal since we explored methylation only in patients with very high LDL-C levels (above 6 mmol/L and above the 99th percentile in the general population) in both groups. Furthermore, we selected only male participants who were not using statins, since these lipid lowering drugs have been shown to alter DNA methylation through reducing DNA methyltransferase mRNA levels [38], and are associated with less methylation in promotor regions of various genes [39,40]. It is also possible that other confounders, such as obesity, are present in the current study. Additionally, we enrolled a relatively small number of individuals in our study. Our stringent selection criteria to avoid spurious findings did not allow for a larger study group to be analysed. Lastly, in the current model we analysed the data at a group level, and we might therefore have missed specific causal methylation patterns that would explain the FH phenotype at an individual patient level. Despite extensive sequencing efforts, a causative genetic variant is not found in a large proportion of patients with a clinical FH diagnosis [1]. Hence efforts to find novel factors causing the FH phenotype are deemed of great relevance. The data presented in the current study suggest that monogenic DNA methylation alterations are not a major contributing factor in FH in our cohort and thus are unlikely to be a common contributing factor to the FH phenotype in FH mutation-negative patients. Nevertheless, with the current study we have not excluded the possibility that rare monogenic DNA methylation alterations can cause FH in some individuals. On the other hand, the genome-wide methylation differences observed with advanced machine learning models between FH mutation-negative and FH mutation-positive subjects might suggest that a large number of small DNA methylation effects play a role in high plasma LDL-C. This phenomenon resembles the polygenic score where the inter individual differences in LDL-C levels are not explained by individual genetic variations but rather by the sum of a large number of small effect-size genetic factors. The question whether this is clinically relevant ensues from this finding. In contrast to monogenic FH, family screening for the presence of polygenic hypercholesterolemia, and epigenetic hypercholesterolemia, does not make sense as these do not follow an autosomal dominant inheritance pattern. At this stage, the treatment of these patients will not change either, since FH guidelines recommend the same aggressive lipid lowering with statins and add-on therapeutics, irrespective of the FH cause. Epigenetic hypercholesterolemia may only prove to be clinically relevant in case it has an impact on the efficacy of lipid lowering therapies. This study was the first of its kind to be conducted in FH patients and tried to control for confounding by differences in lipid levels by the inclusion of two unique FH patient groups: those of interests, FH mutation-negative patients, and a group of FH mutation-positive patients. Although classical candidate gene analysis did, except for CPT1A, not reveal major DNA methylation differences in known lipid genes, a machine learning approach showed that FH mutation-negative patients are characterized by a different genome wide DNA methylation pattern compared to FH mutation-positive patients, with important model features for the genes PRDM16 and GSTT1. Data sharing statement: All individual normalized DNA methylation data are available via https://dx.doi.org/10.6084/m9.figshare.12334586.

Funding sources

This study was funded by a grant (VIDI No. 016.156.445) obtained by G.K. Hovingh. The funder (ZonMW) was not involved in the design, data collection, analysis, interpretation or any other aspect of this study.

Author contributions

Conceptualization, L.F.R., P.H., and G.K.H.; Methodology, L.F.R., E.L., A.K.G., P.H. and G.K.H.; Formal Analysis, L.F.R., A.V. and J.P.B.P.; Resources, J.C.D. and G.K.H.; Writing – Original Draft, L.F.R.; Writing-Review & Editing, E.L., M.N., A.K.G., A.G., P.H. and G.K.H.; Visualization, L.F.R., J.P.B.P. and L.E. Supervision, P.H. and G.K.H.; Funding Acquisition, G.K.H.

Declaration of Competing Interests

LFR is co-founder of Lipid Tools B.V. MN reports reimbursement from kaleido biosciences and caelus health, outside the submitted work. GKH has served as consultant and speaker for biotechnology and pharmaceutical companies that develop molecules that influence lipoprotein metabolism, including Regeneron, Aegerion Pfizer, Merck, KOWA, Sanofi, and Amgen; has served as principal investigator for clinical trials conducted with a.o. Amgen, Sanofi, Eli Lilly, Novartis, Kowa, Genzyme, Cerenis, Pfizer, Dezima, and AstraZeneca; has received research grants from ZonMW (Vidi grant [016.156.445]), Klinkerpad fonds, the European Union, Amgen, Sanofi, AstraZeneca, Aegerion, and Synageva; has received honoraria and investigator fees (to the Department of Vascular Medicine) for sponsor-driven studies and lectures for companies with approved lipid-lowering therapy in the Netherlands; and is partly employed by Novo Nordisk AS, Copenhagen, Denmark (0.7FTE) and the Amsterdam UMC, Amsterdam, the Netherlands (0.3FTE). All other authors declare no competing interests.
  33 in total

1.  Genetic polymorphisms of GSTT1, GSTM1, and NQO1 genes and diabetes mellitus risk in Chinese population.

Authors:  Guoying Wang; Lu Zhang; Qiongfang Li
Journal:  Biochem Biophys Res Commun       Date:  2006-01-10       Impact factor: 3.575

2.  Epipolymorphisms within lipoprotein genes contribute independently to plasma lipid levels in familial hypercholesterolemia.

Authors:  Simon-Pierre Guay; Diane Brisson; Benoit Lamarche; Daniel Gaudet; Luigi Bouchard
Journal:  Epigenetics       Date:  2014-02-06       Impact factor: 4.528

3.  Polygenic Versus Monogenic Causes of Hypercholesterolemia Ascertained Clinically.

Authors:  Jian Wang; Jacqueline S Dron; Matthew R Ban; John F Robinson; Adam D McIntyre; Maher Alazzam; Pei Jun Zhao; Allison A Dilliott; Henian Cao; Murray W Huff; David Rhainds; Cécile Low-Kam; Marie-Pierre Dubé; Guillaume Lettre; Jean-Claude Tardif; Robert A Hegele
Journal:  Arterioscler Thromb Vasc Biol       Date:  2016-10-20       Impact factor: 8.311

4.  Characterization of statin dose response in electronic medical records.

Authors:  W-Q Wei; Q Feng; L Jiang; M S Waitara; O F Iwuchukwu; D M Roden; M Jiang; H Xu; R M Krauss; J I Rotter; D A Nickerson; R L Davis; R L Berg; P L Peissig; C A McCarty; R A Wilke; J C Denny
Journal:  Clin Pharmacol Ther       Date:  2013-10-04       Impact factor: 6.875

5.  Epigenome-wide scans identify differentially methylated regions for age and age-related phenotypes in a healthy ageing population.

Authors:  Jordana T Bell; Pei-Chien Tsai; Tsun-Po Yang; Ruth Pidsley; James Nisbet; Daniel Glass; Massimo Mangino; Guangju Zhai; Feng Zhang; Ana Valdes; So-Youn Shin; Emma L Dempster; Robin M Murray; Elin Grundberg; Asa K Hedman; Alexandra Nica; Kerrin S Small; Emmanouil T Dermitzakis; Mark I McCarthy; Jonathan Mill; Tim D Spector; Panos Deloukas
Journal:  PLoS Genet       Date:  2012-04-19       Impact factor: 5.917

6.  Blood lipids influence DNA methylation in circulating cells.

Authors:  Koen F Dekkers; Maarten van Iterson; Roderick C Slieker; Matthijs H Moed; Marc Jan Bonder; Michiel van Galen; Hailiang Mei; Daria V Zhernakova; Leonard H van den Berg; Joris Deelen; Jenny van Dongen; Diana van Heemst; Albert Hofman; Jouke J Hottenga; Carla J H van der Kallen; Casper G Schalkwijk; Coen D A Stehouwer; Ettje F Tigchelaar; André G Uitterlinden; Gonneke Willemsen; Alexandra Zhernakova; Lude Franke; Peter A C 't Hoen; Rick Jansen; Joyce van Meurs; Dorret I Boomsma; Cornelia M van Duijn; Marleen M J van Greevenbroek; Jan H Veldink; Cisca Wijmenga; Erik W van Zwet; P Eline Slagboom; J Wouter Jukema; Bastiaan T Heijmans
Journal:  Genome Biol       Date:  2016-06-27       Impact factor: 13.583

7.  Epigenome-wide association study (EWAS) on lipids: the Rotterdam Study.

Authors:  Kim V E Braun; Klodian Dhana; Paul S de Vries; Trudy Voortman; Joyce B J van Meurs; Andre G Uitterlinden; Albert Hofman; Frank B Hu; Oscar H Franco; Abbas Dehghan
Journal:  Clin Epigenetics       Date:  2017-02-07       Impact factor: 6.551

8.  Sex Differences in the Methylome and Transcriptome of the Human Liver and Circulating HDL-Cholesterol Levels.

Authors:  Sonia García-Calzón; Alexander Perfilyev; Vanessa D de Mello; Jussi Pihlajamäki; Charlotte Ling
Journal:  J Clin Endocrinol Metab       Date:  2018-12-01       Impact factor: 5.958

9.  MEXPRESS: visualizing expression, DNA methylation and clinical TCGA data.

Authors:  Alexander Koch; Tim De Meyer; Jana Jeschke; Wim Van Criekinge
Journal:  BMC Genomics       Date:  2015-08-26       Impact factor: 3.969

10.  Discovery and refinement of loci associated with lipid levels.

Authors:  Cristen J Willer; Ellen M Schmidt; Sebanti Sengupta; Michael Boehnke; Panos Deloukas; Sekar Kathiresan; Karen L Mohlke; Erik Ingelsson; Gonçalo R Abecasis; Gina M Peloso; Stefan Gustafsson; Stavroula Kanoni; Andrea Ganna; Jin Chen; Martin L Buchkovich; Samia Mora; Jacques S Beckmann; Jennifer L Bragg-Gresham; Hsing-Yi Chang; Ayşe Demirkan; Heleen M Den Hertog; Ron Do; Louise A Donnelly; Georg B Ehret; Tõnu Esko; Mary F Feitosa; Teresa Ferreira; Krista Fischer; Pierre Fontanillas; Ross M Fraser; Daniel F Freitag; Deepti Gurdasani; Kauko Heikkilä; Elina Hyppönen; Aaron Isaacs; Anne U Jackson; Åsa Johansson; Toby Johnson; Marika Kaakinen; Johannes Kettunen; Marcus E Kleber; Xiaohui Li; Jian'an Luan; Leo-Pekka Lyytikäinen; Patrik K E Magnusson; Massimo Mangino; Evelin Mihailov; May E Montasser; Martina Müller-Nurasyid; Ilja M Nolte; Jeffrey R O'Connell; Cameron D Palmer; Markus Perola; Ann-Kristin Petersen; Serena Sanna; Richa Saxena; Susan K Service; Sonia Shah; Dmitry Shungin; Carlo Sidore; Ci Song; Rona J Strawbridge; Ida Surakka; Toshiko Tanaka; Tanya M Teslovich; Gudmar Thorleifsson; Evita G Van den Herik; Benjamin F Voight; Kelly A Volcik; Lindsay L Waite; Andrew Wong; Ying Wu; Weihua Zhang; Devin Absher; Gershim Asiki; Inês Barroso; Latonya F Been; Jennifer L Bolton; Lori L Bonnycastle; Paolo Brambilla; Mary S Burnett; Giancarlo Cesana; Maria Dimitriou; Alex S F Doney; Angela Döring; Paul Elliott; Stephen E Epstein; Gudmundur Ingi Eyjolfsson; Bruna Gigante; Mark O Goodarzi; Harald Grallert; Martha L Gravito; Christopher J Groves; Göran Hallmans; Anna-Liisa Hartikainen; Caroline Hayward; Dena Hernandez; Andrew A Hicks; Hilma Holm; Yi-Jen Hung; Thomas Illig; Michelle R Jones; Pontiano Kaleebu; John J P Kastelein; Kay-Tee Khaw; Eric Kim; Norman Klopp; Pirjo Komulainen; Meena Kumari; Claudia Langenberg; Terho Lehtimäki; Shih-Yi Lin; Jaana Lindström; Ruth J F Loos; François Mach; Wendy L McArdle; Christa Meisinger; Braxton D Mitchell; Gabrielle Müller; Ramaiah Nagaraja; Narisu Narisu; Tuomo V M Nieminen; Rebecca N Nsubuga; Isleifur Olafsson; Ken K Ong; Aarno Palotie; Theodore Papamarkou; Cristina Pomilla; Anneli Pouta; Daniel J Rader; Muredach P Reilly; Paul M Ridker; Fernando Rivadeneira; Igor Rudan; Aimo Ruokonen; Nilesh Samani; Hubert Scharnagl; Janet Seeley; Kaisa Silander; Alena Stančáková; Kathleen Stirrups; Amy J Swift; Laurence Tiret; Andre G Uitterlinden; L Joost van Pelt; Sailaja Vedantam; Nicholas Wainwright; Cisca Wijmenga; Sarah H Wild; Gonneke Willemsen; Tom Wilsgaard; James F Wilson; Elizabeth H Young; Jing Hua Zhao; Linda S Adair; Dominique Arveiler; Themistocles L Assimes; Stefania Bandinelli; Franklyn Bennett; Murielle Bochud; Bernhard O Boehm; Dorret I Boomsma; Ingrid B Borecki; Stefan R Bornstein; Pascal Bovet; Michel Burnier; Harry Campbell; Aravinda Chakravarti; John C Chambers; Yii-Der Ida Chen; Francis S Collins; Richard S Cooper; John Danesh; George Dedoussis; Ulf de Faire; Alan B Feranil; Jean Ferrières; Luigi Ferrucci; Nelson B Freimer; Christian Gieger; Leif C Groop; Vilmundur Gudnason; Ulf Gyllensten; Anders Hamsten; Tamara B Harris; Aroon Hingorani; Joel N Hirschhorn; Albert Hofman; G Kees Hovingh; Chao Agnes Hsiung; Steve E Humphries; Steven C Hunt; Kristian Hveem; Carlos Iribarren; Marjo-Riitta Järvelin; Antti Jula; Mika Kähönen; Jaakko Kaprio; Antero Kesäniemi; Mika Kivimaki; Jaspal S Kooner; Peter J Koudstaal; Ronald M Krauss; Diana Kuh; Johanna Kuusisto; Kirsten O Kyvik; Markku Laakso; Timo A Lakka; Lars Lind; Cecilia M Lindgren; Nicholas G Martin; Winfried März; Mark I McCarthy; Colin A McKenzie; Pierre Meneton; Andres Metspalu; Leena Moilanen; Andrew D Morris; Patricia B Munroe; Inger Njølstad; Nancy L Pedersen; Chris Power; Peter P Pramstaller; Jackie F Price; Bruce M Psaty; Thomas Quertermous; Rainer Rauramaa; Danish Saleheen; Veikko Salomaa; Dharambir K Sanghera; Jouko Saramies; Peter E H Schwarz; Wayne H-H Sheu; Alan R Shuldiner; Agneta Siegbahn; Tim D Spector; Kari Stefansson; David P Strachan; Bamidele O Tayo; Elena Tremoli; Jaakko Tuomilehto; Matti Uusitupa; Cornelia M van Duijn; Peter Vollenweider; Lars Wallentin; Nicholas J Wareham; John B Whitfield; Bruce H R Wolffenbuttel; Jose M Ordovas; Eric Boerwinkle; Colin N A Palmer; Unnur Thorsteinsdottir; Daniel I Chasman; Jerome I Rotter; Paul W Franks; Samuli Ripatti; L Adrienne Cupples; Manjinder S Sandhu; Stephen S Rich
Journal:  Nat Genet       Date:  2013-10-06       Impact factor: 38.330

View more
  2 in total

Review 1.  Lipid Phenotypes and DNA Methylation: a Review of the Literature.

Authors:  Alana C Jones; Marguerite R Irvin; Steven A Claas; Donna K Arnett
Journal:  Curr Atheroscler Rep       Date:  2021-09-01       Impact factor: 5.967

2.  Epigenetic Signatures Discriminate Patients With Primary Sclerosing Cholangitis and Ulcerative Colitis From Patients With Ulcerative Colitis.

Authors:  Manon de Krijger; Ishtu L Hageman; Andrew Y F Li Yim; Jan Verhoeff; Juan J Garcia Vallejo; Patricia H P van Hamersveld; Evgeni Levin; Theodorus B M Hakvoort; Manon E Wildenberg; Peter Henneman; Cyriel Y Ponsioen; Wouter J de Jonge
Journal:  Front Immunol       Date:  2022-03-16       Impact factor: 7.561

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.