Literature DB >> 30275901

Identification of epistatic interactions between the human RNA demethylases FTO and ALKBH5 with gene set enrichment analysis informed by differential methylation.

Elizabeth R Piette1, Jason H Moore1.   

Abstract

The Genetic Analysis Workshop (GAW) presents an opportunity to collaboratively evaluate methodology relevant to current issues in genetic epidemiology. The GAW20 data combine real clinical trial data with fictitious epigenetic drug response endpoints. Considering the evidence suggesting that networks of interactions between many genes underlie complex phenotypes, we utilize differential methylation status to identify a relevant gene set for enrichment analysis and use this to infer potential biological function underlying drug response. We highlight the pertinence of considering the potential for widespread epistatic interactions in the absence of main effects, and present evidence of epistasis between single-nucleotide polymorphisms (SNPs) on the two RNA demethylases FTO and ALKBH5.

Entities:  

Year:  2018        PMID: 30275901      PMCID: PMC6157145          DOI: 10.1186/s12919-018-0122-0

Source DB:  PubMed          Journal:  BMC Proc        ISSN: 1753-6561


Background

The GAW is a forum for investigators to develop and critique new analytical methods for complex traits on a shared data set. The GAW20 data provide simulated replications based on the Genetics of Lipid-Lowering Drugs and Diet Network (GOLDN) clinical trial, had participants been subject to treatment with a fictitious drug with a pharmacoepigenetic effect on triglyceride response [1]. These data present a unique analysis opportunity as all phenotypes, subject covariates, genotypes, pre-treatment methylation levels, etc. are real data from the trial, but are accompanied by simulated post-treatment methylation and triglyceride levels. GAW participants choose to analyze the data with or without knowing the simulation methods; we chose to perform the simulated data analysis prior to attending GAW, without knowledge of the simulation methods. Analysis of the real data was performed following GAW attendance, as it was revealed that the data simulation methods did not consider interactions, and therefore analysis of interactions in the simulated data was not appropriate. Despite evidence for multi-locus underpinnings of phenotype-genotype association, the multiple-testing burden associated with fitting interaction models is stringent due to the high dimensionality of genomic data [2]. Strategies for better detecting these interactions can aim to avoid exhaustively testing each potential interaction via data reduction methods, integrating expert knowledge, and/or consolidating multiple sources of evidence to narrow the search space [3-5]. In this study, we hypothesize that CpG sites that are differentially methylated with respect to treatment are associated with the pharmacoepigenetic mechanism of the fictitious drug. Considering the evidence for multi-locus models of complex disease etiology, we hypothesize that the drug response is better evaluated by gene set enrichment analyses than by single locus models. By integrating results from gene ontology, drug-disease association, and microRNA (miRNA) target analyses we find evidence implicating the relevance of adenosine and miRNAs with known epigenetic regulation and roles in lipid metabolism. From this, we infer the potential importance of the N6-methyladenosine modification in the pharmacoepigenetic response on triglycerides, and consider how miRNA adenosine methylation rather than CpG methylation may impact the phenotype. Lacking direct data for miRNA adenosine methylation, we perform a targeted epistasis search between loci on the two RNA demethylases FTO and ALKBH5, and find evidence for statistical epistasis between one variant within each respective gene. Repeating the analysis with the real data revealed four significant interactions between variants across these genes. Overall, we present an example workflow in which integration of multiple sources of information can help uncover biological meaning in the absence of significant main effects.

Methods

Data set

The GOLDN data and companion simulations for GAW20 are previously described [1, 6]. Relevant to this analysis, subject data includes fasting lipid profiles prior to and post-treatment, methylation at more than 450,000 CpG sites prior to and post-treatment, GWAS of more than 700,000 autosomal SNPs, and covariates including age, center, metabolic syndrome-related traits, and smoking status. This analysis is of the pre-defined single representative replicate (n = 680) of the post-treatment methylation and triglyceride levels.

Phenotype definition

We define the phenotype of interest as the log ratio of the average post-treatment triglyceride level to the average pre-treatment triglyceride level. Due to the high correlations between triglyceride levels at pre-treatment time points 1 and 2 and post-treatment time points 3 and 4 (0.90 and 0.91, respectively), and presence of a value for at least one of time point 1 or 2 and 3 or 4 for each individual, we singly impute missing values via linear regression.

CpG site filtering

Significantly differentially methylated CpG sites are identified via paired t-tests for pre- and post-treatment methylation levels (α = 0.05 for 463,995 hypotheses yields Bonferroni cutoff of 1.08 × 10− 7).

Modeling the relationship between phenotype and CpG site methylation

Linear models are fit to test the relationships between the phenotype and methylation status of the significant CpG sites identified above, characterized as a single predictor: the log ratio of post-treatment to pre-treatment methylation (α = 0.05 for 212,018 hypotheses yields Bonferroni cutoff of 2.36 × 10− 6).

Gene set enrichment analyses

All CpG sites that pass the initial t-test filter and have a p-value ≤0.05 for the phenotype ~ methylation predictor model are used to curate a list of corresponding genes with evidence for both differential methylation and association with the phenotype. This gene list is used for gene ontology, drug association, and miRNA target enrichment analyses using the WEB-based GEneSeTAnaLysis Toolkit (WebGestalt, http://www.webgestalt.org/) [7].

Targeted epistasis search

We investigate potential epistasis between the two RNA demethylases FTO and ALKBH5 by calculating p-values for the likelihood ratio tests, comparing the linear models containing each FTO-ALKBH5 SNP-SNP pair, with and without their interaction term (α = 0.05 for 340 hypotheses yields Bonferroni cutoff of 0.00015 for the simulated data; 255 hypotheses yields a cutoff of 0.000196 for the real data).

Results

We tested 463,995 CpG sites for differential methylation prior to versus post-treatment; 212,018 CpG sites passed the Bonferroni threshold of 1.08 × 10− 7. None of the 212,018 CpG sites that are significantly differentially methylated reached genome-wide significance for association with the phenotype (Fig. 1).
Fig. 1

Manhattan plot of triglyceride phenotype ~ CpG site methylation log ratio

Manhattan plot of triglyceride phenotype ~ CpG site methylation log ratio Our gene set is constructed from all CpG sites with p-values ≤0.05 for the models above for which corresponding gene annotations are available (5413 of the 212,018 differentially methylated sites). Some CpGs have more than one corresponding gene listed, and many genes have multiple CpGs, for a total gene list length of 4443. The top result from the drug association analysis is adenosine (number of reference genes in the category = 477; number of genes in the gene set and also in the category = 126; expected number in the category = 49.14; ratio of enrichment = 2.56; raw p-value from hypergeometric test = 1.31 × 10− 23; p-value adjusted by multiple test adjustment = 8 × 10− 21). The top result from the miRNA target analysis is miR-124a (number of reference genes in the category = 542; number of genes in the gene set and also in the category = 175; expected number in the category = 55.84; ratio of enrichment = 3.13; raw p-value from hypergeometric test = 7.82 × 10− 45; p-value adjusted by multiple test adjustment = 1.70 × 10− 42). The GAW20 GWAS data with the simulated phenotype include complete observations for 68 SNPs on FTO and 5 SNPs on ALKBH5 for the 680 subjects. We only test for epistatic interactions between SNPs across the two genes (and do not test for interactions between SNPs on the same gene), for a total of 340 tested interactions and therefore a Bonferroni threshold of 0.00015. One pair of SNPs, rs2192872 from FTO and rs8068517 from ALKBH5, have a significant p-value for the likelihood ratio test comparing the models with and without the interaction (p = 2.01 × 10− 5). The analysis of the real phenotype and GWAS data was performed in the same manner for 778 subjects for 51 SNPs on FTO and 5 SNPs on ALKBH5 (Bonferroni threshold of 0.000196). Four pairs of SNPs have significant p values for the likelihood ratio test comparing the models with and without the interaction (Table 1). Figure 2 visually summarizes the distribution of phenotype by genotype for the two SNPs involved in the most significant identified interaction.
Table 1

Summary of significant interactions, Variant annotations are from Ensembl [29]. Base model covariate selection is based on significance at the 0.05 level and includes average pre-treatment triglyceride level, age, center, current smoker status, and sex

SNPAllelesMAFLocationGeneConsequence typeLRTp-value
rs1362571 G/T0.34 (G)16:53877858 FTO Intron variant2.76 × 10− 6
rs11655588 A/G0.18 (G)17:18204137 ALKBH5 Intron variant
rs10521304 T/C0.41 (C)16:53874745 FTO Intron variant8.80 × 10− 6
rs11655588 A/G0.18 (G)17:18204137 ALKBH5 Intron variant
rs1421090 A/G0.29 (G)16:53816258 FTO Intron variant0.000158
rs8071834 T/C0.45 (C)17:18196677 ALKBH5 Intron variant
rs17820875 A/G0.12 (G)16:53892878 FTO Intron variant0.000177
rs8068517 G/A0.24 (G)17:18192664 ALKBH5 Intron variant

Abbreviations: LRT likelihood ratio test, MAF minor allele frequency

Fig. 2

Phenotype distributions by genotype a. Phenotype distributions for individuals with 0, 1, or 2 copies of the minor allele for rs1362571. b. Phenotype distributions for individuals with 0, 1, or 2 copies of the minor allele for rs11655588 among those with 0 copies of the minor allele for rs1362571 (left); 1 copy (center); 2 copies (right). c. Phenotype distributions for individuals with 0, 1, or 2 copies of the minor allele for rs11655588. d. Phenotype distributions for individuals with 0, 1, or 2 copies of the minor allele for rs1362571 among those with 0 copies of the minor allele for rs11655588 (left); 1 copy (center); 2 copies (right)

Summary of significant interactions, Variant annotations are from Ensembl [29]. Base model covariate selection is based on significance at the 0.05 level and includes average pre-treatment triglyceride level, age, center, current smoker status, and sex Abbreviations: LRT likelihood ratio test, MAF minor allele frequency Phenotype distributions by genotype a. Phenotype distributions for individuals with 0, 1, or 2 copies of the minor allele for rs1362571. b. Phenotype distributions for individuals with 0, 1, or 2 copies of the minor allele for rs11655588 among those with 0 copies of the minor allele for rs1362571 (left); 1 copy (center); 2 copies (right). c. Phenotype distributions for individuals with 0, 1, or 2 copies of the minor allele for rs11655588. d. Phenotype distributions for individuals with 0, 1, or 2 copies of the minor allele for rs1362571 among those with 0 copies of the minor allele for rs11655588 (left); 1 copy (center); 2 copies (right)

Discussion and conclusions

Our gene set enrichment analyses were motivated by the goal of making inferences about the mechanism of action of the fictitious drug, assuming that differential methylation could reveal a set of genes associated with a drug that is functionally similar to the fictitious one in question. Upon attending GAW, it became apparent that interaction analysis of the simulated data was not appropriate given the nature of the simulation, and the interaction identified can therefore be considered a false positive. However, reimplementing the same analytic pipeline using the real data produced largely comparable results and identified four pairs of loci between FTO and ALKBH5 with significant interactions. The joint evidence implicating adenosine as the top result from the drug association gene set enrichment analysis and numerous miRNAs involved in metabolic traits from the miRNA target analysis, taken with the assumption that the drug has some unknown epigenetic mechanism, lead us to consider that mRNA or miRNA adenosine methylation, rather than CpG methylation, may be associated with drug response. miRNAs in general are known important regulators of lipid metabolism [8-10]. The top miRNA hit, miR-124a, has evidence for both its role in metabolic traits [11-14] and for its epigenetic regulation in the context of risk of diverse diseases [15-19]. N6-Methyladenosine (m6A) is a reversible, dynamic posttranscriptional modification that is regulated by miRNAs, and its demethylation has been shown to regulate adipogenesis [20-25]. Recent work demonstrates that RNA conformational changes induced by m6A determine substrate specificity for the two RNA demethylases, FTO and ALKBH5 [26-28]. If miRNA adenosine methylation rather than CpG methylation affects the phenotype, although the available data lacks observations on miRNA adenosine methylation, interactions between the two genes that demethylate miRNAs may be biologically relevant and can be assessed with the GOLDN SNP data. Given the evidence for physical interactions between RNA with the m6A mark and these demethylases, we were motivated to check for epistasis between SNPs on these genes. Although we did find four pairs of loci with statistically significant interactions, the small sample size means that some SNP-SNP genotypes have few observations, warranting investigation of this interaction in a larger study and further molecular clarification of the distinct and mutual roles of FTO and ALKBH5. Rather than attempting to explain complex phenotypes solely in terms of single locus main effects, we posit that interaction models better represent the underlying regulatory nature of the genome, and that the joint effect of perturbations to multiple interacting partners can help better explain complex phenotypes. This analysis of epistatic interactions between loci on two genes serves as an illustrative example of how interactions can be significant in the absence of significant main effects, and highlights the need for analyses that integrate multiple sources of data to narrow the search space for plausible interactions.
  28 in total

1.  Genome-wide strategies for detecting multiple loci that influence complex diseases.

Authors:  Jonathan Marchini; Peter Donnelly; Lon R Cardon
Journal:  Nat Genet       Date:  2005-03-27       Impact factor: 38.330

2.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer.

Authors:  M D Ritchie; L W Hahn; N Roodi; L R Bailey; W D Dupont; F F Parl; J H Moore
Journal:  Am J Hum Genet       Date:  2001-06-11       Impact factor: 11.025

3.  MicroRNA-124a is hyperexpressed in type 2 diabetic human pancreatic islets and negatively regulates insulin secretion.

Authors:  Guido Sebastiani; Agnese Po; Evelina Miele; Giuliana Ventriglia; Elena Ceccarelli; Marco Bugliani; Lorella Marselli; Piero Marchetti; Alberto Gulino; Elisabetta Ferretti; Francesco Dotta
Journal:  Acta Diabetol       Date:  2014-11-19       Impact factor: 4.280

4.  Epigenome-wide association study of fasting blood lipids in the Genetics of Lipid-lowering Drugs and Diet Network study.

Authors:  Marguerite R Irvin; Degui Zhi; Roby Joehanes; Michael Mendelson; Stella Aslibekyan; Steven A Claas; Krista S Thibeault; Nikita Patel; Kenneth Day; Lindsay Waite Jones; Liming Liang; Brian H Chen; Chen Yao; Hemant K Tiwari; Jose M Ordovas; Daniel Levy; Devin Absher; Donna K Arnett
Journal:  Circulation       Date:  2014-06-11       Impact factor: 29.690

5.  Significance of serum microRNAs in pre-diabetes and newly diagnosed type 2 diabetes: a clinical study.

Authors:  Lei Kong; Junjie Zhu; Wenxia Han; Xiuyun Jiang; Min Xu; Yue Zhao; Qiongzhu Dong; Zengfen Pang; Qingbo Guan; Ling Gao; Jiajun Zhao; Lei Zhao
Journal:  Acta Diabetol       Date:  2010-09-21       Impact factor: 4.280

6.  MicroRNA-124a is epigenetically regulated and acts as a tumor suppressor by controlling multiple targets in uveal melanoma.

Authors:  Xiaoyan Chen; Dandan He; Xiang Da Dong; Feng Dong; Jiao Wang; Lihua Wang; Jiang Tang; Dan-Ning Hu; Dongsheng Yan; LiLi Tu
Journal:  Invest Ophthalmol Vis Sci       Date:  2013-03-01       Impact factor: 4.799

7.  Epigenetic silencing of the tumor suppressor microRNA Hsa-miR-124a regulates CDK6 expression and confers a poor prognosis in acute lymphoblastic leukemia.

Authors:  Xabier Agirre; Amaia Vilas-Zornoza; Antonio Jiménez-Velasco; José Ignacio Martin-Subero; Lucia Cordeu; Leire Gárate; Edurne San José-Eneriz; Gloria Abizanda; Paula Rodríguez-Otero; Puri Fortes; José Rifón; Eva Bandrés; María José Calasanz; Vanesa Martín; Anabel Heiniger; Antonio Torres; Reiner Siebert; José Román-Gomez; Felipe Prósper
Journal:  Cancer Res       Date:  2009-05-12       Impact factor: 12.701

Review 8.  Influence of miRNA in insulin signaling pathway and insulin resistance: micro-molecules with a major role in type-2 diabetes.

Authors:  Chiranjib Chakraborty; C George Priya Doss; Sanghamitra Bandyopadhyay; Govindasamy Agoramoorthy
Journal:  Wiley Interdiscip Rev RNA       Date:  2014-06-18       Impact factor: 9.957

9.  ALKBH5 is a mammalian RNA demethylase that impacts RNA metabolism and mouse fertility.

Authors:  Guanqun Zheng; John Arne Dahl; Yamei Niu; Peter Fedorcsak; Chun-Min Huang; Charles J Li; Cathrine B Vågbø; Yue Shi; Wen-Ling Wang; Shu-Hui Song; Zhike Lu; Ralph P G Bosmans; Qing Dai; Ya-Juan Hao; Xin Yang; Wen-Ming Zhao; Wei-Min Tong; Xiu-Jie Wang; Florian Bogdan; Kari Furu; Ye Fu; Guifang Jia; Xu Zhao; Jun Liu; Hans E Krokan; Arne Klungland; Yun-Gui Yang; Chuan He
Journal:  Mol Cell       Date:  2012-11-21       Impact factor: 17.970

10.  Ensembl 2016.

Authors:  Andrew Yates; Wasiu Akanni; M Ridwan Amode; Daniel Barrell; Konstantinos Billis; Denise Carvalho-Silva; Carla Cummins; Peter Clapham; Stephen Fitzgerald; Laurent Gil; Carlos García Girón; Leo Gordon; Thibaut Hourlier; Sarah E Hunt; Sophie H Janacek; Nathan Johnson; Thomas Juettemann; Stephen Keenan; Ilias Lavidas; Fergal J Martin; Thomas Maurel; William McLaren; Daniel N Murphy; Rishi Nag; Michael Nuhn; Anne Parker; Mateus Patricio; Miguel Pignatelli; Matthew Rahtz; Harpreet Singh Riat; Daniel Sheppard; Kieron Taylor; Anja Thormann; Alessandro Vullo; Steven P Wilder; Amonida Zadissa; Ewan Birney; Jennifer Harrow; Matthieu Muffato; Emily Perry; Magali Ruffier; Giulietta Spudich; Stephen J Trevanion; Fiona Cunningham; Bronwen L Aken; Daniel R Zerbino; Paul Flicek
Journal:  Nucleic Acids Res       Date:  2015-12-19       Impact factor: 16.971

View more
  3 in total

1.  The m6A methyltransferase METTL3 cooperates with demethylase ALKBH5 to regulate osteogenic differentiation through NF-κB signaling.

Authors:  Jinjin Yu; Lujun Shen; Yanli Liu; Hong Ming; Xinxing Zhu; Maoping Chu; Juntang Lin
Journal:  Mol Cell Biochem       Date:  2019-10-23       Impact factor: 3.396

Review 2.  Emerging role of m6A methylation modification in ovarian cancer.

Authors:  Lin-Lin Chang; Xia-Qing Xu; Xue-Ling Liu; Qian-Qian Guo; Yan-Nan Fan; Bao-Xia He; Wen-Zhou Zhang
Journal:  Cancer Cell Int       Date:  2021-12-11       Impact factor: 5.722

Review 3.  Functions and mechanisms of N6‑methyladenosine in prostate cancer (Review).

Authors:  Hongyuan Wan; Yanyan Feng; Junjie Wu; Lijie Zhu; Yuanyuan Mi
Journal:  Mol Med Rep       Date:  2022-07-20       Impact factor: 3.423

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.