Literature DB >> 25609794

MethylMix: an R package for identifying DNA methylation-driven genes.

Olivier Gevaert1.   

Abstract

UNLABELLED: DNA methylation is an important mechanism regulating gene transcription, and its role in carcinogenesis has been extensively studied. Hyper and hypomethylation of genes is an alternative mechanism to deregulate gene expression in a wide range of diseases. At the same time, high-throughput DNA methylation assays have been developed generating vast amounts of genome wide DNA methylation measurements. Yet, few tools exist that can formally identify hypo and hypermethylated genes that are predictive of transcription and thus functionally relevant for a particular disease. To accommodate this lack of tools, we developed MethylMix, an algorithm implemented in R to identify disease specific hyper and hypomethylated genes. MethylMix is based on a beta mixture model to identify methylation states and compares them with the normal DNA methylation state. MethylMix introduces a novel metric, the 'Differential Methylation value' or DM-value defined as the difference of a methylation state with the normal methylation state. Finally, matched gene expression data are used to identify, besides differential, transcriptionally predictive methylation states by focusing on methylation changes that effect gene expression.
AVAILABILITY AND IMPLEMENTATION: MethylMix was implemented as an R package and is available in bioconductor.
© The Author 2015. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2015        PMID: 25609794      PMCID: PMC4443673          DOI: 10.1093/bioinformatics/btv020

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

DNA methylation is one of the most studied epigenetic aberrations underlying oncogenesis. Besides genetic mutations, hyper and hypomethylation of genes is an alternative mechanism that is capable of altering the normal state and driving a wide range of diseases. Prior studies have identified hypo or hypermethylation based on heuristic measures for example in breast cancer (Hill ). Additionally, computational methods have been developed to identify differentially methylated regions for specific DNA methylation platforms (Aryee ; Wang et al., 2012; Warden et al., 2013). However, few methods formalize the identification of DNA methylation driven genes using a model-based approach. We identified three key criteria that should be addressed to derive key methylation-driven genes. First, the determination of the degree of methylation cannot hinge on arbitrary thresholds as is commonly done. Second, the assessment of a gene as hyper or hypomethylated must be made in comparison to normal tissue. Finally, the identification of genes that are hyper or hypomethylated should be transcriptionally predictive effect, thereby implying that their methylation is functionally relevant. We designed MethylMix to accommodate these three criteria to identify methylation driven genes in diseases.

2 Algorithm

MethylMix integrates DNA methylation from normal and disease samples and matched disease gene expression data via a three-step algorithm: The final output of MethylMix is genes that are both transcriptionally predictive and differential together with the parameters of their methylation states. Additionally, a matrix of DM-values is part of the output and can be used in subsequent analysis, for example to define methylation driven subgroups using clustering algorithms. Step i: Genes are filtered by identifying transcriptionally predictive methylation. First, each CpG site is associated with its closest gene. Next, MethylMix requires that the DNA methylation of a CpG site has a significant effect on its corresponding gene expression in order for the gene to be considered a methylation-driven gene. We define such genes as transcriptionally predictive genes. Step ii: The methylation states of a gene are identified using univariate beta mixture modeling to identify subgroups of patients with similar DNA methylation level for a specific CpG site. We use the Bayesian Information Criterion (BIC) to select the number of methylation states by iteratively adding a new mixture component if the BIC score improves. Each beta mixture component is referred to as a methylation state and represented by its mean methylation level. Step iii: Hyper and hypomethylated genes are defined relative to normal by comparing the methylation levels of each methylation state to the mean of the DNA methylation levels of normal tissue samples using a Wilcoxon rank sum test. Based on this test, Differential Methylation values or DM-values are created defined as the difference of a methylation state with the normal methylation state. Genes with methylation states different from normal are called differential genes.

3 Functions and examples

MethylMix was implemented in the statistical language R and is provided as an R package in the supplementary data. MethylMix contains two key functionalities; the creation of MethylMix models for a set of genes of any size and the visualization of a MethylMix plot for each gene. MethylMix needs three datasets: normal DNA methylation data, disease DNA methylation data and matched disease gene expression data. The normal DNA methylation data should ideally be from the same tissue or cell type as the disease DNA methylation data. We provided example data for 14 genes from 251 glioblastoma patients from The Cancer Genome Atlas (TCGA) (McLendon ) in the package. The 14 genes were selected based on their documented differential DNA methylation status in glioblastoma in the literature (Etcheverry ; Hegi ). First, a MethylMix model is created for all genes as follows: > library(MethylMix) > data(METcancer) > data(METnormal) > data(MAcancer) > MethylMixResults = MethylMix(METcancer,METnormal,MAcancer) MethylMix will first investigate for each gene if it is transcriptionally predictive by building a linear regression model that estimates the association between DNA methylation and gene expression. MethylMix only selects genes with a significant inverse relationship (P value < 0.01) resulting in nine transcriptionally predictive genes in this example. Then a MethylMix model is created for these nine transcriptionally predictive genes and MethylMix reports how many methylation states each gene has. For large datasets with more genes, MethylMix can be run in parallel mode and take advantage of multiple cores. Next, a MethylMix model plot can be created for each gene visualizing the beta mixture model and the methylation states that were identified for a particular gene. Additional parameters can be passed to the plot function by adding the normal methylation data and the matched gene expression data. These additional parameters will visualize the 95% confidence interval of the normal DNA methylation data and the relationship with matched gene expression data. > MethylMix_PlotModel(’MGMT’,METcancer, MethylMixResults,MAcancer,METnormal) For example, Figure 1 displays the MethylMix model for MGMT showing two methylation states, whereby the low methylation state matches the normal methylation and the high methylation state corresponds to hypermethylation of MGMT, a well-known case of hypermethylation influencing treatment of glioblastoma patients (Hegi ). Next, Figure 2 shows the inverse correlation between DNA methylation and matched gene expression of MGMT.
Fig. 1.

MethylMix model for the MGMT gene based on 251 glioblastoma patients from TCGA

Fig. 2.

Inverse correlation of DNA methylation and gene expression for MGMT in 251 glioblastoma patients from TCGA

MethylMix model for the MGMT gene based on 251 glioblastoma patients from TCGA Inverse correlation of DNA methylation and gene expression for MGMT in 251 glioblastoma patients from TCGA

4 Conclusion

MethylMix is an R package that identifies hyper and hypomethylated genes using a beta mixture modeling approach. MethylMix also quantifies the effect DNA methylation has on gene expression, thereby identifying transcriptionally predictive DNA methylation events. MethylMix can be used both to study single genes as in the example above or in parallel mode to build MethylMix models genome wide. MethylMix requires a large cohort to identify methylation states and capture DNA methylation heterogeneity present in a particular disease. We used MethylMix and their associated DM-values to identify driver genes (Gevaert and Plevritis, 2013; Gevaert , 2014), on dataset sizes of 100 samples ore more, and on more than 4000 TCGA cases across 12 tissues to identify methylation driven subgroups (Gevaert ). In summary, MethylMix offers a new tool to identify methylation-driven genes providing a complimentary source of information to copy number and mutation spectra to identify disease driver genes.

Funding

This work was partially supported by NIH/NCI R01 CA160251 and NIH/NCI U01 CA176299. Conflict of Interest: none declared.
  11 in total

1.  IMA: an R package for high-throughput analysis of Illumina's 450K Infinium methylation data.

Authors:  Dan Wang; Li Yan; Qiang Hu; Lara E Sucheston; Michael J Higgins; Christine B Ambrosone; Candace S Johnson; Dominic J Smiraglia; Song Liu
Journal:  Bioinformatics       Date:  2012-01-16       Impact factor: 6.937

2.  Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays.

Authors:  Martin J Aryee; Andrew E Jaffe; Hector Corrada-Bravo; Christine Ladd-Acosta; Andrew P Feinberg; Kasper D Hansen; Rafael A Irizarry
Journal:  Bioinformatics       Date:  2014-01-28       Impact factor: 6.937

3.  Identification of ovarian cancer driver genes by using module network integration of multi-omics data.

Authors:  Olivier Gevaert; Victor Villalobos; Branimir I Sikic; Sylvia K Plevritis
Journal:  Interface Focus       Date:  2013-08-06       Impact factor: 3.906

4.  Glioblastoma multiforme: exploratory radiogenomic analysis by using quantitative image features.

Authors:  Olivier Gevaert; Lex A Mitchell; Achal S Achrol; Jiajing Xu; Sebastian Echegaray; Gary K Steinberg; Samuel H Cheshier; Sandy Napel; Greg Zaharchuk; Sylvia K Plevritis
Journal:  Radiology       Date:  2014-05-12       Impact factor: 11.105

5.  Clinical trial substantiates the predictive value of O-6-methylguanine-DNA methyltransferase promoter methylation in glioblastoma patients treated with temozolomide.

Authors:  Monika E Hegi; Annie-Claire Diserens; Sophie Godard; Pierre-Yves Dietrich; Luca Regli; Sandrine Ostermann; Philippe Otten; Guy Van Melle; Nicolas de Tribolet; Roger Stupp
Journal:  Clin Cancer Res       Date:  2004-03-15       Impact factor: 12.531

6.  Genome-wide DNA methylation profiling of CpG islands in breast cancer identifies novel genes associated with tumorigenicity.

Authors:  Victoria K Hill; Christopher Ricketts; Ivan Bieche; Sophie Vacher; Dean Gentle; Cheryl Lewis; Eamonn R Maher; Farida Latif
Journal:  Cancer Res       Date:  2011-03-01       Impact factor: 12.701

7.  DNA methylation in glioblastoma: impact on gene expression and clinical outcome.

Authors:  Amandine Etcheverry; Marc Aubry; Marie de Tayrac; Elodie Vauleon; Rachel Boniface; Frederique Guenot; Stephan Saikali; Abderrahmane Hamlat; Laurent Riffaud; Philippe Menei; Veronique Quillien; Jean Mosser
Journal:  BMC Genomics       Date:  2010-12-14       Impact factor: 3.969

8.  Pancancer analysis of DNA methylation-driven genes using MethylMix.

Authors:  Olivier Gevaert; Robert Tibshirani; Sylvia K Plevritis
Journal:  Genome Biol       Date:  2015-01-29       Impact factor: 13.583

9.  Comprehensive genomic characterization defines human glioblastoma genes and core pathways.

Authors: 
Journal:  Nature       Date:  2008-09-04       Impact factor: 49.962

10.  COHCAP: an integrative genomic pipeline for single-nucleotide resolution DNA methylation analysis.

Authors:  Charles D Warden; Heehyoung Lee; Joshua D Tompkins; Xiaojin Li; Charles Wang; Arthur D Riggs; Hua Yu; Richard Jove; Yate-Ching Yuan
Journal:  Nucleic Acids Res       Date:  2013-04-17       Impact factor: 16.971

View more
  64 in total

1.  Whole slide images reflect DNA methylation patterns of human tumors.

Authors:  Hong Zheng; Alexandre Momeni; Pierre-Louis Cedoz; Hannes Vogel; Olivier Gevaert
Journal:  NPJ Genom Med       Date:  2020-03-10       Impact factor: 8.617

2.  methylFlow: cell-specific methylation pattern reconstruction from high-throughput bisulfite-converted DNA sequencing.

Authors:  Faezeh Dorri; Lee Mendelowitz; Héctor Corrada Bravo
Journal:  Bioinformatics       Date:  2016-06-01       Impact factor: 6.937

3.  A comparative study of multi-omics integration tools for cancer driver gene identification and tumour subtyping.

Authors:  Anita Sathyanarayanan; Rohit Gupta; Erik W Thompson; Dale R Nyholt; Denis C Bauer; Shivashankar H Nagaraj
Journal:  Brief Bioinform       Date:  2020-12-01       Impact factor: 11.622

4.  MOSClip: multi-omic and survival pathway analysis for the identification of survival associated gene and modules.

Authors:  Paolo Martini; Monica Chiogna; Enrica Calura; Chiara Romualdi
Journal:  Nucleic Acids Res       Date:  2019-08-22       Impact factor: 16.971

5.  Intertumoral Heterogeneity within Medulloblastoma Subgroups.

Authors:  Florence M G Cavalli; Marc Remke; Ladislav Rampasek; John Peacock; David J H Shih; Betty Luu; Livia Garzia; Jonathon Torchia; Carolina Nor; A Sorana Morrissy; Sameer Agnihotri; Yuan Yao Thompson; Claudia M Kuzan-Fischer; Hamza Farooq; Keren Isaev; Craig Daniels; Byung-Kyu Cho; Seung-Ki Kim; Kyu-Chang Wang; Ji Yeoun Lee; Wieslawa A Grajkowska; Marta Perek-Polnik; Alexandre Vasiljevic; Cecile Faure-Conter; Anne Jouvet; Caterina Giannini; Amulya A Nageswara Rao; Kay Ka Wai Li; Ho-Keung Ng; Charles G Eberhart; Ian F Pollack; Ronald L Hamilton; G Yancey Gillespie; James M Olson; Sarah Leary; William A Weiss; Boleslaw Lach; Lola B Chambless; Reid C Thompson; Michael K Cooper; Rajeev Vibhakar; Peter Hauser; Marie-Lise C van Veelen; Johan M Kros; Pim J French; Young Shin Ra; Toshihiro Kumabe; Enrique López-Aguilar; Karel Zitterbart; Jaroslav Sterba; Gaetano Finocchiaro; Maura Massimino; Erwin G Van Meir; Satoru Osuka; Tomoko Shofuda; Almos Klekner; Massimo Zollo; Jeffrey R Leonard; Joshua B Rubin; Nada Jabado; Steffen Albrecht; Jaume Mora; Timothy E Van Meter; Shin Jung; Andrew S Moore; Andrew R Hallahan; Jennifer A Chan; Daniela P C Tirapelli; Carlos G Carlotti; Maryam Fouladi; José Pimentel; Claudia C Faria; Ali G Saad; Luca Massimi; Linda M Liau; Helen Wheeler; Hideo Nakamura; Samer K Elbabaa; Mario Perezpeña-Diazconti; Fernando Chico Ponce de León; Shenandoah Robinson; Michal Zapotocky; Alvaro Lassaletta; Annie Huang; Cynthia E Hawkins; Uri Tabori; Eric Bouffet; Ute Bartels; Peter B Dirks; James T Rutka; Gary D Bader; Jüri Reimand; Anna Goldenberg; Vijay Ramaswamy; Michael D Taylor
Journal:  Cancer Cell       Date:  2017-06-12       Impact factor: 31.743

6.  Methylation-driven genes and their prognostic value in cervical squamous cell carcinoma.

Authors:  Jinhui Liu; Sipei Nie; Siyue Li; Huangyang Meng; Rui Sun; Jing Yang; Wenjun Cheng
Journal:  Ann Transl Med       Date:  2020-07

7.  DNA Methylation-Mediated Low Expression of CFTR Stimulates the Progression of Lung Adenocarcinoma.

Authors:  Yue Wang; Lu Tang; Liangliang Yang; Peiyun Lv; Shixiong Mai; Li Xu; Zhenxing Wang
Journal:  Biochem Genet       Date:  2021-09-08       Impact factor: 1.890

8.  Comprehensive and integrative analysis identifies COX7A1 as a critical methylation-driven gene in breast invasive carcinoma.

Authors:  Zhixian He; Feiran Wang; Wei Zhang; Jinhua Ding; Sujie Ni
Journal:  Ann Transl Med       Date:  2019-11

9.  Genomic, Pathway Network, and Immunologic Features Distinguishing Squamous Carcinomas.

Authors:  Joshua D Campbell; Christina Yau; Reanne Bowlby; Yuexin Liu; Kevin Brennan; Huihui Fan; Alison M Taylor; Chen Wang; Vonn Walter; Rehan Akbani; Lauren Averett Byers; Chad J Creighton; Cristian Coarfa; Juliann Shih; Andrew D Cherniack; Olivier Gevaert; Marcos Prunello; Hui Shen; Pavana Anur; Jianhong Chen; Hui Cheng; D Neil Hayes; Susan Bullman; Chandra Sekhar Pedamallu; Akinyemi I Ojesina; Sara Sadeghi; Karen L Mungall; A Gordon Robertson; Christopher Benz; Andre Schultz; Rupa S Kanchi; Carl M Gay; Apurva Hegde; Lixia Diao; Jing Wang; Wencai Ma; Pavel Sumazin; Hua-Sheng Chiu; Ting-Wen Chen; Preethi Gunaratne; Larry Donehower; Janet S Rader; Rosemary Zuna; Hikmat Al-Ahmadie; Alexander J Lazar; Elsa R Flores; Kenneth Y Tsai; Jane H Zhou; Anil K Rustgi; Esther Drill; Ronglei Shen; Christopher K Wong; Joshua M Stuart; Peter W Laird; Katherine A Hoadley; John N Weinstein; Myron Peto; Curtis R Pickering; Zhong Chen; Carter Van Waes
Journal:  Cell Rep       Date:  2018-04-03       Impact factor: 9.423

10.  A methylation-driven gene panel predicts survival in patients with colon cancer.

Authors:  Yaojun Peng; Jing Zhao; Fan Yin; Gaowa Sharen; Qiyan Wu; Qi Chen; Xiaoxuan Sun; Juan Yang; Huan Wang; Dong Zhang
Journal:  FEBS Open Bio       Date:  2021-07-28       Impact factor: 2.693

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.