Literature DB >> 27370569

MethPed: an R package for the identification of pediatric brain tumor subtypes.

Mohammad Tanvir Ahamed1, Anna Danielsson2, Szilárd Nemes3, Helena Carén4.   

Abstract

BACKGROUND: DNA methylation profiling of pediatric brain tumors offers a new way of diagnosing and subgrouping these tumors which improves current clinical diagnostics based on histopathology. We have therefore developed the MethPed classifier, which is a multiclass random forest algorithm, based on DNA methylation profiles from many subgroups of pediatric brain tumors.
RESULTS: We developed an R package that implements the MethPed classifier, making it easily available and accessible. The package can be used for estimating the probability that an unknown sample belongs to each of nine pediatric brain tumor diagnoses/subgroups.
CONCLUSIONS: The MethPed R package efficiently classifies pediatric brain tumors using the developed MethPed classifier. MethPed is available via Bioconductor: http://bioconductor.org/packages/MethPed/.

Entities:  

Keywords:  450K; Astrocytoma; Classifier (classification tool); DNA methylation; Ependymoma; Glioblastoma; Medulloblastoma; MethPed; R package; Random forest

Mesh:

Year:  2016        PMID: 27370569      PMCID: PMC4930602          DOI: 10.1186/s12859-016-1144-0

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

Carcinogenesis involves changes in gene expression that results in tumor specific gene and protein signatures. Such signatures have been used to classify different subtypes of cancers. Gene expression is partly regulated by the methylation state of CpG islands. Cancer tissue is characterized by an increased variability in DNA methylation patterns. DNA methylation profiling has been reported as a robust method to classify and subgroup tumors of different origin [1]. For most pediatric brain tumor diagnoses, methylation profiling can divide the tumors into clinically relevant subgroups reflecting the diverse biology of the different subtypes which further highlights the need for specific therapeutic strategies to target different subgroups. With the increased knowledge about specific brain tumor subgroups and the development of targeted therapy for different entities, it is essential to quickly and accurately determine the correct diagnosis for pediatric brain tumor patients. The most popular and commonly used platform for genome-wide methylation profiling is the Illumina Infinium Human Methylation 450 BeadChip arrays. These arrays profile ~485,000 CpG sites and have been used by the Cancer Genome Atlas Project (TCGA) and in numerous studies of pediatric brain tumors. A correct diagnosis is vital for determining the appropriate treatment protocol for a specific patient, to select the right patients for clinical trials investigating novel therapy for specific diagnoses and subgroups and for basic researchers to be able to draw correct conclusions from experiments. We therefore developed the MethPed classifier [2], which is a multiclass random forest algorithm [3], based on DNA methylation profiles from many subgroups of pediatric brain tumors. We have now developed an R package that uses this method, making it easily available and accessible.

Implementation

The MethPed classifier was developed using the Random Forest (RF) algorithm [3] for robust classification of unknown brain tumor samples into subtypes as described in Danielsson et al. [2]. Briefly, the RF algorithm was applied on beta values which are the estimate of methylation levels (between 0 and 1 with 0 being unmethylated and 1 fully methylated) using the ratio of intensities between methylated and unmethylated alleles generated by the Illumina Infinium HumanMethylation 450 BeadChip array. A training probe pool of 900 methylation sites that showed the highest predictive power (AUC values) in a large number of regression analyses was selected from 472 clinically diagnosed brain tumor cases available on GEO after necessary data cleaning and KNN imputation of missing values. The RF algorithm was then applied to classify unknown samples based on the selected training probe pool. See Fig. 1 for a summary of the workflow for the MethPed classifier and R package.
Fig. 1

Implementation and workflow of the MethPed classifier and package

Implementation and workflow of the MethPed classifier and package The MethPed classifier can be accessed through the ‘MethPed’ package that can be downloaded from Bioconductor, a repository for bioinformatics related applications. The ‘MethPed’ package includes the MethPed classifier and an example data set of two tumors. The example data can be read into the R computing environment with the help of the data function after installing the package. Data for running MethPed is generated by the Illumina Infinium HumanMethylation 450 BeadChip arrays. Beta values for two samples (Tumor A and Tumor B) are provided with the ‘MethPed’ package as an example [2]. This data has no missing values. If missing values exist in a data set, the impute package can be used for missing value imputation, according to the MethPed vignette in Bioconductor.

Results and discussion

The MethPed analysis starts with loading the data, checking for missing values in the data file and thereafter runs through the classification. Error rate of the prediction is estimated and the probability that a sample belongs to one of nine tumor diagnoses/subgroups is given. In the current version of MethPed the following groups are included; glioblastoma (GBM), pilocytic astrocytoma, medulloblastoma (Wnt, Shh, group 3 and group 4), diffuse intrinsic pontine glioma (DIPG), ependymoma and embryonal tumor with multilayered rosettes (ETMR). These include the most common diagnoses and subgroups but not all. The robustness of a classifier is highly dependent on the accuracy of the training data and therefore we choose not to build in groups with limited data avaliable. With the MethPed classifier, the probability that a tumor sample belongs to a specific tumor group is presented (Fig. 2a), or if preferred, the group with the maximum probability leaving zero to the other tumor groups by supplying an extra parameter ‘prob = FALSE’ in the classifier (Fig. 2b). It should be note that for tumors belonging to diagnoses that are not included in the MethPed classification, these will be classified as inconclusive (with low probabilities of belonging to any of the groups) or to the most similar tumor form that is present in the classifier. For more information, see Danielsson et al. [2].
Fig. 2

Bar plots on the diagnosis prediction of the two test samples. a Classification probability of a sample belonging to each of the pediatric brain tumor diagnoses currently included in MethPed and (b) Maximum classification probability of a specific tumor group for each sample

Bar plots on the diagnosis prediction of the two test samples. a Classification probability of a sample belonging to each of the pediatric brain tumor diagnoses currently included in MethPed and (b) Maximum classification probability of a specific tumor group for each sample The conditional probability matrix of the classification from the MethPed output can be observed by the ‘summary’ command in R. For visualization of the prediction, bar plots can be generated by using the ‘plot’ command (Fig. 2). If missing probes from the sample compared with the training data included with the package exist, these can be observed by the ‘probeMis’ command. MethPed is currently the only publically available tool for classification of pediatric brain tumors. The use of methylation profiling for classification of these tumors adds new and important knowledge in the clinical setting for choosing the optimal care and treatment of these patients and will therefore likely complement histopathological diagnoses in the near future [1, 2].

Conclusions

The MethPed R package can be used to efficiently classify pediatric brain tumors using DNA methylation profiles generated by the Illumina 450 K methylation arrays.

Abbreviations

DIPG, diffuse intrinsic pontine glioma; ETMR, embryonal tumor with multilayered rosettes; GBM, glioblastoma; TCGA, the Cancer Genome Atlas Project.
  2 in total

1.  New Brain Tumor Entities Emerge from Molecular Classification of CNS-PNETs.

Authors:  Dominik Sturm; Brent A Orr; Umut H Toprak; Volker Hovestadt; David T W Jones; David Capper; Martin Sill; Ivo Buchhalter; Paul A Northcott; Irina Leis; Marina Ryzhova; Christian Koelsche; Elke Pfaff; Sariah J Allen; Gnanaprakash Balasubramanian; Barbara C Worst; Kristian W Pajtler; Sebastian Brabetz; Pascal D Johann; Felix Sahm; Jüri Reimand; Alan Mackay; Diana M Carvalho; Marc Remke; Joanna J Phillips; Arie Perry; Cynthia Cowdrey; Rachid Drissi; Maryam Fouladi; Felice Giangaspero; Maria Łastowska; Wiesława Grajkowska; Wolfram Scheurlen; Torsten Pietsch; Christian Hagel; Johannes Gojo; Daniela Lötsch; Walter Berger; Irene Slavc; Christine Haberler; Anne Jouvet; Stefan Holm; Silvia Hofer; Marco Prinz; Catherine Keohane; Iris Fried; Christian Mawrin; David Scheie; Bret C Mobley; Matthew J Schniederjan; Mariarita Santi; Anna M Buccoliero; Sonika Dahiya; Christof M Kramm; André O von Bueren; Katja von Hoff; Stefan Rutkowski; Christel Herold-Mende; Michael C Frühwald; Till Milde; Martin Hasselblatt; Pieter Wesseling; Jochen Rößler; Ulrich Schüller; Martin Ebinger; Jens Schittenhelm; Stephan Frank; Rainer Grobholz; Istvan Vajtai; Volkmar Hans; Reinhard Schneppenheim; Karel Zitterbart; V Peter Collins; Eleonora Aronica; Pascale Varlet; Stephanie Puget; Christelle Dufour; Jacques Grill; Dominique Figarella-Branger; Marietta Wolter; Martin U Schuhmann; Tarek Shalaby; Michael Grotzer; Timothy van Meter; Camelia-Maria Monoranu; Jörg Felsberg; Guido Reifenberger; Matija Snuderl; Lynn Ann Forrester; Jan Koster; Rogier Versteeg; Richard Volckmann; Peter van Sluis; Stephan Wolf; Tom Mikkelsen; Amar Gajjar; Kenneth Aldape; Andrew S Moore; Michael D Taylor; Chris Jones; Nada Jabado; Matthias A Karajannis; Roland Eils; Matthias Schlesner; Peter Lichter; Andreas von Deimling; Stefan M Pfister; David W Ellison; Andrey Korshunov; Marcel Kool
Journal:  Cell       Date:  2016-02-25       Impact factor: 41.582

2.  MethPed: a DNA methylation classifier tool for the identification of pediatric brain tumor subtypes.

Authors:  Anna Danielsson; Szilárd Nemes; Magnus Tisell; Birgitta Lannering; Claes Nordborg; Magnus Sabel; Helena Carén
Journal:  Clin Epigenetics       Date:  2015-07-09       Impact factor: 6.551

  2 in total
  3 in total

Review 1.  Advances in the classification of pediatric brain tumors through DNA methylation profiling: From research tool to frontline diagnostic.

Authors:  Rahul Kumar; Anthony P Y Liu; Brent A Orr; Paul A Northcott; Giles W Robinson
Journal:  Cancer       Date:  2018-09-26       Impact factor: 6.860

2.  Establishment and characterization of an orthotopic patient-derived Group 3 medulloblastoma model for preclinical drug evaluation.

Authors:  Emma Sandén; Cecilia Dyberg; Cecilia Krona; Gabriel Gallo-Oller; Thale Kristin Olsen; Julio Enríquez Pérez; Malin Wickström; Atosa Estekizadeh; Marcel Kool; Edward Visse; Tomas J Ekström; Peter Siesjö; John Inge Johnsen; Anna Darabi
Journal:  Sci Rep       Date:  2017-04-18       Impact factor: 4.379

3.  DNA 5-hydroxymethylcytosine in pediatric central nervous system tumors may impact tumor classification and is a positive prognostic marker.

Authors:  Nasim Azizgolshani; Curtis L Petersen; Youdinghuan Chen; Joshua J Levy; Lucas A Salas; Laurent Perreard; Lananh N Nguyen; Brock C Christensen
Journal:  Clin Epigenetics       Date:  2021-09-19       Impact factor: 6.551

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.