Literature DB >> 29539639

DNA methylation-based classification of central nervous system tumours.

David Capper^1,2,3,4, David T W Jones^5,6, Martin Sill^5,6,7, Volker Hovestadt⁸, Daniel Schrimpf^1,2, Dominik Sturm^5,6,9, Christian Koelsche^1,2, Felix Sahm^1,2, Lukas Chavez^5,6, David E Reuss^1,2, Annekathrin Kratz^1,2, Annika K Wefers^1,2, Kristin Huang^1,2, Kristian W Pajtler^5,6,9, Leonille Schweizer^1,3, Damian Stichel^1,2, Adriana Olar^10,11,12, Nils W Engel^13,14, Kerstin Lindenberg², Patrick N Harter^15,16, Anne K Braczynski^15,16, Karl H Plate^15,16, Hildegard Dohmen¹⁷, Boyan K Garvalov¹⁷, Roland Coras¹⁸, Annett Hölsken¹⁸, Ekkehard Hewer¹⁹, Melanie Bewerunge-Hudler²⁰, Matthias Schick²⁰, Roger Fischer²⁰, Rudi Beschorner²¹, Jens Schittenhelm²¹, Ori Staszewski²², Khalida Wani²³, Pascale Varlet²⁴, Melanie Pages²⁴, Petra Temming²⁵, Dietmar Lohmann²⁶, Florian Selt^5,9,27, Hendrik Witt^5,6,9, Till Milde^5,9,27, Olaf Witt^5,9,27, Eleonora Aronica^28,29,30, Felice Giangaspero^31,32, Elisabeth Rushing³³, Wolfram Scheurlen³⁴, Christoph Geisenberger^35,36, Fausto J Rodriguez³⁷, Albert Becker³⁸, Matthias Preusser³⁹, Christine Haberler⁴⁰, Rolf Bjerkvig^41,42, Jane Cryan⁴³, Michael Farrell⁴³, Martina Deckert⁴⁴, Jürgen Hench⁴⁵, Stephan Frank⁴⁵, Jonathan Serrano⁴⁶, Kasthuri Kannan⁴⁶, Aristotelis Tsirigos⁴⁶, Wolfgang Brück⁴⁷, Silvia Hofer⁴⁸, Stefanie Brehmer⁴⁹, Marcel Seiz-Rosenhagen⁴⁹, Daniel Hänggi⁴⁹, Volkmar Hans^50,51, Stephanie Rozsnoki⁵², Jordan R Hansford^53,54,55, Patricia Kohlhof⁵⁶, Bjarne W Kristensen⁵⁷, Matt Lechner⁵⁸, Beatriz Lopes⁵⁹, Christian Mawrin⁶⁰, Ralf Ketter⁶¹, Andreas Kulozik^5,9, Ziad Khatib⁶², Frank Heppner^3,63,64, Arend Koch³, Anne Jouvet⁶⁵, Catherine Keohane⁶⁶, Helmut Mühleisen⁶⁷, Wolf Mueller⁶⁸, Ute Pohl⁶⁹, Marco Prinz^22,70, Axel Benner⁷, Marc Zapatka⁸, Nicholas G Gottardo^71,72,73, Pablo Hernáiz Driever⁷⁴, Christof M Kramm⁷⁵, Hermann L Müller⁷⁶, Stefan Rutkowski⁷⁷, Katja von Hoff^74,77, Michael C Frühwald⁷⁸, Astrid Gnekow⁷⁸, Gudrun Fleischhack²⁵, Stephan Tippelt²⁵, Gabriele Calaminus⁷⁹, Camelia-Maria Monoranu⁸⁰, Arie Perry⁸¹, Chris Jones⁸², Thomas S Jacques⁸³, Bernhard Radlwimmer⁸, Marco Gessi³⁸, Torsten Pietsch³⁸, Johannes Schramm⁸⁴, Gabriele Schackert⁸⁵, Manfred Westphal⁸⁶, Guido Reifenberger^87,88, Pieter Wesseling^89,90, Michael Weller⁹¹, Vincent Peter Collins⁹², Ingmar Blümcke¹⁸, Martin Bendszus⁹³, Jürgen Debus⁹⁴, Annie Huang⁹⁵, Nada Jabado⁹⁶, Paul A Northcott⁹⁷, Werner Paulus⁵², Amar Gajjar⁹⁸, Giles W Robinson⁹⁸, Michael D Taylor⁹⁹, Zane Jaunmuktane^100,101,102, Marina Ryzhova¹⁰³, Michael Platten¹⁰⁴, Andreas Unterberg³⁵, Wolfgang Wick¹⁰⁵, Matthias A Karajannis¹⁰⁶, Michel Mittelbronn^{15,16,107,108,109,110}, Till Acker¹⁷, Christian Hartmann¹¹¹, Kenneth Aldape¹¹², Ulrich Schüller^{14,113,114,115}, Rolf Buslei^18,116, Peter Lichter⁸, Marcel Kool^5,6, Christel Herold-Mende³⁵, David W Ellison¹¹⁷, Martin Hasselblatt⁵², Matija Snuderl¹¹⁸, Sebastian Brandner^100,102, Andrey Korshunov^1,2, Andreas von Deimling^1,2, Stefan M Pfister^5,6,9.

Abstract

Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging-with substantial inter-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show that the availability of this method may have a substantial impact on diagnostic precision compared to standard methods, resulting in a change of diagnosis in up to 12% of prospective cases. For broader accessibility, we have designed a free online classifier tool, the use of which does not require any additional onsite data processing. Our results provide a blueprint for the generation of machine-learning-based tumour classifiers across other cancer entities, with the potential to fundamentally transform tumour pathology.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Year: 2018 PMID： 29539639 PMCID： PMC6093218 DOI： 10.1038/nature26000

Source DB: PubMed Journal: Nature ISSN： 0028-0836 Impact factor: 49.962

The developmental complexity of the brain is reflected in the vast array of distinct brain tumour entities defined in the current WHO classification of central nervous system (CNS) tumours [1]. These tumours are clinically and biologically highly diverse, encompassing a wide spectrum from benign neoplasms that can frequently be cured by surgery alone (e.g. pilocytic astrocytoma), to highly malignant tumours responding poorly to any therapy (e.g. glioblastoma). Previous studies reported substantial inter-observer variability in the histopathological diagnosis of many CNS tumours, e.g., in diffuse gliomas [2], ependymomas [3] and supratentorial PNETs [4]. To address this, some molecular grouping has been introduced into the update of the WHO classification, but only for selected entities such as medulloblastoma. Furthermore, several single-gene tests based on DNA methylation analysis (e.g., MGMT promoter methylation status), FISH (e.g., 1p/19q, EGFR, MYC, MYCN, PDGFRA, 19q13.42, etc.) or immunohistochemistry (CTNNB1, LIN28A, etc.) that are required to cover the most important differential diagnoses have been shown to be difficult to standardize. Such diagnostic discordance and uncertainty may confound decision-making in clinical practice as well as the interpretation and validity of clinical trial results. The cancer methylome is a combination of both somatically acquired DNA methylation changes and characteristics reflecting the cell of origin [5,6]. The latter property allows, for example, tracing of the primary site of highly dedifferentiated metastases of cancers of unknown origin [7]. It has been convincingly shown that DNA methylation profiling is highly robust and reproducible even from small samples and poor quality material [8], and such profiles have been widely used to subclassify CNS tumours that were previously considered homogeneous diseases [4,9-16]. Based on this preliminary work within single entities, we herein present a comprehensive approach for DNA methylation-based classification of all CNS tumour entities across age groups.

CNS tumour reference cohort

To establish a comprehensive CNS tumour reference cohort, we generated genome-wide DNA methylation profiles (minimum of eight cases per group) representing almost all WHO defined neuroectodermal and sellar region tumours [1]. We further profiled mesenchymal tumours, melanoma, diffuse large B-cell lymphoma, plasmacytoma and six types of pituitary adenomas, in total comprising 76 histopathological entities and seven entity variants occurring in the CNS. All histopathological entities and variants were analysed by unsupervised clustering both within each entity and across histologically similar tumour entities, aiming to identify (i) distinct DNA methylation classes within one histopathological entity and (ii) DNA methylation classes comprising tumours displaying a varied histological phenotype. This iterative process led to the designation of 82 CNS tumour classes characterised by distinct DNA methylation profiles (Figure 1a). Twenty-nine of these were equivalent to a single WHO entity (category 1), 29 represented subclasses within a WHO entity (category 2), in eight the WHO grading was not fully recapitulated (category 3) and in 11 the boundaries of methylation classes were not identical to the entity boundaries of WHO (category 4) (Figure 1a). The remaining five represented DNA methylation classes not defined by the WHO classification (category 5), three of which were recently described [4] as well as the not yet well-defined class of anaplastic pilocytic astrocytoma and one new subclass of infantile hemispheric glioma. There was evidence for several additional classes of rare tumours, with too few cases to be included at present. In consideration of the impact of the tumour microenvironment on the methylation profile, we included 47 tumour samples with a pronounced inflammatory or reactive tumour microenvironment, respectively, both demonstrating distinct methylation profiles. We additionally selected 72 samples representing seven non-neoplastic CNS regions, resulting in a combined reference cohort of 2,801 samples from 91 classes (Figure 1a) that was visualized using t-SNE dimensionality reduction [17] (Figure 1b). This analysis further supported the separation of samples into the defined DNA methylation classes (see also Extended Data Figure 1a, b; unprocessed .idat files can be downloaded at NCBIs Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo), accession number GSE90496). Supplementary Table 1 gives an overview of methylation class characteristics and Supplementary Table 2 shows case-by-case information of the reference samples.

Figure 1 |

Establishing of the DNA methylation-based CNS tumour reference cohort.

a, Overview of the 82 CNS tumour methylation classes and nine control tissue methylation classes of the reference cohort. The methylation classes are grouped by histology and color-coded. Category 1 methylation classes are equivalent to a WHO entity, category 2 methylation classes are a subgroup of a WHO entity, category 3 methylation classes are not equivalent to a unique WHO entity with combining of WHO grades, category 4 methylation classes are not equivalent to a unique WHO entity with combining of WHO entities, and category 5 methylation classes are not recognized as a WHO entity. Full names and further details of the abbreviated 91 classes are given in Supplementary Table 1. Embryonal tumours: shades of blue; Glioblastomas: shades of green; Other gliomas: shades of violet; Ependymomas: shades of red; Glio-neuronal tumours: shades of orange; IDH-mutated gliomas: shades of yellow; Choroid plexus tumours: shades of brown; Pineal region tumours: shades of mint green; Melanocytic tumours: shades of dark blue; Sellar region tumours: shades of cyan; Mesenchymal tumours: shades of pink; Nerve tumours: shades of beige; Haematopoietic tumours: shades of dark purple; Control tissues: shades of grey. b, Unsupervised clustering of reference cohort samples (n=2,801) using t-SNE dimensionality reduction. Individual samples are colour-coded in the respective class colour (n=91) and labelled with the class abbreviation. The colour code and abbreviations are identical to Figure 1a.

Extended Data Figure 1 |

Unsupervised clustering of the DNA methylation-based reference cohort.

a, Heatmap showing the pairwise Pearson correlation (lower left) of the 32,000 most variably methylated CpG probes of all 2,801 biologically independent samples of the reference cohort. A detailed view on closely related ependymal classes (upper right) and the three subclasses identified in ATRT tumours (lower right) indicates higher correlation within classes. The colour code and abbreviations are identical to main Figure 1a. b, Barplot showing eigenvalue frequencies of a principal component analysis (PCA) using the same 32,000 most variably methylated CpG probes of all 2,801 biologically independent samples as in (a). The number of non-trivial components were determined by comparing eigenvalues to the maximum eigenvalue of a PCA using randomized beta-values (shuffling of sample labels per probe). c, X and Y coordinates of the first five of a total of 500 iterations of t-SNE dimensionality reduction generated by random downsampling to 90% of the 2,801 biologically independent samples to assess clustering stability. Axis positions of individual cases are connected by a line coloured according to the colour code of Figure 1a. The depiction illustrates the close proximity of cases of the same class across iterations, indicative of a high stability independent of the exact composition of the reference cohort. d, Pairwise correlation of X and Y coordinates between 2,801 biologically independent samples over all iterations of the downsampling analysis demonstrates a very high correlation within classes (average correlation 0.982), indicating a high stability of the t-SNE analysis.

The stability of separation of methylation classes by t-SNE was analysed by iterative random downsampling of the reference cohort and indicated a high stability of the groups (Extended Data Figure 1c, d). Testing for confounding batch effects within our reference cohort did not reveal unexpected confounding factors (Extended Data Figure 2, Extended Data Figure 3a-c). For reference astrocytomas, oligodendrogliomas and glioblastomas we performed additional classification according to the TCGA pan glioma DNA methylation model[18] indicating a strong association of the TCGA classes LGm1–6 with specific classes defined in our reference cohort (Extended Data Figure 3d, Supplementary Table 2).

Extended Data Figure 2 |

Unsupervised clustering is not biased by a range of possible confounding factors.

a, t-SNE representations of the 2,801 biologically independent samples constituting the reference cohort as shown in Figure 1b overlaid with potentially confounding factors (b-f). b, Distribution of patient sex among the classes illustrates equal or near equal distribution of many classes, but also an expected enrichment for one sex in some classes (e.g. female in meningioma or CNS high-grade neuroepithelial tumour with MN1 alteration). c, Patient age illustrates the expected age distribution of many tumour classes. d-f, The slightly uneven distribution of type of material (e.g. pilocytic astrocytoma or meningioma) (d), array preparation date (e), and tissue source (f) are related to the specifics of assembling the reference cohort and do not indicate an apparent confounding effect on the unsupervised clustering.

Extended Data Figure 3 |

Estimation of tumour purity and relation to TCGA pan-glioma methylation classes.

a, A Random Forest model was trained to predict ABSOLUTE tumour purity estimates[50] using the TCGA pan-glioma dataset (795 biologically independent samples)[18]. The plot shows ABSOLUTE purity estimates and out-of-bag Random Forest tumour purity predictions (i.e. using only RF trees for which the respective sample was not involved in the training). The estimated mean squared error is 0.015, indicating that this model is able to yield reasonable predictions of tumour purity from methylation data. b, Bar plot showing the distribution of Random Forest predicted purity in the reference dataset (2,801 biologically independent samples). Purity estimates have been transformed into five categories indicated by different shades of blue. The exact case-by-case values are given in Supplementary Table 2. The median estimated purity in the reference cohort is 66% (range 42% to 87%) and 78% of samples have an estimated purity of at least 60%. c, t-SNE representation of the reference cohort (2,801 biologically independent samples) overlayed with Random Forest predicted purity categories. Methylation classes are generally composed of mixed tumour purity categories. Tumour purity shows some association with the WHO grade (WHO I median tumour purity 60%, range 39–77%; WHO II median 66%, range 43–80%; WHO III median 68% range 54–84%; WHO IV median 69% range 49–87%). A further association of tumour purity with the composition of classes in the unsupervised t-SNE analysis was not evident. d, t-SNE representation of the reference cohort (2,801 biologically independent samples) overlayed with predicted TCGA pan-glioma DNA methylation classes according to Ceccarelli et al. 2016. Pan-glioma methylation classes were predicted by training a Random Forest (RF) on the Ceccarelli et al. 2016 dataset including methylation data of 418 low grade glioma and 377 glioblastoma samples acquired using the Illumina 450k and 27k platforms. The RF was trained using the 1,300 CpG signature as described by the authors[18] and using the default settings of the RF algorithm implemented in the R package randomForest. Pan-glioma class prediction was only performed for subsets of mostly adult astrocytomas, oligodendrogliomas and glioblastomas (magnified areas) included in the Ceccarelli et al. 2016 data set. LGm1, LGm2 and LGm3 show a high overlap with the methylation classes A IDH HG, A IDH and O IDH, respectively. LGm4 shows highest overlap with methylation class GBM RTK II. LGm5 shows highest overlap with methylation classes GBM MES and GBM RTK I. LGm6 show highest overlap with DMG K27, GBM MID and GBM MYCN.

Classifier development

Application in routine diagnostics requires fast and reproducible classification of samples as well as a measure of confidence for the specific call. To this end, we employed the Random Forest (RF) algorithm that is a so called ensemble method that combines the predictions of several ‘weak’ classifiers to achieve improved prediction accuracy[19]. Using this algorithm, we generated 10,000 binary decision trees, incorporating genome-wide information from all 2,801 reference samples and 91 methylation classes (Extended Data Figure 4). Each of these trees assigns a given diagnostic sample to one of the 91 classes, resulting in an aggregate raw score (Figure 2a). To obtain class probability estimates that can be used to guide diagnostic decision-making, we fitted a multinomial logistic regression calibration model that transforms the raw score into a probability that measures the confidence in the class assignment (‘calibrated score’). The calibration allows a comparison of classifier results between classes despite a different raw score distribution (Extended Data Figure 5a, b). Cross-validation of the RF classifier resulted in an estimated error rate of 4.89% for raw and 4.28% for calibrated scores and an area under receiver operating characteristic curve (AUC) of 0.99, indicating a high discriminating power (Figure 2b, Extended Data Figure 5c). The vast majority of cross-validation misclassifications occurred within eight groups of histologically and biologically closely related tumour classes, distinction of which is currently without clinical impact (with the possible exception of choroid plexus tumours [13]; Figure 2b). We therefore defined eight ‘methylation class families’ (MCF), for which calibrated scores are summed up to a single score. This reduced the cross-validated error rate for the clinically relevant groupings to 1.14% (Figure 2b, Extended Data Figure 5c). Taking the maximum score for class assignment and using a multiclass approach [20], overall sensitivity and specificity was 0.989 and 0.999, respectively (Extended Data Figure 5c).

Extended Data Figure 4 |

Development of the Random Forest classifier.

a, The RF training consists of four steps. First, a basic filtering for probes that are not included on the EPIC array, probes located on the X and Y- chromosomes, probes affected by SNPs, and probes not mapping uniquely to the genome is performed. In a second step, the probe-wise batch effects between samples from FFPE and frozen material are estimated and adjusted by a linear model approach. In a third step, feature selection is performed by training a RF using all probes and selecting the 10,000 probes with highest variable importance measure. In a last step, the final RF is trained using only the 10,000 selected probes. The validation of the RF classifier involves a three-fold nested cross-validation (CV). In the outer loop of the CV the complete RF training procedure described before is applied to the training data and the resulting RF is used to predict the test data to generate RF scores. In the inner loop of the CV a three-fold CV is applied to training data of the outer loop in order to generate RF scores independent of the test data in the outer loop. These scores are then used to fit a calibration model, i.e. a L2-penalized, multinomial, logistic regression that takes the RF scores of the test data in the outer CV loop to estimate tumour class probabilities (P1, P2, P3). To fit a calibration model to estimate class probabilities of diagnostic samples using all data in the reference set, the RF scores generated in the outer CV loop are used. b, Schematic depiction of three exemplary binary decision trees of the Random Forest classifier (left), and magnification on five exemplary decisions nodes relevant for glioblastoma classification (right). For prediction, a diagnostic sample enters the root node of each of the 10,000 trees. At every decision node, the decision path is determined on the methylation level of a single CpG, until reaching a terminal node that provides the class prediction. The joint class prediction of all trees represents the raw prediction score. The colour code and abbreviations are identical to Figure 1a.

Figure 2 |

Development and cross-validation of the DNA methylation-based CNS tumour classifier.

a, Schematic of principal classifier components (grey) and processing steps for individual test samples (white). The most informative probes are selected for training of the Random Forest classifier. The classifier produces raw scores representing the number of decision trees assigning a test sample to a specific methylation class. To enable inter-class-comparability a calibration model is used, which transforms raw into calibrated scores. Calibrated scores represent an estimated probability measure of methylation class assignment. b, Heatmap showing results of a three-fold cross-validation of the Random Forest classifier incorporating information of n=2801 biologically independent samples allotted to 91 methylation classes. Deviations from the bisecting line represent misclassification errors (using the maximum calibrated score for class prediction). Methylation class families (MCF) are indicated by black squares. The colour code and abbreviations are identical to Figure 1a.

Extended Data Figure 5 |

Comparison of raw and calibrated classifier scores and threshold definition.

a, Density plots illustrating the distribution of raw and calibrated classifier scores for samples correctly classified during cross-validation (n=2,701 independent biological samples for raw and n=2769 independent biological samples for calibrated), depicted for each methylation class or methylation class family (MCF). Score calibration results in a harmonization of score distribution and allows the establishment of a shared classification threshold. Three thresholds for maximizing specificity (0.958), maximizing the Youden index (0.836), and the cutoff used in this study (0.9) are indicated by red lines (see also panels d and e). b, Multivariate score calibration exemplified in a ternary plot showing scores of the three ATRT subclasses (MYC, SHH, and TYR; together n=112 independent biological samples). Arrows indicate transformation of the scores for individual samples by the calibration model, which increases the discrimination between the three subclasses. c, The accuracy of prediction of the Random Forest classifier constructed of n=2801 biologically independent samples (measured by misclassification error, area under receiver operating characteristic curve (AUC), Brier score, multiclass Sensitivity and Specificity) is improved by score calibration and by combining classes into methylation class families (MCF). d, To determine a common threshold for the calibrated MCF scores, we performed a Receiver Operating Characteristic (ROC) analysis of the maximum calibrated MCF scores of all n=2801 biologically independent samples calculated via cross-validation. For this ROC analysis we defined a new binary class, i.e. samples correctly classified during the CV using the maximum calibrated MCF score for classification were considered as ‘classifiable’ (n=2769) and samples that got falsely classified by using this score were considered ‘non classifiable’ (n=32). Three thresholds for different sensitivity and specificity are highlighted in the ROC curve: A threshold of 0.958 achieving a maximum specificity of 1 with a sensitivity of 0.827, a threshold of 0.836 obtaining a maximum Youden index with Specificity 0.938 and sensitivity 0.934, and our recommended compromise threshold of 0.9 that results in a specificity of 0.938 and a sensitivity of 0.9. Bootstrapped 95% confidence intervals for estimated sensitivity and specificity are indicated in grey. e, Sensitivity and specificity for all possible thresholds applied to cross-validated maximum MCF classifier scores of all n=2801 biologically independent samples. Three thresholds for maximizing specificity (0.958), maximizing the Youden index (0.836) and 0.9 are highlighted by red lines.

For application to diagnostic tumour samples, a threshold value for the prediction of a matching class is required. Using Receiver Operating Characteristic (ROC) curve analysis of the maximum calibrated scores we devised an optimal “common” calibrated score threshold of ≥0.9 (Extended Data Figure 5d, e). For subclasses within methylation class families, we defined a threshold value of ≥0.5 as sufficient for a valid prediction, as long as all family member scores add up to a total score of ≥0.9. Single class specificity and sensitivity for the ≥0.9 threshold are provided in Supplementary Table 3.

Clinical implementation

For evaluation of clinical utility, we prospectively analysed a series of 1,155 diagnostic CNS tumours in parallel with standard histopathological workup (Figure 3a, b). For 51 cases (4%) the material was not suitable for methylation profiling, mostly because of too low tumour cell content or limited total material. Methylation profiling was performed for the remaining 1,104 samples and the cases were assigned as either ‘matching to a defined DNA methylation class’ (calibrated score ≥0.9) or as ‘no match’ cases (highest score <0.9) (for a case-by-case list see Supplementary Table 4). The investigated cases comprised 64 different histopathological entities from both adult (71%) and paediatric patients (29%). The spectrum of entities was enriched for rare and difficult to diagnose cases received for referral, and therefore did not exactly match the distribution seen in daily routine diagnostic practice. Histopathological evaluation was performed blinded to DNA methylation profiling results and included standard molecular testing.

Figure 3 |

Implementation of the classifier in diagnostic practice.

a, Classifier validation by an independent prospective cohort of diagnostic samples. Pathological diagnosis was established by current pathological standard according to the 2016 version of the WHO classification of CNS tumours and compared to classification by methylation profiling. Cases were categorized as “confirmation of diagnosis”, “establishing new diagnosis”, “misleading profile”, or “no match to defined class”. b, Overview of methylation profiling result from 1,155 diagnostic samples and integration with pathological diagnosis.

In total, 88% of profiled samples (n=977/1,104) matched to an established DNA methylation class with a calibrated classifier score ≥0.9 (Figure 3b). For 838 of these (838/1,104; 76%), results obtained by pathology and DNA methylation profiling were concordant. In 171 of the cases, an unambiguous molecular subgroup could be assigned, which would not have been available based on histopathology evaluation only (e.g., molecular subgroups of medulloblastoma and ependymoma, many of which were included in the latest version of the WHO classification of CNS tumours [1]). For the remaining 139 samples with a calibrated classifier score ≥0.9, the DNA methylation class was discordant from the pathological diagnosis. These cases were histologically and molecularly re-evaluated, including additional molecular diagnostics (DNA copy-number profiling, targeted gene sequencing, gene panel sequencing[21], and gene fusion analysis of a subset of cases, see Supplementary Table 5). This resulted in a revision of the initial histopathological diagnosis in 129 of the 139 cases (12% of all cases, Figure 4) in favour of the predicted methylation class. In agreement with several recent reports [16,22,23], several of these were IDH-wildtype astrocytomas and anaplastic astrocytomas reclassified as IDH-wildtype glioblastomas. Establishing a new diagnosis had a profound clinical impact: a change in WHO grading was observed in 71% of these cases (92/129), with both upgrading (41%, 53/129) and downgrading (30%, 39/129; Figure 4). Discrepant results could not be resolved in only 10 cases (<1% of profiled cases), and the histopathological diagnosis was retained.

Figure 4 |

Reassessment of discrepant cases and establishment of new diagnosis.

Discrepancy between pathological diagnosis (left) and methylation profiling (middle) was observed for 139 cases. For 129 cases histological and molecular reassessment (Supplementary Table 5) resulted in change of the initial diagnosis with formulation of a new integrated diagnosis (right). For 92 cases this involved change of WHO grading, with both down- (blue) and upgrading (red). Integrated diagnoses in brackets are not recognized as a WHO entity. For methylation class abbreviations see Supplementary Table 1.

To substantiate the impact in clinical practice we contacted five external centres that have started to implement methylation profiling for diagnostic cases using our algorithm. In total, these centres analysed 401 diagnostic cases and in 50 cases (12%) a new diagnosis was established after methylation profiling, very closely recapitulating our rate of reclassification (Extended Data Figure 6a, Supplementary Table 6). For individual centres the rate of reclassification varied between 6% and 25%, most likely due to differences of the spectrum of investigated cases and more upfront molecular testing by some centres (Extended Data Figure 6b, Supplementary Table 6).

Extended Data Figure 6 |

Diagnostic utility of the DNA-methylation based classifier, assessed at different centres.

a, Implementation of the DNA methylation classifier by five external centres. In total, 401 independent biological samples were analysed. 78% matched to an established class with a cut-off score of ≥0.9 (class colours as in Figure 1a). A new diagnosis was established in 12% of cases. b, Depiction of individual centre results, illustrating the different composition of samples included in the analysis, variation in the rate of non-matching cases, and of cases where a new diagnosis was established. Case-by-case details are given in Supplementary Table 6.

Twelve percent of tumours from the prospective cohort (127/1,104) could not be assigned to a DNA methylation class using the rigid calibrated classifier score cutoff of ≥0.9 (Figure 3b). To further clarify the role of these non-classifiable cases we performed an unsupervised t-SNE analysis of the reference cohort together with the diagnostic cohort (Figure 5a). This demonstrated a high overlap of the classifiable cases with the reference cohort, whereas non-classifiable cases frequently fell in the periphery of the reference classes or even completely separate from these and frequently grouped with other non-classifiable cases (Figure 5a). This may indicate that such cases represent rare novel molecular entities that have not been previously recognized. An example for a likely novel CNS tumour entity is exemplified in Figure 5b, c.

Figure 5 |

DNA methylation-based identification of potential new CNS tumour entities.

a, Unsupervised clustering of the combined reference (n=2,801, grey) and diagnostic cohort (n=1,104, coloured) using t-SNE dimensionality reduction. Abbreviated names indicate the reference cohort classes as in Figure 1. The diagnostic samples are colour coded as “confirmation of diagnosis” (n=838, green), “establishing new diagnosis” (n=129, blue), “misleading profile” (n=10, red) and “no match to defined class” (n=127, dark grey). The matching (green) and reclassified (blue) cases show high overlap with the reference cases. The non-classifiable (black) and the misleading (red) cases frequently fall in the periphery of the reference classes or are completely separate of these. The magnification (right) highlights two non-classifiable cases (here in magenta for easier identification) that group together in the t-SNE representation. b, Both highlighted non-classifiable cases occurred in female children, and had primitive neuroectodermal histology (glioblastoma- or embryonal tumour-like). Histology was assessed by three independent pathologists with similar results. c, Both cases shared a high-level amplification of chromosome 6q24.2 (common amplified region chr6:144,149,293–144,649,987). The common region includes only 5 protein coding genes: LTV1 (LTV1 ribosome biogenesis factor), ZC2HC1B (zinc finger C2HC-type containing 1B), PLAGL1 (PLAG1 like zinc finger 1), SF3B5 (splicing factor 3b subunit 5) and STX11 (syntaxin 11). This amplification was not observed in any of the other tumours from the reference or diagnostic cohort. Copy number analysis was performed once using copy number information deriving from the methylation array data.

Technical and inter-laboratory testing

Technical robustness of the RF classifier was investigated by inter-laboratory comparison. Results of two independent laboratories (starting from DNA extraction) were highly correlated, with only two of 53 samples (4%) showing a classifier score slightly lower than 0.9 in one of the centres whereas all other cases were classified identically (Extended Data Figure 7a). Calculation of copy number profiles was also stable across laboratories (Extended Data Figure 7b). To ascertain forward compatibility with developing technologies, we further used the RF classifier to interrogate newer EPIC DNA methylation arrays and high-coverage whole-genome bisulfite sequencing data. For all 16 samples from different CNS tumours profiled on both array platforms, raw scores (Extended Data Figure 7c) and calibrated scores (not shown) were highly correlated and running them through the classifying algorithm resulted in the same prediction for every case. Further, for all 50 high-coverage whole-genome bisulfite sequencing samples (11 different CNS tumour entities), the highest prediction score was for the same class as with the 450k array, suggesting that our approach is applicable to different DNA methylation profiling techniques with only slight adaptations (Extended Data Figure 7d).

Extended Data Figure 7 |

Inter-centre and inter-platform reproducibility of DNA methylation-based classification.

a, Calibrated scores of 53 independent biological samples representing diagnostic CNS tumour cases analysed at the University of Heidelberg and at the New York University pathology department. Both laboratories performed independent DNA extraction, array hybridization, and data analysis. Cases falling into green areas were classified identically in both centres (96%); cases in the red area were non-classifiable in one centre (4%). None of the 53 samples was assigned to a different methylation class by the two centres. b, Copy-number profiles calculated from the array data generated at both centres were highly comparable and allowed identification of chromosomal gains, losses, amplifications, and deletions. Calculations and interpretation were performed once at each centre. c, Plot of maximum raw classification scores of 16 different tumour samples generated using both 450k and EPIC arrays. All cases fall close to the bisecting line (red) indicating a high concordance of the scores. Further, the methylation class prediction was identical for all samples. d, The CNS tumour classifier also performs well with data generated by whole-genome bisulfite sequencing (WGBS). The plot shows classifier scores calculated from WGBS and 450k arrays of 50 cases comprising 11 different brain tumour entities (bisecting line in red). Methylation beta-values were calculated from high-coverage WGBS data (>10 fold average coverage) and run through the CNS tumour classifier and plotted against the same case analysed using 450k arrays. The highest class prediction score was identical in all cases.

Global dissemination of the platform

To ensure unrestricted community access to our classification system, we created a free web platform for data upload, automatic normalization, Random Forest classification, and PDF report generation (www.molecularneuropathology.org). DNA copy-number profiles[24] and O6-methylguanine-DNA-methyltransferase (MGMT) promoter methylation status[25] are additionally provided, since they can be generated from the same data source – thus having the potential of replacing several time- and cost-intensive single-gene tests. A representative website report is shown in Extended Data Figure 8. During upload, the data provider can chose to give consent that the data may be used for further classifier development. We expect that this web platform can thereby act as a hub for a worldwide cooperative network to continuously identify and track rare tumour classes so that they can eventually be added to the catalogue of known human cancers. Since the launch of the website 14 months ago in December 2016, over 4,500 cases have been uploaded from over 15 participating centres. New biological insights are also likely to be gained based on the interrelationships of tumour classes, and by closer examination of how differential DNA methylation affects tumour biology.

Extended Data Figure 8 |

Sample website PDF report of a IDH wildtype glioblastoma sample.

Discussion

We here demonstrate that DNA methylation-based CNS tumour classification using a comprehensive machine learning approach is a valuable asset for clinical decision making. In particular, the high level of standardization has great promise to reduce the substantial inter-observer variability observed in current CNS tumor diagnostics. Further, in contrast to traditional pathology, whereby there is a pressure to assign all tumours to a described entity even for atypical or challenging cases, the objective measure that we provide here allows for ‘no match’ to a defined class. This information can also be of substantial value in highlighting that a tumour is not a typical example of a given differential diagnosis, and may rather belong to a rarer, yet undefined class. We defined 5 categories of methylation classes that have different clinical implications. Category 1 can be directly translated to WHO entities. Category 2 represents subclasses of WHO entities. For all but ependymal tumours, subclassification currently has little clinical consequence and a translation back to the WHO class may be appropriate for clinical purposes. Category 3 reflects the fact that WHO grading cannot be fully recapitulated by methylation profiling for several classes. Further data is required to assess if the methylation classes of this category may provide a more robust means of prognostication than histology alone, as has been demonstrated for several other classes [4,9,11]. In category 4, the WHO entity boundaries are not identical to the boundaries of the methylation classes. Until additional data on the exact boundaries become available, this category should be critically discussed in the clinical context and orthogonal testing should be undertaken whenever possible. Category 5 represents putative new entities that are currently not recognized by the WHO, and while limited data on these cases is currently available, the biological rationale for a novel class was considered strong. A study in which reference pathology and molecular diagnostics including DNA methylation profiling are blinded for each other´s results is currently ongoing for all childhood brain tumours diagnosed in Germany to objectivise the potential effect of re-classification on patient outcome (http://pediatric-neurooncology.dkfz.de/index.php/en/diagnostics/molecular-neuropathology), with results due over the next few years. A uniform implementation of the classification algorithm holds great promise for standardization of tumour diagnostics across centres and across clinical trials. Further, the digital nature of methylation data facilitates easy exchange and will allow aggregation of extensive tumour libraries. This will likely result in the detection of exceptionally rare tumour classes and a continued refinement of classifiers. Inclusion of new classes will allow a prompt translation into diagnostic practice, almost certainly resulting in a more dynamic tumour classification. In our experience, adaptation of this technique in diagnostic laboratories is relatively straightforward. Extended Data Figure 9 summarizes a sample workflow for diagnostic implementation. We expect that the principle of using DNA methylation signatures as part of a combined histo-molecular tumour classification will improve diagnostic accuracy not only in neuropathology, but will serve as a blueprint in other fields of tumour pathology

Extended Data Figure 9 |

Exemplary workflow and timeline of diagnostic methylation profiling.

Methods (online only)

Patient material

Patient material and clinical data of the retrospective reference cohort (total n=2,801) were obtained from the National Center for Tumour Diseases (NCT) in Heidelberg and supplemented with samples from additional centres (Supplementary Table 2) according to protocols approved by the institutional review boards with written consent obtained from each patient. Tumours were histopathologically re-assessed according to the current WHO classification[1]. Areas with highest tumour cell content (≥70%) were selected for DNA extraction. Subsets of the reference cohort have been previously published[4,9-16,26-33]. Additional patient characteristics are given in Supplementary Table 2. The prospectively assessed clinical cohort was analysed as part of the National Center for Tumour Diseases Precision Oncology Program according to procedures approved by the institutional review board at the Medical Faculty Heidelberg. All patients gave written consent for diagnostic procedures, comprising onward molecular testing including methylation profiling. Additional patient characteristics are given in Supplementary Table 4. Details of the online-analysed cohort of the five additional centres are given in Supplementary Table 6. Usage of the data was according to protocols approved by the institutional review boards of the University of Basel, Frankfurt am Main University Hospital, University Medical Center Utrecht and Princess Máxima Center for Pediatric Oncology Utrecht, Giessen University Hospital and University College London Hospitals. All patients gave written consent for diagnostic procedures, comprising onward molecular testing including methylation profiling. For all the above human research participants all relevant ethical regulations were followed.

Data generation, processing and Random Forest classifier generation

Samples were analysed using Illumina Infinium HumanMethylation450 BeadChip (450k) arrays according to the manufacturer’s instructions. To investigate stability across platforms a selection of samples were additionally assessed using the successor Methylation BeadChip (EPIC) array or whole-genome bisulfite sequencing (WGBS, generated and analysed as described[6]). Array data analysis was performed using R version 3.2.0 [34], using a number of packages from Bioconductor[35] and other repositories. A Random Forest[19] classifier compatible with both 450k and EPIC platforms was trained, and a calibration model that calculates class probabilities from Random Forest scores was devised. A detailed description of all methods is provided below.

Methylation array processing

The 450k array was used to obtain genome-wide DNA methylation profiles for tumour samples and normal control tissues, according to the manufacturer’s instructions (Illumina, San Diego, USA). DNA methylation data was generated at the Genomics and Proteomics Core Facility of the DKFZ (Heidelberg, Germany) and the NYU Langone Medical Center (New York, USA). Data was generated from both fresh-frozen and formalin-fixed paraffin-embedded (FFPE) tissue samples. For most fresh-frozen samples, >500 ng of DNA was used as input material. 250 ng of DNA was used for most FFPE tissues. On-chip quality metrics of all samples were carefully controlled. Copy-number variation (CNV) analysis from 450k methylation array data was performed using the conumee Bioconductor package version 1.3.0. Two sets of 50 control samples displaying a balanced copy-number profile from both male and female donors were used for normalization. Raw signal intensities were obtained from IDAT-files using the minfi Bioconductor package version 1.14.0 [36]. Each sample was individually normalized by performing a background correction (shifting of the 5 % percentile of negative control probe intensities to 0) and a dye-bias correction (scaling of the mean of normalization control probe intensities to 10,000) for both colour channels. Subsequently, a correction for the type of material tissue (FFPE/frozen) was performed by fitting univariate, linear models to the log2-transformed intensity values (removeBatchEffect function, limma package version 3.24.15). The methylated and unmethylated signals were corrected individually. Estimated batch effects were also used to adjust diagnostic samples or test samples within the cross-validation. Beta-values were calculated from the retransformed intensities using an offset of 100 (as recommended by Illumina). To analyse for possible confounding batch effects within our pre-processed reference cohort dataset (after adjusting for FFPE versus frozen material) we applied the sva algorithm [37,38]. We found no significant surrogate variable (data not shown). The following filtering criteria were applied: Removal of probes targeting the X and Y chromosomes (n=11,551), removal of probes containing a single-nucleotide polymorphism (dbSNP132 Common) within five base pairs of and including the targeted CpG site (n=7,998), probes not mapping uniquely to the human reference genome (hg19) allowing for one mismatch (n=3,965), and probes not included on the Illumina EPIC array (n=32,260). In total, 428,799 probes targeting CpG sites were kept for further analysis.

Unsupervised analysis

Pairwise Pearson correlation was calculated for all 2,801 reference samples by selecting the 32,000 most variably methylated probes (s.d. > 0.228, Extended Data Figure 1a). The same probes were used for principal component analysis (PCA). For PCA, pairwise probe covariances of centred beta-values were calculated. Eigenvalue decomposition was performed using the eigs function of the RSpectra package version 0.12. The number of non-trivial components was determined by comparing eigenvalues to the maximum eigenvalue of a PCA using randomized beta-values (shuffling of sample labels per probe) (Extended Data Figure 1b). Principal component scores for all non-trivial components (n=94) were used for t-SNE analysis (t-Distributed Stochastic Neighbour Embedding[17], Rtsne package version 0.11, Figure 1b). The following non-default parameters were used: theta=0, pca=F, max_iter=2500. A similar approach was used for the combined analysis of reference and diagnostic cases (Figure 5a).

The Random Forest algorithm

The Random Forest (RF) [19] algorithm is a so-called ensemble method that combines the predictions of several ‘weak’ classifiers to achieve improved prediction accuracy. The RF algorithm uses binary decision trees (Classification and Regression Trees, CART[39]) as ‘weak’ classifiers (Extended Data Fig. 4). Each of these trees is a sequence of binary splitting rules that are learned by recursive binary splitting. The CART algorithm starts with all samples assigned to a ‘root’ node and tries to find the variable, e.g., a measured CpG probe, and a corresponding cutoff that results in the purest split into the different classes. To measure this gain in class ‘purity’ the Gini index is used. To fit a tree, the CART algorithm iteratively repeats these steps until no further improvements can be made. To predict the class of a new diagnostic case the binary splitting rules are compared with the new data starting in the root node down to one of the leaf nodes. The tree then predicts or votes for the class of that leaf node. Decision trees have the advantage that they are non-parametric and do not rely on any distributional assumptions. The main disadvantages of decision trees is that they often tend to overfit the data and that they have a weak prediction performance. To improve the prediction accuracy the RF algorithm combines thousands of trees by bootstrap aggregation (bagging). In brief, each tree is fitted using training datasets that are generated by drawing bootstrap samples. In addition, at each node only a random subset of the available variables is used to find an optimal splitting rule. This additional source of randomization allows selecting variables with lower predictive value. This feature guarantees that the resulting trees are decorrelated, i.e., they use different variables to find an optimal prediction rule. Taking the majority vote over thousands of bootstrap aggregated and decorrelated trees greatly improves the prediction accuracy of the RF. The majority vote, i.e., the proportion of trees voting for a class, can be interpreted as empirical class probabilities.

Classifier development

To train the RF classifier, the randomForest R package [40] was used. First, the most important features (probes) were selected by applying the Random Forest algorithm to the beta-values of all filtered 428,799 probes. For efficient computation, the probes were split into 43 sets of approximately 10,000 probes. For each set, 100 trees were fitted using 654 randomly sampled candidate features at each split (mtry parameter, square root of 428,799, as would be used by default when not splitting into sets). To take the imbalanced methylation class sizes into account a downsampling strategy was followed that ensures an identical number of samples per class (parameter sampsize=rep(8, 91)), eight reflecting the minimum number of cases in the 91 classes) [41]. For all other parameters the default settings were used. This procedure was repeated 100 times, essentially fitting 10,000 trees per probe. Finally, features are selected by the permutation-based variable importance measure as implemented in the randomForest R package[40]. The importance measure is the class-specific mean decrease in classification accuracy when the feature is permutated. We select features by ranking them using the minimal rank of the variable importance measures across all classes. The final RF classifier was trained by fitting 10,000 trees with the parameter mtry=100 using beta-values of the 10,000 probes selected during feature selection. Imbalanced class sizes were accounted for by downsampling (as described above), and for all other parameters the default settings were used. An overview of the processes is given in Extended Data Fig. 4.

Classifier cross-validation

Overfitting of the training data is a typical problem expected when training classifiers on high-dimensional data. As it often cannot be avoided, the typical strategy to deal with this problem is to evaluate the model accuracy on an independent test dataset or apply cross-validation methods[42]. Because some of the newly defined methylation groups presented in this work cannot be diagnosed by classical histopathological methods or other established molecular assays, an independent test set to assess model accuracy is not available. Therefore, the accuracy of the presented RF model with the accompanying calibration model was evaluated by a three-fold, nested cross-validation (CV). For this, the reference dataset is split into three equally sized parts. In each CV iteration, two-thirds of the data were used to train a RF classifier in the same way as the RF classifier for the complete dataset was trained. Then, the remaining one-third of the data is predicted using this RF classifier. After the third iteration of the CV is completed, each of the 2,801 reference samples has been predicted by an independent RF classifier, i.e. where the sample was not used for estimating batch effects, performing variable selection, or training of the classifier.

Classifier score calibration

The classification scores generated by our multiclass RF (i.e. the proportion of trees voting for a class) perform well when they are used to assign the correct class labels, but they do not reflect class probabilities. Furthermore, the distribution of the RF scores varies between classes, which makes an inter-class comparison difficult. Moreover, to evaluate a diagnostic classification, the uncertainties associated with an individual prediction in terms of confidence scores or estimated class probabilities are needed. To obtain scores that are comparable between classes and that are improved estimates of the certainty of individual predictions we performed a classification score recalibration by mapping the original scores to more accurate class probabilities[43,44]. To find such a mapping, a L2-penalized, multinomial, logistic regression-model was fitted, which takes the methylation class as response variable and the RF scores as explanatory variables. The R package glmnet[45] was used to fit this model. In addition, the model was fitted by incorporating a small ridge-penalty (L2) on the likelihood to prevent from over fitting, as well as to stabilize estimation in situations where classes are perfectly separable. The amount of this regularization, i.e. the penalization parameter, is determined by running a ten-fold cross-validation and choosing the largest value that lies within one standard error of the minimum cross-validation error. Independent RF scores are needed to fit this model, i.e. the scores need to be generated by a RF classifier that was not trained using the same samples, otherwise the RF scores will be systematically biased and not comparable to scores of unseen cases. As such, RF scores generated by the three-fold CV are used. To validate the class predictions generated by using the recalibrated scores of the calibration model, a nested three-fold CV loop is incorporated into the main three-fold CV that validates the RF classifier (Extended Data Fig. 4). Within each CV run this nested three-fold CV is applied to generate independent RF scores, which are then used to train a calibration model. The predicted RF scores resulting from predicting the one-third test data of the outer CV loop are then recalibrated by applying the calibration model that was fitted on the RF scores generated during the nested CV. A similar CV scheme was used by Appel et al.[46] to validate estimated classification probabilities.

Classifier performance measures

Performances of the resulting classifier predictions and scores generated by the CV were assessed by the misclassification error, multiclass area under the curve (AUC) and the multiclass Brier score. The misclassification error measures the frequency of falsely assigned class labels when using the maximum of the RF scores or re-calibrated scores as a cutoff to determine the predicted class, i.e. the majority vote. To measure the AUC for our multiclass RF the generalization of the AUC for multiclass classification problems by Hand and Till[47] was used. To measure how well the resulting RF scores and recalibrated scores perform when used as class probabilities, the multiclass Brier Score[42,48,49] was used. The Brier score is the mean-squared difference between the actual and the predicted class probability and thus measures the same characteristic as the mean squared error (MSE) measures for a continuous forecast.

Methylation class families

We observed that the majority of misclassification errors occurred within eight groups of histologically and biologically closely related tumour classes. We therefore defined eight ‘methylation class families’ (MCF). Since calibrated scores represent class probabilities, it is possible to apply the addition rule of probabilities to sum up calibrated class scores within one MCF to get a class probability for the MCF.

Threshold analysis

Finding an optimal cutoff for diagnostic tests usually involves finding an optimal trade-off between sensitivity and specificity. If there are no preferences regarding specificity or sensitivity, the optimal cutoff is chosen by the upper left corner of the ROC curve or by maximizing the Youden index (specificity+sensitivty-1). In an application like the one described here, where the cost of false negative is that a tumour cannot be classified and the cost of a false positive is a falsely predicted methylation class, a threshold with high specificity is preferred. ROC analysis is typically defined for binary classification problems. Finding a threshold for multiclass classifiers either involves performing a ROC analysis for each class resulting in class-wise individual thresholds or finding some common threshold for all classes. The calibrated MC/MCF scores (here referring to MCF and MC classes that are not assigned to a MCF) are already validated probability estimates for the methylation class with a direct interpretation, i.e. we expect among all samples with scores of approx. 0.9 that 10% are falsely predicted. Applying an additional threshold is not required from a statistical point of view, but desired in clinical practice. In addition, due to calibration, scores are comparable across classes and it is thus reasonable to define a common threshold for all classes instead of finding optimal cutoff for each individual methylation class. To determine a common threshold for the calibrated MC/MCF scores, we performed a ROC analysis of the maximum calibrated MC/MCF scores calculated via cross-validation. For this ROC analysis we defined a new binary class, i.e. samples correctly classified during the CV using the maximum calibrated MC/MCF score for classification were considered as ‘classifiable’ and samples falsely classified by using this score were considered ‘non-classifiable’. Following this ROC analysis approach, we determined a cutoff of 0.836 that maximises the Youden index with a specificity of 93.8% and sensitivity of 93.4% (Extended Data 5d and e). A maximum specificity of 100% with a sensitivity of 82.7% can be achieved with a threshold of 0.958. Bootstrapped 95% confidence intervals (grey area in Extended Data Figure 5d) demonstrate the uncertainty of sensitivity and specificity estimates, especially in the left upper corner of the ROC figure, where the considered thresholds are located. Both thresholds have been determined by cross-validation on our training data of high quality, but real life diagnostic samples were found to achieve slightly lower scores, due to a number of factors we cannot control, such as lower overall sample quality and lower tumour purity compared to samples in our reference cohort. Therefore, we decided to lower the maximum specificity threshold to allow a wider spectrum of samples to become a match. For this, we chose a threshold of ≥0.9 that lies in the middle between the Youden index and the threshold for maximum specificity.

Comparison to TCGA pan-glioma methylation classes

To compare our methylation-based classification of CNS tumours with described methylation classes of brain tumours by the Cancer Genome Atlas (TCGA) project, we downloaded the pre-processed methylation dataset described in Ceccarelli et al. 201618 including methylation data of 418 low grade glioma and 377 glioblastoma samples analysed by using the Illumina 450k array or 27k array platforms. To classify our samples according to the TCGA pan-glioma DNA methylation classification, we trained a Random Forest classifier on this dataset using the 1,300 CpG probe signature provided by the authors and using the default settings of the Random Forest algorithms implemented in the R package randomForest. The results of this classification for astrocytomas, oligodendrogliomas and glioblastomas are shown in Extended Data Figure 3d and are given on a case-by-case basis in Supplementary Table 2 and 4.

Estimating tumour purity from DNA methylation data

Due to the subjective nature of histological assessment of tumour purity, we additionally used the Ceccarelli et al. 2016 dataset[18] to train a Random Forest regression (continuous response variable) model to predict tumour purity[50]. This Random Forest was trained on the 1,000 most important CpG probes for purity estimation selected also by a Random Forest (similar to the variable selection described for the Random Forest classifier). The out-of-bag (i.e. RF trees in which the respective sample, for which purity is predicted, was not used for training) mean squared error of the final model is 0.015, indicating that this model is able to yield reasonable predictions of tumour purity from methylation data (Extended Data Figure 3a-c). The estimated tumour purity for individual cases is given in Supplementary Table 2 and 4.

Unsupervised clustering of the DNA methylation-based reference cohort.

Unsupervised clustering is not biased by a range of possible confounding factors.

Estimation of tumour purity and relation to TCGA pan-glioma methylation classes.

Development of the Random Forest classifier.

Comparison of raw and calibrated classifier scores and threshold definition.

Diagnostic utility of the DNA-methylation based classifier, assessed at different centres.

Inter-centre and inter-platform reproducibility of DNA methylation-based classification.

37 in total

1. Melanotic tumors of the nervous system are characterized by distinct mutational, chromosomal and epigenomic profiles.

Authors: Christian Koelsche; Volker Hovestadt; David T W Jones; David Capper; Dominik Sturm; Felix Sahm; Daniel Schrimpf; Sebastian Adeberg; Katja Böhmer; Christian Hagenlocher; Gunhild Mechtersheimer; Patricia Kohlhof; Helmut Mühleisen; Rudi Beschorner; Christian Hartmann; Anne Kristin Braczynski; Michel Mittelbronn; Rolf Buslei; Albert Becker; Alexander Grote; Horst Urbach; Ori Staszewski; Marco Prinz; Ekkehard Hewer; Stefan M Pfister; Andreas von Deimling; David E Reuss
Journal: Brain Pathol Date: 2014-12-15 Impact factor: 6.508

2. Overfitting, generalization, and MSE in class probability estimation with high-dimensional data.

Authors: Kyung In Kim; Richard Simon
Journal: Biom J Date: 2013-12-16 Impact factor: 2.207

3. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays.

Authors: Martin J Aryee; Andrew E Jaffe; Hector Corrada-Bravo; Christine Ladd-Acosta; Andrew P Feinberg; Kasper D Hansen; Rafael A Irizarry
Journal: Bioinformatics Date: 2014-01-28 Impact factor: 6.937

4. Histopathological grading of pediatric ependymoma: reproducibility and clinical relevance in European trial cohorts.

Authors: David W Ellison; Mehmet Kocak; Dominique Figarella-Branger; Giangaspero Felice; Godfraind Catherine; Torsten Pietsch; Didier Frappaz; Maura Massimino; Jacques Grill; James M Boyett; Richard G Grundy
Journal: J Negat Results Biomed Date: 2011-05-31

5. Molecular Classification of Ependymal Tumors across All CNS Compartments, Histopathological Grades, and Age Groups.

Authors: Kristian W Pajtler; Hendrik Witt; Martin Sill; David T W Jones; Volker Hovestadt; Fabian Kratochwil; Khalida Wani; Ruth Tatevossian; Chandanamali Punchihewa; Pascal Johann; Jüri Reimand; Hans-Jörg Warnatz; Marina Ryzhova; Steve Mack; Vijay Ramaswamy; David Capper; Leonille Schweizer; Laura Sieber; Andrea Wittmann; Zhiqin Huang; Peter van Sluis; Richard Volckmann; Jan Koster; Rogier Versteeg; Daniel Fults; Helen Toledano; Smadar Avigad; Lindsey M Hoffman; Andrew M Donson; Nicholas Foreman; Ekkehard Hewer; Karel Zitterbart; Mark Gilbert; Terri S Armstrong; Nalin Gupta; Jeffrey C Allen; Matthias A Karajannis; David Zagzag; Martin Hasselblatt; Andreas E Kulozik; Olaf Witt; V Peter Collins; Katja von Hoff; Stefan Rutkowski; Torsten Pietsch; Gary Bader; Marie-Laure Yaspo; Andreas von Deimling; Peter Lichter; Michael D Taylor; Richard Gilbertson; David W Ellison; Kenneth Aldape; Andrey Korshunov; Marcel Kool; Stefan M Pfister
Journal: Cancer Cell Date: 2015-05-11 Impact factor: 31.743

6. Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors: Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal: J Stat Softw Date: 2010 Impact factor: 6.440

Review 7. Interobserver variation of the histopathological diagnosis in clinical trials on glioma: a clinician's perspective.

Authors: Martin J van den Bent
Journal: Acta Neuropathol Date: 2010-07-20 Impact factor: 17.088

8. Hotspot mutations in H3F3A and IDH1 define distinct epigenetic and biological subgroups of glioblastoma.

Authors: Dominik Sturm; Hendrik Witt; Volker Hovestadt; Dong-Anh Khuong-Quang; David T W Jones; Carolin Konermann; Elke Pfaff; Martje Tönjes; Martin Sill; Sebastian Bender; Marcel Kool; Marc Zapatka; Natalia Becker; Manuela Zucknick; Thomas Hielscher; Xiao-Yang Liu; Adam M Fontebasso; Marina Ryzhova; Steffen Albrecht; Karine Jacob; Marietta Wolter; Martin Ebinger; Martin U Schuhmann; Timothy van Meter; Michael C Frühwald; Holger Hauch; Arnulf Pekrun; Bernhard Radlwimmer; Tim Niehues; Gregor von Komorowski; Matthias Dürken; Andreas E Kulozik; Jenny Madden; Andrew Donson; Nicholas K Foreman; Rachid Drissi; Maryam Fouladi; Wolfram Scheurlen; Andreas von Deimling; Camelia Monoranu; Wolfgang Roggendorf; Christel Herold-Mende; Andreas Unterberg; Christof M Kramm; Jörg Felsberg; Christian Hartmann; Benedikt Wiestler; Wolfgang Wick; Till Milde; Olaf Witt; Anders M Lindroth; Jeremy Schwartzentruber; Damien Faury; Adam Fleming; Magdalena Zakrzewska; Pawel P Liberski; Krzysztof Zakrzewski; Peter Hauser; Miklos Garami; Almos Klekner; Laszlo Bognar; Sorana Morrissy; Florence Cavalli; Michael D Taylor; Peter van Sluis; Jan Koster; Rogier Versteeg; Richard Volckmann; Tom Mikkelsen; Kenneth Aldape; Guido Reifenberger; V Peter Collins; Jacek Majewski; Andrey Korshunov; Peter Lichter; Christoph Plass; Nada Jabado; Stefan M Pfister
Journal: Cancer Cell Date: 2012-10-16 Impact factor: 31.743

9. Recurrent somatic alterations of FGFR1 and NTRK2 in pilocytic astrocytoma.

Authors: David T W Jones; Barbara Hutter; Natalie Jäger; Andrey Korshunov; Marcel Kool; Hans-Jörg Warnatz; Thomas Zichner; Sally R Lambert; Marina Ryzhova; Dong Anh Khuong Quang; Adam M Fontebasso; Adrian M Stütz; Sonja Hutter; Marc Zuckermann; Dominik Sturm; Jan Gronych; Bärbel Lasitschka; Sabine Schmidt; Huriye Seker-Cin; Hendrik Witt; Marc Sultan; Meryem Ralser; Paul A Northcott; Volker Hovestadt; Sebastian Bender; Elke Pfaff; Sebastian Stark; Damien Faury; Jeremy Schwartzentruber; Jacek Majewski; Ursula D Weber; Marc Zapatka; Benjamin Raeder; Matthias Schlesner; Catherine L Worth; Cynthia C Bartholomae; Christof von Kalle; Charles D Imbusch; Sylwester Radomski; Chris Lawerenz; Peter van Sluis; Jan Koster; Richard Volckmann; Rogier Versteeg; Hans Lehrach; Camelia Monoranu; Beate Winkler; Andreas Unterberg; Christel Herold-Mende; Till Milde; Andreas E Kulozik; Martin Ebinger; Martin U Schuhmann; Yoon-Jae Cho; Scott L Pomeroy; Andreas von Deimling; Olaf Witt; Michael D Taylor; Stephan Wolf; Matthias A Karajannis; Charles G Eberhart; Wolfram Scheurlen; Martin Hasselblatt; Keith L Ligon; Mark W Kieran; Jan O Korbel; Marie-Laure Yaspo; Benedikt Brors; Jörg Felsberg; Guido Reifenberger; V Peter Collins; Nada Jabado; Roland Eils; Peter Lichter; Stefan M Pfister
Journal: Nat Genet Date: 2013-06-30 Impact factor: 38.330

10. Prognostic significance of clinical, histopathological, and molecular characteristics of medulloblastomas in the prospective HIT2000 multicenter clinical trial cohort.

Authors: Torsten Pietsch; Rene Schmidt; Marc Remke; Andrey Korshunov; Volker Hovestadt; David T W Jones; Jörg Felsberg; Kerstin Kaulich; Tobias Goschzik; Marcel Kool; Paul A Northcott; Katja von Hoff; André O von Bueren; Carsten Friedrich; Martin Mynarek; Heyko Skladny; Gudrun Fleischhack; Michael D Taylor; Friedrich Cremer; Peter Lichter; Andreas Faldum; Guido Reifenberger; Stefan Rutkowski; Stefan M Pfister
Journal: Acta Neuropathol Date: 2014-05-04 Impact factor: 17.088

615 in total

1. Molecular subgrouping of primary pineal parenchymal tumors reveals distinct subtypes correlated with clinical parameters and genetic alterations.

Authors: Elke Pfaff; Christian Aichmüller; Martin Sill; Damian Stichel; Matija Snuderl; Matthias A Karajannis; Martin U Schuhmann; Jens Schittenhelm; Martin Hasselblatt; Christian Thomas; Andrey Korshunov; Marina Rhizova; Andrea Wittmann; Anna Kaufhold; Murat Iskar; Petra Ketteler; Dietmar Lohmann; Brent A Orr; David W Ellison; Katja von Hoff; Martin Mynarek; Stefan Rutkowski; Felix Sahm; Andreas von Deimling; Peter Lichter; Marcel Kool; Marc Zapatka; Stefan M Pfister; David T W Jones
Journal: Acta Neuropathol Date: 2019-11-25 Impact factor: 17.088

2. DNA methylation-based profiling of uterine neoplasms: a novel tool to improve gynecologic cancer diagnostics.

Authors: Felix K F Kommoss; Damian Stichel; Daniel Schrimpf; Mark Kriegsmann; Basile Tessier-Cloutier; Aline Talhouk; Jessica N McAlpine; Kenneth T E Chang; Dominik Sturm; Stefan M Pfister; Laura Romero-Pérez; Thomas Kirchner; Thomas G P Grünewald; Rolf Buslei; Hans-Peter Sinn; Gunhild Mechtersheimer; Peter Schirmacher; Dietmar Schmidt; Hans-Anton Lehr; Felix Sahm; David G Huntsman; C Blake Gilks; Friedrich Kommoss; Andreas von Deimling; Christian Koelsche
Journal: J Cancer Res Clin Oncol Date: 2019-11-25 Impact factor: 4.553

3. MR imaging phenotype correlates with extent of genome-wide copy number abundance in IDH mutant gliomas.

Authors: Chih-Chun Wu; Rajan Jain; Lucidio Neto; Seema Patel; Laila M Poisson; Jonathan Serrano; Victor Ng; Sohil H Patel; Dimitris G Placantonakis; David Zagzag; John Golfinos; Andrew S Chi; Matija Snuderl
Journal: Neuroradiology Date: 2019-05-27 Impact factor: 2.804

Review 4. Glioblastoma in adults: a Society for Neuro-Oncology (SNO) and European Society of Neuro-Oncology (EANO) consensus review on current management and future directions.

Authors: Patrick Y Wen; Michael Weller; Eudocia Quant Lee; Brian M Alexander; Jill S Barnholtz-Sloan; Floris P Barthel; Tracy T Batchelor; Ranjit S Bindra; Susan M Chang; E Antonio Chiocca; Timothy F Cloughesy; John F DeGroot; Evanthia Galanis; Mark R Gilbert; Monika E Hegi; Craig Horbinski; Raymond Y Huang; Andrew B Lassman; Emilie Le Rhun; Michael Lim; Minesh P Mehta; Ingo K Mellinghoff; Giuseppe Minniti; David Nathanson; Michael Platten; Matthias Preusser; Patrick Roth; Marc Sanson; David Schiff; Susan C Short; Martin J B Taphoorn; Joerg-Christian Tonn; Jonathan Tsang; Roel G W Verhaak; Andreas von Deimling; Wolfgang Wick; Gelareh Zadeh; David A Reardon; Kenneth D Aldape; Martin J van den Bent
Journal: Neuro Oncol Date: 2020-08-17 Impact factor: 12.300

Review 5. Genetic and molecular epidemiology of adult diffuse glioma.

Authors: Annette M Molinaro; Jennie W Taylor; John K Wiencke; Margaret R Wrensch
Journal: Nat Rev Neurol Date: 2019-06-21 Impact factor: 42.937

6. Superiority of temozolomide over radiotherapy for elderly patients with RTK II methylation class, MGMT promoter methylated malignant astrocytoma.

Authors: Antje Wick; Tobias Kessler; Michael Platten; Christoph Meisner; Michael Bamberg; Ulrich Herrlinger; Jörg Felsberg; Astrid Weyerbrock; Kirsten Papsdorf; Joachim P Steinbach; Michael Sabel; Jan Vesper; Jürgen Debus; Jürgen Meixensberger; Ralf Ketter; Caroline Hertler; Regine Mayer-Steinacker; Sarah Weisang; Hanna Bölting; David Reuss; Guido Reifenberger; Felix Sahm; Andreas von Deimling; Michael Weller; Wolfgang Wick
Journal: Neuro Oncol Date: 2020-08-17 Impact factor: 12.300

7. Molecular and translational advances in meningiomas.

Authors: Suganth Suppiah; Farshad Nassiri; Wenya Linda Bi; Ian F Dunn; Clemens Oliver Hanemann; Craig M Horbinski; Rintaro Hashizume; Charles David James; Christian Mawrin; Houtan Noushmehr; Arie Perry; Felix Sahm; Andrew Sloan; Andreas Von Deimling; Patrick Y Wen; Kenneth Aldape; Gelareh Zadeh
Journal: Neuro Oncol Date: 2019-01-14 Impact factor: 12.300

8. Subgroup-specific outcomes of children with malignant childhood brain tumors treated with an irradiation-sparing protocol.

Authors: Eveline Teresa Hidalgo; Matija Snuderl; Cordelia Orillac; Svetlana Kvint; Jonathan Serrano; Peter Wu; Matthias A Karajannis; Sharon L Gardner
Journal: Childs Nerv Syst Date: 2019-08-02 Impact factor: 1.475

9. Clinical impact of combined epigenetic and molecular analysis of pediatric low-grade gliomas.

Authors: Kohei Fukuoka; Yasin Mamatjan; Ruth Tatevossian; Michal Zapotocky; Scott Ryall; Ana Guerreiro Stucklin; Julie Bennett; Liana Figueiredo Nobre; Anthony Arnoldo; Betty Luu; Ji Wen; Kaicen Zhu; Alberto Leon; Dax Torti; Trevor J Pugh; Lili-Naz Hazrati; Normand Laperriere; James Drake; James T Rutka; Peter Dirks; Abhaya V Kulkarni; Michael D Taylor; Ute Bartels; Annie Huang; Gelareh Zadeh; Kenneth Aldape; Vijay Ramaswamy; Eric Bouffet; Matija Snuderl; David Ellison; Cynthia Hawkins; Uri Tabori
Journal: Neuro Oncol Date: 2020-10-14 Impact factor: 12.300

10. Molecular heterogeneity and CXorf67 alterations in posterior fossa group A (PFA) ependymomas.

Authors: Kristian W Pajtler; Ji Wen; Martin Sill; Tong Lin; Wilda Orisme; Bo Tang; Jens-Martin Hübner; Vijay Ramaswamy; Sujuan Jia; James D Dalton; Kelly Haupfear; Hazel A Rogers; Chandanamali Punchihewa; Ryan Lee; John Easton; Gang Wu; Timothy A Ritzmann; Rebecca Chapman; Lukas Chavez; Fredrick A Boop; Paul Klimo; Noah D Sabin; Robert Ogg; Stephen C Mack; Brian D Freibaum; Hong Joo Kim; Hendrik Witt; David T W Jones; Baohan Vo; Amar Gajjar; Stan Pounds; Arzu Onar-Thomas; Martine F Roussel; Jinghui Zhang; J Paul Taylor; Thomas E Merchant; Richard Grundy; Ruth G Tatevossian; Michael D Taylor; Stefan M Pfister; Andrey Korshunov; Marcel Kool; David W Ellison
Journal: Acta Neuropathol Date: 2018-06-16 Impact factor: 17.088