Literature DB >> 31932775

Machine learning workflows to estimate class probabilities for precision cancer diagnostics on DNA methylation microarray data.

Máté E Maros1,2, David Capper3,4, David T W Jones5,6, Volker Hovestadt7,8,9, Andreas von Deimling3,10, Stefan M Pfister5,11,12, Axel Benner13, Manuela Zucknick14, Martin Sill15,16,17.   

Abstract

DNA methylation data-based precision cancer diagnostics is emerging as the state of the art for molecular tumor classification. Standards for choosing statistical methods with regard to well-calibrated probability estimates for these typically highly multiclass classification tasks are still lacking. To support this choice, we evaluated well-established machine learning (ML) classifiers including random forests (RFs), elastic net (ELNET), support vector machines (SVMs) and boosted trees in combination with post-processing algorithms and developed ML workflows that allow for unbiased class probability (CP) estimation. Calibrators included ridge-penalized multinomial logistic regression (MR) and Platt scaling by fitting logistic regression (LR) and Firth's penalized LR. We compared these workflows on a recently published brain tumor 450k DNA methylation cohort of 2,801 samples with 91 diagnostic categories using a 5 × 5-fold nested cross-validation scheme and demonstrated their generalizability on external data from The Cancer Genome Atlas. ELNET was the top stand-alone classifier with the best calibration profiles. The best overall two-stage workflow was MR-calibrated SVM with linear kernels closely followed by ridge-calibrated tuned RF. For calibration, MR was the most effective regardless of the primary classifier. The protocols developed as a result of these comparisons provide valuable guidance on choosing ML workflows and their tuning to generate well-calibrated CP estimates for precision diagnostics using DNA methylation data. Computation times vary depending on the ML algorithm from <15 min to 5 d using multi-core desktop PCs. Detailed scripts in the open-source R language are freely available on GitHub, targeting users with intermediate experience in bioinformatics and statistics and using R with Bioconductor extensions.

Entities:  

Mesh:

Year:  2020        PMID: 31932775     DOI: 10.1038/s41596-019-0251-6

Source DB:  PubMed          Journal:  Nat Protoc        ISSN: 1750-2799            Impact factor:   13.491


  43 in total

Review 1.  Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting.

Authors:  Alain Dupuy; Richard M Simon
Journal:  J Natl Cancer Inst       Date:  2007-01-17       Impact factor: 13.506

Review 2.  Development of biomarker classifiers from high-dimensional data.

Authors:  Songjoon Baek; Chen-An Tsai; James J Chen
Journal:  Brief Bioinform       Date:  2009-04-03       Impact factor: 11.622

3.  New Brain Tumor Entities Emerge from Molecular Classification of CNS-PNETs.

Authors:  Dominik Sturm; Brent A Orr; Umut H Toprak; Volker Hovestadt; David T W Jones; David Capper; Martin Sill; Ivo Buchhalter; Paul A Northcott; Irina Leis; Marina Ryzhova; Christian Koelsche; Elke Pfaff; Sariah J Allen; Gnanaprakash Balasubramanian; Barbara C Worst; Kristian W Pajtler; Sebastian Brabetz; Pascal D Johann; Felix Sahm; Jüri Reimand; Alan Mackay; Diana M Carvalho; Marc Remke; Joanna J Phillips; Arie Perry; Cynthia Cowdrey; Rachid Drissi; Maryam Fouladi; Felice Giangaspero; Maria Łastowska; Wiesława Grajkowska; Wolfram Scheurlen; Torsten Pietsch; Christian Hagel; Johannes Gojo; Daniela Lötsch; Walter Berger; Irene Slavc; Christine Haberler; Anne Jouvet; Stefan Holm; Silvia Hofer; Marco Prinz; Catherine Keohane; Iris Fried; Christian Mawrin; David Scheie; Bret C Mobley; Matthew J Schniederjan; Mariarita Santi; Anna M Buccoliero; Sonika Dahiya; Christof M Kramm; André O von Bueren; Katja von Hoff; Stefan Rutkowski; Christel Herold-Mende; Michael C Frühwald; Till Milde; Martin Hasselblatt; Pieter Wesseling; Jochen Rößler; Ulrich Schüller; Martin Ebinger; Jens Schittenhelm; Stephan Frank; Rainer Grobholz; Istvan Vajtai; Volkmar Hans; Reinhard Schneppenheim; Karel Zitterbart; V Peter Collins; Eleonora Aronica; Pascale Varlet; Stephanie Puget; Christelle Dufour; Jacques Grill; Dominique Figarella-Branger; Marietta Wolter; Martin U Schuhmann; Tarek Shalaby; Michael Grotzer; Timothy van Meter; Camelia-Maria Monoranu; Jörg Felsberg; Guido Reifenberger; Matija Snuderl; Lynn Ann Forrester; Jan Koster; Rogier Versteeg; Richard Volckmann; Peter van Sluis; Stephan Wolf; Tom Mikkelsen; Amar Gajjar; Kenneth Aldape; Andrew S Moore; Michael D Taylor; Chris Jones; Nada Jabado; Matthias A Karajannis; Roland Eils; Matthias Schlesner; Peter Lichter; Andreas von Deimling; Stefan M Pfister; David W Ellison; Andrey Korshunov; Marcel Kool
Journal:  Cell       Date:  2016-02-25       Impact factor: 41.582

Review 4.  Roadmap for developing and validating therapeutically relevant genomic classifiers.

Authors:  Richard Simon
Journal:  J Clin Oncol       Date:  2005-09-06       Impact factor: 44.544

Review 5.  Cancer epigenetics reaches mainstream oncology.

Authors:  Manuel Rodríguez-Paredes; Manel Esteller
Journal:  Nat Med       Date:  2011-03       Impact factor: 53.440

Review 6.  DNA methylation profiling in the clinic: applications and challenges.

Authors:  Holger Heyn; Manel Esteller
Journal:  Nat Rev Genet       Date:  2012-09-04       Impact factor: 53.242

7.  Cell-of-Origin Patterns Dominate the Molecular Classification of 10,000 Tumors from 33 Types of Cancer.

Authors:  Katherine A Hoadley; Christina Yau; Toshinori Hinoue; Denise M Wolf; Alexander J Lazar; Esther Drill; Ronglai Shen; Alison M Taylor; Andrew D Cherniack; Vésteinn Thorsson; Rehan Akbani; Reanne Bowlby; Christopher K Wong; Maciej Wiznerowicz; Francisco Sanchez-Vega; A Gordon Robertson; Barbara G Schneider; Michael S Lawrence; Houtan Noushmehr; Tathiane M Malta; Joshua M Stuart; Christopher C Benz; Peter W Laird
Journal:  Cell       Date:  2018-04-05       Impact factor: 41.582

8.  DNA methylation-based classification of central nervous system tumours.

Authors:  David Capper; David T W Jones; Martin Sill; Volker Hovestadt; Daniel Schrimpf; Dominik Sturm; Christian Koelsche; Felix Sahm; Lukas Chavez; David E Reuss; Annekathrin Kratz; Annika K Wefers; Kristin Huang; Kristian W Pajtler; Leonille Schweizer; Damian Stichel; Adriana Olar; Nils W Engel; Kerstin Lindenberg; Patrick N Harter; Anne K Braczynski; Karl H Plate; Hildegard Dohmen; Boyan K Garvalov; Roland Coras; Annett Hölsken; Ekkehard Hewer; Melanie Bewerunge-Hudler; Matthias Schick; Roger Fischer; Rudi Beschorner; Jens Schittenhelm; Ori Staszewski; Khalida Wani; Pascale Varlet; Melanie Pages; Petra Temming; Dietmar Lohmann; Florian Selt; Hendrik Witt; Till Milde; Olaf Witt; Eleonora Aronica; Felice Giangaspero; Elisabeth Rushing; Wolfram Scheurlen; Christoph Geisenberger; Fausto J Rodriguez; Albert Becker; Matthias Preusser; Christine Haberler; Rolf Bjerkvig; Jane Cryan; Michael Farrell; Martina Deckert; Jürgen Hench; Stephan Frank; Jonathan Serrano; Kasthuri Kannan; Aristotelis Tsirigos; Wolfgang Brück; Silvia Hofer; Stefanie Brehmer; Marcel Seiz-Rosenhagen; Daniel Hänggi; Volkmar Hans; Stephanie Rozsnoki; Jordan R Hansford; Patricia Kohlhof; Bjarne W Kristensen; Matt Lechner; Beatriz Lopes; Christian Mawrin; Ralf Ketter; Andreas Kulozik; Ziad Khatib; Frank Heppner; Arend Koch; Anne Jouvet; Catherine Keohane; Helmut Mühleisen; Wolf Mueller; Ute Pohl; Marco Prinz; Axel Benner; Marc Zapatka; Nicholas G Gottardo; Pablo Hernáiz Driever; Christof M Kramm; Hermann L Müller; Stefan Rutkowski; Katja von Hoff; Michael C Frühwald; Astrid Gnekow; Gudrun Fleischhack; Stephan Tippelt; Gabriele Calaminus; Camelia-Maria Monoranu; Arie Perry; Chris Jones; Thomas S Jacques; Bernhard Radlwimmer; Marco Gessi; Torsten Pietsch; Johannes Schramm; Gabriele Schackert; Manfred Westphal; Guido Reifenberger; Pieter Wesseling; Michael Weller; Vincent Peter Collins; Ingmar Blümcke; Martin Bendszus; Jürgen Debus; Annie Huang; Nada Jabado; Paul A Northcott; Werner Paulus; Amar Gajjar; Giles W Robinson; Michael D Taylor; Zane Jaunmuktane; Marina Ryzhova; Michael Platten; Andreas Unterberg; Wolfgang Wick; Matthias A Karajannis; Michel Mittelbronn; Till Acker; Christian Hartmann; Kenneth Aldape; Ulrich Schüller; Rolf Buslei; Peter Lichter; Marcel Kool; Christel Herold-Mende; David W Ellison; Martin Hasselblatt; Matija Snuderl; Sebastian Brandner; Andrey Korshunov; Andreas von Deimling; Stefan M Pfister
Journal:  Nature       Date:  2018-03-14       Impact factor: 49.962

9.  Practical implementation of DNA methylation and copy-number-based CNS tumor diagnostics: the Heidelberg experience.

Authors:  David Capper; Damian Stichel; Felix Sahm; David T W Jones; Daniel Schrimpf; Martin Sill; Simone Schmid; Volker Hovestadt; David E Reuss; Christian Koelsche; Annekathrin Reinhardt; Annika K Wefers; Kristin Huang; Philipp Sievers; Azadeh Ebrahimi; Anne Schöler; Daniel Teichmann; Arend Koch; Daniel Hänggi; Andreas Unterberg; Michael Platten; Wolfgang Wick; Olaf Witt; Till Milde; Andrey Korshunov; Stefan M Pfister; Andreas von Deimling
Journal:  Acta Neuropathol       Date:  2018-07-02       Impact factor: 17.088

10.  Second-generation molecular subgrouping of medulloblastoma: an international meta-analysis of Group 3 and Group 4 subtypes.

Authors:  Tanvi Sharma; Edward C Schwalbe; Daniel Williamson; Martin Sill; Volker Hovestadt; Martin Mynarek; Stefan Rutkowski; Giles W Robinson; Amar Gajjar; Florence Cavalli; Vijay Ramaswamy; Michael D Taylor; Janet C Lindsey; Rebecca M Hill; Natalie Jäger; Andrey Korshunov; Debbie Hicks; Simon Bailey; Marcel Kool; Lukas Chavez; Paul A Northcott; Stefan M Pfister; Steven C Clifford
Journal:  Acta Neuropathol       Date:  2019-05-10       Impact factor: 17.088

View more
  20 in total

1.  Artificial intelligence in clinical research of cancers.

Authors:  Dan Shao; Yinfei Dai; Nianfeng Li; Xuqing Cao; Wei Zhao; Li Cheng; Zhuqing Rong; Lan Huang; Yan Wang; Jing Zhao
Journal:  Brief Bioinform       Date:  2022-01-17       Impact factor: 11.622

2.  Machine Learning to Improve Prognosis Prediction of Early Hepatocellular Carcinoma After Surgical Resection.

Authors:  Gu-Wei Ji; Ye Fan; Dong-Wei Sun; Ming-Yu Wu; Ke Wang; Xiang-Cheng Li; Xue-Hao Wang
Journal:  J Hepatocell Carcinoma       Date:  2021-08-10

3.  BioAutoML: automated feature engineering and metalearning to predict noncoding RNAs in bacteria.

Authors:  Robson P Bonidia; Anderson P Avila Santos; Breno L S de Almeida; Peter F Stadler; Ulisses N da Rocha; Danilo S Sanches; André C P L F de Carvalho
Journal:  Brief Bioinform       Date:  2022-07-18       Impact factor: 13.994

4.  Identification of diagnostic markers and lipid dysregulation in oesophageal squamous cell carcinoma through lipidomic analysis and machine learning.

Authors:  Yuyao Yuan; Zitong Zhao; Liyan Xue; Guangxi Wang; Huajie Song; Ruifang Pang; Juntuo Zhou; Jianyuan Luo; Yongmei Song; Yuxin Yin
Journal:  Br J Cancer       Date:  2021-05-05       Impact factor: 7.640

5.  Comparative analysis of machine learning algorithms for computer-assisted reporting based on fully automated cross-lingual RadLex mappings.

Authors:  Máté E Maros; Chang Gyu Cho; Andreas G Junge; Benedikt Kämpgen; Victor Saase; Fabian Siegel; Frederik Trinkmann; Thomas Ganslandt; Christoph Groden; Holger Wenz
Journal:  Sci Rep       Date:  2021-03-09       Impact factor: 4.379

6.  The Differences Between Individuals Engaging in Nonsuicidal Self-Injury and Suicide Attempt Are Complex (vs. Complicated or Simple).

Authors:  Xieyining Huang; Jessica D Ribeiro; Joseph C Franklin
Journal:  Front Psychiatry       Date:  2020-04-07       Impact factor: 4.157

7.  Sarcoma classification by DNA methylation profiling.

Authors:  Christian Koelsche; Daniel Schrimpf; Damian Stichel; Martin Sill; Felix Sahm; David E Reuss; Mirjam Blattner; Barbara Worst; Christoph E Heilig; Katja Beck; Peter Horak; Simon Kreutzfeldt; Elke Paff; Sebastian Stark; Pascal Johann; Florian Selt; Jonas Ecker; Dominik Sturm; Kristian W Pajtler; Annekathrin Reinhardt; Annika K Wefers; Philipp Sievers; Azadeh Ebrahimi; Abigail Suwala; Francisco Fernández-Klett; Belén Casalini; Andrey Korshunov; Volker Hovestadt; Felix K F Kommoss; Mark Kriegsmann; Matthias Schick; Melanie Bewerunge-Hudler; Till Milde; Olaf Witt; Andreas E Kulozik; Marcel Kool; Laura Romero-Pérez; Thomas G P Grünewald; Thomas Kirchner; Wolfgang Wick; Michael Platten; Andreas Unterberg; Matthias Uhl; Amir Abdollahi; Jürgen Debus; Burkhard Lehner; Christian Thomas; Martin Hasselblatt; Werner Paulus; Christian Hartmann; Ori Staszewski; Marco Prinz; Jürgen Hench; Stephan Frank; Yvonne M H Versleijen-Jonkers; Marije E Weidema; Thomas Mentzel; Klaus Griewank; Enrique de Álava; Juan Díaz Martín; Miguel A Idoate Gastearena; Kenneth Tou-En Chang; Sharon Yin Yee Low; Adrian Cuevas-Bourdier; Michel Mittelbronn; Martin Mynarek; Stefan Rutkowski; Ulrich Schüller; Viktor F Mautner; Jens Schittenhelm; Jonathan Serrano; Matija Snuderl; Reinhard Büttner; Thomas Klingebiel; Rolf Buslei; Manfred Gessler; Pieter Wesseling; Winand N M Dinjens; Sebastian Brandner; Zane Jaunmuktane; Iben Lyskjær; Peter Schirmacher; Albrecht Stenzinger; Benedikt Brors; Hanno Glimm; Christoph Heining; Oscar M Tirado; Miguel Sáinz-Jaspeado; Jaume Mora; Javier Alonso; Xavier Garcia Del Muro; Sebastian Moran; Manel Esteller; Jamal K Benhamida; Marc Ladanyi; Eva Wardelmann; Cristina Antonescu; Adrienne Flanagan; Uta Dirksen; Peter Hohenberger; Daniel Baumhoer; Wolfgang Hartmann; Christian Vokuhl; Uta Flucke; Iver Petersen; Gunhild Mechtersheimer; David Capper; David T W Jones; Stefan Fröhling; Stefan M Pfister; Andreas von Deimling
Journal:  Nat Commun       Date:  2021-01-21       Impact factor: 17.694

8.  An Eight-CpG-based Methylation Classifier for Preoperative Discriminating Early and Advanced-Late Stage of Colorectal Cancer.

Authors:  Ji Hu; Fu-Ying Zhao; Bin Huang; Jing Ran; Mei-Yuan Chen; Hai-Lin Liu; You-Song Deng; Xia Zhao; Xiao-Fan Han
Journal:  Front Genet       Date:  2021-01-13       Impact factor: 4.599

9.  Interpretable Machine Learning Reveals Dissimilarities Between Subtypes of Autism Spectrum Disorder.

Authors:  Mateusz Garbulowski; Karolina Smolinska; Klev Diamanti; Gang Pan; Khurram Maqbool; Lars Feuk; Jan Komorowski
Journal:  Front Genet       Date:  2021-02-25       Impact factor: 4.599

10.  Multiparametric MRI Features Predict the SYP Gene Expression in Low-Grade Glioma Patients: A Machine Learning-Based Radiomics Analysis.

Authors:  Zheng Xiao; Shun Yao; Zong-Ming Wang; Di-Min Zhu; Ya-Nan Bie; Shi-Zhong Zhang; Wen-Li Chen
Journal:  Front Oncol       Date:  2021-05-31       Impact factor: 6.244

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.