Elham Sherafat1, Jordan Force1, Ion I Măndoiu2. 1. Computer Science and Engineering Department, University of Connecticut, Storrs, CT, 06269, USA. 2. Computer Science and Engineering Department, University of Connecticut, Storrs, CT, 06269, USA. ion@engr.uconn.edu.
Abstract
BACKGROUND: Personalized cancer vaccines are emerging as one of the most promising approaches to immunotherapy of advanced cancers. However, only a small proportion of the neoepitopes generated by somatic DNA mutations in cancer cells lead to tumor rejection. Since it is impractical to experimentally assess all candidate neoepitopes prior to vaccination, developing accurate methods for predicting tumor-rejection mediating neoepitopes (TRMNs) is critical for enabling routine clinical use of cancer vaccines. RESULTS: In this paper we introduce Positive-unlabeled Learning using AuTOml (PLATO), a general semi-supervised approach to improving accuracy of model-based classifiers. PLATO generates a set of high confidence positive calls by applying a stringent filter to model-based predictions, then rescores remaining candidates by using positive-unlabeled learning. To achieve robust performance on clinical samples with large patient-to-patient variation, PLATO further integrates AutoML hyper-parameter tuning, classification threshold selection based on spies, and support for bootstrapping. CONCLUSIONS: Experimental results on real datasets demonstrate that PLATO has improved performance compared to model-based approaches for two key steps in TRMN prediction, namely somatic variant calling from exome sequencing data and peptide identification from MS/MS data.
BACKGROUND: Personalized cancer vaccines are emerging as one of the most promising approaches to immunotherapy of advanced cancers. However, only a small proportion of the neoepitopes generated by somatic DNA mutations in cancer cells lead to tumor rejection. Since it is impractical to experimentally assess all candidate neoepitopes prior to vaccination, developing accurate methods for predicting tumor-rejection mediating neoepitopes (TRMNs) is critical for enabling routine clinical use of cancer vaccines. RESULTS: In this paper we introduce Positive-unlabeled Learning using AuTOml (PLATO), a general semi-supervised approach to improving accuracy of model-based classifiers. PLATO generates a set of high confidence positive calls by applying a stringent filter to model-based predictions, then rescores remaining candidates by using positive-unlabeled learning. To achieve robust performance on clinical samples with large patient-to-patient variation, PLATO further integrates AutoML hyper-parameter tuning, classification threshold selection based on spies, and support for bootstrapping. CONCLUSIONS: Experimental results on real datasets demonstrate that PLATO has improved performance compared to model-based approaches for two key steps in TRMN prediction, namely somatic variant calling from exome sequencing data and peptide identification from MS/MS data.
Authors: John C Castle; Sebastian Kreiter; Jan Diekmann; Martin Löwer; Niels van de Roemer; Jos de Graaf; Abderraouf Selmi; Mustafa Diken; Sebastian Boegel; Claudia Paret; Michael Koslowski; Andreas N Kuhn; Cedrik M Britten; Christoph Huber; Ozlem Türeci; Ugur Sahin Journal: Cancer Res Date: 2012-01-11 Impact factor: 12.701
Authors: Christopher T Saunders; Wendy S W Wong; Sajani Swamy; Jennifer Becq; Lisa J Murray; R Keira Cheetham Journal: Bioinformatics Date: 2012-05-10 Impact factor: 6.937
Authors: Ryan Poplin; Pi-Chuan Chang; David Alexander; Scott Schwartz; Thomas Colthurst; Alexander Ku; Dan Newburger; Jojo Dijamco; Nam Nguyen; Pegah T Afshar; Sam S Gross; Lizzie Dorfman; Cory Y McLean; Mark A DePristo Journal: Nat Biotechnol Date: 2018-09-24 Impact factor: 54.908
Authors: Sayed Mohammad Ebrahim Sahraeian; Ruolin Liu; Bayo Lau; Karl Podesta; Marghoob Mohiyuddin; Hugo Y K Lam Journal: Nat Commun Date: 2019-03-04 Impact factor: 14.919
Authors: Jasreet Hundal; Beatriz M Carreno; Allegra A Petti; Gerald P Linette; Obi L Griffith; Elaine R Mardis; Malachi Griffith Journal: Genome Med Date: 2016-01-29 Impact factor: 11.117