Yuantong Li1, Fei Wang2, Mengying Yan3, Edward Cantu4, Fan Nils Yang5, Hengyi Rao6, Rui Feng7. 1. Department of Statistics, Purdue University, West Lafayette, IN, 47907, USA. 2. Department of Healthcare Policy and Research, Cornell University Weill Medical School, New York, NY, 10065, USA. 3. Department of Statistics, George Washington University, Washington, DC, 20052, USA. 4. Department of Surgery, University of Pennsylvania, Philadelphia, PA, 19104, USA. 5. Department of Neuroscience, Georgetown University, Washington, D.C, 20057, USA. 6. epartment of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA. 7. Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA.
Abstract
MOTIVATION: Traditional regression models are limited in outcome prediction due to their parametric nature. Current deep learning methods allow for various effects and interactions and have shown improved performance, but they typically need to be trained on a large amount of data to obtain reliable results. Gene expression studies often have small sample sizes but high dimensional correlated predictors so that traditional deep learning methods are not readily applicable. RESULTS: In this paper, we proposed peel learning, a novel neural network that incorporates the prior relationship among genes. In each layer of learning, overall structure is peeled into multiple local substructures. Within the substructure, dependency among variables is reduced through linear projections. The overall structure is gradually simplified over layers and weight parameters are optimized through a revised backpropagation. We applied PL to a small lung transplantation study to predict recipients' post-surgery primary graft dysfunction using donors' gene expressions within several immunology pathways, where PL showed improved prediction accuracy compared to conventional penalized regression, classification trees, feed-forward neural network, and a neural network assuming prior network structure. Through simulation studies, we also demonstrated the advantage of adding specific structure among predictor variables in neural network, over no or uniform group structure, which is more favorable in smaller studies. The empirical evidence is consistent with our theoretical proof of improved upper bound of PL's complexity over ordinary neural networks. AVAILABILITY AND IMPLEMENTATION: PL algorithm was implemented in Python and the open-source code and instruction will be available at https://github.com/Likelyt/Peel-Learning.
MOTIVATION: Traditional regression models are limited in outcome prediction due to their parametric nature. Current deep learning methods allow for various effects and interactions and have shown improved performance, but they typically need to be trained on a large amount of data to obtain reliable results. Gene expression studies often have small sample sizes but high dimensional correlated predictors so that traditional deep learning methods are not readily applicable. RESULTS: In this paper, we proposed peel learning, a novel neural network that incorporates the prior relationship among genes. In each layer of learning, overall structure is peeled into multiple local substructures. Within the substructure, dependency among variables is reduced through linear projections. The overall structure is gradually simplified over layers and weight parameters are optimized through a revised backpropagation. We applied PL to a small lung transplantation study to predict recipients' post-surgery primary graft dysfunction using donors' gene expressions within several immunology pathways, where PL showed improved prediction accuracy compared to conventional penalized regression, classification trees, feed-forward neural network, and a neural network assuming prior network structure. Through simulation studies, we also demonstrated the advantage of adding specific structure among predictor variables in neural network, over no or uniform group structure, which is more favorable in smaller studies. The empirical evidence is consistent with our theoretical proof of improved upper bound of PL's complexity over ordinary neural networks. AVAILABILITY AND IMPLEMENTATION: PL algorithm was implemented in Python and the open-source code and instruction will be available at https://github.com/Likelyt/Peel-Learning.
Authors: Jason D Christie; Dirk Van Raemdonck; Marc de Perrot; Mark Barr; Shaf Keshavjee; Selim Arcasoy; Jonathan Orens Journal: J Heart Lung Transplant Date: 2005-10 Impact factor: 10.247
Authors: Edward Cantu; Mengying Yan; Yoshikazu Suzuki; Taylor Buckley; Vito Galati; Neha Majeti; Christian A Bermudez; Joshua M Diamond; Jason D Christie; Rui Feng Journal: Am J Respir Crit Care Med Date: 2020-10-01 Impact factor: 21.405
Authors: Jason D Christie; Scarlett Bellamy; Lorraine B Ware; David Lederer; Denis Hadjiliadis; James Lee; Nancy Robinson; A Russell Localio; Keith Wille; Vibha Lama; Scott Palmer; Jonathan Orens; Ann Weinacker; Maria Crespo; Ejigaehu Demissie; Stephen E Kimmel; Steven M Kawut Journal: J Heart Lung Transplant Date: 2010-07-22 Impact factor: 10.247
Authors: M Anraku; M J Cameron; T K Waddell; M Liu; T Arenovich; M Sato; M Cypel; A F Pierre; M de Perrot; D J Kelvin; S Keshavjee Journal: Am J Transplant Date: 2008-08-22 Impact factor: 8.086