Shuanglong Fan1, Zhiqiang Zhao2, Hongmei Yu1, Lei Wang1, Chuchu Zheng1, Xueqian Huang1, Zhenhuan Yang1, Meng Xing1, Qing Lu3, Yanhong Luo4. 1. Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China. 2. Department of Hematology, Shanxi Cancer Hospital, Taiyuan, China. 3. Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, USA. lucienq@hotmail.com. 4. Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China. sxmulyh@163.com.
Abstract
BACKGROUND: Under the influences of chemotherapy regimens, clinical staging, immunologic expressions and other factors, the survival rates of patients with diffuse large B-cell lymphoma (DLBCL) are different. The accurate prediction of mortality hazards is key to precision medicine, which can help clinicians make optimal therapeutic decisions to extend the survival times of individual patients with DLBCL. Thus, we have developed a predictive model to predict the mortality hazard of DLBCL patients within 2 years of treatment. METHODS: We evaluated 406 patients with DLBCL and collected 17 variables from each patient. The predictive variables were selected by the Cox model, the logistic model and the random forest algorithm. Five classifiers were chosen as the base models for ensemble learning: the naïve Bayes, logistic regression, random forest, support vector machine and feedforward neural network models. We first calibrated the biased outputs from the five base models by using probability calibration methods (including shape-restricted polynomial regression, Platt scaling and isotonic regression). Then, we aggregated the outputs from the various base models to predict the 2-year mortality of DLBCL patients by using three strategies (stacking, simple averaging and weighted averaging). Finally, we assessed model performance over 300 hold-out tests. RESULTS: Gender, stage, IPI, KPS and rituximab were significant factors for predicting the deaths of DLBCL patients within 2 years of treatment. The stacking model that first calibrated the base model by shape-restricted polynomial regression performed best (AUC = 0.820, ECE = 8.983, MCE = 21.265) in all methods. In contrast, the performance of the stacking model without undergoing probability calibration is inferior (AUC = 0.806, ECE = 9.866, MCE = 24.850). In the simple averaging model and weighted averaging model, the prediction error of the ensemble model also decreased with probability calibration. CONCLUSIONS: Among all the methods compared, the proposed model has the lowest prediction error when predicting the 2-year mortality of DLBCL patients. These promising results may indicate that our modeling strategy of applying probability calibration to ensemble learning is successful.
BACKGROUND: Under the influences of chemotherapy regimens, clinical staging, immunologic expressions and other factors, the survival rates of patients with diffuse large B-cell lymphoma (DLBCL) are different. The accurate prediction of mortality hazards is key to precision medicine, which can help clinicians make optimal therapeutic decisions to extend the survival times of individual patients with DLBCL. Thus, we have developed a predictive model to predict the mortality hazard of DLBCL patients within 2 years of treatment. METHODS: We evaluated 406 patients with DLBCL and collected 17 variables from each patient. The predictive variables were selected by the Cox model, the logistic model and the random forest algorithm. Five classifiers were chosen as the base models for ensemble learning: the naïve Bayes, logistic regression, random forest, support vector machine and feedforward neural network models. We first calibrated the biased outputs from the five base models by using probability calibration methods (including shape-restricted polynomial regression, Platt scaling and isotonic regression). Then, we aggregated the outputs from the various base models to predict the 2-year mortality of DLBCL patients by using three strategies (stacking, simple averaging and weighted averaging). Finally, we assessed model performance over 300 hold-out tests. RESULTS: Gender, stage, IPI, KPS and rituximab were significant factors for predicting the deaths of DLBCL patients within 2 years of treatment. The stacking model that first calibrated the base model by shape-restricted polynomial regression performed best (AUC = 0.820, ECE = 8.983, MCE = 21.265) in all methods. In contrast, the performance of the stacking model without undergoing probability calibration is inferior (AUC = 0.806, ECE = 9.866, MCE = 24.850). In the simple averaging model and weighted averaging model, the prediction error of the ensemble model also decreased with probability calibration. CONCLUSIONS: Among all the methods compared, the proposed model has the lowest prediction error when predicting the 2-year mortality of DLBCL patients. These promising results may indicate that our modeling strategy of applying probability calibration to ensemble learning is successful.
Entities:
Keywords:
Calibration; DLBCL; Discrimination; Ensemble method; Probability calibration; Risk prediction
Authors: Bertrand Coiffier; Eric Lepage; Josette Briere; Raoul Herbrecht; Hervé Tilly; Reda Bouabdallah; Pierre Morel; Eric Van Den Neste; Gilles Salles; Philippe Gaulard; Felix Reyes; Pierre Lederlin; Christian Gisselbrecht Journal: N Engl J Med Date: 2002-01-24 Impact factor: 91.245
Authors: Vicki A Morrison; Paul Hamlin; Pierre Soubeyran; Reinhard Stauder; Punit Wadhwa; Matti Aapro; Stuart Lichtman Journal: J Geriatr Oncol Date: 2014-12-07 Impact factor: 3.599
Authors: H Horn; M Ziepert; M Wartenberg; A M Staiger; T F E Barth; H-W Bernd; A C Feller; W Klapper; C Stuhlmann-Laeisz; M Hummel; H Stein; D Lenze; S Hartmann; M-L Hansmann; P Möller; S Cogliatti; M Pfreundschuh; L Trümper; M Loeffler; B Glass; N Schmitz; G Ott; A Rosenwald Journal: Leukemia Date: 2015-02-17 Impact factor: 11.528
Authors: Kai Fu; Dennis D Weisenburger; William W L Choi; Kyle D Perry; Lynette M Smith; Xinlan Shi; Christine P Hans; Timothy C Greiner; Philip J Bierman; R Gregory Bociek; James O Armitage; Wing C Chan; Julie M Vose Journal: J Clin Oncol Date: 2008-07-28 Impact factor: 44.544