Literature DB >> 32544778

Classifying 2-year recurrence in patients with dlbcl using clinical variables with imbalanced data and machine learning methods.

Lei Wang1, ZhiQiang Zhao2, YanHong Luo3, HongMei Yu4, ShuQing Wu5, XiaoLu Ren6, ChuChu Zheng7, XueQian Huang8.   

Abstract

BACKGROUND: Treatments are limited for patients with relapsed/refractory Diffuse large B-cell lymphoma (DLBCL), and their survival rate is low. Prediction of the recurrence hazard for each patient could provide a reference regarding chemotherapy regimens for clinicians to extend patients' period of long-term remission. As current strategies cannot satisfy such need, we have established predictive models to classify patients with DLBCL with complete remission who had recurrences in 2 years from ones who did not.
METHODS: We assessed 518 patients with DLBCL and measured 52 variables of each patient. They were treated between January 2011 and July 2016. 17 variables were first selected by variable selection methods (including Lasso, Adaptive Lasso, and Elastic net). Then, we set classifiers and probability models for imbalanced data by combining the SMOTE sampling, cost-sensitive, and ensemble learning (consisting of AdaBoost, voting strategy, and Stacking) methods with the machine learning methods (Support Vector Machine, BackPropagation Artificial Neural Network, Random Forest), respectively. Last, assessed their performance.
RESULTS: The disease stage and other 5 variables are significant indicators for recurrence. The SVM with AdaBoost ensemble learning method modeling by SMOTE data performs the best (Sensitivity=97.3%, AUC=96%, RMSE=19.6%, G-mean=96%) in all classifiers. The SVM with AdaBoost method(AUC=98.7%, RMSE=17.7%, MXE=12.7%, Cal mean=3.2%, BS0=2.5%, BS1=4%, BSALL=3.1%) and random forest (AUC=99.5%, RMSE=19.8%, MXE=16.2%, Cal mean=9.1%, BS0=4.8%, BS1=2.9%, BSALL=3.9%) both modeling by SMOTE sampling data perform well in probability models.
CONCLUSIONS: This predictive model has high accuracy for almost all DLBCL patients and the six indicators can be recurrence signals.
Copyright © 2020 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Classification and possibility prediction; Imbalanced data; Indicators; Machine learning; Relapsed/refractory DLBCL

Mesh:

Year:  2020        PMID: 32544778     DOI: 10.1016/j.cmpb.2020.105567

Source DB:  PubMed          Journal:  Comput Methods Programs Biomed        ISSN: 0169-2607            Impact factor:   5.428


  1 in total

1.  LASSO Model Better Predicted the Prognosis of DLBCL than Random Forest Model: A Retrospective Multicenter Analysis of HHLWG.

Authors:  Ziyuan Shen; Shuo Zhang; Yaxue Jiao; Yuye Shi; Hao Zhang; Fei Wang; Ling Wang; Taigang Zhu; Yuqing Miao; Wei Sang; Guoqi Cai; Working Group Huaihai Lymphoma
Journal:  J Oncol       Date:  2022-09-16       Impact factor: 4.501

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.