| Literature DB >> 36157230 |
Ziyuan Shen1, Shuo Zhang2, Yaxue Jiao2, Yuye Shi3, Hao Zhang4, Fei Wang5, Ling Wang6, Taigang Zhu7, Yuqing Miao8, Wei Sang2, Guoqi Cai1, Working Group Huaihai Lymphoma1.
Abstract
Background: Diffuse large B-cell lymphoma (DLBCL) is a heterogeneous non-Hodgkin's lymphoma with great clinical challenge. Machine learning (ML) has attracted substantial attention in diagnosis, prognosis, and treatment of diseases. This study is aimed at exploring the prognostic factors of DLBCL by ML.Entities:
Year: 2022 PMID: 36157230 PMCID: PMC9507678 DOI: 10.1155/2022/1618272
Source DB: PubMed Journal: J Oncol ISSN: 1687-8450 Impact factor: 4.501
The baseline characteristics between the training cohort and the validation cohort.
| Variables | Training cohort | Validation cohort |
|
|---|---|---|---|
|
|
| ||
| Gender (%) | |||
| Male | 451 (53.2) | 203 (55.9) | 0.416 |
| Female | 397 (46.8) | 160 (44.1) | |
| Age (year) | 62.00 (52.00, 70.00) | 62.00 (52.00, 70.00) | 0.903 |
| TC (mmol/L) | 4.32 (3.68, 4.96) | 4.23 (3.67, 4.96) | 0.730 |
| ALB (g/L) | 38.80 (34.80, 42.80) | 39.00 (34.40, 43.35) | 0.744 |
| RBC (1012/L) | 4.10 (3.66, 4.47) | 4.06 (3.73, 4.50) | 0.565 |
| HB (g/L) | 124.00 (108.00, 135.00) | 123.00 (108.00, 138.00) | 0.442 |
| PLT (109/L) | 217.00 (165.00, 269.00) | 213.00 (154.00, 272.50) | 0.304 |
| LDH (U/L) | 236.00 (185.00, 404.25) | 233.00 (181.20, 350.00) | 0.328 |
| Ki-67 | 0.75 (0.60, 0.80) | 0.70 (0.60, 0.80) | 0.915 |
| B symptom (%) | |||
| Absence | 636 (75.0) | 264 (72.7) | 0.449 |
| Presence | 212 (25.0) | 99 (27.3) | |
| CNS involvement (%) | |||
| Absence | 761 (89.7) | 333 (91.7) | 0.332 |
| Presence | 87 (10.3) | 30 (8.3) | |
| BM involvement (%) | |||
| Absence | 776 (91.5) | 333 (91.7) | 0.987 |
| Presence | 72 (8.5) | 30 (8.3) | |
| Liver involvement (%) | |||
| Absence | 808 (95.3) | 343 (94.5) | 0.661 |
| Presence | 40 (4.7) | 20 (5.5) | |
| Ann Arbor stage (%) | |||
| I/II | 391 (46.1) | 166 (45.7) | 0.954 |
| III/IV | 457 (53.9) | 197 (54.3) | |
| NCCN-IPI (%) | |||
| LR/LIR | 477 (56.2) | 193 (53.2) | 0.355 |
| HIR/HR | 371 (43.8) | 170 (46.8) | |
| IPI (%) | |||
| LR/LIR | 529 (62.4) | 221 (60.9) | 0.743 |
| HIR/HR | 318 (37.5) | 141 (38.8) | |
| Bulky (%) | |||
| Absence | 799 (94.2) | 343 (94.5) | 0.961 |
| Presence | 49 (5.8) | 20 (5.5) |
Note: TC: total cholesterol; ALB: albumin; RBC: red blood cell count; HB: hemoglobin; PLT: platelet; LDH: lactate dehydrogenase; CNS involvement: central nervous system involvement; BM involvement: bone marrow involvement; IPI: International Prognostic Index.
Figure 1Clinical variables selection using the LASSO model.(a) The variation characteristics of variable coefficient in LASSO model; (b) the process of screening the optimum value of the parameter λ by cross-validation.
Figure 2Error rate corresponding to different tree number.
Multivariable analysis of OS based on LASSO and random forest.
| Variables | HR | 95% CI |
|
|---|---|---|---|
| LASSO | |||
| Age | 1.032 | 1.022-1.042 | <0.001 |
| WBC | 1.028 | 1.017-1.040 | <0.001 |
| HB | 0.988 | 0.983-0.993 | <0.001 |
| CNS involvement | 2.241 | 1.636-3.068 | <0.001 |
| Gender | 0.659 | 0.524-0.829 | <0.001 |
| Ann Arbor stage | 1.644 | 1.286-2.100 | <0.001 |
| Random forest | |||
| Age | 1.031 | 1.021-1.041 | <0.001 |
| WBC | 1.031 | 1.020-1.041 | <0.001 |
| HB | 0.988 | 0.983-0.994 | <0.001 |
| CNS involvement | 1.992 | 1.462-2.715 | <0.001 |
| ALB | 0.984 | 0.969-0.999 | 0.038 |
| ECOG | 1.295 | 1.011-1.659 | 0.040 |
Figure 3Comparison between the LASSO and random forest model of prediction ability in (a) the training cohort and (b) the validation cohort.
Figure 4Comparison between the LASSO and random forest model of prediction ability by DCA in (a) the training cohort and (b) the validation cohort.
Figure 5Comparison between the LASSO, IPI, and NCCN-IPI models.
Figure 6(a) Kaplan-Meier survival curves of DLBCL patients by the LASSO model; comparison of LASSO, IPI, and NCCN-IPI models in the LR (b), LIR (c), and HR groups (d).