| Literature DB >> 35629140 |
Fengying Zhang1, Yan Liu2, Weijie Ma1, Shengming Zhao1, Jin Chen1, Zhichun Gu3,4.
Abstract
Objective: This study aimed to systematically assess the characteristics and risk of bias of previous studies that have investigated nonlinear machine learning algorithms for warfarin dose prediction.Entities:
Keywords: PROBAST; algorithms; model prediction; nonlinear machine learning; warfarin
Year: 2022 PMID: 35629140 PMCID: PMC9147332 DOI: 10.3390/jpm12050717
Source DB: PubMed Journal: J Pers Med ISSN: 2075-4426
Figure 1PRISMA flow chart of included studies. * The same research was translated into Chinese and English.
Summary characteristics of studies.
| Study | Year | Study Type | Participants | Predictors | Model Development | Model Evaluation | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Source | Patients | Indication | Target INR * | Features | Features Selection | Missing Data Handling | Model Type | Machine Learning Algorithms # | Performance Measures | |||
| Solomon [ | 2004 | Retrospective | Israel | 148 | NA | NA | clinical | Univariate analysis | NA | Development | NNM | r |
| Cosgun [ | 2011 | Retrospective | USA | 290 | NA | 2.0–3.0 | clinical + genetic | Univariate analysis | single imputation | Development | DT, SV, Ensemble learning | R2 |
| Hu [ | 2012 | Retrospective | China | 587 | NA | 1.0–3.0 | clinical | Expert opinion and literature review | NA | Development | DT, SV, KNN, Ensemble learning | MAE |
| Grossi [ | 2014 | Retrospective | Italy | 377 | PE, DVT, AF, AHV, CM, Stroke, Others | 2.0–4.0 | clinical + genetic | Machine learning algorithm (TWIST system) | NA | Development | NNM | R2, MAE, ideal dose |
| Saleh [ | 2014 | Retrospective | IWPC sites | 4271 | PE, DVT, AF, AHV, CM, Stroke, Others | 2.0–3.0 | clinical + genetic | Backward Variable Selection | Excluded | Development | NNM | R2, MAE, ideal dose |
| Zhou [ | 2014 | Retrospective | China | 1093 | HVR | 1.5–2.5 | clinical | Univariate analysis, Stepwise regression | NA | Development | NNM | MAE, ideal dose |
| Li [ | 2015 | Retrospective | China; IWPC sites | 1511 | HVR | 1.7–3.0; 2.0–3.0 | clinical + genetic | Stepwise regression | Excluded | External validation | DT, SV, NNM, Ensemble learning, Other | MAE, ideal dose |
| Liu [ | 2015 | Retrospective | IWPC sites | 4797 | PE, DVT, AF, AHV, CM, Stroke, Others | 2.0–3.0 | clinical + genetic | Stepwise regression | Excluded | Development | DT, SV, NNM, Ensemble learning, Other | MAE, ideal dose |
| Alzubiedi [ | 2016 | Retrospective | IWPC sites | 163 | PE, DVT, AF, Stroke, Others | 2.0–3.0 | clinical + genetic | Backward Variable Selection | NA | Development | NNM | R2, MAE, ideal dose |
| Pavani [ | 2016 | NR | India | 240 | PE, AF, HVR | No limitation | clinical + genetic | NA | NA | Development | NNM | R2, MAE |
| Li [ | 2018 | Retrospective | China | 15,694 | HVR | 1.5–2.5 | clinical | Covariance analysis, expert opinion, and literature review | NA | Development with external validation (same data) | NNM | MAE, RMSE, ideal dose |
| Ma [ | 2018 | Retrospective | IWPC sites | 5743 | PE, DVT, AF, AHV, CM, Stroke, Others | 1.7–3.3 | clinical + genetic | Expert opinion and literature review | single imputation | Development | SV, NNM, Ensemble learning, KNN | MAE, ideal dose |
| Tao [ | 2018 | Retrospective | China | 13,639 | HVR | 1.5–2.5 | clinical | Univariate analysis | NA | Development with external validation (same data) | NNM | MAE, MSE, ideal dose |
| Li [ | 2019 | Retrospective | China | 13,639 | HVR | 1.5–2.5 | clinical | Univariate analysis | NA | Development with external validation (same data) | NNM | MAE, MSE, RMSE, ideal dose |
| Tao [ | 2019 | Retrospective | China | 289 | NR | 2.0–3.0 | clinical + genetic | NA | NA | Development | NNM, SV, GP, Ensemble learning | R2, MAE, MSE, ideal dose |
| Tao [ | 2019 | Retrospective | China; IWPC sites | 617 | PE, DVT, AF, VR, ICT, EVE, stroke | 2.0–3.0; 2.0–2.5 | clinical + genetic | Expert opinion and literature review | NA | Development | DT, SV, Ensemble learning | R2, MAE, MSE, ideal dose |
| Roche-Lima [ | 2020 | Retrospective | USA | 190 | PE, DVT, AF, VR, DM2, CHF, Stroke, Others | 2.0–3.0 | clinical + genetic | NA | Excluded | Development | DT, SV, NNM, KNN, Ensemble learning, Other | MAE, ideal dose |
| Asiimwe [ | 2021 | Retrospective | Uganda, South Africa | 634 | AF, VT, VHT | 2.5–3.5; 2.0–3.0 | clinical | Expert opinion and literature review | multivariate imputation | Development with external validation (another data) | DT, SV, KNN, NNM, Ensemble learning, Other | MAE, MAPE, ideal dose |
| Gu [ | 2021 | Retrospective | China | 15,108 | HVR | 1.5–2.5 | clinical | Univariate analysis | Excluded | Development with external validation (same data) | NNM | MAE, MSE, ideal dose |
| Liu [ | 2021 | Retrospective | China | 377 | PE, DVT, AF, HF, PAH, Stroke | 1.5–3.0 | clinical + genetic | Univariate analysis | Not imputed | Development | Ensemble learning | R2, MAE, MSE, RMSE, ideal dose |
| Ma [ | 2021 | Retrospective | China | 19,060 | HVR | 1.5–2.5 | clinical | Univariate analysis | NA | Development with external validation (same data) | NNM | MAE, MSE, ideal dose |
| Nguyen [ | 2021 | Retrospective | Korean | 650 | PE, DVT, HVR, VHD, Stroke, Arrhythmia, others | 1.5–3.0 | clinical + genetic | Recursive feature elimination | Single imputation | Development | Ensemble learning | r, MAE, RMSE, ideal dose |
| Steiner [ | 2021 | Retrospective | IWPC sites, North and South America | 7030 | PE, DVT, TIA, Others | 2.0–3.0; No limitation | clinical + genetic | NA | multivariate imputation | Development | DT, SV, Other | MAE, ideal dose |
AF—atrial fibrillation; AHV—artificial heart valves; PE—pulmonary embolism; DVT—deep vein thrombosis; CM—cardiomyopathy; ICT—intracardiac thrombus; EVE—endovascular exclusion of aortic dissection; DM2—type 2 diabetes mellitus; CHF—congestive heart failure; HF—heart failure; VT—venous thromboembolism; VHT—valvular heart disease; PAH—pulmonary arterial hypertension; TIA—transient attack; HVR—heart valve replacement; INR—the international normalized ratio; NNM—neural network model; DT—decision tree; SV—support vector; KNN—K-nearest neighbor; GP—genetic programming; Other—other nonlinear regression model; r—coefficient of correlation; R2—coefficient of determination; MAE—mean absolute error; MSE—mean square error; RMSE—root mean square error. * In the article, there were different target INRs based on different indications in the same dataset, we took the minimum and maximum value of the target INRs. There were different target INRs in different datasets, we took the target INRs separately. # The nonlinear machine learning algorithms were divided into 6 categories (DT, SV, NNM, KNN, GP, ensemble learning, other nonlinear regression) based on the algorithms involved in the studies. When a study reported many subcategories in a large category, the large category was reported in the table.
Figure 2Predictors included at least 4 factors in all studies. SCr—serum creatinine; CHF—congestive heart failure.
Performance evaluation of the included nonlinear machine learning algorithms studies.
| Studies | NO. Models | Models | NO. Patients | NO. Features | MAE (mg/Week) |
|---|---|---|---|---|---|
|
| |||||
| Solomon 2004 | 1 | NNM | 148 | 3 | NR |
| Cosgun 2011 | 3 | DT, SV, Ensemble learning | 290 | 11 | NR |
| Hu 2012 | 9 | DT, SV, KNN, Ensemble learning | 587 | 7 | (1.47, 1.55) |
| Grossi 2014 | 1 | NNM | 377 | 14 | 5.72 |
| Saleh 2014 | 1 | NNM | 4271 | 9 | 9 |
| Zhou 2014 | 1 | NNM | 1093 | 11 | 0.08 * |
| Liu 2015 | 7 | DT, SV, NNM, Ensemble learning, Other | 4797 | 9 | (8.84, 9.82) |
| Alzubiedi 2016 | 1 | NNM | 163 | 7 | 11.2 |
| Pavani 2016 | 1 | NNM | 240 | 9 | −1.97 * |
| Ma 2018 | 8 | SV, NNM, KNN, Ensemble learning | 5743 | 13 | (8.31, 10.86) |
| Tao 2019 | 6 | NNM, SV, GP, Ensemble learning | 289 | 7 | NR |
| Tao 2019 | 4 | DT, SV, Ensemble learning | 617 | 11 | (4.73,5.36) |
| Roche-Lima 2020 | 9 | DT, SV, NNM, KNN, Ensemble learning, Other | 190 | 24 | (4.73, 9.87) |
| Liu 2021 | 3 | Ensemble learning | 377 | 11 | (2.98, 4.54) |
| Nguyen 2021 | 1 | Ensemble learning | 650 | 17 | 4.48 |
| Steiner 2021 | 3 | DT, SV, Other | 7030 | 13 | (8.11, 8.18) |
|
| |||||
| Li 2018 IV | 1 | NNM | 15,694 | 12 | 2.59 |
| Li 2018 EV | 1 | NNM | 15,694 | 12 | 2.68 |
| Tao 2018 IV | 1 | NNM | 13,639 | 9 | 4.07 |
| Tao 2018 EV | 1 | NNM | 13,639 | 9 | 4.22 |
| Li 2019 IV | 1 | NNM | 13,639 | 10 | 4.82 |
| Li 2019 EV | 1 | NNM | 13,639 | 10 | 5.18 |
| Gu 2021 IV | 1 | NNM | 15,108 | 8 | 2.58 |
| Gu 2021 EV | 1 | NNM | 15,108 | 8 | 2.59 |
| Ma 2021 IV | 2 | NNM | 19,060 | 8 | (2.28, 3.04) |
| Ma 2021 EV | 2 | NNM | 19,060 | 8 | (2.42, 2.88) |
|
| |||||
| Asiimwe 2021 | 13 | DT, SV, KNN, NNM, Ensemble learning, Other | 270 | 7 | (12.07, 17.59) |
|
| |||||
| Li 2015 | 6 | DT, SV, NNM, Ensemble learning, Other | 1295 | 10 | (4.41, 4.76) |
| Li 2015 | 6 | DT, SV, NNM, Ensemble learning, Other | 216 | 10 | (4.40, 4.84) |
NNM—neural network model; DT—decision tree; SV—support vector; KNN—K-nearest neighbor; GP—genetic programming. * The value provided in literature was not clear, and it was impossible to distinguish whether it was derived from the training set or the test set.
Figure 3PROBAST (Prediction Model Risk of Bias Assessment Tool) risk of bias assessment for 23 studies.
PROBAST signaling questions in 23 included studies.
| Signaling Question No. | Signaling Question | Included Studies ( | ||
|---|---|---|---|---|
| Yes or Probably Yes | No or Probably No | No Information | ||
| Participant domain | number (percentage, 95% confidence interval) | |||
| 1.1 | Were appropriate data sources used, e.g., cohort, RCT, or nested case–control study data? | 23 (100, 100 to 100) | 0 | 0 |
| 1.2 | Were all inclusions and exclusions of participants appropriate? | 13 (57, 36 to 77) | 8 (35, 15 to 54) | 2 (8, 3 to 20) |
|
| ||||
| 2.1 | Were predictors defined and assessed in a similar way for all participants? | 23 (100, 100 to 100) | 0 | 0 |
| 2.2 | Were predictor assessments made without knowledge of outcome data? | 22 (96, 87 to 100) | 1 (4, 4 to 13) | 0 |
| 2.3 | Are all predictors available at the time the model is intended to be used? | 22 (96, 87 to 100) | 1 (4, 4 to 13) | 0 |
|
| ||||
| 3.1 | Was the outcome determined appropriately? | 16 (70, 51 to 89) | 7 (30, 12 to 49) | 0 |
| 3.3 | Were predictors excluded from the outcome definition? | 22 (96, 87 to 100) | 1 (4, 4 to 13) | 0 |
| 3.4 | Was the outcome defined and determined in a similar way for all participants? | 21 (91, 80 to 100) | 0 | 2 (9, 3 to 20) |
| 3.5 | Was the outcome determined without knowledge of predictor information? | 22 (96, 87 to 100) | 1 (4, 4 to 13) | 0 |
| 3.6 | Was the time interval between predictor assessment and outcome determination? | 23 (100, 100 to 100) | 0 | 0 |
|
| ||||
| 4.1 | Were there a reasonable number of participants with the outcome? | 8 (35, 15 to 54) | 12 (52, 32 to 73) | 3 (13, 1 to 27) |
| 4.3 | Were all enrolled participants included in the analysis? | 23 (100, 100 to 100) | 0 | 0 |
| 4.4 | Were participants with missing data handled appropriately? | 3 (13, 1 to 27) | 20 (87, 73 to 100) | 0 |
| 4.5 | Was selection of predictors based on univariable analysis avoided? | 11 (48, 27 to 68) | 8 (35, 15 to 54) | 4 (17, 2 to 33) |
| 4.7 | Were relevant model performance measures evaluated appropriately? | 21 (91, 80 to 100) | 2 (9, 3 to 20) | 0 |
| 4.8 | Were model overfitting and optimism in model performance accounted for? | 19 (90, 78 to 100) | 2 (10, 3 to 22) | 0 |
Signaling questions 3.2, 4.2, 4.6, and 4.9 were not included (Table S3). The risk of bias judgment for each domain was based on the answers to the signaling questions. If the answer to all signaling questions was yes or probably yes, then the domain was judged as low risk of bias. If reported information was insufficient to answer the signaling questions, these were judged as no information. If more than half of the answer to all signaling questions were judged as no information, then the domain was judged as high risk of bias, otherwise the domain was judged as unclear risk of bias. If one answer to all signaling question was answered as no or probably no, then the domain was judged as high risk of bias. After judging all the domains, we performed an overall assessment for each application of PROBAST. This tool recommends rating the study as low risk of bias if all domains had low risk of bias. If at least one domain had a high risk of bias, overall judgment was rated as high risk of bias. If the risk of bias was unclear in at least one domain and all other domains had a low risk of bias, then an unclear risk of bias was assigned.