| Literature DB >> 35260665 |
Paola Berchialla1, Corrado Lanera2, Veronica Sciannameo2, Dario Gregori2, Ileana Baldi2.
Abstract
A central problem in most data-driven personalized medicine scenarios is the estimation of heterogeneous treatment effects to stratify individuals into subpopulations that differ in their susceptibility to a particular disease or response to a specific treatment. In this work, with an illustrative example on type 2 diabetes we showed how the increasing ability to access and analyzed open data from randomized clinical trials (RCTs) allows to build Machine Learning applications in a framework of personalized medicine. An ensemble machine learning predictive model is first developed and then applied to estimate the expected treatment response according to the medication that would be prescribed. Machine learning is quickly becoming indispensable to bridge science and clinical practice, but it is not sufficient on its own. A collaborative effort is requested to clinicians, statisticians, and computer scientists to strengthen tools built on machine learning to take advantage of this evidence flow.Entities:
Mesh:
Year: 2022 PMID: 35260665 PMCID: PMC8904517 DOI: 10.1038/s41598-022-07801-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Base learners and super learner (SL) training performance.
| Algorithms | R implementation | Risk (1-AUC) | Weight |
|---|---|---|---|
| BART | bartMachine | 0.374 | 0.012 |
| BART on variables selected by RF | bartMachine_screen.randomForest | 0.456 | 0.012 |
| Random Forest | Caret | 0.452 | 0.012 |
| Random Forest on variables selected by RF | Caret_screen.randomForest | 0.468 | 0.012 |
| Recursive Partitioning and Regression Trees | Rpart | 0.592 | 0.012 |
| Recursive Partitioning and Regression Trees on variables selected by RF | Rpart_screen.randomForest | 0.592 | 0.012 |
| Bagging | ipredbagg | 0.610 | 0.012 |
| Bagging on variables selected by RF | ipredbagg_screen.randomForest | 0.582 | 0.012 |
| Kernel SVM | KSVM | 0.477 | 0.012 |
| Kernel SVM on variables selected by RF | KSVM_screen.randomForest | 0.646 | 0.012 |
| Elastic Net regularized GLM on variables selected by RF | GLMNET_screen.randomForest | 0.549 | 0.012 |
| Logistic model | GLM | 0.379 | 0.012 |
| Logistic model on variables selected by RF | GLM_screen.randomForest | 0.428 | 0.012 |
| Not adjusted logistic model | GLM | 0.546 | 0.012 |
| Not adjusted logistic model on variables selected by RF | GLM_screen.randomForest | 0.546 | 0.012 |
| Multivariate adaptive polynomial regression splines | Polymars | 0.592 | 0.012 |
| Multivariate adaptive polynomial regression splines on variables selected by RF | Polymars_screen.randomForest | 0.546 | 0.012 |
| Super Learner | 0.079 |
SAIS1 and PROLOGUE patients’ characteristics.
| SAIS1 | PROLOGUE | |||
|---|---|---|---|---|
| Sitagliptin (N = 43) | Conventional (N = 192) | Sitagliptin (N = 193) | Combined (N = 385) | |
| Age | 55.50/61.00/67.00 | 64.00/70.00/76.00 | 64.00/70.00/75.25 | 64.00/70.00/76.00 |
| Male | 67% (29) | 69% (134) | 66% (126) | 68% (260) |
| BMI | 22.85/25.20/29.15 | 22.32/24.39/27.05 | 22.63/24.93/27.42 | 22.42/24.62/27.08 |
| Hypertension | 60% (26) | 52% (100) | 56% (108) | 54% (208) |
| Dyslipidemia | 53% (23) | 49% (94) | 41% (79) | 45% (173) |
| Adiponectin | 1.40/2.20/3.05 | 1.93/3.64/6.24 | 2.03/3.53/5.22 | 1.98/3.53/5.67 |
| SBP | 118/129/139 | 117.00/128.00/140.00 | 120.00/130.00/138.25 | 118.00/129.00/140.00 |
| DBP | 71.5/78/84.5 | 64.00/72.00/80.00 | 64.75/73.00/80.00 | 64.00/72.00/80.00 |
| HbA1c (%) | 7.1/7.4/7.8 | 6.6/6.9/7.2 | 6.5/6.8/7.2 | 6.5/6.8/7.2 |
| FPG | 7.3/7.9/8.8 | 6.16/6.94/8.21 | 6.15/6.94/8.38 | 6.16/6.94/8.27 |
| LDL | 88.5/100/117.5 | 77.40/89.40/112.20 | 77.53/94.90/109.30 | 77.40/92.40/111.00 |
| − 1.15/− 0.7/− 0.3 | − 0.3/− 0.5/0.00 | − 0.7/− 0.35/− 0.2 | − 0.6/− 0.3/− 0.10 | |
Continuous variables are reported as 1st quartile/median/3rd quartile; categorical variables are reported as frequencies and percentage.
Body Mass Index (BMI, kg/cm2); hypertension (SBP > = 130 mmHg OR DBP > = 80 mmHg); dyslipidemia (LDL > = 130 mg/dl OR HDL < 35 mg/dl OR triglyceride > = 150 mg/dl OR total cholesterol (= LDL + HDL + (Triglyceride/5)) > = 200 mg/dl); adiponectin (mg/l), Systolic Blood Pressure (SBP, mmHg); Diastolic Blood Pressure (DBP, mmHg) HbA1c (%), Fasting Plasma Glucose (FPG, mmol/l); Low-Density Lipoprotein (LDL, mg/dl); reduction of glycated haemoglobin (HbA1c) at 6 months for SAIS1 and at12 months for PROLOGUE ( 1c).
Outcome results in SAIS1 and PROLOGUE Study.
| SAIS1 Study | Glimepiride (N = 49) | Sitagliptin (N = 43) | p-value |
|---|---|---|---|
| 69% (34) | 70% (30) | 0.969 | |
| − 0.90/− 0.70/− 0.40 | − 1.150/− 0.700/− 0.30 | 0.316 |
: reduction of glycated haemoglobin (HbA1c) at 6 months for SAIS1 and at 12 months for PROLOGUE.
Figure 1Treatment effect and 95%Confidence Intervals achieved by varying the probability (cut-off level) that defines the best responsive patients to DPP-4 inhibitor Sitagliptin. Below the cut-off levels, the number of responsive patients is reported.
Figure 2Distribution of HbA1c difference at 12 months (Δ0–12 HbA1c) among patients targeted as best responsive in both conventional and Sitagliptin arms according to increasing levels (from bottom to top) predictive probability of lowering HbA1c more than 0.5% (red-dashed line). At level 0, distributions of Δ0–12 HbA1c are based on the entire RCT sample.