| Literature DB >> 30521533 |
Joe Alexander1, Roger A Edwards2, Marina Brodsky3, Luigi Manca4, Roberto Grugni4, Alberto Savoldelli4, Gianluca Bonfanti4, Birol Emir1, Ed Whalen1, Steve Watt1, Bruce Parsons1.
Abstract
Prior work applied hierarchical clustering, coarsened exact matching (CEM), time series regressions with lagged variables as inputs, and microsimulation to data from three randomized clinical trials (RCTs) and a large German observational study (OS) to predict pregabalin pain reduction outcomes for patients with painful diabetic peripheral neuropathy. Here, data were added from six RCTs to reduce covariate bias of the same OS and improve accuracy and/or increase the variety of patients for pain response prediction. Using hierarchical cluster analysis and CEM, a matched dataset was created from the OS (N = 2642) and nine total RCTs (N = 1320). Using a maximum likelihood method, we estimated weekly pain scores for pregabalin-treated patients for each cluster (matched dataset); the models were validated with RCT data that did not match with OS data. We predicted novel 'virtual' patient pain scores over time using simulations including instance-based machine learning techniques to assign novel patients to a cluster, then applying cluster-specific regressions to predict pain response trajectories. Six clusters were identified according to baseline variables (gender, age, insulin use, body mass index, depression history, pregabalin monotherapy, prior gabapentin, pain score, and pain-related sleep interference score). CEM yielded 1766 patients (matched dataset) having lower covariate imbalances. Regression models for pain performed well (adjusted R-squared 0.90-0.93; root mean square errors 0.41-0.48). Simulations showed positive predictive values for achieving >50% and >30% change-from-baseline pain score improvements (range 68.6-83.8% and 86.5-93.9%, respectively). Using more RCTs (nine vs. the earlier three) enabled matching of 46.7% more patients in the OS dataset, with substantially reduced global imbalance vs. not matching. This larger RCT pool covered 66.8% of possible patient characteristic combinations (vs. 25.0% with three original RCTs) and made prediction possible for a broader spectrum of patients. Trial Registration: www.clinicaltrials.gov (as applicable): NCT00156078, NCT00159679, NCT00143156, NCT00553475.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30521533 PMCID: PMC6283469 DOI: 10.1371/journal.pone.0207120
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of patients from RCTs included in virtual Lab 2.0 by maintenance dose.
| 5/6 Weeks Studies | 12/13 Weeks Studies | Total RCT Patients | ||||
|---|---|---|---|---|---|---|
| Pregabalin dose | % of total | % of total | % of total | |||
| Flexible dose | 0 | 0.0 | 83 | 6.3 | 83 | 6.3 |
| Flexible adjusted dose | 0 | 0.0 | 193 | 14.6 | 193 | 14.6 |
| 75 mg/day | 59 | 4.5 | 0 | 0.0 | 59 | 4.5 |
| 150 mg/day | 69 | 5.2 | 74 | 5.6 | 143 | 10.8 |
| 300 mg/day | 124 | 9.4 | 297 | 22.5 | 421 | 31.9 |
| 600 mg/day | 129 | 9.8 | 292 | 22.1 | 421 | 31.9 |
n, number of patients; RCT, randomized controlled trial.
a Patients with 1–4 weeks escalation phase and 8–11 weeks maintenance (Protocol 1008–155).
b Patients with 6 weeks escalation phase and 6 weeks maintenance (Protocol A0081030).
Baseline patient characteristics from calibration dataset (N = 1766), by cluster.
| 1 | 2 | 3 | 4 | 5 | 6 | Total | |
|---|---|---|---|---|---|---|---|
| 431 | 189 | 437 | 266 | 127 | 316 | 1766 | |
| Females (%) | 0.0 | 47.1 | 38.0 | 100.0 | 26.8 | 41.8 | 38.9 |
| Age (years), mean (SD) | 60.2 (9.3) | 62.9 (8.6) | 62.9 (8.5) | 62.2 (8.5) | 61.3 (9.7) | 63.7 (8.3) | 62.2 (8.9) |
| Age group (years), % | |||||||
| 0–44 | 4.8 | 0.5 | 0.9 | 1.9 | 4.7 | 1.3 | 2.3 |
| 45–64 | 63.6 | 54.5 | 53.3 | 59.8 | 60.6 | 53.5 | 57.5 |
| 65–74 | 27.4 | 37.1 | 38.4 | 32.7 | 26.8 | 37.3 | 33.7 |
| 75+ | 4.2 | 7.9 | 7.3 | 5.6 | 7.9 | 7.9 | 6.5 |
| BMI (kg/m2) | |||||||
| Mean (SD) | 28.1 (3.5) | 29.5 (4.5) | 29.7 (4.6) | 28.9 (4.6) | 28.5 (4.4) | 28.4 (3.8) | 28.8 (4.2) |
| Normal (%) | 13.5 | 13.2 | 10.8 | 17.7 | 13.4 | 14.8 | 13.6 |
| Overweight (%) | 64.5 | 47.1 | 49.2 | 47.7 | 59.1 | 58.2 | 54.8 |
| Obese (%) | 22.0 | 39.7 | 40.0 | 34.6 | 27.6 | 27.0 | 31.5 |
| Baseline pain score | |||||||
| Mean (SD) | 6.3 (1.3) | 7.1 (1.3) | 6.6 (1.4) | 6.4 (1.3) | 6.3 (1.4) | 6.3 (1.3) | 6.5 (1.4) |
| Pain severity category (%) | |||||||
| Mild (0–3) | - | - | - | - | - | - | - |
| Moderate (4–6) | 55.4 | 31.7 | 46.0 | 50.0 | 54.3 | 56.3 | 49.8 |
| Severe (7–10) | 44.6 | 68.3 | 54.0 | 50.0 | 45.7 | 43.7 | 50.2 |
| Baseline PRSI score | |||||||
| Mean (SD) | 5.3 (2.2) | 6.8 (1.9) | 5.9 (2.2) | 5.8 (1.9) | 5.6 (2.1) | 5.5 (2.0) | 5.7 (2.1) |
| PRSI category | |||||||
| Mild (0–3) | 21.1 | 6.9 | 13.9 | 12.1 | 20.5 | 15.5 | 15.4 |
| Moderate (4–6) | 43.4 | 30.2 | 41.7 | 50.0 | 44.9 | 51.6 | 44.1 |
| Severe (7–10) | 35.5 | 62.9 | 44.4 | 37.9 | 34.6 | 32.9 | 40.5 |
| Duration of pDPN (years), % | |||||||
| 0 to ≤5 | 28.8 | 15.9 | 21.3 | 25.9 | 24.4 | 24.4 | 24.0 |
| >5 to ≤10 | 23.2 | 21.7 | 23.6 | 21.8 | 24.4 | 22.2 | 22.8 |
| >10 to ≤15 | 25.5 | 24.9 | 24.0 | 27.1 | 22.8 | 18.3 | 23.8 |
| >15 to ≤20 | 11.1 | 15.3 | 12.8 | 15.4 | 10.2 | 18.7 | 13.9 |
| >20 to ≤25 | 3.7 | 5.8 | 5.5 | 2.6 | 7.1 | 6.0 | 4.9 |
| >25 | 7.7 | 16.4 | 12.8 | 7.1 | 11.0 | 10.4 | 10.5 |
| Past or current medical history of depression (%) | 0.0 | 100.0 | 2.1 | 0.0 | 2.4 | 0.0 | 11.4 |
| Prior or current therapy (%) | |||||||
| Pregabalin monotherapy | 100.0 | 34.9 | 59.9 | 100.0 | 64.6 | 0.0 | 62.7 |
| Gabapentin | 0.0 | 11.1 | 2.8 | 0.0 | 100.0 | 1.0 | 9.2 |
| Insulin | 0.0 | 34.9 | 99.8 | 1.5 | 48.8 | 7.9 | 33.6 |
| Full of energy at baseline (%) | |||||||
| Always | 1.2 | 1.1 | 0.7 | 1.1 | 0.0 | 0.6 | 0.8 |
| Mostly | 7.7 | 1.1 | 5.3 | 3.4 | 3.9 | 4.4 | 4.9 |
| Fairly often | 13.5 | 2.1 | 7.1 | 10.5 | 13.4 | 7.9 | 9.2 |
| Sometimes | 29.9 | 17.5 | 29.5 | 31.2 | 25.9 | 20.9 | 26.8 |
| Seldom | 40.6 | 49.7 | 43.3 | 48.1 | 43.3 | 57.9 | 46.7 |
| Never | 7.2 | 28.6 | 14.2 | 5.6 | 13.4 | 8.2 | 11.6 |
| Calm and relaxed at baseline (%) | |||||||
| Always | 2.1 | 2.1 | 2.9 | 0.8 | 0.8 | 1.3 | 1.9 |
| Mostly | 15.3 | 4.2 | 11.2 | 13.2 | 12.6 | 13.3 | 12.2 |
| Fairly often | 16.9 | 9.0 | 16.5 | 15.8 | 20.5 | 13.6 | 15.5 |
| Sometimes | 32.2 | 20.1 | 29.9 | 28.9 | 29.9 | 31.9 | 29.7 |
| Seldom | 31.8 | 47.6 | 33.9 | 38.4 | 29.1 | 36.7 | 35.7 |
| Never | 1.6 | 16.9 | 5.5 | 3.0 | 7.1 | 3.2 | 5.1 |
| Sad and discouraged at baseline (%) | |||||||
| Always | 1.6 | 6.4 | 2.9 | 1.9 | 3.9 | 1.6 | 2.7 |
| Mostly | 15.3 | 35.9 | 17.6 | 15.4 | 15.8 | 15.5 | 18.2 |
| Fairly often | 28.3 | 33.3 | 27.0 | 26.3 | 26.8 | 31.0 | 28.6 |
| Sometimes | 29.3 | 15.3 | 29.5 | 30.8 | 26.8 | 33.2 | 28.6 |
| Seldom | 20.2 | 8.5 | 16.3 | 21.8 | 22.1 | 15.5 | 17.5 |
| Never | 5.3 | 0.5 | 6.6 | 3.8 | 4.7 | 3.2 | 4.5 |
| Pain responders at 50% threshold at endpoint (%) | 87.2 | 72.5 | 79.6 | 78.6 | 77.9 | 79.4 | 80.4 |
| Daily treatment dose (mg) | |||||||
| 75 | 3.5 | 3.7 | 4.8 | 4.5 | 2.4 | 6.0 | 4.4 |
| 150 | 38.3 | 31.2 | 35.7 | 38.7 | 31.5 | 34.2 | 35.7 |
| 300 | 53.8 | 59.8 | 54.7 | 52.6 | 61.4 | 55.4 | 55.3 |
| 600 | 4.4 | 5.3 | 4.8 | 4.1 | 4.7 | 4.4 | 4.4 |
| Other | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
BMI, body mass index; pDPN, painful diabetic peripheral neuropathy; PRSI, pain-related sleep interference; RCT, randomized controlled trial; SD, standard deviation.
Fig 1Simulation steps.
OS, observational study; PDF, probability density function; RCT, randomized controlled trial.
Statistical comparison of clusters within the matched dataset, within the validation dataset, and between the matched and validation datasets.
| Across Clusters Within Calibration Dataset (# of Unique Clusters out of 15 Pairwise Comparisons) | Across Clusters Within Validation Dataset (# of Unique Clusters out of 15 Pairwise Comparisons) | Calibration vs. Validation Dataset (# of Unique Clusters out of 36 Pairwise Comparisons) | |
|---|---|---|---|
| Gender | 13 of 15 (87%) | 9 of 15 (60%) | 30 of 36 (83%) |
| Age group | 6 of 15 (40%) | 5 of 15 (33%) | 36 of 36 (100%) |
| BMI | 8 of 15 (53%) | 6 of 15 (40%) | 29 of 36 (81%) |
| Insulin use | 15 of 15 (100%) | 13 of 15 (87%) | 27 of 36 (75%) |
| Past or current medical history of depression | 11 of 15 (73%) | 6 of 15 (40%) | 13 of 36 (36%) |
| Prior gabapentin | 11 of 15 (73%) | 10 of 15 (67%) | 23 of 36 (64%) |
| Pregabalin monotherapy | 13 of 15 (87%) | 13 of 15 (87%) | 26 of 36 (72%) |
| PRSI score at baseline | 9 of 15 (60%) | 5 of 15 (33%) | 36 of 36 (100%) |
| Pain score at baseline | 7 of 15 (47%) | 2 of 15 (13%) | 14 of 36 (39%) |
| Dose | 0 of 15 (0%) | 0 of 15 (0%) | 2 of 36 (6%) |
BMI, body mass index; PRSI, pain-related sleep interference.
a Each pairwise comparison of one variable in one cluster to one variable in another cluster was evaluated using Fisher’s exact test. The number of significant P-values at P < 0.05 were tallied, and those counts are shown in the table.
Regression model Input variables and resulting regression coefficients by cluster for the calibration dataset.
| y-intercepts for regression models, not variables | - | - | - | - | - | - |
| Age cohort (75+) | - | 0.1039 | - | - | - | - |
| General feeling: calm and relaxed (t = 0) | 0.0354 | 0.0557 | - | 0.0671 | 0.0150 | 0.0107 |
| General feeling: full of energy (t = 0) | - | - | 0.0162 | 0.0043 | 0.0133 | 0.0449 |
| Pain score (t-1) | 0.6229 | 0.7590 | 0.7605 | 0.6294 | 0.6415 | 0.6652 |
| PRSI score (t) | 0.2461 | 0.1417 | 0.2393 | 0.2146 | 0.2234 | 0.1945 |
| PRSI score (t-3) | - | -0.0455 | -0.0707 | - | - | - |
| Dose (t-3) | 0.0002 | 0.0002 | 0.0002 | 0.0003 | - | - |
| Model performance measures applied | Performance, by cluster | |||||
| 1 | 2 | 3 | 4 | 5 | 6 | |
| Likelihood ratio | < 0.0001 | < 0.0001 | < 0.0001 | < 0.0001 | < 0.0001 | < 0.0001 |
| 0.97 | 0.98 | 0.98 | 0.98 | 0.97 | 0.97 | |
| Root mean square error | 0.49 | 0.51 | 0.49 | 0.47 | 0.51 | 0.50 |
| Observed vs. estimated responders (Student’s | 0.55 | 0.64 | 0.46 | 0.90 | 0.33 | 0.83 |
PRSI, pain-related sleep interference.
a The first number in each column is the regression intercept value. Blank spaces in columns indicate that the associated row variable was not a predictor in the final model for that cluster.
b (t-1) indicates 1 week before prediction.
c (t-3) indicates 2 weeks before prediction.
d (t) indicates the same week of the prediction.
e (t = 0) indicates baseline.
f Dummy variables have been introduced for categorical variables. For example, Age cohort (75+) is the dummy variable related to the “75+” value of the Age cohort variable; it means that the corresponding coefficient affects only patients having Age cohort = 75+, but not patients with different values of the Age cohort variable (i.e. 0–44, 45–64, and 65–74).
The regression model inputs were assigned unique variable names, x1–x7, and are represented in the cluster-specific regression equations below:
Equations for the regression models (where ‘y’ is the fitted pain score) both for H.1
CLUSTER 1: y = 0.6229x1 + 0.2461x2 + 0.0002x4 + 0.0354x6
CLUSTER 2: y = 0.7590x1 + 0.1417x2–0.0455x3 + 0.0002x4 + 0.1039x5 + 0.0557x6
CLUSTER 3: y = 0.7605x1 + 0.2393x2–0.0707x3 + 0.0002x4 + 0.0162x7
CLUSTER 4: y = 0.6294x1 + 0.2146x2 + 0.0003x4 + 0.0671x6 + 0.0043x7
CLUSTER 5: y = 0.6415x1 + 0.2234x2 + 0.0150x6 + 0.0133x7
CLUSTER 6: y = 0.6652x1 + 0.1945x2 + 0.0107x6 + 0.0449x7
g The regressions estimate pain score, but we also want to be able to identify whether that patient is a responder at different thresholds (eg, 50% or 30% reduction in pain score). Hence, we wanted to confirm estimation of responder level based on the regression for pain score.
Fig 2Plots of observed vs. predicted pain scores and residuals in validation dataset.
RMSE, root mean square error.
PPV and accuracy for best scenario.
| Model Performance Measures Applied | Performance, by Cluster | ||||||
|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | Overall | |
| PPV at 30% pain responder level threshold | 93.9% | 86.5% | 92.5% | 93.4% | 89.6% | 90.0% | 91.6% |
| Accuracy at 30% pain responder level threshold | 93.9% | 86.5% | 92.5% | 93.4% | 89.6% | 90.0% | 91.6% |
| PPV at 50% pain responder level threshold | 81.5% | 68.6% | 78.0% | 83.8% | 68.7% | 75.9% | 77.8% |
| Accuracy at 50% pain responder level threshold | 81.8% | 64.6% | 75.9% | 82.7% | 68.7% | 75.3% | 76.5% |
PPV, positive predictive value.
Fig 3A) Monotonicity results. N = 106 for decreased pain, N = 762 for maintained pain response from weeks 6 to 12–13 (358 responders + 404 non-responders), N = 71 for increased pain. B) ROC curves for monotonicity prediction of pain beyond 6 weeks. Correct prediction based on majority of simulated patient outcomes.
Fig 4Simulation output.