| Literature DB >> 27053942 |
Hiraku Kumamaru1, Sebastian Schneeweiss2, Robert J Glynn2, Soko Setoguchi3, Joshua J Gagne2.
Abstract
BACKGROUND: Multivariable confounder adjustment in comparative studies of newly marketed drugs can be limited by small numbers of exposed patients and even fewer outcomes. Disease risk scores (DRSs) developed in historical comparator drug users before the new drug entered the market may improve adjustment. However, in a high dimensional data setting, empirical selection of hundreds of potential confounders and modeling of DRS even in the historical cohort can lead to over-fitting and reduced predictive performance in the study cohort. We propose the use of combinations of dimension reduction and shrinkage methods to overcome this problem, and compared the performances of these modeling strategies for implementing high dimensional (hd) DRSs from historical data in two empirical study examples of newly marketed drugs versus comparator drugs after the new drugs' market entry-dabigatran versus warfarin for the outcome of major hemorrhagic events and cyclooxygenase-2 inhibitor (coxibs) versus nonselective non-steroidal anti-inflammatory drugs (nsNSAIDs) for gastrointestinal bleeds.Entities:
Keywords: Comparative study; Confounding; Disease risk score; High dimensional propensity score; Historical data; Shrinkage
Year: 2016 PMID: 27053942 PMCID: PMC4822311 DOI: 10.1186/s12982-016-0047-x
Source DB: PubMed Journal: Emerg Themes Epidemiol ISSN: 1742-7622
Fig. 1Patient enrollment and follow-up in the example studies
Baseline characteristics and observed risk of major hemorrhagic events within 180 days of the warfarin and dabigatran initiators in the historical and concurrent cohorts
| Variable | Historical cohort (Oct 1 2008–Sept 30 2010a) | Concurrent cohort (Oct 1 2010–June 30 2012a) | |
|---|---|---|---|
| Warfarin (N = 10,014) | Warfarin (N = 5360) | Dabigatran (N = 3874) | |
| Age, mean (SD) | 63.9 (11.5) | 64.6 (11.8) | 61.8 (11.5) |
| Female, n (%) | 2974 (29.7) | 1686 (31.5) | 1040 (26.8) |
| Nursing home stay, n (%) | 376 (3.8) | 282 (5.3) | 3815 (1.5) |
| Num. of medications, mean (SD) | 11.0 (6.4) | 11.2 (6.5) | 10.1 (5.9) |
| Num. of physician visits, mean (SD) | 15.8 (18.9) | 17.1 (20.8) | 12.9 (12.4) |
| At least 1 hospitalization, n (%) | 4911 (49.0) | 2686 (50.1) | 1594 (41.2) |
| Use of proton pump Inhibitors, n (%) | 1864 (18.6) | 1092 (20.4) | 689 (17.8) |
| Use of antiplatelets, n (%) | 1155 (11.5) | 611 (11.4) | 399 (10.3) |
| Use of nsNSAIDs, n (%) | 2231 (22.3) | 1105 (20.6) | 842 (21.7) |
| ICH hosp., n (%) | 25 (0.2) | 20 (0.4) | 5 (0.1) |
| GI bleed hosp., n (%) | 69 (0.7) | 40 (0.7) | 16 (0.4) |
| GI bleed, n (%) | 457 (4.6) | 262 (4.9) | 152 (3.9) |
| Peripheral artery disease, n (%) | 1190 (11.9) | 747 (13.9) | 362 (9.3) |
| Anemia, n (%) | 1412 (14.1) | 855 (16.0) | 400 (10.3) |
| Chronic liver disease, n (%) | 217 (2.2) | 136 (2.5) | 89 (2.3) |
| Chronic kidney disease, n (%) | 1524 (15.2) | 1046 (19.5) | 452 (11.7) |
| Alcohol addiction, n (%) | 247 (2.5) | 116 (2.2) | 85 (2.2) |
| Drug abuse, n (%) | 106 (1.1) | 53 (1.0) | 46 (1.2) |
| HAS-BLED mean (SD) | 2.0 (1.5) | 2.1 (1.6) | 1.8 (1.4) |
| Major hemorrhagic event, n (%) | 254 (2.5 %) | 129 (2.4 %) | 49 (1.3 %) |
GI bleed gastrointestinal bleeding, HAS-BLED HAS-BLED hemorrhage risk score, hosp. hospitalizations, Num. number
aEnrollment period
Baseline characteristics and observed risk of gastrointestinal bleeds within 180 days of the non-selective nonsteroidal anti-inflammatory drugs and cyclooxygenase-2 inhibitors initiators in the historical and concurrent cohorts
| Variable | Historical cohort (Jan 1 1997–Dec 31 1998a) | Concurrent cohort (Jan 1 1999–Dec 31 2001a) | |
|---|---|---|---|
| nsNSAIDs (N = 28,533) | nsNSAIDs (N = 15,930) | Coxibs (N = 31,875) | |
| Age, mean (SD) | 78.6 (6.8) | 78.9 (6.9) | 80.5 (6.8) |
| White, n (%) | 26,535 (93.0) | 14,450 (90.7) | 30,583 (95.9) |
| Female, n (%) | 24,011 (84.2) | 13,287 (83.4) | 27,812 (87.3) |
| Nursing home stay, n (%) | 1757 (6.2) | 989 (6.2) | 2781 (8.7) |
| Num. of medications, mean (SD) | 9.4 (5.1) | 9.7 (5.2) | 10.5 (5.4) |
| Num. of physician visits, mean (SD) | 9.7 (6.9) | 9.3 (6.7) | 9.8 (6.7) |
| At least 1 hospitalization, n (%) | 8514 (29.8) | 2267 (28.0) | 10,497 (32.9) |
| Use of warfarin, n (%) | 2148 (7.5) | 1212 (7.6) | 4509 (14.1) |
| Use of gastroprotective drugs, n (%) | 9059 (31.7) | 4560 (28.6) | 12,238 (38.4) |
| Use of corticosteriods, n (%) | 2445 (8.6) | 1435 (9.0) | 3265 (10.2) |
| Use of clopidogrel, n (%) | 671 (2.4) | 785 (4.9) | 2148 (6.7) |
| Rheumatoid arthritis, n (%) | 1215 (4.3) | 519 (3.3) | 1721 (5.4) |
| Congestive heart failure, n (%) | 5892 (20.6) | 3295 (20.7) | 7864 (24.7) |
| Osteoarthritis, n (%) | 13,044 (45.7) | 6645 (41.7) | 17,812 (55.9) |
| Peripheral vascular disease, n (%) | 5173 (18.1) | 3192 (20.0) | 7387 (23.2) |
| GI bleeding, n (%) | 1352 (4.7) | 788 (4.9) | 1972 (6.2) |
| Chronic kidney diseases, n (%) | 965 (3.4) | 684 (4.3) | 1275 (4.0) |
| Carotid artery disease, n (%) | 11,551 (40.5) | 6460 (40.6) | 13,944 (43.8) |
| Combined comorbidity score >2, n (%) | 7266 (25.5) | 4170 (26.2) | 10,051 (31.5) |
| GI bleed, n (%) | 201 (0.7 %) | 87 (0.6 %) | 189 (0.6 %) |
Coxibs cyclooxygenase-2 inhibitors, GI bleed gastrointestinal bleeding, nsNSAIDs non-selective nonsteroidal anti-inflammatory drugs, Num. number
aEnrollment period
Predictive performance of the disease risk score (DRS) models in the warfarin versus dabigatran historical and concurrent cohorts
| Num. | Model data component and dimension reduction/shrinkage method | EPVb | Historical cohort | Concurrent cohort | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Warfarin initiators | Warfarin initiators | Dabigatran initiators | |||||||||
| c-stat | 95 % CI | x-valid. c-stat | c-stat | 95 % CI | HL ( | c-stat | 95 % CI | HL (P value) | |||
| 1 | Demo (age + sex) | 127.0 | 0.58 | 0.55, 0.62 | 0.59 | 0.59 | 0.54, 0.63 | 2.7 (0.95) | 0.68 | 0.61, 0.75 | 28.1 (<0.01) |
| 2 | Demo + scorea | 84.7 | 0.61 | 0.57, 0.64 | 0.60 | 0.62 | 0.57, 0.67 | 5.6 (0.69) | 0.70 | 0.63, 0.77 | 24.6 (<0.01) |
| 3 | Demo + score + predef. | 16.9 | 0.64 | 0.60, 0.67 | 0.61 | 0.61 | 0.56, 0.66 | 7.5 (0.48) | 0.69 | 0.63, 0.76 | 23.0 (<0.01) |
| 4 | Demo + cov500 | 0.5 | 0.86 | 0.84, 0.89 | 0.61 | 0.54 | 0.49, 0.60 | 384 (<0.01) | 0.56 | 0.48, 0.64 | 126 (<0.01) |
| 5 | Demo + PCA(10) | 21.2 | 0.67 | 0.64, 0.71 | 0.65 | 0.66 | 0.61, 0.71 | 5.2 (0.73) | 0.68 | 0.61, 0.75 | 27.0 (<0.01) |
| 6 | Demo + PCA(30) | 7.9 | 0.69 | 0.65, 0.72 | 0.63 | 0.63 | 0.59, 0.68 | 10.2 (0.25) | 0.64 | 0.56, 0.72 | 18.5 (0.02) |
| 7 | Demo + score + PCA(10) | 19.5 | 0.67 | 0.64, 0.71 | 0.64 | 0.64 | 0.59, 0.69 | 9.8 (0.28) | 0.67 | 0.60, 0.73 | 18.1 (0.02) |
| 8 | Demo + score + PCA(30) | 7.7 | 0.69 | 0.65, 0.72 | 0.63 | 0.62 | 0.57, 0.67 | 10.1 (0.26) | 0.64 | 0.56, 0.72 | 17.6 (0.02) |
| 9 | Demo + predef. + score + PCA(10) | 10.2 | 0.69 | 0.66, 0.72 | 0.64 | 0.64 | 0.59, 0.69 | 16.0 (0.04) | 0.66 | 0.59, 0.73 | 20.3 (0.01) |
| 10 | Demo + predef. + score + PCA(30) | 5.6 | 0.71 | 0.67, 0.74 | 0.63 | 0.63 | 0.58, 0.68 | 21.1 (0.01) | 0.66 | 0.59, 0.73 | 18.9 (0.02) |
| 11 | Ridge (Demo + predef. + score + cov500) | 0.5 | 0.83 | 0.81, 0.86 | 0.69 | 0.63 | 0.59, 0.68 | 12.3 (0.14) | 0.63 | 0.55, 0.71 | 19.5 (0.01) |
| 12 | Ridge (Demo + predef. + score + PCA(30)) | 5.6 | 0.71 | 0.68, 0.74 | 0.65 | 0.64 | 0.58, 0.68 | 14.9 (0.06) | 0.67 | 0.59, 0.75 | 25.5 (<0.01) |
| 13 | Lasso (Demo + predef. + score + cov500) | 0.5 | 0.72 | 0.69, 0.75 | 0.63 | 0.64 | 0.59, 0.68 | 10.8 (0.21) | 0.66 | 0.59, 0.73 | 27.7 (<0.01) |
| 14 | Lasso (Demo + predef. + score + PCA(30)) | 5.6 | 0.70 | 0.67, 0.73 | 0.65 | 0.65 | 0.61, 0.70 | 10.7 (0.22) | 0.67 | 0.59, 0.74 | 19.5 (0.01) |
CI confidence interval, Demo demographic variables, HL Hosmer–Lemeshow test statistics, Num. model number, predef. predefined variables, PCA(10) top 10 components from principal component analysis, PCA(30) top 30 components from principal component analysis, c-stat c-statistics, x-valid. 10-fold cross-validated
aScore = HAS-BLED score [23]
bEvent per variable: ratio between the number of outcomes and number of variables included in the DRS model
Predictive performance of the disease risk score (DRS) models in the cyclooxygenase-2 inhibitor versus non-selective non-steroidal anti-inflammatory drugs in historical and concurrent cohorts
| Num. | Model data component and dimension reduction/shrinkage method | EPVc | Historical cohort | Concurrent cohort | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| nsNSAID initiators | nsNSAID initiators | coxib initiators | |||||||||
| c-stat | 95 % CI | x-valid. c-stat | c-stat | 95 % CI | HL (P value) | c-stat | 95 % CI | HL (P value) | |||
| 1 | Demo (age + sex + race) | 40.2 | 0.64 | 0.60, 0.67 | 0.63 | 0.64 | 0.59, 0.70 | 9.6 (0.30) | 0.59 | 0.55, 0.63 | 24.4 (<0.01) |
| 2 | Demo + scorea | 33.5 | 0.66 | 0.62, 0.69 | 0.65 | 0.67 | 0.61, 0.72 | 9.6 (0.30) | 0.65 | 0.62, 0.68 | 34.2 (<0.01) |
| 3 | Demo + score + predef. | 7.2 | 0.70 | 0.67, 0.74 | 0.66 | 0.66 | 0.59, 0.72 | 12.0 (0.15) | 0.62 | 0.58, 0.66 | 41.2 (<0.01) |
| 4 | Demo + cov500 | 0.4 | 0.96 | 0.95, 0.97 | 0.71 | 0.45 | 0.38, 0.51 | >999 (<0.01) | 0.48 | 0.43, 0.52 | >999 (<0.01) |
| 5 | Demo + PCA(10) | 13.4 | 0.72 | 0.68, 0.75 | 0.68 | 0.66 | 0.60, 0.72 | 10.4 (0.24) | 0.63 | 0.59, 0.67 | 32.1 (<0.01) |
| 6 | Demo + PCA(30) | 5.7 | 0.75 | 0.72, 0.79 | 0.69 | 0.66 | 0.61, 0.72 | 12.4 (0.13) | 0.65 | 0.62, 0.69 | 56.5 (<0.01) |
| 7 | Demo + score + PCA(10) | 12.6 | 0.72 | 0.68, 0.75 | 0.68 | 0.66 | 0.60, 0.72 | 12.1 (0.15) | 0.62 | 0.58, 0.66 | 34.8 (<0.01) |
| 8 | Demo + score + PCA(30) | 5.6 | 0.75 | 0.72, 0.79 | 0.69 | 0.66 | 0.60, 0.72 | 12.6 (0.12) | 0.65 | 0.61, 0.69 | 58.5 (<0.01) |
| 9 | Demo + predef. + score + PCA(10) | 5.3 | 0.74 | 0.70, 0.77 | 0.67 | 0.65 | 0.59, 0.72 | 21.7 (0.01) | 0.60 | 0.56, 0.64 | 74.3 (<0.01) |
| 10 | Demo + predef. + score + PCA(30) | 3.5 | 0.78 | 0.75, 0.81 | 0.70 | 0.66 | 0.60, 0.72 | 23.6 (< 0.01) | 0.63 | 0.58, 0.67 | 102.6 (<0.01) |
| 11 | Ridge (Demo + predef. + score + cov500) | 0.4 | 0.92 | 0.90, 0.93 | 0.77 | 0.55 | 0.48, 0.62 | 77.3 (<0.01) | 0.59 | 0.54, 0.63 | 172.9 (<0.01) |
| 12 | Ridge (Demo + predef. + score + PCA(30)) | 3.5 | 0.77 | 0.74, 0.80 | 0.71 | 0.67 | 0.61, 0.72 | 11.9 (0.16) | 0.64 | 0.60, 0.68 | 35.2 (<0.01) |
| 13 | Lasso (Demo + predef. + score + cov500) | 0.4 | 0.93 | 0.91, 0.94 | 0.72b | 0.53 | 0.46, 0.60 | 409 (<0.01) | 0.57 | 0.52, 0.61 | 578.8 (<0.01) |
| 14 | Lasso (Demo + predef. + score + PCA(30)) | 3.5 | 0.78 | 0.74, 0.81 | 0.72 | 0.67 | 0.61, 0.73 | 9.1 (0.33) | 0.65 | 0.61, 0.69 | 38.3 (<0.01) |
c-stat c-statistics, coxibs cyclooxygenase-2 inhibitors, Demo demographic variables, HL Hosmer–Lemeshow test statistics, nsNSAIDs non-selective nonsteroidal anti-inflammatory drug, Num. model number, PCA(10) top 10 components from principal component analysis, PCA(30) top 30 components from principal component analysis, predef. predefined variables, x-valid. 10-fold cross-validated
aScore = combined comorbidity score [24]
bAverage of 3 of 10 which reached convergence in the 10-fold cross-validation, the rest did not reach convergence
cEvent per variable: ratio between the number of outcome and number of variables included in the model
The relative odds of major hemorrhagic events within 180 days for dabigatran initiators compared to warfarin initiators adjusted by DRS decile stratification
| Model Num. | Model data component and dimension reduction/shrinkage method used | Adjusted by DRS decilesb | |
|---|---|---|---|
| Odds ratio | 95 % CI | ||
| Crude | 0.52 | 0.37, 0.72 | |
| 1 | Demo (age + sex + race) | 0.58 | 0.41, 0.81 |
| 2 | Demo + scorea | 0.60 | 0.43, 0.84 |
| 3 | Demo + score + predef. | 0.58 | 0.41, 0.81 |
| 4 | Demo + cov500 | 0.53 | 0.38, 0.74 |
| 5 | Demo + PCA(10) | 0.64 | 0.45, 0.89 |
| 6 | Demo + PCA(30) | 0.62 | 0.44, 0.87 |
| 7 | Demo + score + PCA(10) | 0.61 | 0.44, 0.86 |
| 8 | Demo + score + PCA(30) | 0.61 | 0.44, 0.86 |
| 9 | Demo + predef. + score + PCA(10) | 0.60 | 0.43, 0.84 |
| 10 | Demo + predef. + score + PCA(30) | 0.60 | 0.43, 0.84 |
| 11 | Ridge (Demo + predef. + score + cov500) | 0.58 | 0.41, 0.81 |
| 12 | Ridge (Demo + predef. + score + PCA(30)) | 0.61 | 0.44, 0.85 |
| 13 | Lasso (Demo + predef. + score + cov500) | 0.60 | 0.43, 0.85 |
| 14 | Lasso (Demo + predef. + score + PCA(30)) | 0.64 | 0.46, 0.90 |
CI confidence interval, Demo demographic variables, DRS disease risk scores, OR odds ratios, PCA(10) top 10 components from principal component analysis, PCA(30) top 30 components from principal component analysis, predef. predefined variables
aScore = HAS-BLED score [23]
bStratified by DRS decile indicators
Relative odds of gastrointestinal bleeds within 180 days for cyclooxigenaze-2 inhibitor initiators compared to nsNSAIDs initiators adjusted by DRS decile stratification
| Model Num. | Model data component and dimension reduction/shrinkage method used | Adjusted by DRS decilesb | |
|---|---|---|---|
| Odds ratio | 95 % CI | ||
| Crude | 1.09 | 0.84, 1.40 | |
| 1 | Demo (age + sex + race) | 0.98 | 0.76, 1.27 |
| 2 | Demo + scorea | 0.93 | 0.72, 1.20 |
| 3 | Demo + score + predef. | 0.97 | 0.75, 1.25 |
| 4 | Demo + cov500 | 1.05 | 0.82, 1.36 |
| 5 | Demo + PCA(10) | 0.97 | 0.75, 1.25 |
| 6 | Demo + PCA(30) | 0.97 | 0.75, 1.25 |
| 7 | Demo + score + PCA(10) | 0.98 | 0.76, 1.26 |
| 8 | Demo + score + PCA(30) | 0.97 | 0.75, 1.25 |
| 9 | Demo + predef. + score + PCA(10) | 1.00 | 0.77, 1.29 |
| 10 | Demo + predef. + score + PCA(30) | 0.98 | 0.76, 1.27 |
| 11 | Ridge (Demo + predef. + score + cov500) | 1.04 | 0.80, 1.34 |
| 12 | Ridge (Demo + predef. + score + PCA(30)) | 0.96 | 0.74, 1.24 |
| 13 | Lasso (Demo + predef. + score + cov500) | 1.05 | 0.82, 1.36 |
| 14 | Lasso (Demo + predef. + score + PCA(30)) | 0.95 | 0.73, 1.22 |
CI confidence interval, Demo demographic variables (age, sex and race), DRS disease risk scores, OR odds ratios, PCA(10) top 10 components from principal component analysis, PCA(30) top 30 components from principal component analysis, predef. predefined variables
aScore = combined comorbidity score [24]
bStratified by DRS decile indicators