| Literature DB >> 31603910 |
Md Saiful Islam1, Md Sarowar Morshed1, Gary J Young2,3,4, Md Noor-E-Alam1,2.
Abstract
Under the current policy decision making paradigm we make or evaluate a policy decision by intervening different socio-economic parameters and analyzing the impact of those interventions. This process involves identifying the causal relation between interventions and outcomes. Matching method is one of the popular techniques to identify such causal relations. However, in one-to-one matching, when a treatment or control unit has multiple pair assignment options with similar match quality, different matching algorithms often assign different pairs. Since all the matching algorithms assign pairs without considering the outcomes, it is possible that with the same data and same hypothesis, different experimenters can reach different conclusions creating an uncertainty in policy decision making. This problem becomes more prominent in the case of large-scale observational studies as there are more pair assignment options. Recently, a robust approach has been proposed to tackle the uncertainty that uses an integer programming model to explore all possible assignments. Though the proposed integer programming model is very efficient in making robust causal inference, it is not scalable to big data observational studies. With the current approach, an observational study with 50,000 samples will generate hundreds of thousands binary variables. Solving such integer programming problem is computationally expensive and becomes even worse with the increase of sample size. In this work, we consider causal inference testing with binary outcomes and propose computationally efficient algorithms that are adaptable for large-scale observational studies. By leveraging the structure of the optimization model, we propose a robustness condition that further reduces the computational burden. We validate the efficiency of the proposed algorithms by testing the causal relation between the Medicare Hospital Readmission Reduction Program (HRRP) and non-index readmissions (i.e., readmission to a hospital that is different from the hospital that discharged the patient) from the State of California Patient Discharge Database from 2010 to 2014. Our result shows that HRRP has a causal relation with the increase in non-index readmissions. The proposed algorithms proved to be highly scalable in testing causal relations from large-scale observational studies.Entities:
Mesh:
Year: 2019 PMID: 31603910 PMCID: PMC6788711 DOI: 10.1371/journal.pone.0223360
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Uncertainty due to multiple pair assignment options.
Shapes and Colors represent the treatment status and variations in size represent the difference in outcomes.
Fig 2Comparison of matching for hypothesis testing under classical approach and robust approach [26].
Steps before covariate balance achievement remains same for each approach. In the remaining steps, Black arrows show the classical approach, and Blue arrows show the robust approach proposed in [26].
Contingency table of the outcomes of treatment and control observations.
| Treatment | |||
|---|---|---|---|
| Yes ( | No ( | ||
| Control | Yes ( | A | B |
| No ( | C | D | |
Fig 3Example of maximum pair assignments between treatment and control group.
t represents the treatment group and c represents the control group. An arrow connects a treatment unit with a control unit which forms a pair.
Characteristics of readmitted patients in the State of California Patient Discharge Database from 2010 to 2014.
| Variable | All Readmission | Index Hospital | Non-index Hospital | Before HRRP | After HRRP |
|---|---|---|---|---|---|
| Readmitted Patients | 90553 | 67341 | 23212 | 53353 | 37200 |
| | |||||
| 0-20 | 635 (0.70) | 505 (0.75) | 130 (0.56) | 427 (0.8) | 208 (0.56) |
| 21-30 | 1073 (1.18) | 717 (1.06) | 356 (1.53) | 566 (1.06) | 507 (1.36) |
| 31-40 | 2186 (2.41) | 1471 (2.18) | 715 (3.08) | 1269 (2.38) | 917 (2.47) |
| 41-50 | 6336 (7.00) | 4196 (6.23) | 2140 (9.22) | 3714 (6.96) | 2622 (7.05) |
| 51-65 | 21018 (23.21) | 14470 (21.49) | 6548 (28.21) | 11950 (22.4) | 9068 (24.38) |
| 65 and above | 59305 (65.49) | 45982 (68.28) | 13323 (57.4) | 35427 (66.4) | 23878 (64.19) |
| | |||||
| Female | 45124 (49.80) | 34240 (50.80) | 10884 (46.9) | 27049 (59.94) | 18075 (40.06) |
| Male | 45429 (50.20) | 33101 (49.20) | 12328 (53.1) | 26304 (57.9) | 19125 (42.1) |
| | |||||
| Quartile 1 | 22428 (24.77) | 15640 (23.23) | 6788 (29.24) | 13108 (24.57) | 9320 (25.05) |
| Quartile 2 | 22629 (24.99) | 16633 (24.70) | 5996 (25.83) | 13331 (24.99) | 9298 (24.99) |
| Quartile 3 | 22450 (24.79) | 17160 (25.48) | 5290 (22.79) | 13165 (24.68) | 9285 (24.96) |
| Quartile 4 | 23046 (25.45) | 17908 (26.59) | 5138 (22.14) | 13749 (25.77) | 9297 (24.99) |
| | |||||
| CHF | 50151 (55.40) | 37404 (55.50) | 12747 (54.9) | 29351 (55.01) | 20800 (55.91) |
| AMI | 11917 (13.20) | 8148 (12.10) | 3769 (16.2) | 6865 (12.87) | 5052 (13.58) |
| Pneumonia | 28485 (31.40) | 21789 (32.40) | 6696 (28.9) | 17137 (32.12) | 11348 (30.51) |
| | |||||
| Low (0-2) | 35394 (39.09) | 25884 (38.44) | 9510 (40.97) | 21282 (39.89) | 14112 (37.94) |
| Medium (3-6) | 51301 (56.65) | 38454 (57.10) | 12847 (55.35) | 29850 (55.95) | 21451 (57.66) |
| Medium High (7-10) | 3396 (3.75 | 2628 (3.90) | 768 (3.31) | 1944 (3.64) | 1452 (3.9) |
| High (10 and above) | 462 (0.51) | 375 (0.56) | 87 (0.37) | 277 (0.52) | 185 (0.5) |
| | |||||
| Teaching Hospital | 10261 (11.30) | 7706 (11.40) | 2555 (11) | 5882 (11.02) | 4379 (11.77) |
| Non-teaching Hospital | 80272 (88.70) | 59635 (88.50) | 20657 (89) | 47471 (88.98) | 32821 (88.23) |
| | |||||
| Non-profit Hospital | 58592 (64.70) | 45210 (67.10) | 13382 (57.6) | 34252 (64.2) | 24340 (65.43) |
| Investor Hospital | 17902 (19.80) | 11389 (16.90) | 6513 (28.1) | 10839 (20.32) | 7063 (18.99) |
| Public Hospital | 14059 (15.50) | 10742 (16.00) | 3317 (14.3) | 8262 (15.49) | 5797 (15.58) |
| | |||||
| Small (below 100 beds) | 4982 (5.50) | 3453 (5.10) | 1529 (6.6) | 2979 (5.58) | (5.38) |
| Medium (100-399 beds) | 61167 (67.60) | 45149 (67.10) | 16018 (69) | 36314 (68.06) | 24853 (66.81) |
| Large (400 and above beds) | 24404 (26.90) | 18739 (27.80) | 5665 (24.4) | 14060 (26.35) | 10344 (27.81) |
| | |||||
| Rural | 2426 (2.68) | 1809 (2.69) | 617 (2.66) | 1348 (2.53) | 1078 (2.90) |
| Metro | 88127 (97.32) | 65532 (97.31) | 22595 (97.34) | 52005 (97.47) | 36122 (97.10) |
The entries in each cell is presented as Number of patients “N (%)” form. From February 1, 2010 to September 30, 2012 is considered “Before HRRP” period. From October 1, 2012 to December 31, 2014 is considered “After HRRP” period. CHF-Congestive Heart Failure, AMI-Acute Mayocardial Infraction.
Test statistic Λ(a) calculated using optimization model and algorithm 1.
| Optimization Model | Algorithm 1 | ||||
|---|---|---|---|---|---|
| Λ(a) | Λ(a) | CPU time | Robustness Condition | CPU time | |
| 50 | -7.21 | 6.93 | 918.69 | ||
| 100 | -10.10 | 9.90 | 982.23 | ||
| 300 | -17.38 | 17.26 | 1203.68 | ||
| 500 | -22.41 | 22.32 | 2037.37 | ||
| 800 | -28.32 | 28.25 | 2204.52 | ||
| 1000 | -31.65 | 61.60 | 2218.27 | ||
| 5000 | -70.72 | 70.69 | 2563.32 | ||
| 10000 | -88.97 | 99.99 | 2934.47 | ||
| 15000 | -31.82 | 74.82 | 2386.53 | ||
| 20000 | 7.80 | 21.83 | 2659.94 | ||
| 21000 | 14.51 | 21.83 | 2640.60 | ||
| 21500 | 17.75 | 18.16 | 2219.23 | ||
| 21530 | 17.94 | 17.94 | 3246.64 | 17.94 | 1.13 |
The optimization model is solved iteratively over different values of discordant pairs (m) until a robust solution is reached.
*Algorithm 1 identifies the robustness condition (B = 12082 and C = 9448) and calculates the test statistic for that condition only. CPU times are presented in seconds: time required to solve both minimization and maximization problem.
Fig 4The range of p-value achievable for different number of discordant pairs m.
The p-values were calculated using the test statistics presented in Table 3. The red line represents minimum possible p-value (corresponding to Λ(a)) and the blue line represents the maximum possible p-value (corresponding to Λ(a)).