| Literature DB >> 35872797 |
Abstract
Causal inference is a broad field that seeks to build and apply models that learn the effect of interventions on outcomes using many data types. While the field has existed for decades, its potential to impact healthcare outcomes has increased dramatically recently due to both advancements in machine learning and the unprecedented amounts of observational data resulting from electronic capture of patient claims data by medical insurance companies and widespread adoption of electronic health records (EHR) worldwide. However, there are many different schools of learning causality coming from different fields of statistics, some of them strongly conflicting. While the recent advances in machine learning greatly enhanced causal inference from a modeling perspective, it further exacerbated the fractured state in this field. This fractured state has limited research at the intersection of causal inference, modern machine learning, and EHRs that could potentially transform healthcare. In this paper we unify the classical causal inference approaches with new machine learning developments into a straightforward framework based on whether the researcher is most interested in finding the best intervention for an individual, a group of similar people, or an entire population. Through this lens, we then provide a timely review of the applications of causal inference in healthcare from the literature. As expected, we found that applications of causal inference in medicine were mostly limited to just a few technique types and lag behind other domains. In light of this gap, we offer a helpful schematic to guide data scientists and healthcare stakeholders in selecting appropriate causal methods and reviewing the findings generated by them.Entities:
Keywords: causal inference; electronic health record; healthcare; machine learning; patient population; potential outcome framework; review; treatment effects
Year: 2022 PMID: 35872797 PMCID: PMC9300826 DOI: 10.3389/fmed.2022.864882
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
Summary of causal inference approaches in healthcare.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| Propensity scores-based, propensity score matching and IPTW | Simple, transparent, mimic clinical trials | Model can be misspecified | Widely used ( | ||||
| Whole population | ATE | Outcome regression, variations of G-computation | No need to estimate propensity score | Model can be misspecified | Low | High | Few applications |
| Doubly robust estimator, targeted maximum likelihood estimator | Efficient, doubly robust property | Yield biased estimate if both models are misspecified | Widely used ( | ||||
| Direct stratification | Easy to interpret | Data sparsity proble | Widely used ( | ||||
| Sub population | CATE | Indirect stratification, propensity score-based approach | Robust, easy to satisfy positivity assumption | Subpopulation hard to interpret | Medium | Medium | |
| Data driven, tree based algorithms | Low variance within subpopulation | Subpopulation hard to interpret | Medium | Medium | Few applications ( | ||
| Fit one outcome surface, BART model etc | Capture common underlying data structure | Not flexible, especially when the outcome surfaces are very | Few applications ( | ||||
| Individuals | ITE | different in distinct groups | High | Low | |||
| Fit two outcome surfaces | Flexible, allow for different data structure in groups | Does not capture common data pattern in two groups |
Figure 1Treatment effect estimator selection guide based on target-population intervention size and prior knowledge. Colors in the figure indicate bias-variance tradeoff. Light blue: high bias and low variance; blue: medium bias and variance; dark blue: low bias and high variance. Person icons under each estimator illustrate the composition of the targeted population.