| Literature DB >> 28408854 |
Johnny Kahlert1, Sigrid Bjerge Gribsholt1,2, Henrik Gammelager1,3, Olaf M Dekkers1,4,5, George Luta1,6.
Abstract
In observational studies, control of confounding can be done in the design and analysis phases. Using examples from large health care database studies, this article provides the clinicians with an overview of standard methods in the analysis phase, such as stratification, standardization, multivariable regression analysis and propensity score (PS) methods, together with the more advanced high-dimensional propensity score (HD-PS) method. We describe the progression from simple stratification confined to the inclusion of a few potential confounders to complex modeling procedures such as the HD-PS approach by which hundreds of potential confounders are extracted from large health care databases. Stratification and standardization assist in the understanding of the data at a detailed level, while accounting for potential confounders. Incorporating several potential confounders in the analysis typically implies the choice between multivariable analysis and PS methods. Although PS methods have gained remarkable popularity in recent years, there is an ongoing discussion on the advantages and disadvantages of PS methods as compared to those of multivariable analysis. Furthermore, the HD-PS method, despite its generous inclusion of potential confounders, is also associated with potential pitfalls. All methods are dependent on the assumption of no unknown, unmeasured and residual confounding and suffer from the difficulty of identifying true confounders. Even in large health care databases, insufficient or poor data may contribute to these challenges. The trend in data collection is to compile more fine-grained data on lifestyle and severity of diseases, based on self-reporting and modern technologies. This will surely improve our ability to incorporate relevant confounders or their proxies. However, despite a remarkable development of methods that account for confounding and new data opportunities, confounding will remain a serious issue. Considering the advantages and disadvantages of different methods, we emphasize the importance of the clinical input and of the interplay between clinicians and analysts to ensure a proper analysis.Entities:
Keywords: adjustment; confounding; multivariable analysis; observational studies; propensity score; stratification
Year: 2017 PMID: 28408854 PMCID: PMC5384727 DOI: 10.2147/CLEP.S129886
Source DB: PubMed Journal: Clin Epidemiol ISSN: 1179-1349 Impact factor: 4.790
Summary of the pros and cons of five methods used to control confounding in observational studies
| Method | Advantages | Disadvantages |
|---|---|---|
| Stratification | Simple and transparent method | Limitations on the number of strata that are practically manageable |
| Standardization | Specifically developed to compare rates and ratios | Sensitive to the choice of standard or reference population |
| Multivariable regression analysis | Easy to include a large number of potential confounders with standard statistical software | Many assumptions such as linearity in linear models, no collinearity between factors, normality and homoscedasticity of error terms |
| PS methods | Robust method when exposure is common and outcome is rare | Exposure must be a categorical variable (information potentially lost) |
| HD-PS method | The very large number of variables included in the analysis may comprise proxies for unmeasured confounders (although there is no guarantee) | Data greediness |
Abbreviations: PS, propensity score; HD-PS, high-dimensional propensity score.
Summary of the methodological pros and cons of four different types of PS methods
| Method | Methodology | Advantages | Disadvantages |
|---|---|---|---|
| Stratification | People are assigned to a stratum based upon their PS. Strata are typically defined by percentiles of the PS, eg, quintiles. Hence, within each stratum, treated and untreated people roughly share the same characteristics. A treatment effect is calculated within each stratum, and the overall effect is a weighted average across strata The typical approach estimates ATE | Simpler approach in comparison with matching and weighting | Comparability of treatment groups must be checked for all strata |
| Matching | For each treated person, one or more untreated person(s) with a comparable PS are selected. A comparable PS can be defined in different ways, eg, nearest neighbor or caliper width The typical approach estimates ATT | Potentially more efficient in providing comparable treatment groups | Treated people may not have a match with the untreated people, leading to biased results |
| Covariate adjustment | An outcome regression model is used. As a minimum, the treatment and the PS must be included in the model as independent variables. Other variables may also be included | Simple approach: PS is used to balance treatment groups and is incorporated directly in an outcome regression model | Stronger assumptions than other methods |
| Inverse probability of treatment weighting | Weights are used to create a pseudo-population in which the characteristics are comparable across the treatment groups. Thus, weights are increased for those people who have received the treatment unexpectedly The typical approach estimates ATE | Potentially more efficient in providing comparable treatment groups | A setting involving treated people with a low PS (or untreated people with a high PS) will generate large weights and variances |
Abbreviations: PS, propensity score; ATE, average treatment effect for the population (both treated and untreated people); ATT, average treatment effect among treated people.