Shaun P Forbes1,2, Issa J Dahabreh3,4,5. 1. Center for Evidence Synthesis in Health, Brown University School of Public Health, Providence, USA. 2. Department of Health Services, Policy & Practice, Brown University School of Public Health, Providence, USA. 3. Center for Evidence Synthesis in Health, Brown University School of Public Health, Providence, USA. issa_dahabreh@brown.edu. 4. Department of Health Services, Policy & Practice, Brown University School of Public Health, Providence, USA. issa_dahabreh@brown.edu. 5. Department of Epidemiology, Brown University School of Public Health, Providence, USA. issa_dahabreh@brown.edu.
Abstract
BACKGROUND: Observational analysis methods can be refined by benchmarking against randomized trials. We reviewed studies systematically comparing observational analyses using propensity score methods against randomized trials to explore whether intervention or outcome characteristics predict agreement between designs. METHODS: We searched PubMed (from January 1, 2000, to April 30, 2017), the AHRQ Scientific Resource Center Methods Library, reference lists, and bibliographies to identify systematic reviews that compared estimates from observational analyses using propensity scores against randomized trials across three or more clinical topics; reported extractable relative risk (RR) data; and were published in English. One reviewer extracted data from all eligible systematic reviews; a second reviewer verified the extracted data. RESULTS: Six systematic reviews matching published observational studies to randomized trials, published between 2012 and 2016, met our inclusion criteria. The reviews reported on 127 comparisons overall, in cardiology (29 comparisons), surgery (49), critical care medicine and sepsis (46), nephrology (2), and oncology (1). Disagreements were large (relative RR < 0.7 or > 1.43) in 68 (54%) and statistically significant in 12 (9%) of the comparisons. The degree of agreement varied among reviews but was not strongly associated with intervention or outcome characteristics. DISCUSSION: Disagreements between observational studies using propensity score methods and randomized trials can occur for many reasons and the available data cannot be used to discern the reasons behind specific disagreements. Better benchmarking of observational analyses using propensity scores (and other causal inference methods) is possible using observational studies that explicitly attempt to emulate target trials.
BACKGROUND: Observational analysis methods can be refined by benchmarking against randomized trials. We reviewed studies systematically comparing observational analyses using propensity score methods against randomized trials to explore whether intervention or outcome characteristics predict agreement between designs. METHODS: We searched PubMed (from January 1, 2000, to April 30, 2017), the AHRQ Scientific Resource Center Methods Library, reference lists, and bibliographies to identify systematic reviews that compared estimates from observational analyses using propensity scores against randomized trials across three or more clinical topics; reported extractable relative risk (RR) data; and were published in English. One reviewer extracted data from all eligible systematic reviews; a second reviewer verified the extracted data. RESULTS: Six systematic reviews matching published observational studies to randomized trials, published between 2012 and 2016, met our inclusion criteria. The reviews reported on 127 comparisons overall, in cardiology (29 comparisons), surgery (49), critical care medicine and sepsis (46), nephrology (2), and oncology (1). Disagreements were large (relative RR < 0.7 or > 1.43) in 68 (54%) and statistically significant in 12 (9%) of the comparisons. The degree of agreement varied among reviews but was not strongly associated with intervention or outcome characteristics. DISCUSSION: Disagreements between observational studies using propensity score methods and randomized trials can occur for many reasons and the available data cannot be used to discern the reasons behind specific disagreements. Better benchmarking of observational analyses using propensity scores (and other causal inference methods) is possible using observational studies that explicitly attempt to emulate target trials.
Authors: Marc L Berger; Nancy Dreyer; Fred Anderson; Adrian Towse; Art Sedrakyan; Sharon-Lise Normand Journal: Value Health Date: 2012 Mar-Apr Impact factor: 5.725
Authors: Issa J Dahabreh; Radley C Sheldrick; Jessica K Paulus; Mei Chung; Vasileia Varvarigou; Haseeb Jafri; Jeremy A Rassen; Thomas A Trikalinos; Georgios D Kitsios Journal: Eur Heart J Date: 2012-06-17 Impact factor: 29.983
Authors: Miguel A Hernán; Brian C Sauer; Sonia Hernández-Díaz; Robert Platt; Ian Shrier Journal: J Clin Epidemiol Date: 2016-05-27 Impact factor: 6.437
Authors: Georg Hans Kuhn; Frederick R Walker; Michael Nilsson; Madeleine Hinwood; Jenny Nyberg; Lucy Leigh; Sara Gustavsson; John Attia; Christopher Oldmeadow; Marina Ilicic; Thomas Linden; N David Åberg; Chris Levi; Neil Spratt; Leeanne M Carey; Michael Pollack; Sarah J Johnson Journal: BMJ Open Date: 2022-05-09 Impact factor: 3.006
Authors: Paris J Baptiste; Angel Y S Wong; Anna Schultze; Marianne Cunnington; Johannes F E Mann; Catherine Clase; Clémence Leyrat; Laurie A Tomlinson; Kevin Wing Journal: BMJ Open Date: 2022-03-08 Impact factor: 3.006
Authors: Yoon Duk Hong; Jeroen P Jansen; John Guerino; Marc L Berger; William Crown; Wim G Goettsch; C Daniel Mullins; Richard J Willke; Lucinda S Orsini Journal: BMC Med Date: 2021-12-06 Impact factor: 8.775