| Literature DB >> 34225716 |
Perrine Janiaud1,2, Arnav Agarwal3, Ioanna Tzoulaki4,5, Evropi Theodoratou6,7, Konstantinos K Tsilidis4,5, Evangelos Evangelou4,5, John P A Ioannidis8,9,10,11,12.
Abstract
BACKGROUND: The validity of observational studies and their meta-analyses is contested. Here, we aimed to appraise thousands of meta-analyses of observational studies using a pre-specified set of quantitative criteria that assess the significance, amount, consistency, and bias of the evidence. We also aimed to compare results from meta-analyses of observational studies against meta-analyses of randomized controlled trials (RCTs) and Mendelian randomization (MR) studies.Entities:
Keywords: Mendelian randomization; Observation studies; Randomized clinical trials; Umbrella review
Mesh:
Year: 2021 PMID: 34225716 PMCID: PMC8259334 DOI: 10.1186/s12916-021-02020-6
Source DB: PubMed Journal: BMC Med ISSN: 1741-7015 Impact factor: 11.150
The seven standardized criteria
| Levels of evidence | Description |
|---|---|
| Convincing | • Associations with a statistical significance at • More than 1000 cases included (or more than 20,000 participants for continuous outcomes) • The largest component study reporting a significant result at • A 95% prediction interval that excluded the null • Absence of large heterogeneity (I2<50%) • No evidence of small study effects ( • No evidence of excess significance ( |
| Highly suggestive | • Associations with a statistical significance at • More than 1000 cases included (or more than 20,000 participants for continuous outcomes) • The largest component study reporting a significant result at |
| Suggestive | • Associations with a statistical significance at • More than 1000 cases included (or more than 20,000 participants for continuous outcomes). |
| Weak | • Associations with a statistical significance at |
Previous umbrella reviews have used various criteria to assess the evidence from meta-analysis of observational epidemiological studies. The combination of these criteria allows to tentatively classify evidence from meta-analyses of statistically significant risks and protective factors into four levels described below. A more detailed description of the criteria can be found in Additional file 1: Appendix Method 1
Fig. 1Flowchart of the literature search. *Umbrella reviews reported summarized effect sizes but did not report the other criteria of interest. †Umbrella reviews assessing the same associations
Fig. 2Overview of the included associations. MR, Mendelian randomization; OBS, observational studies; RCTs, randomized controlled trials. The statistical significance threshold was at P < 0.05. *Twenty-one of which were not assessable anymore as included only one cohort study per association. †Sixteen of which were not assessable anymore as included only one cohort study per association
Overview of the included umbrella reviews
| Topic | First author | Year | Type of studies | N total studies | Median [IQR] | N total associations | N associations included | Convincing | Highly suggestive | Suggestive | Weak | Non-significant |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Adiposity and cancer outcomes | Kyrgiou | 2017 | OBS | 507 | 6 [4; 11.50] (2 to 44) | 194 | 67d | 8 (11.9%) | 14 (20.9%) | 15 (22.4%) | 11 (16.4%) | 19 (28.4%) |
| Antidepressant and adverse events | Dragioti | 2019 | OBS | 1012 | 6 [4; 11.25] (2 to 44) | 120 | 120 | 3 (2.5%) | 8 (6.7%) | 24 (20%) | 38 (31.7%) | 47 (39.2%) |
| Antipsychotics and life-threatening events | Papola | 2019 | OBS | 68 | 9 [6.75; 12.75] (6 to 24) | 6 | 6 | 1 (16.7%) | 2 (33.3%) | 3 (50%) | 0 (0%) | 0 (0%) |
| Low-dose aspirin and health outcomes | Veronese | 2020 | OBS-RCT | NR | 3.50 [3; 7.75] (2 to 32) | 156 | 42 | 0 (0%) | 0 (0%) | 0 (0%) | 11 (26.2%) | 31 (73.8%) |
| Aspirin and cancer outcomes | Song | 2020 | OBS | NR | 11 [7; 18.25] (3 to 309) | 27 | 18e | 0 (0%) | 0 (0%) | 0 (0%) | 12 (66.7%) | 6 (33.3%) |
| Birth weight and later life events | Belbasis | 2016 | OBS | NR | 10 [6.25; 16] (3 to 45) | 78 | 78 | 3 (3.8%) | 8 (10.3%) | 10 (12.8%) | 29 (37.2%) | 28 (35.9%) |
| Chocolate and health outcomes | Veronesem | 2018 | OBS-RCT | NR | 5 [4; 6] (4 to 6) | 19 | 7 | 0 (0%) | 0 (0%) | 0 (0%) | 4 (57.1%) | 3 (42.9%) |
| Chronic kidney disease and mortality | Kim | 2020 | OBS-RCT | NR | 9 [4; 14] (2 to 26) | 105 | 49 | 0 (0%) | 0 (0%) | 22 (44.9%) | 11 (22.4%) | 16 (32.7%) |
| Coffee and cancer risk | Zhao | 2020 | OBS | 448 | 15 [7; 21.25] (4 to 54) | 36 | 36 | 0 (0%) | 3 (8.3%) | 7 (19.4%) | 7 (19.4%) | 19 (52.8%) |
| C-reactive protein and health outcomes | Markozannes | 2020 | OBS-MR | 952 | 6 [4; 11] (3 to 53) | 309 | 113 | 2 (1.8%) | 12 (10.6%) | 14 (12.4%) | 67 (59.3%) | 18 (15.9%) |
| Depression and mortality | Machado | 2018 | OBS | 246 | 6 [4; 12] (3 to 111) | 17 | 17 | 0 (0%) | 4 (23.5%) | 2 (11.8%) | 11 (64.7%) | 0 (0%) |
| Antidepressants during pregnancy and neonatal outcomes | Biffi | 2020 | OBS | NR | 7 [4; 10] (3 to 28) | 69 | 69 | 0 (0%) | 5 (7.2%) | 11 (15.9%) | 18 (26.1%) | 35 (50.7%) |
| Dietary Fiber and health outcomes | Veronese | 2018 | OBS | NR | 12 [6; 19] (3 to 26) | 21 | 21 | 2 (9.5%) | 1 (4.8%) | 9 (42.9%) | 6 (28.6%) | 3 (14.3%) |
| Fish and ω-3 Fatty Acids consumptions and cancer outcomes | Lee | 2020 | OBS | NR | 5 [3; 10] (2 to 17) | 57 | 52f | 0 (0%) | 0 (0%) | 2 (3.8%) | 10 (19.2%) | 40 (76.9%) |
| Influenza vaccine in elderly and health outcomes | Demurtas | 2020 | OBS-RCT | NR | 3.50 [2; 6.75] (2 to 27) | 60 | 38 | 1 (2.6%) | 3 (7.9%) | 6 (15.8%) | 15 (39.5%) | 13 (34.2%) |
| Handgrip strength and health outcomes | Soysal | 2020 | OBS | NR | 9 [8; 10.50] (7 to 34) | 11 | 8e | 0 (0%) | 1 (12.5%) | 1 (12.5%) | 4 (50%) | 2 (25%) |
| Human immunodeficiency virus infections and health outcomes | Grabovac | 2019 | OBS | NR | 8 [4; 13.50] (2 to 43) | 55 | 55 | 0 (0%) | 0 (0%) | 9 (16.4%) | 30 (54.5%) | 16 (29.1%) |
| Metformin and cancer outcomes | Yu | 2019 | OBS | 327 | 7 [5; 15] (2 to 29) | 33 | 33 | 1 (3%) | 3 (9.1%) | 5 (15.2%) | 14 (42.4%) | 10 (30.3%) |
| Magnesium and health outcomes | Veronese | 2019 | OBS-RCT | NR | 6 [3; 9.50] (3 to 32) | 55 | 19 | 0 (0%) | 0 (0%) | 2 (10.5%) | 7 (36.8%) | 10 (52.6%) |
| Obesity and gynecology/obstetric outcomes | Kalliala | 2017 | OBS-RCT | 427 | 6 [3; 9] (2 to 40) | 248 | 144 | 11 (7.6%) | 28 (19.4%) | 23 (16%) | 42 (29.2%) | 40 (27.8%) |
| Physical activity and cancer outcomes | Rezende | 2017 | OBS | 297 | 6 [3.25; 10] (2 to 38) | 46 | 46 | 1 (2.2%) | 2 (4.3%) | 5 (10.9%) | 5 (10.9%) | 33 (71.7%) |
| Physical activity and atrial fibrillation outcomes | Valenzuela | 2020 | OBS | NR | 8 [7; 19] (6 to 20) | 5 | 5 | 0 (0%) | 0 (0%) | 0 (0%) | 3 (60%) | 2 (40%) |
| Statins and multiple non-cardiovascular outcomes | Yazhou | 2018 | OBS-RCT | NR | 6 [4; 9] (2 to 27) | 278 | 115 | 0 (0%) | 2 (1.7%) | 21 (18.3%) | 42 (36.5%) | 50 (59.8%) |
| Serum uric acid and health outcomes | Li | 2017 | OBS-RCT-MR | NR | 5 [3; 9] (2 to 31) | 152 | 76 | 0 (0%) | 7 (9.2%) | 9 (11.8%) | 41 (53.9%) | 19 (25%) |
| Type 2 diabetes mellitus and cancer | Tsilidis | 2015 | OBS | 474 | 14 [9; 21] (5 to 45) | 27 | 27 | 2 (7.4%) | 4 (14.8%) | 4 (14.8%) | 10 (37%) | 7 (25.9%) |
| Tea consumption and cancer | Kim | 2020 | OBS | NR | 10 [6.75; 16] (4 to 53) | 150 | 68g | 0 (0%) | 1 (1.5%) | 2 (2.9%) | 18 (26.5%) | 47 (69.1%) |
| Telomere length and health outcomes | Smith | 2019 | OBS | NR | 5.50 [3; 8] (2 to 20) | 50 | 50 | 0 (0%) | 1 (2%) | 0 (0%) | 23 (46%) | 26 (52%) |
| Vitamin D and health outcomes | Theodoratou | 2014 | OBS-RCT | NR | 7 [5; 10] (2 to 37) | 48 | 48 | 0 (0%) | 6 (12.5%) | 7 (14.6%) | 16 (33.3%) | 19 (39.6%) |
| Risk factor for attention deficit hyperactivity disorder | Kim | 2020 | OBS | NR | 6 [4; 9] (2 to 30) | 63 | 63 | 5 (7.9%) | 3 (4.8%) | 11 (17.5%) | 26 (41.3%) | 18 (28.6%) |
| Risk and protective factors for mental disorders with onset in childhood/adolescence | Marco | 2020 | OBS | 192 | 6 [4.50; 9] (2 to 26) | 23 | 23 | 0 (0%) | 0 (0%) | 1 (4.3%) | 8 (34.8%) | 14 (60.9%) |
| Environmental factors and serum biomarkers for atrial fibrillation | Belbasis | 2020 | OBS | NR | 6 [4; 8] (3 to 31) | 51 | 51 | 6 (11.8%) | 11 (21.6%) | 8 (15.7%) | 10 (19.6%) | 16 (31.4%) |
| Factors associated to loneliness | Solmi | 2020 | OBS | NR | 13 [8; 18] (3 to 31) | 5 | 5 | 0 (0%) | 0 (0%) | 1 (20%) | 4 (80%) | 0 (0%) |
| Risk factor for amyotrophic lateral sclerosis | Belbasis | 2016 | OBS | NR | 8 [5.75; 9.25] (3 to 20) | 16 | 16 | 0 (0%) | 0 (0%) | 3 (18.8%) | 6 (37.5%) | 7 (43.8%) |
| Risk and protective factors for anxiety and obsessive compulsive disorders | Fullana | 2019 | OBS | 216 | 3 [2; 6] (2 to 112) | 427 | 128fg | 4 (3.1%) | 2 (1.6%) | 3 (2.3%) | 60 (46.9%) | 59 (46.1%) |
| Environmental risk factors and biomarkers for autism spectrum disorder | Kim | 2019 | OBS | NR | 8 [3.50; 13] (2 to 24) | 67 | 67 | 8 (11.9%) | 7 (10.4%) | 11 (16.4%) | 26 (38.8%) | 15 (22.4%) |
| Environmental risk factors for bipolar disorder | Bortolato | 2017 | OBS | 54 | 8 [5; 10] (3 to 13) | 7 | 7 | 1 (14.3%) | 1 (14.3%) | 2 (28.6%) | 2 (28.6%) | 1 (14.3%) |
| Risk factors for colorectal cancer metastasis and recurrence | Xu | 2020 | OBS | NR | 6 [3.50; 9] (2 to 41) | 47 | 47 | 0 (0%) | 0 (0%) | 4 (8.5%) | 27 (57.4%) | 16 (34%) |
| Non-genetic biomarkers and colorectal cancer risk | Zhang | 2020 | OBS-RCT-MR | NR | 7 [3; 10] (2 to 28) | 112 | 65 | 0 (0%) | 0 (0%) | 4 (6.2%) | 25 (38.5%) | 36 (55.4%) |
| Risk factors for chronic obstructive pulmonary disease | Bellou | 2019 | OBS-MR | NR | 5 [4; 11] (3 to 22) | 22 | 18 | 0 (0%) | 0 (0%) | 8 (44.4%) | 5 (27.8%) | 5 (27.8%) |
| Environmental risk factors for dementia | Bellou | 2017 | OBS | NR | 7 [4.75; 13] (3 to 43) | 76 | 76 | 7 (9.2%) | 5 (6.6%) | 10 (13.2%) | 33 (43.4%) | 21 (27.6%) |
| Risk factors for depression | Kohler | 2018 | OBS-MR | NR | 7.50 [5; 11] (3 to 77) | 140 | 134 | 0 (0%) | 0 (0%) | 41 (30.6%) | 57 (42.5%) | 36 (26.9%) |
| Risk factors for eating disorders | Solmi | 2020 | OBS | NR | 6 [4; 9] (2 to 33) | 49 | 49 | 0 (0%) | 0 (0%) | 6 (12.2%) | 35 (71.4%) | 8 (16.3%) |
| Risk factors for endometrial cancer | Raglan | 2019 | OBS | 604 | 4 [3; 6] (2 to 28) | 127 | 127 | 3 (2.4%) | 13 (10.2%) | 14 (11%) | 26 (20.5%) | 71 (55.9%) |
| Environmental risk factors for obesity | Solmi | 2018 | OBS-RCT | 166 | 8 [6; 10.75] (2 to 22) | 60 | 26 | 4 (15.4%) | 2 (7.7%) | 1 (3.8%) | 15 (57.7%) | 4 (15.4%) |
| Prognostic biomarkers for gastric cancer | Zhou | 2019 | OBS | >1000 | 7 [4; 11] (3 to 51) | 119 | 119 | 3 (2.5%) | 7 (5.9%) | 3 (2.5%) | 82 (68.9%) | 24 (20.2%) |
| Risk factors for gestational diabetes | Giannakou | 2019 | OBS | NR | 8 [5; 14] (3 to 40) | 61 | 61 | 1 (1.6%) | 13 (21.3%) | 9 (14.8%) | 28 (45.9%) | 10 (16.4%) |
| Peripheral biomarkers and major mental disorders | Carvalho | 2020 | OBS | NR | 7 [5; 13] (3 to 55) | 358 | 318g | 0 (0%) | 0 (0%) | 3 (0.9%) | 175 (55%) | 140 (44%) |
| Environmental risk factors for multiple sclerosis | Belbasis | 2015 | OBS | NR | 8 [6; 12] (3 to 30) | 44 | 44 | 2 (4.5%) | 2 (4.5%) | 2 (4.5%) | 17 (38.6%) | 21 (47.7%) |
| Prognostic biomarkers for pancreatic ductal adenocarcinoma | Wang | 2020 | OBS | >300 | 4 [3; 6] (2 to 43) | 63 | 63 | 0 (0%) | 2 (3.2%) | 1 (1.6%) | 41 (65.1%) | 19 (30.2%) |
| Environmental risk factors and Parkinson’s | Bellou | 2016 | OBS | 755 | 7 [5; 10] (2 to 67) | 75 | 75 | 2 (2.7%) | 6 (8%) | 9 (12%) | 18 (24%) | 40 (53.3%) |
| Risk and protective factors for prostate cancer | Markozannes | 2016 | OBS | 1907 | 5 [3.75; 7] (2 to 45) | 248 | 176d | 0 (0%) | 2 (1.1%) | 7 (4%) | 25 (14.2%) | 142 (80.7%) |
| Non-genetic risk factors for pre-eclampsia | Giannakou | 2017 | OBS | NR | 7 [4; 12.25] (3 to 34) | 130 | 64h | 1 (1.6%) | 11 (17.2%) | 5 (7.8%) | 22 (34.4%) | 25 (39.1%) |
| Risk and protective factors for psychosis | Ruada | 2018 | OBS | 683 | 6 [3; 9] (2 to 55) | 170 | 128i | 1 (0.8%) | 2 (1.6%) | 11 (8.6%) | 64 (50%) | 50 (39.1%) |
| Environmental risk factors for rheumatic diseases | Belbasis | 2018 | OBS | NR | 10.50 [7; 13] (3 to 51) | 42 | 42 | 0 (0%) | 0 (0%) | 7 (16.7%) | 26 (61.9%) | 9 (21.4%) |
| Risk factors and peripheral biomarkers for schizophrenia spectrum disorders | Belbasis | 2017 | OBS-MR | NR | 8 [5.25; 13] (3 to 42) | 98 | 98 | 1 (1%) | 4 (4.1%) | 5 (5.1%) | 52 (53.1%) | 36 (36.7%) |
| Non-genetic risk factors for skin cancer | Belbasis | 2016 | OBS | NR | 10 [7; 18] (3 to 41) | 85 | 85 | 4 (4.7%) | 9 (10.6%) | 11 (12.9%) | 34 (40%) | 27 (31.8%) |
| Risk factors for type 2 diabetes mellitus | Bellou | 2018 | OBS-MR | NR | 9.50 [6; 14] (3 to 88) | 155 | 142 | 11 (7.7%) | 34 (23.9%) | 28 (19.7%) | 43 (30.3%) | 26 (18.3%) |
IQR interquartile range, MA meta-analyses, MR Mendelian randomization, NR not reported, OBS observational studies, OCD obsessive and compulsive disorders, RCT randomized controlled trials
aTotal number of primary studies included in the meta-analyses assessed by the umbrella reviews
bMedian number IQR and minimum and maximum number of primary studies included in the associations assessed
cTotal number of associations assessed in the included umbrella reviews
dThe umbrella review presented data for continuous and binary outcomes but their principal analyses focused only on continuous outcomes which we included hence the lowest number of included associations in our work compared with the original umbrella review
eSome associations were excluded as they were assessed by a mix of RCTs and observational studies
fAssociations assessed by only one study were removed
gDuplicated associations were excluded
hExcluded associations assessing genetic factors
iThe authors of the umbrella review mention 170 associations but only report 145. Out of the 145, 17 meta-analyses were excluded because included only one study
Meta-analyses of the proportions of associations for each criterion and level of evidence (random effects)
| n/N associations (crude proportion) | Proportions | I | Range of proportions across topics | |
|---|---|---|---|---|
| Convincing | 99/3744 (2.6%) | 1.3% [1.0%; 2.2%] | 73.9% | 0–16.7% |
| Highly suggestive | 253/3744 (6.7%) | 4.6% [2.9%; 6.6%] | 85.7% | 0–33.3% |
| Suggestive | 440/3744 (11.8%) | 11.0% [8.5%; 13.8%] | 83.9% | 0–50% |
| Weak | 1497/3744 (40.0%) | 39.1% [34.8%; 43.5%] | 86.2% | 0–71.4% |
| Non-significant | 1455/3744 (38.9%) | 34.7% [29.2%; 40.3%] | 90.8% | 0–80.7% |
| Statistical significance | ||||
| | 762/2289 (33.3%) | 29.0% [24.9%; 33.3%] | 74.8% | 0–66.7% |
| | 1377/2289 (60.2%) | 58.6% [54.1%; 63.0%] | 73.3% | 0–100% |
| Cases > 1000 (or > 20,000 participants for continuous outcomes) | 1182/2107 (56.1%) | 65.3% [56.9%; 73.2%] | 94.9% | 1.7–100% |
| Largest study with | 1343/1781 (75.4%) | 74.9% [71.2%; 78.4%] | 63.6% | 28.6–100% |
| 95% prediction interval that excluded the null | 642/2136 (30.1%) | 30.3% [26.5%; 34.2%] | 71.0% | 9.0–100% |
| Absence of large heterogeneity (I2<50%) | 1050/2277 (46.1%) | 46.6% [41.8%; 51.3%] | 79.5% | 0–88.2% |
| No evidence of small study effects ( | 1628/2164 (75.2%) | 75.3% [72.2%; 78.3%] | 63.2% | 40–100% |
| No evidence of excess significance ( | 1599/2052 (77.9%) | 77.7% [72.6%; 82.5%] | 83.0% | 33.3–100% |
aMeta-analyses for the individual criteria excluded associations with missing data. Meta-analyses for the levels of evidence were conducted across all 3744 associations regardless of their statistical significance status. The meta-analyses for the individual criteria were conducted across the 2289 statistically significant associations. Out of 2289, statistically significant associations, 182 associations did not report on the number of cases,508 on whether the largest study had P < 0.05, 153 on the 95% prediction interval, 12 on the I2 for heterogeneity, 125 on the small study effect test, and 237 on the excess of significance bias. Data on all 7 criteria were available for 1457 statistically significant meta-analyses
Fig. 3Kappa heatmap for the seven criteria across all umbrella reviews. Only statistically significant associations (with P < 0.05 for the random effects summary effect) were included in the Cohen’s kappa analysis. A κ<0.6 (lighter red) represents a weak, 0.6≤κ< 0.8 a moderate (red), and κ≥0.8 (dark red) a strong strength of agreement. Conversely, a κ>−0.6 represents a weak (light blue), −0.8<κ≤−0.6 a moderate (blue), and κ≤−0.8 (dark blue) a strong disagreement. The kappa estimated within each umbrella reviews and combined using random effects meta-analyses are presented eFigure 5 and eFigure 6
Changes in number of associations that are graded as having convincing evidence when one criterion is dropped or replaced by a more lenient version
| Credibility assessment | N associations | Proportion |
|---|---|---|
| Replace | 134 | 9.2% |
| Replace | 142 | 9.7% |
| Without the minimum number of cases criterion | 149 | 10.2% |
| Without the largest study at | 103 | 7.1% |
| Without the 95% prediction interval criterion | 106 | 7.3% |
| Without the heterogeneity I2<50% criterion | 159 | 10.9% |
| Without the small study effects criterion | 122 | 8.4% |
| Without the excess significance criterion | 111 | 7.6% |
aThese are the associations that are statistically significant (P < 0.05) and also have information on all criteria