Literature DB >> 35602208

Extensive Mendelian randomization study identifies potential causal risk factors for severe COVID-19.

Yitang Sun¹, Jingqi Zhou^1,2, Kaixiong Ye^1,3.

Abstract

Background: Identifying causal risk factors for severe coronavirus disease 2019 (COVID-19) is critical for its prevention and treatment. Many associated pre-existing conditions and biomarkers have been reported, but these observational associations suffer from confounding and reverse causation.
Methods: Here, we perform a large-scale two-sample Mendelian randomization (MR) analysis to evaluate the causal roles of many traits in severe COVID-19.
Results: Our results highlight multiple body mass index (BMI)-related traits as risk-increasing: BMI (OR: 1.89, 95% CI: 1.51-2.37), hip circumference (OR: 1.46, 1.15-1.85), and waist circumference (OR: 1.82, 1.36-2.43). Our multivariable MR analysis further suggests that the BMI-related effect might be driven by fat mass (OR: 1.63, 1.03-2.58), but not fat-free mass (OR: 1.00, 0.61-1.66). Several white blood cell counts are negatively associated with severe COVID-19, including those of neutrophils (OR: 0.76, 0.61-0.94), granulocytes (OR: 0.75, 0.601-0.93), and myeloid white blood cells (OR: 0.77, 0.62-0.96). Furthermore, some circulating proteins are associated with an increased risk of (e.g., zinc-alpha-2-glycoprotein) or protection from severe COVID-19 (e.g., prostate-associated microseminoprotein). Conclusions: Our study suggests that fat mass and white blood cells might be involved in the development of severe COVID-19. It also prioritizes potential risk and protective factors that might serve as drug targets and guide the effective protection of high-risk individuals.

Entities: Chemical

Keywords: Epidemiology; Genetic association study

Year: 2021 PMID： 35602208 PMCID： PMC9053245 DOI： 10.1038/s43856-021-00061-9

Source DB: PubMed Journal: Commun Med (Lond) ISSN： 2730-664X

Introduction

The coronavirus disease 2019 (COVID-19) is a global pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS–CoV–2)[1]. As of mid-April 2021, 146 million confirmed cases and three million deaths from COVID-19 have been reported worldwide[2]. Despite substantial public health and medical efforts, COVID-19 continues to cause irreversible damage and death[3-5]. It is essential to identify risk factors and potential drug targets for COVID-19 in order to improve primary prevention and to develop treatment strategies. Many observational studies have reported that older age, male gender, non-White ethnicity, and pre-existing conditions, such as cardiovascular disease, diabetes, chronic respiratory disease, hypertension, and cancers, are associated with increased COVID-19 susceptibility and severity[5-8]. Moreover, retrospective observational studies have noted that hospitalized COVID-19 patients, especially those with severe respiratory or systemic conditions, are at increased risks of atrial fibrillation, nonsustained ventricular tachycardia, acute kidney injury, neurologic disorders, and thrombotic complications[9-12]. Vitamin-D deficiency, higher body mass index (BMI), and obesity have been associated with an increased risk of COVID-19[13,14]. Some lifestyle factors were also identified as risk-increasing, such as smoking, alcohol consumption, and lack of physical activity[15]. However, it is difficult to infer causal effects from observational studies because they are susceptible to confounding and reverse causation, while data from randomized controlled trials are scarce and inconclusive. Mendelian randomization (MR) study provides a promising opportunity to validate and prioritize putative risk factors and drug targets. MR studies use randomly allocated genetic variants related to the exposure as instrumental variables for investigating the effect of the exposure on an outcome[16]. It is expected to be independent of confounding factors and has been demonstrated as an efficient and cost-effective strategy to identify causal effects[17]. Recent MR studies have provided evidence of causality for a range of risk factors on COVID-19 (Supplementary Data 1). For instance, BMI and smoking are associated with an increased COVID-19 risk, while no evidence of causal effects was found for circulating 25-hydroxy-vitamin-D levels[18-20]. However, inconsistent results were also reported for some factors, such as Alzheimer’s disease, blood lipids, and physical activity[18,19,21-26]. Some of these inconsistencies are likely due to the usage of early genome-wide association studies (GWAS) of COVID-19, which have small sample sizes. Moreover, most studies are limited to a small number of candidate factors, leaving many more to be tested and identified. The recent release of large GWAS meta-analysis for various COVID-19 phenotypes offers a great opportunity for MR studies[27]. However, special care and caution are also needed when interpreting the results. Sampling from COVID-19 patients, individuals tested for infection, voluntary participants, or existing cohorts may result in nonrepresentative samples and induce collider bias that distorts phenotypic and genetic associations[28-30]. The inherent complexity of COVID-19 as an infectious disease and the potential complications in ascertaining cases and controls make it challenging to disentangle risk factors for an increased chance of infection, susceptibility to infection, and disease severity[27,28]. In this study, we conducted an unbiased and large-scale MR analysis to examine the potential causal effects of an extensive list of exposures on severe COVID-19. All existing GWAS, as compiled by the Integrative Epidemiology Unit (IEU) OpenGWAS project, were included[31,32]. We note that some GWAS were on the same traits. In each of these GWAS, independent genetic variants at the genome-wide significance were selected as instrumental variables for the studied trait. The associations between genetic instruments and the risk of severe COVID-19 were evaluated based primarily on three nonindependent GWAS of COVID-19. The COVID-19 Host Genetics Initiative (HGI) study A2 from release 4 alpha was used in our discovery analysis. HGI A2 compared COVID-19 patients with confirmed severe respiratory symptoms to population controls[27]. The HGI B2 study, comparing hospitalized COVID-19 patients to population controls, was used as one of our two replication datasets. The other replication dataset, labeled as the NEJM study, was drawn from the first published GWAS study of COVID-19 comparing patients with respiratory failure to healthy controls from Italy and Spain[33]. We note that due to sample overlap and different phenotypic definitions, the HGI B2 and NEJM studies are not independent or strict replications of HGI A2. They mainly serve the purpose of reducing false positives in our prioritized list of risk factors. To ensure the robustness of the prioritized list of risk factors, results based on these three COVID-19 GWAS were compared to those from different releases (4 alpha vs. 4 and 5), different case definitions (very severe respiratory COVID-19 in A2 and hospitalized COVID-19 in B2 vs. any reported infection in C2), and different control groups (population controls in A2 and B2 vs. nonhospitalized COVID-19 patients in A1 and B1). Furthermore, multiple sensitivity analyses were performed to detect and correct for the presence of pleiotropy in genetic instruments. Here, we only report associations that do not have evidence of pleiotropy in genetic instruments and are observed in at least one of the two primary replication analyses. As an in-depth investigation into the BMI-related traits, we further conducted a multivariable MR analysis to disentangle the effects of fat mass and fat-free mass. Our findings provide profound insights into the etiology of severe COVID-19 and prioritize candidate causal risk factors for public health intervention and for drug discovery.

Methods

Exposure data sources

This study analyzed publicly available summary statistics from previous GWAS and did not include individual-level data. Ethical approval or informed consent was not required. To obtain a comprehensive list of traits with existing GWAS, the summary statistics of 34,519 published GWAS were extracted from the MRC Integrative Epidemiology Unit (University of Bristol) GWAS database (https://gwas.mrcieu.ac.uk/). Details of each GWAS study can be found at https://gwas.mrcieu.ac.uk/datasets/. The R package TwoSampleMR (version 0.5.5) was applied to retrieve the IEU GWAS datasets[31,32]. The univariable MR study was conducted using the same package. This study is reported as per the guidelines for strengthening the reporting of Mendelian randomization studies (STROBE-MR, Supplementary Data 2)[34]. These GWAS were further filtered based on the following criteria: (1) European ancestry; (2) not eQTL studies, those labeled as “eqtl” from eQTLGen 2019[35]. A total of 14,385 GWAS summary datasets were retained and used in this study. Detailed information on data sources, all GWAS, and their corresponding traits are available in Supplementary Data 3. The units of the exposures follow the definitions in the prior GWAS since we directly used the existing summary statistics. We note that all exposures reported in the main text and Supplementary Data 4 have the unit of standard deviation (SD).

Outcome data sources

For evaluation of the association with COVID-19 severity, the instrument-outcome effects were retrieved from the recent version of GWAS meta-analysis by the COVID-19 Host Genetics Initiative (HGI, release 4 alpha, accessed on October 9, 2020)[27]. Detailed information has been provided on the COVID-19 HGI website (https://www.covid19hg.org/results/). In our primary discovery analysis, we used the summary statistics based on the comparison of 2972 patients confirmed as “very severe respiratory” COVID-19 with the 284,472 general population samples. This is called “the HGI A2 study”. To reduce false positives and to ensure the robustness of our discoveries, replication analyses were performed with two additional GWAS of COVID-19. One of them was also from the COVID-19 HGI, comparing 6492 hospitalized COVID-19 patients with 1,012,809 control participants. We called it “the HGI B2 study”. Only single nucleotide polymorphisms (SNPs) with imputation-quality scores >0.6 were retained. The other GWAS was on 1610 COVID-19 patients with respiratory failure and 2180 controls from Italy and Spain, and it was called “the NEJM study”[33]. To evaluate the robustness of our findings to different data releases, we performed additional analyses with the A2 and B2 COVID-19 GWAS from HGI releases 4 and 5. Since the A2 and B2 GWAS utilized population samples with unknown COVID-19 status, we further performed MR analyses with the A1 and B1 COVID-19 GWAS, which utilized nonhospitalized COVID-19 patients as the control, in order to evaluate the impact of different control samples. In an attempt to distinguish the effects of risk factors on COVID-19 susceptibility and severity, we also included HGI C2 studies, which compared any COVID-19 case (laboratory-confirmed or clinically confirmed SARS-CoV-2 infection, or self-reported COVID-19) to population controls. Summary statistics from HGI releases 4 and 5 were accessed on March 27, 2021 and analyzed with the same computational pipeline as described before. In addition, genetic correlations between various COVID-19 phenotypes from HGI releases 4 and 5 were estimated using linkage-disequilibrium score (LDSC) regression[36,37].

Selection of instrumental variables

For the implementation of MR, SNPs were selected based on the genome-wide significance threshold (p < 5 × 10−8). To ensure that SNPs are independent, we pruned the variants by linkage disequilibrium (LD) (R2 threshold of 0.001 or clumping window of 10,000 kb). When target SNPs were not present in the outcome dataset, proxy SNPs were used instead through LD tagging (minimum LD R2 threshold of 0.8). The effect alleles of selected genetic variants were harmonized across the exposure and outcome associations. F statistics were calculated to assess instrument strength[38]. F statistics ≥ 10 indicate strong instruments.

Univariable Mendelian randomization

Two-sample Mendelian randomization analysis was undertaken using GWAS summary statistics for each exposure-outcome pair. In order to estimate the causal effect of each trait on severe COVID-19, the inverse-variance-weighted (IVW) method with a multiplicative random-effect model was used as the primary analysis[39-41]. Horizontal pleiotropy occurs when SNPs exert an effect on severe COVD-19 through other biological pathways independent of the studied exposure. To assess the presence of heterogeneity among genetic instruments, Cochran’s Q statistic was calculated for heterogeneity for the IVW analyses[42]. An extended version of Cochran’s Q statistic (Rücker’s Q′) can be estimated for the MR-Egger[43]. We used the MR-Egger intercept test to evaluate the presence of unbalanced horizontal pleiotropy[40]. To account for pleiotropy, additional sensitivity analyses were performed with the MR-Egger[40,41], weighted median (WM)[44], and weighted-mode methods[45]. The MR-Egger method allows unbalanced horizontal pleiotropic effects even when all SNPs are invalid instruments[40]. The WM method can provide robust causal estimates when at least 50% of SNPs are valid genetic instruments, while the weighted-mode method reports the causal-effect estimate supported by the largest number of instruments[44,45]. The false-discovery rate (FDR) approach was utilized to correct for multiple testing, and it was applied to the p values from the IVW random-effect model[46]. An association was declared significant if the q-value is < 0.05, and was deemed suggestive if the unadjusted p-value is < 0.05. Two additional exclusion criteria were applied to filter out exposures before FDR correction: (1) the number of genetic instruments was less than three. Three or more are required for statistical tests of pleiotropic effects and for statistical sensitivity analyses to correct for pleiotropy. (2) Exposures with indications of pleiotropy in their genetic instruments. The presence of pleiotropy violates the assumption of MR analysis. For the remaining exposures, FDR correction for multiple testing was applied separately for each analysis with the HGI A2, HGI B2, or NEJM study. To identify potential causal risk factors for severe COVID-19, we used two approaches to consider the evidence strength. First, the significant and replicated results were defined as those with a q-value < 0.05 in the discovery analysis and a p-value < 0.05 in either one of the replication studies (Supplementary Data 5). Second, the suggestive and replicated results were defined as those with a p-value < 0.05 in the discovery analysis and a p-value < 0.05 in either one of the replication studies (Supplementary Data 6). All MR analyses were conducted in R with the TwoSampleMR package[31]. Additional MR analyses were also performed using GWAS from HGI releases 4 and 5. An analysis flowchart is shown in Fig. 1.

Fig. 1

The workflow of our extensive MR study for severe COVID-19.

The workflow of our extensive MR study for severe COVID-19.

GWAS genome-wide association studies, IEU Integrative Epidemiology Unit, eqtl expression quantitative trait loci, HGI host genetics initiative, NEJM New England journal of medicine, IVW inverse-variance weighted, SNP single-nucleotide polymorphism, SNP# number of SNPs used as genetic instruments, Novel not reported before, Confirming confirming some previously reported results, even though previous results may be conflicting among themselves; Conflicting conflicting with previously reported results. The definition of novelty is by comparison to existing COVID-19 MR studies, as summarized in Supplementary Data 1. Detailed summary statistics could be found in Supplementary Data 5 and 6. For a few exposures that have a small number of genetic instruments, we performed an exemplary sensitivity analysis by excluding SNPs with potential pleiotropic effects. For each SNP, we queried the PhenoScanner[47,48] and retrieved any associations at the genome-wide significance. After excluding SNPs with associations with blood cell or BMI-related traits, we repeated the MR analysis as described before.

Multivariable Mendelian randomization

As many BMI-related traits are typically correlated with each other, we conducted a two-sample multivariable MR (MVMR) analysis to explore independent causal risk factors for severe COVID-19[49]. SNPs associated with fat mass and fat‐free mass were obtained from previous GWAS by MRC IEU and the Neale Lab through the TwoSampleMR package. The effects of genetically predicted fat mass and fat-free mass for each pair of the whole body, left arm, right arm, left leg, right leg, and trunk were estimated using the MVMR package (version 0.2.0) in R.

75 in total

Review 1. Neutrophil extracellular traps: double-edged swords of innate immunity.

Authors: Mariana J Kaplan; Marko Radic
Journal: J Immunol Date: 2012-09-15 Impact factor: 5.422

2. Are psychiatric disorders risk factors for COVID-19 susceptibility and severity? a two-sample, bidirectional, univariable, and multivariable Mendelian Randomization study.

Authors: Jurjen J Luykx; Bochao D Lin
Journal: Transl Psychiatry Date: 2021-04-08 Impact factor: 6.222

3. Mendelian randomization analysis with multiple genetic variants using summarized data.

Authors: Stephen Burgess; Adam Butterworth; Simon G Thompson
Journal: Genet Epidemiol Date: 2013-09-20 Impact factor: 2.135

4. Mendelian randomization in COVID-19: Applications for cardiovascular comorbidities and beyond.

Authors: Yaqun Teng; Jiuyang Xu; Yang Zhang; Zhenyu Liu; Shuyang Zhang
Journal: EBioMedicine Date: 2020-06-20 Impact factor: 8.143

5. A practical guide to methods controlling false discoveries in computational biology.

Authors: Keegan Korthauer; Patrick K Kimes; Claire Duvallet; Alejandro Reyes; Ayshwarya Subramanian; Mingxiang Teng; Chinmay Shukla; Eric J Alm; Stephanie C Hicks
Journal: Genome Biol Date: 2019-06-04 Impact factor: 13.583

6. PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations.

Authors: Mihir A Kamat; James A Blackshaw; Robin Young; Praveen Surendran; Stephen Burgess; John Danesh; Adam S Butterworth; James R Staley
Journal: Bioinformatics Date: 2019-11-01 Impact factor: 6.937

7. A dual role for the N-terminal domain of the IL-3 receptor in cell signalling.

Authors: Sophie E Broughton; Timothy R Hercus; Tracy L Nero; Winnie L Kan; Emma F Barry; Mara Dottore; Karen S Cheung Tung Shing; Craig J Morton; Urmi Dhagat; Matthew P Hardy; Nicholas J Wilson; Matthew T Downton; Christine Schieber; Timothy P Hughes; Angel F Lopez; Michael W Parker
Journal: Nat Commun Date: 2018-01-26 Impact factor: 14.919

8. COVID-19: consider cytokine storm syndromes and immunosuppression.

Authors: Puja Mehta; Daniel F McAuley; Michael Brown; Emilie Sanchez; Rachel S Tattersall; Jessica J Manson
Journal: Lancet Date: 2020-03-16 Impact factor: 79.321

9. White Blood Cells and Severe COVID-19: A Mendelian Randomization Study.

Authors: Yitang Sun; Jingqi Zhou; Kaixiong Ye
Journal: J Pers Med Date: 2021-03-12

10. Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention.

Authors: Zunyou Wu; Jennifer M McGoogan
Journal: JAMA Date: 2020-04-07 Impact factor: 56.272

6 in total

1. Extensive Mendelian randomization study identifies potential causal risk factors for severe COVID-19.

Authors: Yitang Sun; Jingqi Zhou; Kaixiong Ye
Journal: Commun Med (Lond) Date: 2021-12-09

Review 2. Low Serum Vitamin D in COVID-19 Patients Is Not Related to Inflammatory Markers and Patients' Outcomes-A Single-Center Experience and a Brief Review of the Literature.

Authors: Adina Huțanu; Anca Meda Georgescu; Septimiu Voidăzan; Akos Vince Andrejkovits; Valentina Negrea; Minodora Dobreanu
Journal: Nutrients Date: 2022-05-10 Impact factor: 6.706

3. Genetic liability between COVID-19 and heart failure: evidence from a bidirectional Mendelian randomization study.

Authors: Huachen Wang; Zheng Guo; Yulu Zheng; Bing Chen
Journal: BMC Cardiovasc Disord Date: 2022-06-11 Impact factor: 2.174

4. Circulating Polyunsaturated Fatty Acids and COVID-19: A Prospective Cohort Study and Mendelian Randomization Analysis.

Authors: Yitang Sun; Radhika Chatterjee; Akash Ronanki; Kaixiong Ye
Journal: Front Med (Lausanne) Date: 2022-06-16

5. Causal associations between body fat accumulation and COVID-19 severity: A Mendelian randomization study.

Authors: Satoshi Yoshiji; Daisuke Tanaka; Hiroto Minamino; Tianyuan Lu; Guillaume Butler-Laporte; Takaaki Murakami; Yoshihito Fujita; J Brent Richards; Nobuya Inagaki
Journal: Front Endocrinol (Lausanne) Date: 2022-08-03 Impact factor: 6.055

6. Iron status and the risk of sepsis and severe COVID-19: a two-sample Mendelian randomization study.

Authors: Randi Marie Mohus; Helene Flatby; Kristin V Liyanarachi; Andrew T DeWan; Erik Solligård; Jan Kristian Damås; Bjørn Olav Åsvold; Lise T Gustad; Tormod Rogne
Journal: Sci Rep Date: 2022-09-28 Impact factor: 4.996

6 in total