Literature DB >> 36051507

A Mendelian randomization-based exploration of red blood cell distribution width and mean corpuscular volume with risk of hemorrhagic strokes.

Jundong Liu¹, Elizabeth L Chou², Kui Kai Lau^3,4, Peter Yat Ming Woo⁵, Tsz Kin Wan⁶, Ruixuan Huang⁶, Kei Hang Katie Chan^1,6,7.

Abstract

Red blood cell distribution width (RCDW) and mean corpuscular volume (MCV) are associated with different risk factors for hemorrhagic stroke. However, whether RCDW and MCV are causally related to hemorrhagic stroke remains poorly understood. Therefore, we explored the causality between RCDW/MCV and nontraumatic hemorrhagic strokes using Mendelian randomization (MR) methods. We extracted exposure and outcome summary statistics from the UK Biobank and FinnGen. We evaluated the causality of RCDW/MCV on four outcomes (subarachnoid hemorrhage [SAH], intracerebral hemorrhage [ICH], nontraumatic intracranial hemorrhage [nITH], and a combination of SAH, cerebral aneurysm, and aneurysm operations) using univariable MR (UMR) and multivariable MR (MVMR). We further performed colocalization and mediation analyses. UMR and MVMR revealed that higher genetically predicted MCV is protective of ICH (UMR: odds ratio [OR] = 0.89 [0.8-0.99], p = 0.036; MVMR: OR = 0.87 [0.78-0.98], p = 0.021) and nITH (UMR: OR = 0.89 [0.82-0.97], p = 0.005; MVMR: OR = 0.88 [0.8-0.96], p = 0.004). There were no strong causal associations between RCDW/MCV and any other outcome. Colocalization analysis revealed a shared causal variant between MCV and ICH; it was not reported to be associated with ICH. Proportion mediated via diastolic blood pressure was 3.1% (0.1%,14.3%) in ICH and 3.4% (0.2%,15.8%) in nITH. The study constitutes the first MR analysis on whether genetically elevated RCDW and MCV affect the risk of hemorrhagic strokes. UMR, MVMR, and mediation analysis revealed that MCV is a protective factor for ICH and nITH, which may inform new insights into the treatments for hemorrhagic strokes.

Entities: Chemical

Keywords: Mendelian randomization; causal inference; hemorrhagic stroke; mean corpuscular volume; red blood cell distribution width

Year: 2022 PMID： 36051507 PMCID： PMC9424589 DOI： 10.1016/j.xhgg.2022.100135

Source DB: PubMed Journal: HGG Adv ISSN： 2666-2477

Introduction

Hemorrhagic stroke results from bleeding into/around the brain when a weakened vessel ruptures., The blood accumulates to compress the surrounding brain cells, leading to brain tissue damage and neurological deficits., According to American Stroke Association, hemorrhagic strokes occupy about 13% of all strokes consisting of intracerebral hemorrhage (ICH)—bleeding into the brain parenchyma—and subarachnoid hemorrhage (SAH)—bleeding into the subarachnoid space. Mortality and morbidity are high for hemorrhagic stroke while identifying risk factors of the hemorrhagic stroke helps detect and modify the risk, potentially reducing the odds of death or disability of the fatal disease., Identifying risk factors of the hemorrhagic stroke helps detect and modify the risk, potentially reducing the odds of death or disability of the fatal disease. Mean corpuscular volume (MCV) describes the average size and volume of erythrocytes circulating in the bloodstream. It is calculated by dividing the hematocrit by the concentration of erythrocytes (hematocrit [%] × 10/red blood cell count [millions/mm3 blood]). Red blood cell distribution width (RCDW) is defined as one standard deviation (SD) of erythrocyte volume divided by the MCV (SD/MCV × 100) and is used to measure the heterogeneity of red blood cell size. Therefore, RCDW and MCV are correlated biologically. RCDW could be a strong predictor of the risk and prognosis of ischemic stroke. Similarly, MCV can be used to calculate the mortality and morbidity rates of ischemic stroke., RCDW and MCV are often used to diagnose hematological system diseases, such as iron-deficiency anemia and bone marrow dysfunction., They help determine the risk of a cardiovascular event occurring after surgery or blood transfusions. Increases in RCDW can be observed more easily as MCV tends to remain at low levels; RCDW and stroke are generally associated with a group of patients with low MCV. However, studies on the causal effects of RCDW and MCV on stroke are limited. Recently, a Mendelian randomization (MR) study proposed that RCDW and MCV are causally associated with small-vessel and cardioembolic stroke, respectively. Whether these two metrics are causally related to hemorrhagic stroke remains to be poorly understood. Numerous studies suggest RCDW and MCV are associated with the risk factors for hemorrhagic stroke.15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 Higher MCV levels had lower body mass index, hemoglobin A1c, cholesterol, triglyceride, uric acid levels, diabetes mellitus, dyslipidemia, hypertension, and metabolic syndrome.28, 29, 30, 31 Risk factors for hemorrhagic stroke include hypertension, smoking, sleep apnea, cocaine use, alcohol abuse, and clotting problems due to blood disorders or medicine (from Winchester Hospital). For hypertension, clinical studies propose that hypertensives have lower MCV than normotensives, while epidemiological studies suggest that they have higher MCVs or no relation exists., Among chronic kidney disease patients, diabetes mellitus and hypertension were more common in the low-MCV group. A study on erythrocytotic transgenic mice demonstrated by experiment that the blood viscosity regulation is an important adaptive mechanism to excessive erythrocytosis. Blood decreases its viscosity by greater shear stress primarily from the high deformability of red blood cells. This can be achieved by increasing the proportion of juvenile flexible erythrocytes, which is more deformable with higher MCV than older ones., Therefore, higher MCV correlates with lower blood viscosity, while blood viscosity was reported to be positively associated with systolic blood pressure, diastolic blood pressure, and mean arterial pressure. Also, RCDW was reported to increase venous thrombosis and diabetes mellitus,, and be positively associated with blood pressure., Hypertensives are prone to carotid artery atherosclerosis if they have high RCDW. For smoking, continuous cigarette smoking has severe undesirable influences on hematological indexes, including MCV, hemoglobin, and red blood cells count, leading to the development of cardiovascular diseases. The MCV level in smokers is higher than in non-smokers. For clotting disorder, blood clot risk can result from low iron levels, while iron deficiency, best characterized by anisocytosis (high RCDW), tends to cause microcytic anemia (low MCV anemia) in children. For alcohol use disorder, a study also found MCV increases in blood after 4–8 weeks of excessive alcohol intake. A previous study also reported that RCDW and D-dimer were associated with cerebral venous thrombosis, a determinant of ICH. Thus, RCDW and MCV have been shown to be associated with different risk factors of hemorrhagic stroke. Based on the above associations, we postulated that the RCDW/MCV is causally associated with hemorrhagic strokes, meditated by some causal pathways. Therefore, this study explored the causal link between RCDW/MCV and hemorrhagic stroke using MR methods with a focus on nontraumatic hemorrhagic strokes.

Materials and methods

Data source

All summary-level data can be obtained from the publicly available Medical Research Council Integrative Epidemiology Unit OpenGWAS database. The exposure and outcome data are from non-overlapping European populations. The two exposures (MCV and RCDW) are derived from the UK Biobank database, where MCV is extracted from the second wave of a genome-wide association study (GWAS) by the Neale Lab and RCDW from another GWAS. The four hemorrhagic stroke outcomes are from the FinnGen consortium; we chose the R5 release of GWAS results. These outcomes are SAH, ICH, nontraumatic intracranial hemorrhage (nITH), and a combination of SAH, unruptured cerebral aneurysm, and aneurysm operations SAH (AOS). Aneurysm operations in AOS are endovascular or surgical operations to intracerebral aneurysms, coded with Nordic Medico-Statistical Committee Operations classification of surgical procedures: AAC0[0-5], AAC1[0-5], AAC99, and AAL00. We involved AOS because SAH is one of the endpoints. Based on the definition of the endpoint in FinnGen, we considered that nITH excludes other brain hemorrhages—nITH is equivalent to hemorrhagic stroke. To test the mediation effect, we included blood disorders (elevated erythrocyte sedimentation rate and abnormality of plasma viscosity) from FinnGen and blood pressure (SBP, systolic blood pressure; DBP, diastolic blood pressure) from a combined GWAS from the UK Biobank and the International Consortium of Blood Pressure. Table 1 presents the description of GWAS summary data used, the name of each trait, the specific endpoints coded with the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10), the sample size, and the mean age of the first event; more information is recorded across Tables S1, S2, and S3. No ethical approval was required for this study as we used publicly available data. The MR study was approved by the City University of Hong Kong Research Committee (JCC approval no.: jcc2122ay006).

Table 1

Description of genome-wide association study summary data for exposures and outcomes

Trait	ICD-10	Sample size	Event age (years)
Mean corpuscular volume (fL)	–	350,473	57
Red blood cell distribution width (10¹² cells/L)	–	116,666	56.5a
Nontraumatic intracranial hemorrhage	I6[0–1]	2,794 cases	61.0
Nontraumatic intracranial hemorrhage	I6[0–1]	203,068 controls	61.0
Intracerebral hemorrhage	I61	1,687 cases	65.6
Intracerebral hemorrhage	I61	201,146 controls	65.6
Subarachnoid hemorrhage	I60	1,338 cases	54.9
Subarachnoid hemorrhage	I60	201,230 controls	54.9
Subarachnoid hemorrhage, unruptured cerebral aneurysm, and aneurysm operations	I60, I671	2,127 cases	56.0
	I60, I671	203,068 controls	56.0
Elevated erythrocyte sedimentation rate and abnormality of plasma viscosity	R70	1,093	66.5
	R70	212,004	66.5
Systolic blood pressure (mmHg)	–	757,601	N//A
Diastolic blood pressure (mmHg)	–	757,601	NA

Mean age is inferred from Hewitt et al. ICD-10, International Statistical Classification of Diseases and Related Health Problems, 10th Revision. Event age is the mean age of the first event.

Description of genome-wide association study summary data for exposures and outcomes Mean age is inferred from Hewitt et al. ICD-10, International Statistical Classification of Diseases and Related Health Problems, 10th Revision. Event age is the mean age of the first event.

Phenotypic correlations

First, we estimated the phenotypic correlations between the traits using the GWAS summary statistics with the PhenoSpD R toolkit. However, PhenoSpD assumes the GWAS samples must overlap substantially to compute the phenotypic correlation effectively; otherwise, the correlation will attenuate toward zero as overlap decreases. Our exposures and outcomes are from two cohorts (UK Biobank and FinnGen). Therefore, the correlations between the exposure and outcome might unreliably approach zero. This may be why we observed the correlations between the exposures or between the outcomes, respectively.

Univariable and multivariable MR

The principles of two-sample univariable MR (UMR) and multivariable MR (MVMR) have been described elsewhere.41, 42, 43, 44 We adopted single-nucleotide polymorphisms (SNPs) as instruments from non-overlapping individuals of the same ethnicity for RCDW/MCV and outcomes. As RCDW and MCV are correlated biologically, they might share genetic variants associated with the risk of stroke, that is, they are pleiotropic. We adjusted MCV and RCDW in MVMR to control the pleiotropic SNPs and assess the causality between exposures and outcomes. In general, the UMR is performed under three assumptions (Figure 1A): (1) SNPs correlated with the exposure (RCDW or MCV); (2) lack of correlated horizontal pleiotropy (e.g., SNPs are not associated with other confounders between MCV/RCDW and stroke); and (3) lack of uncorrelated horizontal pleiotropy (e.g., SNPs only affect stroke through the exposure of interest)., Similar assumptions for MVMR are proposed, as listed: (1) each SNP is associated with RCDW or/and MCV; (2) each variant is not associated with a confounder other than RCDW or MCV; (3) the variant is conditionally independent of stroke given the exposures (Figure 1B).

Figure 1

Flowchart of univariable and multivariable Mendelian randomization

UMR, univariable Mendelian randomization; MVMR, multivariable Mendelian randomization; IV, instrumental variable; SNPs, single-nucleotide polymorphisms; RCDW, red blood cell distribution width; MCV, mean corpuscular volume; ICH, intracerebral hemorrhage; nITH, nontraumatic intracranial hemorrhage; SBP, systolic blood pressure; DBP, diastolic blood pressure; BD, blood disorders.

Flowchart of univariable and multivariable Mendelian randomization UMR, univariable Mendelian randomization; MVMR, multivariable Mendelian randomization; IV, instrumental variable; SNPs, single-nucleotide polymorphisms; RCDW, red blood cell distribution width; MCV, mean corpuscular volume; ICH, intracerebral hemorrhage; nITH, nontraumatic intracranial hemorrhage; SBP, systolic blood pressure; DBP, diastolic blood pressure; BD, blood disorders. We preprocessed the data to perform two-sample UMR and MVMR, respectively. For UMR, we performed linkage disequilibrium clumping (p = 5 × 10−8, and kb = 10,000), allowed for proxy SNPs, corrected non-palindromic strands, and excluded palindromic and ambiguous SNPs via the TwoSampleMR R package. Then, we excluded outliers, computed MR estimates, and performed the sensitivity analyses. For MVMR, we also extracted the SNPs using the TwoSampleMR package. In clumping, we first kept the SNPs that were both present in two traits. Then, we selected those with p < 5 × 10−8 in either of the exposures. We performed clumping and looked for proxies by and kb = 10,000, and harmonized the SNPs on the same strand. Finally, we extracted outcome SNPs by a 0.01 MAF threshold to infer palindromic SNPs; we used default settings for the other parameters.

Model selection framework

According to the Rücker model selection framework, we selected the inverse-variance weighted (IVW) analysis under a fixed effect model as the primary MR estimation when MR assumptions are satisfied without pleiotropy or heterogeneity effects. According to the MVMR package framework, we prioritized IVW in the absence of weak instrument bias and pleiotropy. Multiple MR methods have been developed to account for MR estimates under different violations of assumptions or the existence of heterogeneity, for example, MR-Egger weighted median, and MR-pleiotropy residual sum and outlier. In this study, we computed UMR estimates using models as listed below: Radial IVW fixed effect (RadialMR IVW [FE]), Radial IVW random effect (RadialMR IVW [RE]), and Radial MR-Egger (RadialMR Egger) in the RadialMR R package; weighted median and weighted mode in the TwoSampleMR R package; and MR-pleiotropy residual sum and outlier (MR-PRESSO) in the MR-PRESSO. Also, we obtained MVMR estimates using the following models: MVMR IVW in the MVMR package; multivariable IVW fixed, random effect models, and Egger (namely, MR IVW [FE], MR IVW [RE], and MR Egger) in the MendelianRandomization package; and multivariable MR-PRESSO in the MR-PRESSO. Moreover, we computed MVMR estimates under multivariable Robust models (Robust), which are unbiased and more efficient than IVW with a low proportion of invalid instruments. All of the causal estimates for RCDW and MCV are represented as odds ratios (ORs) per SD increase in genetically predicted levels of the exposures. The significance level of p value is 0.05.

Outliner exclusion, heterogeneity, and weak instrument assessment

We applied the following heterogeneity test statistics for outlier exclusion and heterogeneity assessment. For UMR, we excluded outliners and tested the heterogeneity and pleiotropy using the Cochran’s Q (assuming balanced pleiotropy), the Rücker’s Q (assuming unbalanced pleiotropy) statistics, and the MR-PRESSO global test. We also performed MR under the Wald model to exclude influential SNPs. Other heterogeneity tests based on the funnel plot and Higgins index were provided—0% and negative indicates no observed heterogeneity, and increasing values show higher heterogeneity., We examined weak instrument bias according to the F statistic formula, where a rule of thumb of F > 10 is considered strong for instruments. Finally, we performed the leave-one-out sensitivity test as sensitivity analysis. We still removed pleiotropic SNPs in MVMR using MR-Lasso and MR-PRESSO as there might be some exposures acting as confounders not controlled in our model. For example, hematocrit might be a confounder between MCV and hemorrhagic stroke., So, there might be some pleiotropic instruments associated with MCV and hematocrit. Besides Cochran’s Q and MR-PRESSO, we computed QA (robust to weak instruments with appropriate type 1 error rate) to evaluate the heterogeneity. Finally, we calculated the conditional F statistic to assess weak instrument bias by the MVMR package.

Colocalization analysis

Apart from MR and genetic correlations, colocalization analysis is another widely used approach that leverages genetic data to obtain insights into the genetic relationship between two traits. We performed colocalization analysis only between traits that had a significant causal effect in MR to supplement the association between the two traits.

Description of colonization analysis

The colocalization analysis used is based on a novel Bayesian statistical test, which focuses on a single genomic region at a time to identify shared genes. This method explores whether the data support a shared causal variant for both traits. If there is more than one independent association at a locus for a trait of interest, only the strongest of these distinct association signals is considered by the algorithms. Therefore, using the Bayesian-based colocalization analysis, we only have at most a single locus in a region at a time, which is the shared causal variant for the two traits. At first, two datasets for two traits are from unrelated individuals with the same ethnicity. We relate genotypes to phenotypes in a linear regression , assuming the causal variant is included in the set of variants and there is at most one association for each trait in the genomic region of interest. Then, we have a matrix where, in each row, each element , indicates whether the variant is causally associated with the trait (1 means association). At most one element is non-zero for each trait within the region of variants. The two pairs of vectors generate five conditions as five hypotheses: : No association. : Association between variant and trait 1, but not trait 2. : Association between variant and trait 2, but not trait 1. : Association between variant and trait 1 and between trait 2, and two distinct variants. : Association between one shared variant and trait 1 and trait 2. Sets of all variants for each hypothesis are , , , We would like to calculate from posterior computation. in the above equation can be computed by summary statistics from GWAS and and are assigned by the study., Finally, we denoted the posterior probabilities with PP0, PP1, PP2, PP3, and PP4 for SNP causality relating two traits for the above five mutually exclusive hypotheses. A large PP4 (PP4 > 75%) is evidence of colocalization, suggesting a variant shared between RCDW/MCV and the outcome. Although there are other colocalization methods for multiple causal variants, for example, through Sum of Single Effects regression framework, our focus was on the causal inference of MCV on hemorrhagic stroke. Colocalization analysis can supplement the relationship between MCV and hemorrhagic strokes but cannot infer causal effect between the two traits. We did not assess the number of the shared variants and used the more efficient Bayesian statistical test for colocalization.

Selection of locus for colocalization analysis

Before colocalization, we filtered minor allele frequency and allele frequency, and p values out of range of and deleted missing SNP ID. The searching regions of colocalization are not consistent across studies. One study defined the regions within 500 kb of the lead SNPs, another study searched for significant SNPs at ±200 kb of the transcribed region of interest, and another study defined test regions from MR-independent SNPs within 200 kb distance to define 118 unique regions. In this study, we searched for the lead SNP within a region of ±1 MB and repeated the colocalization analysis within ±3 MB using all the GWAS data. We did not screen the lead SNPs from UMR or MVMR data, because a shared SNP might not be in the causal path of exposure and outcome. The most associated SNPs might be excluded during MR pre-processing, such as during clumping or harmonization. We used the default setting of prior probabilities.

Mediation analysis

A mediator is a variable on the causal path between an exposure and an outcome variable; it is vertically pleiotropic. We would use mediation analysis to account for the mechanism whereby our exposure could affect the outcome. The rationale of mediation analysis, assumptions, estimation of mediation by MR, and implementation challenges have been discussed., In mediation analysis, assuming no interaction, we defined and estimated three parameters for the effects of the exposure on the outcome: (1) the total effect is the effect of exposure on the outcome via all pathways; (2) the direct effect —controlled or natural—is the remaining effect beyond mediator pathways; and (3) the indirect effect (or mediation effect) is what acts through the mediator(s). The indirect effect can be calculated by with the difference method or with the product of coefficients method., As shown in Figure 1C, suppose we only involve MCV as the exposure and ICH as the outcome in the mediation analysis (this is shown in the results). As suggested, we could apply the two-step MR (akin to the product of coefficients method), which assumes no interactions between exposures and mediators, for mediation analysis., We computed and for MCV on ICH and on the mediator in MVMR, controlling for RCDW. We also evaluated in MVMR, controlling for both the exposures, to ensure any effect of the mediator on the outcome is independent of the exposure. Considering the binary outcome with low prevalence (<10%), the proportion-mediated measure is approximately per SD increase in genetically instrumented MCV. We applied bootstrap for resampling to calculate the confidence intervals.

Results

Phenotypic correlation analysis

Figure 2 visualizes the phenotypic correlation between every trait after correction for multiple testing, where green indicates a positive correlation and red indicates a negative correlation. As MCV and RCDW are biologically correlated, they appear negatively associated (r2 = −0.10; Figure 2A). We can witness high positive associations between SAH and AOS (r2 = 0.79), between nITH and ICH (r2 = 0.78), nITH and SAH (r2 = 0.69), and nITH and AOS (r2 = 0.57); they cluster around in the hierarchically clustered heatmap (Figure 2B), possibly due to the overlapped endpoints from SAH/ICH. Moreover, SAH is more closely related to AOS than to nITH.

Figure 2

Phenotypic correlation after applying multiple testing corrections

(A) Phenotypic correlation matrix.

(B) Hierarchically clustered heatmap. MCV, mean corpuscular volume; RCDW, red blood cell distribution width; SAH, subarachnoid hemorrhage; ICH, intracerebral hemorrhage; nITH, nontraumatic intracranial hemorrhage; AOS, cerebral aneurysm, aneurysm operations, and SAH.

Phenotypic correlation after applying multiple testing corrections (A) Phenotypic correlation matrix. (B) Hierarchically clustered heatmap. MCV, mean corpuscular volume; RCDW, red blood cell distribution width; SAH, subarachnoid hemorrhage; ICH, intracerebral hemorrhage; nITH, nontraumatic intracranial hemorrhage; AOS, cerebral aneurysm, aneurysm operations, and SAH.

UMR results

MR estimates for UMR are presented in the forest plots across Figures 3A–3D. Each forest plot presents the MR estimates between MCV/RCDW and the outcome, including the method, number of SNPs, OR (95% CI), and p values. The outcome title is presented at the bottom middle part. A pair of arrows above the outcome indicates the direction of risk that exposure results in for the outcomes. When the OR and its CI are below one toward the “lower risk” arrow (p < 0.05), this means genetically determined higher MCV/RCDW lowers the risk of the outcome; high MCV/RCDW is protective from the outcome. On the other hand, if the OR and its CI are over one toward the “higher risk” arrow (p < 0.05), it means genetically determined higher MCV/RCDW increases the risk of the outcome; high MCV/RCDW is a risk factor of the outcome. If the CI crosses one (p ≥ 0.05), the causal effect of MCV/RCDW on the outcome is non-significant.

Figure 3

Univariable Mendelian randomization analysis results

OR (95% CI) estimates in odds ratio and the 95% confidence intervals for outcome risk per 1 SD increase in exposure genetically predicted levels. Significance level for p is 0.05. IVW, inverse variance weighted; FE, fixed effect; RE, random effect; RCDW, red blood cell distribution width; MCV, mean corpuscular volume; MR-PRESSO, Mendelian randomization pleiotropy residual sum and outlier; SNP, single-nucleotide polymorphism; CIs, confidence intervals.

Univariable Mendelian randomization analysis results OR (95% CI) estimates in odds ratio and the 95% confidence intervals for outcome risk per 1 SD increase in exposure genetically predicted levels. Significance level for p is 0.05. IVW, inverse variance weighted; FE, fixed effect; RE, random effect; RCDW, red blood cell distribution width; MCV, mean corpuscular volume; MR-PRESSO, Mendelian randomization pleiotropy residual sum and outlier; SNP, single-nucleotide polymorphism; CIs, confidence intervals. IVW fixed effect model RadialMR (FE) showed that the genetic risk of MCV decreased the odds of ICH (OR = 0.89 [0.8–0.99], p = 0.036) and nITH (OR = 0.89 [0.82–0.97], p = 0.005) per SD increase in genetically predicted levels of MCV (Figures 3A and 3C). No evidence of causality was found between MCV and SAH (OR = 0.88 [0.78–1], p = 0.051), and MCV and AOS (OR = 1.01 [0.91–1.11], p = 0.89); the same null results were observed between RCDW and any other outcome: ORRCDW-SAH = 1 (0.85–1.19), p = 0.96; ORRCDW-ICH = 0.96 (0.83–1.11), p = 0.57; ORRCDW-nITH = 0.97 (0.87–1.09), p = 0.62; and ORRCDW-AOS = 0.96 (0.84–1.1), p = 0.6; Figure 3A–3D. Heterogeneity test statistics presented a lack of heterogeneity and pleiotropy bias between every exposure-outcome pair (Cochran’s Q: p > 0.93; Rücker’s Q: p > 0.06; MR-PRESSO global test: p > 0.95; ; Table S1). The F statistics were over 84, indicating no weak instruments bias (Table S1). The leave-one-out analyses showed the lack of influential instruments in any MR estimates (Figures S1–S8). The funnel plots are relatively symmetrical (Figures S9A–S10D), suggesting a lack of directional pleiotropy. These test indexes confirmed the use of RadialMR (FE) as the primary analysis for UMR estimates. The IVW under the fixed effect model is most powerful when all IVs are valid. Therefore, for other models (RadialMR Egger, weighted median, and weighted mode), ICH and nITH showed nonsignificant relation with hemorrhagic stroke, possibly due to limited power. MR-Egger estimators are less powerful and less efficient than IVW estimators due to the need to estimate both the slope and intercept parameters. Compared with IVW, the wider confidence interval for MR-Egger is because the standard error of the MR-Egger estimate tends to be larger than that of the IVW method. Moreover, instrument strength independent of direct effect violation introduces more bias in the MR-Egger estimate than in the IVW estimate. Similarly, weighted median and weighted mode estimate the median or mode of the ratio estimate distribution as a causal effect. They could be less efficient than other estimators that combine multiple genetic variants more directly, e.g., IVW. For SAH, we observed a marginal significance for MCV (OR = 0.88 [0.78–1], p = 0.051) in RadialMR IVW (FE) while we observed significant effects in RadialMR IVW (RE) (OR = 0.88 [0.79,0.99], p = 0.035) and MR-PRESSO (OR = 0.88 [0.79,0.98], p = 0.025); non-significance existed in the remaining models. MR-PRESSO also adopted IVW before and after removal of outliers. However, we still considered random effects, and fixed effect models resulted in rather consistent results because the point estimators were the same and the latter just produced a 0.01 wider confidence interval than the former. The random effects p value of 0.051 is marginal and may be clinically significant. We would further compare them when adjusting RCDW in MVMR. To conclude, UMR suggested that genetically predicted MCV levels are associated with lower ICH and nITH risk. MCV and SAH association needed further exploration in MVMR. No substantial evidence was found for an association between MCV/RCDW and other outcomes.

MVMR results

MR estimates for MVMR were presented in the forest plots across Figures 4A–4D, which could be interpreted similarly to Figures 3A–3D. After adjusting for RCDW, we found MCV still reduced the risk of ICH (OR = 0.87 [0.78–0.98], p = 0.021) and nITH (OR = 0.88 [0.8–0.96], p = 0.004) under the IVW fixed effect model—MR IVW (FE); other models also showed similar MR estimates (ORMCV-ICH ranged from 0.87 to 0.88; ORMCV-nITH were all 0.88) with consistent significance (p < 0.05) (Figures 4B and 4C). There was no support for causal effects in the remaining exposure-outcome estimates in all models. For example, under MR IVW (FE), ORMCV-SAH = 0.91 (0.8–1.03), p = 0.14; ORMCV-AOS = 1.02 (0.92–1.13), p = 0.73; ORRCDW-SAH = 0.94 (0.79–1.12), p = 0.49; ORRCDW-ICH = 0.88 (0.76–1.03), p = 0.1; ORRCDW-nITH = 0.91 (0.81–1.03), p = 0.14; ORRCDW-AOS = 1.02 (0.89–1.17), p = 0.79 (Figures 4A–4D). Heterogeneity test statistics were all strong indications of a lack of heterogeneity and pleiotropy (Cochran’s Q p > 0.44, QA p > 0.44, and MR-PRESSO global test p > 0.46 in Table S2). Moreover, for weak instrument assessment, the conditional F statistics (Table S2) were over 18; the instruments were adequately powerful for MVMR. Besides the Rücker model-selection framework, following the MVMR model-selection framework, we could also choose MVMR IVW as our preferred model because of the lack of evidence of heterogeneity and weak instruments. Thus, we considered the MVMR IVW as the sensitivity analysis and alternative MVMR estimation. We could still observe similar results and draw the same casual inference for MCV and ICH (OR = 0.87 [0.78–0.98], p = 0.02), and MCV and nITH (OR = 0.88 [0.8–0.96], p = 0.004) (Figures 4A and 4C); we found no evidence to support causation between any other exposure and outcome (Figures 4A–4D).

Figure 4

MVMR analysis results

OR (95% CI) estimates in odds ratio and the 95% confidence intervals for outcome risk per 1 SD increase in exposure genetically predicted levels. Significance level for p is 0.05. RCDW, red blood cell distribution width; MCV, mean corpuscular volume; IVW, inverse variance weighted; FE, fixed effect; RE, random effect; MR-PRESSO, Mendelian randomization pleiotropy residual sum and outlier; SNP, single-nucleotide polymorphism; OR, odds ratio.

MVMR analysis results OR (95% CI) estimates in odds ratio and the 95% confidence intervals for outcome risk per 1 SD increase in exposure genetically predicted levels. Significance level for p is 0.05. RCDW, red blood cell distribution width; MCV, mean corpuscular volume; IVW, inverse variance weighted; FE, fixed effect; RE, random effect; MR-PRESSO, Mendelian randomization pleiotropy residual sum and outlier; SNP, single-nucleotide polymorphism; OR, odds ratio. MVMR produced more consistent MR estimates than UMR, justifying the control of pleiotropic instruments from RCDW and MCV in our analysis. MR estimates for MCV with SAH in UMR all became nonsignificant in MVMR, indicating that the adjustment strengthened the stability in MR estimates. We considered MVMR, especially IVW models, as the final MR estimates for the interpretation of our study. Therefore, genetically determined MCV was significantly related to ICH and nITH. The ORs of ICH and nITH were 0.87 (0.78–0.98), p = 0.021 and 0.88 (0.8–0.96), p = 0.004, respectively, per genetically predicted 1 SD increase in the MCV level. We conducted the colocalization analyses between MCV and ICH, and MCV and nITH. The PP4 for these two pairs of traits—within the ± 1 MB region of the top hit—were 75.1% and 26.4%; similar values were witnessed within the ± 3 MB region (75.6% and 25.9%) (Table 2). Using colocalization analysis based on Bayesian statistical test, we finally resulted in a shared SNP (rs62393683) within two regions for MCV and ICH that both had PP4 over 75% (Table 2), indicating a strong support for colocalization between two traits; an association was found between MCV and ICH61. While for MCV and nITH, the two PP4 did not support colocalization, possibly due to the result of limited power61. This SNP (rs62393683) was not involved in our UMR or MVMR data. We queried PhenoScanner—a database of human genotype-phenotype associations—for this SNP and found that it is associated with MCV, RCDW, hemoglobin concentration, hematocrit, diastolic blood pressure, hypertension, hereditary/genetic hematological disorder, etc. However, hemorrhagic stroke is not recorded in these associations. Our colocalization result might fill in the gap of the association of MCV and ICH and contribute to the large-scale genetic association studies.

Table 2

Colocalization analysis results under significant MR estimates

Region within position ± 1 MB
Chr	Position	Genes	Traits	Candidate SNP	NSNP	PP0	PP1	PP2	PP3	PP4
6	25,633,204	ZFP57	MCV-ICH	rs62393683	138,664	0.000	0.134	0.000	0.115	0.751
6	25,633,204	ZFP57	MCV-nITH	rs62393683	138,664	0.000	0.425	0.000	0.311	0.264

PP0-PP4: five posterior probabilities for SNP causality relating two traits for five mutually exclusive hypotheses: (1) H0, association with neither trait; (2) H1, association with trait 1 but not trait 2; (3) H2, association with trait 2 but not trait 1; (4) H3, association with both traits, with two distinct SNPs; and (5) H4, association with both traits, with one shared SNP. SNP, single nucleotide polymorphism; candidate SNP, candidate causal SNP with maximum PP4 in a test region; NSNP, the number of SNPs included in the colocalization analysis; MCV, mean corpuscular volume; RCDW, red blood cell distribution width; ICH, intracerebral hemorrhage; SAH, subarachnoid hemorrhage; nITH, nontraumatic intracranial hemorrhage.

Colocalization analysis results under significant MR estimates PP0-PP4: five posterior probabilities for SNP causality relating two traits for five mutually exclusive hypotheses: (1) H0, association with neither trait; (2) H1, association with trait 1 but not trait 2; (3) H2, association with trait 2 but not trait 1; (4) H3, association with both traits, with two distinct SNPs; and (5) H4, association with both traits, with one shared SNP. SNP, single nucleotide polymorphism; candidate SNP, candidate causal SNP with maximum PP4 in a test region; NSNP, the number of SNPs included in the colocalization analysis; MCV, mean corpuscular volume; RCDW, red blood cell distribution width; ICH, intracerebral hemorrhage; SAH, subarachnoid hemorrhage; nITH, nontraumatic intracranial hemorrhage.

Mediation results

We applied the IVW fixed effect model to compute the MVMR estimates of MCV on the mediators and outcomes (results are shown in Figure 5. The interpretation of Figure 5 was different from Figures 3 and 4. As mentioned in the Materials and methods, is the MR estimate of MCV adjusted by RCDW for outcome, i.e., ; is the MR estimate of MCV adjusted by RCDW for meditator, i.e., ; is the estimate of MCV adjusted by RCDW and mediator for outcome, i.e., ; and is the estimate of the mediator adjusted by MCV and RCDW for outcome, i.e., . Figure 5 includes the names of outcomes or mediators of estimates, number of SNPs, MR estimates (95% CI), and p values.

Figure 5

Two-step MVMR analysis results

, total effect of mean corpuscular volume (MCV) on outcomes conditional on red blood cell distribution width (RCDW); , direct effect of MCV on outcomes conditional on RCDW; , effect of MCV on mediators conditional on RCDW; , effect of MCV on outcomes conditional on RCDW and diastolic blood pressure; ICH, intracerebral hemorrhage; nITH, nontraumatic intracranial hemorrhage; BD, blood disorders; SBP, systolic blood pressure; DBP, diastolic blood pressure.

Two-step MVMR analysis results , total effect of mean corpuscular volume (MCV) on outcomes conditional on red blood cell distribution width (RCDW); , direct effect of MCV on outcomes conditional on RCDW; , effect of MCV on mediators conditional on RCDW; , effect of MCV on outcomes conditional on RCDW and diastolic blood pressure; ICH, intracerebral hemorrhage; nITH, nontraumatic intracranial hemorrhage; BD, blood disorders; SBP, systolic blood pressure; DBP, diastolic blood pressure. In two-step MVMR, for step one (MCV on mediator) we observed that only MCV on DBP, conditional on RCVDW, showed significant estimates; we still had similar findings when we used MVMR IVW −0.085 (−0.166 to −0.003, p = 0.031; Table S3). Therefore, we selected DBP as the mediator in the following analysis. For step two , effect of DBP on ICH/nITH, total effect , and direct effect , we also had significant estimates. The heterogeneity tests (Cochran’s Q p > 0.45 and QA p > 0.45), MR-PRESSO global test (p > 0.46), and conditional F statistics (>10), as calculated in the MVMR analysis, showed a lack of heterogeneity effects or weak instrument bias (Table S3). However, as DBP consists of data from the International Consortium for Blood Pressure and UK Biobank, the study participants with DBP and MCV data overlapped partly. We assessed the sample overlap bias and type 1 error inflation. Across the whole range of possible sample overlap, it suggested no increase of bias (Table S4). Therefore, we could infer the statistics for mediation effect from MR should be robust to unmeasured confounding. Table 3 presents the OR for total effect, indirect effect, direct effect, and proportion-mediated measure for MCV on ICH via DBP and MCV on nITH via DBP. Significant results for these statistics indicated diastolic blood pressure mediates between MCV and hemorrhagic strokes. Indirect effects of MCV on hemorrhagic via DBP existed, although they were close to null (OR = 0.995 [0.99,0.999]). The proportion mediated was also small, occupying 3.1% (0.1%,14.3%) in ICH and 3.4% (0.2%,15.8%) in nITH; we might infer that the causal effect of MCV on hemorrhagic strokes could result from the independent causal mechanism of MCV or other mediators.

Table 3

Mediation effect of MCV on ICH and nITH via DBP

Exposure	Mediator	Outcome	Total effect (OR)	Indirect effect (OR)	Direct effect (OR)	Proportion mediated (%)
MCV	DBP	ICH	0.873 (0.778,0.98)	0.995 (0.99,0.999)	0.866 (0.767,0.977)	3.1% (0.1%,14.3%)
MCV	DBP	ICH	p = 0.021	p < 0.001	p = 0.02	3.1% (0.1%,14.3%)
MCV	DBP	nITH	0.876 (0.801,0.958)	0.995 (0.99,0.999)	0.866 (0.767,0.977)	3.4% (0.2%,15.8%)
MCV	DBP	nITH	p = 0.004	p < 0.001	p = 0.02	3.4% (0.2%,15.8%)

Mediation effect of MCV on ICH and nITH via DBP

Discussion

This analysis explores the causal effects of MCV and RCDW on hemorrhagic strokes, including SAH, ICH, nITH, and AOS. First, our phenotypic correlation analysis suggested a negative correlation between RCDW and MCV. Second, UMR and MVMR revealed that only higher genetically predicted MCV is protective of ICH and nITH. There is no strong evidence of a causal inference between RCDW/MCV and any other outcome. Third, the results of colocalization analysis proposed a single variant on the causal pathway of MCV and ICH. Finally, mediation analysis suggested that diastolic blood pressure mediates between MCV and ICH/nITH. To the best of our knowledge, this is the first study to report the causal effects of RCDW/MCV on the risk of hemorrhagic strokes. Although the Rücker model selection framework suggested that the IVW under the fixed effect model be selected for interpretation of our results, the IVW fixed-model or multiplicative random effect model should produce a consistent MR estimate when there is either a lack of pleiotropy (intercept = 0 for all variants) or a balanced pleiotropy, but the residual standard error of the random effects model can be greater., However, in the UMR estimate between MCV and SAH, our standard error was smaller in IVW (RE) than in IVW (FE), and the causal effect shown by UMR IVW (RE) was found toward the null in the MVMR. We speculated that there might be pleiotropic SNPs in UMR and, subsequently, they might be controlled in MVMR. We also observed that all estimates were consistent between each exposure-outcome pair. Therefore, our MVMR should produce more robust estimates than the UMR. Our results are consistent with previous studies. Hypertension was more common in the low-MCV group, although among chronic kidney disease patients. We could also infer that genetically determined lower MCV increased the risk of DBP in our study. Our phenotypic correlation analysis presented a negative association between RCDW and MCV, consistent with previous studies., A meta-analysis and a longitudinal study found that elevated RCDW was a risk factor for general stroke and ischemic stroke but not for SAH or ICH., Our findings showed no supportive evidence for MCV on SBP, controlling for RCDW, which was consistent with the previous report about the lack of a significant relationship between SBP and MCV after adjusting for red blood cell count. The underlying mechanism behind MCV and ICH and nITH may be mediated by blood pressure. Studies showed that there is an important adaptive mechanism to regulate blood viscosity to excessive erythrocytosis by increasing more juvenile flexible erythrocytes of higher MCV., We found a mediation effect from DBP instead of SBP. This might be because DBP is more specific to measure overall resistance to blood flow than SBP and thus is more strongly associated with blood characteristics that influence viscosity, i.e., MCV. When whole blood viscosity increases, peripheral resistance to blood flow also increases, and then blood pressure increases to maintain blood flow in the face of increased peripheral resistance. Blood viscosity was positively associated with systolic blood pressure, diastolic blood pressure, and mean arterial pressure. Therefore, we could infer that decreased whole blood viscosity would result in decreased blood pressure. How MCV is applied to the prevention of hemorrhagic strokes in medical practice, i.e., the external validity, is still an open question. First, our result was based on the European population. It could not be extended to other populations with ethnic backgrounds different from our study. In the Korean elderly population, MCV values were significantly greater than in young adults, while our mean age of first-ever hemorrhagic stroke was between 50 and 70 years. There might be heterogeneity to interpret the results of MCV associated with diseases under different conditions. For example, some studies proposed that elevated MCV is associated with a reduced incidence of metabolic syndrome and predicts a lower risk of in-stent restenosis., There are multiple studies indicating that subjects with higher MCV levels have lower body mass index, hemoglobin A1c, cholesterol, triglyceride, uric acid levels, diabetes mellitus, dyslipidemia, hypertension, and (for men only) metabolic syndrome28, 29, 30, 31 However, higher MCV levels are also related to an increased risk of all-cause mortality and cognitive impairment.,,, Athletes with adequate and regular training have higher MCV levels than sedentary subjects. Taken together, it seems that elevated MCV level reduces a series of health problems, although this is not always the case. Heterogeneity exists in subjects with different genders, patients with anemia or chronic diseases, and athletes without a normal MCV. Our study conclusion can only be generalized to the European population. There was not much other phenotypic information in our data. Second, MCV may be altered by changes in crystalline osmotic pressure. However, according to one of our co-authors as a neurologist, currently there is still a lack of prescribed medications that can modulate MCV. As mentioned that trained athletes have higher MCV levels than the sedentary subjects, it is likely that MCV can be altered by physical activity. Still, the efficacy of short-term targeted interventions on MCV cannot be assessed. But these do not preclude the efficacy of a therapeutic intervention on levels of MCV. Our study still provides a cost-effective target for drug development for the prevention of hemorrhagic strokes. Our study may provide some clinical implications. We have identified novel potential targets for pharmacological therapy (RCDW/MCV) to reduce the risk of ICH and nITH. As higher genetically predicted MCV is considered protective of ICH and nITH, experiment-based research is warranted to explore further whether this finding can be exploited to develop new pharmaceutical treatments for hemorrhagic strokes. However, as we found supportive evidence of causality between MCV and ICH or nTH (combination of ICH and SAH), but not between MCV and SAH, the causation of MCV and nTH might result from that of MCV and ICH. Further investigations are needed to confirm this inference and clarify the association between MCV and nITH. Our study has some limitations. First, we did not involve and adjust other red blood cell indices in MVMR, such as hematocrit and mean corpuscular hemoglobin, which are correlated with MCV and RCDW and might affect stroke; they might share pleiotropic SNPs and should be further controlled in MVMR. Second, we did not conduct a reverse casual study for hemorrhagic strokes on MCV/RCDW. It is unknown whether reverse causation exists. Third, our data are collected from the European population. Our conclusion could not be extended to other races. Fourth, we did not perform casual inference using non-linear models. We cannot detect the causal relationship that might be non-linear for our null results under linear models. Despite these limitations, we found no heterogeneity or weak instrument bias, adopted multiple rigorous MR estimates, and followed documented approaches to interpret our results and present convincing findings on the causal effect of RCDW and MCV on hemorrhagic strokes. In conclusion, we conducted the first MR analysis to investigate the role of genetically elevated RCDW and MCV in the risk of hemorrhagic strokes. We found that MCV might reduce the risk of ICH and nITH. These findings may provide new insights into potential drug targets for hemorrhagic stroke.

Data and code availability

This research used secondary and publicly available GWAS data from studies with all participants’ informed consent and review boards and/or ethics committees’ ethical approval. Data are available directly from Medical Research Council Integrative Epidemiology Unit OpenGWAS database (ID as shown across Tables S1, S2, and S3), the Neale Lab – UK Biobank, and the FinnGen consortium (see Web resources). We also provide data processed across Tables S5, S6, S7, and S8. Codes for the analysis are available at Github (https://github.com/bbb801/MR).

75 in total

1. Low serum iron levels are associated with elevated plasma levels of coagulation factor VIII and pulmonary emboli/deep venous thromboses in replicate cohorts of patients with hereditary haemorrhagic telangiectasia.

Authors: John A Livesey; Richard A Manning; John H Meek; James E Jackson; Elena Kulinskaya; Michael A Laffan; Claire L Shovlin
Journal: Thorax Date: 2011-12-14 Impact factor: 9.139

2. Hematologic variables and venous thrombosis: red cell distribution width and blood monocyte count are associated with an increased risk.

Authors: Suely Meireles Rezende; Willem M Lijfering; Frits R Rosendaal; Suzanne C Cannegieter
Journal: Haematologica Date: 2013-07-26 Impact factor: 9.941

3. Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome.

Authors: Fabiola Del Greco M; Cosetta Minelli; Nuala A Sheehan; John R Thompson
Journal: Stat Med Date: 2015-05-07 Impact factor: 2.373

4. Colocalization of association signals at nicotinic acetylcholine receptor genes between schizophrenia and smoking traits.

Authors: Laila Al-Soufi; Javier Costas
Journal: Drug Alcohol Depend Date: 2021-01-10 Impact factor: 4.492

5. Mean corpuscular volume levels and all-cause and liver cancer mortality.

Authors: Hyung-Jin Yoon; Kyaehyung Kim; You-Seon Nam; Jae-Moon Yun; Minseon Park
Journal: Clin Chem Lab Med Date: 2016-07-01 Impact factor: 3.694

6. Mendelian randomization analysis with multiple genetic variants using summarized data.

Authors: Stephen Burgess; Adam Butterworth; Simon G Thompson
Journal: Genet Epidemiol Date: 2013-09-20 Impact factor: 2.135

7. Red cell distribution width in relation to incidence of stroke and carotid atherosclerosis: a population-based cohort study.

Authors: Martin Söderholm; Yan Borné; Bo Hedblad; Margaretha Persson; Gunnar Engström
Journal: PLoS One Date: 2015-05-07 Impact factor: 3.240

8. Improving the visualization, interpretation and analysis of two-sample summary data Mendelian randomization via the Radial plot and Radial regression.

Authors: Jack Bowden; Wesley Spiller; Fabiola Del Greco M; Nuala Sheehan; John Thompson; Cosetta Minelli; George Davey Smith
Journal: Int J Epidemiol Date: 2018-12-01 Impact factor: 7.196

Review 9. Red blood cell distribution width and ischaemic stroke.

Authors: Gang-Hua Feng; Hai-Peng Li; Qiu-Li Li; Ying Fu; Ren-Bin Huang
Journal: Stroke Vasc Neurol Date: 2017-06-23

10. The relationship between red blood cell distribution width and metabolic syndrome in elderly Chinese: a cross-sectional study.

Authors: Ziyu Yan; Yaguang Fan; Zhaowei Meng; Chao Huang; Ming Liu; Qing Zhang; Kun Song; Qiyu Jia
Journal: Lipids Health Dis Date: 2019-01-31 Impact factor: 3.876