| Literature DB >> 34425752 |
Tianzhong Yang1,2,3, Jingbo Niu4, Han Chen5,6, Peng Wei7.
Abstract
BACKGROUND: Environmental exposures can regulate intermediate molecular phenotypes, such as gene expression, by different mechanisms and thereby lead to various health outcomes. It is of significant scientific interest to unravel the role of potentially high-dimensional intermediate phenotypes in the relationship between environmental exposure and traits. Mediation analysis is an important tool for investigating such relationships. However, it has mainly focused on low-dimensional settings, and there is a lack of a good measure of the total mediation effect. Here, we extend an R-squared (R[Formula: see text]) effect size measure, originally proposed in the single-mediator setting, to the moderate- and high-dimensional mediator settings in the mixed model framework.Entities:
Keywords: -based effect; Aging; High-dimensional mediators; Iterative sure independence screening; Mediation analysis
Mesh:
Year: 2021 PMID: 34425752 PMCID: PMC8381496 DOI: 10.1186/s12859-021-04322-1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Bias and standard deviation under high-dimensional settings (Simulation setting I): bias in the first row, and standard deviation in the second row for each scenario
| SOS | ab | ab (Lasso) | prop | prop (Lasso) | ratio | ratio (Lasso) | ||
|---|---|---|---|---|---|---|---|---|
| H1 | 0.0006 | 0.0013 | 0.0001 | 0.0069 | ||||
| ( | (0.0181) | (0.0370) | (0.2846) | (0.2744) | (0.0833) | (0.0795) | (0.1161) | (0.1117) |
| H2 | 0.0146 | 0.0299 | 0.1602 | 0.0058 | 0.0075 | |||
| ( | (0.0184) | (0.0375) | (0.6463) | (0.2604) | (0.1960) | (0.0777) | (0.2886) | (0.1165) |
| H3 | 0.0006 | 0.0053 | 0.0923 | 0.0547 | ||||
| ( | (0.0071) | (0.0653) | (0.7443) | (0.7547) | (0.2520) | (0.2495) | (0.2983) | (0.3392) |
| H4 | 0.0047 | 0.0095 | 0.1421 | 0.0025 | 0.0498 | |||
| ( | (0.0198) | (0.0403) | (0.2613) | (0.2519) | (0.0785) | (0.0689) | (0.0982) | (0.1055) |
| H5 | ||||||||
| ( | (0.0095) | (0.0109) | (0.0956) | (0.1618) | (0.0158) | (0.0184) | (0.0482) | (0.0295) |
ab: product measure; prop: proportion measure. (Lasso) indicates that the estimation is based on the Lasso regression; otherwise, it is estimated by a mixed-effect model. The true values are presented in Additional file 1: Table S2. The set of variables included in the model is denoted as . The set of true mediators is denoted as , the set of variables associated with exposure but not with outcome is denoted as , and the set of variables associated with outcome but not the exposure is denoted as . Variables in and are non-mediators falsely included in the putative mediator set
Fig. 1Boxplots of the bias across simulation replications based on a two-step variable selection method, either the iterative SIS or FDR (simulation setting II). X-axis corresponds to the percentage of true mediators; Y-axis corresponds bias across simulation replications. A, B non-mediators are included in addition to the true mediators; C, D non-mediators are included in addition to the true mediators. Rsq (All): based on all the data without variable selection; Rsq(VS): based on the variables selected either by iterative SIS (A, C), or by FDR (B, D); Rsq (True): based on the true model/mediator set based on all the data. The numerical values and the bias and variance corresponding to the none mediators (null model) are available in Additional file 1: Tables S4 and S5
Mediation effect size estimated using the Framingham Heart Study data.
| Outcome | SOS | ab | prop | ratio | |||||
|---|---|---|---|---|---|---|---|---|---|
| FVC | 1378 | 0 | 0.207 | 0 | 0 | 0 | 0 | 0 | 0 |
| (0, 0) | (0.153, 0.265) | (0, 0) | (0, 0) | (0, 0) | (0, 0) | (0, 0) | (0, 0) | ||
| Systolic BP | 1711 | 207 | 0.069 | 0.026 | 0.381 | 0.002 | 0.008 | 0.008 | 4.1e−6 |
| (146, 224) | (0.035, 0.111) | (− 0.003, 0.066) | (− 0.085, 0.771) | (− 0.04, 0.03) | (− 0.17, 0.14) | (− 0.14, 0.16) | (1.1e−6, 1.8e−3) |
95% CI is within the parentheses based on percentiles of 500 bootstrap samples; is the number of genes in estimation; n is the sample size for each trait; A mixed model is used to estimate the quantities, including ’s, ab (the product measure), prop (the proportion measure), ratio, and the measure for multiple mediators
1ab and were calculated based on standardized residuals with SD = 1
2Lasso and FDR methods were also applied on FVC, by which none of the gene was selected
Fig. 2Demonstration of mediation analysis. X is the independent variable, Y is the dependent variable, and is the true mediator; A Single-mediator model; B Multiple-mediator model; C shows that is a non-mediator not associated with X, but with Y; D demonstrates a non-mediator that is associated with X, but not associated with Y after adjusting for X