| Literature DB >> 35003213 |
Yidan Cui1,2, Chengwen Luo3, Linghao Luo1,2, Zhangsheng Yu1,2,4.
Abstract
Mediation analysis has been extensively used to identify potential pathways between exposure and outcome. However, the analytical methods of high-dimensional mediation analysis for survival data are still yet to be promoted, especially for non-Cox model approaches. We propose a procedure including "two-step" variable selection and indirect effect estimation for the additive hazards model with high-dimensional mediators. We first apply sure independence screening and smoothly clipped absolute deviation regularization to select mediators. Then we use the Sobel test and the BH method for indirect effect hypothesis testing. Simulation results demonstrate its good performance with a higher true-positive rate and accuracy, as well as a lower false-positive rate. We apply the proposed procedure to analyze DNA methylation markers mediating smoking and survival time of lung cancer patients in a TCGA (The Cancer Genome Atlas) cohort study. The real data application identifies four mediate CpGs, three of which are newly found.Entities:
Keywords: SIS; additive hazards model; high-dimensional mediators; mediation analysis; survival data
Year: 2021 PMID: 35003213 PMCID: PMC8734376 DOI: 10.3389/fgene.2021.771932
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Direct acyclic graph with the exposure, outcome, and high-dimensional mediators.
FIGURE 2Overall workflow of the proposed procedure.
Select accuracy of the proposed procedure compared with naive method.
| Censoring rate | Sample size | Proposed procedure | Naive method | ||||
|---|---|---|---|---|---|---|---|
| TPR | FP | FDP | TPR | FP | FDP | ||
| 15% | n = 500 | 0.9105 | 0.2380 | 0.0471 | 0.0830 |
|
|
| 0.8345 | 0.0160 | 0.0038 | 0.0230 |
|
| ||
| n = 1,000 | 0.9980 | 0.2400 | 0.0447 | 0.7100 |
|
| |
| 0.9950 | 0.0200 | 0.0040 | 0.4735 |
|
| ||
| 20% | n = 500 | 0.8765 | 0.1980 | 0.0402 | 0.0645 |
|
|
| 0.7915 | 0.0160 | 0.0036 | 0.0130 |
|
| ||
| n = 1,000 | 0.9975 | 0.2600 | 0.0488 | 0.6230 |
|
| |
| 0.9890 | 0.0360 | 0.0072 | 0.3815 |
|
| ||
| 25% | n = 500 | 0.8455 | 0.2160 | 0.0448 | 0.0410 |
|
|
| 0.7290 | 0.0240 | 0.0061 | 0.0095 |
|
| ||
| n = 1,000 | 0.9945 | 0.2760 | 0.0512 | 0.5350 |
|
| |
| 0.9855 | 0.0200 | 0.0041 | 0.3005 |
|
| ||
| 30% | n = 500 | 0.7855 | 0.2180 | 0.0493 | 0.0280 |
|
|
| 0.6550 | 0.0140 | 0.0036 | 0.0045 |
|
| ||
| n = 1,000 | 0.9885 | 0.3340 | 0.0617 | 0.4630 |
|
| |
| 0.9725 | 0.0220 | 0.0044 | 0.2210 |
|
| ||
| 35% | n = 500 | 0.7480 | 0.1740 | 0.0420 | 0.0240 |
|
|
| 0.6115 | 0.0200 | 0.0059 | 0.0025 |
|
| ||
| n = 1,000 | 0.9820 | 0.2380 | 0.0446 | 0.3480 |
|
| |
| 0.9575 | 0.0200 | 0.0040 | 0.1560 |
|
| ||
| 40% | n = 500 | 0.6885 | 0.1680 | 0.0425 | 0.0120 |
|
|
| 0.5475 | 0.0160 | 0.0060 | 0.0015 |
|
| ||
| n = 1,000 | 0.9650 | 0.3200 | 0.0602 | 0.2700 |
|
| |
| 0.9285 | 0.0180 | 0.0037 | 0.1110 |
|
| ||
| 45% | n = 500 | 0.6220 | 0.1900 | 0.0485 | 0.0055 |
|
|
| 0.4655 | 0.0080 | 0.0034 | 0.0005 |
|
| ||
| n = 1,000 | 0.9420 | 0.2080 | 0.0393 | 0.2035 |
|
| |
| 0.8975 | 0.0200 | 0.0042 | 0.0705 |
|
| ||
| 50% | n = 500 | 0.5485 | 0.2080 | 0.0593 | 0.0055 |
|
|
| 0.4145 | 0.0100 | 0.0050 | 0.0005 |
|
| ||
| n = 1,000 | 0.9235 | 0.2420 | 0.0474 | 0.1340 |
|
| |
| 0.8545 | 0.0140 | 0.0031 | 0.0465 |
|
| ||
Each scenario has two results, the first line represents the BH-adjusted p value, and the second line is the BY-adjusted p value; TPR, percentage of correctly selected positive mediators; FP number, number of incorrectly selected negative mediators; FDP, percentage of FP mediators among all selected. The results are an average of 500 replications.
Indirect effect estimation of the proposed procedure.
| Mediation | Estimation | cen = 15% | cen = 25% | cen = 35% | cen = 50% | ||||
|---|---|---|---|---|---|---|---|---|---|
| n = 500 | n = 1,000 | n = 500 | n = 1,000 | n = 500 | n = 1,000 | n = 500 | n = 1,000 | ||
| Est. | 0.9973 | 0.9795 | 1.0114 | 0.9763 | 1.0355 | 0.9854 | 1.1351 | 1.0051 | |
|
| CP | 0.9509 | 0.9500 | 0.9534 | 0.9559 | 0.9626 | 0.9539 | 0.9524 | 0.9690 |
| (1,1) = 1 | Emp.SE | 0.2937 | 0.1910 | 0.3175 | 0.2113 | 0.3354 | 0.2243 | 0.3909 | 0.2590 |
| Est.SE | 0.2907 | 0.1997 | 0.3166 | 0.2174 | 0.3481 | 0.2389 | 0.4086 | 0.2806 | |
| Est. | 1.0171 | 0.9854 | 1.0355 | 0.9848 | 1.0866 | 0.9962 | 1.1713 | 1.0086 | |
|
| CP | 0.9351 | 0.9400 | 0.9662 | 0.9520 | 0.9727 | 0.9556 | 0.9661 | 0.9568 |
| (1,1) = 1 | Emp.SE | 0.2978 | 0.2022 | 0.3123 | 0.2213 | 0.3169 | 0.2354 | 0.3460 | 0.2781 |
| Est.SE | 0.2924 | 0.1991 | 0.3192 | 0.2167 | 0.3511 | 0.2384 | 0.4124 | 0.2796 | |
| Est. | 1.0387 | 0.9860 | 1.0678 | 0.9970 | 1.0877 | 0.9913 | 1.1984 | 1.0156 | |
|
| CP | 0.9430 | 0.9440 | 0.9556 | 0.9380 | 0.9581 | 0.9499 | 0.9523 | 0.9591 |
| (1,1) = 1 | Emp.SE | 0.3131 | 0.2028 | 0.3275 | 0.2204 | 0.3554 | 0.2400 | 0.3868 | 0.2714 |
| Est.SE | 0.2926 | 0.2003 | 0.3184 | 0.2186 | 0.3489 | 0.2395 | 0.4094 | 0.2816 | |
| Est. | 1.0510 | 0.9845 | 1.0539 | 0.9875 | 1.0706 | 0.9978 | 1.1842 | 1.0259 | |
|
| CP | 0.9390 | 0.9520 | 0.9459 | 0.9540 | 0.9667 | 0.9480 | 0.9522 | 0.9654 |
| (1,1) = 1 | Emp.SE | 0.3051 | 0.1969 | 0.3198 | 0.2162 | 0.3329 | 0.2428 | 0.3547 | 0.2699 |
| Est.SE | 0.2941 | 0.1995 | 0.3190 | 0.2174 | 0.3499 | 0.2390 | 0.4137 | 0.2805 | |
| Est. | 0.2354 | 0.0916 | 0.3143 | 0.1348 | 0.3465 | 0.1302 | 0.3417 | 0.1423 | |
|
| CP | 0.6071 | 0.3571 | 0.4231 | 0.3684 | 0.5333 | 0.3529 | 0.6500 | 0.5455 |
| (0.5,0) = 0 | Emp.SE | 0.2106 | 0.2029 | 0.1076 | 0.1883 | 0.1281 | 0.2490 | 0.2653 | 0.2662 |
| Est.SE | 0.1473 | 0.1000 | 0.1634 | 0.1091 | 0.1732 | 0.1240 | 0.2073 | 0.1389 | |
| Est. | 0.0985 | 0.1518 | 0.0599 | 0.2226 | 0.2380 | 0.2593 | 0.3769 | 0.2927 | |
|
| CP | 0.5263 | 0.7000 | 0.4706 | 0.3750 | 0.5263 | 0.2500 | 0.2308 | 0.3529 |
| (0.5,0) = 0 | Emp.SE | 0.3193 | 0.1412 | 0.3669 | 0.0725 | 0.2928 | 0.0587 | 0.3874 | 0.0546 |
| Est.SE | 0.1643 | 0.0988 | 0.1794 | 0.1043 | 0.1852 | 0.1196 | 0.2396 | 0.1451 | |
| Est. | (—) | 0.0097 | (—) | 0.0019 | (—) | 0.0647 | (—) | −0.0012 | |
|
| CP | (—) | 0.3077 | (—) | 0.1667 | (—) | 0.2500 | (—) | 0.2000 |
| (0,0.5) = 0 | Emp.SE | (—) | 0.1225 | (—) | 0.1347 | (—) | 0.1207 | (—) | 0.1623 |
| Est.SE | (—) | 0.0554 | (—) | 0.0616 | (—) | 0.0636 | (—) | 0.0769 | |
| Est. | 0.0901 | 0.0772 | 0.0871 | 0.0802 | 0.0771 | 0.1376 | 0.0261 | 0.1526 | |
|
| CP | 0.8000 | 0.2500 | 0.5000 | 0.4000 | 0.7500 | 0.4286 | 1.0000 | 0.5000 |
| (0,0.5) = 0 | Emp.SE | 0.1546 | 0.0944 | 0.1935 | 0.0952 | 0.1877 | 0.0145 | 0.1869 | 0.0155 |
| Est.SE | 0.0941 | 0.0547 | 0.1071 | 0.0587 | 0.1162 | 0.0649 | 0.1217 | 0.0754 | |
The first column represents M (α, β), product of αβ is the real IE; cen, abbreviation of censoring rate; Est., the mean of coefficient estimation; CP, coverage probability, the proportion of replicates which 95% confidence interval (CI) cover the true value of the coefficient; Emp. SE, empirical standard error, calculated standard error from the estimation of all replicates; Est. SE, mean of estimated standard error among all replicates. (-) represents those mediators haven’t been selected among 500 replicates. The results are an average of 500 replications.
FIGURE 3Select accuracy of the proposed procedure. (A) shows TPR variation of the proposed procedure with different censoring rate and sample size, (B) shows FP variation, (C) shows FDP variation.
FIGURE 4Indirect effect estimation of the proposed procedure. (A) is the estimated coefficients of four mediators with sample size 500 in simulation studies, (B) is the coverage probability of four mediators with sample size 500, (C) is the empirical standard error and estimated standard error of four mediators with sample size 500. (D), (E), (F) represent the same simulation results as (A), (B), (C) correspondingly with sample size 1000.
Significant mediate CpG sites with positive indirect effect.
| Est. IE | 95% CI | P(BH) | P(BY) | SE |
|
| Chr | Gene | |
|---|---|---|---|---|---|---|---|---|---|
| cg19757631 | 0.0296 | (0.0129–0.0464) | 0.0112 | 0.0428 | 0.0086 | −0.2806 | −0.1056 | chr1 | SRM |
| cg08636115 | 0.0263 | (0.0093–0.0433) | 0.0152 | 0.0581 | 0.0087 | −0.3811 | −0.0690 | chr1 | PRDM16 |
| cg05147638 | 0.0185 | (0.0047–0.0323) | 0.0422 | 0.1612 | 0.0070 | 0.4918 | 0.0376 | chr12 | COPZ1 |
| cg24720672 | 0.0269 | (0.0100–0.0438) | 0.0151 | 0.0575 | 0.0086 | −1.4889 | −0.0181 | chr15 | LOC283663 |
Est. IE, the estimated IE ( ); P(BH), BH-adjusted p value; P(BY), BY-adjusted p value; SE, the estimated standard error; Chr, the chromosome where CpG is located in; Gene, the CpG located or nearest gene.