| Literature DB >> 24580776 |
Jianwei Gou, Yang Zhao, Yongyue Wei, Chen Wu, Ruyang Zhang, Yongyong Qiu, Ping Zeng, Wen Tan, Dianke Yu, Tangchun Wu, Zhibin Hu, Dongxin Lin, Hongbing Shen, Feng Chen1.
Abstract
BACKGROUND: Evidence suggests that common complex diseases may be partially due to SNP-SNP interactions, but such detection is yet to be fully established in a high-dimensional small-sample (small-n-large-p) study. A number of penalized regression techniques are gaining popularity within the statistical community, and are now being applied to detect interactions. These techniques tend to be over-fitting, and are prone to false positives. The recently developed stability least absolute shrinkage and selection operator (SLASSO) has been used to control family-wise error rate, but often at the expense of power (and thus false negative results).Entities:
Mesh:
Year: 2014 PMID: 24580776 PMCID: PMC3984751 DOI: 10.1186/1471-2105-15-62
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Pairwise r among the 327 SNPs across the gene . The color of each box signifies the value of r2 between SNPs alleles, with the black indicating strongest relationship between a pair of marks (1 = black, 0 = white).
Parameter settings of the different kinds of scenarios
| A1/A2/A3 | 0 | 1 | SNP33 × SNP197 | 1.5/1.4/1.3 | |
| B1/B2/B3 | 1 | 1 | SNP18 + SNP18 × SNP134 | 1.5/1.4/1.3 | |
| C1/C2/C3 | 2 | 1 | SNP33 + SNP134 + SNP33 × SNP134 | 1.5/1.4/1.3 |
Abbreviations: ABCC4 ATP-binding cassette sub-family C member 4, OR odds ratio, SNP single nucleotide polymorphisms.
Selection performance of different methods in different scenarios (OR = 1.5)
| | ||||
|---|---|---|---|---|
| LASSO | 0.768(0.027) | 0.775(0.020) | 0.857(0.018) | |
| SCAD | 0.801(0.025) | 0.775(0.021) | 0.603(0.016) | |
| SLASSO | 0.750(0.018) | 0.780(0.009) | 0.990(0.005) | |
| SSCAD | 0.933(0.013) | 0.897(0.002) | 0.968(0.003) | |
| LASSO | 0.292(0.014) | 0.384(0.012) | 0.356(0.021) | |
| SCAD | 0.350(0.013) | 0.435(0.013) | 0.373(0.016) | |
| SLASSO | 0.776(0.011) | 0.706(0.012) | 0.875(0.011) | |
| SSCAD | 0.826(0.008) | 0.843(0.016) | 0.837(0.015) | |
| LASSO | 0.593(0.018) | 0.601(0.002) | 0.632(0.002) | |
| SCAD | 0.596(0.018) | 0.606(0.002) | 0.632(0.002) | |
| SLASSO | 0.612(0.001) | 0.616(0.002) | 0.638(0.005) | |
| SSCAD | 0.631(0.001) | 0.615(0.001) | 0.630(0.002) | |
| LASSO | 0.859(0.013) | 0.827(0.007) | 0.789(0.026) | |
| SCAD | 0.837(0.009) | 0.813(0.008) | 0.742(0.018) | |
| SLASSO | 0.186(0.006) | 0.160(0.009) | 0.107(0.017) | |
| SSCAD | 0.167(0.003) | 0.147(0.007) | 0.166(0.014) | |
Abbreviations: TPR true positive rate, MCC Matthews correlation coefficient, AUC area under the ROC curve, FDR false discovery rate.
Numbers in each cell represent mean (standard error) by 200 times simulation.
Selection performance of different methods in different scenarios (OR = 1.4)
| LASSO | 0.726(0.028) | 0.726(0.021) | 0.797(0.020) | |
| SCAD | 0.750(0.026) | 0.740(0.021) | 0.734(0.022) | |
| SLASSO | 0.706(0.019) | 0.720(0.012) | 0.942(0.010) | |
| SSCAD | 0.861(0.016) | 0.890(0.004) | 0.953(0.002) | |
| LASSO | 0.240(0.015) | 0.334(0.013) | 0.309(0.013) | |
| SCAD | 0.295(0.014) | 0.380(0.015) | 0.333(0.014) | |
| SLASSO | 0.716(0.012) | 0.636(0.015) | 0.826(0.013) | |
| SSCAD | 0.787(0.009) | 0.800(0.018) | 0.817(0.018) | |
| LASSO | 0.537(0.018) | 0.542(0.002) | 0.577(0.003) | |
| SCAD | 0.538(0.019) | 0.529(0.003) | 0.570(0.004) | |
| SLASSO | 0.557(0.002) | 0.569(0.003) | 0.590(0.003) | |
| SSCAD | 0.557(0.003) | 0.561(0.003) | 0.560(0.001) | |
| LASSO | 0.868(0.014) | 0.829(0.010) | 0.797(0.008) | |
| SCAD | 0.838(0.009) | 0.817(0.010) | 0.754(0.009) | |
| SLASSO | 0.198(0.007) | 0.146(0.009) | 0.122(0.010) | |
| SSCAD | 0.166(0.004) | 0.149(0.007) | 0.180(0.007) | |
Abbreviations: TPR true positive rate, MCC Matthews correlation coefficient, AUC area under the ROC curve, FDR false discovery rate.
Numbers in each cell represent mean (standard error) by 200 times simulation.
Selection performance of different methods in different scenarios (OR = 1.3)
| LASSO | 0.681(0.032) | 0.664(0.023) | 0.777(0.021) | |
| SCAD | 0.715(0.028) | 0.678(0.023) | 0.7853(0.022) | |
| SLASSO | 0.651(0.019) | 0.692(0.012) | 0.874(0.012) | |
| SSCAD | 0.835(0.014) | 0.808(0.005) | 0.878(0.005) | |
| LASSO | 0.173(0.018) | 0.298(0.018) | 0.303(0.016) | |
| SCAD | 0.253(0.016) | 0.350(0.017) | 0.312(0.014) | |
| SLASSO | 0.676(0.012) | 0.600(0.015) | 0.768(0.013) | |
| SSCAD | 0.736(0.012) | 0.737(0.014) | 0.778(0.019) | |
| LASSO | 0.506(0.018) | 0.524(0.003) | 0.536(0.006) | |
| SCAD | 0.517(0.021) | 0.514(0.006) | 0.536(0.006) | |
| SLASSO | 0.518(0.002) | 0.545(0.005) | 0.535(0.005) | |
| SSCAD | 0.537(0.005) | 0.541(0.002) | 0.551(0.002) | |
| LASSO | 0.869(0.015) | 0.838(0.011) | 0.798(0.010) | |
| SCAD | 0.832(0.013) | 0.819(0.008) | 0.726(0.009) | |
| SLASSO | 0.188(0.007) | 0.164(0.006) | 0.182(0.010) | |
| SSCAD | 0.169(0.006) | 0.163(0.007) | 0.174(0.008) | |
Abbreviations: TPR true positive rate, MCC Matthews correlation coefficient, AUC area under the ROC curve, FDR false discovery rate.
Numbers in each cell represent mean (standard error) by 200 times simulation.
Empirical selection probability of significant SNP pairs by LASSO and SCAD under subsampling
| | | | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
| |||||||||||
|
|
|
|
|
|
| |||||||
| rs7839119 | PTK2 | rs12544802 | PTK2 | 0.724c* | 0.702* | 1.04 × 10-6 | 0.628* | 0.940* | 6.34 × 10-2 | 0.668* | 0.801* | 3.10 × 10-4 |
| rs3781626 | RAPSN | rs6018348 | SRC | 0.734* | 0.680* | 3.25 × 10-3 | 0.514* | 0.654* | 2.88 × 10-3 | 0.617* | 0.824* | 1.40 × 10-4 |
| rs7839119 | PTK2 | rs4524871 | MUSK | 0.746* | 0.734* | 7.19 × 10-4 | 0.478 | 0.518* | 4.98 × 10-2 | 0.600* | 0.980* | 5.02 × 10-4 |
| rs2736100 | TERT | rs40318 | PIK3R1 | 0.586 | 0.830* | 9.87 × 10-6 | 0.390 | 0.582* | 2.22 × 10-2 | 0.696* | 0.977* | 2.51 × 10-6 |
Π represents the empirical selection probability of SNP pairs under subsampling.
bp stands for the trend test p-value for simple marginal logistic regression.
cThe significant SNP pairs under stability selection are coded by (*) to indicate its selection probability being higher than the threshold value (implied by FDR < 0.1).
Empirical selection probability of significant SNP pairs in Nanjing study by the LASSO
|
| ||||
|---|---|---|---|---|
| rs929087 | rs12544802 | 0.964 | ||
| rs4946933 | rs11231740 | 0.890 | ||
| rs2853462 | rs7856889 | 0.880 | ||
| rs7445640 | rs10733579 | 0.824 | ||
| rs411751 | rs939269 | 0.794 | ||
| rs7839119 | rs4524871 | 0.746 | ||
| rs3781626 | rs6018348 | 0.734 | ||
| rs7839119 | rs12544802 | 0.724 | ||
| rs725787 | rs5998196 | 0.688 | ||
| rs6578141 | rs1940245 | 0.636 |
Π represents a predictor’s empirical probability of model inclusion under 500 times subsampling.
Empirical selection probability of significant SNP pairs in Beijing study by the LASSO
|
| ||||
|---|---|---|---|---|
| rs3779632 | rs9644448 | 1.000 | ||
| rs2736100 | rs11994882 | 0.904 | ||
| rs10109684 | rs11231735 | 0.904 | ||
| rs11994882 | rs4983387 | 0.840 | ||
| rs6969923 | rs11997161 | 0.788 | ||
| rs2677764 | rs2821142 | 0.784 | ||
| rs4551415 | rs1359711 | 0.740 | ||
| rs1550099 | rs10817088 | 0.706 | ||
| rs12466358 | rs2565062 | 0.700 | ||
| rs3791723 | CHRNG | rs7839119 | PTK2 | 0.684 |
| rs7839119 | rs12544802 | 0.628 | ||
| rs9773817 | rs6018088 | 0.624 | ||
| rs479744 | rs7952435 | 0.610 | ||
| rs2736122 | rs12945577 | 0.598 | ||
| rs10817082 | rs5994451 | 0.596 | ||
| rs251398 | rs10733579 | 0.530 | ||
| rs3781626 | rs6018348 | 0.514 | ||
| rs4727666 | rs7856889 | 0.506 |
Π represents a predictor’s empirical probability of model inclusion under 500 times subsampling.
Empirical selection probability of significant SNP pairs in Nanjing study by the SCAD
|
| ||||
|---|---|---|---|---|
| rs929087 | rs12544802 | 0.904 | ||
| rs2853462 | rs7856889 | 0.876 | ||
| rs2736100 | rs40318 | 0.830 | ||
| rs7445640 | rs10733579 | 0.786 | ||
| rs411751 | rs939269 | 0.756 | ||
| rs7839119 | rs4524871 | 0.734 | ||
| rs7839119 | rs12544802 | 0.702 | ||
| rs725787 | rs5998196 | 0.692 | ||
| rs3781626 | rs6018348 | 0.680 | ||
| rs4946933 | rs11231740 | 0.640 | ||
| rs6578141 | rs1940245 | 0.604 |
Π represents a predictor’s empirical probability of model inclusion under 500 times subsampling.
Empirical selection probability of significant SNP pairs in Beijing study by the SCAD
|
| ||||
|---|---|---|---|---|
| rs7839119 | rs12544802 | 0.940 | ||
| rs3779632 | rs9644448 | 0.828 | ||
| rs10515077 | rs10817088 | 0.658 | ||
| rs3781626 | rs6018348 | 0.654 | ||
| rs2736122 | rs12945577 | 0.610 | ||
| rs3639 | rs3781626 | 0.602 | ||
| rs3791723 | rs7839119 | 0.594 | ||
| rs2736100 | rs40318 | 0.582 | ||
| rs9773817 | rs6018088 | 0.574 | ||
| rs3800230 | rs7856889 | 0.568 | ||
| rs10980510 | rs3829603 | 0.560 | ||
| rs2677764 | rs2853668 | 0.556 | ||
| rs411751 | rs9609396 | 0.526 | ||
| rs9480867 | rs11231741 | 0.522 | ||
| rs7839119 | rs4524871 | 0.518 | ||
| rs4524871 | rs10980564 | 0.502 |
Π represents a predictor’s empirical probability of model inclusion under 500 times subsampling.