| Literature DB >> 23272092 |
Jin Liu1, Jian Huang, Shuangge Ma.
Abstract
Genome-wide association studies have been extensively conducted, searching for markers for biologically meaningful outcomes and phenotypes. Penalization methods have been adopted in the analysis of the joint effects of a large number of SNPs (single nucleotide polymorphisms) and marker identification. This study is partly motivated by the analysis of heterogeneous stock mice dataset, in which multiple correlated phenotypes and a large number of SNPs are available. Existing penalization methods designed to analyze a single response variable cannot accommodate the correlation among multiple response variables. With multiple response variables sharing the same set of markers, joint modeling is first employed to accommodate the correlation. The group Lasso approach is adopted to select markers associated with all the outcome variables. An efficient computational algorithm is developed. Simulation study and analysis of the heterogeneous stock mice dataset show that the proposed method can outperform existing penalization methods.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23272092 PMCID: PMC3522680 DOI: 10.1371/journal.pone.0051198
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Simulation studies: the numbers are mean (standard deviation) based on 100 replicates.
| Combined Individual | ||||||
|
|
| True Positive | Model Size | FDR | FNR | SSE |
| 5000 | 0.1 | 17.60(1.99) | 20.18(2.93) | 0.12(0.09) | 0.27(0.08) | 96.89(12.67) |
| 5000 | 0.5 | 17.80(1.82) | 20.66(3.02) | 0.13(0.08) | 0.26(0.08) | 97.85(13.51) |
| 5000 | 0.9 | 17.66(1.72) | 20.32(3.11) | 0.12(0.09) | 0.26(0.07) | 97.39(15.61) |
| 10000 | 0.1 | 16.78(1.64) | 19.24(2.76) | 0.12(0.09) | 0.30(0.07) | 92.76(11.16) |
| 10000 | 0.5 | 16.84(1.80) | 18.96(2.56) | 0.10(0.08) | 0.30(0.07) | 92.44(12.57) |
| 10000 | 0.9 | 16.70(1.79) | 18.22(2.41) | 0.08(0.06) | 0.30(0.07) | 91.37(13.68) |
False discovery rate (FDR) and false negative rate (FNR) are reported together with true positives and model sizes.
Multi-split -values for simulated data with all matched non-zero s and = 0.9.
| p = 5000 | p = 10000 | |||
| SNP index |
|
|
|
|
| 25 | 0.293 | 9.3E−10 | 0.318 | 2.2E−10 |
| 26 | 0.270 | 7.3E−12 | 0.263 | 1.5E−07 |
| 27 | 0.263 | 3.1E−10 | 0.346 | 2.9E−10 |
| 28 | 0.251 | 1.1E−11 | 0.301 | 1.2E−08 |
| 41 | 0.264 | 0.054 | 0.182 | 1.000 |
| 42 | 0.100 | 1.000 | 0.336 | 0.007 |
| 43 | 0.245 | 0.006 | 0.249 | 0.019 |
| 44 | 0.096 | 1.000 | 0.177 | 0.798 |
| 57 | 0.107 | 0.004 | 0.093 | 1.000 |
| 58 | 0.174 | 1.000 | 0.071 | 1.000 |
| 59 | 0.183 | 2.3E−05 | 0.173 | 7.4E−05 |
| 60 | 0.089 | 1.000 | 0.094 | 1.000 |
| 342 | 0.006 | 1.000 | ||
| 2200 | 0.009 | 1.000 | ||
| 3623 | 0.010 | 1.000 | ||
| 3920 | 0.013 | 1.000 | ||
| 4177 | 0.004 | 1.000 | ||
| 4555 | 0.003 | 1.000 | ||
| 5494 | 0.008 | 1.000 | ||
| 5899 | 0.037 | 1.000 | ||
| 7156 | 0.020 | 1.000 | ||
| 9061 | 0.001 | 1.000 | ||
| 9343 | 0.004 | 1.000 | ||
| 9501 | 0.004 | 1.000 | ||
| 9884 | 0.013 | 1.000 | ||
Empty cells stand for SNPs that are not identified from the model.
Simulation studies: the numbers are mean (standard deviation) based on 100 replicates.
| Combined Individual | ||||||
|
|
| True Positive | Model Size | FDR | FNR | SSE |
| 5000 | 0.1 | 16.96(1.92) | 19.52(3.07) | 0.12(0.09) | 0.29(0.08) | 90.82(12.20) |
| 5000 | 0.5 | 16.96(1.82) | 19.82(3.45) | 0.13(0.10) | 0.29(0.08) | 91.35(12.19) |
| 5000 | 0.9 | 17.10(1.67) | 19.68(3.45) | 0.12(0.10) | 0.29(0.07) | 91.63(13.66) |
| 10000 | 0.1 | 16.06(1.73) | 18.42(3.38) | 0.11(0.09) | 0.33(0.07) | 86.30(10.36) |
| 10000 | 0.5 | 15.92(1.70) | 18.24(2.88) | 0.12(0.09) | 0.34(0.07) | 86.13(10.90) |
| 10000 | 0.9 | 15.96(1.75) | 17.88(2.80) | 0.10(0.08) | 0.34(0.07) | 85.43(12.50) |
False discovery rate (FDR), false negative rate (FNR) and sum of squared errors (SSE) are reported together with true positives and model sizes. 25 of the regression coefficients are not matched.
Simulation studies: the numbers are mean (standard deviation) based on 100 replicates.
| Combined Individual | ||||||
|
|
| True Positive | Model Size | FDR | FNR | SSE |
| 5000 | 0.1 | 16.94(1.89) | 19.80(2.92) | 0.14(0.09) | 0.29(0.08) | 89.29(11.00) |
| 5000 | 0.5 | 17.00(1.92) | 19.82(2.99) | 0.13(0.08) | 0.29(0.08) | 89.67(11.41) |
| 5000 | 0.9 | 17.08(1.90) | 20.02(3.47) | 0.13(0.09) | 0.29(0.08) | 89.94(14.40) |
| 10000 | 0.1 | 16.26(1.55) | 19.36(3.08) | 0.15(0.09) | 0.32(0.06) | 84.41(10.06) |
| 10000 | 0.5 | 16.20(1.58) | 19.06(2.85) | 0.14(0.09) | 0.32(0.07) | 84.24(10.48) |
| 10000 | 0.9 | 16.16(1.60) | 18.34(2.50) | 0.11(0.08) | 0.33(0.07) | 83.38(11.16) |
False discovery rate (FDR), false negative rate (FNR) and sum of squared errors (SSE) are reported together with true positives and model sizes. 50 the regression coefficients are not matched.
Figure 1Absolute values of
estimates from the simple linear regression on CD4/CD8 ratio and CD4∶CD3.
Figure 2Absolute values of
estimates from Lasso on CD4/CD8 ratio and CD4∶CD3 and estimates for the proposed method. Smaller dots represent SNPs selected by the Lasso/proposed method with insignificant multi-split -values. Larger dots represent SNPs with significant -values.
Number of SNPs identified, and overlap of SNPs among the proposed method, the Lasso and single-SNP analysis for heterogeneous stock mice dataset.
| Method |
|
| |||
| L1 | L2 | S1 | S2 | ||
| The Proposed Method | 45(38) | 12 | 13 | 38 | 45 |
| Lasso on M1 | 53(49) | 10 | 51 | 53 | |
| Lasso on M2 | 31(28) | 30 | 31 | ||
| single-SNP analysis on M1 | 2964 | 2964 | |||
| single-SNP analysis on M2 | 3128 | ||||
Short for Lasso on M1.
Short for Lasso on M2.
Short for single-SNP analysis on M1.
Short for single-SNP analysis on M2.
The number in the parenthesis is the number of SNPs with significant -values.
SNPs selected by the proposed method on both phenotypes CD4/CD8 ratio and CD4∶CD3.
| SNP | Chromosome | Position | MAF | Band | Gene | Proposed Method | |
|
|
| ||||||
| rs13475794 | 1 | 32202097 | 0.189 | 1qB | Khdrbs2 | 0.024 | 1.7E−07 |
| rs13475847 | 1 | 45969220 | 0.301 | 1qC1.1 | Slc40a1 | 0.008 | 2.8E−01 |
| rs3679459 | 1 | 120341835 | 0.098 | 1qE2.3 | Clasp1 | 0.024 | 4.7E−08 |
| rs8256197 | 1 | 130485642 | 0.428 | 1qE4 | Cxcr4 | 0.006 | 3.8E−05 |
| rs8256196 | 1 | 130485675 | 0.428 | 1qE4 | Cxcr4 | 5.1E−15 | 4.3E−05 |
| rs3682465 | 2 | 156317950 | 0.146 | 2qH1 | Epb4.1l1 | 0.004 | 3.9E−07 |
| rs3718812 | 3 | 52605874 | 0.155 | 3qC | Cog6 | 0.036 | 3.1E−08 |
| rs3659643 | 3 | 115759847 | 0.205 | 3qG1 | Extl2 | 2.3E−03 | 2.2E−06 |
| rs6176477 | 3 | 117874757 | 0.259 | 3qG1 | Snx7 | 4.3E−04 | 9.6E−06 |
| rs13460366 | 4 | 129804978 | 0.137 | 4qD2.2 | Pef1 | 5.5E−04 | 0.241 |
| rs13477979 | 4 | 130004434 | 0.137 | 4qD2.2 | Zcchc17 | 2.6E−15 | 0.332 |
| rs13477980 | 4 | 130281564 | 0.137 | 4qD2.2 | Pum1 | 2.4E−16 | 0.332 |
| rs13478285 | 5 | 61706070 | 0.078 | 5qC3.1 | G6pd2 | 0.015 | 0.003 |
| rs3692826 | 5 | 63287018 | 0.078 | 5qC3.1 | Gm17384 | 8.5E−16 | 0.004 |
| rs6222023 | 5 | 76590704 | 0.397 | 5qC3.3 | Srd5a3 | 0.007 | 2.1E−08 |
| rs3711751 | 5 | 137393986 | 0.290 | 5qG2 | 4933404O12Rik | 0.009 | 5.9E−07 |
| rs13478656 | 6 | 21893927 | 0.078 | 6qA3.1 | Ing3 | 0.038 |
|
| rs3665567 | 6 | 71342207 | 0.442 | 6qC1 | Rmnd5a | 0.043 | 6.4E−13 |
| rs3671932 | 6 | 134808128 | 0.228 | 6qG1 | Crebl2 | 0.041 | 2.9E−07 |
| rs3657482 | 7 | 121209199 | 0.458 | 7qF1 | Rras2 | 0.013 | 0.559 |
| rs13479673 | 8 | 30344780 | 0.102 | 8qA3 | Unc5d | 0.017 | 3.5E−08 |
| rs33227034 | 8 | 131027085 | 0.480 | 8qE2 | Nrp1 | 0.013 | 0.016 |
| rs29634420 | 9 | 16961090 | 0.075 | 9qA2 | Gm5611 | 0.006 | 0.015 |
| rs13480141 | 9 | 36754648 | 0.474 | 9qA4 | Pknox2 | 0.015 | 6.7E−07 |
| rs13480826 | 10 | 127874456 | 0.194 | 10qD3 | Rnf41 | 0.017 | 2.6E−04 |
| rs3719526 | 10 | 127890255 | 0.194 | 10qD3 | Smarcc2 | 1.4E−14 | 2.6E−04 |
| rs3670360 | 11 | 6153674 | 0.107 | 11qA1 | Ddx56 | 0.051 | 4.2E−10 |
| rs13481186 | 11 | 100224674 | 0.441 | 11qD | Jup | 0.003 | 2.1E−06 |
| rs13481187 | 11 | 100513551 | 0.441 | 11qD | Zfp385c | 5.5E−15 | 2.1E−06 |
| rs6393715 | 11 | 111796714 | 0.322 | 11qE2 | Gm11679 | 0.002 | 1.000 |
| rs13472132 | 13 | 55515090 | 0.184 | 13qB1 | Slc34a1 | 0.002 | 3.0E−05 |
| rs3692326 | 13 | 99316615 | 0.143 | 13qD1 | Gm10320 | 0.028 |
|
| rs4161101 | 16 | 10701008 | 0.369 | 16qA1 | Clec16a | 0.002 | 0.537 |
| rs4163042 | 16 | 13142435 | 0.172 | 16qA1 | Ercc4 | 0.001 | 0.005 |
| rs3714738 | 16 | 14722893 | 0.091 | 16qA1 | Si2 | 0.008 | 1.5E−03 |
| rs4219905 | 16 | 92999911 | 0.348 | 16qC4 | Runx1 | 0.024 | 3.3E−09 |
| rs33886220 | 17 | 33354677 | 0.345 | 17qB1 | Zfp955a | 0.165 |
|
| rs33477985 | 17 | 33744640 | 0.345 | 17qB1 | Myo1f | 1.7E−15 |
|
| rs33661797 | 17 | 35276713 | 0.456 | 17qB1 | Bag6 | 0.038 | 2.4E−10 |
| rs13482968 | 17 | 37268628 | 0.445 | 17qB1 | Olfr93 | 0.061 | 1.8E−10 |
| rs33270235 | 17 | 38311721 | 0.093 | 17qB1 | Olfr134 | 0.011 |
|
| rs3668036 | 17 | 45823731 | 0.339 | 17qB3 | Tmem63b | 0.004 | 7.4E−11 |
| rs3712953 | 17 | 50402827 | 0.076 | 17qC | Dazl | 0.021 | 6.0E−08 |
| rs3720827 | 18 | 63449870 | 0.248 | 18qE1 | Fam38b | 0.020 |
|
| rs13483449 | 18 | 77559708 | 0.141 | 18qE3 | 8030462N17Rik | 0.023 | 5.1E−11 |
Gene names that SNPs belong to or are closest to.
SNPs selected by individual Lasso on CD4/CD8 ratio.
| SNP | Chromosome | Position | MAF | Band | Gene | Lasso | |
|
|
| ||||||
| rs13475847 | 1 | 45969220 | 0.301 | 1qC1.1 | Slc40a1 | 0.017 | 1.4E−04 |
| rs3727162 | 1 | 118830782 | 0.098 | 1qE2.3 | Cntp5a | −0.015 | 8.7E−11 |
| rs13476234 | 1 | 172771818 | 0.479 | 1qH3 | Atf6 | 0.031 | 0.004 |
| rs13476239 | 1 | 174151892 | 0.346 | 1qH3 | Atp1a4 | −0.002 | 1.9E−05 |
| rs13476242 | 1 | 175295510 | 0.423 | 1qH3 | Cadm3 | −0.034 | 2.5E−04 |
| rs13476251 | 1 | 176722388 | 0.430 | 1qH3 | Fmn2 | −0.024 | 1.000 |
| rs13476764 | 2 | 127974055 | 0.360 | 2qF1 | Bcl2l11 | 0.007 | 1.000 |
| rs6411422 | 2 | 128199227 | 0.447 | 2qF1 | Gm14005 | 0.018 | 1.000 |
| rs3718812 | 3 | 52605874 | 0.155 | 3qC | Cog6 | 2.4E−04 | 8.2E−10 |
| rs3674296 | 3 | 52738092 | 0.155 | 3qC | Cog6 | −0.010 | 8.5E−10 |
| rs3709732 | 3 | 117669810 | 0.259 | 3qG1 | Snx7 | −0.012 | 7.6E−12 |
| rs6176477 | 3 | 117874757 | 0.259 | 3qG1 | Snx7 | 0.013 | 7.8E−12 |
| rs13477434 | 3 | 136689645 | 0.482 | 3qG3 | Gm10955 | 0.008 | 3.7E−07 |
| rs13477551 | 4 | 9051848 | 0.463 | 4qA1 | Rps18-ps2 | 0.008 | 2.9E−09 |
| rs13477584 | 4 | 17051798 | 0.353 | 4qA2 | Gm11850 | −0.009 | 9.7E−10 |
| rs13478285 | 5 | 61706070 | 0.078 | 5qC3.1 | G6pd2 | −0.027 | 0.001 |
| rs13478286 | 5 | 62201328 | 0.078 | 5qC3.1 | G6pd2 | 3.1E−13 | 0.003 |
| rs29501536 | 5 | 72711078 | 0.465 | 5qC3.2 | Corin | −0.011 |
|
| rs31537882 | 5 | 72995337 | 0.465 | 5qC3.2 | Cnga1 | −3.4E−14 | 6.4E−13 |
| rs3711751 | 5 | 137393986 | 0.290 | 5qG2 | 4933404O12Rik | 0.026 | 2.29E−12 |
| rs13478801 | 6 | 65057795 | 0.365 | 6qC1 | Smarcad1 | −0.009 |
|
| rs3665567 | 6 | 71342207 | 0.436 | 6qC1 | Rmnd5a | −0.072 |
|
| rs13478941 | 6 | 103348834 | 0.441 | 6qE1 | Chl1 | 0.015 | 5.6E−08 |
| rs6334723 | 6 | 134651968 | 0.368 | 6qG1 | Loh12cr1 | −0.042 | 1.1E−12 |
| rs13479376 | 7 | 91596873 | 0.074 | 7qD3 | Gm2115 | 0.005 | 4.2E−08 |
| rs13479465 | 7 | 120046978 | 0.075 | 7qF1 | Tead1 | −0.020 | 2.1E−04 |
| rs13479621 | 8 | 15993378 | 0.451 | 8qA1.1 | Csmd1 | −0.044 |
|
| rs13479930 | 8 | 97201198 | 0.248 | 8qC5 | Pllp | 0.013 | 0.001 |
| rs6180306 | 8 | 109166165 | 0.074 | 8qD3 | Cdh1 | −0.006 | 0.002 |
| rs29634420 | 9 | 16961090 | 0.075 | 9qA2 | Gm5611 | −0.018 | 4.2E−04 |
| rs6280411 | 10 | 125575083 | 0.451 | 10qD3 | AC153489.1 | −0.024 | 1.11E−07 |
| rs3701568 | 10 | 128933102 | 0.248 | 10qD3 | Olfr790 | 0.001 | 1.1E−09 |
| rs3670360 | 11 | 6153674 | 0.107 | 11qA1 | Ddx56 | 0.039 |
|
| rs3656583 | 11 | 64442910 | 0.456 | 11qB3 | Gm12291 | 0.030 | 1.5E−09 |
| rs6297520 | 11 | 64472210 | 0.456 | 11qB3 | Gm12291 | −0.003 | 2.5E−09 |
| rs13481170 | 11 | 95489416 | 0.074 | 11qD | Gm11528 | 0.038 |
|
| rs3684699 | 12 | 28209015 | 0.076 | 12qA2 | Sox11 | 0.027 | 0.001 |
| rs13481411 | 12 | 42060667 | 0.071 | 12qB1 | Immp2l | −0.015 | 0.001 |
| rs13481412 | 12 | 42722660 | 0.071 | 12qB1 | Immp2l | 2.4E−16 | 0.004 |
| rs13472132 | 13 | 55515090 | 0.184 | 13qB1 | Slc34a1 | 0.020 | 3.4E−08 |
| rs13482225 | 14 | 65324729 | 0.363 | 14qD1 | Kif13b | −0.011 | 0.076 |
| rs4139535 | 14 | 109988359 | 0.082 | 14qE3 | Slitrk1 | −0.004 | 0.009 |
| rs6209981 | 14 | 110067383 | 0.082 | 14qE3 | Slitrk1 | −6.3E−15 | 0.031 |
| rs31100152 | 14 | 110432009 | 0.082 | 14qE3 | n-R5s50 | 7.7E−14 | 0.041 |
| rs4163058 | 16 | 13269758 | 0.181 | 16qA1 | Mkl2 | 0.003 | 9.3E−06 |
| rs4163196 | 16 | 13400890 | 0.181 | 16qA1 | Mkl2 | 1.5E−16 | 8.8E−05 |
| rs4199044 | 16 | 69289859 | 0.449 | 16qC2 | Speer2 | 0.007 | 1.2E−06 |
| rs13482952 | 17 | 32937360 | 0.345 | 17qB1 | Zfp811 | 0.230 |
|
| rs13459151 | 17 | 33078090 | 0.345 | 17qB1 | Cyp4f13 | 1.3E−16 |
|
| rs33886220 | 17 | 33354677 | 0.345 | 17qB1 | Zfp955a | −1.4E−17 |
|
| rs33661797 | 17 | 35276713 | 0.456 | 17qB1 | Bag6 | 0.126 | 2.1E−11 |
| rs3712953 | 17 | 50402827 | 0.076 | 17qC | Dazl | 0.076 |
|
| rs6194426 | 19 | 50203520 | 0.286 | 19qD1 | Sorcs1 | 0.010 | 1.5E−06 |
Gene names that SNPs belong to or are closest to.
SNPs selected by individual Lasso on CD4∶CD3.
| SNP | Chromosome | Position | MAF | Band | Gene | Lasso | |
|
|
| ||||||
| rs13475794 | 1 | 32202097 | 0.189 | 1qB | Khdrbs2 | −9.5E−04 | 1.9E−05 |
| rs13476239 | 1 | 174151892 | 0.346 | 1qH3 | Atp1a4 | 0.012 | 0.014 |
| rs3682465 | 2 | 156317950 | 0.146 | 2qH1 | Epb4.1l1 | 0.007 | 4.7E−08 |
| rs3679962 | 3 | 127795535 | 0.490 | 3qG2 | Gm10650 | 0.016 | 0.135 |
| rs6290401 | 3 | 142297855 | 0.314 | 3qH1 | Gbp2 | 0.003 | 1.2E−08 |
| rs13477459 | 3 | 142492044 | 0.354 | 3qH1 | Pkn2 | 0.011 | 2.6E−09 |
| rs29501536 | 5 | 72711078 | 0.465 | 5qC3.2 | Corin | 6.1E−05 |
|
| rs3710735 | 5 | 73123583 | 0.465 | 5qC3.2 | Txk | −1.4E−16 | 2.2E−12 |
| rs6340166 | 5 | 73188279 | 0.465 | 5qC3.2 | Tec | 0.011 | 1.000 |
| rs4225267 | 5 | 73700837 | 0.465 | 5qC3.2 | Ociad1 | −1.8E−16 | 1.000 |
| rs3711751 | 5 | 137393986 | 0.290 | 5qG2 | 4933404O12Rik | −0.010 | 9.4E−12 |
| rs13478656 | 6 | 21893927 | 0.078 | 6qA3.1 | Ing3 | 0.014 | 2.0E−10 |
| rs13478800 | 6 | 64766250 | 0.435 | 6qC1 | Atoh1 | 0.005 |
|
| rs3665567 | 6 | 71342207 | 0.442 | 6qC1 | Rmnd5a | 0.100 |
|
| rs13479621 | 8 | 15993378 | 0.441 | 8qA1.1 | Csmd1 | 0.027 |
|
| rs13479673 | 8 | 30344780 | 0.102 | 8qA3 | Unc5d | 0.011 | 8.6E−09 |
| rs13480141 | 9 | 36754648 | 0.474 | 9qA4 | Pknox2 | 0.029 | 1.2E−11 |
| rs13480153 | 9 | 40483617 | 0.455 | 9qA5.1 | 9030425E11Rik | −0.003 | 1.7E−11 |
| rs6280411 | 10 | 125575083 | 0.451 | 10qD3 | AC153489.1 | 0.028 | 8.5E−11 |
| rs13480817 | 10 | 125932724 | 0.451 | 10qD3 | AC153489.1 | −1.7E−16 | 3.2E−10 |
| rs29383570 | 10 | 127146595 | 0.420 | 10qD3 | Myo1a | 0.004 | 4.3E−10 |
| rs13481170 | 11 | 95489416 | 0.074 | 11qD | Gm11528 | −0.055 |
|
| rs3692326 | 13 | 99316615 | 0.143 | 13qD1 | Gm10320 | 0.014 |
|
| rs4219905 | 16 | 92999911 | 0.348 | 16qC4 | Runx1 | −0.040 |
|
| rs13482952 | 17 | 32937360 | 0.345 | 17qB1 | Zfp811 | −0.194 |
|
| rs33661797 | 17 | 35276713 | 0.456 | 17qB1 | Bag6 | −0.154 |
|
| rs33270235 | 17 | 38311721 | 0.093 | 17qB1 | Olfr134 | 0.007 |
|
| rs3712953 | 17 | 50402827 | 0.076 | 17qC | Dazl | −0.113 |
|
| rs13483448 | 18 | 77559708 | 0.141 | 18qE3 | Loxhd1 | −0.023 | 6.7E−13 |
| rs13483449 | 18 | 77876027 | 0.141 | 18qE3 | 8030462N17Rik | 4.8E−15 | 8.6E−13 |
Gene names that SNPs belong to or are closest to.