| Literature DB >> 28846717 |
Abdul Wahid1, Dost Muhammad Khan1, Ijaz Hussain2.
Abstract
High dimensional data are commonly encountered in various scientific fields and pose great challenges to modern statistical analysis. To address this issue different penalized regression procedures have been introduced in the litrature, but these methods cannot cope with the problem of outliers and leverage points in the heavy tailed high dimensional data. For this purppose, a new Robust Adaptive Lasso (RAL) method is proposed which is based on pearson residuals weighting scheme. The weight function determines the compatibility of each observations and downweight it if they are inconsistent with the assumed model. It is observed that RAL estimator can correctly select the covariates with non-zero coefficients and can estimate parameters, simultaneously, not only in the presence of influential observations, but also in the presence of high multicolliearity. We also discuss the model selection oracle property and the asymptotic normality of the RAL. Simulations findings and real data examples also demonstrate the better performance of the proposed penalized regression approach.Entities:
Mesh:
Year: 2017 PMID: 28846717 PMCID: PMC5573134 DOI: 10.1371/journal.pone.0183518
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Simulation results for the location level of contamination in the model(1 − δ)N(0, 1) + δN(−10, 1).
| ( | Method | r = 0.5 | r = 0.85 | |||||
|---|---|---|---|---|---|---|---|---|
| Med.PMSE | C | I | Med.PMSE | C | I | |||
| (50,8) | 0% | Lasso | 1.448(1.58) | 4.384 | 0 | 1.361(1.76) | 5 | 0 |
| E.net | 1.479(1.82) | 2.388 | 0 | 1.387(1.71) | 2.242 | 0 | ||
| Alasso | 1.335(1.83) | 4.837 | 0 | 1.405(2.03) | 3 | 1 | ||
| CBPR | 1.202(1.46) | 3.593 | 0 | 1.311(1.68) | 3.512 | 0 | ||
| 1.254(1.33) | 4.713 | 0 | 1.430(1.72) | 2.62 | 0 | |||
| 10% | Lasso | 1.187(1.53) | 0.422 | 0 | 1.131(1.31) | 4 | 0 | |
| E.net | 1.198(1.60) | 1.946 | 0 | 1.161(1.45) | 1.532 | 0 | ||
| Alasso | 1.124(1.89) | 4.568 | 0 | 1.208(2.05) | 3 | 0.094 | ||
| CBPR | 1.100(1.30) | 4.798 | 0 | 1.076(1.19) | 4 | 0 | ||
| 1.001(1.02) | 3.913 | 0 | 1.029(1.24) | 4.837 | 0 | |||
| 20% | Lasso | 0.969(1.34) | 3.084 | 0.02 | 0.930(1.16) | 5 | 1 | |
| E.net | 1.011(1.39) | 2.189 | 0 | 0.939(1.20) | 1 | 0 | ||
| Alasso | 0.983(2.62) | 3.993 | 0 | 1.082(2.54) | 2.386 | 1.756 | ||
| CBPR | 0.900(1.12) | 3.268 | 0 | 0.920(1.11) | 2.704 | 0 | ||
| 0.850(0.92) | 4.735 | 0 | 0.823(0.89) | 4.910 | 0 | |||
| 30% | Lasso | 0.801(0.95) | 3.004 | 0 | 0.779(0.98) | 3.296 | 0 | |
| E.net | 0.837(1.13) | 2.358 | 0 | 0.807(0.99) | 1.935 | 0 | ||
| Alasso | 0.865(3.65) | 3 | 0 | 1.001(3.42) | 3.998 | 0.014 | ||
| CBPR | 0.753(0.96) | 4.006 | 0 | 0.802(0.98) | 4.113 | 0 | ||
| 0.704(0.80) | 4.442 | 0 | 0.707(0.81) | 5 | 0 | |||
| (100,8) | 0% | Lasso | 1.187(0.83) | 3 | 0 | 1.169(0.79) | 4.621 | 0 |
| E.net | 1.215(0.85) | 1.576 | 0 | 1.179(0.77) | 2.231 | 0 | ||
| Alasso | 1.140(0.87) | 4.897 | 0 | 1.162(0.82) | 4.003 | 0 | ||
| CBPR | 1.210(0.75) | 3.457 | 0 | 1.139(0.78) | 5 | 0 | ||
| 1.100(0.70) | 4.365 | 0 | 1.121(0.73) | 4.159 | 0 | |||
| 10% | Lasso | 0.964(0.69) | 3 | 0 | 0.956(0.64) | 4 | 0 | |
| E.net | 0.981(0.67) | 3.706 | 0 | 0.968(0.70) | 3.714 | 0 | ||
| Alasso | 0.934(0.62) | 3.232 | 0 | 0.937(0.71) | 4.5 | 0.002 | ||
| CBPR | 0.932(0.61) | 4 | 0 | 0.938(0.62) | 3.602 | 0 | ||
| 0.895(0.60) | 4.578 | 0 | 0.910(0.51) | 4.330 | 0 | |||
| 20% | Lasso | 0.805(0.61) | 3.065 | 0 | 0.791(0.56) | 4 | 0 | |
| E.net | 0.812(0.54) | 2.498 | 0 | 0.793(0.53) | 2.107 | 0 | ||
| Alasso | 0.773(0.60) | 4.597 | 0 | 0.809(0.90) | 3.084 | 0 | ||
| CBPR | 0.784(0.55) | 3.616 | 0 | 0.775(0.50) | 3.272 | 0 | ||
| 0.758(0.54) | 5 | 0 | 0.745(0.50) | 4.530 | 0 | |||
| 30% | Lasso | 0.678(0.45) | 4 | 0 | 0.667(0.47) | 3 | 0 | |
| E.net | 0.697(0.54) | 1.296 | 0 | 0.684(0.48) | 3.98 | 0 | ||
| Alasso | 0.666(0.84) | 4.483 | 0.01 | 0.740(1.13) | 4.15 | 0.012 | ||
| CBPR | 0.664(0.46) | 5 | 0 | 0.672(0.44) | 4 | 0.09 | ||
| 0.649(0.40) | 4.805 | 0 | 0.630(0.39) | 5 | 0 | |||
Simulation results for the scale level of contamination in the model(1 − δ)N(0, 1) + δN(0, 25).
| ( | Method | r = 0.5 | r = 0.85 | |||||
|---|---|---|---|---|---|---|---|---|
| Med.PMSE | C | I | Med.PMSE | C | I | |||
| (50,8) | 0% | Lasso | 1.390(1.70) | 3.411 | 0 | 1.372(1.62) | 3.022 | 0 |
| E.net | 1.486(1.78) | 1.990 | 0 | 1.390(1.72) | 2.683 | 0 | ||
| Alasso | 1.357(1.91) | 4.851 | 0 | 1.431(1.97) | 4.543 | 0.07 | ||
| CBPR | 1.335(1.65) | 3.751 | 0 | 1.325(1.37) | 3.337 | 0 | ||
| 1.262(1.35) | 4.658 | 0 | 1.262(1.33) | 4.586 | 0 | |||
| 10% | Lasso | 10.246(13.64) | 2.103 | 0 | 9.521(10.76) | 3.938 | 0 | |
| E.net | 9.986(12.59) | 2.126 | 0 | 9.402(10.27) | 2.587 | 0 | ||
| Alasso | 10.445(15.52) | 2.839 | 0.447 | 10.037(11.99) | 2 | 0.747 | ||
| CBPR | 10.262(12.79) | 3.497 | 0 | 9.619(10.40) | 2.940 | 0.424 | ||
| 9.215(11.76) | 3.451 | 0 | 8.720(9.32) | 4.07 | 0 | |||
| 20% | Lasso | 34.591(41.62) | 2.191 | 0.382 | 32.384(35.41) | 3.603 | 0 | |
| E.net | 34.014(40.81) | 0.976 | 0.776 | 32.458(38.22) | 1.934 | 0.92 | ||
| Alasso | 37.373(52.49) | 3.515 | 0.64 | 35.498(46.89) | 0 | 1 | ||
| CBPR | 37.649(48.59) | 1 | 0 | 31.654(33.38) | 1.730 | 1.645 | ||
| 32.768(41.57) | 3.972 | 0 | 30.902(33.21) | 4.369 | 0.043 | |||
| 30% | Lasso | 73.419(82.02) | 1.909 | 1 | 69.504(76.62) | 3.213 | 0.921 | |
| E.net | 73.889(90.57) | 0.958 | 0 | 68.915(91.13) | 0.870 | 1.793 | ||
| Alasso | 81.505(97.31) | 4 | 1.749 | 76.578(96.12) | 1.948 | 2.01 | ||
| CBPR | 74.194(98.32) | 3.370 | 0.034 | 69.384(80.65) | 3.265 | 2 | ||
| 72.284(93.72) | 4.639 | 0 | 67.299(82.93) | 4.037 | 0.745 | |||
| (100,8) | 0% | Lasso | 1.194(0.78) | 3.07 | 0 | 1.162(0.79) | 4.007 | 0 |
| E.net | 1.190(0.87) | 2.488 | 0 | 1.199(0.83) | 1 | 0 | ||
| Alasso | 1.150(0.78) | 4.524 | 0 | 1.159(0.84) | 4.012 | 0 | ||
| CBPR | 1.159(0.80) | 3.752 | 0 | 1.161(0.80) | 3.157 | 0 | ||
| 1.113(0.70) | 4.866 | 0 | 1.102(0.68) | 5 | 0 | |||
| 10% | Lasso | 8.487(5.71) | 1.106 | 0 | 8.165(5.31) | 4.795 | 0.02 | |
| E.net | 8.547(6.11) | 2.318 | 0 | 8.155(5.47) | 3.340 | 0 | ||
| Alasso | 8.125(5.75) | 3.759 | 0 | 8.431(5.56) | 2.992 | 0 | ||
| CBPR | 8.294(5.65) | 3.456 | 0 | 8.108(5.38) | 3.335 | 0 | ||
| 8.012(5.45) | 4.307 | 0 | 7.491(4.93) | 4.267 | 0 | |||
| 20% | Lasso | 30.023(19.73) | 1.086 | 0 | 29.266(19.08) | 3.811 | 0.809 | |
| E.net | 29.779(19.84) | 2.259 | 0 | 29.768(18.34) | 0.960 | 0 | ||
| Alasso | 30.922(21.29) | 2 | 0.805 | 29.275(19.77) | 2.99 | 1.982 | ||
| CBPR | 29.539(20.40) | 2.82 | 0 | 29.056(19.04) | 4.392 | 0.948 | ||
| 29.322(18.19) | 4.72 | 0 | 28.592(18.94) | 4.572 | 0.114 | |||
| 30% | Lasso | 66.129(44.12) | 2.474 | 0.959 | 62.991(42.60) | 1.749 | 1.808 | |
| E.net | 64.387(41.11) | 2.014 | 0 | 63.316(39.76) | 3.268 | 0.230 | ||
| Alasso | 67.047(44.33) | 3.011 | 2 | 65.929(43.11) | 2.847 | 0.639 | ||
| CBPR | 65.303(43.02) | 3.685 | 0 | 63.563(39.06) | 3.506 | 0.715 | ||
| 63.560(44.96) | 4.659 | 0 | 63.039(42.10) | 4.078 | 0.039 | |||
Simulation results for the location level of contamination in the model(1 − δ)N(0, 1) + δN(−10, 1) in high-dimensional data set.
| Method | r = 0.5 | r = 0.85 | |||||
|---|---|---|---|---|---|---|---|
| Med.PMSE | C | I | Med.PMSE | C | I | ||
| 0% | Lasso | 5.413(27.8) | 379.001 | 1.003 | 2.017(2.00) | 384.3 | 3.17 |
| E.net | 6.639(23.2) | 379.003 | 0 | 1.957(1.98) | 383.043 | 0.14 | |
| CBPR | 2.410(2.20) | 376.620 | 0 | 2.257(1.93) | 366.941 | 0 | |
| 2.054(1.50) | 370.820 | 0 | 2.052(2.00) | 373.156 | 0 | ||
| 10% | Lasso | 8.691(27.20) | 372.002 | 2.005 | 1.640(1.66) | 376.106 | 0 |
| E.net | 4.942(22.2) | 379.995 | 0 | 1.576(1.63) | 373.850 | 0 | |
| CBPR | 1.687(1.50) | 381.009 | 0 | 1.674(1.51) | 363.644 | 0 | |
| 1.894(1.01) | 369.526 | 0 | 1.400(1.20) | 379.245 | 0 | ||
| 20% | Lasso | 4.896(27.6) | 379.002 | 2.001 | 1.369(1.49) | 376.624 | 0 |
| E.net | 5.769(22.5) | 379.001 | 0 | 1.357(1.35) | 373.523 | 0 | |
| CBPR | 1.975(1.80) | 381.012 | 0 | 1.213(1.03) | 375.590 | 0 | |
| 1.678(1.20) | 370.550 | 0 | 1.181(1.04) | 377.505 | 0 | ||
| 30% | Lasso | 3.140(35.07) | 364.230 | 0.005 | 1.178(1.23) | 369.374 | 0 |
| E.net | 4.361(32.05) | 357.672 | 0 | 1.150(1.33) | 374.236 | 0 | |
| CBPR | 2.273(1.63) | 368.892 | 0 | 1.103(0.96) | 369.537 | 0 | |
| 1.297(1.66) | 370.747 | 0 | 0.152(1.05) | 381.078 | 0 | ||
Simulation results are based on 1000 replications and δ is the level of contamination in the model (1 − δ)exp(1) + δexp(0.2).
| ( | Method | r = 0.5 | r = 0.85 | |||||
|---|---|---|---|---|---|---|---|---|
| Med.PMSE | C | I | Med.PMSE | C | I | |||
| (50,8) | 0% | Lasso | 1.354(2.73) | 3.325 | 0 | 1.333(2.43) | 2.660 | 1 |
| E.net | 1.336(2.41) | 1.669 | 0 | 1.356(2.48) | 1.795 | 0 | ||
| Alasso | 1.282(4.11) | 3.648 | 0 | 1.389(3.22) | 4.382 | 0 | ||
| CBPR | 1.375(2.60) | 3.204 | 0.007 | 1.321(2.33) | 3.179 | 0 | ||
| 1.192(2.21) | 4.639 | 0 | 1.227(2.09) | 4.575 | 0 | |||
| 10% | Lasso | 1.455(2.48) | 1.554 | 0 | 1.414(2.10) | 3.617 | 0 | |
| E.net | 1.494(2.60) | 0.540 | 0 | 1.394(2.22) | 1.832 | 1 | ||
| Alasso | 1.385(3.13) | 4.064 | 0 | 1.610(3.19) | 3 | 0 | ||
| CBPR | 1.372(2.24) | 4.590 | 0 | 1.375(2.34) | 2.06 | 0 | ||
| 1.253(2.06) | 4.892 | 0 | 1.243(2.00) | 4.668 | 0 | |||
| 20% | Lasso | 2.226(3.68) | 2.311 | 0 | 2.117(3.16) | 2.323 | 1 | |
| E.net | 2.305(4.05) | 2.373 | 0 | 2.256(3.35) | 2.829 | 0 | ||
| Alasso | 2.382(4.85) | 4.249 | 0.021 | 2.480(4.58) | 4.163 | 1 | ||
| CBPR | 2.134(3.36) | 3.472 | 0.002 | 2.157(3.02) | 3.619 | 0 | ||
| 1.953(2.89) | 4.949 | 0 | 2.016(2.81) | 4.888 | 0.001 | |||
| 30% | Lasso | 3.800(6.33) | 2.393 | 0 | 3.567(5.71) | 4.091 | 0.17 | |
| E.net | 3.810(6.51) | 1.522 | 0 | 3.508(6.24) | 3.261 | 0 | ||
| Alasso | 3.941(8.40) | 3.930 | 0.052 | 3.908(6.68) | 3.599 | 0.233 | ||
| CBPR | 3.493(5.85) | 3.719 | 0 | 3.527(6.40) | 3.843 | 0 | ||
| 2.468(5.42) | 4.221 | 0 | 2.469(5.32) | 4.489 | 0.04 | |||
| (100,8) | 0% | Lasso | 1.135(1.27) | 3.198 | 0 | 1.154(0.81) | 3 | 0 |
| E.net | 1.133(1.28) | 1.421 | 0 | 1.185(0.80) | 1.62 | 0 | ||
| Alasso | 1.081(1.33) | 4.859 | 0 | 1.158(0.78) | 4.99 | 0 | ||
| CBPR | 1.103(1.29) | 3.069 | 0 | 1.111(0.77) | 4.73 | 0 | ||
| 1.173(1.44) | 3.473 | 0 | 1.106(0.74) | 5 | 0 | |||
| 10% | Lasso | 1.205(1.19) | 3.117 | 0 | 1.222(0.90) | 3 | 0 | |
| E.net | 1.235(1.25) | 0.388 | 0 | 1.235(0.92) | 1.361 | 0 | ||
| Alasso | 1.176(1.14) | 4.189 | 0 | 1.232(0.88) | 2.943 | 1 | ||
| CBPR | 1.166(1.30) | 3.683 | 0 | 1.220(0.85) | 4 | 0 | ||
| 1.131(1.17) | 4.228 | 0 | 1.181(0.81) | 4.430 | 0 | |||
| 20% | Lasso | 1.925(1.98) | 2.004 | 0 | 1.863(1.64) | 3.329 | 0 | |
| E.net | 1.935(1.88) | 1.547 | 0 | 1.876(1.72) | 3.564 | 0 | ||
| Alasso | 1.860(1.85) | 3.007 | 0 | 1.898(1.82) | 3.519 | 0 | ||
| CBPR | 1.786(1.69) | 4.166 | 0 | 1.868(1.60) | 2.015 | 0 | ||
| 1.062(1.68) | 4.301 | 0 | 1.754(1.55) | 4.865 | 0 | |||
| 30% | Lasso | 3.130(3.18) | 2.418 | 0.031 | 3.091(3.09) | 3.001 | 0 | |
| E.net | 3.203(3.64) | 1.883 | 0 | 3.090(3.03) | 2.554 | 0 | ||
| Alasso | 3.148(3.56) | 3.370 | 0.926 | 3.240(3.52) | 1 | 0.001 | ||
| CBPR | 3.043(3.26) | 4.392 | 0.012 | 3.075(3.21) | 2.957 | 0 | ||
| 2.960(3.00) | 4.768 | 0 | 2.939(2.98) | 4.614 | 0 | |||
Simulation results for the location level of contamination in the model(1 − δ)exp(1) + δexp(0.2) in high-dimensional data set.
| Method | r = 0.5 | r = 0.85 | |||||
|---|---|---|---|---|---|---|---|
| Med.PMSE | C | I | Med.PMSE | C | I | ||
| 0% | Lasso | 5.170(36.74) | 354.096 | 2.143 | 2.057(2.12) | 374.469 | 0 |
| E.net | 6.704(33.29) | 352.160 | 1 | 1.982(1.81) | 375.198 | 0 | |
| CBPR | 2.684(3.97) | 368.523 | 0 | 1.707(1.49) | 376.233 | 0 | |
| 2.878(4.37) | 367.174 | 0 | 1.677(1.55) | 380.644 | 0 | ||
| 10% | Lasso | 5.763(35.56) | 363.011 | 0.034 | 2.160(2.26) | 370.476 | 0.015 |
| E.net | 7.306(37.42) | 354.849 | 0 | 2.072(2.18) | 373.299 | 0 | |
| CBPR | 3.919(5.57) | 359.637 | 0 | 1.866(1.66) | 372.595 | 0 | |
| 3.881(5.10) | 367.295 | 0 | 1.892(1.58) | 381.919 | 0 | ||
| 20% | Lasso | 8.281(32.98) | 364.791 | 0.122 | 3.300(3.63) | 370.232 | 0 |
| E.net | 9.923(37.64) | 356.168 | 0 | 3.116(3.48) | 375.916 | 0 | |
| CBPR | 3.682(5.37) | 369.929 | 0 | 3.947(3.10) | 366.586 | 0 | |
| 2.551(4.43) | 370.475 | 0 | 2.813(2.95) | 380.964 | 0 | ||
| 30% | Lasso | 13.869(46.19) | 361.1 | 0.233 | 5.369(6.33) | 371.226 | 0 |
| E.net | 15.110(44.32) | 347.852 | 0 | 5.092(5.55) | 375.691 | 0.108 | |
| CBPR | 5.566(8.13) | 365.307 | 0 | 5.391(6.381) | 363.874 | 0 | |
| 4.659(7.91) | 377.308 | 0 | 4.263(4.92) | 379.011 | 0 | ||
Estimated coefficients and model error results, for various regularization procedures applied to the prostate data.
The dashed entries correspond to predictors that are estimated “0”.
| Variable | Lasso | Elastic net | Alasso | CBPR | |
|---|---|---|---|---|---|
| Intercept | 0.635 | 0.206 | 2.921 | 1.844 | 0.208 |
| lweight | −0.282 | … | −0.821 | −0.425 | … |
| age | 0.027 | 0.008 | … | 0.029 | … |
| lbph | … | … | … | … | … |
| svi | … | … | … | … | … |
| lcp | 1.135 | 0.836 | 1.213 | 1.266 | 0.648 |
| gleason | 0.088 | 0.102 | 0.267 | … | … |
| pgg45 | … | … | … | … | … |
| lpsa | 0.094 | 0.05 | 0.374 | … | 0.342 |
| Test Error | 1.997 | 1.132 | 3.112 | 2.413 | 0.807 |
Fig 1The Q-Q plot and boxplot of the response variable “lcavol”.
Fig 2(a) Histogram and (b) Boxplot for production rate in riboflavin.
Fig 3Comparison of percentage test error.