| Literature DB >> 33267117 |
Changying Guo1, Biqin Song1, Yingjie Wang1, Hong Chen1, Huijuan Xiong1.
Abstract
Model-free variable selection has attracted increasing interest recently due to its flexibility in algorithmic design and outstanding performance in real-world applications. However, most of the existing statistical methods are formulated under the mean square error (MSE) criterion, and susceptible to non-Gaussian noise and outliers. As the MSE criterion requires the data to satisfy Gaussian noise condition, it potentially hampers the effectiveness of model-free methods in complex circumstances. To circumvent this issue, we present a new model-free variable selection algorithm by integrating kernel modal regression and gradient-based variable identification together. The derived modal regression estimator is related closely to information theoretic learning under the maximum correntropy criterion, and assures algorithmic robustness to complex noise by replacing learning of the conditional mean with the conditional mode. The gradient information of estimator offers a model-free metric to screen the key variables. In theory, we investigate the theoretical foundations of our new model on generalization-bound and variable selection consistency. In applications, the effectiveness of the proposed method is verified by data experiments.Entities:
Keywords: generalization error; maximum correntropy criterion; modal regression; reproducing kernel Hilbert space; variable selection
Year: 2019 PMID: 33267117 PMCID: PMC7514890 DOI: 10.3390/e21040403
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Properties of different regression algorithms.
| Lasso [ | RMR [ | SpAM [ | COSSO [ | GM [ | Ours | |
|---|---|---|---|---|---|---|
| Learning criterion | MSE | MRC | MSE | MSE | MSE | MRC |
| Model assumption | linear | linear | additive | additive | model-free | model-free |
The averaged performance on simulated data in Example 1 (left) and Example 2 (right).
| Noise |
| Method | SIZE | TP | FP | Up | Op | Cp | ASE | SIZE | TP | FP | Up | Op | Cp | ASE |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| (100, 150) | Lasso | 3.92 | 3.92 | 0.00 | 0.36 | 0.00 | 0.64 | 1.369 | 4.40 | 4.28 | 0.12 | 0.44 | 0.12 | 0.44 | 5.112 | |
|
| SpAM | 4.12 | 3.92 | 0.20 | 0.08 | 0.16 | 0.76 | 1.075 | 5.02 | 4.98 | 0.04 | 0.04 | 0.04 |
| 1.611 | |
| GM | 4.12 | 3.88 | 0.24 | 0.12 | 0.16 | 0.72 | 1.123 | 5.14 | 4.98 | 0.16 | 0.04 | 0.12 | 0.84 | 1.775 | ||
|
| 4.00 | 3.92 | 0.08 | 0.08 | 0.04 |
|
| 5.12 | 5.00 | 0.12 | 0.00 | 0.08 |
|
| ||
|
| 3.84 | 3.80 | 0.04 | 0.20 | 0.04 | 0.76 | 1.131 | 5.12 | 4.92 | 0.20 | 0.08 | 0.16 | 0.76 | 1.914 | ||
|
| (150, 150) | Lasso | 4.20 | 3.92 | 0.04 | 0.16 | 0.12 | 0.72 | 1.245 | 4.48 | 4.28 | 0.20 | 0.40 | 0.16 | 0.44 | 4.794 |
| Gaussian Noise |
| SpAM | 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
|
| 5.00 | 5.00 | 0.00 | 0.00 | 0.00 |
| 1.612 |
| GM | 3.96 | 3.92 | 0.04 | 0.08 | 0.04 | 0.88 | 1.011 | 5.04 | 5.00 | 0.04 | 0.00 | 0.04 | 0.96 | 1.627 | ||
|
| 3.96 | 3.96 | 0.00 | 0.04 | 0.00 | 0.96 | 0.899 | 5.00 | 5.00 | 0.00 | 0.00 | 0.00 |
|
| ||
|
| 3.96 | 3.80 | 0.04 | 0.08 | 0.04 | 0.88 | 1.083 | 5.02 | 5.00 | 0.02 | 0.00 | 0.03 | 0.97 | 1.622 | ||
| (200, 150) | Lasso | 4.00 | 3.92 | 0.08 | 0.20 | 0.00 | 0.80 | 1.252 | 4.52 | 4.52 | 0.00 | 0.40 | 0.00 | 0.60 | 2.507 | |
|
| SpAM | 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
| 0.829 | 5.04 | 5.00 | 0.04 | 0.00 | 0.02 | 0.98 | 1.497 | |
| GM | 3.96 | 3.96 | 0.00 | 0.04 | 0.00 | 0.96 | 1.012 | 5.06 | 5.00 | 0.06 | 0.00 | 0.04 | 0.96 | 1.528 | ||
|
| 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
|
| 5.00 | 5.00 | 0.00 | 0.00 | 0.00 |
| 1.485 | ||
|
| 3.96 | 3.96 | 0.00 | 0.04 | 0.00 | 0.96 | 0.915 | 5.00 | 5.00 | 0.00 | 0.00 | 0.00 |
|
| ||
| (100, 150) | Lasso | 3.72 | 3.64 | 0.08 | 0.36 | 0.04 | 0.60 | 5.854 | 4.32 | 3.72 | 0.60 | 0.68 | 0.12 | 0.20 | 6.888 | |
|
| SpAM | 4.24 | 3.76 | 0.48 | 0.24 | 0.28 | 0.48 | 4.342 | 5.52 | 5.00 | 0.52 | 0.00 | 0.40 | 0.60 | 4.328 | |
| GM | 4.24 | 3.80 | 0.44 | 0.20 | 0.36 | 0.44 | 4.012 | 4.48 | 4.44 | 0.04 | 0.30 | 0.05 | 0.65 | 3.611 | ||
|
| 4.16 | 3.96 | 0.20 | 0.04 | 0.20 |
| 2.937 | 5.08 | 4.92 | 0.16 | 0.06 | 0.16 |
|
| ||
|
| 4.18 | 3.90 | 0.28 | 0.16 | 0.12 | 0.72 |
| 5.08 | 4.84 | 0.24 | 0.08 | 0.18 | 0.74 | 3.968 | ||
|
| (150, 150) | Lasso | 5.32 | 3.84 | 1.48 | 0.16 | 0.48 | 0.36 | 5.392 | 5.16 | 4.16 | 1.00 | 0.60 | 0.16 | 0.24 | 4.503 |
| Chi-square Noise |
| SpAM | 4.04 | 3.96 | 0.08 | 0.04 | 0.08 |
| 2.765 | 5.32 | 5.00 | 0.32 | 0.00 | 0.24 | 0.76 | 3.748 |
| GM | 4.00 | 3.88 | 0.12 | 0.12 | 0.08 | 0.80 | 2.873 | 4.98 | 4.92 | 0.06 | 0.05 | 0.05 | 0.90 | 4.173 | ||
|
| 3.96 | 3.92 | 0.04 | 0.08 | 0.04 |
| 2.809 | 5.02 | 5.00 | 0.02 | 0.00 | 0.02 |
|
| ||
|
| 4.08 | 3.96 | 0.12 | 0.04 | 0.12 | 0.84 |
| 5.04 | 5.00 | 0.04 | 0.00 | 0.04 | 0.96 | 3.519 | ||
| (200, 150) | Lasso | 4.24 | 4.00 | 0.24 | 0.00 | 0.28 | 0.72 | 5.805 | 4.36 | 4.32 | 0.04 | 0.52 | 0.04 | 0.44 | 3.754 | |
|
| SpAM | 4.08 | 4.00 | 0.08 | 0.00 | 0.08 | 0.92 | 2.463 | 5.04 | 5.00 | 0.04 | 0.00 | 0.04 | 0.96 | 3.634 | |
| GM | 4.04 | 4.00 | 0.04 | 0.00 | 0.04 |
| 2.523 | 5.18 | 5.00 | 0.18 | 0.00 | 0.20 | 0.80 | 3.816 | ||
|
| 3.96 | 3.96 | 0.00 | 0.04 | 0.00 |
| 2.449 | 5.00 | 5.00 | 0.00 | 0.00 | 0.00 |
|
| ||
|
| 3.96 | 3.96 | 0.00 | 0.04 | 0.00 |
|
| 5.00 | 5.00 | 0.00 | 0.00 | 0.00 |
| 3.457 | ||
| (100, 150) | Lasso | 3.46 | 3.46 | 0.00 | 0.60 | 0.00 | 0.40 | 4.631 | 4.64 | 4.00 | 0.64 | 0.60 | 0.20 | 0.20 | 4.567 | |
|
| SpAM | 4.28 | 3.88 | 0.40 | 0.12 | 0.28 | 0.60 | 4.599 | 5.84 | 5.00 | 0.84 | 0.00 | 0.44 | 0.56 | 4.224 | |
| GM | 4.20 | 3.64 | 0.56 | 0.36 | 0.36 | 0.28 | 3.941 | 5.36 | 4.68 | 0.68 | 0.32 | 0.28 | 0.40 | 4.528 | ||
|
| 4.20 | 3.88 | 0.32 | 0.12 | 0.20 |
| 3.274 | 5.06 | 4.82 | 0.24 | 0.14 | 0.14 | 0.72 | 3.907 | ||
|
| 3.96 | 3.80 | 0.16 | 0.20 | 0.16 | 0.64 |
| 5.12 | 4.92 | 0.20 | 0.04 | 0.16 |
|
| ||
|
| (150, 150) | Lasso | 5.30 | 3.66 | 1.64 | 0.20 | 0.44 | 0.36 | 4.747 | 5.64 | 4.16 | 1.48 | 0.48 | 0.28 | 0.24 | 4.786 |
| Exponential Noise |
| SpAM | 4.04 | 3.96 | 0.08 | 0.04 | 0.08 | 0.88 | 3.403 | 5.28 | 5.00 | 0.28 | 0.16 | 0.00 | 0.84 | 4.969 |
| GM | 4.08 | 3.96 | 0.12 | 0.04 | 0.12 | 0.84 | 3.177 | 4.98 | 4.92 | 0.06 | 0.08 | 0.04 | 0.88 | 4.129 | ||
|
| 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
| 2.724 | 5.02 | 4.98 | 0.04 | 0.02 | 0.04 |
|
| ||
|
| 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
|
| 5.00 | 4.96 | 0.04 | 0.04 | 0.04 | 0.92 | 3.918 | ||
| (200, 150) | Lasso | 3.80 | 3.80 | 0.00 | 0.20 | 0.00 | 0.80 | 4.291 | 4.68 | 4.60 | 0.08 | 0.28 | 0.08 | 0.64 | 3.669 | |
|
| SpAM | 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
| 2.988 | 5.24 | 5.00 | 0.24 | 0.00 | 0.20 | 0.80 | 4.808 | |
| GM | 3.96 | 3.96 | 0.00 | 0.04 | 0.00 | 0.96 | 3.016 | 4.98 | 4.98 | 0.00 | 0.04 | 0.00 | 0.96 | 3.878 | ||
|
| 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
| 2.884 | 5.00 | 5.00 | 0.00 | 0.00 | 0.00 |
|
| ||
|
| 3.96 | 3.92 | 0.04 | 0.09 | 0.00 | 0.91 | 3.113 | 4.96 | 4.96 | 0.00 | 0.04 | 0.00 | 0.96 | 3.771 | ||
| (100, 150) | Lasso | 4.92 | 3.80 | 1.12 | 0.28 | 0.32 | 0.40 | 2.301 | 6.52 | 3.92 | 2.60 | 0.64 | 0.20 | 0.16 | 6.971 | |
|
| SpAM | 4.90 | 3.80 | 1.1 | 0.24 | 0.20 | 0.56 | 1.698 | 7.92 | 4.72 | 3.20 | 0.24 | 0.44 | 0.32 | 4.658 | |
| GM | 5.00 | 3.64 | 1.36 | 0.32 | 0.32 | 0.36 | 1.551 | 5.68 | 4.32 | 1.32 | 0.40 | 0.32 | 0.28 | 3.561 | ||
|
| 4.14 | 3.94 | 0.20 | 0.05 | 0.10 |
|
| 5.00 | 4.84 | 0.16 | 0.08 | 0.16 |
|
| ||
|
| 4.14 | 3.88 | 0.26 | 0.12 | 0.16 | 0.72 | 1.208 | 4.96 | 4.80 | 0.16 | 0.16 | 0.12 | 0.72 | 2.339 | ||
|
| (150, 150) | Lasso | 5.08 | 3.72 | 1.36 | 0.24 | 0.40 | 0.36 | 1.793 | 6.32 | 3.80 | 2.52 | 0.68 | 0.20 | 0.12 | 6.020 |
| Student Noise |
| SpAM | 4.30 | 4.00 | 0.30 | 0.00 | 0.32 | 0.68 | 0.955 | 5.44 | 5.00 | 0.44 | 0.00 | 0.28 | 0.72 | 2.739 |
| GM | 4.04 | 3.80 | 0.24 | 0.16 | 0.16 | 0.68 | 1.046 | 5.56 | 4.60 | 0.96 | 0.28 | 0.08 | 0.64 | 2.557 | ||
|
| 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
|
| 4.98 | 4.98 | 0.00 | 0.08 | 0.00 | 0.92 |
| ||
|
| 3.92 | 3.88 | 0.04 | 0.12 | 0.04 | 0.84 | 1.169 | 4.96 | 4.96 | 0.00 | 0.04 | 0.00 |
| 1.723 | ||
| (200, 150) | Lasso | 5.00 | 3.92 | 1.08 | 0.32 | 0.20 | 0.48 | 1.262 | 5.44 | 4.36 | 1.08 | 0.44 | 0.28 | 0.28 | 2.976 | |
|
| SpAM | 4.10 | 4.00 | 0.10 | 0.00 | 0.27 | 0.73 | 1.060 | 5.64 | 5.00 | 0.64 | 0.00 | 0.28 | 0.72 | 2.427 | |
| GM | 4.00 | 3.96 | 0.04 | 0.04 | 0.04 | 0.92 | 1.011 | 5.20 | 4.72 | 0.48 | 0.20 | 0.04 | 0.76 | 2.350 | ||
|
| 4.04 | 4.00 | 0.04 | 0.00 | 0.10 | 0.90 |
| 5.00 | 5.00 | 0.00 | 0.00 | 0.00 |
|
| ||
|
| 4.04 | 4.00 | 0.04 | 0.00 | 0.04 |
| 0.884 | 4.96 | 4.96 | 0.00 | 0.04 | 0.00 | 0.96 | 1.672 |
The averaged performance with simulated data in Example 1.
| Noise |
| Method | SIZE | TP | FP | Up | Op | Cp | ASE |
|---|---|---|---|---|---|---|---|---|---|
| (300, 500) | Lasso | 1.98 | 1.98 | 0.00 | 1.00 | 0.00 | 0.00 | 1.98 | |
|
| GM | 4.04 | 4.00 | 0.04 | 0.00 | 0.04 |
| 0.80 | |
|
| 4.06 | 4.00 | 0.06 | 0.00 | 0.06 | 0.94 |
| ||
|
| 4.14 | 3.98 | 0.16 | 0.01 | 0.03 |
| 0.88 | ||
|
| (500, 500) | Lasso | 1.92 | 1.92 | 0.00 | 1.00 | 0.00 | 0.00 | 1.35 |
| Gaussian Noise |
| GM | 4.06 | 4.00 | 0.06 | 0.00 | 0.06 | 0.94 | 0.78 |
|
| 4.02 | 4.00 | 0.02 | 0.00 | 0.02 |
|
| ||
|
| 4.04 | 4.00 | 0.04 | 0.00 | 0.02 |
| 0.74 | ||
| (700, 500) | Lasso | 1.88 | 1.88 | 0.00 | 1.00 | 0.00 | 0.00 | 1.55 | |
|
| GM | 4.04 | 4.00 | 0.04 | 0.00 | 0.04 | 0.96 | 0.77 | |
|
| 4.02 | 4.00 | 0.02 | 0.00 | 0.02 | 0.98 |
| ||
|
| 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
| 0.73 | ||
| (300, 500) | Lasso | 1.80 | 1.80 | 0.00 | 1.00 | 0.00 | 0.00 | 4.45 | |
|
| GM | 4.18 | 4.00 | 0.18 | 0.00 | 0.14 | 0.86 | 2.92 | |
|
| 4.09 | 4.00 | 0.09 | 0.00 | 0.11 |
| 2.39 | ||
|
| 4.06 | 3.88 | 0.18 | 0.12 | 0.14 | 0.74 |
| ||
|
| (500, 500) | Lasso | 1.74 | 1.74 | 0.00 | 1.00 | 0.00 | 0.00 | 4.62 |
| Chi-square Noise |
| GM | 4.14 | 4.00 | 0.14 | 0.00 | 0.14 | 0.86 | 3.01 |
|
| 4.08 | 4.00 | 0.08 | 0.00 | 0.06 |
| 2.22 | ||
|
| 4.04 | 3.98 | 0.06 | 0.02 | 0.06 | 0.92 |
| ||
| (700, 500) | Lasso | 1.86 | 1.86 | 0.00 | 1.00 | 0.00 | 0.00 | 4.37 | |
|
| GM | 4.28 | 4.00 | 0.28 | 0.00 | 0.24 | 0.76 | 2.96 | |
|
| 4.02 | 4.00 | 0.02 | 0.00 | 0.02 |
| 2.13 | ||
|
| 4.02 | 4.00 | 0.02 | 0.00 | 0.02 |
|
| ||
| (300, 500) | Lasso | 2.04 | 2.04 | 0.00 | 1.00 | 0.00 | 0.00 | 4.25 | |
|
| GM | 3.94 | 3.87 | 0.07 | 0.13 | 0.05 | 0.82 | 3.14 | |
|
| 4.02 | 4.00 | 0.02 | 0.00 | 0.02 |
| 2.36 | ||
|
| 3.98 | 3.94 | 0.04 | 0.06 | 0.02 | 0.92 |
| ||
|
| (500, 500) | Lasso | 1.94 | 1.94 | 0.00 | 1.00 | 0.00 | 0.00 | 4.34 |
| Exponential Noise |
| GM | 4.12 | 4.00 | 0.12 | 0.00 | 0.10 | 0.90 | 2.35 |
|
| 3.99 | 3.96 | 0.03 | 0.04 | 0.03 | 0.93 | 2.37 | ||
|
| 4.02 | 4.00 | 0.02 | 0.00 | 0.02 |
|
| ||
| (700, 500) | Lasso | 1.90 | 1.90 | 0.00 | 1.00 | 0.00 | 0.00 | 4.67 | |
|
| GM | 4.08 | 4.00 | 0.08 | 0.00 | 0.06 | 0.94 | 2.33 | |
|
| 3.99 | 3.99 | 0.00 | 0.01 | 0.00 |
|
| ||
|
| 4.05 | 4.00 | 0.05 | 0.00 | 0.05 | 0.95 | 1.92 | ||
| (300, 500) | Lasso | 1.96 | 1.96 | 0.00 | 1.00 | 0.00 | 0.00 | 4.63 | |
|
| GM | 3.50 | 3.46 | 0.04 | 0.24 | 0.04 | 0.72 | 2.48 | |
|
| 4.14 | 3.94 | 0.20 | 0.06 | 0.10 | 0.84 |
| ||
|
| 4.00 | 3.98 | 0.02 | 0.02 | 0.00 |
| 0.90 | ||
|
| (500, 500) | Lasso | 1.76 | 1.76 | 0.00 | 0.98 | 0.00 | 0.02 | 3.83 |
| Student Noise |
| GM | 4.30 | 4.00 | 0.30 | 0.00 | 0.16 | 0.84 | 1.96 |
|
| 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
| 0.76 | ||
|
| 4.02 | 4.00 | 0.02 | 0.00 | 0.01 | 0.99 |
| ||
| (700, 500) | Lasso | 1.96 | 1.96 | 0.00 | 0.96 | 0.00 | 0.04 | 2.46 | |
|
| GM | 4.06 | 4.00 | 0.06 | 0.00 | 0.04 | 0.96 | 1.95 | |
|
| 4.04 | 4.00 | 0.04 | 0.00 | 0.06 | 0.94 |
| ||
|
| 4.00 | 4.00 | 0.00 | 0.00 | 0.00 |
| 0.74 |
Figure 1The average squares error (ASE) vs. the sample size n under different noise (A and B represent Example 1. and Example 2 respectively).
Figure 2The correct-fitting probability (Cp) vs. the sample size n under different noise (A and B represent Example 1. and Example 2 respectively).
Learning performance on Auto-Mpg.
| Variable | CyL | DISP | HPOWER | WEIG | ACCELER | YEAR | ORIGN | RSSE(std) |
|---|---|---|---|---|---|---|---|---|
| Lasso | - | - | - | ✔ | - | ✔ | - | 0.5918(0.3762) |
| SpAM | ✔ | ✔ | - | ✔ | - | - | - | 0.2754(0.0191) |
| GM | ✔ | ✔ | ✔ | ✔ | - | ✔ | ✔ | 0.2547(0.0313) |
| RGVS | ✔ | ✔ | ✔ | ✔ | - | ✔ | - | 0.1425(0.0277) |
| RGVS | ✔ | ✔ | ✔ | ✔ | - | ✔ | ✔ |
|
Learning performance on Heating Load (UP) and Cooling Load (DOWN).
| Variable | RC | SA | WA | RA | OH | ORIENT | GA | GAD | RSSE(std) |
|---|---|---|---|---|---|---|---|---|---|
| Lasso | - | - | - | - | ✔ | - | ✔ | - | 0.1739(0.0801) |
| SpAM | - | - | - | ✔ | ✔ | - | - | - | 0.1684( |
| GM | ✔ | ✔ | ✔ | ✔ | ✔ | - | - | - | 0.1244(0.0383) |
| RGVS | - | ✔ | ✔ | ✔ | ✔ | - | ✔ | - | |
| RGVS | - | - | ✔ | ✔ | ✔ | - | ✔ | - | 0.1110(0.0066) |
| Lasso | - | - | ✔ | - | ✔ | - | ✔ | - | 0.2119(0.0926) |
| SpAM | - | - | - | ✔ | ✔ | - | - | - | 0.1910(0.0131) |
| GM | ✔ | ✔ | ✔ | ✔ | ✔ | - | - | - | 0.1515(0.0120) |
| RGVS | - | ✔ | ✔ | ✔ | ✔ | - | ✔ | - | |
| RGVS | ✔ | ✔ | ✔ | ✔ | ✔ | - | - | - | 0.1368( |