| Literature DB >> 33266692 |
Baobin Wang1, Ting Hu2.
Abstract
The minimum error entropy principle (MEE) is an alternative of the classical least squares for its robustness to non-Gaussian noise. This paper studies the gradient descent algorithm for MEE with a semi-supervised approach and distributed method, and shows that using the additional information of unlabeled data can enhance the learning ability of the distributed MEE algorithm. Our result proves that the mean squared error of the distributed gradient descent MEE algorithm can be minimax optimal for regression if the number of local machines increases polynomially as the total datasize.Entities:
Keywords: MEE algorithm; distributed method; gradient descent; information theoretical learning; reproducing kernel Hilbert spaces; semi-supervised approach
Year: 2018 PMID: 33266692 PMCID: PMC7512566 DOI: 10.3390/e20120968
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
List of notations used throughout the paper.
| Notation | Meaning of the Notation |
|---|---|
|
| the explanatory variable |
|
| the response variable |
|
| |
|
| |
|
| a Boreal measure on |
|
| the marginal probability measure of |
|
| the conditional probability measure of |
|
| the mean regression function |
|
| the target function of MEE induced by |
|
| a reproducing kernel on |
|
| the labeled data set |
|
| the size of labeled data set |
|
| the largest integer not exceeding |
|
| the cardinality of |
|
| the unlabeled data set |
|
| the size of unlabeled data set |
|
| the cardinality of |
|
| training data set used in the distributed MEE algorithm, consisting of |
|
| the cardinality of |
|
| the number of local machines |
|
| the |
|
| the loss function of MEE algorithm |
|
| the integral operator associated with |
|
| the empirical operator of |
|
| the function output by the kernel gradient descent MEE algorithm |
| with data | |
|
| the function output by the kernel gradient MEE algorithm |
| with data | |
|
| the global output averaging over local outputs |
Figure 1The mean square errors for the size of unlabeled data as the number of local machines m varies.