| Literature DB >> 33286314 |
Konrad Furmańczyk1, Wojciech Rejchel2.
Abstract
In this paper, we consider prediction and variable selection in the misspecified binary classification models under the high-dimensional scenario. We focus on two approaches to classification, which are computationally efficient, but lead to model misspecification. The first one is to apply penalized logistic regression to the classification data, which possibly do not follow the logistic model. The second method is even more radical: we just treat class labels of objects as they were numbers and apply penalized linear regression. In this paper, we investigate thoroughly these two approaches and provide conditions, which guarantee that they are successful in prediction and variable selection. Our results hold even if the number of predictors is much larger than the sample size. The paper is completed by the experimental results.Entities:
Keywords: misclassification risk; model misspecification; penalized estimation; supervised classification; variable selection consistency
Year: 2020 PMID: 33286314 PMCID: PMC7517038 DOI: 10.3390/e22050543
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524