| Literature DB >> 28018492 |
John D Rice1, Jeremy M G Taylor1.
Abstract
One common use of binary response regression methods is classification based on an arbitrary probability threshold dictated by the particular application. Since this is given to us a priori, it is sensible to incorporate the threshold into our estimation procedure. Specifically, for the linear logistic model, we solve a set of locally weighted score equations, using a kernel-like weight function centered at the threshold. The bandwidth for the weight function is selected by cross validation of a novel hybrid loss function that combines classification error and a continuous measure of divergence between observed and fitted values; other possible cross-validation functions based on more common binary classification metrics are also examined. This work has much in common with robust estimation, but diers from previous approaches in this area in its focus on prediction, specifically classification into high- and low-risk groups. Simulation results are given showing the reduction in error rates that can be obtained with this method when compared with maximum likelihood estimation, especially under certain forms of model misspecification. Analysis of a melanoma data set is presented to illustrate the use of the method in practice.Entities:
Keywords: asymmetric loss; binary classification; local likelihood; logistic regression; robust estimation
Year: 2016 PMID: 28018492 PMCID: PMC5173294 DOI: 10.1007/s12561-016-9147-y
Source DB: PubMed Journal: Stat Biosci ISSN: 1867-1764