| Literature DB >> 33435467 |
David W Scott1, Zhipeng Wang1,2.
Abstract
As modern data analysis pushes the boundaries of classical statistics, it is timely to reexamine alternate approaches to dealing with outliers in multiple regression. As sample sizes and the number of predictors increase, interactive methodology becomes less effective. Likewise, with limited understanding of the underlying contamination process, diagnostics are likely to fail as well. In this article, we advocate for a non-likelihood procedure that attempts to quantify the fraction of bad data as a part of the estimation step. These ideas also allow for the selection of important predictors under some assumptions. As there are many robust algorithms available, running several and looking for interesting differences is a sensible strategy for understanding the nature of the outliers.Entities:
Keywords: influence functions; maximum likelihood estimation; minimum distance estimation
Year: 2021 PMID: 33435467 PMCID: PMC7826993 DOI: 10.3390/e23010088
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524