Literature DB >> 32332161

Benign overfitting in linear regression.

Peter L Bartlett1,2, Philip M Long3, Gábor Lugosi4,5,6, Alexander Tsigler7.   

Abstract

The phenomenon of benign overfitting is one of the key mysteries uncovered by deep learning methodology: deep neural networks seem to predict well, even with a perfect fit to noisy training data. Motivated by this phenomenon, we consider when a perfect fit to training data in linear regression is compatible with accurate prediction. We give a characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy. The characterization is in terms of two notions of the effective rank of the data covariance. It shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size. By studying examples of data covariance properties that this characterization shows are required for benign overfitting, we find an important role for finite-dimensional data: the accuracy of the minimum norm interpolating prediction rule approaches the best possible accuracy for a much narrower range of properties of the data distribution when the data lie in an infinite-dimensional space vs. when the data lie in a finite-dimensional space with dimension that grows faster than the sample size.

Keywords:  interpolation; linear regression; overfitting; statistical learning theory

Year:  2020        PMID: 32332161     DOI: 10.1073/pnas.1907378117

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  11 in total

1.  The science of deep learning.

Authors:  Richard Baraniuk; David Donoho; Matan Gavish
Journal:  Proc Natl Acad Sci U S A       Date:  2020-11-23       Impact factor: 11.205

2.  SURPRISES IN HIGH-DIMENSIONAL RIDGELESS LEAST SQUARES INTERPOLATION.

Authors:  Trevor Hastie; Andrea Montanari; Saharon Rosset; Ryan J Tibshirani
Journal:  Ann Stat       Date:  2022-04-07       Impact factor: 4.904

3.  A smart, practical, deep learning-based clinical decision support tool for patients in the prostate-specific antigen gray zone: model development and validation.

Authors:  Sang Hun Song; Hwanik Kim; Jung Kwon Kim; Hakmin Lee; Jong Jin Oh; Sang-Chul Lee; Seong Jin Jeong; Sung Kyu Hong; Junghoon Lee; Sangjun Yoo; Min-Soo Choo; Min Chul Cho; Hwancheol Son; Hyeon Jeong; Jungyo Suh; Seok-Soo Byun
Journal:  J Am Med Inform Assoc       Date:  2022-10-07       Impact factor: 7.942

4.  High-dimensional dynamics of generalization error in neural networks.

Authors:  Madhu S Advani; Andrew M Saxe; Haim Sompolinsky
Journal:  Neural Netw       Date:  2020-09-05

5.  Using Muse: Rapid Mobile Assessment of Brain Performance.

Authors:  Olave E Krigolson; Mathew R Hammerstrom; Wande Abimbola; Robert Trska; Bruce W Wright; Kent G Hecker; Gordon Binsted
Journal:  Front Neurosci       Date:  2021-01-28       Impact factor: 4.677

6.  Establishment and Effectiveness Evaluation of a Scoring System-RAAS (RDW, AGE, APACHE II, SOFA) for Sepsis by a Retrospective Analysis.

Authors:  Yingying Huang; Shaowei Jiang; Wenjie Li; Yiwen Fan; Yuxin Leng; Chengjin Gao
Journal:  J Inflamm Res       Date:  2022-01-20

7.  Small facial image dataset augmentation using conditional GANs based on incomplete edge feature input.

Authors:  Shih-Kai Hung; John Q Gan
Journal:  PeerJ Comput Sci       Date:  2021-11-17

8.  Ridge Penalization in High-Dimensional Testing With Applications to Imaging Genetics.

Authors:  Iris Ivy Gauran; Gui Xue; Chuansheng Chen; Hernando Ombao; Zhaoxia Yu
Journal:  Front Neurosci       Date:  2022-03-24       Impact factor: 4.677

9.  Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training.

Authors:  Cong Fang; Hangfeng He; Qi Long; Weijie J Su
Journal:  Proc Natl Acad Sci U S A       Date:  2021-10-26       Impact factor: 11.205

10.  The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation.

Authors:  Davide Chicco; Matthijs J Warrens; Giuseppe Jurman
Journal:  PeerJ Comput Sci       Date:  2021-07-05
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.