Literature DB >> 36120512

SURPRISES IN HIGH-DIMENSIONAL RIDGELESS LEAST SQUARES INTERPOLATION.

Trevor Hastie1, Andrea Montanari2, Saharon Rosset3, Ryan J Tibshirani4.   

Abstract

Interpolators-estimators that achieve zero training error-have attracted growing attention in machine learning, mainly because state-of-the art neural networks appear to be models of this type. In this paper, we study minimum ℓ 2 norm ("ridgeless") interpolation least squares regression, focusing on the high-dimensional regime in which the number of unknown parameters p is of the same order as the number of samples n. We consider two different models for the feature distribution: a linear model, where the feature vectors x i ∈ ℝ p are obtained by applying a linear transform to a vector of i.i.d. entries, x i = Σ1/2 z i (with z i ∈ ℝ p ); and a nonlinear model, where the feature vectors are obtained by passing the input through a random one-layer neural network, xi = φ(Wz i ) (with z i ∈ ℝ d , W ∈ ℝ p × d a matrix of i.i.d. entries, and φ an activation function acting componentwise on Wz i ). We recover-in a precise quantitative way-several phenomena that have been observed in large-scale neural networks and kernel machines, including the "double descent" behavior of the prediction risk, and the potential benefits of overparametrization.

Entities:  

Keywords:  Primary 62J05, 62J07; Regression; interpolation; overparametrization; random matrix theory; ridge regression; secondary 62J02, 62F12

Year:  2022        PMID: 36120512      PMCID: PMC9481183          DOI: 10.1214/21-aos2133

Source DB:  PubMed          Journal:  Ann Stat        ISSN: 0090-5364            Impact factor:   4.904


  6 in total

1.  Functional approximation by feed-forward networks: a least-squares approach to generalization.

Authors:  A R Webb
Journal:  IEEE Trans Neural Netw       Date:  1994

2.  Reconciling modern machine-learning practice and the classical bias-variance trade-off.

Authors:  Mikhail Belkin; Daniel Hsu; Siyuan Ma; Soumik Mandal
Journal:  Proc Natl Acad Sci U S A       Date:  2019-07-24       Impact factor: 11.205

3.  Benign overfitting in linear regression.

Authors:  Peter L Bartlett; Philip M Long; Gábor Lugosi; Alexander Tsigler
Journal:  Proc Natl Acad Sci U S A       Date:  2020-04-24       Impact factor: 11.205

4.  SURPRISES IN HIGH-DIMENSIONAL RIDGELESS LEAST SQUARES INTERPOLATION.

Authors:  Trevor Hastie; Andrea Montanari; Saharon Rosset; Ryan J Tibshirani
Journal:  Ann Stat       Date:  2022-04-07       Impact factor: 4.904

5.  A mean field view of the landscape of two-layer neural networks.

Authors:  Song Mei; Andrea Montanari; Phan-Minh Nguyen
Journal:  Proc Natl Acad Sci U S A       Date:  2018-07-27       Impact factor: 11.205

6.  High-dimensional dynamics of generalization error in neural networks.

Authors:  Madhu S Advani; Andrew M Saxe; Haim Sompolinsky
Journal:  Neural Netw       Date:  2020-09-05
  6 in total
  1 in total

1.  SURPRISES IN HIGH-DIMENSIONAL RIDGELESS LEAST SQUARES INTERPOLATION.

Authors:  Trevor Hastie; Andrea Montanari; Saharon Rosset; Ryan J Tibshirani
Journal:  Ann Stat       Date:  2022-04-07       Impact factor: 4.904

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.