Literature DB >> 32518109

Theoretical issues in deep networks.

Tomaso Poggio1, Andrzej Banburski2, Qianli Liao2.   

Abstract

While deep learning is successful in a number of applications, it is not yet well understood theoretically. A theoretical characterization of deep learning should answer questions about their approximation power, the dynamics of optimization, and good out-of-sample performance, despite overparameterization and the absence of explicit regularization. We review our recent results toward this goal. In approximation theory both shallow and deep networks are known to approximate any continuous functions at an exponential cost. However, we proved that for certain types of compositional functions, deep networks of the convolutional type (even without weight sharing) can avoid the curse of dimensionality. In characterizing minimization of the empirical exponential loss we consider the gradient flow of the weight directions rather than the weights themselves, since the relevant function underlying classification corresponds to normalized networks. The dynamics of normalized weights turn out to be equivalent to those of the constrained problem of minimizing the loss subject to a unit norm constraint. In particular, the dynamics of typical gradient descent have the same critical points as the constrained problem. Thus there is implicit regularization in training deep networks under exponential-type loss functions during gradient flow. As a consequence, the critical points correspond to minimum norm infima of the loss. This result is especially relevant because it has been recently shown that, for overparameterized models, selection of a minimum norm solution optimizes cross-validation leave-one-out stability and thereby the expected error. Thus our results imply that gradient descent in deep networks minimize the expected error.

Keywords:  approximation; deep learning; generalization; machine learning; optimization

Year:  2020        PMID: 32518109      PMCID: PMC7720221          DOI: 10.1073/pnas.1907369117

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  3 in total

1.  Error bounds for approximations with deep ReLU networks.

Authors:  Dmitry Yarotsky
Journal:  Neural Netw       Date:  2017-07-13

2.  Optimal approximation of piecewise smooth functions using deep ReLU neural networks.

Authors:  Philipp Petersen; Felix Voigtlaender
Journal:  Neural Netw       Date:  2018-09-07

3.  Complexity control by gradient descent in deep networks.

Authors:  Tomaso Poggio; Qianli Liao; Andrzej Banburski
Journal:  Nat Commun       Date:  2020-02-24       Impact factor: 14.919

  3 in total
  9 in total

1.  The science of deep learning.

Authors:  Richard Baraniuk; David Donoho; Matan Gavish
Journal:  Proc Natl Acad Sci U S A       Date:  2020-11-23       Impact factor: 11.205

2.  Deep physical neural networks trained with backpropagation.

Authors:  Logan G Wright; Tatsuhiro Onodera; Martin M Stein; Tianyu Wang; Darren T Schachter; Zoey Hu; Peter L McMahon
Journal:  Nature       Date:  2022-01-26       Impact factor: 69.504

Review 3.  Artificial intelligence and machine learning approaches for drug design: challenges and opportunities for the pharmaceutical industries.

Authors:  Chandrabose Selvaraj; Ishwar Chandra; Sanjeev Kumar Singh
Journal:  Mol Divers       Date:  2021-10-23       Impact factor: 2.943

4.  Can Robots Do Epidemiology? Machine Learning, Causal Inference, and Predicting the Outcomes of Public Health Interventions.

Authors:  Alex Broadbent; Thomas Grote
Journal:  Philos Technol       Date:  2022-02-26

5.  Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training.

Authors:  Cong Fang; Hangfeng He; Qi Long; Weijie J Su
Journal:  Proc Natl Acad Sci U S A       Date:  2021-10-26       Impact factor: 11.205

6.  Application of Bayesian Algorithm in Risk Quantification for Network Security.

Authors:  Lei Wei
Journal:  Comput Intell Neurosci       Date:  2022-07-08

7.  Efficient enumeration-selection computational strategy for adaptive chemistry.

Authors:  Yachong Guo; Marco Werner; Vladimir A Baulin
Journal:  Sci Rep       Date:  2022-08-29       Impact factor: 4.996

Review 8.  Face Recognition by Humans and Machines: Three Fundamental Advances from Deep Learning.

Authors:  Alice J O'Toole; Carlos D Castillo
Journal:  Annu Rev Vis Sci       Date:  2021-08-04       Impact factor: 7.745

Review 9.  Illuminating the Black Box: Interpreting Deep Neural Network Models for Psychiatric Research.

Authors:  Yi-Han Sheu
Journal:  Front Psychiatry       Date:  2020-10-29       Impact factor: 4.157

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.