Literature DB >> 33822722

The Loss Surface of Deep Linear Networks Viewed Through the Algebraic Geometry Lens.

Dhagash Mehta, Tianran Chen, Tingting Tang, Jonathan D Hauenstein.   

Abstract

By using the viewpoint of modern computational algebraic geometry, we explore properties of the optimization landscapes of deep linear neural network models. After providing clarification on the various definitions of "flat" minima, we show that the geometrically flat minima, which are merely artifacts of residual continuous symmetries of the deep linear networks, can be straightforwardly removed by a generalized L2-regularization. Then, we establish upper bounds on the number of isolated stationary points of these networks with the help of algebraic geometry. Combining these upper bounds with a method in numerical algebraic geometry, we find all stationary points for modest depth and matrix size. We demonstrate that, in the presence of the non-zero regularization, deep linear networks can indeed possess local minima which are not global minima. Finally, we show that even though the number of stationary points increases as the number of neurons (regularization parameters) increases (decreases), higher index saddles are surprisingly rare.

Entities:  

Year:  2022        PMID: 33822722     DOI: 10.1109/TPAMI.2021.3071289

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   9.322


  2 in total

1.  Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep Network Losses.

Authors:  Charles G Frye; James Simon; Neha S Wadia; Andrew Ligeralde; Michael R DeWeese; Kristofer E Bouchard
Journal:  Neural Comput       Date:  2021-05-13       Impact factor: 2.026

2.  Thalamic control of cortical dynamics in a model of flexible motor sequencing.

Authors:  Laureline Logiaco; L F Abbott; Sean Escola
Journal:  Cell Rep       Date:  2021-06-01       Impact factor: 9.423

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.