| Literature DB >> 35173595 |
Vanessa D'Amario1,2, Sanjana Srivastava2,3, Tomotake Sasaki4, Xavier Boix1,2.
Abstract
Biological learning systems are outstanding in their ability to learn from limited training data compared to the most successful learning machines, i.e., Deep Neural Networks (DNNs). What are the key aspects that underlie this data efficiency gap is an unresolved question at the core of biological and artificial intelligence. We hypothesize that one important aspect is that biological systems rely on mechanisms such as foveations in order to reduce unnecessary input dimensions for the task at hand, e.g., background in object recognition, while state-of-the-art DNNs do not. Datasets to train DNNs often contain such unnecessary input dimensions, and these lead to more trainable parameters. Yet, it is not clear whether this affects the DNNs' data efficiency because DNNs are robust to increasing the number of parameters in the hidden layers, and it is uncertain whether this holds true for the input layer. In this paper, we investigate the impact of unnecessary input dimensions on the DNNs data efficiency, namely, the amount of examples needed to achieve certain generalization performance. Our results show that unnecessary input dimensions that are task-unrelated substantially degrade data efficiency. This highlights the need for mechanisms that remove task-unrelated dimensions, such as foveation for image classification, in order to enable data efficiency gains.Entities:
Keywords: data efficiency; deep learning; object background; object recognition; overparameterization; unnecessary input dimensions
Year: 2022 PMID: 35173595 PMCID: PMC8842477 DOI: 10.3389/fncom.2022.760085
Source DB: PubMed Journal: Front Comput Neurosci ISSN: 1662-5188 Impact factor: 2.380
Figure 1Data Efficiency of Linear and Fully Connected Networks Trained on Dataset For Binary Classification. Data efficiency for different amount of unnecessary input dimensions. Error bars indicate standard deviation across experiment repetitions. (A) Test accuracy of the pseudo-inverse solution as function of different amount of training examples. Each curve corresponds to a different amount of additional task-unrelated (left plot), task-related (middle plot) and task-related/unrelated dimensions, as reported in the legend. (B) We report the Area Under the Test Curve (AUTC) of the accuracy for different amount of training examples. We indicate with the gradient bar on the x-axis the amount of additional task-unrelated dimensions (left and right plot), and additional task-related dimensions, in the middle plot. The legend indicate the different networks trained and tested on the dataset. (C) AUTC values for an MLP with ReLU activation and cross entropy loss, for different types of task-unrelated dimensions: Gaussian independent components with increasing variance, Gaussian with non diagonal covariance, and salt and pepper noise. The gradient on the x-axis indicates the number of task-unrelated dimensions.
Figure 2Data Efficiency in Object Recognition Datasets. Results of DNNs' data efficiency for different amount of unnecessary input dimensions, tasks and networks. Error bars indicate standard deviation across experiment repetitions. (A) log-AUTC for CNNs and MLPs trained on synthetic MNIST, for different number of unnecessary dimensions. (B) Left plot: log-AUTC for networks trained on Natural MNIST for larger amount of training examples; Right: test accuracy on the smallest training set. (C) AUTC on the Stanford Dogs dataset for the five cases shown on the left of the panel.