| Literature DB >> 35036484 |
Ariel Keller Rorabaugh1, Silvina Caíno-Lores1, Travis Johnston2, Michela Taufer1.
Abstract
Neural Networks (NNs) are increasingly used across scientific domains to extract knowledge from experimental or computational data. An NN is composed of natural or artificial neurons that serve as simple processing units and are interconnected into a model architecture; it acquires knowledge from the environment through a learning process and stores this knowledge in its connections. The learning process is conducted by training. During NN training, the learning process can be tracked by periodically validating the NN and calculating its fitness. The resulting sequence of fitness values (i.e., validation accuracy or validation loss) is called the NN learning curve. The development of tools for NN design requires knowledge of diverse NNs and their complete learning curves. Generally, only final fully-trained fitness values for highly accurate NNs are made available to the community, hampering efforts to develop tools for NN design and leaving unaddressed aspects such as explaining the generation of an NN and reproducing its learning process. Our dataset fills this gap by fully recording the structure, metadata, and complete learning curves for a wide variety of random NNs throughout their training. Our dataset captures the lifespan of 6000 NNs throughout generation, training, and validation stages. It consists of a suite of 6000 tables, each table representing the lifespan of one NN. We generate each NN with randomized parameter values and train it for 40 epochs on one of three diverse image datasets (i.e., CIFAR-100, FashionMNIST, SVHN). We calculate and record each NN's fitness with high frequency-every half epoch-to capture the evolution of the training and validation process. As a result, for each NN, we record the generated parameter values describing the structure of that NN, the image dataset on which the NN trained, and all loss and accuracy values for the NN every half epoch. We put our dataset to the service of researchers studying NN performance and its evolution throughout training and validation. Statistical methods can be applied to our dataset to analyze the shape of learning curves in diverse NNs, and the relationship between an NN's structure and its fitness. Additionally, the structural data and metadata that we record enable the reconstruction and reproducibility of the associated NN.Entities:
Keywords: Accuracy curve; Artificial intelligence; Classification; Early stopping; Loss curve; Machine learning; Neural architecture search; Performance prediction
Year: 2022 PMID: 35036484 PMCID: PMC8749157 DOI: 10.1016/j.dib.2021.107780
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 1Structure of Generated NNs.
Block of non-linear layers encoded by each integer value.
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
| ReLu | pooling | ReLu, pooling | dropout | ReLU, dropout | dropout, pooling | ReLU, dropout, pooling | |
First three rows of an NN table.
| ... | ||||||||
| 0.5 | 4.604796782661887 | 4.582063616775885 | 1.010404161664666 | 2021_02_15_12_18_09_100387 | 3,153,530,971 | True | ||
| 1.0 | 4.604823879167145 | 4.582446265802151 | 1.0304121648659463 | |||||
| ... | ... | |||||||
| False | CIFAR100 | 49 | CrossEntropyLoss | SGD | 0.01005814462371971 | |||
| ... | ... | |||||||
| 0 | 0 | 0.05815562399917578 | 1-2-2 | 0 | 32-1-1 | |||
| ... | ... | |||||||
| 9-1-1 | 0-4-0 | 9-1 | 5-1 | 0-0 | 304-38-257 | |||
| ... | ||||||||
| 0.23811313201015194 | 0 | 343-100 | ||||||
Parameters for NN generation. Values are uniformly randomized in the specified intervals.
| Parameter | Values |
|---|---|
| Number of convolutional layers | |
| Kernel | |
| Stride | |
| Padding | [0, 5] |
| Number of filters | [ |
| Number and type of non-linear layers in blocks | [1, 30]; ReLU, dropout, pooling |
| Dropout rate for dropout layers | [0.1, 0.7] |
| Pool kernel for pooling layers | |
| Stride for pooling layers | |
| Padding for pooling layers | |
| Number of fully connected layers | [1, 5] |
| Number of filters | [0, 400] |
| Dropout rate for dropout layers | [0.1, 0.7] |
| Learning rate | |
| Momentum | |
| Dampening | |
| Weight decay | |
| Batch size | [25, 250] |
Either number of channels or number of filters of , depending on whether increasing number of filters is enforced.
These parameters are only taken into account if they are randomized to be true.
Fig. 2Randomizing number of filters of each convolutional layer.
Fig. 3Randomizing training parameters.
| Subject | Applied Machine Learning |
| Specific subject area | Neural network metadata and learning curve data |
| Type of data | Tabular data in TXT files. |
| How the data were acquired | The neural networks were generated, trained, and validated on the POWER9 Summit supercomputer |
| Data format | Raw |
| Description of data collection | The data consist of tables describing NNs and their learning curves. We generate each NN with random parameters and train it on an image dataset for 40 epochs, using stochastic gradient descent and cross entropy loss. For each NN, we record the randomized parameter values and image dataset used for training. Every half epoch throughout raining, we validate the NN and record its fitness. |
| Data source location | Summit Supercomputer at Oak Ridge National Laboratory Oak Ridge, TN, United States |
| Data accessibility | Repository name: Harvard Dataverse Data identification number: doi: |
| Related research article | A. Keller Rorabaugh, S. Caíno-Lores, T. Johnston, M. Taufer, Building high-throughput neural architecture search workflows via a decoupled fitness prediction engine. IEEE Transactions on Parallel and Distributed Systems, 2022, In Press. DOI |