Related Datasets

BUTTER - Empirical Deep Learning Dataset

DOI 10.25984/1872441

Publicly accessible License

The BUTTER Empirical Deep Learning Dataset represents an empirical study of the deep learning phenomena on dense fully connected networks, scanning across thirteen datasets, eight network shapes, fourteen depths, twenty-three network sizes (number of trainable parameters), four learning rates, six minibatch sizes, four levels of label noise, and fourteen levels of L1 and L2 regularization each. Multiple repetitions (typically 30, sometimes 10) of each combination of hyperparameters were preformed, and statistics including training and test loss (using a 80% / 20% shuffled train-test split) are recorded at the end of each training epoch. In total, this dataset covers 178 thousand distinct hyperparameter settings ("experiments"), 3.55 million individual training runs (an average of 20 repetitions of each experiments), and a total of 13.3 billion training epochs (three thousand epochs were covered by most runs). Accumulating this dataset consumed 5,448.4 CPU core-years, 17.8 GPU-years, and 111.2 node-years.

BUTTER Empirical Deep Learning Dataset on AWS

BUTTER Empirical Deep Learning Dataset on AWS in S3

326,183

AWS CLI Access: aws s3 ls --no-sign-request s3://oedi-data-lake/butter/

View Data Lake 3.47 TB

Dataset and Metadata Description

A dataset readme describing schema, organization, and contents of the dataset.

View Repository

Example Notebooks Plotting The Data

Jupyter Notebook

This repository contains code, notebooks, instructions and examples to access the NREL Butter Empirical Deep Learning Dataset via AWS S3 and to reproduce the figures an... more

View Repository

OEDI Data Registry on AWS

AWS public dataset program registry page for data released under the Department of Energy's (DOE) Open Energy Data Initiative (OEDI). The registry page contains informa... more

View

Citation Formats

TY  - DATA
AB  - The BUTTER Empirical Deep Learning Dataset represents an empirical study of the deep learning phenomena on dense fully connected networks, scanning across thirteen datasets, eight network shapes, fourteen depths, twenty-three network sizes (number of trainable parameters), four learning rates, six minibatch sizes, four levels of label noise, and fourteen levels of L1 and L2 regularization each. Multiple repetitions (typically 30, sometimes 10) of each combination of hyperparameters were preformed, and statistics including training and test loss (using a 80% / 20% shuffled train-test split) are recorded at the end of each training epoch. In total, this dataset covers 178 thousand distinct hyperparameter settings ("experiments"), 3.55 million individual training runs (an average of 20 repetitions of each experiments), and a total of 13.3 billion training epochs (three thousand epochs were covered by most runs). Accumulating this dataset consumed 5,448.4 CPU core-years, 17.8 GPU-years, and 111.2 node-years.
AU  - Tripp, Charles
A2  - Perr-Sauer, Jordan
A3  - Hayne, Lucas
A4  - Lunacek, Monte
DB  - Open Energy Data Initiative (OEDI)
DP  - Open EI | National Renewable Energy Laboratory
DO  - 10.25984/1872441
KW  - neural networks
KW  - machine learning
KW  - training
KW  - benchmark
KW  - deep learning
KW  - empirical deep learning
KW  - empirical machine learning
KW  - empirical
KW  - learning rate
KW  - batch size
KW  - minibatch size
KW  - regularization
KW  - label noise
KW  - depth
KW  - shape
KW  - topology
KW  - network shape
KW  - network topology
KW  - epoch
KW  - training epoch
KW  - neural architecture search
LA  - English
DA  - 2022/05/20
PY  - 2022
PB  - National Renewable Energy Laboratory
T1  - BUTTER - Empirical Deep Learning Dataset
UR  - https://doi.org/10.25984/1872441
ER  -

Export Citation to RIS

Tripp, Charles, et al. BUTTER - Empirical Deep Learning Dataset. National Renewable Energy Laboratory, 20 May, 2022, Open Energy Data Initiative (OEDI). https://doi.org/10.25984/1872441.

Tripp, C., Perr-Sauer, J., Hayne, L., & Lunacek, M. (2022). BUTTER - Empirical Deep Learning Dataset. [Data set]. Open Energy Data Initiative (OEDI). National Renewable Energy Laboratory. https://doi.org/10.25984/1872441

Tripp, Charles, Jordan Perr-Sauer, Lucas Hayne, and Monte Lunacek. BUTTER - Empirical Deep Learning Dataset. National Renewable Energy Laboratory, May, 20, 2022.  Distributed by Open Energy Data Initiative (OEDI). https://doi.org/10.25984/1872441

@misc{OEDI_Dataset_5708,
title = {BUTTER - Empirical Deep Learning Dataset},
author = {Tripp, Charles and Perr-Sauer, Jordan and Hayne, Lucas and Lunacek, Monte},
abstractNote = {The BUTTER Empirical Deep Learning Dataset represents an empirical study of the deep learning phenomena on dense fully connected networks, scanning across thirteen datasets, eight network shapes, fourteen depths, twenty-three network sizes (number of trainable parameters), four learning rates, six minibatch sizes, four levels of label noise, and fourteen levels of L1 and L2 regularization each. Multiple repetitions (typically 30, sometimes 10) of each combination of hyperparameters were preformed, and statistics including training and test loss (using a 80\% / 20\% shuffled train-test split) are recorded at the end of each training epoch. In total, this dataset covers 178 thousand distinct hyperparameter settings ("experiments"), 3.55 million individual training runs (an average of 20 repetitions of each experiments), and a total of 13.3 billion training epochs (three thousand epochs were covered by most runs). Accumulating this dataset consumed 5,448.4 CPU core-years, 17.8 GPU-years, and 111.2 node-years.},
url = {https://data.openei.org/submissions/5708},
year = {2022},
howpublished = {Open Energy Data Initiative (OEDI), National Renewable Energy Laboratory, https://doi.org/10.25984/1872441},
note = {Accessed: 2025-10-03},
doi = {10.25984/1872441}
}

https://dx.doi.org/10.25984/1872441

Details

Data from May 20, 2022

Last updated Jan 2, 2024

Submitted Jun 15, 2022

Organization

National Renewable Energy Laboratory

Contact

Charles Edison Tripp

303.275.4082

Authors

Charles Tripp

National Renewable Energy Laboratory

Jordan Perr-Sauer

National Renewable Energy Laboratory NREL

Lucas Hayne

National Renewable Energy Laboratory NREL

Monte Lunacek

National Renewable Energy Laboratory NREL

Research Areas

Computational Science

DOE Project Details

Project Name National Renewable Energy Laboratory (NREL) Lab Directed Research and Development (LDRD)

Project Number GO0028308

BUTTER - Empirical Deep Learning Dataset

Citation Formats

Details

Organization

Contact

Authors

Research Areas

Keywords

DOE Project Details

Share