Reproducibility Requires Consolidated Artifacts
Machine learning is facing a 'reproducibility crisis' where a significant number of works report failures when attempting to reproduce previously published results. We evaluate the sources of reproducibility failures using a meta-analysis of 142 replication studies from ReScience C and 204 code repositories. We find that missing experiment details such as hyperparameters are potential causes of unreproducibility. We experimentally show the bias of different hyperparameter selection strategies and conclude that consolidated artifacts with a unified framework can help support reproducibility.
READ FULL TEXT