An Experience Report on Machine Learning Reproducibility: Guidance for Practitioners and TensorFlow Model Garden Contributors

by   Vishnu Banna, et al.

Machine learning techniques are becoming a fundamental tool for scientific and engineering progress. These techniques are applied in contexts as diverse as astronomy and spam filtering. However, correctly applying these techniques requires careful engineering. Much attention has been paid to the technical potential; relatively little attention has been paid to the software engineering process required to bring research-based machine learning techniques into practical utility. Technology companies have supported the engineering community through machine learning frameworks such as TensorFLow and PyTorch, but the details of how to engineer complex machine learning models in these frameworks have remained hidden. To promote best practices within the engineering community, academic institutions and Google have partnered to launch a Special Interest Group on Machine Learning Models (SIGMODELS) whose goal is to develop exemplary implementations of prominent machine learning models in community locations such as the TensorFlow Model Garden (TFMG). The purpose of this report is to define a process for reproducing a state-of-the-art machine learning model at a level of quality suitable for inclusion in the TFMG. We define the engineering process and elaborate on each step, from paper analysis to model release. We report on our experiences implementing the YOLO model family with a team of 26 student researchers, share the tools we developed, and describe the lessons we learned along the way.


page 9

page 10


SELM: Software Engineering of Machine Learning Models

One of the pillars of any machine learning model is its concepts. Using ...

Machine Learning for Software Engineering: A Systematic Mapping

Context: The software development industry is rapidly adopting machine l...

A Survey on Machine Learning Techniques for Source Code Analysis

Context: The advancements in machine learning techniques have encouraged...

Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow

This report provides an introduction to some Machine Learning tools with...

How to avoid machine learning pitfalls: a guide for academic researchers

This document gives a concise outline of some of the common mistakes tha...

Link Quality Estimation using Machine Learning

Since the emergence of wireless communication networks, quality aspects ...

Scaling TensorFlow to 300 million predictions per second

We present the process of transitioning machine learning models to the T...

Please sign up or login with your details

Forgot password? Click here to reset