Proof-of-Learning: Definitions and Practice

by   Hengrui Jia, et al.

Training machine learning (ML) models typically involves expensive iterative optimization. Once the model's final parameters are released, there is currently no mechanism for the entity which trained the model to prove that these parameters were indeed the result of this optimization procedure. Such a mechanism would support security of ML applications in several ways. For instance, it would simplify ownership resolution when multiple parties contest ownership of a specific model. It would also facilitate the distributed training across untrusted workers where Byzantine workers might otherwise mount a denial-of-service by returning incorrect model updates. In this paper, we remediate this problem by introducing the concept of proof-of-learning in ML. Inspired by research on both proof-of-work and verified computations, we observe how a seminal training algorithm, stochastic gradient descent, accumulates secret information due to its stochasticity. This produces a natural construction for a proof-of-learning which demonstrates that a party has expended the compute require to obtain a set of model parameters correctly. In particular, our analyses and experiments show that an adversary seeking to illegitimately manufacture a proof-of-learning needs to perform *at least* as much work than is needed for gradient descent itself. We also instantiate a concrete proof-of-learning mechanism in both of the scenarios described above. In model ownership resolution, it protects the intellectual property of models released publicly. In distributed training, it preserves availability of the training procedure. Our empirical evaluation validates that our proof-of-learning mechanism is robust to variance induced by the hardware (ML accelerators) and software stacks.


Zeno: Byzantine-suspicious stochastic gradient descent

We propose Zeno, a new robust aggregation rule, for distributed synchron...

Proof of Unlearning: Definitions and Instantiation

The "Right to be Forgotten" rule in machine learning (ML) practice enabl...

Toward Understanding the Impact of Staleness in Distributed Machine Learning

Many distributed machine learning (ML) systems adopt the non-synchronous...

D^2: Decentralized Training over Decentralized Data

While training a machine learning model using multiple workers, each of ...

Garfield: System Support for Byzantine Machine Learning

Byzantine Machine Learning (ML) systems are nowadays vulnerable for they...

Distributed Byzantine Tolerant Stochastic Gradient Descent in the Era of Big Data

The recent advances in sensor technologies and smart devices enable the ...

On the Fundamental Limits of Formally (Dis)Proving Robustness in Proof-of-Learning

Proof-of-learning (PoL) proposes a model owner use machine learning trai...

Please sign up or login with your details

Forgot password? Click here to reset