MGit: A Model Versioning and Management System

07/14/2023
by   Wei Hao, et al.
0

Models derived from other models are extremely common in machine learning (ML) today. For example, transfer learning is used to create task-specific models from "pre-trained" models through finetuning. This has led to an ecosystem where models are related to each other, sharing structure and often even parameter values. However, it is hard to manage these model derivatives: the storage overhead of storing all derived models quickly becomes onerous, prompting users to get rid of intermediate models that might be useful for further analysis. Additionally, undesired behaviors in models are hard to track down (e.g., is a bug inherited from an upstream model?). In this paper, we propose a model versioning and management system called MGit that makes it easier to store, test, update, and collaborate on model derivatives. MGit introduces a lineage graph that records provenance and versioning information between models, optimizations to efficiently store model parameters, as well as abstractions over this lineage graph that facilitate relevant testing, updating and collaboration functionality. MGit is able to reduce the lineage graph's storage footprint by up to 7x and automatically update downstream models in response to updates to upstream models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2022

Metadata Representations for Queryable ML Model Zoos

Machine learning (ML) practitioners and organizations are building model...
research
05/17/2023

G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks

It has become a popular paradigm to transfer the knowledge of large-scal...
research
09/04/2023

Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models

With ever increasing parameters and computation, vision-language pre-tra...
research
03/07/2023

Private Read-Update-Write with Controllable Information Leakage for Storage-Efficient Federated Learning with Top r Sparsification

In federated learning (FL), a machine learning (ML) model is collectivel...
research
10/19/2022

Revision Transformers: Getting RiT of No-Nos

Current transformer language models (LM) are large-scale models with bil...
research
02/26/2023

Scalable Weight Reparametrization for Efficient Transfer Learning

This paper proposes a novel, efficient transfer learning method, called ...
research
07/17/2023

DeepMem: ML Models as storage channels and their (mis-)applications

Machine learning (ML) models are overparameterized to support generality...

Please sign up or login with your details

Forgot password? Click here to reset