MDB: Interactively Querying Datasets and Models

08/13/2023
by   Aaditya Naik, et al.
0

As models are trained and deployed, developers need to be able to systematically debug errors that emerge in the machine learning pipeline. We present MDB, a debugging framework for interactively querying datasets and models. MDB integrates functional programming with relational algebra to build expressive queries over a database of datasets and model predictions. Queries are reusable and easily modified, enabling debuggers to rapidly iterate and refine queries to discover and characterize errors and model behaviors. We evaluate MDB on object detection, bias discovery, image classification, and data imputation tasks across self-driving videos, large language models, and medical records. Our experiments show that MDB enables up to 10x faster and 40% shorter queries than other baselines. In a user study, we find developers can successfully construct complex queries that describe errors of machine learning models.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset