Enabling SQL-based Training Data Debugging for Federated Learning

08/26/2021
by   Yejia Liu, et al.
0

How can we debug a logistical regression model in a federated learning setting when seeing the model behave unexpectedly (e.g., the model rejects all high-income customers' loan applications)? The SQL-based training data debugging framework has proved effective to fix this kind of issue in a non-federated learning setting. Given an unexpected query result over model predictions, this framework automatically removes the label errors from training data such that the unexpected behavior disappears in the retrained model. In this paper, we enable this powerful framework for federated learning. The key challenge is how to develop a security protocol for federated debugging which is proved to be secure, efficient, and accurate. Achieving this goal requires us to investigate how to seamlessly integrate the techniques from multiple fields (Databases, Machine Learning, and Cybersecurity). We first propose FedRain, which extends Rain, the state-of-the-art SQL-based training data debugging framework, to our federated learning setting. We address several technical challenges to make FedRain work and analyze its security guarantee and time complexity. The analysis results show that FedRain falls short in terms of both efficiency and security. To overcome these limitations, we redesign our security protocol and propose Frog, a novel SQL-based training data debugging framework tailored for federated learning. Our theoretical analysis shows that Frog is more secure, more accurate, and more efficient than FedRain. We conduct extensive experiments using several real-world datasets and a case study. The experimental results are consistent with our theoretical analysis and validate the effectiveness of Frog in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2023

FedST: Federated Shapelet Transformation for Interpretable Time Series Classification

This paper studies how to develop accurate and interpretable time series...
research
04/12/2020

Complaint-driven Training Data Debugging for Query 2.0

As the need for machine learning (ML) increases rapidly across all indus...
research
11/08/2019

Revocable Federated Learning: A Benchmark of Federated Forest

A learning federation is composed of multiple participants who use the f...
research
09/23/2020

Pocket Diagnosis: Secure Federated Learning against Poisoning Attack in the Cloud

Federated learning has become prevalent in medical diagnosis due to its ...
research
11/25/2022

Inverse Solvability and Security with Applications to Federated Learning

We introduce the concepts of inverse solvability and security for a gene...
research
06/25/2021

Subgraph Federated Learning with Missing Neighbor Generation

Graphs have been widely used in data mining and machine learning due to ...
research
05/08/2023

FedZKP: Federated Model Ownership Verification with Zero-knowledge Proof

Federated learning (FL) allows multiple parties to cooperatively learn a...

Please sign up or login with your details

Forgot password? Click here to reset