Practical Federated Gradient Boosting Decision Trees

11/11/2019
by   Qinbin Li, et al.
0

Gradient Boosting Decision Trees (GBDTs) have become very successful in recent years, with many awards in machine learning and data mining competitions. There have been several recent studies on how to train GBDTs in the federated learning setting. In this paper, we focus on horizontal federated learning, where data samples with the same features are distributed among multiple parties. However, existing studies are not efficient or effective enough for practical use. They suffer either from the inefficiency due to the usage of costly data transformations such as secure sharing and homomorphic encryption, or from the low model accuracy due to differential privacy designs. In this paper, we study a practical federated environment with relaxed privacy constraints. In this environment, a dishonest party might obtain some information about the other parties' data, but it is still impossible for the dishonest party to derive the actual raw data of other parties. Specifically, each party boosts a number of trees by exploiting similarity information based on locality-sensitive hashing. We prove that our framework is secure without exposing the original record to other parties, while the computation overhead in the training process is kept low. Our experimental studies show that, compared with normal training with the local data of each owner, our approach can significantly improve the predictive accuracy, and achieve comparable accuracy to the original GBDT with the data from all parties.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2020

Adaptive Histogram-Based Gradient Boosted Trees for Federated Learning

Federated Learning (FL) is an approach to collaboratively train a model ...
research
12/07/2018

A Hybrid Approach to Privacy-Preserving Federated Learning

Training machine learning models often requires data from multiple parti...
research
11/27/2019

SecureGBM: Secure Multi-Party Gradient Boosting

Federated machine learning systems have been widely used to facilitate t...
research
10/06/2022

Federated Boosted Decision Trees with Differential Privacy

There is great demand for scalable, secure, and efficient privacy-preser...
research
08/17/2020

WAFFLE: Watermarking in Federated Learning

Creators of machine learning models can use watermarking as a technique ...
research
05/22/2023

Privet: A Privacy-Preserving Vertical Federated Learning Service for Gradient Boosted Decision Tables

Vertical federated learning (VFL) has recently emerged as an appealing d...
research
05/18/2023

TPMDP: Threshold Personalized Multi-party Differential Privacy via Optimal Gaussian Mechanism

In modern distributed computing applications, such as federated learning...

Please sign up or login with your details

Forgot password? Click here to reset