Genomic Data Sharing under Dependent Local Differential Privacy

by   Emre Yilmaz, et al.

Privacy-preserving genomic data sharing is prominent to increase the pace of genomic research, and hence to pave the way towards personalized genomic medicine. In this paper, we introduce (ϵ , T)-dependent local differential privacy (LDP) for privacy-preserving sharing of correlated data and propose a genomic data sharing mechanism under this privacy definition. We first show that the original definition of LDP is not suitable for genomic data sharing, and then we propose a new mechanism to share genomic data. The proposed mechanism considers the correlations in data during data sharing, eliminates statistically unlikely data values beforehand, and adjusts the probability distributions for each shared data point accordingly. By doing so, we show that we can avoid an attacker from inferring the correct values of the shared data points by utilizing the correlations in the data. By adjusting the probability distributions of the shared states of each data point, we also improve the utility of shared data for the data collector. Furthermore, we develop a greedy algorithm that strategically identifies the processing order of the shared data points with the aim of maximizing the utility of the shared data. Considering the interdependent privacy risks while sharing genomic data, we also analyze the information gain of an attacker about genomes of a donor's family members by observing perturbed data of the genome donor and we propose a mechanism to select the privacy budget (i.e., ϵ parameter of LDP) of the donor by also considering privacy preferences of her family members. Our evaluation results on a real-life genomic dataset show the superiority of the proposed mechanism compared to the randomized response mechanism (a widely used technique to achieve LDP).


page 1

page 2

page 3

page 4


Near-Optimal Privacy-Utility Tradeoff in Genomic Studies Using Selective SNP Hiding

Motivation: Researchers need a rich trove of genomic datasets that they ...

OptimShare: A Unified Framework for Privacy Preserving Data Sharing – Towards the Practical Utility of Data with Privacy

Tabular data sharing serves as a common method for data exchange. Howeve...

Collusion-Resilient Probabilistic Fingerprinting Scheme for Correlated Data

In order to receive personalized services, individuals share their perso...

GenShare: Sharing Accurate Differentially-Private Statistics for Genomic Datasets with Dependent Tuples

Motivation: Cutting the cost of DNA sequencing technology led to a quant...

Local Obfuscation Mechanisms for Hiding Probability Distributions

We introduce a formal model for the information leakage of probability d...

Online Context-aware Data Release with Sequence Information Privacy

Publishing streaming data in a privacy-preserving manner has been a key ...

Understanding Compressive Adversarial Privacy

Designing a data sharing mechanism without sacrificing too much privacy ...

Please sign up or login with your details

Forgot password? Click here to reset