A Statistical Threshold for Adversarial Classification in Laplace Mechanisms
This paper studies the statistical characterization of detecting an adversary who wants to harm some computation such as machine learning models or aggregation by altering the output of a differentially private mechanism in addition to discovering some information about the underlying dataset. An adversary who is able to modify the published information from a differentially private mechanism aims to maximize the possible damage to the system while remaining undetected. We present a trade-off between the privacy parameter of the system, the sensitivity and the attacker's advantage (the bias) through determining the threshold for the best critical region of the hypothesis testing problem for deciding whether or not the adversary's attack is detected. Such trade-offs are provided for Laplace mechanisms using one-sided and two-sided hypothesis tests. Corresponding error probabilities are analytically derived and ROC curves are presented for various levels of the sensitivity, the absolute mean of the attack and the privacy parameter. Subsequently, we provide an interval for the bias induced by the adversary so that the defender detects the attack. Finally, we adapt the Kullback-Leibler differential privacy to adversarial classification.
READ FULL TEXT