CS SEMINAR

Differential Privacy Dynamics of Langevin Diffusion and Noisy Gradient Descent

Speaker
Miss Ye Jiayuan, PhD student, School of Computing, National University of Singapore
Chaired by
Dr Arnab BHATTACHARYYA, Associate Professor, School of Computing
arnab@comp.nus.edu.sg

31 Mar 2021 Wednesday, 04:00 PM to 05:00 PM

via Zoom

Abstract:
In an iterative machine learning algorithm (e.g. noisy gradient descent), the final released model leaks information about data records. As an estimate for this leakage, a differential privacy guarantee gives an upper bound on how changing a single data record influences the released model's distribution. Given an algorithm, we would like to derive the lowest differential privacy guarantee to enable more training iterations under a constrained privacy budget, thus improving the released model's utility.

The fundamental difficulty for the privacy analysis lies in the complex random mapping from the dataset to the released model parameter after many training iterations. Due to this obstacle, most existing techniques [Bassily, Smith & Thakurta, 2014; Abadi et al., 2016] assume a strong adversary who observes model parameters in every training iteration. They then analyze for a single iteration and compose the privacy loss for multiple iterations. Therefore in these analyses, the privacy loss grows unboundedly as the number of training iterations increases.

Our analysis overcomes this limitation and assumes that the adversary observes only the last model. We formulate the iterative evolution of the intermediate model parameter's probability density function and then use it to derive differential privacy loss dynamics as the number of training iterations increases. For $K$ iterations of noisy gradient descent training, we prove an exponentially converging differential privacy guarantee for strongly convex, smooth loss. We prove by construction that this differential privacy guarantee is tight as upper bound, in the sense that the exact differential privacy loss on constructed datasets and loss function is of the same order. This is based on joint work with Rishav Chourasia and Reza Shokri, and the paper is available at https://arxiv.org/abs/2102.05855.


Biodata:
Ye Jiayuan is a PhD student majoring in computer science at School of Computing, National University of Singapore. She is advised by professor Reza Shokri. Jiayuan is interested in trustworthy Machine Learning, with more focus on privacy-preserving machine learning currently.