Privacy Threats and Protection across Data Lifecycle
19 Jul 2019 Friday, 09:00 AM to 10:30 AM
COM2 Level 2
As a massive amount of data is being aggregated into the cloud platforms powering our society, protecting user data privacy is one of the major challenges. User data faces different threats in different stages of data's lifecycle, which starts when user data is generated and continues when it is transmitted and eventually used. In this thesis, we investigate the threats to data privacy and develop protection mechanisms across the data lifecycle. Privacy threats in data lifecycle have two main types: explicit leakage in cloud platforms and implicit leakage in data processing. We develop isolation and inference techniques towards addressing both privacy threats. Specifically, this thesis makes three contributions. First, we propose an isolation framework in web systems that protects sensitive user data in web sessions using a trusted environment. Second, we apply machine learning algorithms to infer sensitive information from massive data. We develop an approach that jointly uses machine learning techniques and natural language processing techniques to automatically identify sensitive data in mobile applications by learning the semantic meaning of sensitive data. Finally, we study the privacy leakage in machine learning models.
We propose an effective inversion attack against neural networks. Our attack is able to precisely recover input data using only model's prediction on it, which causes serious privacy violations of training and test data in machine learning algorithms.