Privacy Threats and Protection across Data Lifecycle
11 Dec 2018 Tuesday, 10:00 AM to 11:30 AM
COM2 Level 4
Executive Classroom, COM2-04-02
Data is being aggregated in the cloud from massive amounts of devices, applications, and services. Protecting user data privacy is one of the major challenges in the cloud era. User data faces different threats in different stages of the data lifecycle, which starts when user data is generated, and continues when it is transmitted and eventually used. In this seminar, we investigate the threats to data privacy and protection mechanisms across the data lifecycle. The source of privacy threats in data lifecycle is two-fold: explicit leakage in cloud systems and implicit leakage in data manipulation algorithms. We develop isolation and inference techniques towards addressing both privacy threats. Specifically, we make three contributions. First, we propose an isolation framework in web systems that protects sensitive user data in web sessions using a trusted environment. Second, we apply machine learning algorithms to infer sensitive information from massive data. We develop an approach that jointly uses machine learning techniques and natural language processing techniques to automatically identify sensitive data in mobile applications by learning the semantic meaning of sensitive data. Finally, we study the privacy leakage in machine learning models.