DOCTORAL SEMINAR

Incorporation of User Shared Content for Improved User Entity Resolution

Speaker
Mr Sapumal Ahangama
Advisor
Dr Danny Poo Chiang Choon, Associate Professor, School of Computing


29 Aug 2018 Wednesday, 03:00 PM to 04:30 PM

Executive Classroom, COM2-04-02

Abstract:

Widespread adoption of information systems has created vast amounts of digital data traces concerning the users, their relationships and personal activities. Interconnecting the digital data traces of a user from various information systems is not an easy task and is an open problem, commonly known as user entity resolution. Since prior research has loosely focused on incorporating user shared content for user entity resolution, in this thesis proposal the intention is to develop methods that would enable the incorporation of user shared content in user entity resolution to further improve the accuracy.

In approaching user entity resolution problems, prior studies have focused on approximate matching or blocking methods with the intention of reducing the search space and secondly, on identifying the exact user for user entity resolution. Acknowledging the limitations and differentiation power of user shared content, two studies are proposed in this thesis proposal in these two directions.

First study is an approximate matching method for cross domain datasets that searches users efficiently in different localities of the data in the latent space of user shared content. Second study is a network embedding method that preserves both associated text and the relational data for both first and second order proximity in the network.

The preliminary experimental results of both these proposed studies have indicated improvements on the state of the art methods. The outcome of these studies will contribute to the better understanding of how user shared content could be used in user entity resolution.