PH.D DEFENCE - PUBLIC SEMINAR

Analyzing Image Tweets in Microblogs

Speaker
Ms Chen Tao
Advisor
Dr Kan Min Yen, Associate Professor, School of Computing


18 Apr 2016 Monday, 02:00 PM to 03:30 PM

Executive Classroom, COM2-04-02

Abstract:

Social media platforms now allow users to share images alongside their textual posts. These image tweets make up a fast-growing percentage of tweets, but have not been studied in depth unlike their text-only counterparts. In this thesis, we aim to answer four fundamental questions about image tweets: 1) What are the characteristics of image tweets? 2) What are the correlations of image and text in image tweets? 3) How an image tweet is generated? 4) How to interpret the semantics of image tweets?

To answer the first question, we collect a large corpus of microblog posts from Western Twitter and Chinese Sina Weibo and perform a multipronged analysis of image tweets from the perspective of image characteristics, user posting behaviors and textual content.

Using an appropriate corpus analysis, we identify two key image-text relations for image tweets: visual relevance and emotional relevance. Considering the practical values of visually relevant image tweets, we build an automated classifier utilizing text, image and social context features to distinguish them from the others.

We then develop Visual-Emotional LDA (VELDA), a novel topic model that captures the image-text correlation from multiple perspectives (namely, visual and emotional) to model the image tweet generation process. Experiments on real-world image tweets in both English and Chinese and other user generated content show that VELDA significantly outperforms existing methods in the task of cross-modality image retrieval.

Finally, we devise a context-aware image tweets modeling (CITING) framework to interpret the semantics of image tweets from both intrinsic and extrinsic contexts. To demonstrate the effectiveness our framework, we focus on the task of personalized image tweet recommendation, developing a feature-aware matrix factorization model that encodes the contexts as part of user interest modeling. Extensive experiments on a large Twitter dataset show our proposed method significantly improves recommendation performance.