Data Exploration with k-Nearest Neighbor Graph

Ms Ramon Bespinyowong
Dr Anthony Tung Kum Hoe, Professor, School of Computing

  12 Nov 2019 Tuesday, 02:00 PM to 03:30 PM

 SR6, COM1-02-03


The graph model is very intuitive, and expressiveness and has gained much attention recently. Researchers have also developed many graph mining and graph visualization algorithms. However, most of the data are still non-graph data and cannot be applied to those techniques directly. This thesis aims to deploy and develop a graph-based approach for exploring and visualizing the non-graph dataset, especially tabular data.

There are three parts to this work. Firstly, we showed that applying data mining algorithms, such as PageRank, on the k-nearest neighbor graph built from the tabular data, was useful and gave users insight about the data. Secondly, most applications, including our first work, construct the k-nearest neighbor graph based on only one distance function. How- ever, there are infinite weight vectors. We proposed a summary graph that considered all the possible weight vectors. In order to cater to differ- ent users' weight spaces, we implemented an interactive visualization tool allowing users to modify the graph directly and to see changes in the weight space. Thirdly, users can feel overwhelmed when seeing a cluttering graph. We defined a stable community, a community that exists in most of the weight space. We can group a stable community as a super-node and reduce the number of the displayed nodes. However, the complexity of community stability-related problems is still huge.

Our future work will focus on improving the complexity and studying the usefulness of community stability.