Querying Big-graphs - Streaming and Beyond

Assistant Professor Arijit Khan
School of Computer Science and Engineering
Nanyang Technological University

Chaired by
Dr HE Bingsheng, Associate Professor, School of Computing

  17 Nov 2017 Friday, 10:30 AM to 12:00 PM

 SR3, COM1-02-12


Graphs are a ubiquitous model to represent objects and their relations. However, the complex combinations of structure and content, coupled with massive volume, high streaming rate, and uncertainty inherent in the data, raise several challenges that require new efforts for smarter and faster graph querying.

Many graphs such as those formed by the activity on social networks, communication networks, and telephone networks are defined dynamically as rapid edge streams on a massive domain of nodes. Efficient processing of rapid and massive streams, often in space-limited architectures such as FPGAs, network interface cards, routers, and switches, requires approximation through succinct synopses created in a single-pass. In the first half of the talk, I shall discuss our novel synopsis structure that efficiently summarizes massive graph streams, while also retaining information about the structural behavior of the underlying graph dataset. I shall demonstrate how one can use our synopsis to determine important structural properties such as reachability over high-frequency edges. In the second half of the talk, I shall discuss our newest progress with faster and more accurate stream processing: How adding a pre-filtering stage which dynamically identifies and aggregates the most frequent items improves the accuracy and throughput of stream processing.


Arijit Khan is an assistant professor in the School of Computer Science and Engineering, Nanyang Technological University, Singapore. He earned his PhD from the Department of Computer Science, University of California, Santa Barbara, USA, and did a post-doc in the Systems group at ETH Zurich, Switzerland. Arijit is the recipient of the prestigious IBM PhD Fellowship in 2012-13. He published several papers in premier databases and data mining conferences and journals including SIGMOD, VLDB, TKDE, ICDE, SDM, EDBT, and CIKM. Arijit co-presented tutorials on emerging graph queries and big-graph systems at ICDE 2012, and VLDB (2017, 2015, and 2014). He served in the program committee of KDD, SIGMOD, VLDB, ICDE, ICDM, EDBT, WWW, and CIKM. Arijit served as the co-chair of Big-O (Q) workshop co-located with VLDB 2015, and contributed a chapter on Big-Graphs querying and mining in the Springer Handbook of Big Data Technologies. He was invited to give tutorials in the Asia Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data (APWeb-WAIM 2017), and in the International Conference on Management of Data (COMAD 2016).