CS SEMINAR

From Deep Data Integration to Using LLMs to Query Unstructured and Structured Data

Speaker
Tan Wang-Chiew (Research Scientist @ Meta AI)
Chaired by
Dr HSU, Wynne, Provost's Chair Professor, School of Computing
whsu@comp.nus.edu.sg

26 Sep 2023 Tuesday, 10:00 AM to 11:30 AM

Multipurposed Hall 1, 2 and 3 (COM3 01-26, 01-27 and 01-28)

Interaction Tea: 11:30AM – 12PM

Abstract
We are witnessing the widespread adoption of deep learning techniques as avant-garde solutions to different computational problems in recent years. In data integration, the use of deep learning techniques has helped establish several state-of-the-art results in long standing problems, including information extraction, entity matching, data cleaning, and table understanding. In this talk, I will reflect on the strengths of deep learning and how that has helped move forward the needle in data integration. I will also discuss a few challenges associated with solutions based on deep learning techniques and describe some opportunities for future work.
Recently, Large Language Models (LLMs) have emerged as a powerful tool for accessing parametric knowledge, but the potential of tapping into the vast expanse of external or private data remains largely unexplored. This talk presents an open-source question-answering system for seamlessly integrating model parameters with knowledge from external data sources to enhance its predictive capabilities. Our larger vision transcends question answering. We envision a personal insight assistant, adept at sifting through your past data to offer invaluable insights to help make informed decisions and plan with foresight.

Biodata
Wang-Chiew is a research scientist at Meta AI. Before she was the Head of Research at Megagon Labs, where she led the research efforts on building advanced technologies to enhance search by experience. Prior to joining Megagon Labs, she was a Professor of Computer Science at the University of California, Santa Cruz. She also spent two years at IBM Research - Almaden. She received her B.Sc. (First Class) in Computer Science from the National University of Singapore and her Ph.D. in Computer Science from the University of Pennsylvania. Her research interests include data integration and exchange, data provenance, and natural language processing. She is the recipient of an NSF CAREER award, a Google Faculty Award, and an IBM Faculty Award. She co-authored best papers, she is a co-recipient of the 2014 ACM PODS Alberto O. Mendelzon Test-of-Time Award, the 2018 ICDT Test-of-Time Award, and the 2020 Alonzo Church Award. She received the 2019 VLDB Women in Database Research Award. She was on the VLDB Board of Trustees (2014-2019) and she is a Fellow of the ACM.