Capturing and Exploring Textual Information

Professor Ido Dagan
Department of Computer Science
Bar-Ilan University

Chaired by
Dr NG Hwee Tou, Provost's Chair Professor, School of Computing

  12 Jul 2018 Thursday, 02:00 PM to 03:00 PM

 MR1, COM1-03-19


In this talk I will describe work on two research lines, which fit within a broader program aimed at representing and utilizing the consolidated information conveyed in sets of texts. The first line deals with developing a natural-language based, information-oriented, meaning representation for sentences. Unlike typical semantic and knowledge representations, which are based on stipulated pre-specified schemata for elements and relations, our approach structures and relates textual elements using natural language question-answer pairs, following and extending the recent QA-SRL paradigm. This approach is amenable to non-expert crowdsourcing annotation, while relying on human meaning interpretations. The second research line sets up a framework for developing and consistently evaluating interactive summarization methods, targeting effective exploration of multi-text content. While previous interactive summary evaluations relied on subjective user assessments, we are casting interactive summarization as producing incrementally growing static summary snapshots. We then propose to automatically evaluate these snapshots similarly to current summary evaluation practices, motivated by a recent finding on the reliability of ROUGE scores with respect to varying summary lengths. As for human-based summary content evaluation, we aim to utilize easy-to-answer questionnaires that mimic sampling of Pyramid evaluation.


Ido Dagan is a Professor at the Department of Computer Science at Bar-Ilan University, Israel and a Fellow of the Association for Computational Linguistics (ACL). His interests are in applied semantic processing, focusing on textual inference, natural language based knowledge representation and acquisition, and text exploration. Dagan and colleagues established the textual entailment recognition paradigm. He was the President of the ACL in 2010 and served on its Executive Committee during 2008-2011. In that capacity, he led the establishment of the journal Transactions of the Association for Computational Linguistics. Dagan received his B.A. summa cum laude and his Ph.D. (1992) in Computer Science from the Technion. He was a research fellow at the IBM Haifa Scientific Center (1991) and a Member of Technical Staff at AT&T Bell Laboratories (1992-1994). During 1998-2003 he was co-founder and CTO of FocusEngine and VP of Technology of LingoMotors. Currently he is heading the initiative of setting up the Bar-Ilan University Data Science Institute.