The 2018 Singapore Symposium on Natural Language Processing

Speaker

Professor Ido Dagan
Bar Ilan University, Israel

Ms Linlin Li
Alibaba, China

Mr Thang Luong
Google Brain, USA

Professor Vincent Ng
University of Texas, USA

Associate Professor Noah Smith
University of Washington, USA

Professor Junichi Tsuji
National Insitute of Advanced Industrial Science and Technology

Contact Person

Dr KAN Min-Yen, Associate Professor, School of Computing

kanmy@comp.nus.edu.sg

13 Jul 2018 Friday, 09:00 AM to 05:15 PM

Singapore University of Technology and Design 8 Somapah Road Singapore 487372

For more details on The 2018 Singapore Symposium on Natural Language Processing, please go to http://event.statnlp.org/.

Free registration at the website.

9.00am - 9.15am
Welcome and Opening Remarks

9.15am - 9.50am
Invited Talk (Linlin Li: The text processing engine that powers Alibaba's business applications AliNLP is a large-scale NLP technology platform for the entire Eco-system of Alibaba . It covers major NLP areas of lexical, syntactic, semantic, discourse analysis, and distributed representation of text. AliNLP v0.1 was released in Dec, 2016 and v4.0 was released in Aug, 2017. It is used in 250+ business scenarios (Jun, 2018) with more than 500Billion+ API calls per day in Alibaba. The AliNLP platform supports both monolingual and cross-lingual applications, and it is widely used not only in e-commerce but also in other vertical domains such as retail, healthcare, security etc.)

9.50am - 10.15am
Tea Break

10.15am - 10.50am
Invited Talk (Ido Dagan: Consolidating Textual Information How can we capture effectively the information expressed in multiple texts? How can we allow people, as well as computer applications, to easily explore it? The current semantic NLP pipeline typically ends at the single sentence or text level, putting the burden on applications to consolidate related information across different texts. Further, semantic representations, which may provide the basis for text consolidation, are often based on non-trivial pre-specified schemata, which require expert annotation and hence complicate the creation of large scale corpora for training. In this talk, I will outline a research program, titled Natural Knowledge, whose goals are to represent the consolidated information conveyed in multiple texts and to communicate it effectively to users. This program consists of three novel research lines. First, we aim to establish a "natural" semantic representation for individual texts, which is based solely on crowdsourcable natural language expressions rather than on pre-specified schemata. To that end, we follow and extend the recent Question-Answer Semantic Role Labeling (QA-SRL) approach, through which we decompose sentence information to minimal question-answer pairs, each representing an "atomic" statement. Second, we are developing principles for consolidating the information structures of different texts, requiring substantial extension of the expressiveness and performance of cross-text co-reference detection. This yields a consolidated structure that bears similarities to traditional knowledge graphs, where representations correspond to real-world elements and statements relating them. Third, we are developing a framework for interactive exploration of the consolidated content, including methodologies for evaluating interactive presentation of information. In the talk I will provide an overview of the framework and its three research lines and illustrate some concrete research tasks.)

10.50am - 11.25am
Invited Talk (Noah A. Smith: Syncretizing Structured and Learned Representations)
My group's research targets automated understanding of natural language text, in both general-purpose and application-driven settings. In this talk, I will describe new ways to use representation learning for NLP. Noting that a data-driven model always assumes a theory (not necessarily a good one), I will argue for language-appropriate inductive bias in representation-learning-infused models of language. I'll focus on how representation learning and structure combine to achieve new states of the art on broad-coverage semantic analysis tasks.)

11.25am - 12.00pm
Invited Talk (Vincent Ng: Towards Content-Based Essay Scoring)
State-of-the-art automated essay scoring engines such as E-rater do not grade essay content, focusing instead on providing diagnostic trait feedback on categories such as grammar, usage, mechanics, style and organization. Content-based essay scoring is very challenging: it requires an understanding of essay content and is beyond the reach of today's automated essay scoring technologies. As a result, content-dependent dimensions of essay quality are largely ignored in existing automated essay scoring research. In this talk, we describe our recent and ongoing efforts on content-based essay scoring, sharing the lessons we learned from automatically scoring one of the arguably most important content-dependent dimensions of persuasive essay quality, argument persuasiveness.

12.15pm - 1.30pm
Lunch Break and Poster/Demo Session

1.30pm - 2.15pm
Industry Panel Discussion

2.15pm - 3.15pm
Tea Break and Poster/Demo Session

3.15pm - 3.50pm
Invited Talk (Junichi Tsujii: Artificial Intelligence Embedded in the Real World)
Artificial intelligence technologies are being actively deployed in a broad spectrum of fields, including logistics and financial services, medical care and nursing, manufacturing, education, and scientific research, to name only a few. These technologies also play a critical role in efficient and effective operation of social systems such as controlling flows of people, things, and energy. The Artificial Intelligence Research Center (AIRC) strives to serve as one of core research bases in Japan, with a focus on the artificial intelligence technologies that have become fundamental technologies in today's society. Through a systematic approach, we conduct integrated research and development that embraces the entire spectrum from basic research to application, and emphasize the importance of deploying AI technologies in society.)

3.50pm - 4.25pm
Invited Talk (Thang Minh Luong: The Google Brain QANet systems and reading comprehension)
Abstract TBA

4.25pm - 5.00pm
Academic Panel Discussion

5.00pm - 5.15pm
Closing Remarks