Do Large Language Models "Understand" Word Senses?
Abstract
Large Language Models (LLMs) appear to resolve lexical ambiguity effortlessly, raising the provocative question: has Word Sense Disambiguation (WSD) become obsolete? In this talk, we challenge that assumption by jointly examining WSD and Machine Translation (MT) as two faces of the same unresolved problem, i.e., sense-level semantic understanding. We evaluate instruction-tuned LLMs on classic and generative WSD settings, comparing them to specialized systems, and analyze how lexical ambiguity manifests in MT, especially for rare or non-predominant senses. Despite strong surface-level performance, both disambiguation and translation reveal systematic failures: dominant-sense bias, brittleness under domain shift, and sensitivity to evaluation framing.
Our analysis shows that standard task-level metrics often mask these weaknesses, giving a misleading impression of semantic competence. We argue that WSD is not dead but transformed into a diagnostic lens for probing what LLMs truly understand about word meaning. By exposing hidden biases and failure modes across languages and tasks, sense-centric evaluation remains essential for assessing robustness, interpretability, and genuine lexical-semantic understanding in the LLM era.
Bio
Roberto Navigli is Professor of Natural Language Processing at the Sapienza University of Rome, where he leads the Sapienza NLP Group. He has received two ERC grants on multilingual semantics, highlighted among the 15 projects through which the ERC has transformed science and several prizes from top journals and conferences. He leads the Italian Minerva LLM Project - the first LLM pre-trained in Italian and the only public and open-source project in Italy - and is the Scientific Director and co-founder of Babelscape, a successful deep-tech company focused on neuro-symbolic NLP. He is a Fellow of ACL, AAAI, ELLIS, and EurAI, and served as General Chair of ACL 2025.

