Do Large Language Models Understand Word Meanings? - TIB AV-Portal

Do Large Language Models Understand Word Meanings?

1

Leibniz Universität Hannover (LUH)

L3S Research Center

Technische Informationsbibliothek (TIB)

Navigli, Roberto

Formal Metadata

Title

Do Large Language Models Understand Word Meanings?

Title of Series

ACM WSDM 2025 - The 18th ACM International Conference on Web Search and Data Mining

Number of Parts

15

Author

Navigli, Roberto

License

CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/70196 (DOI)

Publisher

Leibniz Universität Hannover (LUH)

L3S Research Center

Technische Informationsbibliothek (TIB)

Release Date

Language

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

The ability to interpret word meanings in context is a core yet underexplored challenge for Large Language Models (LLMs). While these models demonstrate remarkable linguistic fluency, the extent to which they genuinely grasp word semantics remains an open question. In this talk, we investigate the disambiguation capabilities of state-of-the-art instruction-tuned LLMs, benchmarking their performance against specialized systems designed for Word Sense Disambiguation (WSD). We also examine lexical ambiguity as a persistent challenge in Machine Translation (MT), particularly when dealing with rare or context-dependent word senses. Through an in-depth error analysis of both disambiguation and translation tasks, we reveal systematic weaknesses in LLMs, shedding light on the fundamental challenges they face in semantic interpretation. Furthermore, we show the limitations of standard evaluation metrics in capturing disambiguation performance, reinforcing the need for more targeted evaluation frameworks. By presenting dedicated testbeds, we introduce more effective ways to assess lexical understanding both within and across languages. With this talk we highlight the gap between the impressive fluency of LLMs and their actual semantic comprehension, raising important questions about their reliability in critical applications.