Loading Session...

Tutorial on Uncertainty Quantification for Large Language Models

Session Information

Large language models (LLMs) are widely used in NLP applications, but their tendency to produce hallucinations poses significant challenges to the reliability and safety, ultimately undermining user trust. This tutorial offers the first systematic introduction to uncertainty quantification (UQ) for LLMs in text generation tasks -- a conceptual and methodological framework that provides tools for communicating the reliability of a model answer. This additional output could be leveraged for a range of downstream tasks, including hallucination detection and selective generation.

We begin with the theoretical foundations of uncertainty, highlighting why techniques developed for classification might fall short in text generation. Building on this grounding, we survey state-of-the-art white-box and black-box UQ methods, from simple entropy-based scores to supervised probes over hidden states and attention weights, and show how they enable selective generation and hallucination detection. Additionally, we discuss the calibration of uncertainty scores for better interpretability.

A key feature of the tutorial is practical examples using LM-Polygraph, an open-source framework that unifies more than a dozen recent UQ and calibration algorithms and provides a large-scale benchmark, allowing participants to implement UQ in their applications, as well as reproduce and extend experimental results with only a few lines of code. 

By the end of the session, researchers and practitioners will be equipped to (i) evaluate and compare existing UQ techniques, (ii) develop new methods, and (iii) implement UQ in their code for deploying safer, more trustworthy LLM-based systems.


Website: https://sites.google.com/view/ecir-2026tutorial

Mar 29, 2026 13:30 - 17:00(Europe/Amsterdam)
Venue : Lecture room C
20260329T1330 20260329T1700 Europe/Amsterdam Tutorial on Uncertainty Quantification for Large Language Models

Large language models (LLMs) are widely used in NLP applications, but their tendency to produce hallucinations poses significant challenges to the reliability and safety, ultimately undermining user trust. This tutorial offers the first systematic introduction to uncertainty quantification (UQ) for LLMs in text generation tasks -- a conceptual and methodological framework that provides tools for communicating the reliability of a model answer. This additional output could be leveraged for a range of downstream tasks, including hallucination detection and selective generation.

We begin with the theoretical foundations of uncertainty, highlighting why techniques developed for classification might fall short in text generation. Building on this grounding, we survey state-of-the-art white-box and black-box UQ methods, from simple entropy-based scores to supervised probes over hidden states and attention weights, and show how they enable selective generation and hallucination detection. Additionally, we discuss the calibration of uncertainty scores for better interpretability.

A key feature of the tutorial is practical examples using LM-Polygraph, an open-source framework that unifies more than a dozen recent UQ and calibration algorithms and provides a large-scale benchmark, allowing participants to implement UQ in their applications, as well as reproduce and extend experimental results with only a few lines of code. 

By the end of the session, researchers and practitioners will be equipped to (i) evaluate and compare existing UQ techniques, (ii) develop new methods, and (iii) implement UQ in their code for deploying safer, more trustworthy LLM-based systems.

Website: https://si ...

Lecture room C ECIR2026 conference-secretariat@blueboxevents.nl

Sub Sessions

Uncertainty Quantification for Large Language Models

Tutorials 01:30 PM - 05:00 PM (Europe/Amsterdam) 2026/03/29 11:30:00 UTC - 2026/03/29 15:00:00 UTC
"Uncertainty quantification has gained increasing importance in natural language processing (NLP), offering a conceptual and methodological framework to address critical issues such as hallucinations in the answers of LLMs, detection of low-quality responses, out-of-distribution detection, and reducing response latency, among others. While UQ for text classification models in NLP has been covered in previous tutorials, applying UQ to LLMs poses far greater challenges. This complexity stems from the fact that LLMs generate sequences of conditionally dependent predictions with varying levels of importance. As a result, many UQ techniques that are effective for classification models are either ineffective or not directly applicable to LLMs. In this tutorial, we cover foundational concepts of UQ for LLMs, present state-of-the-art techniques, demonstrate practical applications of UQ in various tasks, and equip researchers and practitioners with tools for developing new UQ methods and harnessing uncertainty in various contexts. Recently, retrieval-augmented generation (RAG) systems have become the backbone of many modern LLM-based applications. Augmenting inputs to the model with information retrieved from additional sources poses unique challenges and opportunities for UQ. In this edition of the tutorial, we cover the techniques most suitable for RAG-based LLMs and touch upon applications of uncertainty in agentic frameworks. Through this tutorial, we aim to lower the barrier to entry into UQ research and applications for individual researchers and developers."
Presenters
MP
Maxim Panov
Assistant Professor, MBZUAI
Co-Authors
AS
Artem Shelmanov
Asst Professor, MBZUAI
RV
Roman Vashurin
Senior Research Engineer, MBZUAI
AV
Artem Vazhentsev
Postdoctoral Associate, MBZUAI
EF
Ekaterina Fadeeva
ETH Zurich
LR
Lyudmila Rvanova
TB
Timothy Baldwin
209 visits

Session Participants

User Online
Session speakers, moderators & attendees
Assistant Professor
,
MBZUAI
No moderator for this session!
PhD student
,
University Of Glasgow
18 attendees saved this session

Session Chat

Live Chat
Chat with participants attending this session

Questions & Answers

Answered
Submit questions for the presenters

Session Polls

Active
Participate in live polls

Need Help?

Technical Issues?

If you're experiencing playback problems, try adjusting the quality or refreshing the page.

Questions for Speakers?

Use the Q&A tab to submit questions that may be addressed in follow-up sessions.