Talmud-IR: A Talmud-Inspired Interface for Discussing RAG Response Quality

This abstract has open access

Abstract Summary

Retrieval-augmented generation (RAG) systems promise factually grounded answers, yet evaluating their quality remains difficult. Automated metrics and LLM-as-judge approaches offer scalability but risk circularity, benchmark leakage, and loss of diversity. Human assessors, meanwhile, often struggle to notice subtle omissions or hallucinations when responses appear linguistically fluent and confident. We present Talmud-IR, a novel user interface inspired by the dialogic structure of the Talmud. It visualizes RAG outputs as a central text surrounded by layers of evidence, commentary, and meta-assessment, enabling sustained human--LLM discussion about system quality and failure priorities. The prototype supports comparative RAG evaluation, collaborative exploration of ``unknown unknowns,'' and pedagogical use for teaching critical reading of AI-generated content.

Abstract ID :

NKDR147

Submission Type

Demos

Submission Topics

Associated Sessions

Poster Session (Demos)

Author
Co-Authors

NASK National Research Institute

Niklas Deckers

Uni-Kassel

Maik Fröbe

PhD Student

,

Friedrich-Schiller-Universität Jena

Dr. Laura Dietz

Associate Professor

,

University Of New Hampshire

Birte Platow

TU Dresden

Mark Sanderson

RMIT University

Abstracts With Same Type

Abstract ID

Abstract Title

Abstract Topic

Submission Type

Primary Author

NKDR143

CancerRAGent: Evidence-Linked and Safety-Guided Oncology Question Answering

Applications Machine Learning and Large Language Models Recommender systems Search and ranking

Demos

Trung Vo

NKDR166

CitiLink: Enhancing Municipal Transparency and CitizenEngagement through Searchable Meeting Minutes

Applications Machine Learning and Large Language Models Search and ranking Societally-motivated IR research

Demos

Rodrigo Silva

NKDR168

Context Engineering for Agentic Data Science

Demos

Rishiraj Saha Roy

NKDR169

Creating Specialized RAG-Based Search Engines Using the Open Web Index

Demos

Mr. Alexander Nussbaumer

NKDR156

Enhancing Job Search Effectiveness with LLM-Powered Context-Aware Query Reformulation

Applications Machine Learning and Large Language Models Search and ranking System aspects

Demos

Quang Hieu Vu

NKDR209

GutBrainKB: Exploring the Gut¬CBrain Interaction through aReliable Biomedical KB

Demos

Ornella Irrera

NKDR159

ImageSeek: A Hybrid Text-to-Image Image Retrieval Systemfor Domain-Specific Collections

Applications Machine Learning and Large Language Models Search and ranking

Demos

Rodrigo Duarte

NKDR160

LectureChat: Hybrid RAG over Wikipedia and Multilingual Lectures

Applications Conversational search and recommender systems Societally-motivated IR research

Demos

Markos Dimitsas

NKDR163

MedNuggetizer: Confidence-Based Information Nugget Extraction from Medical Documents

Demos

Gregor Donabauer

View All Abstracts

82 visits