Accepted Papers

This heading text can be changed from Forms > User instructions

Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets

RAG systems are increasingly evaluated and optimized using LLM judges, an approach that is rapidly becoming the dominant paradigm for system assessment. Nugget-based approaches in particular are now embedded not only in evaluation frameworks but also in the architectures of RAG systems themselves. W...

Evaluation research
Full papers

Dr. Laura Dietz

Reducing Human Effort to Validate LLM Relevance Judgements via Stratified Sa...

NKDR45

Information Retrieval (IR) evaluation deeply relies on human-made relevance judgments. To overcome the high costs of the judgment collection process, a potential solution is to utilize LLMs as judges to replace human annotators. However, the validation of LLM-generated judgments is fundamental for i...

Evaluation research
Full papers

Simone Merlo

Validating Search Query Simulations: A Taxonomy of Measures

NKDR31

Assessing the validity of user simulators when used for the evaluation of information retrieval systems remains an open question, constraining their effective use and the reliability of simulation-based results. To address this issue, we conduct a comprehensive literature review with a particular fo...

Evaluation research
Full papers

Andreas Konstantin Kruff

Display #