SuiteEval: Simplifying Retrieval Benchmarks

This abstract has open access

Abstract Summary

Information retrieval evaluation often suffers from fragmented practices---varying dataset subsets, aggregation methods, and pipeline configurations---that undermine reproducibility and comparability, especially for foundation embedding models requiring robust out-of-domain performance. We introduce SuiteEval, a unified framework that offers automatic end-to-end evaluation, dynamic indexing that reuses on-disk indices to minimise disk usage, and built-in support for major benchmarks (BEIR, LoTTE, MS MARCO, NanoBEIR, and BRIGHT). Users only need to supply a pipeline generator. SuiteEval handles data loading, indexing, ranking, metric computation, and result aggregation. New benchmark suites can be added in a single line. SuiteEval reduces boilerplate and standardises evaluations to facilitate reproducible IR research, as a broader benchmark set is increasingly required.

Abstract ID :

NKDR165

Submission Type

Demos

Submission Topics

Associated Sessions

Poster Session (Demo)

Author
Co-Authors

University of Glasgow

Debasis Ganguly

University of Glasgow

Sean MacAvaney

Senior Lecturer

,

University Of Glasgow

Abstracts With Same Type

Abstract ID

Abstract Title

Abstract Topic

Submission Type

Primary Author

NKDR143

CancerRAGent: Evidence-Linked and Safety-Guided Oncology Question Answering

Applications Machine Learning and Large Language Models Recommender systems Search and ranking

Demos

Trung Vo

NKDR166

CitiLink: Enhancing Municipal Transparency and CitizenEngagement through Searchable Meeting Minutes

Applications Machine Learning and Large Language Models Search and ranking Societally-motivated IR research

Demos

Rodrigo Silva

NKDR168

Context Engineering for Agentic Data Science

Demos

Rishiraj Saha Roy

NKDR169

Creating Specialized RAG-Based Search Engines Using the Open Web Index

Demos

Mr. Alexander Nussbaumer

NKDR156

Enhancing Job Search Effectiveness with LLM-Powered Context-Aware Query Reformulation

Applications Machine Learning and Large Language Models Search and ranking System aspects

Demos

Quang Hieu Vu

NKDR209

GutBrainKB: Exploring the Gut¬CBrain Interaction through aReliable Biomedical KB

Demos

Ornella Irrera

NKDR159

ImageSeek: A Hybrid Text-to-Image Image Retrieval Systemfor Domain-Specific Collections

Applications Machine Learning and Large Language Models Search and ranking

Demos

Rodrigo Duarte

NKDR160

LectureChat: Hybrid RAG over Wikipedia and Multilingual Lectures

Applications Conversational search and recommender systems Societally-motivated IR research

Demos

Markos Dimitsas

NKDR163

MedNuggetizer: Confidence-Based Information Nugget Extraction from Medical Documents

Demos

Gregor Donabauer

View All Abstracts

1 visits