RoutIR: Fast Serving of Retrieval Pipelines for Retrieval-Augmented Generation

This abstract has open access
Abstract Summary
Retrieval models are key components of Retrieval-Augmented Generation (RAG) systems, which generate search queries, process the documents returned, and generate a response. RAG systems are often dynamic and may involve multiple rounds of retrieval. While many state-of-the-art retrieval methods are available through academic IR platforms, these platforms are typically designed for the Cranfield paradigm in which all queries are known up front and can be batch processed offline. This simplification accelerates research but leaves state-of-the-art retrieval models unable to support downstream applications that require online services, such as arbitrary dynamic RAG pipelines that involve looping, feedback, or even self-organizing agents. In this work, we introduce RoutIR, a Python package that provides a simple and efficient HTTP API that wraps arbitrary retrieval methods, including first stage retrieval, reranking, query expansion, and result fusion. By providing a minimal JSON configuration file specifying the retrieval models to serve, RoutIR can be used to construct and query retrieval pipelines on-the-fly using any available models (e.g., fusing the results of several first-stage retrieval methods followed by reranking). The API automatically performs asynchronous query batching and results caching by default. While many state-of-the-art retrieval methods are already supported by the package, RoutIR is also easily expandable by implementing the Engine abstract class. The package is publicly available on GitHub: http://github.com/hltcoe/routir.
Abstract ID :
NKDR133
Submission Type
Submission Topics
Research Scientist
,
Human Language Technology Center Of Excellence, Johns Hopkins University
Johns Hopkins University, HLTCOE
Senior Research Scientist
,
HLTCOE At Johns Hopkins University
Principal Computer Scientist
,
JHU HLTCOE
Johns Hopkins University

Abstracts With Same Type

Abstract ID
Abstract Title
Abstract Topic
Submission Type
Primary Author
NKDR132
Resource
Mr. Jan Heinrich Merker
NKDR140
User aspects in IR
Resource
Saber Zerhoudi
NKDR129
Machine Learning and Large Language Models Societally-motivated IR research
Resource
Ricardo Campos
NKDR131
Machine Learning and Large Language Models Societally-motivated IR research
Resource
Ricardo Campos
NKDR93
Evaluation research Machine Learning and Large Language Models Search and ranking
Resource
Laura Caspari
NKDR125
Evaluation research Recommender systems
Resource
Ludovico Boratto