Many analyses have been performed on Information Retrieval (IR) evaluation benchmarks, with many different approaches. Benchmarking also plays a central role in evaluating the capabilities of Large Language Models (LLMs). However, recent concerns have emerged regarding the robustness of benchmarks a...
IR evaluationShort papers