Accepted Papers

This heading text can be changed from Forms > User instructions

Analyzing AI Evaluation Benchmarks Through InformationRetrieval and Network S...

Many analyses have been performed on Information Retrieval (IR) evaluation benchmarks, with many different approaches. Benchmarking also plays a central role in evaluating the capabilities of Large Language Models (LLMs). However, recent concerns have emerged regarding the robustness of benchmarks a...

IR evaluation
Short papers

Gaia Simeoni

Correct but Incomplete: Why Chain-of-Thought Cannot Currently Support Audita...

NKDR118

Large Language Models (LLMs) are increasingly promoted for knowledge-intensive reasoning tasks. Effective oversight requires faithful reasoning traces which show how answers are actually produced. Chain-of-Thought (CoT) prompting is positioned as a technique to promote both accuracy and transparency...

IR evaluation
Short papers

Edward Richards

Structure-aware Pre-Retrieval Performance Prediction onQuery Affinity Graphs

NKDR111

IR evaluation
Short papers

Abbas Saleminezhad

Display #