MemTool: Optimizing Short-Term Memory Management forDynamic Tool Retrieval and Invocation in LLM AgentMulti-Turn Conversations

This abstract has open access
Abstract Summary
Large Language Model (LLM) agents have shown significant autonomous capabilities in dynamically retrieving and utilizing relevant tools or Model Context Protocol (MCP) servers for individual queries. However, fixed context windows limit effectiveness in multi-turn interactions requiring repeated, independent tool usage. We introduce MemTool, a short-term memory framework enabling LLM agents to dynamically retrieve and manage tools or MCP server contexts across multi-turn conversations, outperforming previous state-of-the-art tool retrieval approaches that lack multi-turn support and memory management of available tools or MCPs. MemTool offers three agentic architectures: 1) Autonomous Agent Mode, granting full tool management autonomy, 2) Workflow Mode, providing deterministic control without autonomy, and 3) Hybrid Mode, combining autonomous and deterministic control. We evaluate all modes across 13+ LLMs on the ScaleMCP benchmark, conducting experiments over 100 consecutive user interactions, measuring tool removal ratios (short-term memory efficiency), task completion accuracy, and comprehensive cost analysis across modes. Our results significantly outperform existing state-of-the-art tool retrieval methods which cannot handle multi-turn tool retrieval and management. In Autonomous Agent Mode, reasoning LLMs achieve high tool-removal efficiency (90¨C94\% over a 3-window average), while medium-sized models exhibit significantly lower efficiency (0¨C60\%). Workflow and Hybrid modes consistently manage tool removal effectively, whereas Autonomous and Hybrid modes excel at task completion. We present trade-offs, cost analysis, and recommendations for each MemTool mode based on task accuracy, agency, and model capabilities.
Abstract ID :
NKDR64
Submission Type
Lead AI Researcher
,
PricewaterhouseCoopers U.S.

Abstracts With Same Type

Abstract ID
Abstract Title
Abstract Topic
Submission Type
Primary Author
NKDR52
Search and ranking
Full papers
Emmanouil Georgios Lionis
NKDR51
Search and rankingSocietally-motivated IR research
Full papers
Martim Baltazar
NKDR15
ApplicationsMachine Learning and Large Language Models
Full papers
Saeedeh Javadi
NKDR49
Societally-motivated IR researchUser aspects in IR
Full papers
Niall McGuire
NKDR177
ApplicationsSearch and ranking
Full papers
Danyang Hou
NKDR184
ApplicationsEvaluation research
Full papers
Danyang Hou
NKDR193
ApplicationsSearch and ranking
Full papers
Danyang Hou
NKDR39
ApplicationsMachine Learning and Large Language Models
Full papers
Sarmistha Das
3 visits