Aligning Instruction-Tuned LLMs for Event Extraction with Multi-objective Reinforcement Learning

This abstract has open access

Abstract Summary

Event extraction (EE) aims to identify event triggers and their corresponding arguments from unstructured text, providing structured knowledge essential for many downstream tasks. Despite the success of instruction-tuned large language models (LLMs), current methods often produce inconsistent formats, semantically drifted outputs, and event types that deviate from predefined schemas. These issues arise partly because supervised fine-tuning relies on static loss functions that fail to reflect task-specific objectives such as schema alignment. To address these limitations, we introduce a reinforcement learning framework based on Group Relative Policy Optimization (GRPO) designed to optimize instruction-tuned LLMs for event and argument extraction. We propose three complementary reward functions: a format reward to enforce syntactic and structural validity, a BM25-based reward to enhance lexical and semantic consistency with the input text, and a task-specific supervision reward that directly aligns optimization with task-level performance. Extensive experiments on three standard EE datasets demonstrate that our approach consistently and significantly improves EE performance over strong baselines.

Abstract ID :

NKDR115

Submission Type

Short papers

Submission Topics

IR applications Large Language Models