pt-image-ir-dataset: An Image Retrieval Dataset in EuropeanPortuguese

This abstract has open access
Abstract Summary
With the surge of multimodal models and the demand for effective image Information Retrieval (IR) systems, high-quality text-to-image datasets have become paramount. However, most existing datasets are primarily in English, limiting their applicability to multilingual settings. To address this, we introduce the pt-image-ir-dataset, a manually annotated resource for text-based Image IR in European Portuguese. The dataset comprises 80 diverse queries and a curated pool of 5,201 images, each annotated for relevance by multiple human judges. The proposed dataset is a step forward in supporting the development and evaluation of image IR systems for European Portuguese, addressing a clear gap in multilingual multimodal research. To this end, we have made our dataset publicly available, alongside baseline experimental results, demonstrating its suitability on the Image IR task across different retrieval paradigms, including traditional text-based lexical IR methods, semantic dense retrieval models based on language embeddings, cutting-edge vision-language models and end-to-end image retrieval systems. Results demonstrate that vision-language models, particularly OpenCLIP/xlm-roberta-base-ViT-B-32, significantly outperform other approaches (MRR = 0.610).
Abstract ID :
NKDR126
Submission Type
Submission Topics
University of Beira Interior; INESC TEC
Professor
,
University Of Beira Interior / INESC TEC

Abstracts With Same Type

Abstract ID
Abstract Title
Abstract Topic
Submission Type
Primary Author
NKDR132
Resource
Mr. Jan Heinrich Merker
NKDR140
User aspects in IR
Resource
Saber Zerhoudi
NKDR129
Machine Learning and Large Language Models Societally-motivated IR research
Resource
Ricardo Campos
NKDR131
Machine Learning and Large Language Models Societally-motivated IR research
Resource
Ricardo Campos
NKDR93
Evaluation research Machine Learning and Large Language Models Search and ranking
Resource
Laura Caspari
NKDR125
Evaluation research Recommender systems
Resource
Ludovico Boratto
1 visits