Starbucks: Improved Training for 2D Matryoshka Embeddings

This abstract has open access
Abstract Summary
2D Matryoshka training enables a single embedding model to produce sub-network representations across varying layers and embedding dimensions, offering flexibility under different computational and task constraints. However, its performance remains below that of individually trained models of comparable sizes. To address this, we propose \textbf{Starbucks}, a new training strategy for Matryoshka-style embedding models that combines structured fine-tuning with masked autoencoder (MAE) pre-training. During fine-tuning, we compute the loss over a fixed set of layer-dimension pairs, ordered from small to large, which significantly improves over random sub-network sampling and matches the performance of separately trained models. Our MAE-based pre-training further strengthens sub-network representations, providing a more robust backbone for downstream tasks. Experiments on both in-domain (semantic similarity and passage retrieval) and out-of-domain (BEIR) benchmarks show that Starbucks consistently outperforms 2D Matryoshka models and matches or exceeds the performance of individually trained models, while maintaining high efficiency and flexibility. Ablation studies validate our loss design, the benefits of SMAE pre-training, and demonstrate Starbucks¡¯ applicability across backbones. We further show that depth- and width-wise Starbucks variants encode complementary information, and that combining them yields further gains with minimal latency overhead via parallelization. Code at https://anonymous.4open.science/r/Starbucks-Official-02E7.
Abstract ID :
NKDR17
Submission Type
Submission Topics
The University of Queensland
The University of Queensland
Principal Research Scientist
,
The University Of Queensland & CSIRO
The University of Queensland & Google

Abstracts With Same Type

Abstract ID
Abstract Title
Abstract Topic
Submission Type
Primary Author
NKDR52
Search and ranking
Full papers
Emmanouil Georgios Lionis
NKDR51
Search and rankingSocietally-motivated IR research
Full papers
Martim Baltazar
NKDR15
ApplicationsMachine Learning and Large Language Models
Full papers
Saeedeh Javadi
NKDR49
Societally-motivated IR researchUser aspects in IR
Full papers
Niall McGuire
NKDR177
ApplicationsSearch and ranking
Full papers
Danyang Hou
NKDR184
ApplicationsEvaluation research
Full papers
Danyang Hou
NKDR193
ApplicationsSearch and ranking
Full papers
Danyang Hou
NKDR39
ApplicationsMachine Learning and Large Language Models
Full papers
Sarmistha Das
2 visits