Debiasing CLIP with Neural Interventions

This abstract has open access

Abstract Summary

This paper presents an inference-time method to mitigate demographic bias in CLIP-like vision¨Clanguage models through targeted neural interventions in their internal attention mechanisms. We first identify ``expert'' attention heads that encode demographic information by systematically analyzing CLIP¡¯s internal representations in response to labeled inputs. At inference, we intervene these heads -- replacing their activations with demographic prototypes or by neutralizing them (zero ablation). We chose to intervene specifically at the CLS token, as it aggregates information globally across image patches and is directly responsible for the final image embedding. Our results across multiple evaluation frameworks show that these targeted interventions can significantly reduce both gender and ethnicity biases in cross-modal retrieval and zero-shot classification, without compromising model performance.

Abstract ID :

NKDR191

Submission Type

Submission Topics

Associated Sessions

IR-for-Good Paper Session III

Author
Co-Authors

PhD student

,

COMPUTER VISION CENTER

Jordi Gonzalez

Lluis Gomez

Researcher

,

COMPUTER VISION CENTER

Abstracts With Same Type

Abstract ID

Abstract Title

Abstract Topic

Submission Type

Primary Author

NKDR164

AgriIR: A Scalable Framework for Domain-Specific KnowledgeRetrieval

IR for good

Shuvam Banerji Seal

NKDR145

All That Matters: Revisiting Children's Concept of Relevance in Primary School Context

IR for good

Diletta Micol Tobia

NKDR144

Bias in Book Recommendation: A Case Study on the Danish Public Libraries

IR for good

Savvina Daniil

NKDR200

Counterfactual Understanding via Retrieval-aware MultimodalModeling for Time-to-Event Survival Prediction

IR for good

Mr. Huy-Son Nguyen

NKDR167

Cultural Analytics for Good: Building Inclusive EvaluationFrameworks for Historical IR

IR for good

Suchana Datta

NKDR148

Does Reasoning Make Search More Fair? Comparing Fairness in Reasoning and Non-Reasoning Rerankers

IR for good

Saron Samuel

NKDR154

Extending Logic Tensor Networks to Implicit Feedback forRepresentation-Aware Music Recommendation

IR for good

Hannah Eckert

NKDR151

From Engagement to Empowerment: A Capability-Theoretic Rethinking of Recommender Systems

IR for good

Ms. Vittoria Vineis

NKDR179

From Engagement to Empowerment: A Capability-Theoretic Rethinking of Recommender Systems

IR for good

Ms. Vittoria Vineis

View All Abstracts

1 visits