Off-policy evaluation (OPE) and optimization for learning to rank (LTR) leverage document placement probabilities to correct for the effects of various statistical biases, e.g., position bias. However, computing these propensities poses a challenge, as for most ranking models this requires iterating...
Machine Learning and Large Language ModelsSearch and rankingFull papers