Skip to yearly menu bar Skip to main content


Poster Session
in
Workshop: Scientific Methods for Understanding Neural Networks

Eliminating Position Bias of Language Models: A Mechanistic Approach

Ziqi Wang · Hanlin Zhang · Xiner Li · Kuan-Hao Huang · Chi Han · Shuiwang Ji · Sham Kakade · Hao Peng · Heng Ji

[ ] [ Project Page ]
Sun 15 Dec 11:20 a.m. PST — 12:20 p.m. PST

Abstract: Position bias has proven to be a prevalent issue of modern language models (LMs), where the models prioritize content based on its position within the given context. This bias often leads to unexpected model failures and hurts performance, robustness, and reliability across various applications. Our mechanistic analysis attributes the position bias to two components employed in nearly all state-of-the-art LMs: causal attention and relative positional encodings. Based on the above analyses, we propose to **eliminate** position bias caused by different input segment orders (e.g., options in LM-as-a-judge, retrieved documents in QA) in a **training-free zero-shot** manner. Our method changes the causal attention to bidirectional attention between segments and utilizes model attention values to decide the relative orders of segments instead of using the order provided in input prompts, therefore enabling **P**osition-**IN**variant inferenc**E** (**PINE**) at the segment level. Results on the LM-as-a-judge task show that PINE is especially useful when adapting LMs for evaluating reasoning pairs: it consistently provides $8$ to $10$ percentage points performance gains in most cases. It makes Llama-3-70B-Instruct perform even better than GPT-4-0125-preview and GPT-4o-2024-08-06 on the RewardBench reasoning subset.

Chat is not available.