Skip to yearly menu bar Skip to main content


Poster

RankRAG: Unifying Retrieval-Augmented Generation and Context Ranking in LLMs

Yue Yu · Wei Ping · Zihan Liu · Boxin Wang · Jiaxuan You · Chao Zhang · Mohammad Shoeybi · Bryan Catanzaro

[ ]
Thu 12 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

Large language models (LLMs) typically utilize the top-k contexts from a retriever in retrieval-augmented generation (RAG). In this work, we propose a novel method called RankRAG, which instruction-tunes a single LLM for both context ranking and answer generation in RAG. In particular, the instruction-tuned LLMs work surprisingly well by adding a small fraction of ranking data into the training blend, and outperform existing expert ranking models, including the same LLM exclusively fine-tuned on a large amount of ranking data. For generation, we compare our model with many strong baselines, including ChatQA-1.5, an open-sourced model with the state-of-the-art performance on RAG benchmarks. Specifically, our Llama3-RankRAG-8B and Llama3-RankRAG-70B significantly outperform Llama3-ChatQA-1.5-8B and Llama3-ChatQA-1.5-70B, respectively, on nine general and five biomedical knowledge-intensive benchmarks for RAG. For example, Llama3-RankRAG-70B achieves a record-high exact-match score of 54.2% on Natural Questions in zero-shot evaluation.

Live content is unavailable. Log in and register to view live content