Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Table Representation Learning Workshop (TRL)

The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models

Karime Maamari · Fadhil Abubaker · Daniel Jaroslawicz · Amine Mhedhbi

Keywords: [ Natural Language Interfaces to Databases ] [ BIRD Benchmark ] [ Schema Linking ] [ Text-to-SQL ]

[ ] [ Project Page ]
Sat 14 Dec 2:20 p.m. PST — 2:30 p.m. PST

Abstract:

In Text-to-SQL pipelines, schema linking is used to retrieve tables and columns that are relevant to the user's natural language query. However, inaccuracies in schema linking can lead to the exclusion of crucial information, which in turn adversely affects SQL generation. In this work, we revisit the need for schema linking when using the latest generation of large language models (LLMs). We find that newer models can accurately identify relevant schema during SQL generation, even in the presence of substantial irrelevant data. Consequently, our Text-to-SQL pipeline forgoes schema linking when the entire database schema fits within the model's context window. This approach eliminates errors due to faulty schema linking by ensuring that no schema information is omitted. Furthermore, we introduce techniques such as augmentation, selection, and correction, which improve Text-to-SQL accuracy without the risk of filtering out essential schema information. Our approach ranks first on the BIRD benchmark, achieving an accuracy of 71.83%.

Chat is not available.