Extractive Schema Linking for Text-to-SQL
Text-to-SQL is emerging as a practical interface for real world databases. The dominant paradigm for Text-to-SQL is cross-database or schema-independent, supporting application schemas unseen during training. The schema of a database defines the tables, columns, column types and foreign key connecti...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Text-to-SQL is emerging as a practical interface for real world databases.
The dominant paradigm for Text-to-SQL is cross-database or schema-independent,
supporting application schemas unseen during training. The schema of a database
defines the tables, columns, column types and foreign key connections between
tables. Real world schemas can be large, containing hundreds of columns, but
for any particular query only a small fraction will be relevant. Placing the
entire schema in the prompt for an LLM can be impossible for models with
smaller token windows and expensive even when the context window is large
enough to allow it. Even apart from computational considerations, the accuracy
of the model can be improved by focusing the SQL generation on only the
relevant portion of the database. Schema linking identifies the portion of the
database schema useful for the question. Previous work on schema linking has
used graph neural networks, generative LLMs, and cross encoder classifiers. We
introduce a new approach to adapt decoder-only LLMs to schema linking that is
both computationally more efficient and more accurate than the generative
approach. Additionally our extractive approach permits fine-grained control
over the precision-recall trade-off for schema linking. |
---|---|
DOI: | 10.48550/arxiv.2501.17174 |