Natural language to SQL: where are we today?

Translating natural language to SQL (NL2SQL) has received extensive attention lately, especially with the recent success of deep learning technologies. However, despite the large number of studies, we do not have a thorough understanding of how good existing techniques really are and how much is app...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the VLDB Endowment 2020-06, Vol.13 (10), p.1737-1750
Hauptverfasser: Kim, Hyeonji, So, Byeong-Hoon, Han, Wook-Shin, Lee, Hongrae
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Translating natural language to SQL (NL2SQL) has received extensive attention lately, especially with the recent success of deep learning technologies. However, despite the large number of studies, we do not have a thorough understanding of how good existing techniques really are and how much is applicable to real-world situations. A key difficulty is that different studies are based on different datasets, which often have their own limitations and assumptions that are implicitly hidden in the context or datasets. Moreover, a couple of evaluation metrics are commonly employed but they are rather simplistic and do not properly depict the accuracy of results, as will be shown in our experiments. To provide a holistic view of NL2SQL technologies and access current advancements, we perform extensive experiments under our unified framework using eleven of recent techniques over 10+ benchmarks including a new benchmark (WTQ) and TPC-H. We provide a comprehensive survey of recent NL2SQL methods, introducing a taxonomy of them. We reveal major assumptions of the methods and classify translation errors through extensive experiments. We also provide a practical tool for validation by using existing, mature database technologies such as query rewrite and database testing. We then suggest future research directions so that the translation can be used in practice.
ISSN:2150-8097
2150-8097
DOI:10.14778/3401960.3401970