PRACTIQ: A Practical Conversational Text-to-SQL dataset with Ambiguous and Unanswerable Queries
Previous text-to-SQL datasets and systems have primarily focused on user questions with clear intentions that can be answered. However, real user questions can often be ambiguous with multiple interpretations or unanswerable due to a lack of relevant data. In this work, we construct a practical conv...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Previous text-to-SQL datasets and systems have primarily focused on user
questions with clear intentions that can be answered. However, real user
questions can often be ambiguous with multiple interpretations or unanswerable
due to a lack of relevant data. In this work, we construct a practical
conversational text-to-SQL dataset called PRACTIQ, consisting of ambiguous and
unanswerable questions inspired by real-world user questions. We first
identified four categories of ambiguous questions and four categories of
unanswerable questions by studying existing text-to-SQL datasets. Then, we
generate conversations with four turns: the initial user question, an assistant
response seeking clarification, the user's clarification, and the assistant's
clarified SQL response with the natural language explanation of the execution
results. For some ambiguous queries, we also directly generate helpful SQL
responses, that consider multiple aspects of ambiguity, instead of requesting
user clarification. To benchmark the performance on ambiguous, unanswerable,
and answerable questions, we implemented large language model (LLM)-based
baselines using various LLMs. Our approach involves two steps: question
category classification and clarification SQL prediction. Our experiments
reveal that state-of-the-art systems struggle to handle ambiguous and
unanswerable questions effectively. We will release our code for data
generation and experiments on GitHub. |
---|---|
DOI: | 10.48550/arxiv.2410.11076 |