Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge

In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions coveri...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Dou, Longxu, Gao, Yan, Liu, Xuqi, Pan, Mingyang, Wang, Dingzirui, Che, Wanxiang, Zhan, Dechen, Kan, Min-Yen, Lou, Jian-Guang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Dou, Longxu
Gao, Yan
Liu, Xuqi
Pan, Mingyang
Wang, Dingzirui
Che, Wanxiang
Zhan, Dechen
Kan, Min-Yen
Lou, Jian-Guang
description In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.
doi_str_mv 10.48550/arxiv.2301.01067
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2301_01067</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2301_01067</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-f6f13bfd5620995e6c56a43d3f0bd06f3f437006ab221dd7acc2fed60e5b87083</originalsourceid><addsrcrecordid>eNo9z99KwzAYBfDceCHTB_DKvEDql6RJussxnA4LOtb78rVJtkD_SBrX-fbqFK8OHDgHfoTcccjyQil4wHgOp0xI4Blw0Oaa7Kpxxmgn-jKMc-fswbHtkNwwhZOjlTsnlka235V073ocUmjpG8YpDAc6h3SkmzH2Hx1-1__7G3LlsZvc7V8uSLV5rNbPrHx92q5XJUNtDPPac9l4q7SA5VI53SqNubTSQ2NBe-lzaQA0NkJwaw22rfDOanCqKQwUckHuf28vpvo9hh7jZ_1jqy82-QWxY0o6</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge</title><source>arXiv.org</source><creator>Dou, Longxu ; Gao, Yan ; Liu, Xuqi ; Pan, Mingyang ; Wang, Dingzirui ; Che, Wanxiang ; Zhan, Dechen ; Kan, Min-Yen ; Lou, Jian-Guang</creator><creatorcontrib>Dou, Longxu ; Gao, Yan ; Liu, Xuqi ; Pan, Mingyang ; Wang, Dingzirui ; Che, Wanxiang ; Zhan, Dechen ; Kan, Min-Yen ; Lou, Jian-Guang</creatorcontrib><description>In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.</description><identifier>DOI: 10.48550/arxiv.2301.01067</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2023-01</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2301.01067$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2301.01067$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Dou, Longxu</creatorcontrib><creatorcontrib>Gao, Yan</creatorcontrib><creatorcontrib>Liu, Xuqi</creatorcontrib><creatorcontrib>Pan, Mingyang</creatorcontrib><creatorcontrib>Wang, Dingzirui</creatorcontrib><creatorcontrib>Che, Wanxiang</creatorcontrib><creatorcontrib>Zhan, Dechen</creatorcontrib><creatorcontrib>Kan, Min-Yen</creatorcontrib><creatorcontrib>Lou, Jian-Guang</creatorcontrib><title>Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge</title><description>In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo9z99KwzAYBfDceCHTB_DKvEDql6RJussxnA4LOtb78rVJtkD_SBrX-fbqFK8OHDgHfoTcccjyQil4wHgOp0xI4Blw0Oaa7Kpxxmgn-jKMc-fswbHtkNwwhZOjlTsnlka235V073ocUmjpG8YpDAc6h3SkmzH2Hx1-1__7G3LlsZvc7V8uSLV5rNbPrHx92q5XJUNtDPPac9l4q7SA5VI53SqNubTSQ2NBe-lzaQA0NkJwaw22rfDOanCqKQwUckHuf28vpvo9hh7jZ_1jqy82-QWxY0o6</recordid><startdate>20230103</startdate><enddate>20230103</enddate><creator>Dou, Longxu</creator><creator>Gao, Yan</creator><creator>Liu, Xuqi</creator><creator>Pan, Mingyang</creator><creator>Wang, Dingzirui</creator><creator>Che, Wanxiang</creator><creator>Zhan, Dechen</creator><creator>Kan, Min-Yen</creator><creator>Lou, Jian-Guang</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230103</creationdate><title>Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge</title><author>Dou, Longxu ; Gao, Yan ; Liu, Xuqi ; Pan, Mingyang ; Wang, Dingzirui ; Che, Wanxiang ; Zhan, Dechen ; Kan, Min-Yen ; Lou, Jian-Guang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-f6f13bfd5620995e6c56a43d3f0bd06f3f437006ab221dd7acc2fed60e5b87083</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Dou, Longxu</creatorcontrib><creatorcontrib>Gao, Yan</creatorcontrib><creatorcontrib>Liu, Xuqi</creatorcontrib><creatorcontrib>Pan, Mingyang</creatorcontrib><creatorcontrib>Wang, Dingzirui</creatorcontrib><creatorcontrib>Che, Wanxiang</creatorcontrib><creatorcontrib>Zhan, Dechen</creatorcontrib><creatorcontrib>Kan, Min-Yen</creatorcontrib><creatorcontrib>Lou, Jian-Guang</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dou, Longxu</au><au>Gao, Yan</au><au>Liu, Xuqi</au><au>Pan, Mingyang</au><au>Wang, Dingzirui</au><au>Che, Wanxiang</au><au>Zhan, Dechen</au><au>Kan, Min-Yen</au><au>Lou, Jian-Guang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge</atitle><date>2023-01-03</date><risdate>2023</risdate><abstract>In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.</abstract><doi>10.48550/arxiv.2301.01067</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2301.01067
ispartof
issn
language eng
recordid cdi_arxiv_primary_2301_01067
source arXiv.org
subjects Computer Science - Computation and Language
title Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T12%3A46%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20Knowledge-Intensive%20Text-to-SQL%20Semantic%20Parsing%20with%20Formulaic%20Knowledge&rft.au=Dou,%20Longxu&rft.date=2023-01-03&rft_id=info:doi/10.48550/arxiv.2301.01067&rft_dat=%3Carxiv_GOX%3E2301_01067%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true