An Empirical Comparison of LM-based Question and Answer Generation Methods

Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish basel...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Ushio, Asahi, Alva-Manchego, Fernando, Camacho-Collados, Jose
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Ushio, Asahi Alva-Manchego, Fernando Camacho-Collados, Jose
description	Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.
doi_str_mv	10.48550/arxiv.2305.17002
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2305_17002</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2305_17002</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-16461aa72085566d358d42ec52f30e225831c8d9470d00c8049b34f5c8c3a61a3</originalsourceid><addsrcrecordid>eNotj8tqwzAURLXpoiT9gK6qH7B79bSyNCZNWxxCIXtzI8lEEMtGSl9_X9ftauDADHMIuWdQSqMUPGL6Ch8lF6BKVgHwW_JaR7odppCCxQttxmHCFPIY6djTdl-cMHtH3959voYZYnS0jvnTJ7rz0Sdc6N5fz6PLa3LT4yX7u_9ckePT9tg8F-1h99LUbYG64gXTUjPEisP8SGsnlHGSe6t4L8Bzroxg1riNrMABWANycxKyV9ZYgXNTrMjD3-wi000pDJi-u1-pbpESP2eTRYI</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>An Empirical Comparison of LM-based Question and Answer Generation Methods</title><source>arXiv.org</source><creator>Ushio, Asahi ; Alva-Manchego, Fernando ; Camacho-Collados, Jose</creator><creatorcontrib>Ushio, Asahi ; Alva-Manchego, Fernando ; Camacho-Collados, Jose</creatorcontrib><description>Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.</description><identifier>DOI: 10.48550/arxiv.2305.17002</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2023-05</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2305.17002$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2305.17002$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ushio, Asahi</creatorcontrib><creatorcontrib>Alva-Manchego, Fernando</creatorcontrib><creatorcontrib>Camacho-Collados, Jose</creatorcontrib><title>An Empirical Comparison of LM-based Question and Answer Generation Methods</title><description>Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tqwzAURLXpoiT9gK6qH7B79bSyNCZNWxxCIXtzI8lEEMtGSl9_X9ftauDADHMIuWdQSqMUPGL6Ch8lF6BKVgHwW_JaR7odppCCxQttxmHCFPIY6djTdl-cMHtH3959voYZYnS0jvnTJ7rz0Sdc6N5fz6PLa3LT4yX7u_9ckePT9tg8F-1h99LUbYG64gXTUjPEisP8SGsnlHGSe6t4L8Bzroxg1riNrMABWANycxKyV9ZYgXNTrMjD3-wi000pDJi-u1-pbpESP2eTRYI</recordid><startdate>20230526</startdate><enddate>20230526</enddate><creator>Ushio, Asahi</creator><creator>Alva-Manchego, Fernando</creator><creator>Camacho-Collados, Jose</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230526</creationdate><title>An Empirical Comparison of LM-based Question and Answer Generation Methods</title><author>Ushio, Asahi ; Alva-Manchego, Fernando ; Camacho-Collados, Jose</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-16461aa72085566d358d42ec52f30e225831c8d9470d00c8049b34f5c8c3a61a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Ushio, Asahi</creatorcontrib><creatorcontrib>Alva-Manchego, Fernando</creatorcontrib><creatorcontrib>Camacho-Collados, Jose</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ushio, Asahi</au><au>Alva-Manchego, Fernando</au><au>Camacho-Collados, Jose</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Empirical Comparison of LM-based Question and Answer Generation Methods</atitle><date>2023-05-26</date><risdate>2023</risdate><abstract>Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.</abstract><doi>10.48550/arxiv.2305.17002</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2305.17002
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2305_17002
source	arXiv.org
subjects	Computer Science - Computation and Language
title	An Empirical Comparison of LM-based Question and Answer Generation Methods
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T14%3A27%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Empirical%20Comparison%20of%20LM-based%20Question%20and%20Answer%20Generation%20Methods&rft.au=Ushio,%20Asahi&rft.date=2023-05-26&rft_id=info:doi/10.48550/arxiv.2305.17002&rft_dat=%3Carxiv_GOX%3E2305_17002%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true