An Empirical Comparison of LM-based Question and Answer Generation Methods

Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish basel...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ushio, Asahi, Alva-Manchego, Fernando, Camacho-Collados, Jose
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Ushio, Asahi
Alva-Manchego, Fernando
Camacho-Collados, Jose
description Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.
doi_str_mv 10.48550/arxiv.2305.17002
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2305_17002</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2305_17002</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-16461aa72085566d358d42ec52f30e225831c8d9470d00c8049b34f5c8c3a61a3</originalsourceid><addsrcrecordid>eNotj8tqwzAURLXpoiT9gK6qH7B79bSyNCZNWxxCIXtzI8lEEMtGSl9_X9ftauDADHMIuWdQSqMUPGL6Ch8lF6BKVgHwW_JaR7odppCCxQttxmHCFPIY6djTdl-cMHtH3959voYZYnS0jvnTJ7rz0Sdc6N5fz6PLa3LT4yX7u_9ckePT9tg8F-1h99LUbYG64gXTUjPEisP8SGsnlHGSe6t4L8Bzroxg1riNrMABWANycxKyV9ZYgXNTrMjD3-wi000pDJi-u1-pbpESP2eTRYI</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>An Empirical Comparison of LM-based Question and Answer Generation Methods</title><source>arXiv.org</source><creator>Ushio, Asahi ; Alva-Manchego, Fernando ; Camacho-Collados, Jose</creator><creatorcontrib>Ushio, Asahi ; Alva-Manchego, Fernando ; Camacho-Collados, Jose</creatorcontrib><description>Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.</description><identifier>DOI: 10.48550/arxiv.2305.17002</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2023-05</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2305.17002$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2305.17002$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ushio, Asahi</creatorcontrib><creatorcontrib>Alva-Manchego, Fernando</creatorcontrib><creatorcontrib>Camacho-Collados, Jose</creatorcontrib><title>An Empirical Comparison of LM-based Question and Answer Generation Methods</title><description>Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tqwzAURLXpoiT9gK6qH7B79bSyNCZNWxxCIXtzI8lEEMtGSl9_X9ftauDADHMIuWdQSqMUPGL6Ch8lF6BKVgHwW_JaR7odppCCxQttxmHCFPIY6djTdl-cMHtH3959voYZYnS0jvnTJ7rz0Sdc6N5fz6PLa3LT4yX7u_9ckePT9tg8F-1h99LUbYG64gXTUjPEisP8SGsnlHGSe6t4L8Bzroxg1riNrMABWANycxKyV9ZYgXNTrMjD3-wi000pDJi-u1-pbpESP2eTRYI</recordid><startdate>20230526</startdate><enddate>20230526</enddate><creator>Ushio, Asahi</creator><creator>Alva-Manchego, Fernando</creator><creator>Camacho-Collados, Jose</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230526</creationdate><title>An Empirical Comparison of LM-based Question and Answer Generation Methods</title><author>Ushio, Asahi ; Alva-Manchego, Fernando ; Camacho-Collados, Jose</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-16461aa72085566d358d42ec52f30e225831c8d9470d00c8049b34f5c8c3a61a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Ushio, Asahi</creatorcontrib><creatorcontrib>Alva-Manchego, Fernando</creatorcontrib><creatorcontrib>Camacho-Collados, Jose</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ushio, Asahi</au><au>Alva-Manchego, Fernando</au><au>Camacho-Collados, Jose</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Empirical Comparison of LM-based Question and Answer Generation Methods</atitle><date>2023-05-26</date><risdate>2023</risdate><abstract>Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.</abstract><doi>10.48550/arxiv.2305.17002</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2305.17002
ispartof
issn
language eng
recordid cdi_arxiv_primary_2305_17002
source arXiv.org
subjects Computer Science - Computation and Language
title An Empirical Comparison of LM-based Question and Answer Generation Methods
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T14%3A27%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Empirical%20Comparison%20of%20LM-based%20Question%20and%20Answer%20Generation%20Methods&rft.au=Ushio,%20Asahi&rft.date=2023-05-26&rft_id=info:doi/10.48550/arxiv.2305.17002&rft_dat=%3Carxiv_GOX%3E2305_17002%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true