orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions

Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ data...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Lev, Guy, Shmueli-Scheuer, Michal, Jerbi, Achiya, Konopnicki, David
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Lev, Guy Shmueli-Scheuer, Michal Jerbi, Achiya Konopnicki, David
description	Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ datasets exist. However, we claim that being collected from community websites, they do not correctly represent challenges associated with FAQs in an organizational context. Thus, we release orgFAQ, a new dataset composed of $6988$ user questions and $1579$ corresponding FAQs that were extracted from organizations' FAQ webpages in the Jobs domain. In this paper, we provide an analysis of the properties of such FAQs, and demonstrate the usefulness of our new dataset by utilizing it in a relevant task from the Jobs domain. We also show the value of the orgFAQ dataset in a task of a different domain - the COVID-19 pandemic.
doi_str_mv	10.48550/arxiv.2009.01460
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2009_01460</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2009_01460</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-941e737ef5ec52308946e2d1a101038c60ab7c640f5061a006b75d769e7dfb473</originalsourceid><addsrcrecordid>eNotj8tKw0AUhmfjQqoP4MrzAolnMrfEXahWhWAp1HU4SU7KQE1kJl7q09umrn74b_AJcSMx1bkxeEfhx3-lGWKRotQWL0U1ht2q3NxDCa_8DQ80UeQJaOigHGh_iD7COMA67GjwvzT58ejCcRHnzlvkAJtPjqcgXomLnvaRr_91Ibarx-3yOanWTy_LskrIOkwKLdkpx73h1mQK80JbzjpJEiWqvLVIjWutxt6glYRoG2c6Zwt2Xd9opxbi9nw749Qfwb9TONQnrHrGUn-bpUXh</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions</title><source>arXiv.org</source><creator>Lev, Guy ; Shmueli-Scheuer, Michal ; Jerbi, Achiya ; Konopnicki, David</creator><creatorcontrib>Lev, Guy ; Shmueli-Scheuer, Michal ; Jerbi, Achiya ; Konopnicki, David</creatorcontrib><description>Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ datasets exist. However, we claim that being collected from community websites, they do not correctly represent challenges associated with FAQs in an organizational context. Thus, we release orgFAQ, a new dataset composed of $6988$ user questions and $1579$ corresponding FAQs that were extracted from organizations' FAQ webpages in the Jobs domain. In this paper, we provide an analysis of the properties of such FAQs, and demonstrate the usefulness of our new dataset by utilizing it in a relevant task from the Jobs domain. We also show the value of the orgFAQ dataset in a task of a different domain - the COVID-19 pandemic.</description><identifier>DOI: 10.48550/arxiv.2009.01460</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2020-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2009.01460$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2009.01460$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Lev, Guy</creatorcontrib><creatorcontrib>Shmueli-Scheuer, Michal</creatorcontrib><creatorcontrib>Jerbi, Achiya</creatorcontrib><creatorcontrib>Konopnicki, David</creatorcontrib><title>orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions</title><description>Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ datasets exist. However, we claim that being collected from community websites, they do not correctly represent challenges associated with FAQs in an organizational context. Thus, we release orgFAQ, a new dataset composed of $6988$ user questions and $1579$ corresponding FAQs that were extracted from organizations' FAQ webpages in the Jobs domain. In this paper, we provide an analysis of the properties of such FAQs, and demonstrate the usefulness of our new dataset by utilizing it in a relevant task from the Jobs domain. We also show the value of the orgFAQ dataset in a task of a different domain - the COVID-19 pandemic.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tKw0AUhmfjQqoP4MrzAolnMrfEXahWhWAp1HU4SU7KQE1kJl7q09umrn74b_AJcSMx1bkxeEfhx3-lGWKRotQWL0U1ht2q3NxDCa_8DQ80UeQJaOigHGh_iD7COMA67GjwvzT58ejCcRHnzlvkAJtPjqcgXomLnvaRr_91Ibarx-3yOanWTy_LskrIOkwKLdkpx73h1mQK80JbzjpJEiWqvLVIjWutxt6glYRoG2c6Zwt2Xd9opxbi9nw749Qfwb9TONQnrHrGUn-bpUXh</recordid><startdate>20200903</startdate><enddate>20200903</enddate><creator>Lev, Guy</creator><creator>Shmueli-Scheuer, Michal</creator><creator>Jerbi, Achiya</creator><creator>Konopnicki, David</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200903</creationdate><title>orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions</title><author>Lev, Guy ; Shmueli-Scheuer, Michal ; Jerbi, Achiya ; Konopnicki, David</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-941e737ef5ec52308946e2d1a101038c60ab7c640f5061a006b75d769e7dfb473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Lev, Guy</creatorcontrib><creatorcontrib>Shmueli-Scheuer, Michal</creatorcontrib><creatorcontrib>Jerbi, Achiya</creatorcontrib><creatorcontrib>Konopnicki, David</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lev, Guy</au><au>Shmueli-Scheuer, Michal</au><au>Jerbi, Achiya</au><au>Konopnicki, David</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions</atitle><date>2020-09-03</date><risdate>2020</risdate><abstract>Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ datasets exist. However, we claim that being collected from community websites, they do not correctly represent challenges associated with FAQs in an organizational context. Thus, we release orgFAQ, a new dataset composed of $6988$ user questions and $1579$ corresponding FAQs that were extracted from organizations' FAQ webpages in the Jobs domain. In this paper, we provide an analysis of the properties of such FAQs, and demonstrate the usefulness of our new dataset by utilizing it in a relevant task from the Jobs domain. We also show the value of the orgFAQ dataset in a task of a different domain - the COVID-19 pandemic.</abstract><doi>10.48550/arxiv.2009.01460</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2009.01460
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2009_01460
source	arXiv.org
subjects	Computer Science - Computation and Language
title	orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-20T13%3A20%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=orgFAQ:%20A%20New%20Dataset%20and%20Analysis%20on%20Organizational%20FAQs%20and%20User%20Questions&rft.au=Lev,%20Guy&rft.date=2020-09-03&rft_id=info:doi/10.48550/arxiv.2009.01460&rft_dat=%3Carxiv_GOX%3E2009_01460%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true