orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions

Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ data...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lev, Guy, Shmueli-Scheuer, Michal, Jerbi, Achiya, Konopnicki, David
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Lev, Guy
Shmueli-Scheuer, Michal
Jerbi, Achiya
Konopnicki, David
description Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ datasets exist. However, we claim that being collected from community websites, they do not correctly represent challenges associated with FAQs in an organizational context. Thus, we release orgFAQ, a new dataset composed of $6988$ user questions and $1579$ corresponding FAQs that were extracted from organizations' FAQ webpages in the Jobs domain. In this paper, we provide an analysis of the properties of such FAQs, and demonstrate the usefulness of our new dataset by utilizing it in a relevant task from the Jobs domain. We also show the value of the orgFAQ dataset in a task of a different domain - the COVID-19 pandemic.
doi_str_mv 10.48550/arxiv.2009.01460
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2009_01460</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2009_01460</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-941e737ef5ec52308946e2d1a101038c60ab7c640f5061a006b75d769e7dfb473</originalsourceid><addsrcrecordid>eNotj8tKw0AUhmfjQqoP4MrzAolnMrfEXahWhWAp1HU4SU7KQE1kJl7q09umrn74b_AJcSMx1bkxeEfhx3-lGWKRotQWL0U1ht2q3NxDCa_8DQ80UeQJaOigHGh_iD7COMA67GjwvzT58ejCcRHnzlvkAJtPjqcgXomLnvaRr_91Ibarx-3yOanWTy_LskrIOkwKLdkpx73h1mQK80JbzjpJEiWqvLVIjWutxt6glYRoG2c6Zwt2Xd9opxbi9nw749Qfwb9TONQnrHrGUn-bpUXh</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions</title><source>arXiv.org</source><creator>Lev, Guy ; Shmueli-Scheuer, Michal ; Jerbi, Achiya ; Konopnicki, David</creator><creatorcontrib>Lev, Guy ; Shmueli-Scheuer, Michal ; Jerbi, Achiya ; Konopnicki, David</creatorcontrib><description>Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ datasets exist. However, we claim that being collected from community websites, they do not correctly represent challenges associated with FAQs in an organizational context. Thus, we release orgFAQ, a new dataset composed of $6988$ user questions and $1579$ corresponding FAQs that were extracted from organizations' FAQ webpages in the Jobs domain. In this paper, we provide an analysis of the properties of such FAQs, and demonstrate the usefulness of our new dataset by utilizing it in a relevant task from the Jobs domain. We also show the value of the orgFAQ dataset in a task of a different domain - the COVID-19 pandemic.</description><identifier>DOI: 10.48550/arxiv.2009.01460</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2020-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2009.01460$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2009.01460$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Lev, Guy</creatorcontrib><creatorcontrib>Shmueli-Scheuer, Michal</creatorcontrib><creatorcontrib>Jerbi, Achiya</creatorcontrib><creatorcontrib>Konopnicki, David</creatorcontrib><title>orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions</title><description>Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ datasets exist. However, we claim that being collected from community websites, they do not correctly represent challenges associated with FAQs in an organizational context. Thus, we release orgFAQ, a new dataset composed of $6988$ user questions and $1579$ corresponding FAQs that were extracted from organizations' FAQ webpages in the Jobs domain. In this paper, we provide an analysis of the properties of such FAQs, and demonstrate the usefulness of our new dataset by utilizing it in a relevant task from the Jobs domain. We also show the value of the orgFAQ dataset in a task of a different domain - the COVID-19 pandemic.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tKw0AUhmfjQqoP4MrzAolnMrfEXahWhWAp1HU4SU7KQE1kJl7q09umrn74b_AJcSMx1bkxeEfhx3-lGWKRotQWL0U1ht2q3NxDCa_8DQ80UeQJaOigHGh_iD7COMA67GjwvzT58ejCcRHnzlvkAJtPjqcgXomLnvaRr_91Ibarx-3yOanWTy_LskrIOkwKLdkpx73h1mQK80JbzjpJEiWqvLVIjWutxt6glYRoG2c6Zwt2Xd9opxbi9nw749Qfwb9TONQnrHrGUn-bpUXh</recordid><startdate>20200903</startdate><enddate>20200903</enddate><creator>Lev, Guy</creator><creator>Shmueli-Scheuer, Michal</creator><creator>Jerbi, Achiya</creator><creator>Konopnicki, David</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200903</creationdate><title>orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions</title><author>Lev, Guy ; Shmueli-Scheuer, Michal ; Jerbi, Achiya ; Konopnicki, David</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-941e737ef5ec52308946e2d1a101038c60ab7c640f5061a006b75d769e7dfb473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Lev, Guy</creatorcontrib><creatorcontrib>Shmueli-Scheuer, Michal</creatorcontrib><creatorcontrib>Jerbi, Achiya</creatorcontrib><creatorcontrib>Konopnicki, David</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lev, Guy</au><au>Shmueli-Scheuer, Michal</au><au>Jerbi, Achiya</au><au>Konopnicki, David</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions</atitle><date>2020-09-03</date><risdate>2020</risdate><abstract>Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ datasets exist. However, we claim that being collected from community websites, they do not correctly represent challenges associated with FAQs in an organizational context. Thus, we release orgFAQ, a new dataset composed of $6988$ user questions and $1579$ corresponding FAQs that were extracted from organizations' FAQ webpages in the Jobs domain. In this paper, we provide an analysis of the properties of such FAQs, and demonstrate the usefulness of our new dataset by utilizing it in a relevant task from the Jobs domain. We also show the value of the orgFAQ dataset in a task of a different domain - the COVID-19 pandemic.</abstract><doi>10.48550/arxiv.2009.01460</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2009.01460
ispartof
issn
language eng
recordid cdi_arxiv_primary_2009_01460
source arXiv.org
subjects Computer Science - Computation and Language
title orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-20T13%3A20%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=orgFAQ:%20A%20New%20Dataset%20and%20Analysis%20on%20Organizational%20FAQs%20and%20User%20Questions&rft.au=Lev,%20Guy&rft.date=2020-09-03&rft_id=info:doi/10.48550/arxiv.2009.01460&rft_dat=%3Carxiv_GOX%3E2009_01460%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true