WikiAsp: A Dataset for Multi-domain Aspect-based Summarization

Aspect-based summarization is the task of generating focused summaries based on specific points of interest. Such summaries aid efficient analysis of text, such as quickly understanding reviews or opinions from different angles. However, due to large differences in the type of aspects for different...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Hayashi, Hiroaki, Budania, Prashant, Wang, Peng, Ackerson, Chris, Neervannan, Raj, Neubig, Graham
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Hayashi, Hiroaki Budania, Prashant Wang, Peng Ackerson, Chris Neervannan, Raj Neubig, Graham
description	Aspect-based summarization is the task of generating focused summaries based on specific points of interest. Such summaries aid efficient analysis of text, such as quickly understanding reviews or opinions from different angles. However, due to large differences in the type of aspects for different domains (e.g., sentiment, product features), the development of previous models has tended to be domain-specific. In this paper, we propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization that attempts to spur research in the direction of open-domain aspect-based summarization. Specifically, we build the dataset using Wikipedia articles from 20 different domains, using the section titles and boundaries of each article as a proxy for aspect annotation. We propose several straightforward baseline models for this task and conduct experiments on the dataset. Results highlight key challenges that existing summarization models face in this setting, such as proper pronoun handling of quoted sources and consistent explanation of time-sensitive events.
doi_str_mv	10.48550/arxiv.2011.07832
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2011_07832</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2011_07832</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-af02742b23d3a4a82388530994e9f9a79ba831775bafbccf5f771848e07c3f333</originalsourceid><addsrcrecordid>eNotj71OwzAURr0woMIDMOEXcLB9ba7NgBSVv0pFDFRijK4TW7JomipxEfD0hML0DUf6dA5jF0pWxlkrr2j8zB-VlkpVEh3oU3b7lt9zPe1veM3vqNAUC0_DyJ8P25JFN_SUd3zmsS0izLTjr4e-pzF_U8nD7oydJNpO8fx_F2zzcL9ZPon1y-NqWa8FXaMWlKRGo4OGDsiQ0-CcBem9iT55Qh_IgUK0gVJo22QTonLGRYktJABYsMu_22NAsx_zrPDV_IY0xxD4AdnmQhg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>WikiAsp: A Dataset for Multi-domain Aspect-based Summarization</title><source>arXiv.org</source><creator>Hayashi, Hiroaki ; Budania, Prashant ; Wang, Peng ; Ackerson, Chris ; Neervannan, Raj ; Neubig, Graham</creator><creatorcontrib>Hayashi, Hiroaki ; Budania, Prashant ; Wang, Peng ; Ackerson, Chris ; Neervannan, Raj ; Neubig, Graham</creatorcontrib><description>Aspect-based summarization is the task of generating focused summaries based on specific points of interest. Such summaries aid efficient analysis of text, such as quickly understanding reviews or opinions from different angles. However, due to large differences in the type of aspects for different domains (e.g., sentiment, product features), the development of previous models has tended to be domain-specific. In this paper, we propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization that attempts to spur research in the direction of open-domain aspect-based summarization. Specifically, we build the dataset using Wikipedia articles from 20 different domains, using the section titles and boundaries of each article as a proxy for aspect annotation. We propose several straightforward baseline models for this task and conduct experiments on the dataset. Results highlight key challenges that existing summarization models face in this setting, such as proper pronoun handling of quoted sources and consistent explanation of time-sensitive events.</description><identifier>DOI: 10.48550/arxiv.2011.07832</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2020-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2011.07832$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2011.07832$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Hayashi, Hiroaki</creatorcontrib><creatorcontrib>Budania, Prashant</creatorcontrib><creatorcontrib>Wang, Peng</creatorcontrib><creatorcontrib>Ackerson, Chris</creatorcontrib><creatorcontrib>Neervannan, Raj</creatorcontrib><creatorcontrib>Neubig, Graham</creatorcontrib><title>WikiAsp: A Dataset for Multi-domain Aspect-based Summarization</title><description>Aspect-based summarization is the task of generating focused summaries based on specific points of interest. Such summaries aid efficient analysis of text, such as quickly understanding reviews or opinions from different angles. However, due to large differences in the type of aspects for different domains (e.g., sentiment, product features), the development of previous models has tended to be domain-specific. In this paper, we propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization that attempts to spur research in the direction of open-domain aspect-based summarization. Specifically, we build the dataset using Wikipedia articles from 20 different domains, using the section titles and boundaries of each article as a proxy for aspect annotation. We propose several straightforward baseline models for this task and conduct experiments on the dataset. Results highlight key challenges that existing summarization models face in this setting, such as proper pronoun handling of quoted sources and consistent explanation of time-sensitive events.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj71OwzAURr0woMIDMOEXcLB9ba7NgBSVv0pFDFRijK4TW7JomipxEfD0hML0DUf6dA5jF0pWxlkrr2j8zB-VlkpVEh3oU3b7lt9zPe1veM3vqNAUC0_DyJ8P25JFN_SUd3zmsS0izLTjr4e-pzF_U8nD7oydJNpO8fx_F2zzcL9ZPon1y-NqWa8FXaMWlKRGo4OGDsiQ0-CcBem9iT55Qh_IgUK0gVJo22QTonLGRYktJABYsMu_22NAsx_zrPDV_IY0xxD4AdnmQhg</recordid><startdate>20201116</startdate><enddate>20201116</enddate><creator>Hayashi, Hiroaki</creator><creator>Budania, Prashant</creator><creator>Wang, Peng</creator><creator>Ackerson, Chris</creator><creator>Neervannan, Raj</creator><creator>Neubig, Graham</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20201116</creationdate><title>WikiAsp: A Dataset for Multi-domain Aspect-based Summarization</title><author>Hayashi, Hiroaki ; Budania, Prashant ; Wang, Peng ; Ackerson, Chris ; Neervannan, Raj ; Neubig, Graham</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-af02742b23d3a4a82388530994e9f9a79ba831775bafbccf5f771848e07c3f333</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Hayashi, Hiroaki</creatorcontrib><creatorcontrib>Budania, Prashant</creatorcontrib><creatorcontrib>Wang, Peng</creatorcontrib><creatorcontrib>Ackerson, Chris</creatorcontrib><creatorcontrib>Neervannan, Raj</creatorcontrib><creatorcontrib>Neubig, Graham</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hayashi, Hiroaki</au><au>Budania, Prashant</au><au>Wang, Peng</au><au>Ackerson, Chris</au><au>Neervannan, Raj</au><au>Neubig, Graham</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>WikiAsp: A Dataset for Multi-domain Aspect-based Summarization</atitle><date>2020-11-16</date><risdate>2020</risdate><abstract>Aspect-based summarization is the task of generating focused summaries based on specific points of interest. Such summaries aid efficient analysis of text, such as quickly understanding reviews or opinions from different angles. However, due to large differences in the type of aspects for different domains (e.g., sentiment, product features), the development of previous models has tended to be domain-specific. In this paper, we propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization that attempts to spur research in the direction of open-domain aspect-based summarization. Specifically, we build the dataset using Wikipedia articles from 20 different domains, using the section titles and boundaries of each article as a proxy for aspect annotation. We propose several straightforward baseline models for this task and conduct experiments on the dataset. Results highlight key challenges that existing summarization models face in this setting, such as proper pronoun handling of quoted sources and consistent explanation of time-sensitive events.</abstract><doi>10.48550/arxiv.2011.07832</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2011.07832
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2011_07832
source	arXiv.org
subjects	Computer Science - Computation and Language
title	WikiAsp: A Dataset for Multi-domain Aspect-based Summarization
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T20%3A27%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=WikiAsp:%20A%20Dataset%20for%20Multi-domain%20Aspect-based%20Summarization&rft.au=Hayashi,%20Hiroaki&rft.date=2020-11-16&rft_id=info:doi/10.48550/arxiv.2011.07832&rft_dat=%3Carxiv_GOX%3E2011_07832%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true