Social context summarization using user-generated content and third-party sources

•A novel framework for social context summarization is proposed.•The framework relies on the reinforcement support of external information.•23 features in three groups: local, user-generated, and third-party are proposed.•A new open-domain dataset is created and manually annotated.•Combining interna...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge-based systems 2018-03, Vol.144, p.51-64
Hauptverfasser: Nguyen, Minh-Tien, Tran, Duc-Vu, Nguyen, Le-Minh
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 64
container_issue
container_start_page 51
container_title Knowledge-based systems
container_volume 144
creator Nguyen, Minh-Tien
Tran, Duc-Vu
Nguyen, Le-Minh
description •A novel framework for social context summarization is proposed.•The framework relies on the reinforcement support of external information.•23 features in three groups: local, user-generated, and third-party are proposed.•A new open-domain dataset is created and manually annotated.•Combining internal and external information benefits the summarization. In the context of social media, users mutually share their interests of an event mentioned in a Web document. Its content can also be found in different news providers with a writing variation. This paper presents a framework which exploits the support of social context (user-generated content such as comments or tweets and third-party sources such as relevant documents retrieved from a search engine) to extract high-quality summaries. The extraction was formulated in two steps: sentence scoring and selection. The scoring is modeled as a learning to rank problem, which employs Ranking SVM to mutually exploits sentences, user-generated content, and third-party sources in the form of features to cover summary aspects. For the selection, summaries are extracted by using a score-based or voting method. For evaluation, three datasets of sentence and highlight extraction in two languages were taken as a case study. Experimental results indicate that by integrating user-generated content and third-party sources, our framework obtains improvements of ROUGE-scores over state-of-the-art methods for single-document summarization.
doi_str_mv 10.1016/j.knosys.2017.12.023
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2041718667</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0950705117306019</els_id><sourcerecordid>2041718667</sourcerecordid><originalsourceid>FETCH-LOGICAL-c400t-555c277fd852c8c0531be93f0541503074580d4d6fc1cbfbfa2d58de7dedb0613</originalsourceid><addsrcrecordid>eNp9kE9LxDAQxYMouK5-Aw8Fz62TtGm6F0HEf7Agop5Dm0zX1N1kTVJx_fRmqWcvM5f33sz7EXJOoaBA68uh-LAu7ELBgIqCsgJYeUBmtBEsFxUsDskMFhxyAZwek5MQBgBgjDYz8vzilGnXmXI24nfMwrjZtN78tNE4m43B2FWa6PMVWvRtRD1Jbcxaq7P4brzOt62Puyy40SsMp-Sob9cBz_72nLzd3b7ePOTLp_vHm-tlriqAmHPOFROi1w1nqlHAS9rhouyBV5RDCaLiDehK172iquu7vmWaNxqFRt1BTcs5uZhyt959jhiiHNIDNp2UDCoqaFPXIqmqSaW8C8FjL7fepIY7SUHu4clBTvDkHp6kTCZ4yXY12TA1-DLoZVAGrUJtPKootTP_B_wCHNx7mA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2041718667</pqid></control><display><type>article</type><title>Social context summarization using user-generated content and third-party sources</title><source>Access via ScienceDirect (Elsevier)</source><creator>Nguyen, Minh-Tien ; Tran, Duc-Vu ; Nguyen, Le-Minh</creator><creatorcontrib>Nguyen, Minh-Tien ; Tran, Duc-Vu ; Nguyen, Le-Minh</creatorcontrib><description>•A novel framework for social context summarization is proposed.•The framework relies on the reinforcement support of external information.•23 features in three groups: local, user-generated, and third-party are proposed.•A new open-domain dataset is created and manually annotated.•Combining internal and external information benefits the summarization. In the context of social media, users mutually share their interests of an event mentioned in a Web document. Its content can also be found in different news providers with a writing variation. This paper presents a framework which exploits the support of social context (user-generated content such as comments or tweets and third-party sources such as relevant documents retrieved from a search engine) to extract high-quality summaries. The extraction was formulated in two steps: sentence scoring and selection. The scoring is modeled as a learning to rank problem, which employs Ranking SVM to mutually exploits sentences, user-generated content, and third-party sources in the form of features to cover summary aspects. For the selection, summaries are extracted by using a score-based or voting method. For evaluation, three datasets of sentence and highlight extraction in two languages were taken as a case study. Experimental results indicate that by integrating user-generated content and third-party sources, our framework obtains improvements of ROUGE-scores over state-of-the-art methods for single-document summarization.</description><identifier>ISSN: 0950-7051</identifier><identifier>EISSN: 1872-7409</identifier><identifier>DOI: 10.1016/j.knosys.2017.12.023</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Case studies ; Data mining ; Digital media ; Document summarization ; Electronic documents ; Feature extraction ; Information retrieval ; Learning to rank ; Sentences ; Social context summarization ; Social networks ; Summaries ; User generated content</subject><ispartof>Knowledge-based systems, 2018-03, Vol.144, p.51-64</ispartof><rights>2017</rights><rights>Copyright Elsevier Science Ltd. Mar 15, 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c400t-555c277fd852c8c0531be93f0541503074580d4d6fc1cbfbfa2d58de7dedb0613</citedby><cites>FETCH-LOGICAL-c400t-555c277fd852c8c0531be93f0541503074580d4d6fc1cbfbfa2d58de7dedb0613</cites><orcidid>0000-0002-5028-0608</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.knosys.2017.12.023$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>315,781,785,3551,27929,27930,46000</link.rule.ids></links><search><creatorcontrib>Nguyen, Minh-Tien</creatorcontrib><creatorcontrib>Tran, Duc-Vu</creatorcontrib><creatorcontrib>Nguyen, Le-Minh</creatorcontrib><title>Social context summarization using user-generated content and third-party sources</title><title>Knowledge-based systems</title><description>•A novel framework for social context summarization is proposed.•The framework relies on the reinforcement support of external information.•23 features in three groups: local, user-generated, and third-party are proposed.•A new open-domain dataset is created and manually annotated.•Combining internal and external information benefits the summarization. In the context of social media, users mutually share their interests of an event mentioned in a Web document. Its content can also be found in different news providers with a writing variation. This paper presents a framework which exploits the support of social context (user-generated content such as comments or tweets and third-party sources such as relevant documents retrieved from a search engine) to extract high-quality summaries. The extraction was formulated in two steps: sentence scoring and selection. The scoring is modeled as a learning to rank problem, which employs Ranking SVM to mutually exploits sentences, user-generated content, and third-party sources in the form of features to cover summary aspects. For the selection, summaries are extracted by using a score-based or voting method. For evaluation, three datasets of sentence and highlight extraction in two languages were taken as a case study. Experimental results indicate that by integrating user-generated content and third-party sources, our framework obtains improvements of ROUGE-scores over state-of-the-art methods for single-document summarization.</description><subject>Case studies</subject><subject>Data mining</subject><subject>Digital media</subject><subject>Document summarization</subject><subject>Electronic documents</subject><subject>Feature extraction</subject><subject>Information retrieval</subject><subject>Learning to rank</subject><subject>Sentences</subject><subject>Social context summarization</subject><subject>Social networks</subject><subject>Summaries</subject><subject>User generated content</subject><issn>0950-7051</issn><issn>1872-7409</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp9kE9LxDAQxYMouK5-Aw8Fz62TtGm6F0HEf7Agop5Dm0zX1N1kTVJx_fRmqWcvM5f33sz7EXJOoaBA68uh-LAu7ELBgIqCsgJYeUBmtBEsFxUsDskMFhxyAZwek5MQBgBgjDYz8vzilGnXmXI24nfMwrjZtN78tNE4m43B2FWa6PMVWvRtRD1Jbcxaq7P4brzOt62Puyy40SsMp-Sob9cBz_72nLzd3b7ePOTLp_vHm-tlriqAmHPOFROi1w1nqlHAS9rhouyBV5RDCaLiDehK172iquu7vmWaNxqFRt1BTcs5uZhyt959jhiiHNIDNp2UDCoqaFPXIqmqSaW8C8FjL7fepIY7SUHu4clBTvDkHp6kTCZ4yXY12TA1-DLoZVAGrUJtPKootTP_B_wCHNx7mA</recordid><startdate>20180315</startdate><enddate>20180315</enddate><creator>Nguyen, Minh-Tien</creator><creator>Tran, Duc-Vu</creator><creator>Nguyen, Le-Minh</creator><general>Elsevier B.V</general><general>Elsevier Science Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-5028-0608</orcidid></search><sort><creationdate>20180315</creationdate><title>Social context summarization using user-generated content and third-party sources</title><author>Nguyen, Minh-Tien ; Tran, Duc-Vu ; Nguyen, Le-Minh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c400t-555c277fd852c8c0531be93f0541503074580d4d6fc1cbfbfa2d58de7dedb0613</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Case studies</topic><topic>Data mining</topic><topic>Digital media</topic><topic>Document summarization</topic><topic>Electronic documents</topic><topic>Feature extraction</topic><topic>Information retrieval</topic><topic>Learning to rank</topic><topic>Sentences</topic><topic>Social context summarization</topic><topic>Social networks</topic><topic>Summaries</topic><topic>User generated content</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nguyen, Minh-Tien</creatorcontrib><creatorcontrib>Tran, Duc-Vu</creatorcontrib><creatorcontrib>Nguyen, Le-Minh</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Knowledge-based systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nguyen, Minh-Tien</au><au>Tran, Duc-Vu</au><au>Nguyen, Le-Minh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Social context summarization using user-generated content and third-party sources</atitle><jtitle>Knowledge-based systems</jtitle><date>2018-03-15</date><risdate>2018</risdate><volume>144</volume><spage>51</spage><epage>64</epage><pages>51-64</pages><issn>0950-7051</issn><eissn>1872-7409</eissn><abstract>•A novel framework for social context summarization is proposed.•The framework relies on the reinforcement support of external information.•23 features in three groups: local, user-generated, and third-party are proposed.•A new open-domain dataset is created and manually annotated.•Combining internal and external information benefits the summarization. In the context of social media, users mutually share their interests of an event mentioned in a Web document. Its content can also be found in different news providers with a writing variation. This paper presents a framework which exploits the support of social context (user-generated content such as comments or tweets and third-party sources such as relevant documents retrieved from a search engine) to extract high-quality summaries. The extraction was formulated in two steps: sentence scoring and selection. The scoring is modeled as a learning to rank problem, which employs Ranking SVM to mutually exploits sentences, user-generated content, and third-party sources in the form of features to cover summary aspects. For the selection, summaries are extracted by using a score-based or voting method. For evaluation, three datasets of sentence and highlight extraction in two languages were taken as a case study. Experimental results indicate that by integrating user-generated content and third-party sources, our framework obtains improvements of ROUGE-scores over state-of-the-art methods for single-document summarization.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.knosys.2017.12.023</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-5028-0608</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0950-7051
ispartof Knowledge-based systems, 2018-03, Vol.144, p.51-64
issn 0950-7051
1872-7409
language eng
recordid cdi_proquest_journals_2041718667
source Access via ScienceDirect (Elsevier)
subjects Case studies
Data mining
Digital media
Document summarization
Electronic documents
Feature extraction
Information retrieval
Learning to rank
Sentences
Social context summarization
Social networks
Summaries
User generated content
title Social context summarization using user-generated content and third-party sources
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-11T16%3A00%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Social%20context%20summarization%20using%20user-generated%20content%20and%20third-party%20sources&rft.jtitle=Knowledge-based%20systems&rft.au=Nguyen,%20Minh-Tien&rft.date=2018-03-15&rft.volume=144&rft.spage=51&rft.epage=64&rft.pages=51-64&rft.issn=0950-7051&rft.eissn=1872-7409&rft_id=info:doi/10.1016/j.knosys.2017.12.023&rft_dat=%3Cproquest_cross%3E2041718667%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2041718667&rft_id=info:pmid/&rft_els_id=S0950705117306019&rfr_iscdi=true