Enhancing unsupervised neural networks based text summarization with word embedding and ensemble learning
•Word2vec representation improves the summarization task compared to bag of words.•Feature learning using unsupervised neural networks improves the summarization task.•Unsupervised neural networks trained on word2vec vectors gives promising results.•Ensemble learning with word2vec representation obt...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2019-06, Vol.123, p.195-211 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 211 |
---|---|
container_issue | |
container_start_page | 195 |
container_title | Expert systems with applications |
container_volume | 123 |
creator | Alami, Nabil Meknassi, Mohammed En-nahnahi, Noureddine |
description | •Word2vec representation improves the summarization task compared to bag of words.•Feature learning using unsupervised neural networks improves the summarization task.•Unsupervised neural networks trained on word2vec vectors gives promising results.•Ensemble learning with word2vec representation obtains the best results.
The vast amounts of data being collected and analyzed have led to invaluable source of information, which needs to be easily handled by humans. Automatic Text Summarization (ATS) systems enable users to get the gist of information and knowledge in a short time in order to make critical decisions quickly. Deep neural networks have proven their ability to achieve excellent performance in many real-world Natural Language Processing and computer vision applications. However, it still lacks attention in ATS. The key problem of traditional applications is that they involve high dimensional and sparse data, which makes it difficult to capture relevant information. One technique for overcoming these problems is learning features via dimensionality reduction. On the other hand, word embedding is another neural network technique that generates a much more compact word representation than a traditional Bag-of-Words (BOW) approach. In this paper, we are seeking to enhance the quality of ATS by integrating unsupervised deep neural network techniques with word embedding approach. First, we develop a word embedding based text summarization, and we show that Word2Vec representation gives better results than traditional BOW representation. Second, we propose other models by combining word2vec and unsupervised feature learning methods in order to merge information from different sources. We show that unsupervised neural networks models trained on Word2Vec representation give better results than those trained on BOW representation. Third, we also propose three ensemble techniques. The first ensemble combines BOW and word2vec using a majority voting technique. The second ensemble aggregates the information provided by the BOW approach and unsupervised neural networks. The third ensemble aggregates the information provided by Word2Vec and unsupervised neural networks. We show that the ensemble methods improve the quality of ATS, in particular the ensemble based on word2vec approach gives better results. Finally, we perform different experiments to evaluate the performance of the investigated models. We use two kind of datasets that are publically available for |
doi_str_mv | 10.1016/j.eswa.2019.01.037 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2193150625</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0957417419300375</els_id><sourcerecordid>2193150625</sourcerecordid><originalsourceid>FETCH-LOGICAL-c328t-1cf1e854aa864a7ae88f1ac10d81353b19ad77a393213f7311ae773bc84ae4bd3</originalsourceid><addsrcrecordid>eNp9UMtOwzAQtBBIlMIPcIrEOcEbJ3UicUFVeUiVuMDZ2tgb6tI6xU5a4OtxVM6cRjuamd0dxq6BZ8BhdrvOKBwwyznUGYeMC3nCJlBJkc5kLU7ZhNelTAuQxTm7CGHNOUjO5YTZhVuh09a9J4MLw4783gYyiaPB4yZCf-j8R0gaHNmevvokDNstevuDve1ccrD9Kokak9C2IWPGJHRxciESG0o2hN5F9pKdtbgJdPWHU_b2sHidP6XLl8fn-f0y1SKv-hR0C1SVBWI1K1AiVVULqIGbCkQpGqjRSImiFjmIVgoAJClFo6sCqWiMmLKbY-7Od58DhV6tu8G7uFLlUAso-Swvoyo_qrTvQvDUqp238a1vBVyNlaq1GitVY6WKg4qVRtPd0UTx_r0lr4K25DQZ60n3ynT2P_svtgGB-Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2193150625</pqid></control><display><type>article</type><title>Enhancing unsupervised neural networks based text summarization with word embedding and ensemble learning</title><source>ScienceDirect Journals (5 years ago - present)</source><creator>Alami, Nabil ; Meknassi, Mohammed ; En-nahnahi, Noureddine</creator><creatorcontrib>Alami, Nabil ; Meknassi, Mohammed ; En-nahnahi, Noureddine</creatorcontrib><description>•Word2vec representation improves the summarization task compared to bag of words.•Feature learning using unsupervised neural networks improves the summarization task.•Unsupervised neural networks trained on word2vec vectors gives promising results.•Ensemble learning with word2vec representation obtains the best results.
The vast amounts of data being collected and analyzed have led to invaluable source of information, which needs to be easily handled by humans. Automatic Text Summarization (ATS) systems enable users to get the gist of information and knowledge in a short time in order to make critical decisions quickly. Deep neural networks have proven their ability to achieve excellent performance in many real-world Natural Language Processing and computer vision applications. However, it still lacks attention in ATS. The key problem of traditional applications is that they involve high dimensional and sparse data, which makes it difficult to capture relevant information. One technique for overcoming these problems is learning features via dimensionality reduction. On the other hand, word embedding is another neural network technique that generates a much more compact word representation than a traditional Bag-of-Words (BOW) approach. In this paper, we are seeking to enhance the quality of ATS by integrating unsupervised deep neural network techniques with word embedding approach. First, we develop a word embedding based text summarization, and we show that Word2Vec representation gives better results than traditional BOW representation. Second, we propose other models by combining word2vec and unsupervised feature learning methods in order to merge information from different sources. We show that unsupervised neural networks models trained on Word2Vec representation give better results than those trained on BOW representation. Third, we also propose three ensemble techniques. The first ensemble combines BOW and word2vec using a majority voting technique. The second ensemble aggregates the information provided by the BOW approach and unsupervised neural networks. The third ensemble aggregates the information provided by Word2Vec and unsupervised neural networks. We show that the ensemble methods improve the quality of ATS, in particular the ensemble based on word2vec approach gives better results. Finally, we perform different experiments to evaluate the performance of the investigated models. We use two kind of datasets that are publically available for evaluating ATS task. Results of statistical studies affirm that word embedding-based models outperform the summarization task compared to those based on BOW approach. In particular, ensemble learning technique with Word2Vec representation surpass all the investigated models.</description><identifier>ISSN: 0957-4174</identifier><identifier>EISSN: 1873-6793</identifier><identifier>DOI: 10.1016/j.eswa.2019.01.037</identifier><language>eng</language><publisher>New York: Elsevier Ltd</publisher><subject>Aggregates ; Artificial neural networks ; Auto-encoder ; Computer vision ; Embedding ; Ensemble learning ; Extreme learning machine ; Natural language processing ; Neural networks ; Performance evaluation ; Representations ; Text summarization ; Variational auto-encoder ; Word2vec</subject><ispartof>Expert systems with applications, 2019-06, Vol.123, p.195-211</ispartof><rights>2019 Elsevier Ltd</rights><rights>Copyright Elsevier BV Jun 1, 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c328t-1cf1e854aa864a7ae88f1ac10d81353b19ad77a393213f7311ae773bc84ae4bd3</citedby><cites>FETCH-LOGICAL-c328t-1cf1e854aa864a7ae88f1ac10d81353b19ad77a393213f7311ae773bc84ae4bd3</cites><orcidid>0000-0003-1641-1501</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.eswa.2019.01.037$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Alami, Nabil</creatorcontrib><creatorcontrib>Meknassi, Mohammed</creatorcontrib><creatorcontrib>En-nahnahi, Noureddine</creatorcontrib><title>Enhancing unsupervised neural networks based text summarization with word embedding and ensemble learning</title><title>Expert systems with applications</title><description>•Word2vec representation improves the summarization task compared to bag of words.•Feature learning using unsupervised neural networks improves the summarization task.•Unsupervised neural networks trained on word2vec vectors gives promising results.•Ensemble learning with word2vec representation obtains the best results.
The vast amounts of data being collected and analyzed have led to invaluable source of information, which needs to be easily handled by humans. Automatic Text Summarization (ATS) systems enable users to get the gist of information and knowledge in a short time in order to make critical decisions quickly. Deep neural networks have proven their ability to achieve excellent performance in many real-world Natural Language Processing and computer vision applications. However, it still lacks attention in ATS. The key problem of traditional applications is that they involve high dimensional and sparse data, which makes it difficult to capture relevant information. One technique for overcoming these problems is learning features via dimensionality reduction. On the other hand, word embedding is another neural network technique that generates a much more compact word representation than a traditional Bag-of-Words (BOW) approach. In this paper, we are seeking to enhance the quality of ATS by integrating unsupervised deep neural network techniques with word embedding approach. First, we develop a word embedding based text summarization, and we show that Word2Vec representation gives better results than traditional BOW representation. Second, we propose other models by combining word2vec and unsupervised feature learning methods in order to merge information from different sources. We show that unsupervised neural networks models trained on Word2Vec representation give better results than those trained on BOW representation. Third, we also propose three ensemble techniques. The first ensemble combines BOW and word2vec using a majority voting technique. The second ensemble aggregates the information provided by the BOW approach and unsupervised neural networks. The third ensemble aggregates the information provided by Word2Vec and unsupervised neural networks. We show that the ensemble methods improve the quality of ATS, in particular the ensemble based on word2vec approach gives better results. Finally, we perform different experiments to evaluate the performance of the investigated models. We use two kind of datasets that are publically available for evaluating ATS task. Results of statistical studies affirm that word embedding-based models outperform the summarization task compared to those based on BOW approach. In particular, ensemble learning technique with Word2Vec representation surpass all the investigated models.</description><subject>Aggregates</subject><subject>Artificial neural networks</subject><subject>Auto-encoder</subject><subject>Computer vision</subject><subject>Embedding</subject><subject>Ensemble learning</subject><subject>Extreme learning machine</subject><subject>Natural language processing</subject><subject>Neural networks</subject><subject>Performance evaluation</subject><subject>Representations</subject><subject>Text summarization</subject><subject>Variational auto-encoder</subject><subject>Word2vec</subject><issn>0957-4174</issn><issn>1873-6793</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9UMtOwzAQtBBIlMIPcIrEOcEbJ3UicUFVeUiVuMDZ2tgb6tI6xU5a4OtxVM6cRjuamd0dxq6BZ8BhdrvOKBwwyznUGYeMC3nCJlBJkc5kLU7ZhNelTAuQxTm7CGHNOUjO5YTZhVuh09a9J4MLw4783gYyiaPB4yZCf-j8R0gaHNmevvokDNstevuDve1ccrD9Kokak9C2IWPGJHRxciESG0o2hN5F9pKdtbgJdPWHU_b2sHidP6XLl8fn-f0y1SKv-hR0C1SVBWI1K1AiVVULqIGbCkQpGqjRSImiFjmIVgoAJClFo6sCqWiMmLKbY-7Od58DhV6tu8G7uFLlUAso-Swvoyo_qrTvQvDUqp238a1vBVyNlaq1GitVY6WKg4qVRtPd0UTx_r0lr4K25DQZ60n3ynT2P_svtgGB-Q</recordid><startdate>20190601</startdate><enddate>20190601</enddate><creator>Alami, Nabil</creator><creator>Meknassi, Mohammed</creator><creator>En-nahnahi, Noureddine</creator><general>Elsevier Ltd</general><general>Elsevier BV</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-1641-1501</orcidid></search><sort><creationdate>20190601</creationdate><title>Enhancing unsupervised neural networks based text summarization with word embedding and ensemble learning</title><author>Alami, Nabil ; Meknassi, Mohammed ; En-nahnahi, Noureddine</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c328t-1cf1e854aa864a7ae88f1ac10d81353b19ad77a393213f7311ae773bc84ae4bd3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Aggregates</topic><topic>Artificial neural networks</topic><topic>Auto-encoder</topic><topic>Computer vision</topic><topic>Embedding</topic><topic>Ensemble learning</topic><topic>Extreme learning machine</topic><topic>Natural language processing</topic><topic>Neural networks</topic><topic>Performance evaluation</topic><topic>Representations</topic><topic>Text summarization</topic><topic>Variational auto-encoder</topic><topic>Word2vec</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alami, Nabil</creatorcontrib><creatorcontrib>Meknassi, Mohammed</creatorcontrib><creatorcontrib>En-nahnahi, Noureddine</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Expert systems with applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alami, Nabil</au><au>Meknassi, Mohammed</au><au>En-nahnahi, Noureddine</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Enhancing unsupervised neural networks based text summarization with word embedding and ensemble learning</atitle><jtitle>Expert systems with applications</jtitle><date>2019-06-01</date><risdate>2019</risdate><volume>123</volume><spage>195</spage><epage>211</epage><pages>195-211</pages><issn>0957-4174</issn><eissn>1873-6793</eissn><abstract>•Word2vec representation improves the summarization task compared to bag of words.•Feature learning using unsupervised neural networks improves the summarization task.•Unsupervised neural networks trained on word2vec vectors gives promising results.•Ensemble learning with word2vec representation obtains the best results.
The vast amounts of data being collected and analyzed have led to invaluable source of information, which needs to be easily handled by humans. Automatic Text Summarization (ATS) systems enable users to get the gist of information and knowledge in a short time in order to make critical decisions quickly. Deep neural networks have proven their ability to achieve excellent performance in many real-world Natural Language Processing and computer vision applications. However, it still lacks attention in ATS. The key problem of traditional applications is that they involve high dimensional and sparse data, which makes it difficult to capture relevant information. One technique for overcoming these problems is learning features via dimensionality reduction. On the other hand, word embedding is another neural network technique that generates a much more compact word representation than a traditional Bag-of-Words (BOW) approach. In this paper, we are seeking to enhance the quality of ATS by integrating unsupervised deep neural network techniques with word embedding approach. First, we develop a word embedding based text summarization, and we show that Word2Vec representation gives better results than traditional BOW representation. Second, we propose other models by combining word2vec and unsupervised feature learning methods in order to merge information from different sources. We show that unsupervised neural networks models trained on Word2Vec representation give better results than those trained on BOW representation. Third, we also propose three ensemble techniques. The first ensemble combines BOW and word2vec using a majority voting technique. The second ensemble aggregates the information provided by the BOW approach and unsupervised neural networks. The third ensemble aggregates the information provided by Word2Vec and unsupervised neural networks. We show that the ensemble methods improve the quality of ATS, in particular the ensemble based on word2vec approach gives better results. Finally, we perform different experiments to evaluate the performance of the investigated models. We use two kind of datasets that are publically available for evaluating ATS task. Results of statistical studies affirm that word embedding-based models outperform the summarization task compared to those based on BOW approach. In particular, ensemble learning technique with Word2Vec representation surpass all the investigated models.</abstract><cop>New York</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.eswa.2019.01.037</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0003-1641-1501</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0957-4174 |
ispartof | Expert systems with applications, 2019-06, Vol.123, p.195-211 |
issn | 0957-4174 1873-6793 |
language | eng |
recordid | cdi_proquest_journals_2193150625 |
source | ScienceDirect Journals (5 years ago - present) |
subjects | Aggregates Artificial neural networks Auto-encoder Computer vision Embedding Ensemble learning Extreme learning machine Natural language processing Neural networks Performance evaluation Representations Text summarization Variational auto-encoder Word2vec |
title | Enhancing unsupervised neural networks based text summarization with word embedding and ensemble learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T00%3A13%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Enhancing%20unsupervised%20neural%20networks%20based%20text%20summarization%20with%20word%20embedding%20and%20ensemble%20learning&rft.jtitle=Expert%20systems%20with%20applications&rft.au=Alami,%20Nabil&rft.date=2019-06-01&rft.volume=123&rft.spage=195&rft.epage=211&rft.pages=195-211&rft.issn=0957-4174&rft.eissn=1873-6793&rft_id=info:doi/10.1016/j.eswa.2019.01.037&rft_dat=%3Cproquest_cross%3E2193150625%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2193150625&rft_id=info:pmid/&rft_els_id=S0957417419300375&rfr_iscdi=true |