Computational linguistics literature and citations oriented citation linkage, classification and summarization
Scientific literature is currently the most important resource for scholars, and their citations have provided researchers with a powerful latent way to analyze scientific trends, influences and relationships of works and authors. This paper is focused on automatic citation analysis and summarizatio...
Gespeichert in:
Veröffentlicht in: | International journal on digital libraries 2018-09, Vol.19 (2-3), p.173-190 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 190 |
---|---|
container_issue | 2-3 |
container_start_page | 173 |
container_title | International journal on digital libraries |
container_volume | 19 |
creator | Li, Lei Mao, Liyuan Zhang, Yazhao Chi, Junqi Huang, Taiwen Cong, Xiaoyue Peng, Heng |
description | Scientific literature is currently the most important resource for scholars, and their citations have provided researchers with a powerful latent way to analyze scientific trends, influences and relationships of works and authors. This paper is focused on automatic citation analysis and summarization for the scientific literature of computational linguistics, which are also the shared tasks in the 2016 workshop of the 2nd Computational Linguistics Scientific Document Summarization at BIRNDL 2016 (The Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries). Each citation linkage between a citation and the spans of text in the reference paper is recognized according to their content similarities via various computational methods. Then the cited text span is classified to five pre-defined facets, i.e., Hypothesis, Implication, Aim, Results and Method, based on various features of lexicons and rules via Support Vector Machine and Voting Method. Finally, a summary of the reference paper from the cited text spans is generated within 250 words. hLDA (hierarchical Latent Dirichlet Allocation) topic model is adopted for content modeling, which provides knowledge about sentence clustering (subtopic) and word distributions (abstractiveness) for summarization. We combine hLDA knowledge with several other classical features using different weights and proportions to evaluate the sentences in the reference paper. Our systems have been ranked top one and top two according to the evaluation results published by BIRNDL 2016, which has verified the effectiveness of our methods. |
doi_str_mv | 10.1007/s00799-017-0219-5 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2088229280</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2088229280</sourcerecordid><originalsourceid>FETCH-LOGICAL-c316t-638d9c3f8ee53a68de141eb6b1bd258c0a1b00fea81920d23621b90518d3459c3</originalsourceid><addsrcrecordid>eNp1kE9PwzAMxSMEEmPwAbhV4krBTpYuPaKJf9IkLnCO0jSdMrp2xOkBPj0pnbQTF9t6ej_LfoxdI9whwPKeUinLHHCZA8cylydshgvBcxQAp4dZAvJzdkG0BQBUuJyxbtXv9kM00fedabPWd5vBU_SW0hxdMHEILjNdnVk_uSjrg3dddEdpxD7Nxt1mtjVEvvF20keOht3OBP_zp1yys8a05K4Ofc4-nh7fVy_5-u35dfWwzq3AIuaFUHVpRaOck8IUqna4QFcVFVY1l8qCwQqgcUZhyaHmouBYlSBR1WIhEzlnN9Pefei_BkdRb_shpA9Jc1CK85IrSC6cXDb0RME1eh98OvZbI-gxVj3FqlOseoxVy8TwiaHk7TYuHDf_D_0COmN9Ag</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2088229280</pqid></control><display><type>article</type><title>Computational linguistics literature and citations oriented citation linkage, classification and summarization</title><source>SpringerLink Journals - AutoHoldings</source><creator>Li, Lei ; Mao, Liyuan ; Zhang, Yazhao ; Chi, Junqi ; Huang, Taiwen ; Cong, Xiaoyue ; Peng, Heng</creator><creatorcontrib>Li, Lei ; Mao, Liyuan ; Zhang, Yazhao ; Chi, Junqi ; Huang, Taiwen ; Cong, Xiaoyue ; Peng, Heng</creatorcontrib><description>Scientific literature is currently the most important resource for scholars, and their citations have provided researchers with a powerful latent way to analyze scientific trends, influences and relationships of works and authors. This paper is focused on automatic citation analysis and summarization for the scientific literature of computational linguistics, which are also the shared tasks in the 2016 workshop of the 2nd Computational Linguistics Scientific Document Summarization at BIRNDL 2016 (The Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries). Each citation linkage between a citation and the spans of text in the reference paper is recognized according to their content similarities via various computational methods. Then the cited text span is classified to five pre-defined facets, i.e., Hypothesis, Implication, Aim, Results and Method, based on various features of lexicons and rules via Support Vector Machine and Voting Method. Finally, a summary of the reference paper from the cited text spans is generated within 250 words. hLDA (hierarchical Latent Dirichlet Allocation) topic model is adopted for content modeling, which provides knowledge about sentence clustering (subtopic) and word distributions (abstractiveness) for summarization. We combine hLDA knowledge with several other classical features using different weights and proportions to evaluate the sentences in the reference paper. Our systems have been ranked top one and top two according to the evaluation results published by BIRNDL 2016, which has verified the effectiveness of our methods.</description><identifier>ISSN: 1432-5012</identifier><identifier>EISSN: 1432-1300</identifier><identifier>DOI: 10.1007/s00799-017-0219-5</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Bibliometrics ; Citation analysis ; Clustering ; Computation ; Computer Science ; Database Management ; Dirichlet problem ; Information retrieval ; Information Systems and Communication Service ; Linguistics ; Natural language processing ; Sentences ; Support vector machines</subject><ispartof>International journal on digital libraries, 2018-09, Vol.19 (2-3), p.173-190</ispartof><rights>Springer-Verlag GmbH Germany 2017</rights><rights>International Journal on Digital Libraries is a copyright of Springer, (2017). All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c316t-638d9c3f8ee53a68de141eb6b1bd258c0a1b00fea81920d23621b90518d3459c3</citedby><cites>FETCH-LOGICAL-c316t-638d9c3f8ee53a68de141eb6b1bd258c0a1b00fea81920d23621b90518d3459c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00799-017-0219-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00799-017-0219-5$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27923,27924,41487,42556,51318</link.rule.ids></links><search><creatorcontrib>Li, Lei</creatorcontrib><creatorcontrib>Mao, Liyuan</creatorcontrib><creatorcontrib>Zhang, Yazhao</creatorcontrib><creatorcontrib>Chi, Junqi</creatorcontrib><creatorcontrib>Huang, Taiwen</creatorcontrib><creatorcontrib>Cong, Xiaoyue</creatorcontrib><creatorcontrib>Peng, Heng</creatorcontrib><title>Computational linguistics literature and citations oriented citation linkage, classification and summarization</title><title>International journal on digital libraries</title><addtitle>Int J Digit Libr</addtitle><description>Scientific literature is currently the most important resource for scholars, and their citations have provided researchers with a powerful latent way to analyze scientific trends, influences and relationships of works and authors. This paper is focused on automatic citation analysis and summarization for the scientific literature of computational linguistics, which are also the shared tasks in the 2016 workshop of the 2nd Computational Linguistics Scientific Document Summarization at BIRNDL 2016 (The Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries). Each citation linkage between a citation and the spans of text in the reference paper is recognized according to their content similarities via various computational methods. Then the cited text span is classified to five pre-defined facets, i.e., Hypothesis, Implication, Aim, Results and Method, based on various features of lexicons and rules via Support Vector Machine and Voting Method. Finally, a summary of the reference paper from the cited text spans is generated within 250 words. hLDA (hierarchical Latent Dirichlet Allocation) topic model is adopted for content modeling, which provides knowledge about sentence clustering (subtopic) and word distributions (abstractiveness) for summarization. We combine hLDA knowledge with several other classical features using different weights and proportions to evaluate the sentences in the reference paper. Our systems have been ranked top one and top two according to the evaluation results published by BIRNDL 2016, which has verified the effectiveness of our methods.</description><subject>Bibliometrics</subject><subject>Citation analysis</subject><subject>Clustering</subject><subject>Computation</subject><subject>Computer Science</subject><subject>Database Management</subject><subject>Dirichlet problem</subject><subject>Information retrieval</subject><subject>Information Systems and Communication Service</subject><subject>Linguistics</subject><subject>Natural language processing</subject><subject>Sentences</subject><subject>Support vector machines</subject><issn>1432-5012</issn><issn>1432-1300</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp1kE9PwzAMxSMEEmPwAbhV4krBTpYuPaKJf9IkLnCO0jSdMrp2xOkBPj0pnbQTF9t6ej_LfoxdI9whwPKeUinLHHCZA8cylydshgvBcxQAp4dZAvJzdkG0BQBUuJyxbtXv9kM00fedabPWd5vBU_SW0hxdMHEILjNdnVk_uSjrg3dddEdpxD7Nxt1mtjVEvvF20keOht3OBP_zp1yys8a05K4Ofc4-nh7fVy_5-u35dfWwzq3AIuaFUHVpRaOck8IUqna4QFcVFVY1l8qCwQqgcUZhyaHmouBYlSBR1WIhEzlnN9Pefei_BkdRb_shpA9Jc1CK85IrSC6cXDb0RME1eh98OvZbI-gxVj3FqlOseoxVy8TwiaHk7TYuHDf_D_0COmN9Ag</recordid><startdate>20180901</startdate><enddate>20180901</enddate><creator>Li, Lei</creator><creator>Mao, Liyuan</creator><creator>Zhang, Yazhao</creator><creator>Chi, Junqi</creator><creator>Huang, Taiwen</creator><creator>Cong, Xiaoyue</creator><creator>Peng, Heng</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7XB</scope><scope>88I</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CNYFK</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M1O</scope><scope>M2O</scope><scope>M2P</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PADUT</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope></search><sort><creationdate>20180901</creationdate><title>Computational linguistics literature and citations oriented citation linkage, classification and summarization</title><author>Li, Lei ; Mao, Liyuan ; Zhang, Yazhao ; Chi, Junqi ; Huang, Taiwen ; Cong, Xiaoyue ; Peng, Heng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c316t-638d9c3f8ee53a68de141eb6b1bd258c0a1b00fea81920d23621b90518d3459c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Bibliometrics</topic><topic>Citation analysis</topic><topic>Clustering</topic><topic>Computation</topic><topic>Computer Science</topic><topic>Database Management</topic><topic>Dirichlet problem</topic><topic>Information retrieval</topic><topic>Information Systems and Communication Service</topic><topic>Linguistics</topic><topic>Natural language processing</topic><topic>Sentences</topic><topic>Support vector machines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Lei</creatorcontrib><creatorcontrib>Mao, Liyuan</creatorcontrib><creatorcontrib>Zhang, Yazhao</creatorcontrib><creatorcontrib>Chi, Junqi</creatorcontrib><creatorcontrib>Huang, Taiwen</creatorcontrib><creatorcontrib>Cong, Xiaoyue</creatorcontrib><creatorcontrib>Peng, Heng</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Library & Information Science Collection</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Library Science Database</collection><collection>Research Library</collection><collection>Science Database</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Research Library China</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>International journal on digital libraries</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Lei</au><au>Mao, Liyuan</au><au>Zhang, Yazhao</au><au>Chi, Junqi</au><au>Huang, Taiwen</au><au>Cong, Xiaoyue</au><au>Peng, Heng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Computational linguistics literature and citations oriented citation linkage, classification and summarization</atitle><jtitle>International journal on digital libraries</jtitle><stitle>Int J Digit Libr</stitle><date>2018-09-01</date><risdate>2018</risdate><volume>19</volume><issue>2-3</issue><spage>173</spage><epage>190</epage><pages>173-190</pages><issn>1432-5012</issn><eissn>1432-1300</eissn><abstract>Scientific literature is currently the most important resource for scholars, and their citations have provided researchers with a powerful latent way to analyze scientific trends, influences and relationships of works and authors. This paper is focused on automatic citation analysis and summarization for the scientific literature of computational linguistics, which are also the shared tasks in the 2016 workshop of the 2nd Computational Linguistics Scientific Document Summarization at BIRNDL 2016 (The Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries). Each citation linkage between a citation and the spans of text in the reference paper is recognized according to their content similarities via various computational methods. Then the cited text span is classified to five pre-defined facets, i.e., Hypothesis, Implication, Aim, Results and Method, based on various features of lexicons and rules via Support Vector Machine and Voting Method. Finally, a summary of the reference paper from the cited text spans is generated within 250 words. hLDA (hierarchical Latent Dirichlet Allocation) topic model is adopted for content modeling, which provides knowledge about sentence clustering (subtopic) and word distributions (abstractiveness) for summarization. We combine hLDA knowledge with several other classical features using different weights and proportions to evaluate the sentences in the reference paper. Our systems have been ranked top one and top two according to the evaluation results published by BIRNDL 2016, which has verified the effectiveness of our methods.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00799-017-0219-5</doi><tpages>18</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1432-5012 |
ispartof | International journal on digital libraries, 2018-09, Vol.19 (2-3), p.173-190 |
issn | 1432-5012 1432-1300 |
language | eng |
recordid | cdi_proquest_journals_2088229280 |
source | SpringerLink Journals - AutoHoldings |
subjects | Bibliometrics Citation analysis Clustering Computation Computer Science Database Management Dirichlet problem Information retrieval Information Systems and Communication Service Linguistics Natural language processing Sentences Support vector machines |
title | Computational linguistics literature and citations oriented citation linkage, classification and summarization |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T08%3A43%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Computational%20linguistics%20literature%20and%20citations%20oriented%20citation%20linkage,%20classification%20and%20summarization&rft.jtitle=International%20journal%20on%20digital%20libraries&rft.au=Li,%20Lei&rft.date=2018-09-01&rft.volume=19&rft.issue=2-3&rft.spage=173&rft.epage=190&rft.pages=173-190&rft.issn=1432-5012&rft.eissn=1432-1300&rft_id=info:doi/10.1007/s00799-017-0219-5&rft_dat=%3Cproquest_cross%3E2088229280%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2088229280&rft_id=info:pmid/&rfr_iscdi=true |