A multi-view method of scientific paper classification via heterogeneous graph embeddings

The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Scientometrics 2022-08, Vol.127 (8), p.4847-4872
Hauptverfasser: Lv, Yiqin, Xie, Zheng, Zuo, Xiaojing, Song, Yiping
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4872
container_issue 8
container_start_page 4847
container_title Scientometrics
container_volume 127
creator Lv, Yiqin
Xie, Zheng
Zuo, Xiaojing
Song, Yiping
description The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represents the semantic similarity between papers, derived from papers’ titles and abstracts. The other type represents the citation relationship between papers and the journals or proceedings of conferences of their references. We utilize a contrastive learning method to embed the nodes in the heterogeneous graph into a vector space. Then, we feed the paper node vectors into classifiers, such as the decision tree, multilayer perceptron, and so on. We conduct experiments on three datasets of scientific papers: the Microsoft Academic Graph with 63,211 scientific papers in 20 classes, the Proceedings of the National Academy of Sciences with 38,243 scientific papers in 18 classes, and the American Physical Society with 443,845 scientific papers in 5 classes. The experimental results on the multi-class task show that our multi-view method scores the classification accuracy up to 98%, outperforming state-of-the-arts.
doi_str_mv 10.1007/s11192-022-04419-1
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2700752681</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2700752681</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-75a52f085f95d2af660d9b2eeb3d39de82f541786d5cb243d71cee51fef43bec3</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMouK7-AU8Bz9VM0rTpcVn8ggUvevAU0mbSzbL9MGlX_Pd2reDNwzAMvM8M8xByDewWGMvvIgAUPGF8qjSFIoETsgCpVMJVBqdkwUCopADBzslFjDs2QYKpBXlf0WbcDz45ePykDQ7bztLO0Vh5bAfvfEV702Og1d7EeJzN4LuWHryhWxwwdDW22I2R1sH0W4pNidb6to6X5MyZfcSr374kbw_3r-unZPPy-LxebZJKQDEkuTSSO6akK6TlxmUZs0XJEUthRWFRcSdTyFVmZVXyVNgcKkQJDl0qSqzEktzMe_vQfYwYB73rxtBOJzXPpzclzxRMKT6nqtDFGNDpPvjGhC8NTB8V6lmhnhTqH4X6CIkZilO4rTH8rf6H-gbOgnV7</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2700752681</pqid></control><display><type>article</type><title>A multi-view method of scientific paper classification via heterogeneous graph embeddings</title><source>SpringerNature Journals</source><creator>Lv, Yiqin ; Xie, Zheng ; Zuo, Xiaojing ; Song, Yiping</creator><creatorcontrib>Lv, Yiqin ; Xie, Zheng ; Zuo, Xiaojing ; Song, Yiping</creatorcontrib><description>The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represents the semantic similarity between papers, derived from papers’ titles and abstracts. The other type represents the citation relationship between papers and the journals or proceedings of conferences of their references. We utilize a contrastive learning method to embed the nodes in the heterogeneous graph into a vector space. Then, we feed the paper node vectors into classifiers, such as the decision tree, multilayer perceptron, and so on. We conduct experiments on three datasets of scientific papers: the Microsoft Academic Graph with 63,211 scientific papers in 20 classes, the Proceedings of the National Academy of Sciences with 38,243 scientific papers in 18 classes, and the American Physical Society with 443,845 scientific papers in 5 classes. The experimental results on the multi-class task show that our multi-view method scores the classification accuracy up to 98%, outperforming state-of-the-arts.</description><identifier>ISSN: 0138-9130</identifier><identifier>EISSN: 1588-2861</identifier><identifier>DOI: 10.1007/s11192-022-04419-1</identifier><language>eng</language><publisher>Cham: Springer International Publishing</publisher><subject>Classification ; Computer Science ; Decision trees ; Graph theory ; Information Storage and Retrieval ; Library Science ; Multilayer perceptrons ; Nodes ; Scientific papers ; Vector spaces</subject><ispartof>Scientometrics, 2022-08, Vol.127 (8), p.4847-4872</ispartof><rights>Akadémiai Kiadó, Budapest, Hungary 2022</rights><rights>Akadémiai Kiadó, Budapest, Hungary 2022.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-75a52f085f95d2af660d9b2eeb3d39de82f541786d5cb243d71cee51fef43bec3</citedby><cites>FETCH-LOGICAL-c319t-75a52f085f95d2af660d9b2eeb3d39de82f541786d5cb243d71cee51fef43bec3</cites><orcidid>0000-0003-0391-8725</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11192-022-04419-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11192-022-04419-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Lv, Yiqin</creatorcontrib><creatorcontrib>Xie, Zheng</creatorcontrib><creatorcontrib>Zuo, Xiaojing</creatorcontrib><creatorcontrib>Song, Yiping</creatorcontrib><title>A multi-view method of scientific paper classification via heterogeneous graph embeddings</title><title>Scientometrics</title><addtitle>Scientometrics</addtitle><description>The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represents the semantic similarity between papers, derived from papers’ titles and abstracts. The other type represents the citation relationship between papers and the journals or proceedings of conferences of their references. We utilize a contrastive learning method to embed the nodes in the heterogeneous graph into a vector space. Then, we feed the paper node vectors into classifiers, such as the decision tree, multilayer perceptron, and so on. We conduct experiments on three datasets of scientific papers: the Microsoft Academic Graph with 63,211 scientific papers in 20 classes, the Proceedings of the National Academy of Sciences with 38,243 scientific papers in 18 classes, and the American Physical Society with 443,845 scientific papers in 5 classes. The experimental results on the multi-class task show that our multi-view method scores the classification accuracy up to 98%, outperforming state-of-the-arts.</description><subject>Classification</subject><subject>Computer Science</subject><subject>Decision trees</subject><subject>Graph theory</subject><subject>Information Storage and Retrieval</subject><subject>Library Science</subject><subject>Multilayer perceptrons</subject><subject>Nodes</subject><subject>Scientific papers</subject><subject>Vector spaces</subject><issn>0138-9130</issn><issn>1588-2861</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LxDAQhoMouK7-AU8Bz9VM0rTpcVn8ggUvevAU0mbSzbL9MGlX_Pd2reDNwzAMvM8M8xByDewWGMvvIgAUPGF8qjSFIoETsgCpVMJVBqdkwUCopADBzslFjDs2QYKpBXlf0WbcDz45ePykDQ7bztLO0Vh5bAfvfEV702Og1d7EeJzN4LuWHryhWxwwdDW22I2R1sH0W4pNidb6to6X5MyZfcSr374kbw_3r-unZPPy-LxebZJKQDEkuTSSO6akK6TlxmUZs0XJEUthRWFRcSdTyFVmZVXyVNgcKkQJDl0qSqzEktzMe_vQfYwYB73rxtBOJzXPpzclzxRMKT6nqtDFGNDpPvjGhC8NTB8V6lmhnhTqH4X6CIkZilO4rTH8rf6H-gbOgnV7</recordid><startdate>20220801</startdate><enddate>20220801</enddate><creator>Lv, Yiqin</creator><creator>Xie, Zheng</creator><creator>Zuo, Xiaojing</creator><creator>Song, Yiping</creator><general>Springer International Publishing</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>E3H</scope><scope>F2A</scope><orcidid>https://orcid.org/0000-0003-0391-8725</orcidid></search><sort><creationdate>20220801</creationdate><title>A multi-view method of scientific paper classification via heterogeneous graph embeddings</title><author>Lv, Yiqin ; Xie, Zheng ; Zuo, Xiaojing ; Song, Yiping</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-75a52f085f95d2af660d9b2eeb3d39de82f541786d5cb243d71cee51fef43bec3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Classification</topic><topic>Computer Science</topic><topic>Decision trees</topic><topic>Graph theory</topic><topic>Information Storage and Retrieval</topic><topic>Library Science</topic><topic>Multilayer perceptrons</topic><topic>Nodes</topic><topic>Scientific papers</topic><topic>Vector spaces</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lv, Yiqin</creatorcontrib><creatorcontrib>Xie, Zheng</creatorcontrib><creatorcontrib>Zuo, Xiaojing</creatorcontrib><creatorcontrib>Song, Yiping</creatorcontrib><collection>CrossRef</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><jtitle>Scientometrics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lv, Yiqin</au><au>Xie, Zheng</au><au>Zuo, Xiaojing</au><au>Song, Yiping</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A multi-view method of scientific paper classification via heterogeneous graph embeddings</atitle><jtitle>Scientometrics</jtitle><stitle>Scientometrics</stitle><date>2022-08-01</date><risdate>2022</risdate><volume>127</volume><issue>8</issue><spage>4847</spage><epage>4872</epage><pages>4847-4872</pages><issn>0138-9130</issn><eissn>1588-2861</eissn><abstract>The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represents the semantic similarity between papers, derived from papers’ titles and abstracts. The other type represents the citation relationship between papers and the journals or proceedings of conferences of their references. We utilize a contrastive learning method to embed the nodes in the heterogeneous graph into a vector space. Then, we feed the paper node vectors into classifiers, such as the decision tree, multilayer perceptron, and so on. We conduct experiments on three datasets of scientific papers: the Microsoft Academic Graph with 63,211 scientific papers in 20 classes, the Proceedings of the National Academy of Sciences with 38,243 scientific papers in 18 classes, and the American Physical Society with 443,845 scientific papers in 5 classes. The experimental results on the multi-class task show that our multi-view method scores the classification accuracy up to 98%, outperforming state-of-the-arts.</abstract><cop>Cham</cop><pub>Springer International Publishing</pub><doi>10.1007/s11192-022-04419-1</doi><tpages>26</tpages><orcidid>https://orcid.org/0000-0003-0391-8725</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0138-9130
ispartof Scientometrics, 2022-08, Vol.127 (8), p.4847-4872
issn 0138-9130
1588-2861
language eng
recordid cdi_proquest_journals_2700752681
source SpringerNature Journals
subjects Classification
Computer Science
Decision trees
Graph theory
Information Storage and Retrieval
Library Science
Multilayer perceptrons
Nodes
Scientific papers
Vector spaces
title A multi-view method of scientific paper classification via heterogeneous graph embeddings
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T02%3A53%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20multi-view%20method%20of%20scientific%20paper%20classification%20via%20heterogeneous%20graph%20embeddings&rft.jtitle=Scientometrics&rft.au=Lv,%20Yiqin&rft.date=2022-08-01&rft.volume=127&rft.issue=8&rft.spage=4847&rft.epage=4872&rft.pages=4847-4872&rft.issn=0138-9130&rft.eissn=1588-2861&rft_id=info:doi/10.1007/s11192-022-04419-1&rft_dat=%3Cproquest_cross%3E2700752681%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2700752681&rft_id=info:pmid/&rfr_iscdi=true