A multi-view method of scientific paper classification via heterogeneous graph embeddings
The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represe...
Gespeichert in:
Veröffentlicht in: | Scientometrics 2022-08, Vol.127 (8), p.4847-4872 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 4872 |
---|---|
container_issue | 8 |
container_start_page | 4847 |
container_title | Scientometrics |
container_volume | 127 |
creator | Lv, Yiqin Xie, Zheng Zuo, Xiaojing Song, Yiping |
description | The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represents the semantic similarity between papers, derived from papers’ titles and abstracts. The other type represents the citation relationship between papers and the journals or proceedings of conferences of their references. We utilize a contrastive learning method to embed the nodes in the heterogeneous graph into a vector space. Then, we feed the paper node vectors into classifiers, such as the decision tree, multilayer perceptron, and so on. We conduct experiments on three datasets of scientific papers: the Microsoft Academic Graph with 63,211 scientific papers in 20 classes, the Proceedings of the National Academy of Sciences with 38,243 scientific papers in 18 classes, and the American Physical Society with 443,845 scientific papers in 5 classes. The experimental results on the multi-class task show that our multi-view method scores the classification accuracy up to 98%, outperforming state-of-the-arts. |
doi_str_mv | 10.1007/s11192-022-04419-1 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2700752681</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2700752681</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-75a52f085f95d2af660d9b2eeb3d39de82f541786d5cb243d71cee51fef43bec3</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMouK7-AU8Bz9VM0rTpcVn8ggUvevAU0mbSzbL9MGlX_Pd2reDNwzAMvM8M8xByDewWGMvvIgAUPGF8qjSFIoETsgCpVMJVBqdkwUCopADBzslFjDs2QYKpBXlf0WbcDz45ePykDQ7bztLO0Vh5bAfvfEV702Og1d7EeJzN4LuWHryhWxwwdDW22I2R1sH0W4pNidb6to6X5MyZfcSr374kbw_3r-unZPPy-LxebZJKQDEkuTSSO6akK6TlxmUZs0XJEUthRWFRcSdTyFVmZVXyVNgcKkQJDl0qSqzEktzMe_vQfYwYB73rxtBOJzXPpzclzxRMKT6nqtDFGNDpPvjGhC8NTB8V6lmhnhTqH4X6CIkZilO4rTH8rf6H-gbOgnV7</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2700752681</pqid></control><display><type>article</type><title>A multi-view method of scientific paper classification via heterogeneous graph embeddings</title><source>SpringerNature Journals</source><creator>Lv, Yiqin ; Xie, Zheng ; Zuo, Xiaojing ; Song, Yiping</creator><creatorcontrib>Lv, Yiqin ; Xie, Zheng ; Zuo, Xiaojing ; Song, Yiping</creatorcontrib><description>The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represents the semantic similarity between papers, derived from papers’ titles and abstracts. The other type represents the citation relationship between papers and the journals or proceedings of conferences of their references. We utilize a contrastive learning method to embed the nodes in the heterogeneous graph into a vector space. Then, we feed the paper node vectors into classifiers, such as the decision tree, multilayer perceptron, and so on. We conduct experiments on three datasets of scientific papers: the Microsoft Academic Graph with 63,211 scientific papers in 20 classes, the Proceedings of the National Academy of Sciences with 38,243 scientific papers in 18 classes, and the American Physical Society with 443,845 scientific papers in 5 classes. The experimental results on the multi-class task show that our multi-view method scores the classification accuracy up to 98%, outperforming state-of-the-arts.</description><identifier>ISSN: 0138-9130</identifier><identifier>EISSN: 1588-2861</identifier><identifier>DOI: 10.1007/s11192-022-04419-1</identifier><language>eng</language><publisher>Cham: Springer International Publishing</publisher><subject>Classification ; Computer Science ; Decision trees ; Graph theory ; Information Storage and Retrieval ; Library Science ; Multilayer perceptrons ; Nodes ; Scientific papers ; Vector spaces</subject><ispartof>Scientometrics, 2022-08, Vol.127 (8), p.4847-4872</ispartof><rights>Akadémiai Kiadó, Budapest, Hungary 2022</rights><rights>Akadémiai Kiadó, Budapest, Hungary 2022.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-75a52f085f95d2af660d9b2eeb3d39de82f541786d5cb243d71cee51fef43bec3</citedby><cites>FETCH-LOGICAL-c319t-75a52f085f95d2af660d9b2eeb3d39de82f541786d5cb243d71cee51fef43bec3</cites><orcidid>0000-0003-0391-8725</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11192-022-04419-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11192-022-04419-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Lv, Yiqin</creatorcontrib><creatorcontrib>Xie, Zheng</creatorcontrib><creatorcontrib>Zuo, Xiaojing</creatorcontrib><creatorcontrib>Song, Yiping</creatorcontrib><title>A multi-view method of scientific paper classification via heterogeneous graph embeddings</title><title>Scientometrics</title><addtitle>Scientometrics</addtitle><description>The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represents the semantic similarity between papers, derived from papers’ titles and abstracts. The other type represents the citation relationship between papers and the journals or proceedings of conferences of their references. We utilize a contrastive learning method to embed the nodes in the heterogeneous graph into a vector space. Then, we feed the paper node vectors into classifiers, such as the decision tree, multilayer perceptron, and so on. We conduct experiments on three datasets of scientific papers: the Microsoft Academic Graph with 63,211 scientific papers in 20 classes, the Proceedings of the National Academy of Sciences with 38,243 scientific papers in 18 classes, and the American Physical Society with 443,845 scientific papers in 5 classes. The experimental results on the multi-class task show that our multi-view method scores the classification accuracy up to 98%, outperforming state-of-the-arts.</description><subject>Classification</subject><subject>Computer Science</subject><subject>Decision trees</subject><subject>Graph theory</subject><subject>Information Storage and Retrieval</subject><subject>Library Science</subject><subject>Multilayer perceptrons</subject><subject>Nodes</subject><subject>Scientific papers</subject><subject>Vector spaces</subject><issn>0138-9130</issn><issn>1588-2861</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LxDAQhoMouK7-AU8Bz9VM0rTpcVn8ggUvevAU0mbSzbL9MGlX_Pd2reDNwzAMvM8M8xByDewWGMvvIgAUPGF8qjSFIoETsgCpVMJVBqdkwUCopADBzslFjDs2QYKpBXlf0WbcDz45ePykDQ7bztLO0Vh5bAfvfEV702Og1d7EeJzN4LuWHryhWxwwdDW22I2R1sH0W4pNidb6to6X5MyZfcSr374kbw_3r-unZPPy-LxebZJKQDEkuTSSO6akK6TlxmUZs0XJEUthRWFRcSdTyFVmZVXyVNgcKkQJDl0qSqzEktzMe_vQfYwYB73rxtBOJzXPpzclzxRMKT6nqtDFGNDpPvjGhC8NTB8V6lmhnhTqH4X6CIkZilO4rTH8rf6H-gbOgnV7</recordid><startdate>20220801</startdate><enddate>20220801</enddate><creator>Lv, Yiqin</creator><creator>Xie, Zheng</creator><creator>Zuo, Xiaojing</creator><creator>Song, Yiping</creator><general>Springer International Publishing</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>E3H</scope><scope>F2A</scope><orcidid>https://orcid.org/0000-0003-0391-8725</orcidid></search><sort><creationdate>20220801</creationdate><title>A multi-view method of scientific paper classification via heterogeneous graph embeddings</title><author>Lv, Yiqin ; Xie, Zheng ; Zuo, Xiaojing ; Song, Yiping</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-75a52f085f95d2af660d9b2eeb3d39de82f541786d5cb243d71cee51fef43bec3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Classification</topic><topic>Computer Science</topic><topic>Decision trees</topic><topic>Graph theory</topic><topic>Information Storage and Retrieval</topic><topic>Library Science</topic><topic>Multilayer perceptrons</topic><topic>Nodes</topic><topic>Scientific papers</topic><topic>Vector spaces</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lv, Yiqin</creatorcontrib><creatorcontrib>Xie, Zheng</creatorcontrib><creatorcontrib>Zuo, Xiaojing</creatorcontrib><creatorcontrib>Song, Yiping</creatorcontrib><collection>CrossRef</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><jtitle>Scientometrics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lv, Yiqin</au><au>Xie, Zheng</au><au>Zuo, Xiaojing</au><au>Song, Yiping</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A multi-view method of scientific paper classification via heterogeneous graph embeddings</atitle><jtitle>Scientometrics</jtitle><stitle>Scientometrics</stitle><date>2022-08-01</date><risdate>2022</risdate><volume>127</volume><issue>8</issue><spage>4847</spage><epage>4872</epage><pages>4847-4872</pages><issn>0138-9130</issn><eissn>1588-2861</eissn><abstract>The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represents the semantic similarity between papers, derived from papers’ titles and abstracts. The other type represents the citation relationship between papers and the journals or proceedings of conferences of their references. We utilize a contrastive learning method to embed the nodes in the heterogeneous graph into a vector space. Then, we feed the paper node vectors into classifiers, such as the decision tree, multilayer perceptron, and so on. We conduct experiments on three datasets of scientific papers: the Microsoft Academic Graph with 63,211 scientific papers in 20 classes, the Proceedings of the National Academy of Sciences with 38,243 scientific papers in 18 classes, and the American Physical Society with 443,845 scientific papers in 5 classes. The experimental results on the multi-class task show that our multi-view method scores the classification accuracy up to 98%, outperforming state-of-the-arts.</abstract><cop>Cham</cop><pub>Springer International Publishing</pub><doi>10.1007/s11192-022-04419-1</doi><tpages>26</tpages><orcidid>https://orcid.org/0000-0003-0391-8725</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0138-9130 |
ispartof | Scientometrics, 2022-08, Vol.127 (8), p.4847-4872 |
issn | 0138-9130 1588-2861 |
language | eng |
recordid | cdi_proquest_journals_2700752681 |
source | SpringerNature Journals |
subjects | Classification Computer Science Decision trees Graph theory Information Storage and Retrieval Library Science Multilayer perceptrons Nodes Scientific papers Vector spaces |
title | A multi-view method of scientific paper classification via heterogeneous graph embeddings |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T02%3A53%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20multi-view%20method%20of%20scientific%20paper%20classification%20via%20heterogeneous%20graph%20embeddings&rft.jtitle=Scientometrics&rft.au=Lv,%20Yiqin&rft.date=2022-08-01&rft.volume=127&rft.issue=8&rft.spage=4847&rft.epage=4872&rft.pages=4847-4872&rft.issn=0138-9130&rft.eissn=1588-2861&rft_id=info:doi/10.1007/s11192-022-04419-1&rft_dat=%3Cproquest_cross%3E2700752681%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2700752681&rft_id=info:pmid/&rfr_iscdi=true |