Improving citation mining

In recent years the number of citations a paper is receiving is seen more and more (maybe too much so) as an important indicator for the quality of a paper, the quality of researchers, the quality of journals, etc. Based on the number of citations a scholar has received over his lifetime or over the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Afzal, M.T., Balke, W.-T., Maurer, H., Kulathuramaiyer, N.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 121
container_issue
container_start_page 116
container_title
container_volume
creator Afzal, M.T.
Balke, W.-T.
Maurer, H.
Kulathuramaiyer, N.
description In recent years the number of citations a paper is receiving is seen more and more (maybe too much so) as an important indicator for the quality of a paper, the quality of researchers, the quality of journals, etc. Based on the number of citations a scholar has received over his lifetime or over the last few years various measures have been introduced. The number of citations (often without counting self-citations or citations from ldquominorrdquo sources, in whatever way this may be defined), or some measurement based on the number of citations (like the h- or the g-factor) are being used to evaluate scholars; the citation index of a journal (again with a variety of parameters) is seen as measuring the impact of the journal, and hence the importance one assigns to publications there, etc. The number of measurements based on citation numbers is steadily increasing, and their definition has become a science in itself. However, they all rest on finding all relevant citations. Thus, ldquocitation mining toolsrdquo used for the ISI Web of Knowledge, the Citeseer citation index, Google scholar or software such as the ldquopublishorperish.comrdquo software based on Google scholar, etc., are the critical starting points for all measurement efforts. In this paper we show that the current citation mining techniques do not discover all relevant citations. We propose a technique that increases accuracy substantially and show numeric evaluations for one typical journal. It is clear that in the absence of very reliable citation mining tools all current measurements based on citation counting should be considered with a grain of salt.
doi_str_mv 10.1109/NDT.2009.5272186
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5272186</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5272186</ieee_id><sourcerecordid>5272186</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-b7c03a2bb57265af2970f571edf39460f2a17ad4f04d10de7cd74a510b295bdc3</originalsourceid><addsrcrecordid>eNpFj0FLAzEQhSNasK29K176B3admUwym6PUqoWil3ou2U0iEbctu4vgv7diwdPje_A9eEpdI5SI4O5eHjYlAbjSkBBW9kxNkImZLRpz_g8sF2pMx66ohKqRmvxKDpgsXKpZ338AgCbQbPVY3azaQ7f_yrv3eZMHP-T9bt7m3ZGv1Cj5zz7OTjlVb4_LzeK5WL8-rRb36yKjmKGopQHtqa6NkDU-kRNIRjCGpB1bSORRfOAEHBBClCYIe4NQkzN1aPRU3f7t5hjj9tDl1nff29NH_QORVz86</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Improving citation mining</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Afzal, M.T. ; Balke, W.-T. ; Maurer, H. ; Kulathuramaiyer, N.</creator><creatorcontrib>Afzal, M.T. ; Balke, W.-T. ; Maurer, H. ; Kulathuramaiyer, N.</creatorcontrib><description>In recent years the number of citations a paper is receiving is seen more and more (maybe too much so) as an important indicator for the quality of a paper, the quality of researchers, the quality of journals, etc. Based on the number of citations a scholar has received over his lifetime or over the last few years various measures have been introduced. The number of citations (often without counting self-citations or citations from ldquominorrdquo sources, in whatever way this may be defined), or some measurement based on the number of citations (like the h- or the g-factor) are being used to evaluate scholars; the citation index of a journal (again with a variety of parameters) is seen as measuring the impact of the journal, and hence the importance one assigns to publications there, etc. The number of measurements based on citation numbers is steadily increasing, and their definition has become a science in itself. However, they all rest on finding all relevant citations. Thus, ldquocitation mining toolsrdquo used for the ISI Web of Knowledge, the Citeseer citation index, Google scholar or software such as the ldquopublishorperish.comrdquo software based on Google scholar, etc., are the critical starting points for all measurement efforts. In this paper we show that the current citation mining techniques do not discover all relevant citations. We propose a technique that increases accuracy substantially and show numeric evaluations for one typical journal. It is clear that in the absence of very reliable citation mining tools all current measurements based on citation counting should be considered with a grain of salt.</description><identifier>ISSN: 2155-8728</identifier><identifier>ISBN: 1424446147</identifier><identifier>ISBN: 9781424446148</identifier><identifier>EISBN: 1424446155</identifier><identifier>EISBN: 9781424446155</identifier><identifier>DOI: 10.1109/NDT.2009.5272186</identifier><identifier>LCCN: 2009904260</identifier><language>eng</language><publisher>IEEE</publisher><subject>Current measurement ; Data mining ; Indexing ; Information systems ; Intersymbol interference ; Paper technology ; Software engineering ; Software measurement ; Software tools</subject><ispartof>2009 First International Conference on Networked Digital Technologies, 2009, p.116-121</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5272186$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2056,27924,54919</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5272186$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Afzal, M.T.</creatorcontrib><creatorcontrib>Balke, W.-T.</creatorcontrib><creatorcontrib>Maurer, H.</creatorcontrib><creatorcontrib>Kulathuramaiyer, N.</creatorcontrib><title>Improving citation mining</title><title>2009 First International Conference on Networked Digital Technologies</title><addtitle>NDT</addtitle><description>In recent years the number of citations a paper is receiving is seen more and more (maybe too much so) as an important indicator for the quality of a paper, the quality of researchers, the quality of journals, etc. Based on the number of citations a scholar has received over his lifetime or over the last few years various measures have been introduced. The number of citations (often without counting self-citations or citations from ldquominorrdquo sources, in whatever way this may be defined), or some measurement based on the number of citations (like the h- or the g-factor) are being used to evaluate scholars; the citation index of a journal (again with a variety of parameters) is seen as measuring the impact of the journal, and hence the importance one assigns to publications there, etc. The number of measurements based on citation numbers is steadily increasing, and their definition has become a science in itself. However, they all rest on finding all relevant citations. Thus, ldquocitation mining toolsrdquo used for the ISI Web of Knowledge, the Citeseer citation index, Google scholar or software such as the ldquopublishorperish.comrdquo software based on Google scholar, etc., are the critical starting points for all measurement efforts. In this paper we show that the current citation mining techniques do not discover all relevant citations. We propose a technique that increases accuracy substantially and show numeric evaluations for one typical journal. It is clear that in the absence of very reliable citation mining tools all current measurements based on citation counting should be considered with a grain of salt.</description><subject>Current measurement</subject><subject>Data mining</subject><subject>Indexing</subject><subject>Information systems</subject><subject>Intersymbol interference</subject><subject>Paper technology</subject><subject>Software engineering</subject><subject>Software measurement</subject><subject>Software tools</subject><issn>2155-8728</issn><isbn>1424446147</isbn><isbn>9781424446148</isbn><isbn>1424446155</isbn><isbn>9781424446155</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2009</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpFj0FLAzEQhSNasK29K176B3admUwym6PUqoWil3ou2U0iEbctu4vgv7diwdPje_A9eEpdI5SI4O5eHjYlAbjSkBBW9kxNkImZLRpz_g8sF2pMx66ohKqRmvxKDpgsXKpZ338AgCbQbPVY3azaQ7f_yrv3eZMHP-T9bt7m3ZGv1Cj5zz7OTjlVb4_LzeK5WL8-rRb36yKjmKGopQHtqa6NkDU-kRNIRjCGpB1bSORRfOAEHBBClCYIe4NQkzN1aPRU3f7t5hjj9tDl1nff29NH_QORVz86</recordid><startdate>200907</startdate><enddate>200907</enddate><creator>Afzal, M.T.</creator><creator>Balke, W.-T.</creator><creator>Maurer, H.</creator><creator>Kulathuramaiyer, N.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200907</creationdate><title>Improving citation mining</title><author>Afzal, M.T. ; Balke, W.-T. ; Maurer, H. ; Kulathuramaiyer, N.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-b7c03a2bb57265af2970f571edf39460f2a17ad4f04d10de7cd74a510b295bdc3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Current measurement</topic><topic>Data mining</topic><topic>Indexing</topic><topic>Information systems</topic><topic>Intersymbol interference</topic><topic>Paper technology</topic><topic>Software engineering</topic><topic>Software measurement</topic><topic>Software tools</topic><toplevel>online_resources</toplevel><creatorcontrib>Afzal, M.T.</creatorcontrib><creatorcontrib>Balke, W.-T.</creatorcontrib><creatorcontrib>Maurer, H.</creatorcontrib><creatorcontrib>Kulathuramaiyer, N.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Afzal, M.T.</au><au>Balke, W.-T.</au><au>Maurer, H.</au><au>Kulathuramaiyer, N.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Improving citation mining</atitle><btitle>2009 First International Conference on Networked Digital Technologies</btitle><stitle>NDT</stitle><date>2009-07</date><risdate>2009</risdate><spage>116</spage><epage>121</epage><pages>116-121</pages><issn>2155-8728</issn><isbn>1424446147</isbn><isbn>9781424446148</isbn><eisbn>1424446155</eisbn><eisbn>9781424446155</eisbn><abstract>In recent years the number of citations a paper is receiving is seen more and more (maybe too much so) as an important indicator for the quality of a paper, the quality of researchers, the quality of journals, etc. Based on the number of citations a scholar has received over his lifetime or over the last few years various measures have been introduced. The number of citations (often without counting self-citations or citations from ldquominorrdquo sources, in whatever way this may be defined), or some measurement based on the number of citations (like the h- or the g-factor) are being used to evaluate scholars; the citation index of a journal (again with a variety of parameters) is seen as measuring the impact of the journal, and hence the importance one assigns to publications there, etc. The number of measurements based on citation numbers is steadily increasing, and their definition has become a science in itself. However, they all rest on finding all relevant citations. Thus, ldquocitation mining toolsrdquo used for the ISI Web of Knowledge, the Citeseer citation index, Google scholar or software such as the ldquopublishorperish.comrdquo software based on Google scholar, etc., are the critical starting points for all measurement efforts. In this paper we show that the current citation mining techniques do not discover all relevant citations. We propose a technique that increases accuracy substantially and show numeric evaluations for one typical journal. It is clear that in the absence of very reliable citation mining tools all current measurements based on citation counting should be considered with a grain of salt.</abstract><pub>IEEE</pub><doi>10.1109/NDT.2009.5272186</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2155-8728
ispartof 2009 First International Conference on Networked Digital Technologies, 2009, p.116-121
issn 2155-8728
language eng
recordid cdi_ieee_primary_5272186
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Current measurement
Data mining
Indexing
Information systems
Intersymbol interference
Paper technology
Software engineering
Software measurement
Software tools
title Improving citation mining
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T00%3A54%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Improving%20citation%20mining&rft.btitle=2009%20First%20International%20Conference%20on%20Networked%20Digital%20Technologies&rft.au=Afzal,%20M.T.&rft.date=2009-07&rft.spage=116&rft.epage=121&rft.pages=116-121&rft.issn=2155-8728&rft.isbn=1424446147&rft.isbn_list=9781424446148&rft_id=info:doi/10.1109/NDT.2009.5272186&rft_dat=%3Cieee_6IE%3E5272186%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1424446155&rft.eisbn_list=9781424446155&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5272186&rfr_iscdi=true