A note on measuring overlap

In measuring the overlap between two sets A and B (e.g. libraries, databases) one is obliged to calculate the overlap O(A|B) of A with respect to B (i.e. the fraction of elements of B that are also in A) and of O(B|A) of B with respect to A (i.e. the fraction of elements in A that are also in B). Th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of information science 2007-04, Vol.33 (2), p.189-195
Hauptverfasser: Egghe, L., Goovaerts, M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 195
container_issue 2
container_start_page 189
container_title Journal of information science
container_volume 33
creator Egghe, L.
Goovaerts, M.
description In measuring the overlap between two sets A and B (e.g. libraries, databases) one is obliged to calculate the overlap O(A|B) of A with respect to B (i.e. the fraction of elements of B that are also in A) and of O(B|A) of B with respect to A (i.e. the fraction of elements in A that are also in B). Theoretically this requires two samples. In this paper we explain that one sample can suffice to determine confidence intervals for both O(A|B) and O(B|A). The paper closes with the example of measuring the overlap between the secondary sources in mathematics MathSciNet and Zentralblatt MATH and with a remark on the estimation of the Jaccard index.
doi_str_mv 10.1177/0165551506075325
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_743610772</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_0165551506075325</sage_id><sourcerecordid>1928815073</sourcerecordid><originalsourceid>FETCH-LOGICAL-c444t-5339a9462fcef7752565648d7fe8bd7e84078b73a3b6564796058b99ebf821e43</originalsourceid><addsrcrecordid>eNp9kM1Lw0AQxRdRsFbvgpeAqKfo7MfsbI6l-AUFL3oOm3RTUtKk7jaC_303tKAU9DSH93vvMY-xSw73nBM9ANeIyBE0EEqBR2zESfFUK4PHbDTI6aCfsrMQlgCAmVQjdjVJ2m7jkq5NVs6G3tftIum-nG_s-pydVLYJ7mJ_x-zj6fF9-pLO3p5fp5NZWiqlNilKmdlMaVGVriJCgRpj65wqZ4o5OaOATEHSymIQKNOApsgyV1RGcKfkmN3tcte---xd2OSrOpSuaWzruj7kpKTmQCQiefsviaQNGIURvD4Al13v2_hFzjNhTNyJZKRgR5W-C8G7Kl_7emX9d84hH1bND1eNlpt9sA2lbSpv27IOPz6jCYSAyKU7LtiF-1X-V-4WPEF_HA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1928815073</pqid></control><display><type>article</type><title>A note on measuring overlap</title><source>SAGE Complete A-Z List</source><creator>Egghe, L. ; Goovaerts, M.</creator><creatorcontrib>Egghe, L. ; Goovaerts, M.</creatorcontrib><description>In measuring the overlap between two sets A and B (e.g. libraries, databases) one is obliged to calculate the overlap O(A|B) of A with respect to B (i.e. the fraction of elements of B that are also in A) and of O(B|A) of B with respect to A (i.e. the fraction of elements in A that are also in B). Theoretically this requires two samples. In this paper we explain that one sample can suffice to determine confidence intervals for both O(A|B) and O(B|A). The paper closes with the example of measuring the overlap between the secondary sources in mathematics MathSciNet and Zentralblatt MATH and with a remark on the estimation of the Jaccard index.</description><identifier>ISSN: 0165-5515</identifier><identifier>EISSN: 1741-6485</identifier><identifier>DOI: 10.1177/0165551506075325</identifier><identifier>CODEN: JISCDI</identifier><language>eng</language><publisher>Thousand Oaks, CA: Sage Publications</publisher><subject>Bibliometrics. Scientometrics. Evaluation ; Confidence intervals ; Exact sciences and technology ; Information and communication sciences ; Information science. Documentation ; Informetrics ; Libraries ; Library and information science. General aspects ; Mathematics ; Overlap ; Sciences and techniques of general use</subject><ispartof>Journal of information science, 2007-04, Vol.33 (2), p.189-195</ispartof><rights>2007 INIST-CNRS</rights><rights>Copyright Bowker-Saur Ltd. Apr 2007</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c444t-5339a9462fcef7752565648d7fe8bd7e84078b73a3b6564796058b99ebf821e43</citedby><cites>FETCH-LOGICAL-c444t-5339a9462fcef7752565648d7fe8bd7e84078b73a3b6564796058b99ebf821e43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://journals.sagepub.com/doi/pdf/10.1177/0165551506075325$$EPDF$$P50$$Gsage$$H</linktopdf><linktohtml>$$Uhttps://journals.sagepub.com/doi/10.1177/0165551506075325$$EHTML$$P50$$Gsage$$H</linktohtml><link.rule.ids>314,776,780,21798,27901,27902,43597,43598</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=18670220$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Egghe, L.</creatorcontrib><creatorcontrib>Goovaerts, M.</creatorcontrib><title>A note on measuring overlap</title><title>Journal of information science</title><description>In measuring the overlap between two sets A and B (e.g. libraries, databases) one is obliged to calculate the overlap O(A|B) of A with respect to B (i.e. the fraction of elements of B that are also in A) and of O(B|A) of B with respect to A (i.e. the fraction of elements in A that are also in B). Theoretically this requires two samples. In this paper we explain that one sample can suffice to determine confidence intervals for both O(A|B) and O(B|A). The paper closes with the example of measuring the overlap between the secondary sources in mathematics MathSciNet and Zentralblatt MATH and with a remark on the estimation of the Jaccard index.</description><subject>Bibliometrics. Scientometrics. Evaluation</subject><subject>Confidence intervals</subject><subject>Exact sciences and technology</subject><subject>Information and communication sciences</subject><subject>Information science. Documentation</subject><subject>Informetrics</subject><subject>Libraries</subject><subject>Library and information science. General aspects</subject><subject>Mathematics</subject><subject>Overlap</subject><subject>Sciences and techniques of general use</subject><issn>0165-5515</issn><issn>1741-6485</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><recordid>eNp9kM1Lw0AQxRdRsFbvgpeAqKfo7MfsbI6l-AUFL3oOm3RTUtKk7jaC_303tKAU9DSH93vvMY-xSw73nBM9ANeIyBE0EEqBR2zESfFUK4PHbDTI6aCfsrMQlgCAmVQjdjVJ2m7jkq5NVs6G3tftIum-nG_s-pydVLYJ7mJ_x-zj6fF9-pLO3p5fp5NZWiqlNilKmdlMaVGVriJCgRpj65wqZ4o5OaOATEHSymIQKNOApsgyV1RGcKfkmN3tcte---xd2OSrOpSuaWzruj7kpKTmQCQiefsviaQNGIURvD4Al13v2_hFzjNhTNyJZKRgR5W-C8G7Kl_7emX9d84hH1bND1eNlpt9sA2lbSpv27IOPz6jCYSAyKU7LtiF-1X-V-4WPEF_HA</recordid><startdate>200704</startdate><enddate>200704</enddate><creator>Egghe, L.</creator><creator>Goovaerts, M.</creator><general>Sage Publications</general><general>Bowker-Saur</general><general>Bowker-Saur Ltd</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>200704</creationdate><title>A note on measuring overlap</title><author>Egghe, L. ; Goovaerts, M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c444t-5339a9462fcef7752565648d7fe8bd7e84078b73a3b6564796058b99ebf821e43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Bibliometrics. Scientometrics. Evaluation</topic><topic>Confidence intervals</topic><topic>Exact sciences and technology</topic><topic>Information and communication sciences</topic><topic>Information science. Documentation</topic><topic>Informetrics</topic><topic>Libraries</topic><topic>Library and information science. General aspects</topic><topic>Mathematics</topic><topic>Overlap</topic><topic>Sciences and techniques of general use</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Egghe, L.</creatorcontrib><creatorcontrib>Goovaerts, M.</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of information science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Egghe, L.</au><au>Goovaerts, M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A note on measuring overlap</atitle><jtitle>Journal of information science</jtitle><date>2007-04</date><risdate>2007</risdate><volume>33</volume><issue>2</issue><spage>189</spage><epage>195</epage><pages>189-195</pages><issn>0165-5515</issn><eissn>1741-6485</eissn><coden>JISCDI</coden><abstract>In measuring the overlap between two sets A and B (e.g. libraries, databases) one is obliged to calculate the overlap O(A|B) of A with respect to B (i.e. the fraction of elements of B that are also in A) and of O(B|A) of B with respect to A (i.e. the fraction of elements in A that are also in B). Theoretically this requires two samples. In this paper we explain that one sample can suffice to determine confidence intervals for both O(A|B) and O(B|A). The paper closes with the example of measuring the overlap between the secondary sources in mathematics MathSciNet and Zentralblatt MATH and with a remark on the estimation of the Jaccard index.</abstract><cop>Thousand Oaks, CA</cop><pub>Sage Publications</pub><doi>10.1177/0165551506075325</doi><tpages>7</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0165-5515
ispartof Journal of information science, 2007-04, Vol.33 (2), p.189-195
issn 0165-5515
1741-6485
language eng
recordid cdi_proquest_miscellaneous_743610772
source SAGE Complete A-Z List
subjects Bibliometrics. Scientometrics. Evaluation
Confidence intervals
Exact sciences and technology
Information and communication sciences
Information science. Documentation
Informetrics
Libraries
Library and information science. General aspects
Mathematics
Overlap
Sciences and techniques of general use
title A note on measuring overlap
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T18%3A08%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20note%20on%20measuring%20overlap&rft.jtitle=Journal%20of%20information%20science&rft.au=Egghe,%20L.&rft.date=2007-04&rft.volume=33&rft.issue=2&rft.spage=189&rft.epage=195&rft.pages=189-195&rft.issn=0165-5515&rft.eissn=1741-6485&rft.coden=JISCDI&rft_id=info:doi/10.1177/0165551506075325&rft_dat=%3Cproquest_cross%3E1928815073%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1928815073&rft_id=info:pmid/&rft_sage_id=10.1177_0165551506075325&rfr_iscdi=true