A note on measuring overlap
In measuring the overlap between two sets A and B (e.g. libraries, databases) one is obliged to calculate the overlap O(A|B) of A with respect to B (i.e. the fraction of elements of B that are also in A) and of O(B|A) of B with respect to A (i.e. the fraction of elements in A that are also in B). Th...
Gespeichert in:
Veröffentlicht in: | Journal of information science 2007-04, Vol.33 (2), p.189-195 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 195 |
---|---|
container_issue | 2 |
container_start_page | 189 |
container_title | Journal of information science |
container_volume | 33 |
creator | Egghe, L. Goovaerts, M. |
description | In measuring the overlap between two sets A and B (e.g. libraries, databases) one is obliged to calculate the overlap O(A|B) of A with respect to B (i.e. the fraction of elements of B that are also in A) and of O(B|A) of B with respect to A (i.e. the fraction of elements in A that are also in B). Theoretically this requires two samples. In this paper we explain that one sample can suffice to determine confidence intervals for both O(A|B) and O(B|A). The paper closes with the example of measuring the overlap between the secondary sources in mathematics MathSciNet and Zentralblatt MATH and with a remark on the estimation of the Jaccard index. |
doi_str_mv | 10.1177/0165551506075325 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_743610772</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_0165551506075325</sage_id><sourcerecordid>1928815073</sourcerecordid><originalsourceid>FETCH-LOGICAL-c444t-5339a9462fcef7752565648d7fe8bd7e84078b73a3b6564796058b99ebf821e43</originalsourceid><addsrcrecordid>eNp9kM1Lw0AQxRdRsFbvgpeAqKfo7MfsbI6l-AUFL3oOm3RTUtKk7jaC_303tKAU9DSH93vvMY-xSw73nBM9ANeIyBE0EEqBR2zESfFUK4PHbDTI6aCfsrMQlgCAmVQjdjVJ2m7jkq5NVs6G3tftIum-nG_s-pydVLYJ7mJ_x-zj6fF9-pLO3p5fp5NZWiqlNilKmdlMaVGVriJCgRpj65wqZ4o5OaOATEHSymIQKNOApsgyV1RGcKfkmN3tcte---xd2OSrOpSuaWzruj7kpKTmQCQiefsviaQNGIURvD4Al13v2_hFzjNhTNyJZKRgR5W-C8G7Kl_7emX9d84hH1bND1eNlpt9sA2lbSpv27IOPz6jCYSAyKU7LtiF-1X-V-4WPEF_HA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1928815073</pqid></control><display><type>article</type><title>A note on measuring overlap</title><source>SAGE Complete A-Z List</source><creator>Egghe, L. ; Goovaerts, M.</creator><creatorcontrib>Egghe, L. ; Goovaerts, M.</creatorcontrib><description>In measuring the overlap between two sets A and B (e.g. libraries, databases) one is obliged to calculate the overlap O(A|B) of A with respect to B (i.e. the fraction of elements of B that are also in A) and of O(B|A) of B with respect to A (i.e. the fraction of elements in A that are also in B). Theoretically this requires two samples. In this paper we explain that one sample can suffice to determine confidence intervals for both O(A|B) and O(B|A). The paper closes with the example of measuring the overlap between the secondary sources in mathematics MathSciNet and Zentralblatt MATH and with a remark on the estimation of the Jaccard index.</description><identifier>ISSN: 0165-5515</identifier><identifier>EISSN: 1741-6485</identifier><identifier>DOI: 10.1177/0165551506075325</identifier><identifier>CODEN: JISCDI</identifier><language>eng</language><publisher>Thousand Oaks, CA: Sage Publications</publisher><subject>Bibliometrics. Scientometrics. Evaluation ; Confidence intervals ; Exact sciences and technology ; Information and communication sciences ; Information science. Documentation ; Informetrics ; Libraries ; Library and information science. General aspects ; Mathematics ; Overlap ; Sciences and techniques of general use</subject><ispartof>Journal of information science, 2007-04, Vol.33 (2), p.189-195</ispartof><rights>2007 INIST-CNRS</rights><rights>Copyright Bowker-Saur Ltd. Apr 2007</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c444t-5339a9462fcef7752565648d7fe8bd7e84078b73a3b6564796058b99ebf821e43</citedby><cites>FETCH-LOGICAL-c444t-5339a9462fcef7752565648d7fe8bd7e84078b73a3b6564796058b99ebf821e43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://journals.sagepub.com/doi/pdf/10.1177/0165551506075325$$EPDF$$P50$$Gsage$$H</linktopdf><linktohtml>$$Uhttps://journals.sagepub.com/doi/10.1177/0165551506075325$$EHTML$$P50$$Gsage$$H</linktohtml><link.rule.ids>314,776,780,21798,27901,27902,43597,43598</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18670220$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Egghe, L.</creatorcontrib><creatorcontrib>Goovaerts, M.</creatorcontrib><title>A note on measuring overlap</title><title>Journal of information science</title><description>In measuring the overlap between two sets A and B (e.g. libraries, databases) one is obliged to calculate the overlap O(A|B) of A with respect to B (i.e. the fraction of elements of B that are also in A) and of O(B|A) of B with respect to A (i.e. the fraction of elements in A that are also in B). Theoretically this requires two samples. In this paper we explain that one sample can suffice to determine confidence intervals for both O(A|B) and O(B|A). The paper closes with the example of measuring the overlap between the secondary sources in mathematics MathSciNet and Zentralblatt MATH and with a remark on the estimation of the Jaccard index.</description><subject>Bibliometrics. Scientometrics. Evaluation</subject><subject>Confidence intervals</subject><subject>Exact sciences and technology</subject><subject>Information and communication sciences</subject><subject>Information science. Documentation</subject><subject>Informetrics</subject><subject>Libraries</subject><subject>Library and information science. General aspects</subject><subject>Mathematics</subject><subject>Overlap</subject><subject>Sciences and techniques of general use</subject><issn>0165-5515</issn><issn>1741-6485</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><recordid>eNp9kM1Lw0AQxRdRsFbvgpeAqKfo7MfsbI6l-AUFL3oOm3RTUtKk7jaC_303tKAU9DSH93vvMY-xSw73nBM9ANeIyBE0EEqBR2zESfFUK4PHbDTI6aCfsrMQlgCAmVQjdjVJ2m7jkq5NVs6G3tftIum-nG_s-pydVLYJ7mJ_x-zj6fF9-pLO3p5fp5NZWiqlNilKmdlMaVGVriJCgRpj65wqZ4o5OaOATEHSymIQKNOApsgyV1RGcKfkmN3tcte---xd2OSrOpSuaWzruj7kpKTmQCQiefsviaQNGIURvD4Al13v2_hFzjNhTNyJZKRgR5W-C8G7Kl_7emX9d84hH1bND1eNlpt9sA2lbSpv27IOPz6jCYSAyKU7LtiF-1X-V-4WPEF_HA</recordid><startdate>200704</startdate><enddate>200704</enddate><creator>Egghe, L.</creator><creator>Goovaerts, M.</creator><general>Sage Publications</general><general>Bowker-Saur</general><general>Bowker-Saur Ltd</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>200704</creationdate><title>A note on measuring overlap</title><author>Egghe, L. ; Goovaerts, M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c444t-5339a9462fcef7752565648d7fe8bd7e84078b73a3b6564796058b99ebf821e43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Bibliometrics. Scientometrics. Evaluation</topic><topic>Confidence intervals</topic><topic>Exact sciences and technology</topic><topic>Information and communication sciences</topic><topic>Information science. Documentation</topic><topic>Informetrics</topic><topic>Libraries</topic><topic>Library and information science. General aspects</topic><topic>Mathematics</topic><topic>Overlap</topic><topic>Sciences and techniques of general use</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Egghe, L.</creatorcontrib><creatorcontrib>Goovaerts, M.</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of information science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Egghe, L.</au><au>Goovaerts, M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A note on measuring overlap</atitle><jtitle>Journal of information science</jtitle><date>2007-04</date><risdate>2007</risdate><volume>33</volume><issue>2</issue><spage>189</spage><epage>195</epage><pages>189-195</pages><issn>0165-5515</issn><eissn>1741-6485</eissn><coden>JISCDI</coden><abstract>In measuring the overlap between two sets A and B (e.g. libraries, databases) one is obliged to calculate the overlap O(A|B) of A with respect to B (i.e. the fraction of elements of B that are also in A) and of O(B|A) of B with respect to A (i.e. the fraction of elements in A that are also in B). Theoretically this requires two samples. In this paper we explain that one sample can suffice to determine confidence intervals for both O(A|B) and O(B|A). The paper closes with the example of measuring the overlap between the secondary sources in mathematics MathSciNet and Zentralblatt MATH and with a remark on the estimation of the Jaccard index.</abstract><cop>Thousand Oaks, CA</cop><pub>Sage Publications</pub><doi>10.1177/0165551506075325</doi><tpages>7</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0165-5515 |
ispartof | Journal of information science, 2007-04, Vol.33 (2), p.189-195 |
issn | 0165-5515 1741-6485 |
language | eng |
recordid | cdi_proquest_miscellaneous_743610772 |
source | SAGE Complete A-Z List |
subjects | Bibliometrics. Scientometrics. Evaluation Confidence intervals Exact sciences and technology Information and communication sciences Information science. Documentation Informetrics Libraries Library and information science. General aspects Mathematics Overlap Sciences and techniques of general use |
title | A note on measuring overlap |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T18%3A08%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20note%20on%20measuring%20overlap&rft.jtitle=Journal%20of%20information%20science&rft.au=Egghe,%20L.&rft.date=2007-04&rft.volume=33&rft.issue=2&rft.spage=189&rft.epage=195&rft.pages=189-195&rft.issn=0165-5515&rft.eissn=1741-6485&rft.coden=JISCDI&rft_id=info:doi/10.1177/0165551506075325&rft_dat=%3Cproquest_cross%3E1928815073%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1928815073&rft_id=info:pmid/&rft_sage_id=10.1177_0165551506075325&rfr_iscdi=true |