GlOSS: text-source discovery over the Internet

The dramatic growth of the Internet has created a new problem for users: location of the relevant sources of documents. This article presents a framework for (and experimentally analyzes a solution to) this problem, which we call the text-source discovery problem. Our approach consists of two phases...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on database systems 1999-06, Vol.24 (2), p.229-264
Hauptverfasser: Gravano, Luis, García-Molina, Héctor, Tomasic, Anthony
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 264
container_issue 2
container_start_page 229
container_title ACM transactions on database systems
container_volume 24
creator Gravano, Luis
García-Molina, Héctor
Tomasic, Anthony
description The dramatic growth of the Internet has created a new problem for users: location of the relevant sources of documents. This article presents a framework for (and experimentally analyzes a solution to) this problem, which we call the text-source discovery problem. Our approach consists of two phases. First, each text source exports its contents to a centralized service. Second, users present queries to the service, which returns an ordered list of promising text sources. This article describes GlOSS, Glossary of Servers Server, with two versions: bGlOSS, which provides a Boolean query retrieval model, and vGlOSS, which provides a vector-space retrieval model. We also present hGlOSS, which provides a decentralized version of the system. We extensively describe the methodology for measuring the retrieval effectiveness of these systems and provide experimental evidence, based on actual data, that all three systems are highly effective in determining promising text sources for a given query.
doi_str_mv 10.1145/320248.320252
format Article
fullrecord <record><control><sourceid>gale_cross</sourceid><recordid>TN_cdi_gale_infotracmisc_A58224701</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A58224701</galeid><sourcerecordid>A58224701</sourcerecordid><originalsourceid>FETCH-LOGICAL-a243t-30db56611c31f32651782eac3ca428bdcf9654e31d82f51df761b6c435898fcc3</originalsourceid><addsrcrecordid>eNp90E1LAzEQBuAgFlyrRw9ePZuaSTJJ9liKVqHQQ_UcstmkrOyHJL34792yRTwUmcPAzMMwvITcAVsASHwSnHFpFseG_IIUgKipVFJekoIJxSmWgFfkOudPxpg0pS7IbN1ud7sbMouuzeH21Ofk4-X5ffVKN9v122q5oY5LcaCC1RUqBeAFRMEVgjY8OC-8k9xUtY-lQhkE1IZHhDpqBZXyUqApTfRezMnDdHfv2mCbPg6H5HzXZG-XaDiXmsGIHs-gfehDcu3Qh9iM47-cnuFj1aFr_D_epyHnFKL9Sk3n0rcFZo852ilHO-U4-vvJj4_-0tPuByDsZ9g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>GlOSS: text-source discovery over the Internet</title><source>ACM Digital Library Complete</source><creator>Gravano, Luis ; García-Molina, Héctor ; Tomasic, Anthony</creator><creatorcontrib>Gravano, Luis ; García-Molina, Héctor ; Tomasic, Anthony</creatorcontrib><description>The dramatic growth of the Internet has created a new problem for users: location of the relevant sources of documents. This article presents a framework for (and experimentally analyzes a solution to) this problem, which we call the text-source discovery problem. Our approach consists of two phases. First, each text source exports its contents to a centralized service. Second, users present queries to the service, which returns an ordered list of promising text sources. This article describes GlOSS, Glossary of Servers Server, with two versions: bGlOSS, which provides a Boolean query retrieval model, and vGlOSS, which provides a vector-space retrieval model. We also present hGlOSS, which provides a decentralized version of the system. We extensively describe the methodology for measuring the retrieval effectiveness of these systems and provide experimental evidence, based on actual data, that all three systems are highly effective in determining promising text sources for a given query.</description><identifier>ISSN: 0362-5915</identifier><identifier>EISSN: 1557-4644</identifier><identifier>DOI: 10.1145/320248.320252</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><subject>Applied computing ; Computers in other domains ; Data management systems ; Database industry ; Database management system engines ; Digital libraries and archives ; Evaluation ; Information management ; Information retrieval ; Information storage systems ; Information systems ; Information systems applications ; Internet/Web search services ; Services ; Work measurement</subject><ispartof>ACM transactions on database systems, 1999-06, Vol.24 (2), p.229-264</ispartof><rights>ACM</rights><rights>COPYRIGHT 1999 Association for Computing Machinery, Inc.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a243t-30db56611c31f32651782eac3ca428bdcf9654e31d82f51df761b6c435898fcc3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/320248.320252$$EPDF$$P50$$Gacm$$H</linktopdf><link.rule.ids>314,780,784,2282,27924,27925,40196,76228</link.rule.ids></links><search><creatorcontrib>Gravano, Luis</creatorcontrib><creatorcontrib>García-Molina, Héctor</creatorcontrib><creatorcontrib>Tomasic, Anthony</creatorcontrib><title>GlOSS: text-source discovery over the Internet</title><title>ACM transactions on database systems</title><addtitle>ACM TODS</addtitle><description>The dramatic growth of the Internet has created a new problem for users: location of the relevant sources of documents. This article presents a framework for (and experimentally analyzes a solution to) this problem, which we call the text-source discovery problem. Our approach consists of two phases. First, each text source exports its contents to a centralized service. Second, users present queries to the service, which returns an ordered list of promising text sources. This article describes GlOSS, Glossary of Servers Server, with two versions: bGlOSS, which provides a Boolean query retrieval model, and vGlOSS, which provides a vector-space retrieval model. We also present hGlOSS, which provides a decentralized version of the system. We extensively describe the methodology for measuring the retrieval effectiveness of these systems and provide experimental evidence, based on actual data, that all three systems are highly effective in determining promising text sources for a given query.</description><subject>Applied computing</subject><subject>Computers in other domains</subject><subject>Data management systems</subject><subject>Database industry</subject><subject>Database management system engines</subject><subject>Digital libraries and archives</subject><subject>Evaluation</subject><subject>Information management</subject><subject>Information retrieval</subject><subject>Information storage systems</subject><subject>Information systems</subject><subject>Information systems applications</subject><subject>Internet/Web search services</subject><subject>Services</subject><subject>Work measurement</subject><issn>0362-5915</issn><issn>1557-4644</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1999</creationdate><recordtype>article</recordtype><recordid>eNp90E1LAzEQBuAgFlyrRw9ePZuaSTJJ9liKVqHQQ_UcstmkrOyHJL34792yRTwUmcPAzMMwvITcAVsASHwSnHFpFseG_IIUgKipVFJekoIJxSmWgFfkOudPxpg0pS7IbN1ud7sbMouuzeH21Ofk4-X5ffVKN9v122q5oY5LcaCC1RUqBeAFRMEVgjY8OC-8k9xUtY-lQhkE1IZHhDpqBZXyUqApTfRezMnDdHfv2mCbPg6H5HzXZG-XaDiXmsGIHs-gfehDcu3Qh9iM47-cnuFj1aFr_D_epyHnFKL9Sk3n0rcFZo852ilHO-U4-vvJj4_-0tPuByDsZ9g</recordid><startdate>19990601</startdate><enddate>19990601</enddate><creator>Gravano, Luis</creator><creator>García-Molina, Héctor</creator><creator>Tomasic, Anthony</creator><general>ACM</general><general>Association for Computing Machinery, Inc</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>19990601</creationdate><title>GlOSS</title><author>Gravano, Luis ; García-Molina, Héctor ; Tomasic, Anthony</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a243t-30db56611c31f32651782eac3ca428bdcf9654e31d82f51df761b6c435898fcc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1999</creationdate><topic>Applied computing</topic><topic>Computers in other domains</topic><topic>Data management systems</topic><topic>Database industry</topic><topic>Database management system engines</topic><topic>Digital libraries and archives</topic><topic>Evaluation</topic><topic>Information management</topic><topic>Information retrieval</topic><topic>Information storage systems</topic><topic>Information systems</topic><topic>Information systems applications</topic><topic>Internet/Web search services</topic><topic>Services</topic><topic>Work measurement</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gravano, Luis</creatorcontrib><creatorcontrib>García-Molina, Héctor</creatorcontrib><creatorcontrib>Tomasic, Anthony</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on database systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gravano, Luis</au><au>García-Molina, Héctor</au><au>Tomasic, Anthony</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>GlOSS: text-source discovery over the Internet</atitle><jtitle>ACM transactions on database systems</jtitle><stitle>ACM TODS</stitle><date>1999-06-01</date><risdate>1999</risdate><volume>24</volume><issue>2</issue><spage>229</spage><epage>264</epage><pages>229-264</pages><issn>0362-5915</issn><eissn>1557-4644</eissn><abstract>The dramatic growth of the Internet has created a new problem for users: location of the relevant sources of documents. This article presents a framework for (and experimentally analyzes a solution to) this problem, which we call the text-source discovery problem. Our approach consists of two phases. First, each text source exports its contents to a centralized service. Second, users present queries to the service, which returns an ordered list of promising text sources. This article describes GlOSS, Glossary of Servers Server, with two versions: bGlOSS, which provides a Boolean query retrieval model, and vGlOSS, which provides a vector-space retrieval model. We also present hGlOSS, which provides a decentralized version of the system. We extensively describe the methodology for measuring the retrieval effectiveness of these systems and provide experimental evidence, based on actual data, that all three systems are highly effective in determining promising text sources for a given query.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/320248.320252</doi><tpages>36</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0362-5915
ispartof ACM transactions on database systems, 1999-06, Vol.24 (2), p.229-264
issn 0362-5915
1557-4644
language eng
recordid cdi_gale_infotracmisc_A58224701
source ACM Digital Library Complete
subjects Applied computing
Computers in other domains
Data management systems
Database industry
Database management system engines
Digital libraries and archives
Evaluation
Information management
Information retrieval
Information storage systems
Information systems
Information systems applications
Internet/Web search services
Services
Work measurement
title GlOSS: text-source discovery over the Internet
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T14%3A26%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=GlOSS:%20text-source%20discovery%20over%20the%20Internet&rft.jtitle=ACM%20transactions%20on%20database%20systems&rft.au=Gravano,%20Luis&rft.date=1999-06-01&rft.volume=24&rft.issue=2&rft.spage=229&rft.epage=264&rft.pages=229-264&rft.issn=0362-5915&rft.eissn=1557-4644&rft_id=info:doi/10.1145/320248.320252&rft_dat=%3Cgale_cross%3EA58224701%3C/gale_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_galeid=A58224701&rfr_iscdi=true