Finding the most similar documents across multiple text databases

We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Clement Yu, King-Lup Liu, Wensheng Wu, Weiyi Meng, Rishe, N.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Australia Computer networks Database systems Indexing Information retrieval Information systems Internet ISDN Machine learning Transaction databases
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	162
container_issue
container_start_page	150
container_title
container_volume
creator	Clement Yu King-Lup Liu Wensheng Wu Weiyi Meng Rishe, N.
description	We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to the order and in a particular way. If the databases containing the n most similar documents for a given query can be ranked ahead of other databases, the methodology will guarantee the retrieval of the n most similar documents for the query. A statistical method is provided to identify databases, each of which is estimated to contain at least one of the n most similar documents. Then, a number of strategies are presented to retrieve documents from the identified databases. Experimental results are given to illustrate the relative performance of different strategies.
doi_str_mv	10.1109/ADL.1999.777710
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_777710</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>777710</ieee_id><sourcerecordid>777710</sourcerecordid><originalsourceid>FETCH-LOGICAL-i104t-64ad574fd23358a0b86fddc94ad7c0a960ee0ba0b1b0e8346e6bac3c7bd34ac53</originalsourceid><addsrcrecordid>eNotT8lOwzAUtFgkSukZiZN_IOE5Trwco0IBKRIXOFdeXsAoSxW7Evw9ltq5zGg08_SGkHsGJWOgH9unrmRa61JmMLggq4pLVWRZX5KNlgqk0A1UOXNFVrlRFVo3-obcxvgDUAFXakXaXZh8mL5o-kY6zjHRGMYwmIX62R1HnFKkxi1zjHQ8DikcBqQJfxP1JhlrIsY7ct2bIeLmzGvyuXv-2L4W3fvL27btipA_SoWojW9k3fuK80YZsEr03judbenAaAGIYLPPLKDitUBhjeNOWs9r4xq-Jg-nuwER94cljGb525_G838IrUyk</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Finding the most similar documents across multiple text databases</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Clement Yu ; King-Lup Liu ; Wensheng Wu ; Weiyi Meng ; Rishe, N.</creator><creatorcontrib>Clement Yu ; King-Lup Liu ; Wensheng Wu ; Weiyi Meng ; Rishe, N.</creatorcontrib><description>We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to the order and in a particular way. If the databases containing the n most similar documents for a given query can be ranked ahead of other databases, the methodology will guarantee the retrieval of the n most similar documents for the query. A statistical method is provided to identify databases, each of which is estimated to contain at least one of the n most similar documents. Then, a number of strategies are presented to retrieve documents from the identified databases. Experimental results are given to illustrate the relative performance of different strategies.</description><identifier>ISSN: 1092-9959</identifier><identifier>ISBN: 9780769502199</identifier><identifier>ISBN: 0769502199</identifier><identifier>EISSN: 2378-7104</identifier><identifier>DOI: 10.1109/ADL.1999.777710</identifier><language>eng</language><publisher>IEEE</publisher><subject>Australia ; Computer networks ; Database systems ; Indexing ; Information retrieval ; Information systems ; Internet ; ISDN ; Machine learning ; Transaction databases</subject><ispartof>Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries, 1999, p.150-162</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/777710$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/777710$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Clement Yu</creatorcontrib><creatorcontrib>King-Lup Liu</creatorcontrib><creatorcontrib>Wensheng Wu</creatorcontrib><creatorcontrib>Weiyi Meng</creatorcontrib><creatorcontrib>Rishe, N.</creatorcontrib><title>Finding the most similar documents across multiple text databases</title><title>Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries</title><addtitle>ADL</addtitle><description>We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to the order and in a particular way. If the databases containing the n most similar documents for a given query can be ranked ahead of other databases, the methodology will guarantee the retrieval of the n most similar documents for the query. A statistical method is provided to identify databases, each of which is estimated to contain at least one of the n most similar documents. Then, a number of strategies are presented to retrieve documents from the identified databases. Experimental results are given to illustrate the relative performance of different strategies.</description><subject>Australia</subject><subject>Computer networks</subject><subject>Database systems</subject><subject>Indexing</subject><subject>Information retrieval</subject><subject>Information systems</subject><subject>Internet</subject><subject>ISDN</subject><subject>Machine learning</subject><subject>Transaction databases</subject><issn>1092-9959</issn><issn>2378-7104</issn><isbn>9780769502199</isbn><isbn>0769502199</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>1999</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotT8lOwzAUtFgkSukZiZN_IOE5Trwco0IBKRIXOFdeXsAoSxW7Evw9ltq5zGg08_SGkHsGJWOgH9unrmRa61JmMLggq4pLVWRZX5KNlgqk0A1UOXNFVrlRFVo3-obcxvgDUAFXakXaXZh8mL5o-kY6zjHRGMYwmIX62R1HnFKkxi1zjHQ8DikcBqQJfxP1JhlrIsY7ct2bIeLmzGvyuXv-2L4W3fvL27btipA_SoWojW9k3fuK80YZsEr03judbenAaAGIYLPPLKDitUBhjeNOWs9r4xq-Jg-nuwER94cljGb525_G838IrUyk</recordid><startdate>1999</startdate><enddate>1999</enddate><creator>Clement Yu</creator><creator>King-Lup Liu</creator><creator>Wensheng Wu</creator><creator>Weiyi Meng</creator><creator>Rishe, N.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>1999</creationdate><title>Finding the most similar documents across multiple text databases</title><author>Clement Yu ; King-Lup Liu ; Wensheng Wu ; Weiyi Meng ; Rishe, N.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i104t-64ad574fd23358a0b86fddc94ad7c0a960ee0ba0b1b0e8346e6bac3c7bd34ac53</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>1999</creationdate><topic>Australia</topic><topic>Computer networks</topic><topic>Database systems</topic><topic>Indexing</topic><topic>Information retrieval</topic><topic>Information systems</topic><topic>Internet</topic><topic>ISDN</topic><topic>Machine learning</topic><topic>Transaction databases</topic><toplevel>online_resources</toplevel><creatorcontrib>Clement Yu</creatorcontrib><creatorcontrib>King-Lup Liu</creatorcontrib><creatorcontrib>Wensheng Wu</creatorcontrib><creatorcontrib>Weiyi Meng</creatorcontrib><creatorcontrib>Rishe, N.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Clement Yu</au><au>King-Lup Liu</au><au>Wensheng Wu</au><au>Weiyi Meng</au><au>Rishe, N.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Finding the most similar documents across multiple text databases</atitle><btitle>Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries</btitle><stitle>ADL</stitle><date>1999</date><risdate>1999</risdate><spage>150</spage><epage>162</epage><pages>150-162</pages><issn>1092-9959</issn><eissn>2378-7104</eissn><isbn>9780769502199</isbn><isbn>0769502199</isbn><abstract>We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to the order and in a particular way. If the databases containing the n most similar documents for a given query can be ranked ahead of other databases, the methodology will guarantee the retrieval of the n most similar documents for the query. A statistical method is provided to identify databases, each of which is estimated to contain at least one of the n most similar documents. Then, a number of strategies are presented to retrieve documents from the identified databases. Experimental results are given to illustrate the relative performance of different strategies.</abstract><pub>IEEE</pub><doi>10.1109/ADL.1999.777710</doi><tpages>13</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1092-9959
ispartof	Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries, 1999, p.150-162
issn	1092-9959 2378-7104
language	eng
recordid	cdi_ieee_primary_777710
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Australia Computer networks Database systems Indexing Information retrieval Information systems Internet ISDN Machine learning Transaction databases
title	Finding the most similar documents across multiple text databases
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T11%3A09%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Finding%20the%20most%20similar%20documents%20across%20multiple%20text%20databases&rft.btitle=Proceedings%20IEEE%20Forum%20on%20Research%20and%20Technology%20Advances%20in%20Digital%20Libraries&rft.au=Clement%20Yu&rft.date=1999&rft.spage=150&rft.epage=162&rft.pages=150-162&rft.issn=1092-9959&rft.eissn=2378-7104&rft.isbn=9780769502199&rft.isbn_list=0769502199&rft_id=info:doi/10.1109/ADL.1999.777710&rft_dat=%3Cieee_6IE%3E777710%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=777710&rfr_iscdi=true