Finding the most similar documents across multiple text databases
We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to t...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 162 |
---|---|
container_issue | |
container_start_page | 150 |
container_title | |
container_volume | |
creator | Clement Yu King-Lup Liu Wensheng Wu Weiyi Meng Rishe, N. |
description | We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to the order and in a particular way. If the databases containing the n most similar documents for a given query can be ranked ahead of other databases, the methodology will guarantee the retrieval of the n most similar documents for the query. A statistical method is provided to identify databases, each of which is estimated to contain at least one of the n most similar documents. Then, a number of strategies are presented to retrieve documents from the identified databases. Experimental results are given to illustrate the relative performance of different strategies. |
doi_str_mv | 10.1109/ADL.1999.777710 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_777710</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>777710</ieee_id><sourcerecordid>777710</sourcerecordid><originalsourceid>FETCH-LOGICAL-i104t-64ad574fd23358a0b86fddc94ad7c0a960ee0ba0b1b0e8346e6bac3c7bd34ac53</originalsourceid><addsrcrecordid>eNotT8lOwzAUtFgkSukZiZN_IOE5Trwco0IBKRIXOFdeXsAoSxW7Evw9ltq5zGg08_SGkHsGJWOgH9unrmRa61JmMLggq4pLVWRZX5KNlgqk0A1UOXNFVrlRFVo3-obcxvgDUAFXakXaXZh8mL5o-kY6zjHRGMYwmIX62R1HnFKkxi1zjHQ8DikcBqQJfxP1JhlrIsY7ct2bIeLmzGvyuXv-2L4W3fvL27btipA_SoWojW9k3fuK80YZsEr03judbenAaAGIYLPPLKDitUBhjeNOWs9r4xq-Jg-nuwER94cljGb525_G838IrUyk</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Finding the most similar documents across multiple text databases</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Clement Yu ; King-Lup Liu ; Wensheng Wu ; Weiyi Meng ; Rishe, N.</creator><creatorcontrib>Clement Yu ; King-Lup Liu ; Wensheng Wu ; Weiyi Meng ; Rishe, N.</creatorcontrib><description>We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to the order and in a particular way. If the databases containing the n most similar documents for a given query can be ranked ahead of other databases, the methodology will guarantee the retrieval of the n most similar documents for the query. A statistical method is provided to identify databases, each of which is estimated to contain at least one of the n most similar documents. Then, a number of strategies are presented to retrieve documents from the identified databases. Experimental results are given to illustrate the relative performance of different strategies.</description><identifier>ISSN: 1092-9959</identifier><identifier>ISBN: 9780769502199</identifier><identifier>ISBN: 0769502199</identifier><identifier>EISSN: 2378-7104</identifier><identifier>DOI: 10.1109/ADL.1999.777710</identifier><language>eng</language><publisher>IEEE</publisher><subject>Australia ; Computer networks ; Database systems ; Indexing ; Information retrieval ; Information systems ; Internet ; ISDN ; Machine learning ; Transaction databases</subject><ispartof>Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries, 1999, p.150-162</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/777710$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/777710$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Clement Yu</creatorcontrib><creatorcontrib>King-Lup Liu</creatorcontrib><creatorcontrib>Wensheng Wu</creatorcontrib><creatorcontrib>Weiyi Meng</creatorcontrib><creatorcontrib>Rishe, N.</creatorcontrib><title>Finding the most similar documents across multiple text databases</title><title>Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries</title><addtitle>ADL</addtitle><description>We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to the order and in a particular way. If the databases containing the n most similar documents for a given query can be ranked ahead of other databases, the methodology will guarantee the retrieval of the n most similar documents for the query. A statistical method is provided to identify databases, each of which is estimated to contain at least one of the n most similar documents. Then, a number of strategies are presented to retrieve documents from the identified databases. Experimental results are given to illustrate the relative performance of different strategies.</description><subject>Australia</subject><subject>Computer networks</subject><subject>Database systems</subject><subject>Indexing</subject><subject>Information retrieval</subject><subject>Information systems</subject><subject>Internet</subject><subject>ISDN</subject><subject>Machine learning</subject><subject>Transaction databases</subject><issn>1092-9959</issn><issn>2378-7104</issn><isbn>9780769502199</isbn><isbn>0769502199</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>1999</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotT8lOwzAUtFgkSukZiZN_IOE5Trwco0IBKRIXOFdeXsAoSxW7Evw9ltq5zGg08_SGkHsGJWOgH9unrmRa61JmMLggq4pLVWRZX5KNlgqk0A1UOXNFVrlRFVo3-obcxvgDUAFXakXaXZh8mL5o-kY6zjHRGMYwmIX62R1HnFKkxi1zjHQ8DikcBqQJfxP1JhlrIsY7ct2bIeLmzGvyuXv-2L4W3fvL27btipA_SoWojW9k3fuK80YZsEr03judbenAaAGIYLPPLKDitUBhjeNOWs9r4xq-Jg-nuwER94cljGb525_G838IrUyk</recordid><startdate>1999</startdate><enddate>1999</enddate><creator>Clement Yu</creator><creator>King-Lup Liu</creator><creator>Wensheng Wu</creator><creator>Weiyi Meng</creator><creator>Rishe, N.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>1999</creationdate><title>Finding the most similar documents across multiple text databases</title><author>Clement Yu ; King-Lup Liu ; Wensheng Wu ; Weiyi Meng ; Rishe, N.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i104t-64ad574fd23358a0b86fddc94ad7c0a960ee0ba0b1b0e8346e6bac3c7bd34ac53</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>1999</creationdate><topic>Australia</topic><topic>Computer networks</topic><topic>Database systems</topic><topic>Indexing</topic><topic>Information retrieval</topic><topic>Information systems</topic><topic>Internet</topic><topic>ISDN</topic><topic>Machine learning</topic><topic>Transaction databases</topic><toplevel>online_resources</toplevel><creatorcontrib>Clement Yu</creatorcontrib><creatorcontrib>King-Lup Liu</creatorcontrib><creatorcontrib>Wensheng Wu</creatorcontrib><creatorcontrib>Weiyi Meng</creatorcontrib><creatorcontrib>Rishe, N.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Clement Yu</au><au>King-Lup Liu</au><au>Wensheng Wu</au><au>Weiyi Meng</au><au>Rishe, N.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Finding the most similar documents across multiple text databases</atitle><btitle>Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries</btitle><stitle>ADL</stitle><date>1999</date><risdate>1999</risdate><spage>150</spage><epage>162</epage><pages>150-162</pages><issn>1092-9959</issn><eissn>2378-7104</eissn><isbn>9780769502199</isbn><isbn>0769502199</isbn><abstract>We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to the order and in a particular way. If the databases containing the n most similar documents for a given query can be ranked ahead of other databases, the methodology will guarantee the retrieval of the n most similar documents for the query. A statistical method is provided to identify databases, each of which is estimated to contain at least one of the n most similar documents. Then, a number of strategies are presented to retrieve documents from the identified databases. Experimental results are given to illustrate the relative performance of different strategies.</abstract><pub>IEEE</pub><doi>10.1109/ADL.1999.777710</doi><tpages>13</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1092-9959 |
ispartof | Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries, 1999, p.150-162 |
issn | 1092-9959 2378-7104 |
language | eng |
recordid | cdi_ieee_primary_777710 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Australia Computer networks Database systems Indexing Information retrieval Information systems Internet ISDN Machine learning Transaction databases |
title | Finding the most similar documents across multiple text databases |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T11%3A09%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Finding%20the%20most%20similar%20documents%20across%20multiple%20text%20databases&rft.btitle=Proceedings%20IEEE%20Forum%20on%20Research%20and%20Technology%20Advances%20in%20Digital%20Libraries&rft.au=Clement%20Yu&rft.date=1999&rft.spage=150&rft.epage=162&rft.pages=150-162&rft.issn=1092-9959&rft.eissn=2378-7104&rft.isbn=9780769502199&rft.isbn_list=0769502199&rft_id=info:doi/10.1109/ADL.1999.777710&rft_dat=%3Cieee_6IE%3E777710%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=777710&rfr_iscdi=true |