Implicit ambiguity resolution using incremental clustering in cross-language information retrieval

This paper presents a method to implicitly resolve ambiguities using dynamic incremental clustering in cross-language information retrieval (CLIR) such as Korean-to-English and Japanese-to-English CLIR. The main objective of this paper shows that document clusters can effectively resolve the ambigui...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information processing & management 2004, Vol.40 (1), p.145-159
Hauptverfasser:	LEE, Kyung-Soon, KAGEURA, Kyo, CHOI, Key-Sun
Format:	Artikel
Sprache:	eng
Schlagworte:	Ambiguity Cluster analysis Computerized information storage and retrieval Document delivery Exact sciences and technology Information and communication sciences Information retrieval Information retrieval systems. Information and document management system Information science. Documentation Languages Multilingual systems Queries Sciences and techniques of general use Searching Studies System design and modelling Translations Vector space
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	159
container_issue	1
container_start_page	145
container_title	Information processing & management
container_volume	40
creator	LEE, Kyung-Soon KAGEURA, Kyo CHOI, Key-Sun
description	This paper presents a method to implicitly resolve ambiguities using dynamic incremental clustering in cross-language information retrieval (CLIR) such as Korean-to-English and Japanese-to-English CLIR. The main objective of this paper shows that document clusters can effectively resolve the ambiguities tremendously increased in translated queries as well as take into account the context of all the terms in a document. In the framework we propose, a query in Korean/Japanese is first translated into English by looking up bilingual dictionaries, then documents are retrieved for the translated query terms based on the vector space retrieval model or the probabilistic retrieval model. For the top-ranked retrieved documents, query-oriented document clusters are incrementally created and the weight of each retrieved document is re-calculated by using the clusters. In the experiment based on TREC CLIR test collection, our method achieved 39.41% and 36.79% improvement for translated queries without ambiguity resolution in Korean-to-English CLIR, and 17.89% and 30.46% improvements in Japanese-to-English CLIR, on the vector space retrieval and on the probabilistic retrieval, respectively. Our method achieved 12.30% improvement for all translation queries, compared with blind feedback for the probabilistic retrieval in Korean-to-English CLIR. These results indicate that cluster analysis helps to resolve ambiguity.
doi_str_mv	10.1016/S0306-4573(03)00028-1
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_57599873</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>57599873</sourcerecordid><originalsourceid>FETCH-LOGICAL-c339t-30e316b66ccf53ef6a2761bfcd07f1abf0ee363d798d6efad763d3faf8b5d1463</originalsourceid><addsrcrecordid>eNpdkF1LwzAUhoMoOKc_QSiCohfVpGnS9lKGH4OBF-p1SNOTkpG2M0mF_XvTbSgIB8I5PO_h5EHokuB7ggl_eMcU8zRnBb3F9A5jnJUpOUIzUhY0ZbQgx2j2i5yiM-_XEcoZyWaoXnYba5QJiexq044mbBMHfrBjMEOfjN70bWJ65aCDPkibKDv6AG4_TpQbvE-t7NtRthAnenCd3EUdBGfgW9pzdKKl9XBxeOfo8_npY_Gart5elovHVaoorUJKMVDCa86V0oyC5jIrOKm1anChiaw1BqCcNkVVNhy0bIrYUC11WbOG5JzO0c1-78YNXyP4IDrjFdh4HQyjF6xgVRWVRPDqH7geRtfH2wSp8irPSk4ixPbQ7osOtNg400m3FQSLSbvYaReTU4FjTdrFlLs-LJdeSaud7JXxf2FG84qXGf0BU12FjA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>194942861</pqid></control><display><type>article</type><title>Implicit ambiguity resolution using incremental clustering in cross-language information retrieval</title><source>Elsevier ScienceDirect Journals</source><creator>LEE, Kyung-Soon ; KAGEURA, Kyo ; CHOI, Key-Sun</creator><creatorcontrib>LEE, Kyung-Soon ; KAGEURA, Kyo ; CHOI, Key-Sun</creatorcontrib><description>This paper presents a method to implicitly resolve ambiguities using dynamic incremental clustering in cross-language information retrieval (CLIR) such as Korean-to-English and Japanese-to-English CLIR. The main objective of this paper shows that document clusters can effectively resolve the ambiguities tremendously increased in translated queries as well as take into account the context of all the terms in a document. In the framework we propose, a query in Korean/Japanese is first translated into English by looking up bilingual dictionaries, then documents are retrieved for the translated query terms based on the vector space retrieval model or the probabilistic retrieval model. For the top-ranked retrieved documents, query-oriented document clusters are incrementally created and the weight of each retrieved document is re-calculated by using the clusters. In the experiment based on TREC CLIR test collection, our method achieved 39.41% and 36.79% improvement for translated queries without ambiguity resolution in Korean-to-English CLIR, and 17.89% and 30.46% improvements in Japanese-to-English CLIR, on the vector space retrieval and on the probabilistic retrieval, respectively. Our method achieved 12.30% improvement for all translation queries, compared with blind feedback for the probabilistic retrieval in Korean-to-English CLIR. These results indicate that cluster analysis helps to resolve ambiguity.</description><identifier>ISSN: 0306-4573</identifier><identifier>EISSN: 1873-5371</identifier><identifier>DOI: 10.1016/S0306-4573(03)00028-1</identifier><identifier>CODEN: IPMADK</identifier><language>eng</language><publisher>Oxford: Elsevier Science</publisher><subject>Ambiguity ; Cluster analysis ; Computerized information storage and retrieval ; Document delivery ; Exact sciences and technology ; Information and communication sciences ; Information retrieval ; Information retrieval systems. Information and document management system ; Information science. Documentation ; Languages ; Multilingual systems ; Queries ; Sciences and techniques of general use ; Searching ; Studies ; System design and modelling ; Translations ; Vector space</subject><ispartof>Information processing & management, 2004, Vol.40 (1), p.145-159</ispartof><rights>2004 INIST-CNRS</rights><rights>Copyright Pergamon Press Inc. Jan 2004</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c339t-30e316b66ccf53ef6a2761bfcd07f1abf0ee363d798d6efad763d3faf8b5d1463</citedby><cites>FETCH-LOGICAL-c339t-30e316b66ccf53ef6a2761bfcd07f1abf0ee363d798d6efad763d3faf8b5d1463</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,4010,27904,27905,27906</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15349682$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>LEE, Kyung-Soon</creatorcontrib><creatorcontrib>KAGEURA, Kyo</creatorcontrib><creatorcontrib>CHOI, Key-Sun</creatorcontrib><title>Implicit ambiguity resolution using incremental clustering in cross-language information retrieval</title><title>Information processing & management</title><description>This paper presents a method to implicitly resolve ambiguities using dynamic incremental clustering in cross-language information retrieval (CLIR) such as Korean-to-English and Japanese-to-English CLIR. The main objective of this paper shows that document clusters can effectively resolve the ambiguities tremendously increased in translated queries as well as take into account the context of all the terms in a document. In the framework we propose, a query in Korean/Japanese is first translated into English by looking up bilingual dictionaries, then documents are retrieved for the translated query terms based on the vector space retrieval model or the probabilistic retrieval model. For the top-ranked retrieved documents, query-oriented document clusters are incrementally created and the weight of each retrieved document is re-calculated by using the clusters. In the experiment based on TREC CLIR test collection, our method achieved 39.41% and 36.79% improvement for translated queries without ambiguity resolution in Korean-to-English CLIR, and 17.89% and 30.46% improvements in Japanese-to-English CLIR, on the vector space retrieval and on the probabilistic retrieval, respectively. Our method achieved 12.30% improvement for all translation queries, compared with blind feedback for the probabilistic retrieval in Korean-to-English CLIR. These results indicate that cluster analysis helps to resolve ambiguity.</description><subject>Ambiguity</subject><subject>Cluster analysis</subject><subject>Computerized information storage and retrieval</subject><subject>Document delivery</subject><subject>Exact sciences and technology</subject><subject>Information and communication sciences</subject><subject>Information retrieval</subject><subject>Information retrieval systems. Information and document management system</subject><subject>Information science. Documentation</subject><subject>Languages</subject><subject>Multilingual systems</subject><subject>Queries</subject><subject>Sciences and techniques of general use</subject><subject>Searching</subject><subject>Studies</subject><subject>System design and modelling</subject><subject>Translations</subject><subject>Vector space</subject><issn>0306-4573</issn><issn>1873-5371</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2004</creationdate><recordtype>article</recordtype><recordid>eNpdkF1LwzAUhoMoOKc_QSiCohfVpGnS9lKGH4OBF-p1SNOTkpG2M0mF_XvTbSgIB8I5PO_h5EHokuB7ggl_eMcU8zRnBb3F9A5jnJUpOUIzUhY0ZbQgx2j2i5yiM-_XEcoZyWaoXnYba5QJiexq044mbBMHfrBjMEOfjN70bWJ65aCDPkibKDv6AG4_TpQbvE-t7NtRthAnenCd3EUdBGfgW9pzdKKl9XBxeOfo8_npY_Gart5elovHVaoorUJKMVDCa86V0oyC5jIrOKm1anChiaw1BqCcNkVVNhy0bIrYUC11WbOG5JzO0c1-78YNXyP4IDrjFdh4HQyjF6xgVRWVRPDqH7geRtfH2wSp8irPSk4ixPbQ7osOtNg400m3FQSLSbvYaReTU4FjTdrFlLs-LJdeSaud7JXxf2FG84qXGf0BU12FjA</recordid><startdate>2004</startdate><enddate>2004</enddate><creator>LEE, Kyung-Soon</creator><creator>KAGEURA, Kyo</creator><creator>CHOI, Key-Sun</creator><general>Elsevier Science</general><general>Elsevier Science Ltd</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>E3H</scope><scope>F2A</scope></search><sort><creationdate>2004</creationdate><title>Implicit ambiguity resolution using incremental clustering in cross-language information retrieval</title><author>LEE, Kyung-Soon ; KAGEURA, Kyo ; CHOI, Key-Sun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c339t-30e316b66ccf53ef6a2761bfcd07f1abf0ee363d798d6efad763d3faf8b5d1463</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Ambiguity</topic><topic>Cluster analysis</topic><topic>Computerized information storage and retrieval</topic><topic>Document delivery</topic><topic>Exact sciences and technology</topic><topic>Information and communication sciences</topic><topic>Information retrieval</topic><topic>Information retrieval systems. Information and document management system</topic><topic>Information science. Documentation</topic><topic>Languages</topic><topic>Multilingual systems</topic><topic>Queries</topic><topic>Sciences and techniques of general use</topic><topic>Searching</topic><topic>Studies</topic><topic>System design and modelling</topic><topic>Translations</topic><topic>Vector space</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>LEE, Kyung-Soon</creatorcontrib><creatorcontrib>KAGEURA, Kyo</creatorcontrib><creatorcontrib>CHOI, Key-Sun</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><jtitle>Information processing & management</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>LEE, Kyung-Soon</au><au>KAGEURA, Kyo</au><au>CHOI, Key-Sun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Implicit ambiguity resolution using incremental clustering in cross-language information retrieval</atitle><jtitle>Information processing & management</jtitle><date>2004</date><risdate>2004</risdate><volume>40</volume><issue>1</issue><spage>145</spage><epage>159</epage><pages>145-159</pages><issn>0306-4573</issn><eissn>1873-5371</eissn><coden>IPMADK</coden><abstract>This paper presents a method to implicitly resolve ambiguities using dynamic incremental clustering in cross-language information retrieval (CLIR) such as Korean-to-English and Japanese-to-English CLIR. The main objective of this paper shows that document clusters can effectively resolve the ambiguities tremendously increased in translated queries as well as take into account the context of all the terms in a document. In the framework we propose, a query in Korean/Japanese is first translated into English by looking up bilingual dictionaries, then documents are retrieved for the translated query terms based on the vector space retrieval model or the probabilistic retrieval model. For the top-ranked retrieved documents, query-oriented document clusters are incrementally created and the weight of each retrieved document is re-calculated by using the clusters. In the experiment based on TREC CLIR test collection, our method achieved 39.41% and 36.79% improvement for translated queries without ambiguity resolution in Korean-to-English CLIR, and 17.89% and 30.46% improvements in Japanese-to-English CLIR, on the vector space retrieval and on the probabilistic retrieval, respectively. Our method achieved 12.30% improvement for all translation queries, compared with blind feedback for the probabilistic retrieval in Korean-to-English CLIR. These results indicate that cluster analysis helps to resolve ambiguity.</abstract><cop>Oxford</cop><pub>Elsevier Science</pub><doi>10.1016/S0306-4573(03)00028-1</doi><tpages>15</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0306-4573
ispartof	Information processing & management, 2004, Vol.40 (1), p.145-159
issn	0306-4573 1873-5371
language	eng
recordid	cdi_proquest_miscellaneous_57599873
source	Elsevier ScienceDirect Journals
subjects	Ambiguity Cluster analysis Computerized information storage and retrieval Document delivery Exact sciences and technology Information and communication sciences Information retrieval Information retrieval systems. Information and document management system Information science. Documentation Languages Multilingual systems Queries Sciences and techniques of general use Searching Studies System design and modelling Translations Vector space
title	Implicit ambiguity resolution using incremental clustering in cross-language information retrieval
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T12%3A51%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Implicit%20ambiguity%20resolution%20using%20incremental%20clustering%20in%20cross-language%20information%20retrieval&rft.jtitle=Information%20processing%20&%20management&rft.au=LEE,%20Kyung-Soon&rft.date=2004&rft.volume=40&rft.issue=1&rft.spage=145&rft.epage=159&rft.pages=145-159&rft.issn=0306-4573&rft.eissn=1873-5371&rft.coden=IPMADK&rft_id=info:doi/10.1016/S0306-4573(03)00028-1&rft_dat=%3Cproquest_cross%3E57599873%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=194942861&rft_id=info:pmid/&rfr_iscdi=true