Query translation for CLIR: EWC vs. Google Translate
A new approach to find accurate translation of search engine queries from Japanese into English for the CLIR task is proposed. The Mecab system and online dictionary SPACEALC are utilized to segment Japanese queries and to get all possible English senses for every term detected. To disambiguate term...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 711 |
---|---|
container_issue | |
container_start_page | 707 |
container_title | |
container_volume | |
creator | Klyuev, V. Haralambous, Y. |
description | A new approach to find accurate translation of search engine queries from Japanese into English for the CLIR task is proposed. The Mecab system and online dictionary SPACEALC are utilized to segment Japanese queries and to get all possible English senses for every term detected. To disambiguate terms, the idea of the shortest path on an oriented graph is applied. Nodes of this graph symbolize word senses and edges connect nodes representing neighboring Japanese terms. The EWC semantic relatedness measure is used to select the most related meanings for the translation results. This measure combines the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. The proposed technique is tested on the NTCIR data collection. Queries generated by Google Translate were used to evaluate the quality of translation. |
doi_str_mv | 10.1109/ICIST.2012.6221738 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>hal_6IE</sourceid><recordid>TN_cdi_ieee_primary_6221738</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6221738</ieee_id><sourcerecordid>oai_HAL_hal_00959927v1</sourcerecordid><originalsourceid>FETCH-LOGICAL-h1688-c2bfae4aade4613ac2ca9589c6a4696f0452e8ce7c6117d93c21bbd0a61223fe3</originalsourceid><addsrcrecordid>eNo9kE1Lw0AYhFdUsNb8Ab3s1UPivvu93kqobSAgasRjeLPZ2EhsJKmF_nsrrc5lmOFhDkPINbAEgLm7LM1eioQz4InmHIywJ-QSpDKGCancKYmcsX9ZsDMy4aBlLIUyFyQaxw-2l1EgtJ0Q-fQdhh3dDLgeO9y0_Zo2_UDTPHu-p_O3lG7HhC76_r0LtDhC4YqcN9iNITr6lLw-zIt0GeePiyyd5fEKtLWx51WDQSLWQWoQ6LlHp6zzGqV2umFS8WB9MF4DmNoJz6GqaoYaOBdNEFNye9hdYVd-De0nDruyx7ZczvLyt2PMKee42cKevTmwbQjhHz4eJH4A82hU0Q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Query translation for CLIR: EWC vs. Google Translate</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Klyuev, V. ; Haralambous, Y.</creator><creatorcontrib>Klyuev, V. ; Haralambous, Y.</creatorcontrib><description>A new approach to find accurate translation of search engine queries from Japanese into English for the CLIR task is proposed. The Mecab system and online dictionary SPACEALC are utilized to segment Japanese queries and to get all possible English senses for every term detected. To disambiguate terms, the idea of the shortest path on an oriented graph is applied. Nodes of this graph symbolize word senses and edges connect nodes representing neighboring Japanese terms. The EWC semantic relatedness measure is used to select the most related meanings for the translation results. This measure combines the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. The proposed technique is tested on the NTCIR data collection. Queries generated by Google Translate were used to evaluate the quality of translation.</description><identifier>ISSN: 2164-4357</identifier><identifier>ISBN: 9781457703430</identifier><identifier>ISBN: 1457703432</identifier><identifier>EISBN: 1457703459</identifier><identifier>EISBN: 1457703440</identifier><identifier>EISBN: 9781457703447</identifier><identifier>EISBN: 9781457703454</identifier><identifier>DOI: 10.1109/ICIST.2012.6221738</identifier><language>eng</language><publisher>IEEE</publisher><subject>Electronic publishing ; Encyclopedias ; Engineering Sciences ; Google ; Information retrieval ; Internet ; Semantics</subject><ispartof>2012 IEEE International Conference on Information Science and Technology, 2012, p.707-711</ispartof><rights>Attribution</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0003-1443-6115</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6221738$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,309,310,776,780,785,786,881,2052,4036,4037,27902,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6221738$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://hal.science/hal-00959927$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Klyuev, V.</creatorcontrib><creatorcontrib>Haralambous, Y.</creatorcontrib><title>Query translation for CLIR: EWC vs. Google Translate</title><title>2012 IEEE International Conference on Information Science and Technology</title><addtitle>ICIST</addtitle><description>A new approach to find accurate translation of search engine queries from Japanese into English for the CLIR task is proposed. The Mecab system and online dictionary SPACEALC are utilized to segment Japanese queries and to get all possible English senses for every term detected. To disambiguate terms, the idea of the shortest path on an oriented graph is applied. Nodes of this graph symbolize word senses and edges connect nodes representing neighboring Japanese terms. The EWC semantic relatedness measure is used to select the most related meanings for the translation results. This measure combines the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. The proposed technique is tested on the NTCIR data collection. Queries generated by Google Translate were used to evaluate the quality of translation.</description><subject>Electronic publishing</subject><subject>Encyclopedias</subject><subject>Engineering Sciences</subject><subject>Google</subject><subject>Information retrieval</subject><subject>Internet</subject><subject>Semantics</subject><issn>2164-4357</issn><isbn>9781457703430</isbn><isbn>1457703432</isbn><isbn>1457703459</isbn><isbn>1457703440</isbn><isbn>9781457703447</isbn><isbn>9781457703454</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9kE1Lw0AYhFdUsNb8Ab3s1UPivvu93kqobSAgasRjeLPZ2EhsJKmF_nsrrc5lmOFhDkPINbAEgLm7LM1eioQz4InmHIywJ-QSpDKGCancKYmcsX9ZsDMy4aBlLIUyFyQaxw-2l1EgtJ0Q-fQdhh3dDLgeO9y0_Zo2_UDTPHu-p_O3lG7HhC76_r0LtDhC4YqcN9iNITr6lLw-zIt0GeePiyyd5fEKtLWx51WDQSLWQWoQ6LlHp6zzGqV2umFS8WB9MF4DmNoJz6GqaoYaOBdNEFNye9hdYVd-De0nDruyx7ZczvLyt2PMKee42cKevTmwbQjhHz4eJH4A82hU0Q</recordid><startdate>201203</startdate><enddate>201203</enddate><creator>Klyuev, V.</creator><creator>Haralambous, Y.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0003-1443-6115</orcidid></search><sort><creationdate>201203</creationdate><title>Query translation for CLIR: EWC vs. Google Translate</title><author>Klyuev, V. ; Haralambous, Y.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-h1688-c2bfae4aade4613ac2ca9589c6a4696f0452e8ce7c6117d93c21bbd0a61223fe3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Electronic publishing</topic><topic>Encyclopedias</topic><topic>Engineering Sciences</topic><topic>Google</topic><topic>Information retrieval</topic><topic>Internet</topic><topic>Semantics</topic><toplevel>online_resources</toplevel><creatorcontrib>Klyuev, V.</creatorcontrib><creatorcontrib>Haralambous, Y.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Klyuev, V.</au><au>Haralambous, Y.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Query translation for CLIR: EWC vs. Google Translate</atitle><btitle>2012 IEEE International Conference on Information Science and Technology</btitle><stitle>ICIST</stitle><date>2012-03</date><risdate>2012</risdate><spage>707</spage><epage>711</epage><pages>707-711</pages><issn>2164-4357</issn><isbn>9781457703430</isbn><isbn>1457703432</isbn><eisbn>1457703459</eisbn><eisbn>1457703440</eisbn><eisbn>9781457703447</eisbn><eisbn>9781457703454</eisbn><abstract>A new approach to find accurate translation of search engine queries from Japanese into English for the CLIR task is proposed. The Mecab system and online dictionary SPACEALC are utilized to segment Japanese queries and to get all possible English senses for every term detected. To disambiguate terms, the idea of the shortest path on an oriented graph is applied. Nodes of this graph symbolize word senses and edges connect nodes representing neighboring Japanese terms. The EWC semantic relatedness measure is used to select the most related meanings for the translation results. This measure combines the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. The proposed technique is tested on the NTCIR data collection. Queries generated by Google Translate were used to evaluate the quality of translation.</abstract><pub>IEEE</pub><doi>10.1109/ICIST.2012.6221738</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0003-1443-6115</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2164-4357 |
ispartof | 2012 IEEE International Conference on Information Science and Technology, 2012, p.707-711 |
issn | 2164-4357 |
language | eng |
recordid | cdi_ieee_primary_6221738 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Electronic publishing Encyclopedias Engineering Sciences Information retrieval Internet Semantics |
title | Query translation for CLIR: EWC vs. Google Translate |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T08%3A02%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-hal_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Query%20translation%20for%20CLIR:%20EWC%20vs.%20Google%20Translate&rft.btitle=2012%20IEEE%20International%20Conference%20on%20Information%20Science%20and%20Technology&rft.au=Klyuev,%20V.&rft.date=2012-03&rft.spage=707&rft.epage=711&rft.pages=707-711&rft.issn=2164-4357&rft.isbn=9781457703430&rft.isbn_list=1457703432&rft_id=info:doi/10.1109/ICIST.2012.6221738&rft_dat=%3Chal_6IE%3Eoai_HAL_hal_00959927v1%3C/hal_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1457703459&rft.eisbn_list=1457703440&rft.eisbn_list=9781457703447&rft.eisbn_list=9781457703454&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6221738&rfr_iscdi=true |