Unsupervised group matching with application to cross-lingual topic matching without alignment information

We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Data mining and knowledge discovery 2017-03, Vol.31 (2), p.350-370
Hauptverfasser: Iwata, Tomoharu, Kanagawa, Motonobu, Hirao, Tsutomu, Fukumizu, Kenji
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 370
container_issue 2
container_start_page 350
container_title Data mining and knowledge discovery
container_volume 31
creator Iwata, Tomoharu
Kanagawa, Motonobu
Hirao, Tsutomu
Fukumizu, Kenji
description We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without alignment information. The proposed method interprets a group as a probability distribution, which enables us to handle uncertainty in a limited amount of data, and to incorporate the high order information on groups. Groups are matched by maximizing the dependence between distributions, in which we use the Hilbert Schmidt independence criterion for measuring the dependence. By using kernel embedding which maps distributions into a reproducing kernel Hilbert space, we can calculate the dependence between distributions without density estimation. In the experiments, we demonstrate the effectiveness of the proposed method using synthetic and real data sets including an application to cross-lingual topic matching.
doi_str_mv 10.1007/s10618-016-0470-1
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1884108487</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>4313540631</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-b5fc77858629a469cdae74d34f6e00ddf7ddbecd72bc9bedf4ebf0dc878ee7013</originalsourceid><addsrcrecordid>eNp1kE9LxDAQxYsouK5-AG8FL16ikzZt0qMs_oMFLy54C2mSdrO0SU1axW9v1npQwdMMM783vHlJco7hCgPQ64ChxAwBLhEQCggfJAtc0BzRonw5jH3OCCoYhuPkJIQdABRZDotkt7FhGrR_M0GrtPVuGtJejHJrbJu-m3GbimHojBSjcTYdXSq9CwF1cT2JLg4GI38L3DSmojOt7bUdU2Mb5_sv9Wly1Igu6LPvukw2d7fPqwe0frp_XN2skcxJNaK6aCSlrGBlVglSVlIJTYnKSVNqAKUaqlStpaJZLataq4bougElGWVaU8D5Mrmc7w7evU46jLw3QequE1a7KXDMGMHACKMRvfiD7tzkbXQXqZIylpWkihSeqa_fvW744E0v_AfHwPfp8zl9HtPn-_T53kQ2a0Jkbav9j8v_ij4BJqWLWA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1867882649</pqid></control><display><type>article</type><title>Unsupervised group matching with application to cross-lingual topic matching without alignment information</title><source>Springer Online Journals</source><creator>Iwata, Tomoharu ; Kanagawa, Motonobu ; Hirao, Tsutomu ; Fukumizu, Kenji</creator><creatorcontrib>Iwata, Tomoharu ; Kanagawa, Motonobu ; Hirao, Tsutomu ; Fukumizu, Kenji</creatorcontrib><description>We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without alignment information. The proposed method interprets a group as a probability distribution, which enables us to handle uncertainty in a limited amount of data, and to incorporate the high order information on groups. Groups are matched by maximizing the dependence between distributions, in which we use the Hilbert Schmidt independence criterion for measuring the dependence. By using kernel embedding which maps distributions into a reproducing kernel Hilbert space, we can calculate the dependence between distributions without density estimation. In the experiments, we demonstrate the effectiveness of the proposed method using synthetic and real data sets including an application to cross-lingual topic matching.</description><identifier>ISSN: 1384-5810</identifier><identifier>EISSN: 1573-756X</identifier><identifier>DOI: 10.1007/s10618-016-0470-1</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Alignment ; Artificial Intelligence ; Chemistry and Earth Sciences ; Computer Science ; Correlation analysis ; Correspondence ; Data mining ; Data Mining and Knowledge Discovery ; Hilbert space ; Information Storage and Retrieval ; Kernels ; Matching ; Mathematical analysis ; Methods ; Multilingualism ; Ontology ; Physics ; Probability distribution ; Reproduction ; Similarity ; Statistics for Engineering ; Tasks</subject><ispartof>Data mining and knowledge discovery, 2017-03, Vol.31 (2), p.350-370</ispartof><rights>The Author(s) 2016</rights><rights>Data Mining and Knowledge Discovery is a copyright of Springer, 2017.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-b5fc77858629a469cdae74d34f6e00ddf7ddbecd72bc9bedf4ebf0dc878ee7013</citedby><cites>FETCH-LOGICAL-c349t-b5fc77858629a469cdae74d34f6e00ddf7ddbecd72bc9bedf4ebf0dc878ee7013</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10618-016-0470-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10618-016-0470-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Iwata, Tomoharu</creatorcontrib><creatorcontrib>Kanagawa, Motonobu</creatorcontrib><creatorcontrib>Hirao, Tsutomu</creatorcontrib><creatorcontrib>Fukumizu, Kenji</creatorcontrib><title>Unsupervised group matching with application to cross-lingual topic matching without alignment information</title><title>Data mining and knowledge discovery</title><addtitle>Data Min Knowl Disc</addtitle><description>We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without alignment information. The proposed method interprets a group as a probability distribution, which enables us to handle uncertainty in a limited amount of data, and to incorporate the high order information on groups. Groups are matched by maximizing the dependence between distributions, in which we use the Hilbert Schmidt independence criterion for measuring the dependence. By using kernel embedding which maps distributions into a reproducing kernel Hilbert space, we can calculate the dependence between distributions without density estimation. In the experiments, we demonstrate the effectiveness of the proposed method using synthetic and real data sets including an application to cross-lingual topic matching.</description><subject>Alignment</subject><subject>Artificial Intelligence</subject><subject>Chemistry and Earth Sciences</subject><subject>Computer Science</subject><subject>Correlation analysis</subject><subject>Correspondence</subject><subject>Data mining</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Hilbert space</subject><subject>Information Storage and Retrieval</subject><subject>Kernels</subject><subject>Matching</subject><subject>Mathematical analysis</subject><subject>Methods</subject><subject>Multilingualism</subject><subject>Ontology</subject><subject>Physics</subject><subject>Probability distribution</subject><subject>Reproduction</subject><subject>Similarity</subject><subject>Statistics for Engineering</subject><subject>Tasks</subject><issn>1384-5810</issn><issn>1573-756X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp1kE9LxDAQxYsouK5-AG8FL16ikzZt0qMs_oMFLy54C2mSdrO0SU1axW9v1npQwdMMM783vHlJco7hCgPQ64ChxAwBLhEQCggfJAtc0BzRonw5jH3OCCoYhuPkJIQdABRZDotkt7FhGrR_M0GrtPVuGtJejHJrbJu-m3GbimHojBSjcTYdXSq9CwF1cT2JLg4GI38L3DSmojOt7bUdU2Mb5_sv9Wly1Igu6LPvukw2d7fPqwe0frp_XN2skcxJNaK6aCSlrGBlVglSVlIJTYnKSVNqAKUaqlStpaJZLataq4bougElGWVaU8D5Mrmc7w7evU46jLw3QequE1a7KXDMGMHACKMRvfiD7tzkbXQXqZIylpWkihSeqa_fvW744E0v_AfHwPfp8zl9HtPn-_T53kQ2a0Jkbav9j8v_ij4BJqWLWA</recordid><startdate>20170301</startdate><enddate>20170301</enddate><creator>Iwata, Tomoharu</creator><creator>Kanagawa, Motonobu</creator><creator>Hirao, Tsutomu</creator><creator>Fukumizu, Kenji</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope></search><sort><creationdate>20170301</creationdate><title>Unsupervised group matching with application to cross-lingual topic matching without alignment information</title><author>Iwata, Tomoharu ; Kanagawa, Motonobu ; Hirao, Tsutomu ; Fukumizu, Kenji</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-b5fc77858629a469cdae74d34f6e00ddf7ddbecd72bc9bedf4ebf0dc878ee7013</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Alignment</topic><topic>Artificial Intelligence</topic><topic>Chemistry and Earth Sciences</topic><topic>Computer Science</topic><topic>Correlation analysis</topic><topic>Correspondence</topic><topic>Data mining</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Hilbert space</topic><topic>Information Storage and Retrieval</topic><topic>Kernels</topic><topic>Matching</topic><topic>Mathematical analysis</topic><topic>Methods</topic><topic>Multilingualism</topic><topic>Ontology</topic><topic>Physics</topic><topic>Probability distribution</topic><topic>Reproduction</topic><topic>Similarity</topic><topic>Statistics for Engineering</topic><topic>Tasks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Iwata, Tomoharu</creatorcontrib><creatorcontrib>Kanagawa, Motonobu</creatorcontrib><creatorcontrib>Hirao, Tsutomu</creatorcontrib><creatorcontrib>Fukumizu, Kenji</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection (ProQuest)</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM global</collection><collection>Computing Database</collection><collection>ProQuest research library</collection><collection>Research Library (Corporate)</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>One Business (ProQuest)</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Data mining and knowledge discovery</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Iwata, Tomoharu</au><au>Kanagawa, Motonobu</au><au>Hirao, Tsutomu</au><au>Fukumizu, Kenji</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Unsupervised group matching with application to cross-lingual topic matching without alignment information</atitle><jtitle>Data mining and knowledge discovery</jtitle><stitle>Data Min Knowl Disc</stitle><date>2017-03-01</date><risdate>2017</risdate><volume>31</volume><issue>2</issue><spage>350</spage><epage>370</epage><pages>350-370</pages><issn>1384-5810</issn><eissn>1573-756X</eissn><abstract>We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without alignment information. The proposed method interprets a group as a probability distribution, which enables us to handle uncertainty in a limited amount of data, and to incorporate the high order information on groups. Groups are matched by maximizing the dependence between distributions, in which we use the Hilbert Schmidt independence criterion for measuring the dependence. By using kernel embedding which maps distributions into a reproducing kernel Hilbert space, we can calculate the dependence between distributions without density estimation. In the experiments, we demonstrate the effectiveness of the proposed method using synthetic and real data sets including an application to cross-lingual topic matching.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10618-016-0470-1</doi><tpages>21</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1384-5810
ispartof Data mining and knowledge discovery, 2017-03, Vol.31 (2), p.350-370
issn 1384-5810
1573-756X
language eng
recordid cdi_proquest_miscellaneous_1884108487
source Springer Online Journals
subjects Alignment
Artificial Intelligence
Chemistry and Earth Sciences
Computer Science
Correlation analysis
Correspondence
Data mining
Data Mining and Knowledge Discovery
Hilbert space
Information Storage and Retrieval
Kernels
Matching
Mathematical analysis
Methods
Multilingualism
Ontology
Physics
Probability distribution
Reproduction
Similarity
Statistics for Engineering
Tasks
title Unsupervised group matching with application to cross-lingual topic matching without alignment information
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-12T00%3A24%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Unsupervised%20group%20matching%20with%20application%20to%20cross-lingual%20topic%20matching%20without%20alignment%20information&rft.jtitle=Data%20mining%20and%20knowledge%20discovery&rft.au=Iwata,%20Tomoharu&rft.date=2017-03-01&rft.volume=31&rft.issue=2&rft.spage=350&rft.epage=370&rft.pages=350-370&rft.issn=1384-5810&rft.eissn=1573-756X&rft_id=info:doi/10.1007/s10618-016-0470-1&rft_dat=%3Cproquest_cross%3E4313540631%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1867882649&rft_id=info:pmid/&rfr_iscdi=true