Unsupervised group matching with application to cross-lingual topic matching without alignment information

We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Data mining and knowledge discovery 2017-03, Vol.31 (2), p.350-370
Hauptverfasser:	Iwata, Tomoharu, Kanagawa, Motonobu, Hirao, Tsutomu, Fukumizu, Kenji
Format:	Artikel
Sprache:	eng
Schlagworte:	Alignment Artificial Intelligence Chemistry and Earth Sciences Computer Science Correlation analysis Correspondence Data mining Data Mining and Knowledge Discovery Hilbert space Information Storage and Retrieval Kernels Matching Mathematical analysis Methods Multilingualism Ontology Physics Probability distribution Reproduction Similarity Statistics for Engineering Tasks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	370
container_issue	2
container_start_page	350
container_title	Data mining and knowledge discovery
container_volume	31
creator	Iwata, Tomoharu Kanagawa, Motonobu Hirao, Tsutomu Fukumizu, Kenji
description	We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without alignment information. The proposed method interprets a group as a probability distribution, which enables us to handle uncertainty in a limited amount of data, and to incorporate the high order information on groups. Groups are matched by maximizing the dependence between distributions, in which we use the Hilbert Schmidt independence criterion for measuring the dependence. By using kernel embedding which maps distributions into a reproducing kernel Hilbert space, we can calculate the dependence between distributions without density estimation. In the experiments, we demonstrate the effectiveness of the proposed method using synthetic and real data sets including an application to cross-lingual topic matching.
doi_str_mv	10.1007/s10618-016-0470-1
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1884108487</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>4313540631</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-b5fc77858629a469cdae74d34f6e00ddf7ddbecd72bc9bedf4ebf0dc878ee7013</originalsourceid><addsrcrecordid>eNp1kE9LxDAQxYsouK5-AG8FL16ikzZt0qMs_oMFLy54C2mSdrO0SU1axW9v1npQwdMMM783vHlJco7hCgPQ64ChxAwBLhEQCggfJAtc0BzRonw5jH3OCCoYhuPkJIQdABRZDotkt7FhGrR_M0GrtPVuGtJejHJrbJu-m3GbimHojBSjcTYdXSq9CwF1cT2JLg4GI38L3DSmojOt7bUdU2Mb5_sv9Wly1Igu6LPvukw2d7fPqwe0frp_XN2skcxJNaK6aCSlrGBlVglSVlIJTYnKSVNqAKUaqlStpaJZLataq4bougElGWVaU8D5Mrmc7w7evU46jLw3QequE1a7KXDMGMHACKMRvfiD7tzkbXQXqZIylpWkihSeqa_fvW744E0v_AfHwPfp8zl9HtPn-_T53kQ2a0Jkbav9j8v_ij4BJqWLWA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1867882649</pqid></control><display><type>article</type><title>Unsupervised group matching with application to cross-lingual topic matching without alignment information</title><source>Springer Online Journals</source><creator>Iwata, Tomoharu ; Kanagawa, Motonobu ; Hirao, Tsutomu ; Fukumizu, Kenji</creator><creatorcontrib>Iwata, Tomoharu ; Kanagawa, Motonobu ; Hirao, Tsutomu ; Fukumizu, Kenji</creatorcontrib><description>We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without alignment information. The proposed method interprets a group as a probability distribution, which enables us to handle uncertainty in a limited amount of data, and to incorporate the high order information on groups. Groups are matched by maximizing the dependence between distributions, in which we use the Hilbert Schmidt independence criterion for measuring the dependence. By using kernel embedding which maps distributions into a reproducing kernel Hilbert space, we can calculate the dependence between distributions without density estimation. In the experiments, we demonstrate the effectiveness of the proposed method using synthetic and real data sets including an application to cross-lingual topic matching.</description><identifier>ISSN: 1384-5810</identifier><identifier>EISSN: 1573-756X</identifier><identifier>DOI: 10.1007/s10618-016-0470-1</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Alignment ; Artificial Intelligence ; Chemistry and Earth Sciences ; Computer Science ; Correlation analysis ; Correspondence ; Data mining ; Data Mining and Knowledge Discovery ; Hilbert space ; Information Storage and Retrieval ; Kernels ; Matching ; Mathematical analysis ; Methods ; Multilingualism ; Ontology ; Physics ; Probability distribution ; Reproduction ; Similarity ; Statistics for Engineering ; Tasks</subject><ispartof>Data mining and knowledge discovery, 2017-03, Vol.31 (2), p.350-370</ispartof><rights>The Author(s) 2016</rights><rights>Data Mining and Knowledge Discovery is a copyright of Springer, 2017.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-b5fc77858629a469cdae74d34f6e00ddf7ddbecd72bc9bedf4ebf0dc878ee7013</citedby><cites>FETCH-LOGICAL-c349t-b5fc77858629a469cdae74d34f6e00ddf7ddbecd72bc9bedf4ebf0dc878ee7013</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10618-016-0470-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10618-016-0470-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Iwata, Tomoharu</creatorcontrib><creatorcontrib>Kanagawa, Motonobu</creatorcontrib><creatorcontrib>Hirao, Tsutomu</creatorcontrib><creatorcontrib>Fukumizu, Kenji</creatorcontrib><title>Unsupervised group matching with application to cross-lingual topic matching without alignment information</title><title>Data mining and knowledge discovery</title><addtitle>Data Min Knowl Disc</addtitle><description>We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without alignment information. The proposed method interprets a group as a probability distribution, which enables us to handle uncertainty in a limited amount of data, and to incorporate the high order information on groups. Groups are matched by maximizing the dependence between distributions, in which we use the Hilbert Schmidt independence criterion for measuring the dependence. By using kernel embedding which maps distributions into a reproducing kernel Hilbert space, we can calculate the dependence between distributions without density estimation. In the experiments, we demonstrate the effectiveness of the proposed method using synthetic and real data sets including an application to cross-lingual topic matching.</description><subject>Alignment</subject><subject>Artificial Intelligence</subject><subject>Chemistry and Earth Sciences</subject><subject>Computer Science</subject><subject>Correlation analysis</subject><subject>Correspondence</subject><subject>Data mining</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Hilbert space</subject><subject>Information Storage and Retrieval</subject><subject>Kernels</subject><subject>Matching</subject><subject>Mathematical analysis</subject><subject>Methods</subject><subject>Multilingualism</subject><subject>Ontology</subject><subject>Physics</subject><subject>Probability distribution</subject><subject>Reproduction</subject><subject>Similarity</subject><subject>Statistics for Engineering</subject><subject>Tasks</subject><issn>1384-5810</issn><issn>1573-756X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp1kE9LxDAQxYsouK5-AG8FL16ikzZt0qMs_oMFLy54C2mSdrO0SU1axW9v1npQwdMMM783vHlJco7hCgPQ64ChxAwBLhEQCggfJAtc0BzRonw5jH3OCCoYhuPkJIQdABRZDotkt7FhGrR_M0GrtPVuGtJejHJrbJu-m3GbimHojBSjcTYdXSq9CwF1cT2JLg4GI38L3DSmojOt7bUdU2Mb5_sv9Wly1Igu6LPvukw2d7fPqwe0frp_XN2skcxJNaK6aCSlrGBlVglSVlIJTYnKSVNqAKUaqlStpaJZLataq4bougElGWVaU8D5Mrmc7w7evU46jLw3QequE1a7KXDMGMHACKMRvfiD7tzkbXQXqZIylpWkihSeqa_fvW744E0v_AfHwPfp8zl9HtPn-_T53kQ2a0Jkbav9j8v_ij4BJqWLWA</recordid><startdate>20170301</startdate><enddate>20170301</enddate><creator>Iwata, Tomoharu</creator><creator>Kanagawa, Motonobu</creator><creator>Hirao, Tsutomu</creator><creator>Fukumizu, Kenji</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope></search><sort><creationdate>20170301</creationdate><title>Unsupervised group matching with application to cross-lingual topic matching without alignment information</title><author>Iwata, Tomoharu ; Kanagawa, Motonobu ; Hirao, Tsutomu ; Fukumizu, Kenji</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-b5fc77858629a469cdae74d34f6e00ddf7ddbecd72bc9bedf4ebf0dc878ee7013</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Alignment</topic><topic>Artificial Intelligence</topic><topic>Chemistry and Earth Sciences</topic><topic>Computer Science</topic><topic>Correlation analysis</topic><topic>Correspondence</topic><topic>Data mining</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Hilbert space</topic><topic>Information Storage and Retrieval</topic><topic>Kernels</topic><topic>Matching</topic><topic>Mathematical analysis</topic><topic>Methods</topic><topic>Multilingualism</topic><topic>Ontology</topic><topic>Physics</topic><topic>Probability distribution</topic><topic>Reproduction</topic><topic>Similarity</topic><topic>Statistics for Engineering</topic><topic>Tasks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Iwata, Tomoharu</creatorcontrib><creatorcontrib>Kanagawa, Motonobu</creatorcontrib><creatorcontrib>Hirao, Tsutomu</creatorcontrib><creatorcontrib>Fukumizu, Kenji</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection (ProQuest)</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM global</collection><collection>Computing Database</collection><collection>ProQuest research library</collection><collection>Research Library (Corporate)</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>One Business (ProQuest)</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Data mining and knowledge discovery</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Iwata, Tomoharu</au><au>Kanagawa, Motonobu</au><au>Hirao, Tsutomu</au><au>Fukumizu, Kenji</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Unsupervised group matching with application to cross-lingual topic matching without alignment information</atitle><jtitle>Data mining and knowledge discovery</jtitle><stitle>Data Min Knowl Disc</stitle><date>2017-03-01</date><risdate>2017</risdate><volume>31</volume><issue>2</issue><spage>350</spage><epage>370</epage><pages>350-370</pages><issn>1384-5810</issn><eissn>1573-756X</eissn><abstract>We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without alignment information. The proposed method interprets a group as a probability distribution, which enables us to handle uncertainty in a limited amount of data, and to incorporate the high order information on groups. Groups are matched by maximizing the dependence between distributions, in which we use the Hilbert Schmidt independence criterion for measuring the dependence. By using kernel embedding which maps distributions into a reproducing kernel Hilbert space, we can calculate the dependence between distributions without density estimation. In the experiments, we demonstrate the effectiveness of the proposed method using synthetic and real data sets including an application to cross-lingual topic matching.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10618-016-0470-1</doi><tpages>21</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1384-5810
ispartof	Data mining and knowledge discovery, 2017-03, Vol.31 (2), p.350-370
issn	1384-5810 1573-756X
language	eng
recordid	cdi_proquest_miscellaneous_1884108487
source	Springer Online Journals
subjects	Alignment Artificial Intelligence Chemistry and Earth Sciences Computer Science Correlation analysis Correspondence Data mining Data Mining and Knowledge Discovery Hilbert space Information Storage and Retrieval Kernels Matching Mathematical analysis Methods Multilingualism Ontology Physics Probability distribution Reproduction Similarity Statistics for Engineering Tasks
title	Unsupervised group matching with application to cross-lingual topic matching without alignment information
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-12T00%3A24%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Unsupervised%20group%20matching%20with%20application%20to%20cross-lingual%20topic%20matching%20without%20alignment%20information&rft.jtitle=Data%20mining%20and%20knowledge%20discovery&rft.au=Iwata,%20Tomoharu&rft.date=2017-03-01&rft.volume=31&rft.issue=2&rft.spage=350&rft.epage=370&rft.pages=350-370&rft.issn=1384-5810&rft.eissn=1573-756X&rft_id=info:doi/10.1007/s10618-016-0470-1&rft_dat=%3Cproquest_cross%3E4313540631%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1867882649&rft_id=info:pmid/&rfr_iscdi=true