Word Alignment Between Chinese and Japanese Using Maximum Weight Matching on Bipartite Graph

The word-aligned bilingual corpus is an important knowledge source for many tasks in NLP especially in machine translation. Among the existing word alignment methods, the unknown word problem, the synonym problem and the global optimization problem are very important factors impacting the recall and...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Wu, Honglin, Liu, Shaoming
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Applied sciences Artificial intelligence bipartite graph Computer science control theory systems Exact sciences and technology matching similarity measure Speech and sound recognition and synthesis. Linguistics word alignment
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	84
container_issue
container_start_page	75
container_title
container_volume
creator	Wu, Honglin Liu, Shaoming
description	The word-aligned bilingual corpus is an important knowledge source for many tasks in NLP especially in machine translation. Among the existing word alignment methods, the unknown word problem, the synonym problem and the global optimization problem are very important factors impacting the recall and precision of alignment results. In this paper, we proposed a word alignment model between Chinese and Japanese which measures similarity in terms of morphological similarity, semantic distance, part of speech and co-occurrence, and matches words by maximum weight matching on bipartite graph. The model can partly solve the problems mentioned above. The model was proved to be effective by experiments. It achieved 80% as F-Score than 72% of GIZA++.
doi_str_mv	10.1007/11940098_8
format	Conference Proceeding
fullrecord	<record><control><sourceid>pascalfrancis_sprin</sourceid><recordid>TN_cdi_pascalfrancis_primary_20127831</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>20127831</sourcerecordid><originalsourceid>FETCH-LOGICAL-p1338-28c257dbe4c849341b31c4ea68cd57c24b29e8fa97dff7e5bce9850398e88a4d3</originalsourceid><addsrcrecordid>eNpFULtOwzAUNS-JUrrwBV6QWAJ27MT22FZQQEUsVGVAihznJjE0TmQHAX9PSpF6lqtzz2M4CF1Qck0JETeUKk6Ikpk8QGcs4YSrNJXyEI1oSmnEGFdHe0G8HqMRYSSOlODsFE1CeCcDGB0yyQi9rVtf4OnGVq4B1-MZ9F8ADs9r6yAA1q7Aj7rTf2QVrKvwk_62zWeD12Cruh9ob-rtv3V4Zjvte9sDXnjd1efopNSbAJP_O0aru9uX-X20fF48zKfLqKOMySiWJk5EkQM3kivGac6o4aBTaYpEmJjnsQJZaiWKshSQ5AaUTAhTEqTUvGBjdLnr7XQwelN67YwNWedto_1PFhMaC8no4Lva-cIguQp8lrftR8goybbTZvtp2S8rqGY9</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Word Alignment Between Chinese and Japanese Using Maximum Weight Matching on Bipartite Graph</title><source>Springer Books</source><creator>Wu, Honglin ; Liu, Shaoming</creator><contributor>Matsumoto, Yuji ; Sproat, Richard W. ; Zhang, Min ; Wong, Kam-Fai</contributor><creatorcontrib>Wu, Honglin ; Liu, Shaoming ; Matsumoto, Yuji ; Sproat, Richard W. ; Zhang, Min ; Wong, Kam-Fai</creatorcontrib><description>The word-aligned bilingual corpus is an important knowledge source for many tasks in NLP especially in machine translation. Among the existing word alignment methods, the unknown word problem, the synonym problem and the global optimization problem are very important factors impacting the recall and precision of alignment results. In this paper, we proposed a word alignment model between Chinese and Japanese which measures similarity in terms of morphological similarity, semantic distance, part of speech and co-occurrence, and matches words by maximum weight matching on bipartite graph. The model can partly solve the problems mentioned above. The model was proved to be effective by experiments. It achieved 80% as F-Score than 72% of GIZA++.</description><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 354049667X</identifier><identifier>ISBN: 9783540496670</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 3540496688</identifier><identifier>EISBN: 9783540496687</identifier><identifier>DOI: 10.1007/11940098_8</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer Berlin Heidelberg</publisher><subject>Applied sciences ; Artificial intelligence ; bipartite graph ; Computer science; control theory; systems ; Exact sciences and technology ; matching ; similarity measure ; Speech and sound recognition and synthesis. Linguistics ; word alignment</subject><ispartof>Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead, 2006, p.75-84</ispartof><rights>Springer-Verlag Berlin Heidelberg 2006</rights><rights>2008 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/11940098_8$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/11940098_8$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>309,310,779,780,784,789,790,793,4050,4051,27925,38255,41442,42511</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=20127831$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><contributor>Matsumoto, Yuji</contributor><contributor>Sproat, Richard W.</contributor><contributor>Zhang, Min</contributor><contributor>Wong, Kam-Fai</contributor><creatorcontrib>Wu, Honglin</creatorcontrib><creatorcontrib>Liu, Shaoming</creatorcontrib><title>Word Alignment Between Chinese and Japanese Using Maximum Weight Matching on Bipartite Graph</title><title>Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead</title><description>The word-aligned bilingual corpus is an important knowledge source for many tasks in NLP especially in machine translation. Among the existing word alignment methods, the unknown word problem, the synonym problem and the global optimization problem are very important factors impacting the recall and precision of alignment results. In this paper, we proposed a word alignment model between Chinese and Japanese which measures similarity in terms of morphological similarity, semantic distance, part of speech and co-occurrence, and matches words by maximum weight matching on bipartite graph. The model can partly solve the problems mentioned above. The model was proved to be effective by experiments. It achieved 80% as F-Score than 72% of GIZA++.</description><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>bipartite graph</subject><subject>Computer science; control theory; systems</subject><subject>Exact sciences and technology</subject><subject>matching</subject><subject>similarity measure</subject><subject>Speech and sound recognition and synthesis. Linguistics</subject><subject>word alignment</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>354049667X</isbn><isbn>9783540496670</isbn><isbn>3540496688</isbn><isbn>9783540496687</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2006</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNpFULtOwzAUNS-JUrrwBV6QWAJ27MT22FZQQEUsVGVAihznJjE0TmQHAX9PSpF6lqtzz2M4CF1Qck0JETeUKk6Ikpk8QGcs4YSrNJXyEI1oSmnEGFdHe0G8HqMRYSSOlODsFE1CeCcDGB0yyQi9rVtf4OnGVq4B1-MZ9F8ADs9r6yAA1q7Aj7rTf2QVrKvwk_62zWeD12Cruh9ob-rtv3V4Zjvte9sDXnjd1efopNSbAJP_O0aru9uX-X20fF48zKfLqKOMySiWJk5EkQM3kivGac6o4aBTaYpEmJjnsQJZaiWKshSQ5AaUTAhTEqTUvGBjdLnr7XQwelN67YwNWedto_1PFhMaC8no4Lva-cIguQp8lrftR8goybbTZvtp2S8rqGY9</recordid><startdate>2006</startdate><enddate>2006</enddate><creator>Wu, Honglin</creator><creator>Liu, Shaoming</creator><general>Springer Berlin Heidelberg</general><general>Springer</general><scope>IQODW</scope></search><sort><creationdate>2006</creationdate><title>Word Alignment Between Chinese and Japanese Using Maximum Weight Matching on Bipartite Graph</title><author>Wu, Honglin ; Liu, Shaoming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p1338-28c257dbe4c849341b31c4ea68cd57c24b29e8fa97dff7e5bce9850398e88a4d3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>bipartite graph</topic><topic>Computer science; control theory; systems</topic><topic>Exact sciences and technology</topic><topic>matching</topic><topic>similarity measure</topic><topic>Speech and sound recognition and synthesis. Linguistics</topic><topic>word alignment</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Honglin</creatorcontrib><creatorcontrib>Liu, Shaoming</creatorcontrib><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Honglin</au><au>Liu, Shaoming</au><au>Matsumoto, Yuji</au><au>Sproat, Richard W.</au><au>Zhang, Min</au><au>Wong, Kam-Fai</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Word Alignment Between Chinese and Japanese Using Maximum Weight Matching on Bipartite Graph</atitle><btitle>Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead</btitle><date>2006</date><risdate>2006</risdate><spage>75</spage><epage>84</epage><pages>75-84</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>354049667X</isbn><isbn>9783540496670</isbn><eisbn>3540496688</eisbn><eisbn>9783540496687</eisbn><abstract>The word-aligned bilingual corpus is an important knowledge source for many tasks in NLP especially in machine translation. Among the existing word alignment methods, the unknown word problem, the synonym problem and the global optimization problem are very important factors impacting the recall and precision of alignment results. In this paper, we proposed a word alignment model between Chinese and Japanese which measures similarity in terms of morphological similarity, semantic distance, part of speech and co-occurrence, and matches words by maximum weight matching on bipartite graph. The model can partly solve the problems mentioned above. The model was proved to be effective by experiments. It achieved 80% as F-Score than 72% of GIZA++.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/11940098_8</doi><tpages>10</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0302-9743
ispartof	Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead, 2006, p.75-84
issn	0302-9743 1611-3349
language	eng
recordid	cdi_pascalfrancis_primary_20127831
source	Springer Books
subjects	Applied sciences Artificial intelligence bipartite graph Computer science control theory systems Exact sciences and technology matching similarity measure Speech and sound recognition and synthesis. Linguistics word alignment
title	Word Alignment Between Chinese and Japanese Using Maximum Weight Matching on Bipartite Graph
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T12%3A23%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_sprin&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Word%20Alignment%20Between%20Chinese%20and%20Japanese%20Using%20Maximum%20Weight%20Matching%20on%20Bipartite%20Graph&rft.btitle=Computer%20Processing%20of%20Oriental%20Languages.%20Beyond%20the%20Orient:%20The%20Research%20Challenges%20Ahead&rft.au=Wu,%20Honglin&rft.date=2006&rft.spage=75&rft.epage=84&rft.pages=75-84&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=354049667X&rft.isbn_list=9783540496670&rft_id=info:doi/10.1007/11940098_8&rft_dat=%3Cpascalfrancis_sprin%3E20127831%3C/pascalfrancis_sprin%3E%3Curl%3E%3C/url%3E&rft.eisbn=3540496688&rft.eisbn_list=9783540496687&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true