Comparison of L2 Korean pronunciation error patterns from five L1 backgrounds by using automatic phonetic transcription

This paper presents a large-scale analysis of L2 Korean pronunciation error patterns from five different language backgrounds, Chinese, Vietnamese, Japanese, Thai, and English, by using automatic phonetic transcription. For the analysis, confusion matrices are generated for each L1, by aligning cano...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yeo, Eun Jung, Ryu, Hyungshin, Lee, Jooyoung, Kim, Sunhee, Chung, Minhwa
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Yeo, Eun Jung
Ryu, Hyungshin
Lee, Jooyoung
Kim, Sunhee
Chung, Minhwa
description This paper presents a large-scale analysis of L2 Korean pronunciation error patterns from five different language backgrounds, Chinese, Vietnamese, Japanese, Thai, and English, by using automatic phonetic transcription. For the analysis, confusion matrices are generated for each L1, by aligning canonical phone sequences and automatically transcribed phone sequences obtained from fine-tuned Wav2Vec2 XLS-R phone recognizer. Each value in the confusion matrices is compared to capture frequent common error patterns and to specify patterns unique to a certain language background. Using the Foreign Speakers' Voice Data of Korean for Artificial Intelligence Learning dataset, common error pattern types are found to be (1) substitutions of aspirated or tense consonants with plain consonants, (2) deletions of syllable-final consonants, and (3) substitutions of diphthongs with monophthongs. On the other hand, thirty-nine patterns including (1) syllable-final /l/ substitutions with /n/ for Vietnamese and (2) /\textturnm/ insertions for Japanese are discovered as language-dependent.
doi_str_mv 10.48550/arxiv.2306.10821
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2306_10821</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2306_10821</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-cbca46b48ffd94a8a23d26fd6ed8e6095634dfc11f5f395568334fd71b62932d3</originalsourceid><addsrcrecordid>eNotkMtOwzAURL1hgQofwIr7AwmxHbvJEkW8RKRuuo9u_CgWxI5ukkL_nqawmpFGc6QZxu54kZeVUsUD0k845kIWOudFJfg1-27SMCKFKUVIHloB74kcRhgpxSWagHM4R44oEYw4z47iBJ7SAD4cHbQcejSfB0pLtBP0J1imEA-Ay5yGc9fA-JGiW81MGCdDYVyJN-zK49fkbv91w_bPT_vmNWt3L2_NY5uh3vLM9AZL3ZeV97YusUIhrdDeamcrp4taaVlabzj3ystaKV1JWXq75b0WtRRWbtj9H_ayvBspDEinbn2guzwgfwFh3lkl</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Comparison of L2 Korean pronunciation error patterns from five L1 backgrounds by using automatic phonetic transcription</title><source>arXiv.org</source><creator>Yeo, Eun Jung ; Ryu, Hyungshin ; Lee, Jooyoung ; Kim, Sunhee ; Chung, Minhwa</creator><creatorcontrib>Yeo, Eun Jung ; Ryu, Hyungshin ; Lee, Jooyoung ; Kim, Sunhee ; Chung, Minhwa</creatorcontrib><description>This paper presents a large-scale analysis of L2 Korean pronunciation error patterns from five different language backgrounds, Chinese, Vietnamese, Japanese, Thai, and English, by using automatic phonetic transcription. For the analysis, confusion matrices are generated for each L1, by aligning canonical phone sequences and automatically transcribed phone sequences obtained from fine-tuned Wav2Vec2 XLS-R phone recognizer. Each value in the confusion matrices is compared to capture frequent common error patterns and to specify patterns unique to a certain language background. Using the Foreign Speakers' Voice Data of Korean for Artificial Intelligence Learning dataset, common error pattern types are found to be (1) substitutions of aspirated or tense consonants with plain consonants, (2) deletions of syllable-final consonants, and (3) substitutions of diphthongs with monophthongs. On the other hand, thirty-nine patterns including (1) syllable-final /l/ substitutions with /n/ for Vietnamese and (2) /\textturnm/ insertions for Japanese are discovered as language-dependent.</description><identifier>DOI: 10.48550/arxiv.2306.10821</identifier><language>eng</language><subject>Computer Science - Computation and Language ; Computer Science - Sound</subject><creationdate>2023-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2306.10821$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2306.10821$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yeo, Eun Jung</creatorcontrib><creatorcontrib>Ryu, Hyungshin</creatorcontrib><creatorcontrib>Lee, Jooyoung</creatorcontrib><creatorcontrib>Kim, Sunhee</creatorcontrib><creatorcontrib>Chung, Minhwa</creatorcontrib><title>Comparison of L2 Korean pronunciation error patterns from five L1 backgrounds by using automatic phonetic transcription</title><description>This paper presents a large-scale analysis of L2 Korean pronunciation error patterns from five different language backgrounds, Chinese, Vietnamese, Japanese, Thai, and English, by using automatic phonetic transcription. For the analysis, confusion matrices are generated for each L1, by aligning canonical phone sequences and automatically transcribed phone sequences obtained from fine-tuned Wav2Vec2 XLS-R phone recognizer. Each value in the confusion matrices is compared to capture frequent common error patterns and to specify patterns unique to a certain language background. Using the Foreign Speakers' Voice Data of Korean for Artificial Intelligence Learning dataset, common error pattern types are found to be (1) substitutions of aspirated or tense consonants with plain consonants, (2) deletions of syllable-final consonants, and (3) substitutions of diphthongs with monophthongs. On the other hand, thirty-nine patterns including (1) syllable-final /l/ substitutions with /n/ for Vietnamese and (2) /\textturnm/ insertions for Japanese are discovered as language-dependent.</description><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Sound</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotkMtOwzAURL1hgQofwIr7AwmxHbvJEkW8RKRuuo9u_CgWxI5ukkL_nqawmpFGc6QZxu54kZeVUsUD0k845kIWOudFJfg1-27SMCKFKUVIHloB74kcRhgpxSWagHM4R44oEYw4z47iBJ7SAD4cHbQcejSfB0pLtBP0J1imEA-Ay5yGc9fA-JGiW81MGCdDYVyJN-zK49fkbv91w_bPT_vmNWt3L2_NY5uh3vLM9AZL3ZeV97YusUIhrdDeamcrp4taaVlabzj3ystaKV1JWXq75b0WtRRWbtj9H_ayvBspDEinbn2guzwgfwFh3lkl</recordid><startdate>20230619</startdate><enddate>20230619</enddate><creator>Yeo, Eun Jung</creator><creator>Ryu, Hyungshin</creator><creator>Lee, Jooyoung</creator><creator>Kim, Sunhee</creator><creator>Chung, Minhwa</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230619</creationdate><title>Comparison of L2 Korean pronunciation error patterns from five L1 backgrounds by using automatic phonetic transcription</title><author>Yeo, Eun Jung ; Ryu, Hyungshin ; Lee, Jooyoung ; Kim, Sunhee ; Chung, Minhwa</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-cbca46b48ffd94a8a23d26fd6ed8e6095634dfc11f5f395568334fd71b62932d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Sound</topic><toplevel>online_resources</toplevel><creatorcontrib>Yeo, Eun Jung</creatorcontrib><creatorcontrib>Ryu, Hyungshin</creatorcontrib><creatorcontrib>Lee, Jooyoung</creatorcontrib><creatorcontrib>Kim, Sunhee</creatorcontrib><creatorcontrib>Chung, Minhwa</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yeo, Eun Jung</au><au>Ryu, Hyungshin</au><au>Lee, Jooyoung</au><au>Kim, Sunhee</au><au>Chung, Minhwa</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparison of L2 Korean pronunciation error patterns from five L1 backgrounds by using automatic phonetic transcription</atitle><date>2023-06-19</date><risdate>2023</risdate><abstract>This paper presents a large-scale analysis of L2 Korean pronunciation error patterns from five different language backgrounds, Chinese, Vietnamese, Japanese, Thai, and English, by using automatic phonetic transcription. For the analysis, confusion matrices are generated for each L1, by aligning canonical phone sequences and automatically transcribed phone sequences obtained from fine-tuned Wav2Vec2 XLS-R phone recognizer. Each value in the confusion matrices is compared to capture frequent common error patterns and to specify patterns unique to a certain language background. Using the Foreign Speakers' Voice Data of Korean for Artificial Intelligence Learning dataset, common error pattern types are found to be (1) substitutions of aspirated or tense consonants with plain consonants, (2) deletions of syllable-final consonants, and (3) substitutions of diphthongs with monophthongs. On the other hand, thirty-nine patterns including (1) syllable-final /l/ substitutions with /n/ for Vietnamese and (2) /\textturnm/ insertions for Japanese are discovered as language-dependent.</abstract><doi>10.48550/arxiv.2306.10821</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2306.10821
ispartof
issn
language eng
recordid cdi_arxiv_primary_2306_10821
source arXiv.org
subjects Computer Science - Computation and Language
Computer Science - Sound
title Comparison of L2 Korean pronunciation error patterns from five L1 backgrounds by using automatic phonetic transcription
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-12T15%3A16%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparison%20of%20L2%20Korean%20pronunciation%20error%20patterns%20from%20five%20L1%20backgrounds%20by%20using%20automatic%20phonetic%20transcription&rft.au=Yeo,%20Eun%20Jung&rft.date=2023-06-19&rft_id=info:doi/10.48550/arxiv.2306.10821&rft_dat=%3Carxiv_GOX%3E2306_10821%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true