Towards automatic assessment of spontaneous spoken English

With increasing global demand for learning English as a second language, there has been considerable interest in methods of automatic assessment of spoken language proficiency for use in interactive electronic learning tools as well as for grading candidates for formal qualifications. This paper pre...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Speech communication 2018-11, Vol.104, p.47-56
Hauptverfasser:	Wang, Y., Gales, M.J.F., Knill, K.M., Kyriakopoulos, K., Malinin, A., van Dalen, R.C., Rashid, M.
Format:	Artikel
Sprache:	eng
Schlagworte:	Automatic assessment of spoken english Deep learning Distance learning English as a second language English as an international language English language Evaluation Gaussian process Interpolation Language proficiency Machine learning Online instruction Pronunciation Recording Rejection scheme Speech Speech recognition Spoken language Spontaneous speech
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	56
container_issue
container_start_page	47
container_title	Speech communication
container_volume	104
creator	Wang, Y. Gales, M.J.F. Knill, K.M. Kyriakopoulos, K. Malinin, A. van Dalen, R.C. Rashid, M.
description	With increasing global demand for learning English as a second language, there has been considerable interest in methods of automatic assessment of spoken language proficiency for use in interactive electronic learning tools as well as for grading candidates for formal qualifications. This paper presents an automatic system to address the assessment of spontaneous spoken language. Prompts or questions requiring spontaneous speech responses elicit more natural speech which better reflects a learner’s proficiency level than read speech. In addition to the challenges of highly variable non-native, learner, speech and noisy real-world recording conditions, this requires any automatic system to handle disfluent, non-grammatical, spontaneous speech with the underlying text unknown. To handle these, a strong deep learning based speech recognition system is applied in combination with a Gaussian Process (GP) grader. A range of features derived from the audio using the recognition hypothesis are investigated for their efficacy in the automatic grader. The proposed system is shown to predict grades at a similar level to the original examiner graders on real candidate entries. Interpolation with the examiner grades further boosts performance. The ability to reject poorly estimated grades is also important and measures are proposed to evaluate the performance of rejection schemes. The GP variance is used to decide which automatic grades should be rejected. Back-off to an expert grader for the least confident grades gives gains.
doi_str_mv	10.1016/j.specom.2018.09.002
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2154224951</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167639317304545</els_id><sourcerecordid>2154224951</sourcerecordid><originalsourceid>FETCH-LOGICAL-c446t-ca4fd406dc0167e945ac387785d5ca4b5246219aefda364d074002fa849c96d63</originalsourceid><addsrcrecordid>eNp9UEtLxDAQDqLguvoPPBQ8tyZpmiYeBFnWByx4Wc8hJqm2bpuaSRX_vSn17GkG5nvNh9AlwQXBhF93BYzO-L6gmIgCywJjeoRWRNQ0r4mgx2iVYHXOS1meojOADmPMhKArdLP33zpYyPQUfa9jazIN4AB6N8TMNxmMfoh6cH6Cef9wQ7Yd3g4tvJ-jk0YfwF38zTV6ud_uN4_57vnhaXO3yw1jPOZGs8YyzK2ZMzjJKm1KUdeislW6vVaUcUqkdo3VJWcW1yzFb7Rg0khueblGV4vuGPzn5CCqzk9hSJaKkopRymRFEootKBM8QHCNGkPb6_CjCFZzS6pTS0tqbklhqZJNot0uNJc--GpdUGBaNxhn2-BMVNa3_wv8Ain8cis</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2154224951</pqid></control><display><type>article</type><title>Towards automatic assessment of spontaneous spoken English</title><source>Access via ScienceDirect (Elsevier)</source><creator>Wang, Y. ; Gales, M.J.F. ; Knill, K.M. ; Kyriakopoulos, K. ; Malinin, A. ; van Dalen, R.C. ; Rashid, M.</creator><creatorcontrib>Wang, Y. ; Gales, M.J.F. ; Knill, K.M. ; Kyriakopoulos, K. ; Malinin, A. ; van Dalen, R.C. ; Rashid, M.</creatorcontrib><description>With increasing global demand for learning English as a second language, there has been considerable interest in methods of automatic assessment of spoken language proficiency for use in interactive electronic learning tools as well as for grading candidates for formal qualifications. This paper presents an automatic system to address the assessment of spontaneous spoken language. Prompts or questions requiring spontaneous speech responses elicit more natural speech which better reflects a learner’s proficiency level than read speech. In addition to the challenges of highly variable non-native, learner, speech and noisy real-world recording conditions, this requires any automatic system to handle disfluent, non-grammatical, spontaneous speech with the underlying text unknown. To handle these, a strong deep learning based speech recognition system is applied in combination with a Gaussian Process (GP) grader. A range of features derived from the audio using the recognition hypothesis are investigated for their efficacy in the automatic grader. The proposed system is shown to predict grades at a similar level to the original examiner graders on real candidate entries. Interpolation with the examiner grades further boosts performance. The ability to reject poorly estimated grades is also important and measures are proposed to evaluate the performance of rejection schemes. The GP variance is used to decide which automatic grades should be rejected. Back-off to an expert grader for the least confident grades gives gains.</description><identifier>ISSN: 0167-6393</identifier><identifier>EISSN: 1872-7182</identifier><identifier>DOI: 10.1016/j.specom.2018.09.002</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Automatic assessment of spoken english ; Deep learning ; Distance learning ; English as a second language ; English as an international language ; English language ; Evaluation ; Gaussian process ; Interpolation ; Language proficiency ; Machine learning ; Online instruction ; Pronunciation ; Recording ; Rejection scheme ; Speech ; Speech recognition ; Spoken language ; Spontaneous speech</subject><ispartof>Speech communication, 2018-11, Vol.104, p.47-56</ispartof><rights>2018 Elsevier B.V.</rights><rights>Copyright Elsevier Science Ltd. Nov 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c446t-ca4fd406dc0167e945ac387785d5ca4b5246219aefda364d074002fa849c96d63</citedby><cites>FETCH-LOGICAL-c446t-ca4fd406dc0167e945ac387785d5ca4b5246219aefda364d074002fa849c96d63</cites><orcidid>0000-0001-9500-081X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.specom.2018.09.002$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Wang, Y.</creatorcontrib><creatorcontrib>Gales, M.J.F.</creatorcontrib><creatorcontrib>Knill, K.M.</creatorcontrib><creatorcontrib>Kyriakopoulos, K.</creatorcontrib><creatorcontrib>Malinin, A.</creatorcontrib><creatorcontrib>van Dalen, R.C.</creatorcontrib><creatorcontrib>Rashid, M.</creatorcontrib><title>Towards automatic assessment of spontaneous spoken English</title><title>Speech communication</title><description>With increasing global demand for learning English as a second language, there has been considerable interest in methods of automatic assessment of spoken language proficiency for use in interactive electronic learning tools as well as for grading candidates for formal qualifications. This paper presents an automatic system to address the assessment of spontaneous spoken language. Prompts or questions requiring spontaneous speech responses elicit more natural speech which better reflects a learner’s proficiency level than read speech. In addition to the challenges of highly variable non-native, learner, speech and noisy real-world recording conditions, this requires any automatic system to handle disfluent, non-grammatical, spontaneous speech with the underlying text unknown. To handle these, a strong deep learning based speech recognition system is applied in combination with a Gaussian Process (GP) grader. A range of features derived from the audio using the recognition hypothesis are investigated for their efficacy in the automatic grader. The proposed system is shown to predict grades at a similar level to the original examiner graders on real candidate entries. Interpolation with the examiner grades further boosts performance. The ability to reject poorly estimated grades is also important and measures are proposed to evaluate the performance of rejection schemes. The GP variance is used to decide which automatic grades should be rejected. Back-off to an expert grader for the least confident grades gives gains.</description><subject>Automatic assessment of spoken english</subject><subject>Deep learning</subject><subject>Distance learning</subject><subject>English as a second language</subject><subject>English as an international language</subject><subject>English language</subject><subject>Evaluation</subject><subject>Gaussian process</subject><subject>Interpolation</subject><subject>Language proficiency</subject><subject>Machine learning</subject><subject>Online instruction</subject><subject>Pronunciation</subject><subject>Recording</subject><subject>Rejection scheme</subject><subject>Speech</subject><subject>Speech recognition</subject><subject>Spoken language</subject><subject>Spontaneous speech</subject><issn>0167-6393</issn><issn>1872-7182</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp9UEtLxDAQDqLguvoPPBQ8tyZpmiYeBFnWByx4Wc8hJqm2bpuaSRX_vSn17GkG5nvNh9AlwQXBhF93BYzO-L6gmIgCywJjeoRWRNQ0r4mgx2iVYHXOS1meojOADmPMhKArdLP33zpYyPQUfa9jazIN4AB6N8TMNxmMfoh6cH6Cef9wQ7Yd3g4tvJ-jk0YfwF38zTV6ud_uN4_57vnhaXO3yw1jPOZGs8YyzK2ZMzjJKm1KUdeislW6vVaUcUqkdo3VJWcW1yzFb7Rg0khueblGV4vuGPzn5CCqzk9hSJaKkopRymRFEootKBM8QHCNGkPb6_CjCFZzS6pTS0tqbklhqZJNot0uNJc--GpdUGBaNxhn2-BMVNa3_wv8Ain8cis</recordid><startdate>201811</startdate><enddate>201811</enddate><creator>Wang, Y.</creator><creator>Gales, M.J.F.</creator><creator>Knill, K.M.</creator><creator>Kyriakopoulos, K.</creator><creator>Malinin, A.</creator><creator>van Dalen, R.C.</creator><creator>Rashid, M.</creator><general>Elsevier B.V</general><general>Elsevier Science Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7T9</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-9500-081X</orcidid></search><sort><creationdate>201811</creationdate><title>Towards automatic assessment of spontaneous spoken English</title><author>Wang, Y. ; Gales, M.J.F. ; Knill, K.M. ; Kyriakopoulos, K. ; Malinin, A. ; van Dalen, R.C. ; Rashid, M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c446t-ca4fd406dc0167e945ac387785d5ca4b5246219aefda364d074002fa849c96d63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Automatic assessment of spoken english</topic><topic>Deep learning</topic><topic>Distance learning</topic><topic>English as a second language</topic><topic>English as an international language</topic><topic>English language</topic><topic>Evaluation</topic><topic>Gaussian process</topic><topic>Interpolation</topic><topic>Language proficiency</topic><topic>Machine learning</topic><topic>Online instruction</topic><topic>Pronunciation</topic><topic>Recording</topic><topic>Rejection scheme</topic><topic>Speech</topic><topic>Speech recognition</topic><topic>Spoken language</topic><topic>Spontaneous speech</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Y.</creatorcontrib><creatorcontrib>Gales, M.J.F.</creatorcontrib><creatorcontrib>Knill, K.M.</creatorcontrib><creatorcontrib>Kyriakopoulos, K.</creatorcontrib><creatorcontrib>Malinin, A.</creatorcontrib><creatorcontrib>van Dalen, R.C.</creatorcontrib><creatorcontrib>Rashid, M.</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Speech communication</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Y.</au><au>Gales, M.J.F.</au><au>Knill, K.M.</au><au>Kyriakopoulos, K.</au><au>Malinin, A.</au><au>van Dalen, R.C.</au><au>Rashid, M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards automatic assessment of spontaneous spoken English</atitle><jtitle>Speech communication</jtitle><date>2018-11</date><risdate>2018</risdate><volume>104</volume><spage>47</spage><epage>56</epage><pages>47-56</pages><issn>0167-6393</issn><eissn>1872-7182</eissn><abstract>With increasing global demand for learning English as a second language, there has been considerable interest in methods of automatic assessment of spoken language proficiency for use in interactive electronic learning tools as well as for grading candidates for formal qualifications. This paper presents an automatic system to address the assessment of spontaneous spoken language. Prompts or questions requiring spontaneous speech responses elicit more natural speech which better reflects a learner’s proficiency level than read speech. In addition to the challenges of highly variable non-native, learner, speech and noisy real-world recording conditions, this requires any automatic system to handle disfluent, non-grammatical, spontaneous speech with the underlying text unknown. To handle these, a strong deep learning based speech recognition system is applied in combination with a Gaussian Process (GP) grader. A range of features derived from the audio using the recognition hypothesis are investigated for their efficacy in the automatic grader. The proposed system is shown to predict grades at a similar level to the original examiner graders on real candidate entries. Interpolation with the examiner grades further boosts performance. The ability to reject poorly estimated grades is also important and measures are proposed to evaluate the performance of rejection schemes. The GP variance is used to decide which automatic grades should be rejected. Back-off to an expert grader for the least confident grades gives gains.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.specom.2018.09.002</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0001-9500-081X</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0167-6393
ispartof	Speech communication, 2018-11, Vol.104, p.47-56
issn	0167-6393 1872-7182
language	eng
recordid	cdi_proquest_journals_2154224951
source	Access via ScienceDirect (Elsevier)
subjects	Automatic assessment of spoken english Deep learning Distance learning English as a second language English as an international language English language Evaluation Gaussian process Interpolation Language proficiency Machine learning Online instruction Pronunciation Recording Rejection scheme Speech Speech recognition Spoken language Spontaneous speech
title	Towards automatic assessment of spontaneous spoken English
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T16%3A53%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20automatic%20assessment%20of%20spontaneous%20spoken%20English&rft.jtitle=Speech%20communication&rft.au=Wang,%20Y.&rft.date=2018-11&rft.volume=104&rft.spage=47&rft.epage=56&rft.pages=47-56&rft.issn=0167-6393&rft.eissn=1872-7182&rft_id=info:doi/10.1016/j.specom.2018.09.002&rft_dat=%3Cproquest_cross%3E2154224951%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2154224951&rft_id=info:pmid/&rft_els_id=S0167639317304545&rfr_iscdi=true