Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn

This study investigated the application of WriteToLearn on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays fro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:CALICO journal 2016-01, Vol.33 (1), p.71-91
Hauptverfasser: Liu, Sha, Kunnan, Antony John
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 91
container_issue 1
container_start_page 71
container_title CALICO journal
container_volume 33
creator Liu, Sha
Kunnan, Antony John
description This study investigated the application of WriteToLearn on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was marked by four human raters as well as WriteToLearn. Many-facet Rasch measurement (MFRM) was conducted to calibrate WriteToLearn's rating performance in scoring the whole set of essays against those of four trained human raters. In addition, the accuracy of WriteToLearn's feedback on 60 randomly selected essays was compared with the feedback provided by human raters. The two main findings related to scoring were that: (1) WriteToLearn was more consistent but highly stringent when compared to the four trained human raters in scoring essays; and (2) WriteToLearn failed to score seven essays. In terms of error feedback, WriteToLearn had an overall precision and recall of 49% and 18.7% respectively. These figures did not meet the minimum threshold of 90% precision (set by Burstein, Chodorow, and Leacock, 2003) for it to be considered a reliable error detecting tool. Furthermore, it had difficulty in identifying errors made by Chinese undergraduate English majors in the use of articles, prepositions, word choice and expression.
doi_str_mv 10.1558/cj.v33i1.26380
format Article
fullrecord <record><control><sourceid>jstor_eric_</sourceid><recordid>TN_cdi_eric_primary_EJ1143726</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ericid>EJ1143726</ericid><jstor_id>calicojournal.33.1.71</jstor_id><sourcerecordid>calicojournal.33.1.71</sourcerecordid><originalsourceid>FETCH-LOGICAL-c357t-1b2890027bdefafd584b6dc145b414ac4d1a04f68fe407af9bb6ecabeb72f9a83</originalsourceid><addsrcrecordid>eNpdkc2L2zAUxEVpoWnaa28FQS-l4FSyZEvuLYR0P8iyh-7So5BlKZFxpFSSA3vuP75yvOyhJyHe7808ZgD4jNEKVxX_ofrVmRCLV2VNOHoDFiWq6qJBmL0FC8RoWTDG-HvwIcYeIUpKyhbg340765jsXibr9jAdNFyfToNV-e8d9Aaux-SPMukO_gn2Am3PchjnefJwc7BORw0fXafDPsgujzTcuv1g4wHeyd6H-BOu4UZm6Hcau6dJddLSD36nZXAfwTsjh6g_vbxL8Phr-7C5Lnb3Vzeb9a5QpGKpwG3JG4RK1nbaSNNVnLZ1pzCtWoqpVLTDElFTc6MpYtI0bVtrJVvdstI0kpMl-DLr6mCVOAV7lOFJbG8xpoTl0Jbg2zw_Bf93zKmIo41KD4N02o9RYNbk0JopuyX4-h_a-zG4fH2meF0RTsvJcDVTKvgYgzavphiJqTOhenHpTFw6ywvf54U-Jh9eaSVzIf7FQRAisGCYPANAcpr0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1786538428</pqid></control><display><type>article</type><title>Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn</title><source>ERIC - Full Text Only (Discovery)</source><source>JSTOR Archive Collection A-Z Listing</source><source>Equinox Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>EBSCOhost Education Source</source><creator>Liu, Sha ; Kunnan, Antony John</creator><creatorcontrib>Liu, Sha ; Kunnan, Antony John</creatorcontrib><description>This study investigated the application of WriteToLearn on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was marked by four human raters as well as WriteToLearn. Many-facet Rasch measurement (MFRM) was conducted to calibrate WriteToLearn's rating performance in scoring the whole set of essays against those of four trained human raters. In addition, the accuracy of WriteToLearn's feedback on 60 randomly selected essays was compared with the feedback provided by human raters. The two main findings related to scoring were that: (1) WriteToLearn was more consistent but highly stringent when compared to the four trained human raters in scoring essays; and (2) WriteToLearn failed to score seven essays. In terms of error feedback, WriteToLearn had an overall precision and recall of 49% and 18.7% respectively. These figures did not meet the minimum threshold of 90% precision (set by Burstein, Chodorow, and Leacock, 2003) for it to be considered a reliable error detecting tool. Furthermore, it had difficulty in identifying errors made by Chinese undergraduate English majors in the use of articles, prepositions, word choice and expression.</description><identifier>ISSN: 0742-7778</identifier><identifier>ISSN: 2056-9017</identifier><identifier>EISSN: 2056-9017</identifier><identifier>DOI: 10.1558/cj.v33i1.26380</identifier><language>eng</language><publisher>San Marcos: Equinox Publishing Ltd</publisher><subject>Accuracy ; Bursting ; Calibration ; Case studies ; Classroom Environment ; College English ; College students ; Comparative Analysis ; Computer Assisted Instruction ; Computer assisted language learning ; Computer Assisted Testing ; Computer generated language analysis ; Construct Validity ; Correlation ; Educational Testing ; English (Second Language) ; English as a second language instruction ; English language ; English Language Learners ; Error detection ; Error feedback ; Error of Measurement ; Errors ; Essays ; Evaluators ; Feedback ; Feedback (Response) ; Foreign Countries ; Human ; Human performance ; Instructional Systems ; Item Response Theory ; Majors (Students) ; Native Speakers ; Rating Scales ; Recall ; Reliability ; Researchers ; Scoring ; Second Language Learning ; Second language writing ; Second language writing instruction ; Undergraduate Students ; Validity ; Writing ; Writing Evaluation</subject><ispartof>CALICO journal, 2016-01, Vol.33 (1), p.71-91</ispartof><rights>2016, Equinox Publishing</rights><rights>Copyright Equinox Publishing Ltd. 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c357t-1b2890027bdefafd584b6dc145b414ac4d1a04f68fe407af9bb6ecabeb72f9a83</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/calicojournal.33.1.71$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/calicojournal.33.1.71$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,690,780,784,803,885,27924,27925,58017,58250</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ1143726$$DView record in ERIC$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Sha</creatorcontrib><creatorcontrib>Kunnan, Antony John</creatorcontrib><title>Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn</title><title>CALICO journal</title><description>This study investigated the application of WriteToLearn on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was marked by four human raters as well as WriteToLearn. Many-facet Rasch measurement (MFRM) was conducted to calibrate WriteToLearn's rating performance in scoring the whole set of essays against those of four trained human raters. In addition, the accuracy of WriteToLearn's feedback on 60 randomly selected essays was compared with the feedback provided by human raters. The two main findings related to scoring were that: (1) WriteToLearn was more consistent but highly stringent when compared to the four trained human raters in scoring essays; and (2) WriteToLearn failed to score seven essays. In terms of error feedback, WriteToLearn had an overall precision and recall of 49% and 18.7% respectively. These figures did not meet the minimum threshold of 90% precision (set by Burstein, Chodorow, and Leacock, 2003) for it to be considered a reliable error detecting tool. Furthermore, it had difficulty in identifying errors made by Chinese undergraduate English majors in the use of articles, prepositions, word choice and expression.</description><subject>Accuracy</subject><subject>Bursting</subject><subject>Calibration</subject><subject>Case studies</subject><subject>Classroom Environment</subject><subject>College English</subject><subject>College students</subject><subject>Comparative Analysis</subject><subject>Computer Assisted Instruction</subject><subject>Computer assisted language learning</subject><subject>Computer Assisted Testing</subject><subject>Computer generated language analysis</subject><subject>Construct Validity</subject><subject>Correlation</subject><subject>Educational Testing</subject><subject>English (Second Language)</subject><subject>English as a second language instruction</subject><subject>English language</subject><subject>English Language Learners</subject><subject>Error detection</subject><subject>Error feedback</subject><subject>Error of Measurement</subject><subject>Errors</subject><subject>Essays</subject><subject>Evaluators</subject><subject>Feedback</subject><subject>Feedback (Response)</subject><subject>Foreign Countries</subject><subject>Human</subject><subject>Human performance</subject><subject>Instructional Systems</subject><subject>Item Response Theory</subject><subject>Majors (Students)</subject><subject>Native Speakers</subject><subject>Rating Scales</subject><subject>Recall</subject><subject>Reliability</subject><subject>Researchers</subject><subject>Scoring</subject><subject>Second Language Learning</subject><subject>Second language writing</subject><subject>Second language writing instruction</subject><subject>Undergraduate Students</subject><subject>Validity</subject><subject>Writing</subject><subject>Writing Evaluation</subject><issn>0742-7778</issn><issn>2056-9017</issn><issn>2056-9017</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GA5</sourceid><recordid>eNpdkc2L2zAUxEVpoWnaa28FQS-l4FSyZEvuLYR0P8iyh-7So5BlKZFxpFSSA3vuP75yvOyhJyHe7808ZgD4jNEKVxX_ofrVmRCLV2VNOHoDFiWq6qJBmL0FC8RoWTDG-HvwIcYeIUpKyhbg340765jsXibr9jAdNFyfToNV-e8d9Aaux-SPMukO_gn2Am3PchjnefJwc7BORw0fXafDPsgujzTcuv1g4wHeyd6H-BOu4UZm6Hcau6dJddLSD36nZXAfwTsjh6g_vbxL8Phr-7C5Lnb3Vzeb9a5QpGKpwG3JG4RK1nbaSNNVnLZ1pzCtWoqpVLTDElFTc6MpYtI0bVtrJVvdstI0kpMl-DLr6mCVOAV7lOFJbG8xpoTl0Jbg2zw_Bf93zKmIo41KD4N02o9RYNbk0JopuyX4-h_a-zG4fH2meF0RTsvJcDVTKvgYgzavphiJqTOhenHpTFw6ywvf54U-Jh9eaSVzIf7FQRAisGCYPANAcpr0</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Liu, Sha</creator><creator>Kunnan, Antony John</creator><general>Equinox Publishing Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>3V.</scope><scope>7SC</scope><scope>7T9</scope><scope>7XB</scope><scope>88B</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CJNVE</scope><scope>CPGLG</scope><scope>CRLPW</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0P</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEDU</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>ERI</scope><scope>GA5</scope></search><sort><creationdate>20160101</creationdate><title>Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn</title><author>Liu, Sha ; Kunnan, Antony John</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c357t-1b2890027bdefafd584b6dc145b414ac4d1a04f68fe407af9bb6ecabeb72f9a83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Accuracy</topic><topic>Bursting</topic><topic>Calibration</topic><topic>Case studies</topic><topic>Classroom Environment</topic><topic>College English</topic><topic>College students</topic><topic>Comparative Analysis</topic><topic>Computer Assisted Instruction</topic><topic>Computer assisted language learning</topic><topic>Computer Assisted Testing</topic><topic>Computer generated language analysis</topic><topic>Construct Validity</topic><topic>Correlation</topic><topic>Educational Testing</topic><topic>English (Second Language)</topic><topic>English as a second language instruction</topic><topic>English language</topic><topic>English Language Learners</topic><topic>Error detection</topic><topic>Error feedback</topic><topic>Error of Measurement</topic><topic>Errors</topic><topic>Essays</topic><topic>Evaluators</topic><topic>Feedback</topic><topic>Feedback (Response)</topic><topic>Foreign Countries</topic><topic>Human</topic><topic>Human performance</topic><topic>Instructional Systems</topic><topic>Item Response Theory</topic><topic>Majors (Students)</topic><topic>Native Speakers</topic><topic>Rating Scales</topic><topic>Recall</topic><topic>Reliability</topic><topic>Researchers</topic><topic>Scoring</topic><topic>Second Language Learning</topic><topic>Second language writing</topic><topic>Second language writing instruction</topic><topic>Undergraduate Students</topic><topic>Validity</topic><topic>Writing</topic><topic>Writing Evaluation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Sha</creatorcontrib><creatorcontrib>Kunnan, Antony John</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Education Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Education Collection</collection><collection>Linguistics Collection</collection><collection>Linguistics Database</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Education Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Education</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>ERIC</collection><collection>ERIC - Full Text Only (Discovery)</collection><jtitle>CALICO journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Sha</au><au>Kunnan, Antony John</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ1143726</ericid><atitle>Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn</atitle><jtitle>CALICO journal</jtitle><date>2016-01-01</date><risdate>2016</risdate><volume>33</volume><issue>1</issue><spage>71</spage><epage>91</epage><pages>71-91</pages><issn>0742-7778</issn><issn>2056-9017</issn><eissn>2056-9017</eissn><abstract>This study investigated the application of WriteToLearn on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was marked by four human raters as well as WriteToLearn. Many-facet Rasch measurement (MFRM) was conducted to calibrate WriteToLearn's rating performance in scoring the whole set of essays against those of four trained human raters. In addition, the accuracy of WriteToLearn's feedback on 60 randomly selected essays was compared with the feedback provided by human raters. The two main findings related to scoring were that: (1) WriteToLearn was more consistent but highly stringent when compared to the four trained human raters in scoring essays; and (2) WriteToLearn failed to score seven essays. In terms of error feedback, WriteToLearn had an overall precision and recall of 49% and 18.7% respectively. These figures did not meet the minimum threshold of 90% precision (set by Burstein, Chodorow, and Leacock, 2003) for it to be considered a reliable error detecting tool. Furthermore, it had difficulty in identifying errors made by Chinese undergraduate English majors in the use of articles, prepositions, word choice and expression.</abstract><cop>San Marcos</cop><pub>Equinox Publishing Ltd</pub><doi>10.1558/cj.v33i1.26380</doi><tpages>21</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0742-7778
ispartof CALICO journal, 2016-01, Vol.33 (1), p.71-91
issn 0742-7778
2056-9017
2056-9017
language eng
recordid cdi_eric_primary_EJ1143726
source ERIC - Full Text Only (Discovery); JSTOR Archive Collection A-Z Listing; Equinox Journals; EZB-FREE-00999 freely available EZB journals; EBSCOhost Education Source
subjects Accuracy
Bursting
Calibration
Case studies
Classroom Environment
College English
College students
Comparative Analysis
Computer Assisted Instruction
Computer assisted language learning
Computer Assisted Testing
Computer generated language analysis
Construct Validity
Correlation
Educational Testing
English (Second Language)
English as a second language instruction
English language
English Language Learners
Error detection
Error feedback
Error of Measurement
Errors
Essays
Evaluators
Feedback
Feedback (Response)
Foreign Countries
Human
Human performance
Instructional Systems
Item Response Theory
Majors (Students)
Native Speakers
Rating Scales
Recall
Reliability
Researchers
Scoring
Second Language Learning
Second language writing
Second language writing instruction
Undergraduate Students
Validity
Writing
Writing Evaluation
title Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T03%3A21%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_eric_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Investigating%20the%20Application%20of%20Automated%20Writing%20Evaluation%20to%20Chinese%20Undergraduate%20English%20Majors:%20A%20Case%20Study%20of%20WriteToLearn&rft.jtitle=CALICO%20journal&rft.au=Liu,%20Sha&rft.date=2016-01-01&rft.volume=33&rft.issue=1&rft.spage=71&rft.epage=91&rft.pages=71-91&rft.issn=0742-7778&rft.eissn=2056-9017&rft_id=info:doi/10.1558/cj.v33i1.26380&rft_dat=%3Cjstor_eric_%3Ecalicojournal.33.1.71%3C/jstor_eric_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1786538428&rft_id=info:pmid/&rft_ericid=EJ1143726&rft_jstor_id=calicojournal.33.1.71&rfr_iscdi=true