Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn
This study investigated the application of WriteToLearn on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays fro...
Gespeichert in:
Veröffentlicht in: | CALICO journal 2016-01, Vol.33 (1), p.71-91 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 91 |
---|---|
container_issue | 1 |
container_start_page | 71 |
container_title | CALICO journal |
container_volume | 33 |
creator | Liu, Sha Kunnan, Antony John |
description | This study investigated the application of WriteToLearn on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of
its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was
marked by four human raters as well as WriteToLearn. Many-facet Rasch measurement (MFRM) was conducted to calibrate WriteToLearn's rating
performance in scoring the whole set of essays against those of four trained human raters. In addition, the accuracy of WriteToLearn's feedback on 60
randomly selected essays was compared with the feedback provided by human raters. The two main findings related to scoring were that: (1) WriteToLearn was more
consistent but highly stringent when compared to the four trained human raters in scoring essays; and (2) WriteToLearn failed to score seven essays. In terms of
error feedback, WriteToLearn had an overall precision and recall of 49% and 18.7% respectively. These figures did not meet the minimum threshold of
90% precision (set by Burstein, Chodorow, and Leacock, 2003) for it to be considered a reliable error detecting tool. Furthermore, it had difficulty in identifying errors
made by Chinese undergraduate English majors in the use of articles, prepositions, word choice and expression. |
doi_str_mv | 10.1558/cj.v33i1.26380 |
format | Article |
fullrecord | <record><control><sourceid>jstor_eric_</sourceid><recordid>TN_cdi_eric_primary_EJ1143726</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ericid>EJ1143726</ericid><jstor_id>calicojournal.33.1.71</jstor_id><sourcerecordid>calicojournal.33.1.71</sourcerecordid><originalsourceid>FETCH-LOGICAL-c357t-1b2890027bdefafd584b6dc145b414ac4d1a04f68fe407af9bb6ecabeb72f9a83</originalsourceid><addsrcrecordid>eNpdkc2L2zAUxEVpoWnaa28FQS-l4FSyZEvuLYR0P8iyh-7So5BlKZFxpFSSA3vuP75yvOyhJyHe7808ZgD4jNEKVxX_ofrVmRCLV2VNOHoDFiWq6qJBmL0FC8RoWTDG-HvwIcYeIUpKyhbg340765jsXibr9jAdNFyfToNV-e8d9Aaux-SPMukO_gn2Am3PchjnefJwc7BORw0fXafDPsgujzTcuv1g4wHeyd6H-BOu4UZm6Hcau6dJddLSD36nZXAfwTsjh6g_vbxL8Phr-7C5Lnb3Vzeb9a5QpGKpwG3JG4RK1nbaSNNVnLZ1pzCtWoqpVLTDElFTc6MpYtI0bVtrJVvdstI0kpMl-DLr6mCVOAV7lOFJbG8xpoTl0Jbg2zw_Bf93zKmIo41KD4N02o9RYNbk0JopuyX4-h_a-zG4fH2meF0RTsvJcDVTKvgYgzavphiJqTOhenHpTFw6ywvf54U-Jh9eaSVzIf7FQRAisGCYPANAcpr0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1786538428</pqid></control><display><type>article</type><title>Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn</title><source>ERIC - Full Text Only (Discovery)</source><source>JSTOR Archive Collection A-Z Listing</source><source>Equinox Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>EBSCOhost Education Source</source><creator>Liu, Sha ; Kunnan, Antony John</creator><creatorcontrib>Liu, Sha ; Kunnan, Antony John</creatorcontrib><description>This study investigated the application of WriteToLearn on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of
its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was
marked by four human raters as well as WriteToLearn. Many-facet Rasch measurement (MFRM) was conducted to calibrate WriteToLearn's rating
performance in scoring the whole set of essays against those of four trained human raters. In addition, the accuracy of WriteToLearn's feedback on 60
randomly selected essays was compared with the feedback provided by human raters. The two main findings related to scoring were that: (1) WriteToLearn was more
consistent but highly stringent when compared to the four trained human raters in scoring essays; and (2) WriteToLearn failed to score seven essays. In terms of
error feedback, WriteToLearn had an overall precision and recall of 49% and 18.7% respectively. These figures did not meet the minimum threshold of
90% precision (set by Burstein, Chodorow, and Leacock, 2003) for it to be considered a reliable error detecting tool. Furthermore, it had difficulty in identifying errors
made by Chinese undergraduate English majors in the use of articles, prepositions, word choice and expression.</description><identifier>ISSN: 0742-7778</identifier><identifier>ISSN: 2056-9017</identifier><identifier>EISSN: 2056-9017</identifier><identifier>DOI: 10.1558/cj.v33i1.26380</identifier><language>eng</language><publisher>San Marcos: Equinox Publishing Ltd</publisher><subject>Accuracy ; Bursting ; Calibration ; Case studies ; Classroom Environment ; College English ; College students ; Comparative Analysis ; Computer Assisted Instruction ; Computer assisted language learning ; Computer Assisted Testing ; Computer generated language analysis ; Construct Validity ; Correlation ; Educational Testing ; English (Second Language) ; English as a second language instruction ; English language ; English Language Learners ; Error detection ; Error feedback ; Error of Measurement ; Errors ; Essays ; Evaluators ; Feedback ; Feedback (Response) ; Foreign Countries ; Human ; Human performance ; Instructional Systems ; Item Response Theory ; Majors (Students) ; Native Speakers ; Rating Scales ; Recall ; Reliability ; Researchers ; Scoring ; Second Language Learning ; Second language writing ; Second language writing instruction ; Undergraduate Students ; Validity ; Writing ; Writing Evaluation</subject><ispartof>CALICO journal, 2016-01, Vol.33 (1), p.71-91</ispartof><rights>2016, Equinox Publishing</rights><rights>Copyright Equinox Publishing Ltd. 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c357t-1b2890027bdefafd584b6dc145b414ac4d1a04f68fe407af9bb6ecabeb72f9a83</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/calicojournal.33.1.71$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/calicojournal.33.1.71$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,690,780,784,803,885,27924,27925,58017,58250</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ1143726$$DView record in ERIC$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Sha</creatorcontrib><creatorcontrib>Kunnan, Antony John</creatorcontrib><title>Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn</title><title>CALICO journal</title><description>This study investigated the application of WriteToLearn on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of
its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was
marked by four human raters as well as WriteToLearn. Many-facet Rasch measurement (MFRM) was conducted to calibrate WriteToLearn's rating
performance in scoring the whole set of essays against those of four trained human raters. In addition, the accuracy of WriteToLearn's feedback on 60
randomly selected essays was compared with the feedback provided by human raters. The two main findings related to scoring were that: (1) WriteToLearn was more
consistent but highly stringent when compared to the four trained human raters in scoring essays; and (2) WriteToLearn failed to score seven essays. In terms of
error feedback, WriteToLearn had an overall precision and recall of 49% and 18.7% respectively. These figures did not meet the minimum threshold of
90% precision (set by Burstein, Chodorow, and Leacock, 2003) for it to be considered a reliable error detecting tool. Furthermore, it had difficulty in identifying errors
made by Chinese undergraduate English majors in the use of articles, prepositions, word choice and expression.</description><subject>Accuracy</subject><subject>Bursting</subject><subject>Calibration</subject><subject>Case studies</subject><subject>Classroom Environment</subject><subject>College English</subject><subject>College students</subject><subject>Comparative Analysis</subject><subject>Computer Assisted Instruction</subject><subject>Computer assisted language learning</subject><subject>Computer Assisted Testing</subject><subject>Computer generated language analysis</subject><subject>Construct Validity</subject><subject>Correlation</subject><subject>Educational Testing</subject><subject>English (Second Language)</subject><subject>English as a second language instruction</subject><subject>English language</subject><subject>English Language Learners</subject><subject>Error detection</subject><subject>Error feedback</subject><subject>Error of Measurement</subject><subject>Errors</subject><subject>Essays</subject><subject>Evaluators</subject><subject>Feedback</subject><subject>Feedback (Response)</subject><subject>Foreign Countries</subject><subject>Human</subject><subject>Human performance</subject><subject>Instructional Systems</subject><subject>Item Response Theory</subject><subject>Majors (Students)</subject><subject>Native Speakers</subject><subject>Rating Scales</subject><subject>Recall</subject><subject>Reliability</subject><subject>Researchers</subject><subject>Scoring</subject><subject>Second Language Learning</subject><subject>Second language writing</subject><subject>Second language writing instruction</subject><subject>Undergraduate Students</subject><subject>Validity</subject><subject>Writing</subject><subject>Writing Evaluation</subject><issn>0742-7778</issn><issn>2056-9017</issn><issn>2056-9017</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GA5</sourceid><recordid>eNpdkc2L2zAUxEVpoWnaa28FQS-l4FSyZEvuLYR0P8iyh-7So5BlKZFxpFSSA3vuP75yvOyhJyHe7808ZgD4jNEKVxX_ofrVmRCLV2VNOHoDFiWq6qJBmL0FC8RoWTDG-HvwIcYeIUpKyhbg340765jsXibr9jAdNFyfToNV-e8d9Aaux-SPMukO_gn2Am3PchjnefJwc7BORw0fXafDPsgujzTcuv1g4wHeyd6H-BOu4UZm6Hcau6dJddLSD36nZXAfwTsjh6g_vbxL8Phr-7C5Lnb3Vzeb9a5QpGKpwG3JG4RK1nbaSNNVnLZ1pzCtWoqpVLTDElFTc6MpYtI0bVtrJVvdstI0kpMl-DLr6mCVOAV7lOFJbG8xpoTl0Jbg2zw_Bf93zKmIo41KD4N02o9RYNbk0JopuyX4-h_a-zG4fH2meF0RTsvJcDVTKvgYgzavphiJqTOhenHpTFw6ywvf54U-Jh9eaSVzIf7FQRAisGCYPANAcpr0</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Liu, Sha</creator><creator>Kunnan, Antony John</creator><general>Equinox Publishing Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>3V.</scope><scope>7SC</scope><scope>7T9</scope><scope>7XB</scope><scope>88B</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CJNVE</scope><scope>CPGLG</scope><scope>CRLPW</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0P</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEDU</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>ERI</scope><scope>GA5</scope></search><sort><creationdate>20160101</creationdate><title>Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn</title><author>Liu, Sha ; Kunnan, Antony John</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c357t-1b2890027bdefafd584b6dc145b414ac4d1a04f68fe407af9bb6ecabeb72f9a83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Accuracy</topic><topic>Bursting</topic><topic>Calibration</topic><topic>Case studies</topic><topic>Classroom Environment</topic><topic>College English</topic><topic>College students</topic><topic>Comparative Analysis</topic><topic>Computer Assisted Instruction</topic><topic>Computer assisted language learning</topic><topic>Computer Assisted Testing</topic><topic>Computer generated language analysis</topic><topic>Construct Validity</topic><topic>Correlation</topic><topic>Educational Testing</topic><topic>English (Second Language)</topic><topic>English as a second language instruction</topic><topic>English language</topic><topic>English Language Learners</topic><topic>Error detection</topic><topic>Error feedback</topic><topic>Error of Measurement</topic><topic>Errors</topic><topic>Essays</topic><topic>Evaluators</topic><topic>Feedback</topic><topic>Feedback (Response)</topic><topic>Foreign Countries</topic><topic>Human</topic><topic>Human performance</topic><topic>Instructional Systems</topic><topic>Item Response Theory</topic><topic>Majors (Students)</topic><topic>Native Speakers</topic><topic>Rating Scales</topic><topic>Recall</topic><topic>Reliability</topic><topic>Researchers</topic><topic>Scoring</topic><topic>Second Language Learning</topic><topic>Second language writing</topic><topic>Second language writing instruction</topic><topic>Undergraduate Students</topic><topic>Validity</topic><topic>Writing</topic><topic>Writing Evaluation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Sha</creatorcontrib><creatorcontrib>Kunnan, Antony John</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Education Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Education Collection</collection><collection>Linguistics Collection</collection><collection>Linguistics Database</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Education Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Education</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>ERIC</collection><collection>ERIC - Full Text Only (Discovery)</collection><jtitle>CALICO journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Sha</au><au>Kunnan, Antony John</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ1143726</ericid><atitle>Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn</atitle><jtitle>CALICO journal</jtitle><date>2016-01-01</date><risdate>2016</risdate><volume>33</volume><issue>1</issue><spage>71</spage><epage>91</epage><pages>71-91</pages><issn>0742-7778</issn><issn>2056-9017</issn><eissn>2056-9017</eissn><abstract>This study investigated the application of WriteToLearn on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of
its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was
marked by four human raters as well as WriteToLearn. Many-facet Rasch measurement (MFRM) was conducted to calibrate WriteToLearn's rating
performance in scoring the whole set of essays against those of four trained human raters. In addition, the accuracy of WriteToLearn's feedback on 60
randomly selected essays was compared with the feedback provided by human raters. The two main findings related to scoring were that: (1) WriteToLearn was more
consistent but highly stringent when compared to the four trained human raters in scoring essays; and (2) WriteToLearn failed to score seven essays. In terms of
error feedback, WriteToLearn had an overall precision and recall of 49% and 18.7% respectively. These figures did not meet the minimum threshold of
90% precision (set by Burstein, Chodorow, and Leacock, 2003) for it to be considered a reliable error detecting tool. Furthermore, it had difficulty in identifying errors
made by Chinese undergraduate English majors in the use of articles, prepositions, word choice and expression.</abstract><cop>San Marcos</cop><pub>Equinox Publishing Ltd</pub><doi>10.1558/cj.v33i1.26380</doi><tpages>21</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0742-7778 |
ispartof | CALICO journal, 2016-01, Vol.33 (1), p.71-91 |
issn | 0742-7778 2056-9017 2056-9017 |
language | eng |
recordid | cdi_eric_primary_EJ1143726 |
source | ERIC - Full Text Only (Discovery); JSTOR Archive Collection A-Z Listing; Equinox Journals; EZB-FREE-00999 freely available EZB journals; EBSCOhost Education Source |
subjects | Accuracy Bursting Calibration Case studies Classroom Environment College English College students Comparative Analysis Computer Assisted Instruction Computer assisted language learning Computer Assisted Testing Computer generated language analysis Construct Validity Correlation Educational Testing English (Second Language) English as a second language instruction English language English Language Learners Error detection Error feedback Error of Measurement Errors Essays Evaluators Feedback Feedback (Response) Foreign Countries Human Human performance Instructional Systems Item Response Theory Majors (Students) Native Speakers Rating Scales Recall Reliability Researchers Scoring Second Language Learning Second language writing Second language writing instruction Undergraduate Students Validity Writing Writing Evaluation |
title | Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T03%3A21%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_eric_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Investigating%20the%20Application%20of%20Automated%20Writing%20Evaluation%20to%20Chinese%20Undergraduate%20English%20Majors:%20A%20Case%20Study%20of%20WriteToLearn&rft.jtitle=CALICO%20journal&rft.au=Liu,%20Sha&rft.date=2016-01-01&rft.volume=33&rft.issue=1&rft.spage=71&rft.epage=91&rft.pages=71-91&rft.issn=0742-7778&rft.eissn=2056-9017&rft_id=info:doi/10.1558/cj.v33i1.26380&rft_dat=%3Cjstor_eric_%3Ecalicojournal.33.1.71%3C/jstor_eric_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1786538428&rft_id=info:pmid/&rft_ericid=EJ1143726&rft_jstor_id=calicojournal.33.1.71&rfr_iscdi=true |