Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09
One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to...
Gespeichert in:
Veröffentlicht in: | ETS research report series 2019-12 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | ETS research report series |
container_volume | |
creator | Wendler, Cathy Glazer, Nancy Cline, Frederick |
description | One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as a type of quality control during CR scoring. Calibration sets are designed to provide sufficient evidence that raters have understood and internalized the rubrics and can score accurately across all score points of the score scale. This study examined the calibration process used to qualify raters to score essays from the "GRE"® Analytical Writing measure. A total of 46 experienced raters participated in the study, and each rater scored up to 630 essays from 1 of 2 essay prompt types. Two research questions were evaluated: "Does calibration influence scoring accuracy?" and "Does reducing the frequency of calibration impact scoring accuracy?" While the distribution of score points represented by the essays used in the study did not necessarily reflect what raters see during operational scoring, results suggest that the influence of calibration on Day 1 remains with raters through at least 3 scoring days. Results further suggest that scoring accuracy may be moderated by prompt type. Nevertheless, study results indicate that daily calibration for GRE prompt types may not be necessary and that reducing the frequency of calibration is unlikely to reduce scoring accuracy. |
format | Article |
fullrecord | <record><control><sourceid>eric</sourceid><recordid>TN_cdi_eric_primary_EJ1238527</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ericid>EJ1238527</ericid><sourcerecordid>EJ1238527</sourcerecordid><originalsourceid>FETCH-eric_primary_EJ12385273</originalsourceid><addsrcrecordid>eNqFjE1OAkEQhXuhCUQ5AkmF_ZD-CQNsJS3GlWlmT8qhBpoM06RqFnoXz-AhPJmd0ZUbV-_lfV_ejRpb53SxWphypCYiZ621caU1pR2rD_-Gl9jF7gj9iWCDbXxl7GPq4IVTTSLQJIaAPbFAagZrtg1-9vUJW-qIsYWKpJ-Dr3aQQd4fEvIBAgkh16dcromzMMDCrAtt5n8p7Igjyc9LCIO1vle3DbZCk9-8U9NHX22eiuzW-yvHC_L73j8b61YLu3T_8W_kK1IA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09</title><source>Wiley Free Content</source><source>ERIC - Full Text Only (Discovery)</source><source>Education Source</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Wendler, Cathy ; Glazer, Nancy ; Cline, Frederick</creator><creatorcontrib>Wendler, Cathy ; Glazer, Nancy ; Cline, Frederick</creatorcontrib><description>One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as a type of quality control during CR scoring. Calibration sets are designed to provide sufficient evidence that raters have understood and internalized the rubrics and can score accurately across all score points of the score scale. This study examined the calibration process used to qualify raters to score essays from the "GRE"® Analytical Writing measure. A total of 46 experienced raters participated in the study, and each rater scored up to 630 essays from 1 of 2 essay prompt types. Two research questions were evaluated: "Does calibration influence scoring accuracy?" and "Does reducing the frequency of calibration impact scoring accuracy?" While the distribution of score points represented by the essays used in the study did not necessarily reflect what raters see during operational scoring, results suggest that the influence of calibration on Day 1 remains with raters through at least 3 scoring days. Results further suggest that scoring accuracy may be moderated by prompt type. Nevertheless, study results indicate that daily calibration for GRE prompt types may not be necessary and that reducing the frequency of calibration is unlikely to reduce scoring accuracy.</description><identifier>ISSN: 2330-8516</identifier><language>eng</language><publisher>Educational Testing Service</publisher><subject>Accuracy ; College Entrance Examinations ; Essays ; Examiners ; Graduate Study ; Interrater Reliability ; Quality Control ; Scoring ; Test Reliability ; Writing Evaluation</subject><ispartof>ETS research report series, 2019-12</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,687,776,780,881</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ1238527$$DView record in ERIC$$Hfree_for_read</backlink></links><search><creatorcontrib>Wendler, Cathy</creatorcontrib><creatorcontrib>Glazer, Nancy</creatorcontrib><creatorcontrib>Cline, Frederick</creatorcontrib><title>Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09</title><title>ETS research report series</title><description>One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as a type of quality control during CR scoring. Calibration sets are designed to provide sufficient evidence that raters have understood and internalized the rubrics and can score accurately across all score points of the score scale. This study examined the calibration process used to qualify raters to score essays from the "GRE"® Analytical Writing measure. A total of 46 experienced raters participated in the study, and each rater scored up to 630 essays from 1 of 2 essay prompt types. Two research questions were evaluated: "Does calibration influence scoring accuracy?" and "Does reducing the frequency of calibration impact scoring accuracy?" While the distribution of score points represented by the essays used in the study did not necessarily reflect what raters see during operational scoring, results suggest that the influence of calibration on Day 1 remains with raters through at least 3 scoring days. Results further suggest that scoring accuracy may be moderated by prompt type. Nevertheless, study results indicate that daily calibration for GRE prompt types may not be necessary and that reducing the frequency of calibration is unlikely to reduce scoring accuracy.</description><subject>Accuracy</subject><subject>College Entrance Examinations</subject><subject>Essays</subject><subject>Examiners</subject><subject>Graduate Study</subject><subject>Interrater Reliability</subject><subject>Quality Control</subject><subject>Scoring</subject><subject>Test Reliability</subject><subject>Writing Evaluation</subject><issn>2330-8516</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GA5</sourceid><recordid>eNqFjE1OAkEQhXuhCUQ5AkmF_ZD-CQNsJS3GlWlmT8qhBpoM06RqFnoXz-AhPJmd0ZUbV-_lfV_ejRpb53SxWphypCYiZ621caU1pR2rD_-Gl9jF7gj9iWCDbXxl7GPq4IVTTSLQJIaAPbFAagZrtg1-9vUJW-qIsYWKpJ-Dr3aQQd4fEvIBAgkh16dcromzMMDCrAtt5n8p7Igjyc9LCIO1vle3DbZCk9-8U9NHX22eiuzW-yvHC_L73j8b61YLu3T_8W_kK1IA</recordid><startdate>201912</startdate><enddate>201912</enddate><creator>Wendler, Cathy</creator><creator>Glazer, Nancy</creator><creator>Cline, Frederick</creator><general>Educational Testing Service</general><scope>ERI</scope><scope>GA5</scope></search><sort><creationdate>201912</creationdate><title>Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09</title><author>Wendler, Cathy ; Glazer, Nancy ; Cline, Frederick</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-eric_primary_EJ12385273</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Accuracy</topic><topic>College Entrance Examinations</topic><topic>Essays</topic><topic>Examiners</topic><topic>Graduate Study</topic><topic>Interrater Reliability</topic><topic>Quality Control</topic><topic>Scoring</topic><topic>Test Reliability</topic><topic>Writing Evaluation</topic><toplevel>online_resources</toplevel><creatorcontrib>Wendler, Cathy</creatorcontrib><creatorcontrib>Glazer, Nancy</creatorcontrib><creatorcontrib>Cline, Frederick</creatorcontrib><collection>ERIC</collection><collection>ERIC - Full Text Only (Discovery)</collection><jtitle>ETS research report series</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wendler, Cathy</au><au>Glazer, Nancy</au><au>Cline, Frederick</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ1238527</ericid><atitle>Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09</atitle><jtitle>ETS research report series</jtitle><date>2019-12</date><risdate>2019</risdate><issn>2330-8516</issn><abstract>One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as a type of quality control during CR scoring. Calibration sets are designed to provide sufficient evidence that raters have understood and internalized the rubrics and can score accurately across all score points of the score scale. This study examined the calibration process used to qualify raters to score essays from the "GRE"® Analytical Writing measure. A total of 46 experienced raters participated in the study, and each rater scored up to 630 essays from 1 of 2 essay prompt types. Two research questions were evaluated: "Does calibration influence scoring accuracy?" and "Does reducing the frequency of calibration impact scoring accuracy?" While the distribution of score points represented by the essays used in the study did not necessarily reflect what raters see during operational scoring, results suggest that the influence of calibration on Day 1 remains with raters through at least 3 scoring days. Results further suggest that scoring accuracy may be moderated by prompt type. Nevertheless, study results indicate that daily calibration for GRE prompt types may not be necessary and that reducing the frequency of calibration is unlikely to reduce scoring accuracy.</abstract><pub>Educational Testing Service</pub><tpages>19</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2330-8516 |
ispartof | ETS research report series, 2019-12 |
issn | 2330-8516 |
language | eng |
recordid | cdi_eric_primary_EJ1238527 |
source | Wiley Free Content; ERIC - Full Text Only (Discovery); Education Source; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Accuracy College Entrance Examinations Essays Examiners Graduate Study Interrater Reliability Quality Control Scoring Test Reliability Writing Evaluation |
title | Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09 |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T11%3A27%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-eric&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Examining%20the%20Calibration%20Process%20for%20Raters%20of%20the%20%22GRE%22%C2%AE%20General%20Test.%20ETS%20GRE%C2%AE%20Board%20Research%20Report.%20GRE%C2%AE-19-01.%20Research%20Report%20Series.%20ETS%20RR-19-09&rft.jtitle=ETS%20research%20report%20series&rft.au=Wendler,%20Cathy&rft.date=2019-12&rft.issn=2330-8516&rft_id=info:doi/&rft_dat=%3Ceric%3EEJ1238527%3C/eric%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ericid=EJ1238527&rfr_iscdi=true |