Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09

One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ETS research report series 2019-12
Hauptverfasser:	Wendler, Cathy, Glazer, Nancy, Cline, Frederick
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy College Entrance Examinations Essays Examiners Graduate Study Interrater Reliability Quality Control Scoring Test Reliability Writing Evaluation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	ETS research report series
container_volume
creator	Wendler, Cathy Glazer, Nancy Cline, Frederick
description	One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as a type of quality control during CR scoring. Calibration sets are designed to provide sufficient evidence that raters have understood and internalized the rubrics and can score accurately across all score points of the score scale. This study examined the calibration process used to qualify raters to score essays from the "GRE"® Analytical Writing measure. A total of 46 experienced raters participated in the study, and each rater scored up to 630 essays from 1 of 2 essay prompt types. Two research questions were evaluated: "Does calibration influence scoring accuracy?" and "Does reducing the frequency of calibration impact scoring accuracy?" While the distribution of score points represented by the essays used in the study did not necessarily reflect what raters see during operational scoring, results suggest that the influence of calibration on Day 1 remains with raters through at least 3 scoring days. Results further suggest that scoring accuracy may be moderated by prompt type. Nevertheless, study results indicate that daily calibration for GRE prompt types may not be necessary and that reducing the frequency of calibration is unlikely to reduce scoring accuracy.
format	Article
fullrecord	<record><control><sourceid>eric</sourceid><recordid>TN_cdi_eric_primary_EJ1238527</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ericid>EJ1238527</ericid><sourcerecordid>EJ1238527</sourcerecordid><originalsourceid>FETCH-eric_primary_EJ12385273</originalsourceid><addsrcrecordid>eNqFjE1OAkEQhXuhCUQ5AkmF_ZD-CQNsJS3GlWlmT8qhBpoM06RqFnoXz-AhPJmd0ZUbV-_lfV_ejRpb53SxWphypCYiZ621caU1pR2rD_-Gl9jF7gj9iWCDbXxl7GPq4IVTTSLQJIaAPbFAagZrtg1-9vUJW-qIsYWKpJ-Dr3aQQd4fEvIBAgkh16dcromzMMDCrAtt5n8p7Igjyc9LCIO1vle3DbZCk9-8U9NHX22eiuzW-yvHC_L73j8b61YLu3T_8W_kK1IA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09</title><source>Wiley Free Content</source><source>ERIC - Full Text Only (Discovery)</source><source>Education Source</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Wendler, Cathy ; Glazer, Nancy ; Cline, Frederick</creator><creatorcontrib>Wendler, Cathy ; Glazer, Nancy ; Cline, Frederick</creatorcontrib><description>One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as a type of quality control during CR scoring. Calibration sets are designed to provide sufficient evidence that raters have understood and internalized the rubrics and can score accurately across all score points of the score scale. This study examined the calibration process used to qualify raters to score essays from the "GRE"® Analytical Writing measure. A total of 46 experienced raters participated in the study, and each rater scored up to 630 essays from 1 of 2 essay prompt types. Two research questions were evaluated: "Does calibration influence scoring accuracy?" and "Does reducing the frequency of calibration impact scoring accuracy?" While the distribution of score points represented by the essays used in the study did not necessarily reflect what raters see during operational scoring, results suggest that the influence of calibration on Day 1 remains with raters through at least 3 scoring days. Results further suggest that scoring accuracy may be moderated by prompt type. Nevertheless, study results indicate that daily calibration for GRE prompt types may not be necessary and that reducing the frequency of calibration is unlikely to reduce scoring accuracy.</description><identifier>ISSN: 2330-8516</identifier><language>eng</language><publisher>Educational Testing Service</publisher><subject>Accuracy ; College Entrance Examinations ; Essays ; Examiners ; Graduate Study ; Interrater Reliability ; Quality Control ; Scoring ; Test Reliability ; Writing Evaluation</subject><ispartof>ETS research report series, 2019-12</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,687,776,780,881</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ1238527$$DView record in ERIC$$Hfree_for_read</backlink></links><search><creatorcontrib>Wendler, Cathy</creatorcontrib><creatorcontrib>Glazer, Nancy</creatorcontrib><creatorcontrib>Cline, Frederick</creatorcontrib><title>Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09</title><title>ETS research report series</title><description>One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as a type of quality control during CR scoring. Calibration sets are designed to provide sufficient evidence that raters have understood and internalized the rubrics and can score accurately across all score points of the score scale. This study examined the calibration process used to qualify raters to score essays from the "GRE"® Analytical Writing measure. A total of 46 experienced raters participated in the study, and each rater scored up to 630 essays from 1 of 2 essay prompt types. Two research questions were evaluated: "Does calibration influence scoring accuracy?" and "Does reducing the frequency of calibration impact scoring accuracy?" While the distribution of score points represented by the essays used in the study did not necessarily reflect what raters see during operational scoring, results suggest that the influence of calibration on Day 1 remains with raters through at least 3 scoring days. Results further suggest that scoring accuracy may be moderated by prompt type. Nevertheless, study results indicate that daily calibration for GRE prompt types may not be necessary and that reducing the frequency of calibration is unlikely to reduce scoring accuracy.</description><subject>Accuracy</subject><subject>College Entrance Examinations</subject><subject>Essays</subject><subject>Examiners</subject><subject>Graduate Study</subject><subject>Interrater Reliability</subject><subject>Quality Control</subject><subject>Scoring</subject><subject>Test Reliability</subject><subject>Writing Evaluation</subject><issn>2330-8516</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GA5</sourceid><recordid>eNqFjE1OAkEQhXuhCUQ5AkmF_ZD-CQNsJS3GlWlmT8qhBpoM06RqFnoXz-AhPJmd0ZUbV-_lfV_ejRpb53SxWphypCYiZ621caU1pR2rD_-Gl9jF7gj9iWCDbXxl7GPq4IVTTSLQJIaAPbFAagZrtg1-9vUJW-qIsYWKpJ-Dr3aQQd4fEvIBAgkh16dcromzMMDCrAtt5n8p7Igjyc9LCIO1vle3DbZCk9-8U9NHX22eiuzW-yvHC_L73j8b61YLu3T_8W_kK1IA</recordid><startdate>201912</startdate><enddate>201912</enddate><creator>Wendler, Cathy</creator><creator>Glazer, Nancy</creator><creator>Cline, Frederick</creator><general>Educational Testing Service</general><scope>ERI</scope><scope>GA5</scope></search><sort><creationdate>201912</creationdate><title>Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09</title><author>Wendler, Cathy ; Glazer, Nancy ; Cline, Frederick</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-eric_primary_EJ12385273</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Accuracy</topic><topic>College Entrance Examinations</topic><topic>Essays</topic><topic>Examiners</topic><topic>Graduate Study</topic><topic>Interrater Reliability</topic><topic>Quality Control</topic><topic>Scoring</topic><topic>Test Reliability</topic><topic>Writing Evaluation</topic><toplevel>online_resources</toplevel><creatorcontrib>Wendler, Cathy</creatorcontrib><creatorcontrib>Glazer, Nancy</creatorcontrib><creatorcontrib>Cline, Frederick</creatorcontrib><collection>ERIC</collection><collection>ERIC - Full Text Only (Discovery)</collection><jtitle>ETS research report series</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wendler, Cathy</au><au>Glazer, Nancy</au><au>Cline, Frederick</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ1238527</ericid><atitle>Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09</atitle><jtitle>ETS research report series</jtitle><date>2019-12</date><risdate>2019</risdate><issn>2330-8516</issn><abstract>One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as a type of quality control during CR scoring. Calibration sets are designed to provide sufficient evidence that raters have understood and internalized the rubrics and can score accurately across all score points of the score scale. This study examined the calibration process used to qualify raters to score essays from the "GRE"® Analytical Writing measure. A total of 46 experienced raters participated in the study, and each rater scored up to 630 essays from 1 of 2 essay prompt types. Two research questions were evaluated: "Does calibration influence scoring accuracy?" and "Does reducing the frequency of calibration impact scoring accuracy?" While the distribution of score points represented by the essays used in the study did not necessarily reflect what raters see during operational scoring, results suggest that the influence of calibration on Day 1 remains with raters through at least 3 scoring days. Results further suggest that scoring accuracy may be moderated by prompt type. Nevertheless, study results indicate that daily calibration for GRE prompt types may not be necessary and that reducing the frequency of calibration is unlikely to reduce scoring accuracy.</abstract><pub>Educational Testing Service</pub><tpages>19</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2330-8516
ispartof	ETS research report series, 2019-12
issn	2330-8516
language	eng
recordid	cdi_eric_primary_EJ1238527
source	Wiley Free Content; ERIC - Full Text Only (Discovery); Education Source; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects	Accuracy College Entrance Examinations Essays Examiners Graduate Study Interrater Reliability Quality Control Scoring Test Reliability Writing Evaluation
title	Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T11%3A27%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-eric&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Examining%20the%20Calibration%20Process%20for%20Raters%20of%20the%20%22GRE%22%C2%AE%20General%20Test.%20ETS%20GRE%C2%AE%20Board%20Research%20Report.%20GRE%C2%AE-19-01.%20Research%20Report%20Series.%20ETS%20RR-19-09&rft.jtitle=ETS%20research%20report%20series&rft.au=Wendler,%20Cathy&rft.date=2019-12&rft.issn=2330-8516&rft_id=info:doi/&rft_dat=%3Ceric%3EEJ1238527%3C/eric%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ericid=EJ1238527&rfr_iscdi=true