What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model

This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person desig...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Educational and psychological measurement 2023-12, Vol.83 (6), p.1249-1290
Hauptverfasser: Fellinghauer, Carolina, Debelak, Rudolf, Strobl, Carolin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1290
container_issue 6
container_start_page 1249
container_title Educational and psychological measurement
container_volume 83
creator Fellinghauer, Carolina
Debelak, Rudolf
Strobl, Carolin
description This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation results are discussed with a focus on scale equating in health-related research settings. The study simulated data for two scales, varying the number of items and the sample sizes. The factor correlation between scales was used to operationalize construct similarity. Targeting of the scales was operationalized through increasing departure from equal difficulty and by varying the dispersion of the item and person parameters in each scale. The results show that low similarity between scales goes along with lower transformation precision. In cases with equal levels of similarity, precision improves in settings where the range of the item parameters is encompassing the person parameters range. With decreasing similarity, score transformation precision benefits more from good targeting. Difficulty shifts up to two logits somewhat increased the estimation bias but without affecting the transformation precision. The observed robustness against difficulty shifts supports the advantage of applying a true-score equating methods over identity equating, which was used as a naive baseline method for comparison. Finally, larger sample size did not improve the transformation precision in this study, longer scales improved only marginally the quality of the equating. The insights from the simulation study are used in a real-data example.
doi_str_mv 10.1177/00131644221143051
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10638984</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ericid>EJ1400231</ericid><sage_id>10.1177_00131644221143051</sage_id><sourcerecordid>2888605041</sourcerecordid><originalsourceid>FETCH-LOGICAL-c495t-6fbbd6dff4400dc87c9859ac5d10fa7f4086e0532db7fe328c0c809e5392e3c03</originalsourceid><addsrcrecordid>eNp1kU9v1DAQxS0EokvhA3AAWeLCJWUc24lzqqrVAkVFFNGKY-R1xruusnFrO0j77XFIWf4JH-zD-82bGT9CnjM4Yayu3wAwziohypIxwUGyB2TBpCwLrpR6SBaTXkzAEXkS4w3kIxh7TI543dQglFqQ_detTvTMWjQp0rRF-nnUvUt76i39YnxAehX0EK0PO52cH-IpvfQJh-R0T89jHDFSN2RoxGLmV3djJocNvY7TPXle6vCDXwbsXKIffYf9U_LI6j7is_v3mFy_XV0t3xcXn96dL88uCiMamYrKrtdd1VkrBEBnVG0aJRttZMfA6toKUBWC5GW3ri3yUhkwChqUvCmRG-DH5HT2vR3XO-xMHj3ovr0NbqfDvvXatX8qg9u2G_-tZVBx1SiRHV7fOwR_l_dN7c5Fg32vB_RjbEvVQC0r4BP66i_0xo9hyPtlSqkKZE4gU2ymTPAxBrSHaRi0U7LtP8nmmpe_r3Go-BllBl7MAAZnDvLqA8v_VvLJ4GTWo97gr7H-3_E7dAa1ng</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2888605041</pqid></control><display><type>article</type><title>What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model</title><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>SAGE Complete A-Z List</source><source>PubMed Central</source><creator>Fellinghauer, Carolina ; Debelak, Rudolf ; Strobl, Carolin</creator><creatorcontrib>Fellinghauer, Carolina ; Debelak, Rudolf ; Strobl, Carolin</creatorcontrib><description>This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation results are discussed with a focus on scale equating in health-related research settings. The study simulated data for two scales, varying the number of items and the sample sizes. The factor correlation between scales was used to operationalize construct similarity. Targeting of the scales was operationalized through increasing departure from equal difficulty and by varying the dispersion of the item and person parameters in each scale. The results show that low similarity between scales goes along with lower transformation precision. In cases with equal levels of similarity, precision improves in settings where the range of the item parameters is encompassing the person parameters range. With decreasing similarity, score transformation precision benefits more from good targeting. Difficulty shifts up to two logits somewhat increased the estimation bias but without affecting the transformation precision. The observed robustness against difficulty shifts supports the advantage of applying a true-score equating methods over identity equating, which was used as a naive baseline method for comparison. Finally, larger sample size did not improve the transformation precision in this study, longer scales improved only marginally the quality of the equating. The insights from the simulation study are used in a real-data example.</description><identifier>ISSN: 0013-1644</identifier><identifier>EISSN: 1552-3888</identifier><identifier>DOI: 10.1177/00131644221143051</identifier><identifier>PMID: 37970488</identifier><language>eng</language><publisher>Los Angeles, CA: SAGE Publications</publisher><subject>Difficulty Level ; Equated Scores ; Item Response Theory ; Rasch model ; Sample Size ; Simulation ; Test Items ; Test Length ; True Scores</subject><ispartof>Educational and psychological measurement, 2023-12, Vol.83 (6), p.1249-1290</ispartof><rights>The Author(s) 2023</rights><rights>The Author(s) 2023.</rights><rights>The Author(s) 2023 2023 SAGE Publications</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c495t-6fbbd6dff4400dc87c9859ac5d10fa7f4086e0532db7fe328c0c809e5392e3c03</cites><orcidid>0000-0002-0042-9945</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638984/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638984/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,21819,27924,27925,43621,43622,53791,53793</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ1400231$$DView record in ERIC$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37970488$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Fellinghauer, Carolina</creatorcontrib><creatorcontrib>Debelak, Rudolf</creatorcontrib><creatorcontrib>Strobl, Carolin</creatorcontrib><title>What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model</title><title>Educational and psychological measurement</title><addtitle>Educ Psychol Meas</addtitle><description>This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation results are discussed with a focus on scale equating in health-related research settings. The study simulated data for two scales, varying the number of items and the sample sizes. The factor correlation between scales was used to operationalize construct similarity. Targeting of the scales was operationalized through increasing departure from equal difficulty and by varying the dispersion of the item and person parameters in each scale. The results show that low similarity between scales goes along with lower transformation precision. In cases with equal levels of similarity, precision improves in settings where the range of the item parameters is encompassing the person parameters range. With decreasing similarity, score transformation precision benefits more from good targeting. Difficulty shifts up to two logits somewhat increased the estimation bias but without affecting the transformation precision. The observed robustness against difficulty shifts supports the advantage of applying a true-score equating methods over identity equating, which was used as a naive baseline method for comparison. Finally, larger sample size did not improve the transformation precision in this study, longer scales improved only marginally the quality of the equating. The insights from the simulation study are used in a real-data example.</description><subject>Difficulty Level</subject><subject>Equated Scores</subject><subject>Item Response Theory</subject><subject>Rasch model</subject><subject>Sample Size</subject><subject>Simulation</subject><subject>Test Items</subject><subject>Test Length</subject><subject>True Scores</subject><issn>0013-1644</issn><issn>1552-3888</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>AFRWT</sourceid><recordid>eNp1kU9v1DAQxS0EokvhA3AAWeLCJWUc24lzqqrVAkVFFNGKY-R1xruusnFrO0j77XFIWf4JH-zD-82bGT9CnjM4Yayu3wAwziohypIxwUGyB2TBpCwLrpR6SBaTXkzAEXkS4w3kIxh7TI543dQglFqQ_detTvTMWjQp0rRF-nnUvUt76i39YnxAehX0EK0PO52cH-IpvfQJh-R0T89jHDFSN2RoxGLmV3djJocNvY7TPXle6vCDXwbsXKIffYf9U_LI6j7is_v3mFy_XV0t3xcXn96dL88uCiMamYrKrtdd1VkrBEBnVG0aJRttZMfA6toKUBWC5GW3ri3yUhkwChqUvCmRG-DH5HT2vR3XO-xMHj3ovr0NbqfDvvXatX8qg9u2G_-tZVBx1SiRHV7fOwR_l_dN7c5Fg32vB_RjbEvVQC0r4BP66i_0xo9hyPtlSqkKZE4gU2ymTPAxBrSHaRi0U7LtP8nmmpe_r3Go-BllBl7MAAZnDvLqA8v_VvLJ4GTWo97gr7H-3_E7dAa1ng</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Fellinghauer, Carolina</creator><creator>Debelak, Rudolf</creator><creator>Strobl, Carolin</creator><general>SAGE Publications</general><general>SAGE PUBLICATIONS, INC</general><scope>AFRWT</scope><scope>7SW</scope><scope>BJH</scope><scope>BNH</scope><scope>BNI</scope><scope>BNJ</scope><scope>BNO</scope><scope>ERI</scope><scope>PET</scope><scope>REK</scope><scope>WWN</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-0042-9945</orcidid></search><sort><creationdate>20231201</creationdate><title>What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model</title><author>Fellinghauer, Carolina ; Debelak, Rudolf ; Strobl, Carolin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c495t-6fbbd6dff4400dc87c9859ac5d10fa7f4086e0532db7fe328c0c809e5392e3c03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Difficulty Level</topic><topic>Equated Scores</topic><topic>Item Response Theory</topic><topic>Rasch model</topic><topic>Sample Size</topic><topic>Simulation</topic><topic>Test Items</topic><topic>Test Length</topic><topic>True Scores</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fellinghauer, Carolina</creatorcontrib><creatorcontrib>Debelak, Rudolf</creatorcontrib><creatorcontrib>Strobl, Carolin</creatorcontrib><collection>Sage Journals GOLD Open Access 2024</collection><collection>ERIC</collection><collection>ERIC (Ovid)</collection><collection>ERIC</collection><collection>ERIC</collection><collection>ERIC (Legacy Platform)</collection><collection>ERIC( SilverPlatter )</collection><collection>ERIC</collection><collection>ERIC PlusText (Legacy Platform)</collection><collection>Education Resources Information Center (ERIC)</collection><collection>ERIC</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Educational and psychological measurement</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fellinghauer, Carolina</au><au>Debelak, Rudolf</au><au>Strobl, Carolin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ1400231</ericid><atitle>What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model</atitle><jtitle>Educational and psychological measurement</jtitle><addtitle>Educ Psychol Meas</addtitle><date>2023-12-01</date><risdate>2023</risdate><volume>83</volume><issue>6</issue><spage>1249</spage><epage>1290</epage><pages>1249-1290</pages><issn>0013-1644</issn><eissn>1552-3888</eissn><abstract>This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation results are discussed with a focus on scale equating in health-related research settings. The study simulated data for two scales, varying the number of items and the sample sizes. The factor correlation between scales was used to operationalize construct similarity. Targeting of the scales was operationalized through increasing departure from equal difficulty and by varying the dispersion of the item and person parameters in each scale. The results show that low similarity between scales goes along with lower transformation precision. In cases with equal levels of similarity, precision improves in settings where the range of the item parameters is encompassing the person parameters range. With decreasing similarity, score transformation precision benefits more from good targeting. Difficulty shifts up to two logits somewhat increased the estimation bias but without affecting the transformation precision. The observed robustness against difficulty shifts supports the advantage of applying a true-score equating methods over identity equating, which was used as a naive baseline method for comparison. Finally, larger sample size did not improve the transformation precision in this study, longer scales improved only marginally the quality of the equating. The insights from the simulation study are used in a real-data example.</abstract><cop>Los Angeles, CA</cop><pub>SAGE Publications</pub><pmid>37970488</pmid><doi>10.1177/00131644221143051</doi><tpages>42</tpages><orcidid>https://orcid.org/0000-0002-0042-9945</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0013-1644
ispartof Educational and psychological measurement, 2023-12, Vol.83 (6), p.1249-1290
issn 0013-1644
1552-3888
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10638984
source Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; SAGE Complete A-Z List; PubMed Central
subjects Difficulty Level
Equated Scores
Item Response Theory
Rasch model
Sample Size
Simulation
Test Items
Test Length
True Scores
title What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T17%3A46%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=What%20Affects%20the%20Quality%20of%20Score%20Transformations?%20Potential%20Issues%20in%20True-Score%20Equating%20Using%20the%20Partial%20Credit%20Model&rft.jtitle=Educational%20and%20psychological%20measurement&rft.au=Fellinghauer,%20Carolina&rft.date=2023-12-01&rft.volume=83&rft.issue=6&rft.spage=1249&rft.epage=1290&rft.pages=1249-1290&rft.issn=0013-1644&rft.eissn=1552-3888&rft_id=info:doi/10.1177/00131644221143051&rft_dat=%3Cproquest_pubme%3E2888605041%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2888605041&rft_id=info:pmid/37970488&rft_ericid=EJ1400231&rft_sage_id=10.1177_00131644221143051&rfr_iscdi=true