Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease

Background Social media and online surveys are commonly used to recruit and collect data for health preference studies. Online data fraud (i.e., intentional duplicate responses/straight-lining, hots, professional survey takers who provide fraudulent responses to meet study eligibility) is increasing...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The patient : patient-centered outcomes research 2024-11, Vol.17 (6), p.719-720
Hauptverfasser: MacDonald, Karen V, Nguyen, Geoff C, Barker, Karis L, Harris, Meghan, Sewitch, Maida J, Marshall, Deborah A
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 720
container_issue 6
container_start_page 719
container_title The patient : patient-centered outcomes research
container_volume 17
creator MacDonald, Karen V
Nguyen, Geoff C
Barker, Karis L
Harris, Meghan
Sewitch, Maida J
Marshall, Deborah A
description Background Social media and online surveys are commonly used to recruit and collect data for health preference studies. Online data fraud (i.e., intentional duplicate responses/straight-lining, hots, professional survey takers who provide fraudulent responses to meet study eligibility) is increasing and difficult to identify. We developed a fraud identification algorithm and verification process and demonstrate the impact of fraudulent respondents on data and results. Methods We administered an online best-worst scaling (BWS) survey on healthcare processes for managing inflammatory bowel disease (IBD) to Canadian IBD patients. Recruitment was done in clinic and online (mailing lists, social media). A gift card was offered for participation which resulted in an influx of fraudulent respondents. We developed a fraud identification algorithm with 13 binary 'red flag' variables related to respondent age, year of IBD diagnosis, postal code, survey duration, responses to open text questions, email address, and Qualtrics fraud variables. These variables were used to generate a fraudulent response score (FRS; range-0 (most likely real (LR)) to 13 (most likely fraudulent (LF)). Respondents with FRS >3 were categorized as LF. Data of respondents with FRS ?3 were further reviewed to determine categorization (LF, LR, unsure). Next, respondents categorized as LR or unsure underwent age verification; those who correctly verified age remained categorized as LR. BWS data were analyzed using conditional logit. We explored differences in results by FRS (>3 vs ?3) and categorization (LF, LR, unsure). Results Based on FRS, 75% (n = 3258) of the 4334 respondents were initially categorized as LF, 17% (n = 727) as unsure, and 8% (n = 349) as LR. After age verification, 76% (n = 3297) were categorized as LF, 14% (n = 592) as unsure, 10% (n = 442) as LR, and
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3115796661</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3115796661</sourcerecordid><originalsourceid>FETCH-proquest_journals_31157966613</originalsourceid><addsrcrecordid>eNqNjssKwjAURIMoWB__cMF1wVCsj52Pit2Jupeot5KSJjW3Ufv3piCuXc0w5yymxQLOp7OQxzFv__ok6rIeUT4exx7EAbukN9SVzGqp77C1wt2c8gMckEqjG0YgNewtZmhRXxGOzj6xpgUsNSRvUZQKGyPVmRJFISpja1iZFyrYSEJBOGCdTCjC4Tf7bLRNTutdWFrzcEjVOTfOao_OEeeT6dwf49F_1gfVkEcf</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3115796661</pqid></control><display><type>article</type><title>Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease</title><source>SpringerLink Journals - AutoHoldings</source><creator>MacDonald, Karen V ; Nguyen, Geoff C ; Barker, Karis L ; Harris, Meghan ; Sewitch, Maida J ; Marshall, Deborah A</creator><creatorcontrib>MacDonald, Karen V ; Nguyen, Geoff C ; Barker, Karis L ; Harris, Meghan ; Sewitch, Maida J ; Marshall, Deborah A</creatorcontrib><description>Background Social media and online surveys are commonly used to recruit and collect data for health preference studies. Online data fraud (i.e., intentional duplicate responses/straight-lining, hots, professional survey takers who provide fraudulent responses to meet study eligibility) is increasing and difficult to identify. We developed a fraud identification algorithm and verification process and demonstrate the impact of fraudulent respondents on data and results. Methods We administered an online best-worst scaling (BWS) survey on healthcare processes for managing inflammatory bowel disease (IBD) to Canadian IBD patients. Recruitment was done in clinic and online (mailing lists, social media). A gift card was offered for participation which resulted in an influx of fraudulent respondents. We developed a fraud identification algorithm with 13 binary 'red flag' variables related to respondent age, year of IBD diagnosis, postal code, survey duration, responses to open text questions, email address, and Qualtrics fraud variables. These variables were used to generate a fraudulent response score (FRS; range-0 (most likely real (LR)) to 13 (most likely fraudulent (LF)). Respondents with FRS &gt;3 were categorized as LF. Data of respondents with FRS ?3 were further reviewed to determine categorization (LF, LR, unsure). Next, respondents categorized as LR or unsure underwent age verification; those who correctly verified age remained categorized as LR. BWS data were analyzed using conditional logit. We explored differences in results by FRS (&gt;3 vs ?3) and categorization (LF, LR, unsure). Results Based on FRS, 75% (n = 3258) of the 4334 respondents were initially categorized as LF, 17% (n = 727) as unsure, and 8% (n = 349) as LR. After age verification, 76% (n = 3297) were categorized as LF, 14% (n = 592) as unsure, 10% (n = 442) as LR, and &lt;1% (n = 3) were duplicates of LR respondents. The ranking of BWS items differed by FRS and LF/LR/unsure categories. Conclusions Despite convenience, social media and online surveys may be prone to data fraud, especially when incentives are offered. We developed an algorithm and verification process to identify fraudulent responses. Given that only 10% of our sample was considered likely real and ranking of BWS items differed by which respondents were included in the analysis, health preference researchers using social media and online surveys should carefully examine data for fraudulent responses and apply strategies to mitigate risks.</description><identifier>ISSN: 1178-1653</identifier><identifier>EISSN: 1178-1661</identifier><language>eng</language><publisher>Auckland: Springer Nature B.V</publisher><subject>Algorithms ; Fraud ; Inflammatory bowel disease ; Polls &amp; surveys ; Social networks</subject><ispartof>The patient : patient-centered outcomes research, 2024-11, Vol.17 (6), p.719-720</ispartof><rights>Copyright Springer Nature B.V. Nov 2024</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784</link.rule.ids></links><search><creatorcontrib>MacDonald, Karen V</creatorcontrib><creatorcontrib>Nguyen, Geoff C</creatorcontrib><creatorcontrib>Barker, Karis L</creatorcontrib><creatorcontrib>Harris, Meghan</creatorcontrib><creatorcontrib>Sewitch, Maida J</creatorcontrib><creatorcontrib>Marshall, Deborah A</creatorcontrib><title>Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease</title><title>The patient : patient-centered outcomes research</title><description>Background Social media and online surveys are commonly used to recruit and collect data for health preference studies. Online data fraud (i.e., intentional duplicate responses/straight-lining, hots, professional survey takers who provide fraudulent responses to meet study eligibility) is increasing and difficult to identify. We developed a fraud identification algorithm and verification process and demonstrate the impact of fraudulent respondents on data and results. Methods We administered an online best-worst scaling (BWS) survey on healthcare processes for managing inflammatory bowel disease (IBD) to Canadian IBD patients. Recruitment was done in clinic and online (mailing lists, social media). A gift card was offered for participation which resulted in an influx of fraudulent respondents. We developed a fraud identification algorithm with 13 binary 'red flag' variables related to respondent age, year of IBD diagnosis, postal code, survey duration, responses to open text questions, email address, and Qualtrics fraud variables. These variables were used to generate a fraudulent response score (FRS; range-0 (most likely real (LR)) to 13 (most likely fraudulent (LF)). Respondents with FRS &gt;3 were categorized as LF. Data of respondents with FRS ?3 were further reviewed to determine categorization (LF, LR, unsure). Next, respondents categorized as LR or unsure underwent age verification; those who correctly verified age remained categorized as LR. BWS data were analyzed using conditional logit. We explored differences in results by FRS (&gt;3 vs ?3) and categorization (LF, LR, unsure). Results Based on FRS, 75% (n = 3258) of the 4334 respondents were initially categorized as LF, 17% (n = 727) as unsure, and 8% (n = 349) as LR. After age verification, 76% (n = 3297) were categorized as LF, 14% (n = 592) as unsure, 10% (n = 442) as LR, and &lt;1% (n = 3) were duplicates of LR respondents. The ranking of BWS items differed by FRS and LF/LR/unsure categories. Conclusions Despite convenience, social media and online surveys may be prone to data fraud, especially when incentives are offered. We developed an algorithm and verification process to identify fraudulent responses. Given that only 10% of our sample was considered likely real and ranking of BWS items differed by which respondents were included in the analysis, health preference researchers using social media and online surveys should carefully examine data for fraudulent responses and apply strategies to mitigate risks.</description><subject>Algorithms</subject><subject>Fraud</subject><subject>Inflammatory bowel disease</subject><subject>Polls &amp; surveys</subject><subject>Social networks</subject><issn>1178-1653</issn><issn>1178-1661</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNqNjssKwjAURIMoWB__cMF1wVCsj52Pit2Jupeot5KSJjW3Ufv3piCuXc0w5yymxQLOp7OQxzFv__ok6rIeUT4exx7EAbukN9SVzGqp77C1wt2c8gMckEqjG0YgNewtZmhRXxGOzj6xpgUsNSRvUZQKGyPVmRJFISpja1iZFyrYSEJBOGCdTCjC4Tf7bLRNTutdWFrzcEjVOTfOao_OEeeT6dwf49F_1gfVkEcf</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>MacDonald, Karen V</creator><creator>Nguyen, Geoff C</creator><creator>Barker, Karis L</creator><creator>Harris, Meghan</creator><creator>Sewitch, Maida J</creator><creator>Marshall, Deborah A</creator><general>Springer Nature B.V</general><scope>4T-</scope><scope>K9.</scope><scope>NAPCQ</scope></search><sort><creationdate>20241101</creationdate><title>Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease</title><author>MacDonald, Karen V ; Nguyen, Geoff C ; Barker, Karis L ; Harris, Meghan ; Sewitch, Maida J ; Marshall, Deborah A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31157966613</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Fraud</topic><topic>Inflammatory bowel disease</topic><topic>Polls &amp; surveys</topic><topic>Social networks</topic><toplevel>online_resources</toplevel><creatorcontrib>MacDonald, Karen V</creatorcontrib><creatorcontrib>Nguyen, Geoff C</creatorcontrib><creatorcontrib>Barker, Karis L</creatorcontrib><creatorcontrib>Harris, Meghan</creatorcontrib><creatorcontrib>Sewitch, Maida J</creatorcontrib><creatorcontrib>Marshall, Deborah A</creatorcontrib><collection>Docstoc</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Nursing &amp; Allied Health Premium</collection><jtitle>The patient : patient-centered outcomes research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>MacDonald, Karen V</au><au>Nguyen, Geoff C</au><au>Barker, Karis L</au><au>Harris, Meghan</au><au>Sewitch, Maida J</au><au>Marshall, Deborah A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease</atitle><jtitle>The patient : patient-centered outcomes research</jtitle><date>2024-11-01</date><risdate>2024</risdate><volume>17</volume><issue>6</issue><spage>719</spage><epage>720</epage><pages>719-720</pages><issn>1178-1653</issn><eissn>1178-1661</eissn><abstract>Background Social media and online surveys are commonly used to recruit and collect data for health preference studies. Online data fraud (i.e., intentional duplicate responses/straight-lining, hots, professional survey takers who provide fraudulent responses to meet study eligibility) is increasing and difficult to identify. We developed a fraud identification algorithm and verification process and demonstrate the impact of fraudulent respondents on data and results. Methods We administered an online best-worst scaling (BWS) survey on healthcare processes for managing inflammatory bowel disease (IBD) to Canadian IBD patients. Recruitment was done in clinic and online (mailing lists, social media). A gift card was offered for participation which resulted in an influx of fraudulent respondents. We developed a fraud identification algorithm with 13 binary 'red flag' variables related to respondent age, year of IBD diagnosis, postal code, survey duration, responses to open text questions, email address, and Qualtrics fraud variables. These variables were used to generate a fraudulent response score (FRS; range-0 (most likely real (LR)) to 13 (most likely fraudulent (LF)). Respondents with FRS &gt;3 were categorized as LF. Data of respondents with FRS ?3 were further reviewed to determine categorization (LF, LR, unsure). Next, respondents categorized as LR or unsure underwent age verification; those who correctly verified age remained categorized as LR. BWS data were analyzed using conditional logit. We explored differences in results by FRS (&gt;3 vs ?3) and categorization (LF, LR, unsure). Results Based on FRS, 75% (n = 3258) of the 4334 respondents were initially categorized as LF, 17% (n = 727) as unsure, and 8% (n = 349) as LR. After age verification, 76% (n = 3297) were categorized as LF, 14% (n = 592) as unsure, 10% (n = 442) as LR, and &lt;1% (n = 3) were duplicates of LR respondents. The ranking of BWS items differed by FRS and LF/LR/unsure categories. Conclusions Despite convenience, social media and online surveys may be prone to data fraud, especially when incentives are offered. We developed an algorithm and verification process to identify fraudulent responses. Given that only 10% of our sample was considered likely real and ranking of BWS items differed by which respondents were included in the analysis, health preference researchers using social media and online surveys should carefully examine data for fraudulent responses and apply strategies to mitigate risks.</abstract><cop>Auckland</cop><pub>Springer Nature B.V</pub></addata></record>
fulltext fulltext
identifier ISSN: 1178-1653
ispartof The patient : patient-centered outcomes research, 2024-11, Vol.17 (6), p.719-720
issn 1178-1653
1178-1661
language eng
recordid cdi_proquest_journals_3115796661
source SpringerLink Journals - AutoHoldings
subjects Algorithms
Fraud
Inflammatory bowel disease
Polls & surveys
Social networks
title Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T01%3A18%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identifying%20Fraudulent%20Respondents%20in%20Preference%20Surveys:%20An%20Example%20in%20Inflammatory%20Bowel%20Disease&rft.jtitle=The%20patient%20:%20patient-centered%20outcomes%20research&rft.au=MacDonald,%20Karen%20V&rft.date=2024-11-01&rft.volume=17&rft.issue=6&rft.spage=719&rft.epage=720&rft.pages=719-720&rft.issn=1178-1653&rft.eissn=1178-1661&rft_id=info:doi/&rft_dat=%3Cproquest%3E3115796661%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3115796661&rft_id=info:pmid/&rfr_iscdi=true