Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease
Background Social media and online surveys are commonly used to recruit and collect data for health preference studies. Online data fraud (i.e., intentional duplicate responses/straight-lining, hots, professional survey takers who provide fraudulent responses to meet study eligibility) is increasing...
Gespeichert in:
Veröffentlicht in: | The patient : patient-centered outcomes research 2024-11, Vol.17 (6), p.719-720 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 720 |
---|---|
container_issue | 6 |
container_start_page | 719 |
container_title | The patient : patient-centered outcomes research |
container_volume | 17 |
creator | MacDonald, Karen V Nguyen, Geoff C Barker, Karis L Harris, Meghan Sewitch, Maida J Marshall, Deborah A |
description | Background Social media and online surveys are commonly used to recruit and collect data for health preference studies. Online data fraud (i.e., intentional duplicate responses/straight-lining, hots, professional survey takers who provide fraudulent responses to meet study eligibility) is increasing and difficult to identify. We developed a fraud identification algorithm and verification process and demonstrate the impact of fraudulent respondents on data and results. Methods We administered an online best-worst scaling (BWS) survey on healthcare processes for managing inflammatory bowel disease (IBD) to Canadian IBD patients. Recruitment was done in clinic and online (mailing lists, social media). A gift card was offered for participation which resulted in an influx of fraudulent respondents. We developed a fraud identification algorithm with 13 binary 'red flag' variables related to respondent age, year of IBD diagnosis, postal code, survey duration, responses to open text questions, email address, and Qualtrics fraud variables. These variables were used to generate a fraudulent response score (FRS; range-0 (most likely real (LR)) to 13 (most likely fraudulent (LF)). Respondents with FRS >3 were categorized as LF. Data of respondents with FRS ?3 were further reviewed to determine categorization (LF, LR, unsure). Next, respondents categorized as LR or unsure underwent age verification; those who correctly verified age remained categorized as LR. BWS data were analyzed using conditional logit. We explored differences in results by FRS (>3 vs ?3) and categorization (LF, LR, unsure). Results Based on FRS, 75% (n = 3258) of the 4334 respondents were initially categorized as LF, 17% (n = 727) as unsure, and 8% (n = 349) as LR. After age verification, 76% (n = 3297) were categorized as LF, 14% (n = 592) as unsure, 10% (n = 442) as LR, and |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3115796661</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3115796661</sourcerecordid><originalsourceid>FETCH-proquest_journals_31157966613</originalsourceid><addsrcrecordid>eNqNjssKwjAURIMoWB__cMF1wVCsj52Pit2Jupeot5KSJjW3Ufv3piCuXc0w5yymxQLOp7OQxzFv__ok6rIeUT4exx7EAbukN9SVzGqp77C1wt2c8gMckEqjG0YgNewtZmhRXxGOzj6xpgUsNSRvUZQKGyPVmRJFISpja1iZFyrYSEJBOGCdTCjC4Tf7bLRNTutdWFrzcEjVOTfOao_OEeeT6dwf49F_1gfVkEcf</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3115796661</pqid></control><display><type>article</type><title>Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease</title><source>SpringerLink Journals - AutoHoldings</source><creator>MacDonald, Karen V ; Nguyen, Geoff C ; Barker, Karis L ; Harris, Meghan ; Sewitch, Maida J ; Marshall, Deborah A</creator><creatorcontrib>MacDonald, Karen V ; Nguyen, Geoff C ; Barker, Karis L ; Harris, Meghan ; Sewitch, Maida J ; Marshall, Deborah A</creatorcontrib><description>Background Social media and online surveys are commonly used to recruit and collect data for health preference studies. Online data fraud (i.e., intentional duplicate responses/straight-lining, hots, professional survey takers who provide fraudulent responses to meet study eligibility) is increasing and difficult to identify. We developed a fraud identification algorithm and verification process and demonstrate the impact of fraudulent respondents on data and results. Methods We administered an online best-worst scaling (BWS) survey on healthcare processes for managing inflammatory bowel disease (IBD) to Canadian IBD patients. Recruitment was done in clinic and online (mailing lists, social media). A gift card was offered for participation which resulted in an influx of fraudulent respondents. We developed a fraud identification algorithm with 13 binary 'red flag' variables related to respondent age, year of IBD diagnosis, postal code, survey duration, responses to open text questions, email address, and Qualtrics fraud variables. These variables were used to generate a fraudulent response score (FRS; range-0 (most likely real (LR)) to 13 (most likely fraudulent (LF)). Respondents with FRS >3 were categorized as LF. Data of respondents with FRS ?3 were further reviewed to determine categorization (LF, LR, unsure). Next, respondents categorized as LR or unsure underwent age verification; those who correctly verified age remained categorized as LR. BWS data were analyzed using conditional logit. We explored differences in results by FRS (>3 vs ?3) and categorization (LF, LR, unsure). Results Based on FRS, 75% (n = 3258) of the 4334 respondents were initially categorized as LF, 17% (n = 727) as unsure, and 8% (n = 349) as LR. After age verification, 76% (n = 3297) were categorized as LF, 14% (n = 592) as unsure, 10% (n = 442) as LR, and <1% (n = 3) were duplicates of LR respondents. The ranking of BWS items differed by FRS and LF/LR/unsure categories. Conclusions Despite convenience, social media and online surveys may be prone to data fraud, especially when incentives are offered. We developed an algorithm and verification process to identify fraudulent responses. Given that only 10% of our sample was considered likely real and ranking of BWS items differed by which respondents were included in the analysis, health preference researchers using social media and online surveys should carefully examine data for fraudulent responses and apply strategies to mitigate risks.</description><identifier>ISSN: 1178-1653</identifier><identifier>EISSN: 1178-1661</identifier><language>eng</language><publisher>Auckland: Springer Nature B.V</publisher><subject>Algorithms ; Fraud ; Inflammatory bowel disease ; Polls & surveys ; Social networks</subject><ispartof>The patient : patient-centered outcomes research, 2024-11, Vol.17 (6), p.719-720</ispartof><rights>Copyright Springer Nature B.V. Nov 2024</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784</link.rule.ids></links><search><creatorcontrib>MacDonald, Karen V</creatorcontrib><creatorcontrib>Nguyen, Geoff C</creatorcontrib><creatorcontrib>Barker, Karis L</creatorcontrib><creatorcontrib>Harris, Meghan</creatorcontrib><creatorcontrib>Sewitch, Maida J</creatorcontrib><creatorcontrib>Marshall, Deborah A</creatorcontrib><title>Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease</title><title>The patient : patient-centered outcomes research</title><description>Background Social media and online surveys are commonly used to recruit and collect data for health preference studies. Online data fraud (i.e., intentional duplicate responses/straight-lining, hots, professional survey takers who provide fraudulent responses to meet study eligibility) is increasing and difficult to identify. We developed a fraud identification algorithm and verification process and demonstrate the impact of fraudulent respondents on data and results. Methods We administered an online best-worst scaling (BWS) survey on healthcare processes for managing inflammatory bowel disease (IBD) to Canadian IBD patients. Recruitment was done in clinic and online (mailing lists, social media). A gift card was offered for participation which resulted in an influx of fraudulent respondents. We developed a fraud identification algorithm with 13 binary 'red flag' variables related to respondent age, year of IBD diagnosis, postal code, survey duration, responses to open text questions, email address, and Qualtrics fraud variables. These variables were used to generate a fraudulent response score (FRS; range-0 (most likely real (LR)) to 13 (most likely fraudulent (LF)). Respondents with FRS >3 were categorized as LF. Data of respondents with FRS ?3 were further reviewed to determine categorization (LF, LR, unsure). Next, respondents categorized as LR or unsure underwent age verification; those who correctly verified age remained categorized as LR. BWS data were analyzed using conditional logit. We explored differences in results by FRS (>3 vs ?3) and categorization (LF, LR, unsure). Results Based on FRS, 75% (n = 3258) of the 4334 respondents were initially categorized as LF, 17% (n = 727) as unsure, and 8% (n = 349) as LR. After age verification, 76% (n = 3297) were categorized as LF, 14% (n = 592) as unsure, 10% (n = 442) as LR, and <1% (n = 3) were duplicates of LR respondents. The ranking of BWS items differed by FRS and LF/LR/unsure categories. Conclusions Despite convenience, social media and online surveys may be prone to data fraud, especially when incentives are offered. We developed an algorithm and verification process to identify fraudulent responses. Given that only 10% of our sample was considered likely real and ranking of BWS items differed by which respondents were included in the analysis, health preference researchers using social media and online surveys should carefully examine data for fraudulent responses and apply strategies to mitigate risks.</description><subject>Algorithms</subject><subject>Fraud</subject><subject>Inflammatory bowel disease</subject><subject>Polls & surveys</subject><subject>Social networks</subject><issn>1178-1653</issn><issn>1178-1661</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNqNjssKwjAURIMoWB__cMF1wVCsj52Pit2Jupeot5KSJjW3Ufv3piCuXc0w5yymxQLOp7OQxzFv__ok6rIeUT4exx7EAbukN9SVzGqp77C1wt2c8gMckEqjG0YgNewtZmhRXxGOzj6xpgUsNSRvUZQKGyPVmRJFISpja1iZFyrYSEJBOGCdTCjC4Tf7bLRNTutdWFrzcEjVOTfOao_OEeeT6dwf49F_1gfVkEcf</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>MacDonald, Karen V</creator><creator>Nguyen, Geoff C</creator><creator>Barker, Karis L</creator><creator>Harris, Meghan</creator><creator>Sewitch, Maida J</creator><creator>Marshall, Deborah A</creator><general>Springer Nature B.V</general><scope>4T-</scope><scope>K9.</scope><scope>NAPCQ</scope></search><sort><creationdate>20241101</creationdate><title>Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease</title><author>MacDonald, Karen V ; Nguyen, Geoff C ; Barker, Karis L ; Harris, Meghan ; Sewitch, Maida J ; Marshall, Deborah A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31157966613</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Fraud</topic><topic>Inflammatory bowel disease</topic><topic>Polls & surveys</topic><topic>Social networks</topic><toplevel>online_resources</toplevel><creatorcontrib>MacDonald, Karen V</creatorcontrib><creatorcontrib>Nguyen, Geoff C</creatorcontrib><creatorcontrib>Barker, Karis L</creatorcontrib><creatorcontrib>Harris, Meghan</creatorcontrib><creatorcontrib>Sewitch, Maida J</creatorcontrib><creatorcontrib>Marshall, Deborah A</creatorcontrib><collection>Docstoc</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Nursing & Allied Health Premium</collection><jtitle>The patient : patient-centered outcomes research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>MacDonald, Karen V</au><au>Nguyen, Geoff C</au><au>Barker, Karis L</au><au>Harris, Meghan</au><au>Sewitch, Maida J</au><au>Marshall, Deborah A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease</atitle><jtitle>The patient : patient-centered outcomes research</jtitle><date>2024-11-01</date><risdate>2024</risdate><volume>17</volume><issue>6</issue><spage>719</spage><epage>720</epage><pages>719-720</pages><issn>1178-1653</issn><eissn>1178-1661</eissn><abstract>Background Social media and online surveys are commonly used to recruit and collect data for health preference studies. Online data fraud (i.e., intentional duplicate responses/straight-lining, hots, professional survey takers who provide fraudulent responses to meet study eligibility) is increasing and difficult to identify. We developed a fraud identification algorithm and verification process and demonstrate the impact of fraudulent respondents on data and results. Methods We administered an online best-worst scaling (BWS) survey on healthcare processes for managing inflammatory bowel disease (IBD) to Canadian IBD patients. Recruitment was done in clinic and online (mailing lists, social media). A gift card was offered for participation which resulted in an influx of fraudulent respondents. We developed a fraud identification algorithm with 13 binary 'red flag' variables related to respondent age, year of IBD diagnosis, postal code, survey duration, responses to open text questions, email address, and Qualtrics fraud variables. These variables were used to generate a fraudulent response score (FRS; range-0 (most likely real (LR)) to 13 (most likely fraudulent (LF)). Respondents with FRS >3 were categorized as LF. Data of respondents with FRS ?3 were further reviewed to determine categorization (LF, LR, unsure). Next, respondents categorized as LR or unsure underwent age verification; those who correctly verified age remained categorized as LR. BWS data were analyzed using conditional logit. We explored differences in results by FRS (>3 vs ?3) and categorization (LF, LR, unsure). Results Based on FRS, 75% (n = 3258) of the 4334 respondents were initially categorized as LF, 17% (n = 727) as unsure, and 8% (n = 349) as LR. After age verification, 76% (n = 3297) were categorized as LF, 14% (n = 592) as unsure, 10% (n = 442) as LR, and <1% (n = 3) were duplicates of LR respondents. The ranking of BWS items differed by FRS and LF/LR/unsure categories. Conclusions Despite convenience, social media and online surveys may be prone to data fraud, especially when incentives are offered. We developed an algorithm and verification process to identify fraudulent responses. Given that only 10% of our sample was considered likely real and ranking of BWS items differed by which respondents were included in the analysis, health preference researchers using social media and online surveys should carefully examine data for fraudulent responses and apply strategies to mitigate risks.</abstract><cop>Auckland</cop><pub>Springer Nature B.V</pub></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1178-1653 |
ispartof | The patient : patient-centered outcomes research, 2024-11, Vol.17 (6), p.719-720 |
issn | 1178-1653 1178-1661 |
language | eng |
recordid | cdi_proquest_journals_3115796661 |
source | SpringerLink Journals - AutoHoldings |
subjects | Algorithms Fraud Inflammatory bowel disease Polls & surveys Social networks |
title | Identifying Fraudulent Respondents in Preference Surveys: An Example in Inflammatory Bowel Disease |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T01%3A18%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identifying%20Fraudulent%20Respondents%20in%20Preference%20Surveys:%20An%20Example%20in%20Inflammatory%20Bowel%20Disease&rft.jtitle=The%20patient%20:%20patient-centered%20outcomes%20research&rft.au=MacDonald,%20Karen%20V&rft.date=2024-11-01&rft.volume=17&rft.issue=6&rft.spage=719&rft.epage=720&rft.pages=719-720&rft.issn=1178-1653&rft.eissn=1178-1661&rft_id=info:doi/&rft_dat=%3Cproquest%3E3115796661%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3115796661&rft_id=info:pmid/&rfr_iscdi=true |