An experimental comparison of real and artificial deception using a deception generation model

To develop a data mining approach for a deception application, data collection costs can be prohibitive because both deceptive data and truthful data are necessary to be collected. To reduce data collection costs, artificially generated deception data can be used, but the impact of using artificiall...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Decision Support Systems 2012-06, Vol.53 (3), p.543-553
Hauptverfasser: Yang, Yanjuan, Mannino, Michael V.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To develop a data mining approach for a deception application, data collection costs can be prohibitive because both deceptive data and truthful data are necessary to be collected. To reduce data collection costs, artificially generated deception data can be used, but the impact of using artificially generated deception data is not well understood. To study the relationship between artificial and real deception, this paper presents an experimental comparison using a novel deception generation model. The deception and truth data were collected from financial aid applications, a document centric area with limited resources for verification. The data collection provided a unique data set containing truth, natural deception, and boosted deception. To simulate deception, the Application Deception Model was developed to generate artificial deception in different deception scenarios. To study differences between artificial and real deception, an experiment was performed using deception level and data generation method as factors and directed distance and outlier score as outcome variables. Our results provided evidence of a reasonable similarity between artificial and real deception, suggesting the possibility of using artificially generated deception to reduce the costs associated with obtaining training data. ► This paper develops a new data generation model for document deception. ► This paper presents an experiment comparing the similarity of real deception and artificially generated noise and deception. ► This paper provides evidence that artificially generated deception is similar to real deception.
ISSN:0167-9236
1873-5797
DOI:10.1016/j.dss.2012.04.009