Finding Deceptive Opinion Spam by Correcting the Mislabeled Instances

Assessing the trustworthiness of reviews is a key in natural language processing and computational linguistics. Previous work mainly focuses on some heuristic strategies or simple supervised learning methods, which limit the performance of this task. This paper presents a new approach, from the view...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Chinese Journal of Electronics 2015-01, Vol.24 (1), p.52-57
Hauptverfasser: Ren, Yafeng, Ji, Donghong, Yin, Lan, Zhang, Hongbin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Assessing the trustworthiness of reviews is a key in natural language processing and computational linguistics. Previous work mainly focuses on some heuristic strategies or simple supervised learning methods, which limit the performance of this task. This paper presents a new approach, from the viewpoint of correcting the mislabeled instances, to find deceptive opinion spam. Partition a dataset into several subsets, construct a classifier set for each subset and select the best one to evaluate the whole dataset. Error variables are defined to compute the probability that the instances have been mislabeled. The mislabeled instances are corrected based on two threshold schemes, ma jority and non-objection. The results display significant improvements in our method in contrast to the existing baselines.
ISSN:1022-4653
2075-5597
DOI:10.1049/cje.2015.01.009