Machine learning approaches to detect online harassment using bag of words

The time people spend online has been increasing dramatically recently, and people can become anonymous when posting, share their own opinion and participate in online chat. Because of this, more and more people are sexually harassed online on various social media, especially with children. This pap...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Amer, Noor, Dhannoon, Ban N.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The time people spend online has been increasing dramatically recently, and people can become anonymous when posting, share their own opinion and participate in online chat. Because of this, more and more people are sexually harassed online on various social media, especially with children. This paper aims to detect sexual harassment at an early phase using three types of machine learning (SVM, Logistic regression, XGBoost) with a bag of words to represent the text. This paper deal with two kinds of data sets Chats sexual predators (CSP) and comments sexual harassment (CSH). In the logistic regression for the datasets (CSH, CSP), the accuracy obtained was 98.44%, 93.71%, while the accuracy of XGBoost was 96.57%, 90.65%, respectively. XGBoost use to avoiding overfitting and shows promising results in both data sets.
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0118599