Machine learning approaches to detect online harassment using bag of words
The time people spend online has been increasing dramatically recently, and people can become anonymous when posting, share their own opinion and participate in online chat. Because of this, more and more people are sexually harassed online on various social media, especially with children. This pap...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The time people spend online has been increasing dramatically recently, and people can become anonymous when posting, share their own opinion and participate in online chat. Because of this, more and more people are sexually harassed online on various social media, especially with children. This paper aims to detect sexual harassment at an early phase using three types of machine learning (SVM, Logistic regression, XGBoost) with a bag of words to represent the text. This paper deal with two kinds of data sets Chats sexual predators (CSP) and comments sexual harassment (CSH). In the logistic regression for the datasets (CSH, CSP), the accuracy obtained was 98.44%, 93.71%, while the accuracy of XGBoost was 96.57%, 90.65%, respectively. XGBoost use to avoiding overfitting and shows promising results in both data sets. |
---|---|
ISSN: | 0094-243X 1551-7616 |
DOI: | 10.1063/5.0118599 |