Classification of online toxic comments using the logistic regression and neural networks models

The paper addresses the questions of abusive content identification in the Internet. It is presented the solving of the task of toxic online comments classification, which was issued on the site of machine learning Kaggle (www.Kaggle.com) in March of 2018. Based on the analysis of initial data, four...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Saif, Mujahed A., Medvedev, Alexander N., Medvedev, Maxim A., Atanasova, Todorka
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Artificial neural networks Classification Machine learning Neural networks Problem solving Regression analysis Regression models
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The paper addresses the questions of abusive content identification in the Internet. It is presented the solving of the task of toxic online comments classification, which was issued on the site of machine learning Kaggle (www.Kaggle.com) in March of 2018. Based on the analysis of initial data, four models for solving the task are proposed: logistic regression model and three neural networks models - convolutional neural network (Conv), long shortterm memory (LSTM), and Conv + LSTM. All models are realized as a program in Python 3, which has simple structure and can be adapted to solve other tasks. The results of the classification problem solving with help of proposed models are presented. It is concluded that all models provide successful solving of the task, but the combined model Conv + LSTM is the most effective, so as it provides the best accuracy.
ISSN:	0094-243X 1551-7616
DOI:	10.1063/1.5082126