Bengali News Headline Categorization Using Optimized Machine Learning Pipeline

Bengali text based news portal is now very common and increasing day by day. With easy access of internet technology, reading news through online is now a regular task. Different types of news are represented in the news portal. The system presented in this paper categorizes the news headline of new...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of information engineering and electronic business 2021-02, Vol.13 (1), p.15-24
Hauptverfasser: Dhar, Prashengit, Abedin, Md. Zainal
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Bengali text based news portal is now very common and increasing day by day. With easy access of internet technology, reading news through online is now a regular task. Different types of news are represented in the news portal. The system presented in this paper categorizes the news headline of news portal or sites. Prediction is made by machine learning algorithm. Large number of collected data are trained and tested. As pre-processing tasks such as tokenization, digit removal, removing punctuation marks, symbols, and deletion of stop words are processed. A set of stop words is also created manually. Strong stop words leads to better performance. Stop words deletion plays a lead role in feature selection. For optimization, genetic algorithm is used which results in reduced feature size. A comparison is also explored without optimization process. Dataset is established by collecting news headline from various Bengali news portal and sites. Resultant output shows well performance in categorization.
ISSN:2074-9023
2074-9031
DOI:10.5815/ijieeb.2021.01.02