Bengali News Headline Categorization Using Optimized Machine Learning Pipeline
Bengali text based news portal is now very common and increasing day by day. With easy access of internet technology, reading news through online is now a regular task. Different types of news are represented in the news portal. The system presented in this paper categorizes the news headline of new...
Gespeichert in:
Veröffentlicht in: | International journal of information engineering and electronic business 2021-02, Vol.13 (1), p.15-24 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Bengali text based news portal is now very common and increasing day by day. With easy access of internet technology, reading news through online is now a regular task. Different types of news are represented in the news portal. The system presented in this paper categorizes the news headline of news portal or sites. Prediction is made by machine learning algorithm. Large number of collected data are trained and tested. As pre-processing tasks such as tokenization, digit removal, removing punctuation marks, symbols, and deletion of stop words are processed. A set of stop words is also created manually. Strong stop words leads to better performance. Stop words deletion plays a lead role in feature selection. For optimization, genetic algorithm is used which results in reduced feature size. A comparison is also explored without optimization process. Dataset is established by collecting news headline from various Bengali news portal and sites. Resultant output shows well performance in categorization. |
---|---|
ISSN: | 2074-9023 2074-9031 |
DOI: | 10.5815/ijieeb.2021.01.02 |