Turkish Text Classification: From Lexicon Analysis to Bidirectional Transformer
Text classification has seen an increased use in both academic and industry settings. Though rule based methods have been fairly successful, supervised machine learning has been shown to be most successful for most languages, where most research was done on English. In this article, the success of l...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Text classification has seen an increased use in both academic and industry
settings. Though rule based methods have been fairly successful, supervised
machine learning has been shown to be most successful for most languages, where
most research was done on English. In this article, the success of lexicon
analysis, support vector machines, and extreme gradient boosting for the task
of text classification and sentiment analysis are evaluated in Turkish and a
pretrained transformer based classifier is proposed, outperforming previous
methods for Turkish text classification. In the context of text classification,
all machine learning models proposed in the article are domain-independent and
do not require any task-specific modifications. |
---|---|
DOI: | 10.48550/arxiv.2104.11642 |