UKARA 1.0 Challenge Track 1: Automatic Short-Answer Scoring in Bahasa Indonesia

We describe our third-place solution to the UKARA 1.0 challenge on automated essay scoring. The task consists of a binary classification problem on two datasets | answers from two different questions. We ended up using two different models for the two datasets. For task A, we applied a random forest...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2020-02
Hauptverfasser:	Ali Akbar Septiandri, Winatmoko, Yosef Ardhito
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Datasets Feature extraction Regression analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We describe our third-place solution to the UKARA 1.0 challenge on automated essay scoring. The task consists of a binary classification problem on two datasets \| answers from two different questions. We ended up using two different models for the two datasets. For task A, we applied a random forest algorithm on features extracted using unigram with latent semantic analysis (LSA). On the other hand, for task B, we only used logistic regression on TF-IDF features. Our model results in F1 score of 0.812.
ISSN:	2331-8422