Automated Essay Scoring by Capturing Relative Writing Quality

Automated essay-scoring (AES) systems utilize computer techniques and algorithms to automatically rate essays written in an educational setting, by which the workload of human raters is greatly reduced. AES is usually addressed as a classification or regression problem, where classical machine learn...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer journal 2014-09, Vol.57 (9), p.1318-1330
Hauptverfasser: Chen, Hongbo, Xu, Jungang, He, Ben
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Automated essay-scoring (AES) systems utilize computer techniques and algorithms to automatically rate essays written in an educational setting, by which the workload of human raters is greatly reduced. AES is usually addressed as a classification or regression problem, where classical machine learning algorithms such as K-nearest neighbor and support vector machines are applied. In this paper, we argue that essay rating is based on the comparison of writing quality between essays and treat AES rather as a ranking problem by capturing the difference in writing quality between essays. We propose a rank-based approach that trains an essay-rating model by learning to rank algorithms, which have been widely used in many information retrieval and social Web mining tasks. Various linguistic and statistical features are utilized to facilitate the learning algorithms. Extensive experiments on two public English essay datasets, Automated Student Assessment Prize and Chinese Learners English Corpus, show that our proposed approach based on pairwise learning outperforms previous classification or regression-based methods on all 15 topics. Finally, analysis on the importance of the features extracted reveals that content, organization and structure are the main factors that affect the ratings of essays written by native English speakers, while non-native speakers are prone to losing ratings on improper term usage, syntactic complexity and grammar errors.
ISSN:0010-4620
1460-2067
DOI:10.1093/comjnl/bxt117