Heterogeneous Educational Data Classification at the Course Level

Nowadays, teaching and learning activities in a course are greatly supported by information technologies. Forums are among information technologies utilized in a course to encourage students to communicate with lecturers more outside a traditional class. Free-styled textual posts in those communicat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Vietnam journal of computer science 2021-08, Vol.8 (3), p.337-355
Hauptverfasser: Phuc, Nguyen Hua Gia, Chau, Vo Thi Ngoc
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Nowadays, teaching and learning activities in a course are greatly supported by information technologies. Forums are among information technologies utilized in a course to encourage students to communicate with lecturers more outside a traditional class. Free-styled textual posts in those communications express the problems that the students are facing as well as the interest and activeness of the students with respect to each topic of a course. Exploiting such textual data in a course forum for course-level student prediction is considered in our work. Due to hierarchical structures in course forum texts, we propose a solution in this paper which combines a deep convolutional neural network (CNN) and a loss function to extract the features from textual data in such a manner that more correct recognitions of instances of the minority class which includes students with failure can be supported. In addition, other numeric data are examined and used for the task so that all the students with and without posts can be predicted in the task. Therefore, our work is the first one that defines and solves this prediction task with heterogeneous educational data at the course level as compared to the existing works. In the proposed solution, Random Forests are suggested as an effective ensemble model suitable for our heterogeneous data when many single prediction models which are random trees can be built for many various subspaces with different random features in a supervised learning process. Experimental results in an empirical evaluation on two real datasets show that a heterogeneous combination of textual and numeric data with a Random Forest model can enhance the effectiveness of our solution to the task. The best accuracy and F -measure values can be obtained for early predictions of the students with either success or failure. Such better predictions can help both students and lecturers beware of students’ study and support them in time for ultimate success in a course.
ISSN:2196-8888
2196-8896
DOI:10.1142/S2196888821500147