Unsupervised Characterization and Visualization of Students’ Academic Performance Features

The large nature of students' dataset has made it difficult to find patterns associated with students' academic performance (AP) using conventional methods. This has increased the rate of drop-outs, graduands with weak class of degree (CoD) and students that spend more than the minimum sti...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computer and information science (Toronto) 2019-04, Vol.12 (2), p.103
Hauptverfasser:	Inyang, Udoinyang G., Umoh, Uduak A., Nnaemeka, Ifeoma C., Robinson, Samuel A.
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The large nature of students' dataset has made it difficult to find patterns associated with students' academic performance (AP) using conventional methods. This has increased the rate of drop-outs, graduands with weak class of degree (CoD) and students that spend more than the minimum stipulated duration of studies. It is necessary to determine students' AP using educational data mining (EDM) tools in order to know students who are likely to perform poorly at an early stage of their studies. This paper explores k-means and self-organizing map (SOM) in mining pieces of knowledge relating to the natural number of clusters in students' dataset and the association of the input features using selected demographic, pre-admission and first year performance. Matlab 2015a was the programming environment and the dataset consists of nine sets of computer science graduands. Cluster validity assessment with k-means discovered four (4) clusters with correlation metric yielding the highest mean silhouette value of 0.5912. SOM provided an hexagonal grid visual of feature component planes and scatter plots of each significant input attribute. The result shows that the significant attributes were highly correlated with each other except entry mode (EM), indicating that the impact of EM on CoD varies with students irrespective of mode of admission. Also, four distinct clusters were also discovered in the dataset by SOM -7.7% belonging to cluster 1 (first class), and 25% for cluster 2 (2nd class Upper) while Clusters 3 and 4 had 35% proportion each. This validates the results of k-means and further confirms the importance of early detection of students' AP and confirms the effectiveness of SOM as a cluster validity tool. As further work, the labels from SOM will be associated with records in the dataset for association rule mining, supervised learning and prediction of students' AP.
ISSN:	1913-8989 1913-8997
DOI:	10.5539/cis.v12n2p103