APJITM UKM

Article Info

Combining Cluster Quality Index and Supervised Learning to Predict Students’ Academic Performance

Suhaila Zainudin, Rapi’ah Ibrahim, Hafiz Mohd Sarim
dx.doi.org/10.17576/apjitm-2024-1301-06

Page 72 - 92

Abstract

Predicting students' academic performance can help the institution to take timely action, such as planning intervention measures to improve students’ academic achievement. This study aims to identify the main factors contributing to the postgraduate student’s academic performance. Preliminary predictions can be made to avoid student dropouts, especially for students studying at the postgraduate level. The results obtained from this study are significant for facilitating the institution in decision-making and formulating the best strategies for the primary stakeholder (students). This study employs a combination of data mining tasks, such as clustering and classification, to undertake the prediction task. First, the approach performed clustering with K-Means algorithm to identifies different student groups. Then, the clusters were evaluated with cluster quality indexes, namely, the Silhouette Coefficient, Calinski-Harabasz Index and Davies-Bouldin Index, to determine the best clusters. The best number of clusters is selected based on the Silhouette Coefficient score because the uniformity for this coefficient is between -1 and 1. The best cluster is further analysed using classification to predict students’ academic performance. Three classification algorithms have been selected: Logistic Regression (LR), Support Vector Machine (SVM) and Decision Tree (DT). The results show that the LR model best predicts students’ academic performance levels compared to SVM and DT.