Multi-Stage Feature Selection for Optimizing Student Dropout Prediction

Arif Mudi Priyatno; Yunia Ningsih; Rizqon Jamil Farhas; Fahmi Iqbal Firmananda; Resy Kumala Sari; Aryadi Aryadi

doi:10.5935/jetia.v11i55.1910

Arif Mudi Priyatno Universitas Pahlawan Tuanku Tambusai, Riau, Indonesia http://orcid.org/0000-0003-3500-3511
Yunia Ningsih Universitas Trisakti, Jakarta, Indonesia http://orcid.org/0009-0005-6388-1233
Rizqon Jamil Farhas Universitas Pahlawan Tuanku Tambusai, Riau, Indonesia http://orcid.org/0009-0002-1092-9596
Fahmi Iqbal Firmananda Universitas Pahlawan Tuanku Tambusai, Riau, Indonesia http://orcid.org/0009-0001-9744-6580
Resy Kumala Sari Universitas Pahlawan Tuanku Tambusai, Riau, Indonesia http://orcid.org/0000-0003-3138-3502
Aryadi Aryadi Universitas Pahlawan Tuanku Tambusai, Riau, Indonesia http://orcid.org/0009-0009-0495-2264

DOI: https://doi.org/10.5935/jetia.v11i55.1910

Abstract

The high rate of college dropouts is a significant challenge in higher education. Dropout prediction requires an accurate model and is supported by a selection of relevant features. This study proposes a step-by-step feature selection framework to improve prediction accuracy, consisting of three stages, namely Variance Threshold, Mutual Information, and Boruta. The classification model is built using the Extreme Gradient Boosting (XGBoost) algorithm, with evaluation through Stratified 10-fold Cross-Validation. The dataset used includes 4,423 student data that reflects academic, demographic, and socioeconomic information. A total of 18 features were confirmed to be relevant by Boruta. XGBoost models trained on selected features show high performance, with an accuracy of 90.77%, precision of 92.07%, recall of 83.68%, and an F1-score of 87.63%. These results show that the integration of filter and wrapper approaches in the feature selection process effectively improves the performance of the dropout prediction model. This framework is able to filter out important features and produce a more stable and efficient classification model in the context of higher education.

Downloads

Download data is not yet available.

JETIA Journal Data
Available:	2015 - 2026
Volumes:	12
Issues:	58
Articles:	1.110
Article Processing Charges (APC):	PAID