Register: Jurnal Ilmiah Teknologi Sistem Informasi
Vol 7, No 1 (2021): January

An in-depth performance analysis of the oversampling techniques for high-class imbalanced dataset

Wibowo, Prasetyo (Unknown)
Fatichah, Chastine (Unknown)



Article Info

Publish Date
28 Feb 2021

Abstract

Class imbalance occurs when the distribution of classes between the majority and the minority classes is not the same. The data on imbalanced classes may vary from mild to severe. The effect of high-class imbalance may affect the overall classification accuracy since the model is most likely to predict most of the data that fall within the majority class.  Such a model will give biased results, and the performance predictions for the minority class often have no impact on the model. The use of the oversampling technique is one way to deal with high-class imbalance, but only a few are used to solve data imbalance. This study aims for an in-depth performance analysis of the oversampling techniques to address the high-class imbalance problem. The addition of the oversampling technique will balance each class’s data to provide unbiased evaluation results in modeling. We compared the performance of Random Oversampling (ROS), ADASYN, SMOTE, and Borderline-SMOTE techniques. All oversampling techniques will be combined with machine learning methods such as Random Forest, Logistic Regression, and k-Nearest Neighbor (KNN). The test results show that Random Forest with Borderline-SMOTE gives the best value with an accuracy value of 0.9997, 0.9474 precision, 0.8571 recall, 0.9000 F1-score, 0.9388 ROC-AUC, and 0.8581 PRAUC of the overall oversampling technique.

Copyrights © 2021






Journal Info

Abbrev

register

Publisher

Subject

Computer Science & IT

Description

Register: Jurnal Ilmiah Teknologi Sistem Informasi published by the Department of Information Systems Unipdu Jombang. Register published twice a year, in January and July, Registerincludes research in the field of Information Technology, Information Systems Engineering, Intelligent Business Systems, ...