Building of Informatics, Technology and Science
Vol 4 No 4 (2023): Maret 2023

Undersampling dan K-Fold Random Forest Untuk Klasifikasi Kelas Tidak Seimbang

Laila Qadrini (Universitas Sulawesi Barat, Majene)



Article Info

Publish Date
31 Mar 2023

Abstract

Classification in Data Mining is a process of modelling that explains and differentiates data classes intending to estimate the class of an object whose class is unknown. Classification can be applied in various aspects so over time quite a lot of classification algorithms have been developed, but some problems are often encountered in classification, namely the problem of data imbalance. An imbalanced class is a condition where there are several data where the number of classes is not balanced or there is a significant difference in each number of classes. Most classification datasets do not have the same number of classes. However, the class imbalance is not a problem when the comparison between classes is not much different. Class imbalance can cause problems if left untreated because the resulting model predictions will tend to the majority group so that the contribution of the minority class to the model is small. One of the algorithms that are often used to handle unbalanced classes is the resampling algorithm. The purpose of this research is to apply the Resampling Undersampling Random Forest and Random Forest K-Fold Undersampling Algorithms to the Breast Cancer Diagnostic dataset from UCI Machine Learning. Undersampling was chosen because it produces better accuracy than oversampling. Recall accuracy for the K-Fold 10 Random Forest Algorithm is 83% and for Recall Undersampling Random Forest is 65%.

Copyrights © 2023






Journal Info

Abbrev

bits

Publisher

Subject

Computer Science & IT

Description

Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. ...