Class imbalance is a severe problem in classification due to the deep slope on the class axis. The dataset is dominated by the majority class, which has the potential for misclassification. Another problem in classification and clustering is that high-dimensional datasets are found that have the potential to affect the performance of classification algorithms in terms of computation and accuracy. In this study, the class imbalance was handled using the ADASYN k - NN resampling technique and the selection feature using Information Gain. Based on the evaluation results, the sampling contribution matrix can improve the classification model by improving the geometric mean value. The selection feature helps interpret data with more simple features but can reduce the accuracy of the results. The results showed that the implementation of ADASYN k-NN and Information Gain could increase the accuracy score and geometric mean score of Decision Tree C4.5 and Naive Bayes. For further work, this proposed method will be tested on multiclass imbalanced datasets.
Copyrights © 2022