JOIV : International Journal on Informatics Visualization
Vol 7, No 3 (2023)

Extreme Gradient Boosting Algorithm to Improve Machine Learning Model Performance on Multiclass Imbalanced Dataset

Yoga Pristyanto (Universitas Amikom Yogyakarta, Yogyakarta, 55281, Indonesia)
Zulfikar Mukarabiman (Universitas Amikom Yogyakarta, Yogyakarta, 55281, Indonesia)
Anggit Ferdita Nugraha (Universitas Amikom Yogyakarta, Yogyakarta, 55281, Indonesia)



Article Info

Publish Date
10 Sep 2023

Abstract

Unbalanced conditions in the dataset often become a real-world problem, especially in machine learning. Class imbalance in the dataset is a condition where the number of minority classes is much smaller than the majority class, or the number is insufficient. Machine learning models tend to recognize patterns in the majority class more than in the minority class. This problem is one of the most critical challenges in machine learning research, so several methods have been developed to overcome it. However, most of these methods only focus on binary datasets, so few methods still focus on multiclass datasets. Handling unbalanced multiclass is more complex than handling unbalanced binary because it involves more classes than binary class datasets. With these problems, we need an algorithm with features that can support adjustments to the difficulties that arise in multiclass unbalanced datasets. One of the algorithms that have features for adjustment is the ensemble algorithm, namely Xtreme Gradient Boosting. Based on the research, our proposed method with Xtreme Gradient Boosting showed better results than the other classification and ensemble algorithms on eight datasets with five evaluation metrics indicators such as balanced accuracy, the geometric-mean, multiclass area under the curve, true positive rate, and true negative rate. In future research, we suggest combining methods at the data level and Xtreme Gradient Boosting. With the performance increase in Xtreme Gradient Boosting, it can be a solution and reference in the case of handling multiclass imbalanced problems. Besides, we also recommended testing with datasets in the form of categorical and continuous data.

Copyrights © 2023






Journal Info

Abbrev

joiv

Publisher

Subject

Computer Science & IT

Description

JOIV : International Journal on Informatics Visualization is an international peer-reviewed journal dedicated to interchange for the results of high quality research in all aspect of Computer Science, Computer Engineering, Information Technology and Visualization. The journal publishes state-of-art ...