Zulfikar Mukarabiman
Universitas Amikom Yogyakarta, Yogyakarta, 55281, Indonesia

Published : 1 Documents Claim Missing Document
Claim Missing Document

Found 1 Documents

Extreme Gradient Boosting Algorithm to Improve Machine Learning Model Performance on Multiclass Imbalanced Dataset Yoga Pristyanto; Zulfikar Mukarabiman; Anggit Ferdita Nugraha
JOIV : International Journal on Informatics Visualization Vol 7, No 3 (2023)
Publisher : Politeknik Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30630/joiv.7.3.1102


Unbalanced conditions in the dataset often become a real-world problem, especially in machine learning. Class imbalance in the dataset is a condition where the number of minority classes is much smaller than the majority class, or the number is insufficient. Machine learning models tend to recognize patterns in the majority class more than in the minority class. This problem is one of the most critical challenges in machine learning research, so several methods have been developed to overcome it. However, most of these methods only focus on binary datasets, so few methods still focus on multiclass datasets. Handling unbalanced multiclass is more complex than handling unbalanced binary because it involves more classes than binary class datasets. With these problems, we need an algorithm with features that can support adjustments to the difficulties that arise in multiclass unbalanced datasets. One of the algorithms that have features for adjustment is the ensemble algorithm, namely Xtreme Gradient Boosting. Based on the research, our proposed method with Xtreme Gradient Boosting showed better results than the other classification and ensemble algorithms on eight datasets with five evaluation metrics indicators such as balanced accuracy, the geometric-mean, multiclass area under the curve, true positive rate, and true negative rate. In future research, we suggest combining methods at the data level and Xtreme Gradient Boosting. With the performance increase in Xtreme Gradient Boosting, it can be a solution and reference in the case of handling multiclass imbalanced problems. Besides, we also recommended testing with datasets in the form of categorical and continuous data.