METHOMIKA: Jurnal Manajemen Informatika & Komputerisasi Akuntansi
Vol. 6 No. 2 (2022): METHOMIKA: Jurnal Manajemen Informatika & Komputersisasi Akuntansi

TEXT MINING DAN KLASIFIKASI MULTI LABEL MENGGUNAKAN XGBOOST

Rimbun Siringoringo (Universitas Methodis Indonesia)
Jamaluddin Jamaluddin (Universitas Methodist Indonesia)
Resianta Perangin-angin (Universitas Methodist Indonesia)



Article Info

Publish Date
31 Oct 2022

Abstract

The conventional classification process is applied to find a single criterion or label. The multi-label classification process is more complex because a large number of labels results in more classes. Another aspect that must be considered in multi-label classification is the existence of mutual dependencies between data labels. In traditional binary classification, classification analysis only aims to determine the label in the text, whether positive or negative. This method is sub-optimal because the relationship between labels cannot be determined. To overcome the weaknesses of these traditional methods, multi-label classification is one of the solutions in data labeling. With multi-label text classification, it allows the existence of many labels in a document and there is a semantic correlation between these labels. This research performs multi-label classification on research article texts using the ensemble classifier approach, namely XGBoost. Classification performance evaluation is based on several metrics criteria of confusion matrix, accuracy, and f1 score. Model evaluation is also carried out by comparing the performance of XGBoost with Logistic Regression. The results of the study using the train test split and cross-validation obtained an average accuracy of training and testing for Regression Logistics of 0.81, and an average f1 score of 0.47. The average accuracy for XGBoost is 0.88, and the average f1 score is 0.78. The results show that the XGBoost classifier model can be applied to produce a good classification performance.

Copyrights © 2022






Journal Info

Abbrev

methomika

Publisher

Subject

Computer Science & IT Economics, Econometrics & Finance

Description

Sistem Informasi Sistem Informasi Manajemen Sistem Informasi Akuntansi Manajemen Basis Data Pengembangan Aplikasi Web dan Mobile Sistem Pendukung Keputusan Desain Grafis dan Multimedia Audit Sistem Informasi Topik-topik lain yang Relevan dengan bidang ilmu Manajemen Informatika Topik-topik lain yang ...