Elkom: Jurnal Elektronika dan Komputer
Vol 13 No 1 (2020): Juli: Jurnal Elektronika dan Komputer

EVALUASI EKSTRAKSI FITUR KLASIFIKASI TEKS UNTUK PENINGKATAN AKURASI KLASIFIKASI MENGGUNAKAN NAIVE BAYES

Aji Priyambodo (Institut Teknologi dan Bisnis Semarang)
Prihati Prihati (Institut Teknologi dan Bisnis Semarang)



Article Info

Publish Date
01 Jul 2020

Abstract

Classification is one of the most widely used techniques in machine learning. Text classification is the process of classifying data according to pre-determined groups or classes. Where in most cases, text classification uses labeled training data to obtain the rules used to classify test data into predefined groups. In this study, it is proposed to use CountVectorizer for Indonesian text classification which will be compared with TF-IDF Term Weighting and its three feature levels, namely Character Level, Word Level and N-gram Level as feature extraction which is implemented together with Naive Bayes classification and the BPPPTIndToEngCorpusHalfM dataset. To compare the classification performance, this study uses 10-Fold Cross Validation and Split Data using a ratio of 90:10, while to evaluate the accuracy of the authors using the F1-Score and AUC with the hope that this study will get good accuracy results so that it can be used as a reference to be developed using another method. The F1-Score accuracy obtained in this study was 0.93 and the AUC score was 0.95.

Copyrights © 2020






Journal Info

Abbrev

elkom

Publisher

Subject

Education

Description

Elkom : Jurnal Elektronika dan Komputer merupakan Jurnal yang diterbitkan oleh SEKOLAH TINGGI ELEKTRONIKA DAN KOMPUTER (STEKOM). Jurnal ini terbit 2 kali dalam setahun yaitu pada bulan Juli dan Desember. Misi dari Jurnal ELKOM adalah untuk menyebarluaskan, mengembangkan dan menfasilitasi hasil ...