Claim Missing Document
Check
Articles

Found 1 Documents
Search

Klasifikasi Tweet Berbahasa Indonesia Berisi Ujaran Kebencian Menggunakan Metode Improved K-Nearest Neighbor dengan Pembobotan BM25F Nurdifa Febrianti; Indriati Indriati; Muhammad Tanzil Furqon
Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer Vol 3 No 10 (2019): Oktober 2019
Publisher : Fakultas Ilmu Komputer (FILKOM), Universitas Brawijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (368.482 KB)

Abstract

Hate speech is a verbal hatred act that targets a group of people or parts of a particular community. In Indonesia, hate speech is increasingly found, especially on text-based social media such as Twitter. So that inspired the writing of this research, to identify hate speech on Twitter with the classification of tweets, especially those in Indonesian. The author chooses to use Improved K-Nearest Neighbor by using the BM25F term weighting, which is a weighting that considers the fields/streams in the document. So the tweet chosen as a training document and research test document, consists of 2 streams, the tweet and the hashtag. K-Fold Cross Validation testing (with K = 5) was performed on the parameter k for IKNN classification, bs, vs, and k1 for BM25F weighting, with 400 training documents and 100 test documents. The test results show that the determination of stream weight values ​​on BM25F sufficiently influences the results of the IKNN classification. Meanwhile the best final results for the F-Measure, Accuracy, Precision, and Recall of the average 5-Fold Cross Validation obtained were 79.77%, 68.80%, 68.80%, and 89.92% with k = 70, bs= 0,6, v1 = 2, v2= 5 and k1= 2 as the best value for each parameter.