IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
Vol 16, No 1 (2022): January

Selection of the Best K-Gram Value on Modified Rabin-Karp Algorithm

Wahyu Hidayat (Magister Teknik Informatika, Universitas Amikom Yogyakarta, Yogyakarta)
Ema Utami (Magister Teknik Informatika, Universitas Amikom Yogyakarta, Yogyakarta)
Andi Sunyoto (Magister Teknik Informatika, Universitas Amikom Yogyakarta, Yogyakarta)



Article Info

Publish Date
31 Jan 2022

Abstract

The Rabin-Karp algorithm is used to detect similarity using hashing techniques, from related studies modifications have been made in the hashing process but in previous studies have not been conducted research for the best k value in the K-Gram process. At the stage of stemming the Nazief & Adriani algorithm is used to transform the words into basic words. The researcher uses several variations of K-Gram values to determine the best K-Gram values. The analysis was performed using Ukara Enhanced public data obtained from the Kaggle with a total of 12215 data. The student essay answers data totaled to 258 data in the group A and 305 in the group B, every student essay answers data in each group will be compared with the answers of other fellow group member. Research results are the value of k = 3 has the best performance which has the highest some interpretations of 1-14%  (Little degree of similarity) and 15-50% (Medium level of similarity) compared to values of k = 5, 7, and 9 which have the highest number of interpretation results 0%-0.99% (Document is different). However, if the students essay answers compared have 100% (Exactly the same) interpretations, the k value on K-Gram does not affect the results.

Copyrights © 2022






Journal Info

Abbrev

ijccs

Publisher

Subject

Computer Science & IT Control & Systems Engineering

Description

Indonesian Journal of Computing and Cybernetics Systems (IJCCS), a two times annually provides a forum for the full range of scholarly study . IJCCS focuses on advanced computational intelligence, including the synergetic integration of neural networks, fuzzy logic and eveolutionary computation, so ...