SemanTIK : Teknik Informasi
Vol 2, No 1 (2016): semanTIK

APLIKASI PENDETEKSI KEMIRIPAN ISI TEKS DOKUMEN MENGGUNAKAN METODE LEVENSHTEIN DISTANCE

Na’firul Hasna Ariyani (Halu Oleo University)
Sutardi Sutardi (Halu Oleo University)
Rahmat Ramadhan (Halu Oleo University)



Article Info

Publish Date
18 Apr 2016

Abstract

The rapidly evolving information technology brings positive and negative impacts to the lives One of the negative impacts is plagiarism. Plagiarism is the act of plagiarizing the work of others and recognize as his own handiwork. Therefore, detection of plagiarism needs to be done to reduce plagiarism against other people's work. This thesis aims to detect text document similarity algorithm using Levenshtein Distance so that it can be used to help determine plagiarism. Type of document to be tested is .docx and .pdf .txt. Stages in the system is preprocessing that consist of Case Folding, tokenizing, Filtering, Stemming, Sorting. After the preprocessing the next step is to do the calculation using the method Levenshtein Distance and penggukuran value of similarity thus getting a percentage value of the similarity between the two documents. In testing using real data ie data documents berplagiat with Levenshtein Distance algorithm produces a high similarity value is above 77% to 100% for the document that a high level of similarity. As for the document with a low degree of similarity or not berplagiat then generate similarity values ​​below 40%.   Keywords— Document, Levenshtein Distance, Preprocessing, Similarity, Plagiarism

Copyrights © 2016






Journal Info

Abbrev

semantik

Publisher

Subject

Computer Science & IT Control & Systems Engineering

Description

Jurnal "semanTIK" merupakan salah satu media publikasi hasil-hasil penelitian dalam bidang teknologi informasi. Kajian penelitian dalam jurnal yaitu Rekayasa Perangkat Lunak, Jaringan Komputer, Sistem Cerdas, Sistem Informasi dan Robotika. Sasaran dalam penerbitan jurnal ini adalah Dosen, Mahasiswa ...