Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
Vol 4 No 5 (2020): Oktober 2020

Analisis Fitur Stilometri dan Strategi Segmentasi pada Sistem Deteksi Plagiasi Intrinsik Teks

Sylvia Putri Gunawan (Universitas Kristen Duta Wacana)
Lucia Dwi Krisnawati (Universitas Kristen Duta Wacana)
Antonius Rachmat Chrismanto (UKDW)



Article Info

Publish Date
30 Oct 2020

Abstract

Two different paradigms in the field of plagiarism detection resulting in External Plagiarism Detection (EPD) and Intrinsic Plagiarism Detection (IPD) systems. The most common applied system is EPD, which requires its algorithm to make a heuristic comparison between a suspicious document with documents in a corpus. In contrast, given a suspicious document only, an algorithm of IPD should be able to find the plagiarism section by looking for text segments having different writing styles. Previous researches for Indonesian texts fell only in the field of the EPD development system. Therefore, this research focuses on and contributes to experimenting and analyzing the stylometric features and segmentation strategies to build an IPD system for Indonesian texts. The experimentation results show that the paragraph segment performs better by scoring 0.92 for Macro Averaged-Accuracy and 0.54 for Macro Averaged-F1. The stylometric features achieving the highest scores of F-1 and Accuracy are the frequency of punctuation, the average paragraph length, and the type-token ratio.

Copyrights © 2020






Journal Info

Abbrev

RESTI

Publisher

Subject

Computer Science & IT Engineering

Description

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) dimaksudkan sebagai media kajian ilmiah hasil penelitian, pemikiran dan kajian analisis-kritis mengenai penelitian Rekayasa Sistem, Teknik Informatika/Teknologi Informasi, Manajemen Informatika dan Sistem Informasi. Sebagai bagian dari semangat ...