MULTINETICS
Vol 1, No 2 (2015): MULTINETICS Nopember (2015)

Sistem Deteksi Bahasa pada Dokumen menggunakan N-Gram

Zaman, Badrus (Unknown)
Hariyanti, Eva (Unknown)
Purwanti, Endah (Unknown)



Article Info

Publish Date
20 Nov 2015

Abstract

Language detection on a very large collection of documents can be done to increasing performance of information retrieval system. One of popular method on language detection is N-Grams, based on pieces of n-characters taken from a string. This research is developed language detection system based on N-Gram that performs by Indonesian or English language. In general, the steps being taken there were 3 phases, namely creating profile of each language, system testing, and system evaluation. Fifty documents were used to creating profile of each language, i.e. 25 Indonesian and 25 English. Sixty documents were used for system testing. System performance was evaluated using F-measures. Based on the test, obtained F-measures for unigram, bigram, and unigram respectively 0.933, 0.917, and 0.933.

Copyrights © 2015






Journal Info

Abbrev

multinetics

Publisher

Subject

Computer Science & IT

Description

Multinetics is a peer-reviewed journal is published twice a year (May and November). Multinetics aims to provide a forum exchange and an interface between researchers and practitioners in any computer and informatics engineering related field. Scopes this journal are Content-Based Multimedia ...