Garuda - Garba Rujukan Digital

Amalia Amalia

Universitas Sumatera Utara

Author-ID : 2757017

Humanities Computer Science & IT Control & Systems Engineering Economics, Econometrics & Finance Education Electrical & Electronics Engineering Energy Engineering Health Professions Public Health

Published : 4 Documents Claim Missing Document

Claim Missing Document

Articles

Title

1

Perbandingan Metode Klaster dan Preprocessing Untuk Dokumen Berbahasa Indonesia Amalia Amalia; Maya Silvi Lydia; Siti Dara Fadilla; Miftahul Huda
Jurnal Rekayasa Elektrika Vol 14, No 1 (2018)
Publisher : Universitas Syiah Kuala

Clustering is an unsupervised method to group multiple objects based on the similarity automatically. The quality of clustering accuracy is determined by the number of similar objects in a correct cluster group. The robust preprocessing process and the choice of cluster algorithm can increase the efficiency of clustering. The objective of this study is to observe the most suitable method to cluster document in Bahasa Indonesia. We performed tests on several cluster algorithms such as K-Means, K-Means++ and Agglomerative with various preprocessing stages and collected the accuracy of each algorithm. Clustering experiments were conducted on a corpus containing 100 documents in Bahasa Indonesia with a commonly used preprocessing scenario. Additionally, we also attach our preprocessing stages such as LSA function, TF-IDF function, and LSA / TF-IDF function. We tested various LSA dimension reductions values from 10% to 90%, and the result shows that the best percentage of reduction rates between 50%-80%. The result also indicates that K-Means++ algorithm produces better purity values than other algorithms.

Perbandingan Metode Klaster dan Preprocessing Untuk Dokumen Berbahasa Indonesia Amalia Amalia; Maya Silvi Lydia; Siti Dara Fadilla; Miftahul Huda
Jurnal Rekayasa Elektrika Vol 14, No 1 (2018)
Publisher : Universitas Syiah Kuala

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17529/jre.v14i1.9027

Clustering is an unsupervised method to group multiple objects based on the similarity automatically. The quality of clustering accuracy is determined by the number of similar objects in a correct cluster group. The robust preprocessing process and the choice of cluster algorithm can increase the efficiency of clustering. The objective of this study is to observe the most suitable method to cluster document in Bahasa Indonesia. We performed tests on several cluster algorithms such as K-Means, K-Means++ and Agglomerative with various preprocessing stages and collected the accuracy of each algorithm. Clustering experiments were conducted on a corpus containing 100 documents in Bahasa Indonesia with a commonly used preprocessing scenario. Additionally, we also attach our preprocessing stages such as LSA function, TF-IDF function, and LSA / TF-IDF function. We tested various LSA dimension reductions values from 10% to 90%, and the result shows that the best percentage of reduction rates between 50%-80%. The result also indicates that K-Means++ algorithm produces better purity values than other algorithms.

Title Search

Found 2 Documents Search Journal : Jurnal Rekayasa elektrika

Abstract

Abstract

Title

Found 2 Documents
Search
Journal : Jurnal Rekayasa elektrika