Jurnal POINTER
Vol 2, No 1 (2011): Jurnal Pointer - Ilmu Komputer

Improving Performance of Document Clustering Using Latent Semantic Index Approach

Lailil Muflikhah (Universitas Brawijaya)
Baharudin B. Baharum (Unknown)



Article Info

Publish Date
19 Feb 2012

Abstract

ABSTRACT Document clustering is important to help users to retrieve the information they need. Initially, clustering is a method used to improve the precision and recall in information retrieval. The fuzzy clustering method is used to categorize document collections. Clustering of document involves huge volume of data that may be correlated either inter or intra documents. Hence, their pattern can be found by using Latent Semantic Index (LSI) approach. There are two methods used in this research, Singular Vector Decomposition (SVD) and Principal Component Analysis (PCA). The PCA is an extension of SVD method using data covariance. The aim of this study is to improve the performance of existing clustering algorithm (fuzzy c-Means) by simplified matrix dimension, which can contribute to improving the performance quality of document categorization. By various data volumes (class sizes) and topics, the experiment has shown that there is significant improvement for the performance quality of cluster either internal or external. Keyword: document clustering, Latent Semantic Index, SVD, PCA, fuzzy c-means

Copyrights © 2011






Journal Info

Abbrev

POINTER

Publisher

Subject

Computer Science & IT

Description

Jurnal POINTER diterbitkan oleh Program Studi Ilmu Komputer, Jurusan Matematika, Fakultas Matematika dan Ilmu Pengetahuan Alam, Universitas Brawijaya Malang. Jurnal POINTER ini terbit dua kali dalam satu tahun, yaitu pada bulan Februari dan ...