Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2019 - 2024

P-Index

This Author published in this journals

All Journal Seminar Nasional Aplikasi Teknologi Informasi (SNATI)

Jazi Eko Istyanto

Unknown Affiliation

Author-ID : 2615505

Computer Science & IT

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

Perbandingan Feature Kata dan Frasa dalam Kinerja Clustering Dokumen Teks Berbahasa Indonesia Amir Hamzah; Adhi Susanto; F. Soesianto; Jazi Eko Istyanto
Seminar Nasional Aplikasi Teknologi Informasi (SNATI) 2007
Publisher : Jurusan Teknik Informatika, Fakultas Teknologi Industri, Universitas Islam Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar

Text document clustering has been intensively studied because of its important role in text-mining andinformation retrieval. High dimensionality problem caused by high number of words is always happened inword-based clustering technique using vector space model. Although extracting words in the preprocessingphase is simple, the collection itself is not only can be viewed as a set of words but also a set of partly more thanone word phrase. Separating a phrase into its parts can eliminate the actual meaning of phrase. Therefore inorder to maintain the context of words a phrase must be maintain as a phrase. It is assumed that by addingphrases to words as features in clustering will improve the performance. This paper will study the comparison ofword-base and phrase-based clustering. Three clustering models was chosen i.e. hierachical, partional andhybrid model. Four similarity technique i.e. GroupAverage, CompleteLink, SingleLink, and ClusterCenter wastried for hierarchical, K-Means and Bisecting K-Mean for partitonal and buckshot for hybrid. Documentcollections from 200-800 news text that has been categorized manually was used to test these algorithms byusing F-measure as criteria of clustering performance. This value was derived from Recall and Precision andcan be used to measure the performance of the algorithms to correctly classify the collections. Results show thatby adding phrases or simply word pair, although it’s still not statistically significant, it slightly improves theperformance of clustering.Keywords: word-base document clustering, phraset-based document clustering, clustering performance

Co-Authors Adhi Susanto Amir Hamzah F. Soesianto

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search