International Journal of Advances in Intelligent Informatics
Vol 1, No 3 (2015): November 2015

Automatic Text Summarization Using Latent Drichlet Allocation (LDA) for Document Clustering

Erwin Yudi Hidayat (Faculty of Computer Science, Universitas Dian Nuswantoro)
Fahri Firdausillah (Faculty of Computer Science, Universitas Dian Nuswantoro)
Khafiizh Hastuti (Faculty of Computer Science, Universitas Dian Nuswantoro)
Ika Novita Dewi (Faculty of Computer Science, Universitas Dian Nuswantoro)
Azhari Azhari (Computer Science and Electronics Department, Universitas Gajah Mada)



Article Info

Publish Date
01 Dec 2015

Abstract

In this paper, we present Latent Drichlet Allocation in automatic text summarization to improve accuracy in document clustering. The experiments involving 398 data set from public blog article obtained by using python scrapy crawler and scraper. Several steps of clustering in this research are preprocessing, automatic document compression using feature method, automatic document compression using LDA, word weighting and clustering algorithm The results show that automatic document summarization with LDA reaches 72% in LDA 40%, compared to traditional k-means method which only reaches 66%.

Copyrights © 2015






Journal Info

Abbrev

IJAIN

Publisher

Subject

Computer Science & IT

Description

International journal of advances in intelligent informatics (IJAIN) e-ISSN: 2442-6571 is a peer reviewed open-access journal published three times a year in English-language, provides scientists and engineers throughout the world for the exchange and dissemination of theoretical and ...