Claim Missing Document
Check
Articles

Found 2 Documents
Search

INTER AND INTRA CLUSTER ON SELF-ADAPTIVE DIFFERENTIAL EVOLUTION FOR MULTI-DOCUMENT SUMMARIZATION Alifia Puspaningrum; Adhi Nurilham; Eva Firdayanti Bisono; Khoirul Umam; Agus Zainal Arifin
Jurnal Ilmu Komputer dan Informasi Vol 11, No 2 (2018): Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information
Publisher : Faculty of Computer Science - Universitas Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (438.145 KB) | DOI: 10.21609/jiki.v11i2.547

Abstract

Multi – document as one of summarization type has become more challenging issue than single-document because its larger space and its different content of each document. Hence, some of optimization algorithms consider some criteria in producing the best summary, such as relevancy, content coverage, and diversity. Those weighted criteria based on the assumption that the multi-documents are already located in the same cluster. However, in a certain condition, multi-documents consist of many categories and need to be considered too. In this paper, we propose an inter and intra cluster which consist of four weighted criteria functions (coherence, coverage, diversity, and inter-cluster analysis) to be optimized by using SaDE (Self Adaptive Differential Evolution) to get the best summary result. Therefore, the proposed method will deal not only with the value of compactness quality of the cluster within but also the separation of each cluster. Experimental results on Text Analysis Conference (TAC) 2008 datasets yields better summaries results with average ROUGE-1 on precision, recall, and f - measure 0.77, 0.07, and 0.12 compared to another method that only consider the analysis of intra-cluster.
Ekstraksi Frasa Kunci pada Penggabungan Klaster berdasarkan Maximum-Common-Subgraph Adhi Nurilham; Diana Purwitasari; Chastine Fatichah
Jurnal Nasional Teknik Elektro dan Teknologi Informasi Vol 7 No 3: Agustus 2018
Publisher : Departemen Teknik Elektro dan Teknologi Informasi, Fakultas Teknik, Universitas Gadjah Mada

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1771.664 KB)

Abstract

Document clustering based on topic similarities helps users in searching from a collection of scientific articles. Topic labels are necessesary for describing subjects of the document clusters. Clusters with related subjects or contextual similarities can be merged to produce more descriptive labels. Relations between those words in one context can be modelled as a graph. Instead of single word, this paper proposed cluster labeling of phrases from scientific articles withcluster merging based on graph. The proposed method begins with K-Means++ for clustering the scientific articles. Then, the candidates of word phrases from document clusters are extracted using Frequent Phrase Mining which inspired by Apriori algorithm. Each cluster result has a representation graph from those extracted word phrases. An indicator value from each graph shows any similarities of graph structures which is calculated with Maximum Common Subgraph (MCS). Those clusters are merged if there are any structure similarities between them. Topic labels of clusters are keyword phrases extracted from a representation graph of previous merged clusters using TopicRank algorithm. The merging process which becomes the contribution of this paper is considering topic distribution within clusters for phrase extraction. The proposed method evaluationis performed based on topic coherence of the merged clusterslabel. The results show that proposed method can improve topic coherence on the merged clusters with MCS graph size percentage as the key factor.Further observation shows that merged cluster labels consistent to MCS graph.