Khazanah Informatika: Jurnal Ilmu Komputer dan Informatika
Vol. 9 No. 2 October 2023

Combination of Graph-based Approach and Sequential Pattern Mining for Extractive Text Summarization with Indonesian Language

Dian Sa'adillah Maylawati (Centre for Advanced Computing Technology, Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Malaysia and Department of Informatics, UIN Sunan Gunung Djati Bandung, Indonesia)
Yogan Jaya Kumar (Centre for Advanced Computing Technology, Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Malaysia)
Fauziah Binti Kasmin (Centre for Advanced Computing Technology, Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Malaysia)



Article Info

Publish Date
29 Oct 2023

Abstract

The great challenge in Indonesian automatic text summarization research is producing readable summaries. The quality of text summary can be reached if the meaning of the text can be maintained properly. As a result, the purpose of this study is to improve the quality of extractive Indonesian automatic text summarization by taking into account the quality of structured text representation. This study employs Sequential Pattern Mining (SPM) to generate a sequence of words as a structured representation of text and a graph-based approach to generate automatic text summarization. The SPM algorithm used is PrefixSpan, and the graph-based approach uses the Bellman-Ford algorithm. The results of an experiment using the IndoSum dataset show that combining SPM and Bellman-Ford can improve the precision, recall, and f-measure of ROUGE-1, ROUGE-2, and ROUGE-L. When Bellman-Ford is combined with SPM, the F-measure of ROUGE-1 increases from 0.2299 to 0.3342. The ROUGE-2 f-measure increases from 0.1342 to 0.2191, and the ROUGE-L f-measure increases from 0.1904 to 0.2878. This result demonstrates that SPM can improve the performance of the Bellman-Ford algorithm in producing Indonesian text summaries.

Copyrights © 2023






Journal Info

Abbrev

khif

Publisher

Subject

Computer Science & IT

Description

Khazanah Informatika: Jurnal Ilmiah Komputer dan Informatika, an Indonesian national journal, publishes high quality research papers in the broad field of Informatics and Computer Science, which encompasses software engineering, information system development, computer systems, computer network, ...