Jurnal INFOTEL
Vol 15 No 2 (2023): May 2023

Indonesian news classification application with named entity recognition approach

Nurchim Nurchim (Universitas Duta Bangsa Surakarta, Indonesia)
Nurmalitasari Nurmalitasari (Universitas Duta Bangsa Surakarta, Indonesia)
Zalizah Awang Long (Universiti Kuala Lumpur, Malaysia)



Article Info

Publish Date
23 May 2023

Abstract

Nowadays, many netizens search for news via search engines with countless amounts of information, so it is increasingly difficult to determine when the number of news articles that appear changes very quickly and dynamically. Thus, it is necessary to process the extraction of news information to display the core information of the news. Problems arise, especially in Indonesian, which has a structure of various noun phrase entities with shallow parsing or grammatical induction. Named Entity Recognition (NER) has the opportunity to overcome this because it can extract news entities in depth, starting from proper nouns in text documents containing information search, machine translation, answering questions, and automatic summarization. This study aims to apply NER in Indonesian language news classification. This study uses Design-Based Research whose process includes (1) pre-implementation, (2) design, (3) implementation and revision, and finally, (4) reflection and evaluation. This application was developed on the platform python, streamlit, BeautifulSoup, gnews, and spacy library. The results of application accuracy testing have an F1-score value of 89.69% for all entities consisting of place, figure, day, date, and organization.

Copyrights © 2023






Journal Info

Abbrev

infotel

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

Jurnal INFOTEL is a scientific journal published by Lembaga Penelitian dan Pengabdian Masyarakat (LPPM) of Institut Teknologi Telkom Purwokerto, Indonesia. Jurnal INFOTEL covers the field of informatics, telecommunication, and electronics. First published in 2009 for a printed version and published ...