Garuda - Garba Rujukan Digital

Jurnal Mantik

Vol. 6 No. 1 (2022): May: Manajemen, Teknologi Informatika dan Komunikasi (Mantik)

Samuel Situmeang (Institut Teknologi Del)

Publish Date
16 May 2022

The text preprocessing stage within a natural language processing application framework helps eliminate parts that are not helpful in the text analysis process or particular noise. Despite having a potential impact on the final performance of the application, text preprocessing has not received attention in the text analysis application literature, especially in the named entity recognition application in Indonesian texts. This paper aims to comprehensively examine the impact of text preprocessing in the Indonesian named entity recognition based on a baseline model, namely Conditional Random Field, to find the fittest preprocessing procedures for a NER model compelling performance. Various forms of text preprocessing contribute to the successful recognition of named entities assessed comparatively across three categories: people, places, and organizations. Experimental analysis of the data set reveals that several combinations of preprocessing text forms are useful. Rather than enabling or disabling them all, several combinations can significantly improve the accuracy of Indonesian named entity recognition depending on the entity category.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Jurnal Mantik

Website

Abbrev

mantik

Publisher

Institute Of Computer Science

Subject

Computer Science & IT Economics, Econometrics & Finance Languange, Linguistic, Communication & Media

Description

Jurnal Mantik (Manajemen, Teknologi Informatika dan Komunikasi) is a scientific journal in information systems/informati containing the scientific literature on studies of pure and applied research in information systems/information technology,Comptuer Science and management science and public ...

Article Info

Abstract

Impact of Text Preprocessing on Named Entity Recognition Based on Conditional Random Field in Indonesian Text

Article Info

Abstract