Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control
Vol. 7, No. 1, February 2022

Rule-based Disease Classification using Text Mining on Symptoms Extraction from Electronic Medical Records in Indonesian

Alfonsus Haryo Sangaji (Institut Teknologi Sepuluh Nopember)
Yuri Pamungkas (Institut Teknologi Sepuluh Nopember)
Supeno Mardi Susiki Nugroho (Institut Teknologi Sepuluh Nopember)
Adhi Dharma Wibawa (Institut Teknologi Sepuluh Nopember)



Article Info

Publish Date
28 Feb 2022

Abstract

Recently, electronic medical record (EMR) has become the source of many insights for clinicians and hospital management. EMR stores much important information and new knowledge regarding many aspects for hospital and clinician competitive advantage. It is valuable not only for mining data patterns saved in it regarding the patient symptoms, medication, and treatment, but also it is the box deposit of many new strategies and future trends in the medical world. However, EMR remains a challenge for many clinicians because of its unstructured form. Information extraction helps in finding valuable information in unstructured data. In this paper, information on disease symptoms in the form of text data is the focus of this study. Only the highest prevalence rate of diseases in Indonesia, such as tuberculosis, malignant neoplasm, diabetes mellitus, hypertensive, and renal failure, are analyzed. Pre-processing techniques such as data cleansing and correction play a significant role in obtaining the features. Since the amount of data is imbalanced, SMOTE technique is implemented to overcome this condition. The process of extracting symptoms from EMR data uses a rule-based algorithm. Two algorithms were implemented to classify the disease based on the features, namely SVM and Random Forest. The result showed that the rule-based symptoms extraction works well in extracting valuable information from the unstructured EMR. The classification performance on all algorithms with accuracy in SVM 78% and RF 89%.

Copyrights © 2022






Journal Info

Abbrev

kinetik

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Energy Engineering

Description

Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control was published by Universitas Muhammadiyah Malang. journal is open access journal in the field of Informatics and Electrical Engineering. This journal is available for researchers who want to improve ...