Jurnal Nasional Teknik Elektro dan Teknologi Informasi
Vol 11 No 2: Mei 2022

Deteksi Dini Penyakit Diabetes Menggunakan Machine Learning dengan Algoritma Logistic Regression

Erlin (Institut Bisnis dan Teknologi Pelita Indonesia)
Yulvia Nora Marlim (Institut Bisnis dan Teknologi Pelita Indonesia)
Junadhi (STMIK Amik Riau)
Laili Suryati (Universitas Persada Indonesia Y.A.I)
Nova Agustina (Sekolah Tinggi Teknologi Bandung)



Article Info

Publish Date
30 May 2022

Abstract

Diabetes is one of the deadliest diseases in the world, including in Indonesia. It can cause complications in numerous body parts and increase the overall risk of death. One way to detect diabetes is to use machine learning algorithms. Logistic regression is a classification model in machine learning widely used in clinical analysis. In this paper, a predictive model was created in Python IDE using logistic regression to conduct an early detection if a person has diabetes or not depending on the initial data provided. The experiment was carried out using a dataset from the Pima Indians Diabetes Database, which consisted of 768 patient data with eight independent variables and one dependent variable. Exploratory data analysis was applied to obtain maximum insight of the datasets owned by using statistical assistance and presenting them through visual techniques. Some dataset variables contained incomplete data. Missing data values were replaced with the median value of each variable. Unbalanced data was handled using the synthetic minority over-sampling technique (SMOTE) to increase the minority class through synthetic data sampling. The model was evaluated based on the confusion matrix, which showed a reasonably good performance with an accuracy value of 77%, precision of 75%, recall of 77%, and F1-score of 76%. In addition, this paper also used the grid search technique as a hyperparameter tuning that could improve the performance of the logistic regression model. The primary model performance with the model after applying the grid search technique was tested and evaluated. The experimental results showed that the hyperparameter tuning-based model could improve the performance of the logistic regression algorithm for prediction with an accuracy value of 82%, precision of 81%, recall of 79%, and F1-score of 80%.

Copyrights © 2022






Journal Info

Abbrev

JNTETI

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Energy Engineering

Description

Topics cover the fields of (but not limited to): 1. Information Technology: Software Engineering, Knowledge and Data Mining, Multimedia Technologies, Mobile Computing, Parallel/Distributed Computing, Artificial Intelligence, Computer Graphics, Virtual Reality 2. Power Systems: Power Generation, ...