JOURNAL OF SCIENCE AND SOCIAL RESEARCH
Vol 5, No 3 (2022): October 2022

KOMPARASI RANDOM FOREST DAN LOGISTIC REGRESSION DALAM KLASIFIKASI PENDERITA COVID-19 BERDASARKAN GEJALANYA

Ichsan Firmansyah (Unknown)
Jaka Tirta Samudra (Unknown)
Doughlas Pardede (Unknown)
Zakarias Situmorang (Unknown)



Article Info

Publish Date
18 Oct 2022

Abstract

Abstract: In data mining, we can use symptoms suffered by patients for a reference in classifying positive and negative Covid-19 patients using data mining. Random Forest and logistic regression are two data mining algorithms with high accuracy, precision, and sensitivity in data classification. This study compares the random forest and the logistic regression algorithm - where we use the lasso and ridge regulations - on classifying positive and negative Covid-19 patients based on their symptoms. From 5434 data used in the data set, the evaluation results show that the random forest algorithm is the best in terms of accuracy, precision, and sensitivity compared to other algorithms, while the logistic regression algorithm with ridge regulation is the worst. The random forest algorithm is the most reliable in classifying patients with positive Covid-19, while the logistic regression algorithm with ridge regulation is the least reliable. Also, the random forest algorithm is the most reliable in classifying patients with negative Covid-19, while the logistic regression algorithm with lasso regulation is the least reliable.Keywords: classification;covid-19;data mining;logistic regression;random forest.Abstrak: Dalam data mining, kita dapat menggunakan gejala yang diderita pasien sebagai acuan dalam mengklasifikasikan pasien positif dan negatif Covid-19 menggunakan data mining. Random forest dan logistic regression adalah dua algoritma data mining yang memiliki akurasi (accuracy), presisi (precision), dan sensitivitas (recall) tinggi dalam klasifikasi data. Penelitian ini membandingkan algoritma random forest dan logistic regression - di mana kami menggunakan regulasi lasso dan ridge - dalam mengklasifikasikan pasien positif dan negatif Covid-19 berdasarkan gejalanya. Dari 5434 data yang digunakan dalam data set, hasil evaluasi menunjukkan bahwa algoritma random forest adalah yang terbaik dalam hal akurasi, presisi, dan sensitivitas dibandingkan dengan algoritma lainnya, sedangkan algoritma logistic regression dengan regulasi ridge adalah yang terburuk. Algoritma random forest paling andal dalam mengklasifikasikan pasien positif Covid-19, sedangkan algoritma logistic regression dengan regulasi ridge merupakan algoritma yang paling tidak tidak dapat diandalkan. Selain itu, algoritma random forest paling andal dalam mengklasifikasikan pasien dengan Covid-19 negatif, sedangkan algoritma logistic regresssion dengan regulasi lasso merupakan yang paling tidak dapat diandalkan.Kata kunci: covid-19;data mining;klasifikasi;logistic regression;random forest.

Copyrights © 2022






Journal Info

Abbrev

JSSR

Publisher

Subject

Computer Science & IT Economics, Econometrics & Finance Education Social Sciences

Description

Journal of Science and Social Research is accepts research works from academicians in their respective expertise of studies. Journal of Science and Social Research is platform to disclose the research abilities and promote quality and excellence of young researchers and experienced thoughts towards ...