Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : TIN: TERAPAN INFORMATIKA NUSANTARA

The Utilization of Resampling Techniques and the Random Forest Method in Data Classification Ciciana Ciciana; Rahmawati Rahmawati; Laila Qadrini
TIN: Terapan Informatika Nusantara Vol 4 No 4 (2023): September 2023
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/tin.v4i4.4342

Abstract

In data classification, there are various methods that can be employed, one of which is the random forest method. This method proves effective in handling non-linear data, exhibiting robustness against extreme data points and disturbances, and providing ease of use that results in high-quality classification outcomes. Data imbalance, where one class has more or fewer instances than the others, is a common issue. In situations of data imbalance, most classification models tend to favor the majority class, which can lead to overfitting and unsatisfactory classification results. To address this issue, resampling techniques can be applied. One such resampling technique is SMOTE, specifically an oversampling method that augments the minority class by generating synthetic data points. This research aims to evaluate the accuracy of data classification using the random forest method and assess the impact of resampling and random forest on classification. The data used in this study includes simulated breast cancer data and real-world patient data from LBW Puskesmas Banggae I Kabupaten Majene. The analysis results indicate an accuracy rate of 94.74%, a sensitivity of 93.33%, and an F1-Score of 95.89% for breast cancer data. Meanwhile, the accuracy for LBW data reached 73.75%, with a sensitivity of 77.63%, and an F1-Score of 84.89%.