Data imbalance is a common problem in classification, including in maternal health risk classification. Data imbalance occurs when the number of samples in the positive class is much less than the negative class. Data imbalance can cause the classification model to be inaccurate and tend to predict the majority class. One way to overcome the problem of data imbalance is to use the random oversampling technique. In this study, the random oversampling method is applied to overcome the problem of data imbalance in the classification of maternal health risks. Particle swarm optimization (PSO) is used for attribute weighting, improving the results of random oversampling and model performance. The results show that random oversampling can improve accuracy and reduce errors in predicting minority classes. In addition, the PSO technique also significantly contributed to improving the model's accuracy. The results of testing the random forest algorithm using 10-fold cross-validation on the health risks of pregnant women have an accuracy of 80.77%. After going through the random oversampling technique, the accuracy rate reaches 81.86%, and after optimization using the PSO technique, there is an increase of 2.15%, so the accuracy rate reaches 82.92%.
Copyrights © 2023