Malware is a type of software designed to harm various devices. As malware evolves and diversifies, traditional signature-based detection methods have become less effective against advanced types such as polymorphic, metamorphic, and oligomorphic malware. To address this challenge, machine learning-based malware detection has emerged as a promising solution. In this study, we evaluated the performance of several machine learning algorithms in detecting malware and applied Principal Component Analysis (PCA) to the best-performing algorithm to reduce the number of features and improve performance. Our results showed that the Random Forest algorithm outperformed Adaboost, Neural Network, Support Vector Machine, and k-Nearest Neighbor algorithms with an accuracy and recall rate of 98.3%. By applying PCA, we were able to further improve the performance of Random Forest to 98.7% for both accuracy and recall while reducing the number of features from 1084 to 32.
Copyrights © 2023