Lontar Komputer: Jurnal Ilmiah Teknologi Informasi
Vol. 9, No. 3 December 2018

Dimensionality Reduction using PCA and K-Means Clustering for Breast Cancer Prediction

Ade Jamal (Universitas Al-Azhar Indonesia)
Annisa Handayani (Unknown)
Ali Akbar Septiandri (Unknown)
Endang Ripmiatin (Unknown)
Yunus Effendi (Unknown)



Article Info

Publish Date
22 Dec 2018

Abstract

Breast cancer is the most important cause of death among women. A prediction of breast cancer in early stage provides a greater possibility of its cure. It needs a breast cancer prediction tool that can classify a breast tumor whether it was a harmful malignant tumor or un-harmful benign tumor. In this paper, two algorithms of machine learning, namely Support Vector Machine and Extreme Gradient Boosting technique will be compared for classification purpose. Prior to the classification, the number of data attribute will be reduced from the raw data by extracting features using Principal Component Analysis. A clustering method, namely K-Means is also used for dimensionality reduction besides the Principal Component Analysis. This paper will present a comparison among four models based on two dimensionality reduction methods combined with two classifiers which applied on Wisconsin Breast Cancer Dataset. The comparison will be measured by using accuracy, sensitivity and specificity metrics evaluated from the confusion matrices. The experimental results have indicated that the K-Means method, which is not usually used for dimensionality reduction can perform well compared to the popular Principal Component Analysis.

Copyrights © 2018






Journal Info

Abbrev

lontar

Publisher

Subject

Computer Science & IT

Description

Lontar Komputer [ISSN Print 2088-1541] [ISSN Online 2541-5832] is a journal that focuses on the theory, practice, and methodology of all aspects of technology in the field of computer science and engineering as well as productive and innovative ideas related to new technology and information ...