Knowledge Engineering and Data Science
Vol 3, No 1 (2020)

Parallelization of Partitioning Around Medoids (PAM) in K-Medoids Clustering on GPU

Adhi Prahara (Department of Informatics Faculty of Industrial Technology Universitas Ahmad Dahlan)
Dewi Pramudi Ismi (Department of Informatics Faculty of Industrial Technology Universitas Ahmad Dahlan)
Ahmad Azhari (Department of Informatics Faculty of Industrial Technology Universitas Ahmad Dahlan)



Article Info

Publish Date
30 Jun 2020

Abstract

K-medoids clustering is categorized as partitional clustering. K-medoids offers better result when dealing with outliers and arbitrary distance metric also in the situation when the mean or median does not exist within data. However, k-medoids suffers a high computational complexity. Partitioning Around Medoids (PAM) has been developed to improve k-medoids clustering, consists of build and swap steps and uses the entire dataset to find the best potential medoids. Thus, PAM produces better medoids than other algorithms. This research proposes the parallelization of PAM in k-medoids clustering on GPU to reduce computational time at the swap step of PAM. The parallelization scheme utilizes shared memory, reduction algorithm, and optimization of the thread block configuration to maximize the occupancy. Based on the experiment result, the proposed parallelized PAM k-medoids is faster than CPU and Matlab implementation and efficient for large dataset.

Copyrights © 2020






Journal Info

Abbrev

keds

Publisher

Subject

Computer Science & IT Engineering

Description

Knowledge Engineering and Data Science (2597-4637), KEDS, brings together researchers, industry practitioners, and potential users, to promote collaborations, exchange ideas and practices, discuss new opportunities, and investigate analytics frameworks on data-driven and knowledge base systems. ...