JAREE (Journal on Advanced Research in Electrical Engineering)
Vol 4, No 2 (2020): October

Clustering Data National Examinations based on Social Media Using K-Means Methods

Chandra Eko Wahyudi Utomo (Institut Teknologi Sepuluh Nopember
Pranata Komputer at University of Jember)

Mochamad Hariadi (Institut Teknologi Sepuluh Nopember)
Surya Sumpeno (Institut Teknologi Sepuluh Nopember)



Article Info

Publish Date
16 Oct 2020

Abstract

The development of social media as a source of data is now increasingly interesting to study. The social media studied in this research is Twitter. Twitter as one of the top-ranked social media among social media accessed by the people of Indonesia. People's behavior can be learned by collecting and processing data, one of which is people's sentiments or opinions about national examinations in Indonesia. Twitter user behavior in the form of their comments about the national exam in Indonesia. This study aims to analyze the public sentiments of social media users about the National Examination in Indonesia. Data is retrieved by crawling data via the Twitter API. The data needs to be preprocessed first and feature extracted using TF-IDF. However, because the text data on Twitter is unstructured and very diverse data (variety), the grouping stage must be done first. Grouping technique using K-Means Clustering on Spark. Spark clustering techniques are used to overcome the grouping of data on very large and complex amounts of data. From the clustering process using Spark it was found that the grouping process resulted in 3 clusters where elbow detection was found in the third cluster of the number of clusters between 2 and 50. The results of clustering in the form of 3 large groups were further processed (with classification techniques) to get a positive or negative sentiment comparison of social media user comments about the national exam. Furthermore, these results become recommendations and new knowledge about community behavior regarding Social Media-based National Exams.Keywords: clustering, K-Means, national exam, sentiment analysis, social media.

Copyrights © 2020






Journal Info

Abbrev

jaree

Publisher

Subject

Control & Systems Engineering Electrical & Electronics Engineering

Description

JAREE is an Open Access Journal published by the Department of Electrical Engineering, Institut Teknologi Sepuluh Nopember (ITS), Surabaya – Indonesia. Published twice a year every April and October, JAREE welcomes research papers with topics including power and energy systems, telecommunications ...