Indonesian Journal of Electrical Engineering and Computer Science
Vol 22, No 2: May 2021

The effect of the TF-IDF algorithm in times series in forecasting word on social media

Arif Ridho Lubis (Universitas Sumatera Utara)
Mahyuddin K. M. Nasution (Universitas Sumatera Utara)
Opim Salim Sitompul (Universitas Sumatera Utara)
Elviawaty Muisa Zamzami (Universitas Sumatera Utara)



Article Info

Publish Date
01 May 2021

Abstract

Forecasting is one of the main topics in data mining or machine learning in which forecasting, a group of data used, has a label class or target. Thus, many algorithms for solving forecasting problems are categorized as supervised learning with the aim of conducting training. In this case, the things that were supervised were the label or target data playing a role as a 'supervisor' who supervise the training process in achieving a certain level of accuracy or precision. Time series is a method that is generally used to forecast based on time and can forecast words in social media. In this study had conducted the word forecasting on twitter with 1734 tweets which were interpreted as weighted documents using the TF-IDF algorithm with a frequency that often comes out in tweets so the TF-IDF value is getting smaller and vice versa. After getting the word weight value of the tweets, a time series forecast was performed with the test data of 1734 tweets that the results referred to 1203 categories of Slack words and 531 verb tweets as training data resulting in good accuracy. The division of word forecasting was classified into two groups i.e. inactive users and active users. The results obtained were processed with a MAPE calculation process of 50% for inactive users and 0.1980198% for active users.

Copyrights © 2021