Journal of Data Science and Its Applications
Vol 4 No 1 (2021): Journal of Data Science and Its Applications

Sentiment Analysis of Beauty Product Reviews Using the K-Nearest Neighbor (KNN) and TF-IDF Methods with Chi-Square Feature Selection

Yusrifa Deta Kirana (student)
Said Al Faraby (Unknown)



Article Info

Publish Date
17 Oct 2021

Abstract

The rise of beauty products in recent times can make consumers hesitate to choose a beauty product, especially for women. Beauty product reviews have become a very valuable source of information for consumers in making decisions to purchase a product in improving their products and marketing strategies. The process of sentiment analysis on negative and positive beauty product reviews will be classified one by one. Therefore, in this study, sentiment analysis was applied to the beauty product review data using the K-Nearest Neighbor (KNN) method to find the best k in the case of this study. The dataset used will be pre-processed with case folding, noise removal, tokenization, stemming, stopword removal, and slang words, for feature extraction using Term Frequency Inverse Document Frequency (TF-IDF) to calculate the weight of a word in the document, and The feature selection method uses Chi-Square which aims to select the features needed to increase the accuracy value. In this study, the best accuracy value was 71% of the data classified using KNN with a k value of 50 and the model on feature selection with 76 features.

Copyrights © 2021






Journal Info

Abbrev

jdsa

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management

Description

JDSA welcomes all topics that are relevant to data science, computational linguistics, and information sciences. The listed topics of interest are as follows: Big Data Analytics Computational Linguistics Data Clustering and Classifications Data Mining and Data Analytics Data Visualization ...