Garuda - Garba Rujukan Digital

Jurnal Teknik Informatika (JUTIF)

Vol. 4 No. 1 (2023): JUTIF Volume 4, Number 1, February 2023

Fahry Maodah (Magister Teknik Informatika, Universitas Amikom Yogyakarta, Indonesia)
Ema Utami (Magister Teknik Informatika, Universitas Amikom Yogyakarta, Indonesia)
Sudarmawan Sudarmawan (Magister Teknik Informatika, Universitas Amikom Yogyakarta, Indonesia)

Publish Date
10 Feb 2023

This research attempts to identify the most accurate and effective model in performing sentiment analysis on product reviews in marketplaces using preprocessing techniques, word2vec, and CNN. We collected 20,986 reviews from 720 products in a marketplace using scrap method, then cleaned and labeled the data to include 515 positive reviews, 490 negative reviews. We then performed preprocessing on the data using four different scenarios and identified word vector representation using word2vec. Subsequently, we applied the results of word2vec to the CNN architecture to classify sentiment in product reviews. After trying various variations of each technique, we found that a combination of the third preprocessing technique (case folding, punctuation removal, word normalization, and stemming), the second word2vec parameter combination (size 50, window 2, hs 0, and negative 10), and the fourth CNN parameter combination (kernel size 2, dropout 0.2, and learning rate 0.01) had the best accuracy of 99.00%, precision of 98.96%, and recall of 98.96%. We also found that the word normalization technique greatly helped to increase model accuracy by correcting improperly written or incorrect words in the reviews. Based on the evaluation of word2vec, the hs 0 method produced a higher average accuracy compared to the hs 1 method because the hs 0 method used negative sampling which helped the model understand the context of the trained words. In the CNN parameter, higher learning rates can cause the model to learn faster, but can also cause the model to be unstable, while lower learning rates can make the model more stable but can also cause the model's learning process to be slower.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Jurnal Teknik Informatika (JUTIF)

Website

Abbrev

jurnal

Publisher

Universitas Jenderal Soedirman

Subject

Computer Science & IT

Description

Jurnal Teknik Informatika (JUTIF) is an Indonesian national journal, publishes high-quality research papers in the broad field of Informatics, Information Systems and Computer Science, which encompasses software engineering, information system development, computer systems, computer network, ...

Article Info

Abstract

OPTIMIZING SENTIMENT ANALYSIS OF PRODUCT REVIEWS ON MARKETPLACE USING A COMBINATION OF PREPROCESSING TECHNIQUES, WORD2VEC, AND CONVOLUTIONAL NEURAL NETWORK

Article Info

Abstract