Jurnal Sisfokom (Sistem Informasi dan Komputer)
Vol 12, No 2 (2023): JULI

Automatic Categorization of Multi Marketplace FMCGs Products using TF-IDF and PCA Features

Sri Suci Indasari (Institut Teknologi Sepuluh Nopember)
Aris Tjahyanto (Institut Teknologi Sepuluh Nopember)



Article Info

Publish Date
01 Jul 2023

Abstract

The use of technology in line with the increasing number of internet users has caused a shift in the product sales ecosystem to the realm of electronic commerce (electronic commerce). A total of 73.23 customers made purchase transactions using e-commerce and the most purchased products were products classified as Fast Moving Consumer Goods (FMCGs). The increasingly varied FMCGs data coupled with the increasing number of marketplaces is felt to need to be broken down into specific groups. The process is carried out by analyzing e-commerce product information, especially product names, and descriptions. In this study, we propose an automatic categorization of multiple marketplaces using data from multiple marketplaces. Data text is converted into structured data with a series of preprocessing, and comprehensive experiments are carried out to see the extraction performance of variables including TF-IDF, BOW, and N-Gram.  All three methods are used to validate text data sets with K-Means grouping results used with the help of PCA to reduce data dimensions.  The results show that the performance of the TF-IDF algorithm with a dimension reduction value of 70 and the use of Python can provide optimal results for the percentage of grouping data.

Copyrights © 2023






Journal Info

Abbrev

sisfokom

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

Jurnal Sisfokom merupakan singkatan dari Jurnal Sistem Informasi dan Komputer. Jurnal ini merupakan kolaborasi antara sivitas akademika STMIK Atma Luhur dengan perguruan tinggi maupun universitas di Indonesia. Jurnal ini berisi artikel ilmiah dari peneliti, akademisi, serta para pemerhati TI. Jurnal ...