Amanda, Riyan
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Analysis and Implementation Machine Learning for YouTube Data Classification by Comparing the Performance of Classification Algorithms Amanda, Riyan; Negara, Edi Surya
JOIN (Jurnal Online Informatika) Vol 5, No 1 (2020)
Publisher : Department of Informatics, UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/join.v5i1.505

Abstract

Every day, people around the world upload 1.2 million videos to YouTube or more than 100 hours per minute, and this number is increasing. The condition of this continuous data will be useless if not utilized again. To dig up information on large-scale data, a technique called data mining can be a solution. One of the techniques in data mining is classification. For most YouTube users, when searching for video titles do not match the desired video category. Therefore, this research was conducted to classify YouTube data based on its search text. This article focuses on comparing three algorithms for the classification of YouTube data into the Kesenian and Sains category. Data collection in this study uses scraping techniques taken from the YouTube website in the form of links, titles, descriptions, and searches. The method used in this research is an experimental method by conducting data collection, data processing, proposed models, testing, and evaluating models. The models applied are Random Forest, SVM, Naive Bayes. The results showed that the accuracy rate of the random forest model was better by 0.004%, with the label encoder not being applied to the target class, and the label encoder had no effect on the accuracy of the classification models. The most appropriate model for YouTube data classification from data taken in this study is Naïve Bayes, with an accuracy rate of 88% and an average precision of 90%.