Journal of Data Science and Its Applications
Vol 4 No 1 (2021): Journal of Data Science and Its Applications

Implementation of Enhance Confix Stripping Stemmer Algorithm for Multiclass Dataset Classification in News Text using K-Nearest Neighbor

Alvianda Ricky Lukman (Student)
Widi Astuti (Unknown)



Article Info

Publish Date
07 Oct 2021

Abstract

Needs for news information has increased since the change from physical media to online media. News is grouped according to categories to making it easier for readers to get the news as desired. Grouping to determine the category of news information is known as text classification. The number of words in the news text create diversity of words that appear and can be minimized by the stemming process, which is changing an affixed word into its root word. This study comparing between use of stemming and without stemming and finding the best value of K and optimum distance calculation of K-Nearest Neighbor. The best accuracy is 0.9671 which is obtained when stemming algorithm not applied, number of K=9 and cosine distance is used as distance metric. This result is greater than the classification that applies stemming algorithm in condition K=7 using cosine distance which resulted accuracy in 0.9660.

Copyrights © 2021






Journal Info

Abbrev

jdsa

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management

Description

JDSA welcomes all topics that are relevant to data science, computational linguistics, and information sciences. The listed topics of interest are as follows: Big Data Analytics Computational Linguistics Data Clustering and Classifications Data Mining and Data Analytics Data Visualization ...