Garuda - Garba Rujukan Digital

p-Index From 2019 - 2024

0.444

P-Index

This Author published in this journals

All Journal Telematika : Jurnal Informatika dan Teknologi Informasi Techno LPPM Seminar Nasional Informatika (SEMNASIF) Register: Jurnal Ilmiah Teknologi Sistem Informasi Proceeding of the Electrical Engineering Computer Science and Informatics

Hidayatulah Himawan

Informatika, Universitas Pembangunan Nasional Veteran Yogyakarta

Author-ID : 282152

Computer Science & IT Electrical & Electronics Engineering

Published : 22 Documents Claim Missing Document

Claim Missing Document

Articles

Title

Effect of information gain on document classification using k-nearest neighbor Rifki Indra Perwira; Bambang Yuwono; Risya Ines Putri Siswoyo; Febri Liantoni; Hidayatulah Himawan
Register: Jurnal Ilmiah Teknologi Sistem Informasi Vol. 8 No. 1 (2022): January
Publisher : Information Systems - Universitas Pesantren Tinggi Darul Ulum

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26594/register.v8i1.2397

State universities have a library as a facility to support students’ education and science, which contains various books, journals, and final assignments. An intelligent system for classifying documents is needed to ease library visitors in higher education as a form of service to students. The documents that are in the library are generally the result of research. Various complaints related to the imbalance of data texts and categories based on irrelevant document titles and words that have the ambiguity of meaning when searching for documents are the main reasons for the need for a classification system. This research uses k-Nearest Neighbor (k-NN) to categorize documents based on study interests with information gain features selection to handle unbalanced data and cosine similarity to measure the distance between test and training data. Based on the results of tests conducted with 276 training data, the highest results using the information gain selection feature using 80% training data and 20% test data produce an accuracy of 87.5% with a parameter value of k=5. The highest accuracy results of 92.9% are achieved without information gain feature selection, with the proportion of training data of 90% and 10% test data and parameters k=5, 7, and 9. This paper concludes that without information gain feature selection, the system has better accuracy than using the feature selection because every word in the document title is considered to have an essential role in forming the classification.

Co-Authors Abdur Rahman, Hafidz Fajar Adi Yusuf Agus Sasmito Aribowo Agus Triawan Anak Agung Istri Sri Wiadnyani Annesa Maya Sabarina Awang Hendrianto Pratomo Awang Hendrianto Pratomo Bambang Yuwono Bambang Yuwono Bambang Yuwono Debby Gybson Putri Dessyanto Boedi Prasetyo Eko Yuli Prasetyo Febri Liantoni Hafidz Fajar Abdur Rahman Hafidz Fajar Abdur Rahman, Hafidz Fajar Abdur Heru Cahya Rustamaji Mangaras Yanu F Muhammad Afif Karomi Muhammad Ali Husaini Nur Heri Cahyana Oliver Samuel Simanjuntak Pratomo, Awang Hendrianto Reza Raditya Setyo Putra Rifki Indra Perwira Risya Ines Putri Siswoyo Wilis Kaswidjanti Wilis Kaswidjanti Wilis Kaswidjanti

Title Search

Found 1 Documents Search Journal : Register: Jurnal Ilmiah Teknologi Sistem Informasi

Abstract

Title

Found 1 Documents
Search
Journal : Register: Jurnal Ilmiah Teknologi Sistem Informasi