Journal of Applied Data Sciences
Vol 4, No 4: DECEMBER 2023

Multiple Choice Question Difficulty Level Classification with Multi Class Confusion Matrix in the Online Question Bank of Education Gallery

Pariang Sonang Siregar (Department of Elementary Teacher Education, Universitas Rokania, Indonesia)
Rindi Genesa Hatika (Department of Physics Education, Universitas Pasir Pengaraian, Indonesia)
B. Herawan Hayadi (Information Technology Education, Universitas Bina Bangsa, Indonesia)



Article Info

Publish Date
05 Dec 2023

Abstract

The importance of test question planning as a critical element in improving the quality of education is undeniable as it helps teachers evaluate student understanding. The creation of questions must consider the level of difficulty, which is often divided into three categories: easy, medium, and difficult. Predicting the difficulty level of questions has great importance as it helps teachers create test questions that match students' abilities. In this study, we view the identification of item difficulty as a classification problem. The data used includes questions from elementary and junior high school, with various machine learning methods applied to perform classification. We tested Random Forest, Logistic Regression, SVM, Gaussian, and Dense NN, considering embedding, lexical, and syntactic features. The evaluation results show that the best method in identifying the difficulty level of questions in subjects is using Random Forest, resulting in an accuracy of 84%. Meanwhile, in other cases, the best method is also Random Forest, with an accuracy of 80%. Our research shows that the use of feature embedding and TF-IDF has a significant positive impact on the accuracy of the resulting model.

Copyrights © 2023






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...