cover
Contact Name
Aji Prasetya Wibawa
Contact Email
aji.prasetya.ft@um.ac.id
Phone
+62818539333
Journal Mail Official
keds.journal@um.ac.id
Editorial Address
Gedung G4. Lantai 1 Jl. Semarang No.5, Malang
Location
Kota malang,
Jawa timur
INDONESIA
Knowledge Engineering and Data Science
ISSN : -     EISSN : 25974637     DOI : http://dx.doi.org/10.17977
Knowledge Engineering and Data Science (2597-4637), KEDS, brings together researchers, industry practitioners, and potential users, to promote collaborations, exchange ideas and practices, discuss new opportunities, and investigate analytics frameworks on data-driven and knowledge base systems.
Articles 81 Documents
A Comparative Study of Machine Learning-based Approach for Network Traffic Classification Kien Trang; An Hoang Nguyen
Knowledge Engineering and Data Science Vol 4, No 2 (2021)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v4i22021p128-137

Abstract

Internet usage has increased rapidly and become an essential part of human life, corresponding to the rapid development of network infrastructure in recent years. Thus, protecting users’ confidential information when joining the global network becomes one of the most significant considerations. Even though multiple encryption algorithms and techniques have been applied in different parties, including internet providers, and web hosting, this situation also allows the hacker to attack the network system anonymously. Therefore, the significance of classifying network data streams to improve network system quality and security is attracting increasing study interests. This work introduces a machine learning-based approach to find the most suitable training model for network traffic classification tasks. Data pre-processing is first applied to normalize each feature type in the dataset. Different machine learning techniques, including k-Nearest Neighbors (KNN), Artificial Neural Network (ANN), and Random Forest (RF), are applied based on the normalized features in the classification phase. An open-access dataset ISCXVPN2016 is applied for this research, which includes two types of encryption (VPN and Non-VPN) and seven classes of traffic categories classes. Experimental results on the open dataset have shown that the proposed models have reached a high classification rate – over 85% in some cases, in which the RF model obtains the most refined results among the three techniques.
A Comprehensive Analysis of Reward Function for Adaptive Traffic Signal Control Abu Rafe Md Jamil; Naushin Nower
Knowledge Engineering and Data Science Vol 4, No 2 (2021)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v4i22021p85-96

Abstract

Adaptive traffic control systems (ATCS) can play an important role to reduce traffic congestion in urban areas. The main challenge for ATSC is to determine the proper signal timing. Recently, Deep Reinforcement learning (DRL) is used to determine proper signal timing. However, the success of the DRL algorithm depends on the appropriate reward function design. There exist various reward functions for ATSC in the existing research.  In this research, a comprehensive analysis of the widely used reward function is presented. The pros and cons of various reward algorithms are discussed and experimental analysis shows that multi-objective reward function enhances the performance of ATSC.
Stress Classification using Deep Learning with 1D Convolutional Neural Networks Abdulrazak Yahya Saleh; Lau Khai Xian
Knowledge Engineering and Data Science Vol 4, No 2 (2021)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v4i22021p145-152

Abstract

Stress has been a major problem impacting people in various ways, and it gets serious every day. Identifying whether someone is suffering from stress is crucial before it becomes a severe illness. Artificial Intelligence (AI) interprets external data, learns from such data, and uses the learning to achieve specific goals and tasks. Deep Learning (DL) has created an impact in the field of Artificial Intelligence as it can perform tasks with high accuracy. Therefore, the primary purpose of this paper is to evaluate the performance of 1D Convolutional Neural Networks (1D CNNs) for stress classification. A Psychophysiological stress (PS) dataset is utilized in this paper. The PS dataset consists of twelve features obtained from the expert. The 1D CNNs are trained and tested using 10-fold cross-validation using the PS dataset. The algorithm performance is evaluated based on accuracy and loss matrices. The 1D CNNs outputs 99.7% in stress classification, which outperforms the Backpropagation (BP), only 65.57% in stress classification. Therefore, the findings yield a promising outcome that the 1D CNNs effectively classify stress compared to BP. Further explanation is provided in this paper to prove the efficiency of 1D CNN for the classification of stress. 
Parallel Approach of Adaptive Image Thresholding Algorithm on GPU Adhi Prahara; Andri Pranolo; Nuril Anwar; Yingchi Mao
Knowledge Engineering and Data Science Vol 4, No 2 (2021)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v4i22021p69-84

Abstract

Image thresholding is used to segment an image into background and foreground using a given threshold. The threshold can be generated using a specific algorithm instead of a pre-defined value obtained from observation or experiment. However, the algorithm involves per pixel operation, histogram calculation, and iterative procedure to search the optimum threshold that is costly for high-resolution images. In this research, parallel implementations on GPU for three adaptive image thresholding methods, namely Otsu, ISODATA, and minimum cross-entropy, were proposed to optimize their computational times to deal with high-resolution images. The approach involves parallel reduction and parallel prefix sum (scan) techniques to optimize the calculation. The proposed approach was tested on various sizes of grayscale images. The result shows that the parallel implementation of three adaptive image thresholding methods on GPU achieves 4-6 speeds up compared to the CPU implementation, reducing the computational time significantly and effectively dealing with high-resolution images. 
Melanoma Classification based on Simulated Annealing Optimization in Neural Network Edi Jaya Kusuma; Ika Pantiawati; Sri Handayani
Knowledge Engineering and Data Science Vol 4, No 2 (2021)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v4i22021p97-104

Abstract

Technology development in image processing and artificial intelligence field leads to the high demand for smart systems, especially in the health sector. Cancer is one of the diseases that has the highest mortality cases around the world. Melanoma is one of the cancer types that appear caused by high exposure to UV light. The earliest the melanoma was identified, the higher the chance the patient can be recovered. Therefore, this study carries the melanoma detection based on BPNN optimized by a simulated annealing algorithm. This research utilizes PH2 dermoscopic image data which contains 200 color digital images in BMP format. The data is processed using color feature extraction techniques to identify the characteristics of each image according to the target data. The color space extraction used includes mean RGB, HSV, CIE LAB, YCbCr, and XYZ. The evaluation result showed that the BPNN-SA method was able to increase the accuracy performance in classifying skin cancer when compared to the original BPNN method with an overall average accuracy of 84.03%.
Similarity Identification of Large-scale Biomedical Documents using Cosine Similarity and Parallel Computing Merlinda Wibowo; Christoph Quix; Nur Syahela Hussien; Herman Yuliansyah; Faisal Dharma Adhinata
Knowledge Engineering and Data Science Vol 4, No 2 (2021)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v4i22021p105-116

Abstract

Document similarity computation is an important research topic in information retrieval, and it is a crucial issue for automatic document categorization. The similarity value is between 0 and 1, then the closest value to 1 is represented both documents is considered more relevant, vice versa. However, the large scale of textual information has created the problem of finding the relevance level between documents. Therefore, the relevance between mesh heading text in the PubMed documents is higher than the relevance of the abstract text in the PubMed documents. Furthermore, parallel computing is implemented to speed up the large-scale documents similarity identification process that automatically calculates in the PubMed application. The execution time of mesh heading is 15.447 seconds, and the timely execution of abstract is 74.191 seconds. The execution time of mesh heading is higher than abstract because abstract contains more words than mesh heading. This study has successfully identified the similarity between large-scale biomedical documents of the PubMed documents that implemented a cosine similarity algorithm. The result has shown that the cosine similarity of the mesh heading texts is higher than the abstract text in the form of a graph and table shown in the PubMed application. The cosine similarity is useful to measure the similarity between documents based on the TF*IDF calculation result.
Optimized Three Deep Learning Models Based-PSO Hyperparameters for Beijing PM2.5 Prediction Andri Pranolo; Yingchi Mao; Aji Prasetya Wibawa; Agung Bella Putra Utama; Felix Andika Dwiyanto
Knowledge Engineering and Data Science Vol 5, No 1 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i12022p53-66

Abstract

Deep learning is a machine learning approach that produces excellent performance in various applications, including natural language processing, image identification, and forecasting. Deep learning network performance depends on the hyperparameter settings. This research attempts to optimize the deep learning architecture of Long short term memory (LSTM), Convolutional neural network (CNN), and Multilayer perceptron (MLP) for forecasting tasks using Particle swarm optimization (PSO), a swarm intelligence-based metaheuristic optimization methodology: Proposed M-1 (PSO-LSTM), M-2 (PSO-CNN), and M-3 (PSO-MLP). Beijing PM2.5 datasets was analyzed to measure the performance of the proposed models. PM2.5 as a target variable was affected by dew point, pressure, temperature, cumulated wind speed, hours of snow, and hours of rain. The deep learning network inputs consist of three different scenarios: daily, weekly, and monthly. The results show that the proposed M-1 with three hidden layers produces the best results of RMSE and MAPE compared to the proposed M-2, M-3, and all the baselines. A recommendation for air pollution management could be generated by using these optimized models.
Non-Gaussian Analysis of Herbarium Specimen Damage to Optimize Specimen Collection Management Aris Yaman; Yulia Aris Kartika; Ariani Indrawati; Zaenal Akbar; Lindung Parningotan Malik; Wita Wardani; Tutie Djarwaningsih; Taufik Mahendra; Dadan Ridwan Saleh
Knowledge Engineering and Data Science Vol 5, No 1 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i12022p1-16

Abstract

Damage to specimen collections occurs in practically every herbarium across the world. Hence, some precautions must be taken, such as investigating the factors that cause specimen damage in their collections and evaluating their herbarium collection handling and usage policy. However, manual investigation of the causes of herbarium collection damage requires a lot of effort and time. Only a few studies have attempted to investigate the causes of herbarium collection damage. So far, the non-gaussian approach to detecting the causes of damage to herbarium specimens has not been studied before. This study attempted to explore the effect of species type, time, location, storage, and remounting status on the level of damage to herbarium specimens, especially those in the genus Excoecaria. Gaussian modeling is not good enough to model the counted data phenomenon (the amount of damage to herbarium specimens). Negative binomial regression (NBR) provides a better model when compared to generalized Poisson regression and ordinary Gaussian regression approaches. NBR detects non-uniformity in the storage process, causing damage to herbarium specimens. Natural damage to herbarium specimens is caused by differences in species and the origin of specimens.
Social Distancing Monitoring System using Deep Learning Amelia Ritahani Ismail; Nur Shairah Muhd Affendy; Ahsiah Ismail; Asmarani Ahmad Puzi
Knowledge Engineering and Data Science Vol 5, No 1 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i12022p17-26

Abstract

COVID-19 has been declared a pandemic in the world by 2020. One way to prevent COVID-19 disease, as the World Health Organization (WHO) suggests, is to keep a distance from other people. It is advised to stay at least 1 meter away from others, even if they do not appear to be sick. The reason is that people can also be the virus carrier without having any symptoms. Thus, many countries have enforced the rules of social distancing in their Standard Operating Procedure (SOP) to prevent the virus spread. Monitoring the social distance is challenging as this requires authorities to carefully observe the social distancing of every single person in a surrounding, especially in crowded places. Real-time object detection can be proposed to improve the efficiency in monitoring the social distance SOP inspection. Therefore, in this paper, object detection using a deep neural network is proposed to help the authorities monitor social distancing even in crowded places. The proposed system uses the You Only Look Once (YOLO) v4 object detection models for the detection. The proposed system is tested on the MS COCO image dataset with a total of 330,000 images. The performance of mean average precision (mAP) accuracy and frame per second (FPS) of the proposed object detection is compared with Faster Region-based Convolutional Neural Network (R-CNN) and Multibox Single Shot Detector (SSD) model. Finally, the result is analyzed among all the models.
Automatic 3D Cranial Landmark Positioning based on Surface Curvature Feature using Machine Learning Putu Hendra Suputra; Anggraini Dwi Sensusiati; Myrtati Dyah Artaria; Gijsbertus Jacob Verkerke; Eko Mulyanto Yuniarno; I Ketut Eddy Purnama
Knowledge Engineering and Data Science Vol 5, No 1 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i12022p27-40

Abstract

Cranial anthropometric reference points (landmarks) play an important role in craniofacial reconstruction and identification. Knowledge to detect the position of landmarks is critical. This work aims to locate landmarks automatically. Landmarks positioning using Surface Curvature Feature (SCF) is inspired by conventional methods of finding landmarks based on morphometrical features. Each cranial landmark has a unique shape. With the appropriate 3D descriptors, the computer can draw associations between shapes and landmarks using machine learning. The challenge in classification and detection in three-dimensional space is to determine the model and data representation. Using three-dimensional raw data in machine learning is a serious volumetric issue. This work uses the Surface Curvature Feature as a three-dimensional descriptor. It extracts the local surface curvature shape into a projection sequential value (depth). A machine learning method is developed to determine the position of landmarks based on local surface shape characteristics. Classification is carried out from the top-n prediction probabilities for each landmark class, from a set of predictions, then filtered to get pinpoint accuracy. The landmark prediction points are hypothetically clustered in a particular area, so a cluster-based filter is appropriate to isolate them. The learning model successfully detected the landmarks, with the average distance between the prediction points and the ground truth being 0.0326 normalized units. The cluster-based filter is implemented to increase accuracy compared to the ground truth. Thus, SCF is suitable as a 3D descriptor of cranial landmarks.