Claim Missing Document
Check
Articles

Found 6 Documents
Search

Cluster Analysis on Dengue Incidence and Weather Data Using K-Medoids and Fuzzy C-Means Clustering Algorithms (Case Study: Spread of Dengue in the DKI Jakarta Province) Cindy; Cynthia; Valentino Vito; Devvi Sarwinda; Bevina Desjwiandra Handari; Gatot Fatwanto Hertono
Journal of Mathematical and Fundamental Sciences Vol. 53 No. 3 (2021)
Publisher : Institute for Research and Community Services (LPPM) ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/j.math.fund.sci.2021.53.3.9

Abstract

In Indonesia, Dengue incidence tends to increase every year but has been fluctuating in recent years. The potential for Dengue outbreaks in DKI Jakarta, the capital city, deserves serious attention. Weather factors are suspected of being associated with the incidence of Dengue in Indonesia. This research used weather and Dengue incidence data for five regions of DKI Jakarta, Indonesia, from December 30, 2008, to January 2, 2017. The study used a clustering approach on time-series and non-time-series data using K-Medoids and Fuzzy C-Means Clustering. The clustering results for the non-time-series data showed a positive correlation between the number of Dengue incidents and both average relative humidity and amount of rainfall. However, Dengue incidence and average temperature were negatively correlated. Moreover, the clustering implementation on the time-series data showed that rainfall patterns most closely resembled those of Dengue incidence. Therefore, rainfall can be used to estimate Dengue incidence. Both results suggest that the government could utilize weather data to predict possible spikes in DHF incidence, especially when entering the rainy season and alert the public to greater probability of a Dengue outbreak.
Implementation of Ensemble Self-Organizing Maps for Missing Values Imputation Titin Siswantining; Kathan Gerry Vivaldi; Devvi Sarwinda; Saskya Mary Soemartojo; Ika Mattasari; Herley Al-Ash
Indonesian Journal of Statistics and Applications Vol 6 No 1 (2022)
Publisher : Departemen Statistika, IPB University dengan Forum Perguruan Tinggi Statistika (FORSTAT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v6i1p1-12

Abstract

The purpose of this study is to implement the ensemble self-organizing maps (E-SOM) method to impute missing values at the preprocessing data stage, which is an important stage when making predictions or classifications. The Ensemble Self-Organizing Maps (E-SOM) is the development of the SOM imputation method, in which the E-SOM method is implemented by applying an ensemble framework using several SOMs to improve generalization capabilities. In this study, the E-SOM imputation method is implemented in South African heart disease data using random forest as a classification model. The results of the model evaluation showed that for accuracy in testing data, the Random Forest model formed from E-SOM imputed data yields better accuracy values than the Random Forest model formed from SOM-imputed data for variations of 36, 49, 64, and 81 neurons, while for variation of 25 neurons both models produce the same accuracy value. From the variation of the number of ensembles applied, the E-SOM imputation method with a combination of 81 neurons and 15 ensemble numbers produced a Random Forest model with the most optimal value of accuracy.
Lung cancer classification based on support vector machine-recursive feature elimination and artificial bee colony Alhadi Bustamam; Zuherman Rustama; Selly A. A. K; Nyoman A. Wibawa; Devvi Sarwinda; NadyaAsanul Husna
Annals of Mathematical Modeling Vol 3, No 1 (2021)
Publisher : Research and Social Study Institute

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33292/amm.v13i1.71

Abstract

Early detection of cancerous cells can increase survival rates for patients by more than 97%. Microarray data, used for cancer classification, are comp osed of many thousands of features and from tens to hundreds of instances. Handling these huge datasets is the most imp ortant challenge in data classification. Feature selection or reduction is therefore an essential task in data classification. We prop ose a cancer diagnostic to ol using a supp ort vector machine for classifier and feature selection. First, we use supp ort vector machine-recursive feature elimination to prefilter the genes. This was enhanced with the artificial b ee colony algorithm. We ran four simulations using Ontario and Michigan lung cancer datasets. This approach provides higher classification accuracy than those without feature selection, supp ort vector machine-recursive feature elimination, or the artificial b ee colony algorithm. The accuracy of a supp ort vector machine using a feature selection-based recursive feature elimination metho d combined with the artificial b ee colony algorithm reached 98% with 100 b est features for the Michigan lung cancer dataset and 97% with 70 b est features for the Ontario lung cancer dataset. SVM with RFE-ABC as the feature selection metho d gives us an accurate result to diagnose Lung cancer using microarray data.
Diabetic Retinopathy Detection Using GoogleNet Architecture of Convolutional Neural Network Through Fundus Images Amnia Salma; Alhadi Bustamam; Devvi Sarwinda
Nusantara Science and Technology Proceedings Bioinformatics and Biodiversity Conferences (BBC)
Publisher : Future Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11594/nstp.2021.0701

Abstract

The number of people who have Diabetes is about 422 million in the world. Diabetes is a group of metabolic disease characterized by elevated lev- els of blood glucose. The serious damage of blood vessels caused by Diabetes in the tissue at the retina is called Diabetic Retinopathy. Diabetic Retinopathy can cause severe blindness. Early detection can help patients find a suitable treatment and prevent the blindness. Opthalmologists can detect this disease by screening, but this method takes a long time, is very costly and need pro- fessional skills to perform it. In the big data era, many researchers use deep learning models for medical help. One of the models use image classification. We have designed a tool using image classification to help ophthalmologists detect diabetic retinopathy. In this research, we use image classification to classify Diabetic Retinopathy into two classes which are normal (No DR) and Diabetic Retinopathy. We use 200 datasets of fundus images that we obtain from Kaggle Database. We used deep learning model in this research that is one of Convolutional Neural Network architecture called GoogleNet. For training the model we used Python as programming languange with Pytorch library. GoogleNet has a very good performance for image classification and has an accuracy of 88%.
MULTIPLE IMPUTATION FOR ORDINARY COUNT DATA BY NORMAL DISTRIBUTION APPROXIMATION Titin Siswantining; Muhammad Ihsan; Saskya Mary Soemartojo; Devvi Sarwinda; Herley Shaori Al-Ash; Ika Marta Sari
MEDIA STATISTIKA Vol 14, No 1 (2021): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/medstat.14.1.68-78

Abstract

Missing values are a problem that is often encountered in various fields and must be addressed to obtain good statistical inference such as parameter estimation. Missing values can be found in any type of data, included count data that has Poisson distributed. One solution to overcome that problem is applying multiple imputation techniques. The multiple imputation technique for the case of count data consists of three main stages, namely the imputation, the analysis, and pooling parameter. The use of the normal distribution refers to the sampling distribution using the central limit theorem for discrete distributions. This study is also equipped with numerical simulations which aim to compare accuracy based on the resulting bias value. Based on the study, the solutions proposed to overcome the missing values in the count data yield satisfactory results. This is indicated by the size of the bias parameter estimate is small. But the bias value tends to increase with increasing percentage of observation of missing values and when the parameter values are small.
SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT Titin Siswantining; Stanley Pratama; Devvi Sarwinda
MEDIA STATISTIKA Vol 15, No 2 (2022): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/medstat.15.2.129-138

Abstract

Paraphrasing is a way to write sentences with other words with the same intent or purpose. Automatic paraphrase detection can be done using Natural Language Sentence Matching (NLSM) which is part of Natural Language Processing (NLP). NLP is a computational technique for processing text in general, while NLSM is used specifically to find the relationship between two sentences. With the development Neural Network (NN), nowadays NLP can be done more easily by computers. Many models for detecting and paraphrasing in English have been developed compared to Indonesian, which has less training data. This study proposes SPratama Model, which models paraphrase detection for Indonesian using a Recurrent Neural Network (RNN), namely Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Gated Recurrent Unit (BiGRU). The data used is "Quora Question Pairs" taken from Kaggle and translated into Indonesian using Google Translate. The results of this study indicate that the proposed model has an accuracy of around 80% for the detection of paraphrased sentences.