Claim Missing Document
Check
Articles

Found 13 Documents
Search

Algoritma Genetik: Alternatif Metode Penentuan Strata Optimum dalam Perancangan Survei Yanti, Yusma; Rahardiantoro, Septian
KOMPUTASI Vol 14, No 1 (2017): JURNAL KOMPUTASI
Publisher : KOMPUTASI

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (432.95 KB)

Abstract

Tujuan dari pembuatan strata ketika pengambilan contoh dalam survei adalah untuk menghasilkan penduga paremeter dengan varians kecil, sehinggapenentuan alokasi strata perlu diperoleh. Menentukanbanyaknya strata dan alokasi elemen strata dari suatu himpunan nilai respon akan menjadi fokus dari penelitian ini. Algoritma Genetik (AG) diaplikasikan untuk kasus ini dengan meminimalkan varians dalam strata pada himpunan yang tersedia, dari jumlah strata 2 sampai 6 strata. Studi empiris melalui simulasi dikembangkan dalam skema populasi yang telah diketahui banyaknya strata sebenarnya, kemudian dengan beberapa jenis banyaknya strata, AG diterapkan dalam data. Berdasarkan hasil simulasi, dapat disimpulkan bahwa AG dapat memberikan banyaknya strata yang sesuai dengan banyaknya strata sebenarnya, sehingga dapat menjadi metode alternatif yang baik untuk memilih banyaknya strata optimal dalam pengambilan contoh survei.
LAD-LASSO: SIMULATION STUDY OF ROBUST REGRESSION IN HIGH DIMENSIONAL DATA Septian Rahardiantoro; Anang Kurnia
FORUM STATISTIKA DAN KOMPUTASI Vol. 20 No. 2 (2015)
Publisher : FORUM STATISTIKA DAN KOMPUTASI

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (733.502 KB)

Abstract

The common issues in regression, there are a lot of cases in the condition number of predictor variables more than number of observations ( ) called high dimensional data. The classical problem always lies in this case, that is multicolinearity. It would be worse when the datasets subject to heavy-tailed errors or outliers that may appear in the responses and/or the predictors. As this reason, Wang et al in 2007 developed combined methods from Least Absolute Deviation (LAD) regression that is useful for robust regression, and also LASSO that is popular choice for shrinkage estimation and variable selection, becoming LAD-LASSO. Extensive simulation studies demonstrate satisfactory using LAD-LASSO in high dimensional datasets that lies outliers better than using LASSO.Keywords: high dimensional data, LAD-LASSO, robust regression
ALTERNATIF PENGGEROMBOLAN DATA DERET WAKTU DENGAN KONDISI TERDAPAT DATA KOSONG: Studi Kasus Penggerombolan Provinsi di Indonesia Berdasarkan Data Deret Waktu Rasio Gini tahun 2007 – 2017 Yusma Yanti; Septian Rahardiantoro
Indonesian Journal of Statistics and Applications Vol 2 No 1 (2018)
Publisher : Departemen Statistika, IPB University dengan Forum Perguruan Tinggi Statistika (FORSTAT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v2i1.55

Abstract

Panel data describes a condition in which there are many observations with each observation observed periodically over a period of time. The observation clustering context based on this data is known as Clustering of Time Series Data. Many methods are developed based on fluctuating time series data conditions. However, missing data causes problems in this analysis. Missing data is the unavailability of data value on an observation because there is no information related to it. This study attempts to provide an alternative method of clustering observations on data with time series containing missing data by utilizing correlation matrices converted into Euclid distance matrices which are subsequently applied by the hierarchical clustering method. The simulation process was done to see the goodness of alternative method with common method used in data with 0%, 10%, 20% and 40% missing data condition. The result was obtained that the accuracy of the observation bundling on the proposed alternative method is always better than the commonly used method. Furthermore, the implementation was done on the annual gini ratio data of each province in Indonesia in 2007 to 2017 which contained missing data in North Kalimantan Province. There were 2 clusters of province with different characteristics.
PENERAPAN ANALISIS LASSO DAN GROUP LASSO DALAM MENGIDENTIFIKASI FAKTOR-FAKTOR YANG BERHUBUNGAN DENGAN TUBERKULOSIS DI JAWA BARAT Stephan Chen; Khairil Anwar Notodiputro; Septian Rahardiantoro
Indonesian Journal of Statistics and Applications Vol 4 No 1 (2020)
Publisher : Departemen Statistika, IPB University dengan Forum Perguruan Tinggi Statistika (FORSTAT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (740.178 KB) | DOI: 10.29244/ijsa.v4i1.510

Abstract

Tuberculosis is the deadliest infectious disease in Indonesia, and West Java is a province with the largest number of tuberculosis cases in Indonesia. This research was conducted to identify variables and groups of variables that could explain the number of tuberculosis cases in West Java. The data used has many explanatory variables, and these variables form groups. LASSO and group LASSO analysis can be used for variables selection and handle data that has many explanatory variables, and group LASSO analysis can be used on data with grouped variables. The results of the LASSO analysis, variables that can explain the number of tuberculosis cases in West Java are the number of people with disabilities, the number of pharmacy staff, the number of malnourished people, the number of people working and the number of cities. According to the group LASSO analysis, the variables that can explain the number of tuberculosis cases in West Java are variables in the health and environmental groups. The government can focus on these factors if they want to reduce the number of tuberculosis cases in West Java.
Algoritma Genetik: Alternatif Metode Penentuan Strata Optimum dalam Perancangan Survei Yusma Yanti; Septian Rahardiantoro
KOMPUTASI Vol 14, No 1 (2017): Komputasi: Jurnal Ilmiah Ilmu Komputer dan Matematika
Publisher : Ilmu Komputer, FMIPA, Universitas Pakuan

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (432.95 KB) | DOI: 10.33751/komputasi.v14i1.275

Abstract

Tujuan dari pembuatan strata ketika pengambilan contoh dalam survei adalah untuk menghasilkan penduga paremeter dengan varians kecil, sehinggapenentuan alokasi strata perlu diperoleh. Menentukanbanyaknya strata dan alokasi elemen strata dari suatu himpunan nilai respon akan menjadi fokus dari penelitian ini. Algoritma Genetik (AG) diaplikasikan untuk kasus ini dengan meminimalkan varians dalam strata pada himpunan yang tersedia, dari jumlah strata 2 sampai 6 strata. Studi empiris melalui simulasi dikembangkan dalam skema populasi yang telah diketahui banyaknya strata sebenarnya, kemudian dengan beberapa jenis banyaknya strata, AG diterapkan dalam data. Berdasarkan hasil simulasi, dapat disimpulkan bahwa AG dapat memberikan banyaknya strata yang sesuai dengan banyaknya strata sebenarnya, sehingga dapat menjadi metode alternatif yang baik untuk memilih banyaknya strata optimal dalam pengambilan contoh survei.
PENERAPAN METODE COKRIGING DENGAN VARIOGRAM ISOTROPI DAN ANISOTROPI DALAM MEMPREDIKSI CURAH HUJAN BULANAN JAWA BARAT Anik Djuraidah; Septian Rahardiantoro; Azizah Desiwari
Jurnal Meteorologi dan Geofisika Vol 20, No 1 (2019)
Publisher : Pusat Penelitian dan Pengembangan BMKG

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (825.236 KB) | DOI: 10.31172/jmg.v20i1.594

Abstract

Curah hujan merupakan salah satu unsur iklim yang penting dalam pertanian. Informasi mengenai ukuran curah hujan dapat diketahui dari pos hujan pada suatu wilayah. Permasalahan yang dihadapi adalah tidak semua wilayah memiliki pos hujan, sehingga metode interpolasi spasial dapat digunakan dalam memprediksi besarnya curah hujan pada suatu wilayah. Metode cokriging merupakan salah satu metode interpolasi spasial yang bersifat Best Linear Unbiased Prediction (BLUP) dengan melibatkan minimum dua peubah. Peubah yang digunakan dalam penelitian ini dipilih berdasarkan keeratan hubungannya, yaitu peubah curah hujan dan elevasi pos hujan. Data yang digunakan dalam penelitian ini adalah curah hujan bulanan tahun 1981 hingga 2013 pada 38 pos hujan di wilayah Jawa Barat. Metode analisis diawali dengan menetukan variogram isotropi  yang ditentukan berdasarkan jarak spasial dan variogram anisotropi yang ditentukan berdasarkan jarak dan arah pada kedua peubah. Selanjutnya, variogram yang terbaik digunakan untuk prediksi curah hujan. Hasil penelitian menunjukkan variogram terbaik adalah variogram isotropi dengan hasil prediksi curah hujan bulanan yang mempunyai nilai reduced means square error berkisar antara 0.54 sampai dengan 1.46 dan nilai average error hampir 0.Rainfall is one of the important climatic elements in agriculture. The information on the amount of rainfall can be known from the weather station in a region. The problem faced is not all regions have its own weather station, so that spatial interpolation can be used to predict the amount of rainfall in a region. Cokriging is one of spatial interpolation that has properties BLUP (Best Linear Unbiased Prediction) that involved at least two variables. In this study, the variables used were the amount of rainfall and elevation of the weather station because these variables have a correlation. The data used in this study were monthly rainfall from 1981 to 2013 at 38 weather stations in West Java. The first step in analysis data was determined isotropy variogram determined based on spatial distance and anisotropic variogram determined based on distance and direction in the two variables. Furthermore, the best variogram was used for the rainfall prediction. The results showed the best variogram is isotropy with the results of monthly rainfall predictions with the cokriging method having reduced means square error values ranging from 0.54 to 1.46 and the average error value of almost 0. 
EKSPLORASI DAN ANALISIS REGRESI LOGISTIK TERHADAP KONDISI SUNGAI TERCEMAR LIMBAH DI DESA/KELURAHAN PROVINSI DKI JAKARTA INDONESIA Septian Rahardiantoro; Yusma Yanti
Jurnal Matematika Sains dan Teknologi Vol. 23 No. 1 (2022)
Publisher : LPPM Universitas Terbuka

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33830/jmst.v23i1.3136.2022

Abstract

Provinsi DKI Jakarta memiliki kepadatan penduduk terbesar di Indonesia yang riskan dengan permasalahan lingkungan, salah satunya di sungai-sungainya. Sekitar 52.5% desa/kelurahan yang dilalui sungai memiliki kondisi sungai tercemar limbah. Penelitian ini bertujuan untuk melakukan eksplorasi kondisi sungai tercemar limbah yang disertai dengan identifikasi faktor-faktor yang diduga memengaruhinya berdasarkan data Potensi Desa (Podes) tahun 2018 (Badan Pusat Statistik, 2018). Eksplorasi data dilakukan dengan membuat plot tebaran dan ringkasan statistik, dengan hasil yang diperoleh adalah mayoritas sumber limbah berasal dari limbah rumah tangga. Selanjutnya, analisis regresi logistik beserta metode stepwise dilakukan untuk mengidentifikasi faktor-faktor dari segi kondisi lingkungan, alih fungsi, serta kondisi sosial ekonomi desa/kelurahan yang memengaruhi kondisi sungai tercemar limbah. Hasilnya, faktor yang memengaruhi sungai tercemar limbah meliputi adanya fungsi alih sungai dan banyaknya rumah tangga miskin dengan kategori tinggi. Selain itu, faktor adanya perawatan sungai dapat digunakan sebagai indikator bahwa sungai di desa/kelurahan di DKI Jakarta tercemar limbah.
Pemodelan Regresi Spasial Kekar: Studi Kasus Jumlah Kunjungan WIsatawan Mancanegara Asal Eurasia di Indonesia Tahun 2015 Resti Cahyati; Anik Djuraidah; Septian Rahardiantoro
Xplore: Journal of Statistics Vol. 2 No. 1 (2018): 30 Juni 2018
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (388.691 KB) | DOI: 10.29244/xplore.v2i1.85

Abstract

Spatial regression model is a model used to evaluate the relationship between one variable with some other variables considering the spatial effects in each region. One of the causes of imprecise spatial regression model in predicting is the presence of outlier or extreme value. The existence of outlier or extreme value could damage spatial regression parameter estimator. However, discarding the outlier or extreme value in spatial analysis, could change the composition of the spatial effect on the data. Visitor arrivals from Eurasia to Indonesia by nationality in 2015 great diversity caused by the outlier. So in this paper, we need a spatial regression parameter estimation method which is robust where the value of the estimation is not much affected by small changes in the data. The application of the S prediction principle is carried out in the estimation of the coefficient of spatial regression parameters which is robust to the observation of silane. The result of modeling by applying the principle of the S estimator method on the estimation of the stocky spatial regression parameter is able to accommodate the existence of pencilan observation on the spatial regression model quite effectively. This is indicated by a considerable change in the coefficient coefficient estimator parameters of spatial regression is able to decrease the value of MAPE and MAD produced by spatial regression regression modeling.
Two Step Method for Clustering Mixed Data untuk Menggerombolkan Toko Mainan Anak Digital Muhammad Shalih; Cici Suhaeni; Septian Rahardiantoro
Xplore: Journal of Statistics Vol. 7 No. 3 (2018): 31 Desember 2018
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/xplore.v7i3.131

Abstract

The development of digital trading system today, triggering the proliferation of shops that sell various needs in various marketplace. This is supported by the large number of internet users in Indonesia that facilitate the store with commercial-based digital to reach market share. One of the growing categories in a marketplace is the stores that sell toys. However, not all toy stores have a good reputation. Clustering based on store reputation indicators can be done to find out how the condition of toy stores in a marketplace. The store reputation indicators used are categorical and numerical scale variables. This study uses A Two-Step Method for Mixed Categorical and Numerical Data (TMCM), which is a clustering method that can cluster mixed numerical and categorical data that using a co-occurence concept. The result of this clustering found that the optimal number of cluster is five cluster based on the maximum value of Pseudo-F and the minimum value of ratio (R ).
Penerapan Metode DBSCAN dalam Memperbaiki Kinerja K-Means untuk Penggerombolan Data Tweet Astri Fatimah; Anang Kurnia; Septian Rahardiantoro; Yani Nurhadryani
Xplore: Journal of Statistics Vol. 8 No. 1 (2019): 30 April 2019
Publisher : Department of Statistics, IPB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/xplore.v8i1.159

Abstract

Text Mining is collecting text data mining results from a computer to get information contained therein. Text data has a form of data that is not structured and difficult to analyze. The unstructured data can be used as structured data through pre-processing stages. Text data is represented as numerical data after going through the pre-processing stages using vector space model method and weighting method of inverse frequency document frequency so that it can be used for analysis. The K-Means cluster analysis is one method that can be used for unstructured data, but the K-Means method is not robust to noise. Outliers can be detected using Density Based Spatial Clustering of Application with Noise (DBSCAN) cluster analysis. Outliers obtained from DBSCAN results can be omitted in the data. Cluster analysis was carried out again after removal of outliers using the K-Means method with the same number of k clusters. Evaluation of the cluster that is used to see the goodness of the cluster results is Silhouette Coefficient (SC). The SC value of the K-Means method after removal of outliers has a significant increase of 0.21 for a small amount of data. Adding the amount of text data to cluster analysis also affects the number of clusters. This is influenced by the number of katas in a document that is given weight. The fewer katas that are given weight, the more number of clusters will be generated