Claim Missing Document
Check
Articles

Found 33 Documents
Search

ENSEMBLE LEARNING DENGAN METODE SMOTEBAGGING PADA KLASIFIKASI DATA TIDAK SEIMBANG Siringoringo, Rimbun; Jaya, Indra Kelana
Journal Information System Development (ISD) Vol 3, No 2 (2018): Journal Information System Development (ISD)
Publisher : UNIVERSITAS PELITA HARAPAN

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Unbalanced data classification is a crucial problem in the field of machine learning and data mining. Data imbalances have a poor impact on classification results where minority classes are often misclassified as a majority class. Conventional machine learning algorithms are not equipped with the ability to work on unbalanced data, so the performance of conventional algorithms is always not optimal. In this study, ensemble learning using SMOTEBagging method was applied to classify 11 unbalanced datasets. SMOTEBagging performance is also compared with three types of conventional classification algorithms namely SVM, k-NN, and C4.5. By applying the 5 cross-validation scheme, the AUC value generated by SMOTEBagging is higher at 10 datasets. The mean values of the lowest to highest AUC were obtained by SVM, k-NN, C4.5 and SMOTEBagging algorithms with values 0.638, 0.742, 0.770 and 0.895. By applying Friedman test it was found that the performance of AUC SMOTEBagging differed significantly with the other three conventional methods SVM, k-NN and C4.5ENSEMBLE LEARNING DENGAN  METODE  SMOTEBagging PADA KLASIFIKASI DATA TIDAK SEIMBANG
KLASIFIKASI DATA TIDAK SEIMBANG MENGGUNAKAN ALGORITMA SMOTE DAN k-NEAREST NEIGHBOR Siringoringo, Rimbun
Journal Information System Development (ISD) Vol 3, No 1 (2018): Journal Information System Development (ISD)
Publisher : UNIVERSITAS PELITA HARAPAN

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Unbalanced data classification is a crucial problem in the field of machine learning and data mining. Data imbalances have a poor impact on classification results where minority classes are often misclassified as a majority class. k-Nearest Neighbor is one of the most popular and simple classification methods but it is not equipped with the ability to work on unbalanced datasets. In this study, the Synthetic Minority Over-Sampling Technique (SMOTE) was applied to solve the class imbalance problem on the Credit Card Fraud dataset. By applying the 10-cross-validation evaluation scheme, it was found that SMOTE increases the mean of  G-Mean by 53.4% to 81.0% and the mean of  F-Measure by 38.7 to 81.8%Keywords: Class imbalance, Synthetic Minority Over-sampling Technique, k-Nearest Neighbor
PEMODELAN TOPIK BERITA MENGGUNAKAN LATENT DIRICHLET ALLOCATION DAN K-MEANS CLUSTERING Siringoringo, Rimbun; Jamaluddin, Jamaluddin; Gea, Asaziduhu
Journal Information System Development (ISD) Vol 5, No 1 (2020): Journal Information System Development (ISD)
Publisher : UNIVERSITAS PELITA HARAPAN

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Mayoritas pengguna internet saat ini melakukan penelusuran internet untuk mengetahui berita atau informasi yang sedang berkembang. Pertumbuhan internet dan media sosial telah mendorong munculnya ratusan portal atau berita online dengan topik berita yang sangat beragam. Menelusuri topik berita secara manual merupakan metode yang tidak efektif serta menghabiskan waktu yang banyak.  Pada penelitian ini dilakukan pemodelan topik berita menggunakan Latent Dirichlet Allocation (LDA). Sebelum penerapan model LDA, juga diterapkan proses-proses pendukung yaitu tokenisasi, lemmatisasi, faktorisasi tf-idf, dan non-negative matrix factorization. Hasil penelitian menunjukkan bahwa LDA dapat diterapkan untuk memodelkan topik berita dengan baik dengan nilai  skor loglikelihood -13615.912 dan skor perplexity 378.958. Selain menggunakan LDA, pemodelan topik juga dilakukan dalam bentuk klaster dengan menerapkan k-means clustering. Dengan metode elbow diperoleh jumlah klaster yang ideal untuk k-means clustering adalah 5 klaster serta performa nilai silhouette 0.62
MODEL HIBRID GENETIC-XGBOOST DAN PRINCIPAL COMPONENT ANALYSIS PADA SEGMENTASI DAN PERAMALAN PASAR Siringoringo, Rimbun; Perangin-angin, Resianta; Jamaluddin, Jamaluddin
METHOMIKA: Jurnal Manajemen Informatika & Komputerisasi Akuntansi Vol. 5 No. 2 (2021): METHOMIKA: Jurnal Manajemen Informatika & Komputersisasi Akuntansi
Publisher : Universitas Methodist Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (533.828 KB) | DOI: 10.46880/jmika.Vol5No2.pp97-103

Abstract

Extreme Gradient Boosting(XGBoost) is a popular boosting algorithm based on decision trees. XGBoost is the best in the boosting group. XGBoost has excellent convergence. On the other hand, XGBoost is a Hyper parameterized model. Determining the value of each parameter is classified as difficult, resulting in the results obtained being trapped in the local optimum situation. Determining the value of each parameter manually, of course, takes a lot of time. In this study, a Genetic Algorithm (GA) is applied to find the optimal value of the XGBoost hyperparameter on the market segmentation problem. The evaluation of the model is based on the ROC curve. Test result. The ROC test results for several SVM, Logistic Regression, and Genetic-XGBoost models are 0.89; 0.98; 0.99. The results show that the Genetic-XGBoost model can be applied to market segmentation and forecasting.
ENSEMBLE LEARNING DAN ANALISIS SENTIMEN PADA DATA ULASAN PRODUK Rimbun Siringoringo; Resianta Perangin Angin; Mufria J. Purba
Jurnal Informatika Kaputama (JIK) Vol 3, No 2 (2019): VOLUME 3 NOMOR 2, EDISI JULI 2019
Publisher : STMIK KAPUTAMA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.1234/jik.v3i2.161

Abstract

The majority of internet users are currently searching the internet before buying certain products. One consideration of prospective buyers is product reviews (product review). Prospective consumers can decide to buy a product because it is influenced by reviews with positive sentiments, or decide not to buy a particular product because it is influenced by a negative sentiment review. Product reviews are a way of delivering consumer opinions and sentiments to a product online. In essence, the product review data mined directly from the database is unbalanced, between positive sentiment and negative sentiment. This condition makes it difficult for machine learning algorithms to perform classification and clustering functions. In this study, sentiment analysis was conducted based on Trendy Shoes products from Denim Shoes. The stages of sentiment analysis consist of data collection, initial processing, data transformation, feature selection and classification stages using SMOTEBoost. Initial processing applies the stages of text mining namely case folding, non alpha numeric removal, stop words removal, and stemming. The results of sentiment analysis were measured using the criteria of Accuracy, G-Mean, and F-Measure. By applying the test to two types of sentiment data, the results show that SMOTEBoost can classify sentiments well. SMOTEBoost's performance is compared to other ensemble techniques namely ADABoost, RUSBoost, and SMOTEBagging. Classification results of review_1 data, SMOTEboost is better in accuracy and G-Mean. While for the review_2 data, SMOTEBoost has better results for all criteria, both accuracy, F-Measure and G-Mean 
Pemodelan Topik Berita Menggunakan Latent Dirichlet Allocation dan K-Means Clustering Rimbun Siringoringo; Jamaluddin Jamaluddin; Resianta Perangin-Angin
Jurnal Informatika Kaputama (JIK) Vol 4, No 2 (2020): Volume 4, Nomor 2 Juli 2020
Publisher : STMIK KAPUTAMA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.1234/jik.v4i2.263

Abstract

Majority of people now search the internet for news or information topics. The growth of the internet and social media has led to the emergence of hundreds of portals or online news with very diverse news topics. Searching for headlines manually is an ineffective and time-consuming method. In this study headlines modeling was used using Latent Dirichlet Allocation (LDA). Prior to the application of the LDA model, supporting processes such as tokenization, lemmatization, tf-idf factorization and non-negative matrix factorization were also applied. The results showed that the LDA can be applied to model the news topic well with a loglikelihood score of -13615,912 and a perplexity score of 378,958. In addition to using LDA, topic modeling is also done in the form of clusters by applying k-means clustering. With the elbow method, the ideal number of clusters for k-means clustering is 5 clusters and the silhouette performance is 0.62
PENDAMPINGAN DISAIN KEMASAN MAKANAN TIPA-TIPA DI DESA MAROM KECAMATAN ULUAN KABUPATEN TOBA SAMOSIR Rimbun Siringoringo; Jamaluddin Jamaluddin; Yosephine Sembiring
Jurnal Pengabdian Masyarakat Borneo Vol 4, No 2 (2020)
Publisher : Universitas Borneo Tarakan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35334/jpmb.v4i2.1851

Abstract

Tipa-tipa merupakan salah satu makanan oleh-oleh yang sangat terkenal dari Toba Samosir. Produk ini menjadi salah satu wujud kearifan kuliner lokal di Toba Samosir, khususnya Kecamatan Uluan yang bersumber dari tradisi turun-temurun masyarakat lokal. Saat ini keberadaan ke dua jenis makanan lokal ini mengalami penurunan penjualan karena terhimpit oleh makanan atau jajanan kekinian yang banyak disediakan di toko-toko dan swalayan. Daya tarik disain dan kualitas kemasan yang masih sangat klasik dan sederhana, semakin kurangnya kepercayaan konsumen akan kebersihan dan kelayakan konsumsi makanan tipa-tipa dan sasagun, minimnya promosi dan informasi tentang produk pada wisatawan menjadi beberapa faktor utama penyebab nya. Melalui PKM ini, tim telah melakukan pendampingan disain kemasan, pelabelan makanan, dan pembungkusan makanan kepada masyarakat mitra. Masyarakat mitra pada PKM ini adalah kelompok pedagang tipa-tipa yang ada di Desa Marom, Kecamaan Uluan, Kabupaen Toba Samosir.  Pendampingan telah diakukan kepada masyarakat mitra di Desa Marom Trasnfer ipteks yang telah diaksanakan adalah desain kemasan, pelabelan kemasan, penggunaan sealer, penggunaan stempel kadaluarsa, serta pemilihan model-model kemasan modern.
Peningkatan Performa Cluster Fuzzy C-Means pada Klastering Sentimen Menggunakan Particle Swarm Optimization Rimbun Siringoringo; Jamaluddin Jamaluddin
Jurnal Teknologi Informasi dan Ilmu Komputer Vol 6, No 4: Agustus 2019
Publisher : Fakultas Ilmu Komputer, Universitas Brawijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (2918.635 KB) | DOI: 10.25126/jtiik.2019641090

Abstract

Fuzzy C-Means (FCM) merupakan algoritma klastering  yang sangat baik dan lebih fleksibel dari algoritma klastering konvensional. Selain kelebihan tersebut, kelemahan utama algoritma ini adalah sensitif terhadap pusat klaster. Pusat klaster yang sensitif mengakibatkan hasil akhir sulit di kontrol dan FCM  mudah terjebak  pada optimum lokal. Untuk mengatasi masalah tersebut, penelitian ini memperbaiki kinerja FCM dengan menerapkan Particle Swarm Optimization (PSO) untuk menentukan pusat klaster yang lebih baik. Penelitian ini diterapkan pada klastering sentimen dengan menggunakan data berdimensi tinggi yaitu ulasan produk yang dikumpulkan dari beberapa situs toko online di Indonesia. Hasil penelitian menunjukkan bahwa penerapan PSO pada pembangkitan pusat klaster FCM dapat memperbaiki performa FCM serta memberikan luaran yang lebih sesuai. Performa klastering yang menjadi acuan  adalah Rand Index, F-Measure dan Objective Function Value (OFV). Untuk keseluruhan performa tersebut, FCM-PSO memberikan hasil yang lebih baik dari FCM. Nilai OFV yang lebih baik menunjukkan bahwa FCM-PSO tersebut membutuhkan waktu konvergensi yang lebih cepat serta penanganan noise yang lebih baik.AbstractFuzzy C-Means (FCM) algorithm is one of the popular fuzzy clustering techniques. Compared with the hard clustering algorithm, FCM is more flexible and fair. However, FCM is significantly sensitive to the initial cluster center and easily trapped in a local optimum. To overcome this problem, this study proposes and improved FCM with Particle Swarm Optimization (PSO) algorithm to determine a better cluster center for high dimensional and unstructured sentiment clustering. This study uses product review data collected from several online shopping websites in Indonesia. Initial processing product review data consists of Case Folding, Non Alpha Numeric Removal, Stop Word Removal, and Stemming. PSO is applied for the determination of suite cluster center. Clustering performance criteria are Rand Index, F-Measure and Objective Function Value (OFV). The results showed that FCM-PSO can provide better performance compared to the conventional FCM in terms of Rand Index, F-measure and Objective Function Values (OFV). The better OFV value indicates that FCM-PSO requires faster convergence time and better noise handling.
OPTIMASI FUNGSI KEANGGOTAAN FUZZY BERBASIS ALGORITMA MODIFIED PARTICLE SWARM OPTIMIZATION Rimbun Siringoringo; Zakarias Situmorang
Komputa : Jurnal Ilmiah Komputer dan Informatika Vol 3 No 2 (2014): Komputa : Jurnal Ilmiah Komputer dan Informatika
Publisher : Program Studi Teknik Informatika - Universitas Komputer Indonesia (UNIKOM)

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (835.074 KB) | DOI: 10.34010/komputa.v3i2.2391

Abstract

Pada penelitian ini optimasi berbasis algoritma Modified Particle Swarm Optimization (MPSO) diterapkan untuk mengoptimasi fungsi keanggotaan fuzzy. Terdapat dua metode MPSO yang diterapkan yaitu metode Linear Decreasing Inertia Weight (LDIW) dan Constriction Factor Method (FCM). Masing-masing metode tersebut diuji dengan 10 kali percobaan pada dua jenis jumlah particle yaitu 50 dan 20 particle. Dari hasil pengujian diperoleh bahwa pada jumlah particle yang sama, CFM memperoleh nilai global best fitness yang lebih optimal daripapa metode LDIW. Pengujian sebanyak 10 kali percobaan dan menerapkan 50 particle, pada percobaan pertama diperoleh nilai global best fitness yaitu 1,4; 1,4; 2,36 dan 3,28 untuk masing-masing variabel produktifitas, keterisolasian, hubungan sosial dan aksesibilitas. Pengujian sebanyak 10 kali percobaan dan menerapkan 20 particle diperoleh nilai global best fitness yaitu 2,34; 2,40; 2,37 dan 3,36 untuk masing-masing variabel. Di sisi lain metode CFM memperoleh hasil konvergensi yang lebih cepat dari pada metode LIDW. Pengujian pada 100 swarm metode LDIW menemukan global best fitness pada swarm 91, 84, 54 dan 38 untuk masing-masing variabel, sementara dengan metode CFM menemukan global best fitness pada swarm 81, 23, 34 dan 23.
ENSEMBLE LEARNING DENGAN METODE SMOTEBAGGING PADA KLASIFIKASI DATA TIDAK SEIMBANG Rimbun Siringoringo; Indra Kelana Jaya
Journal Information System Development Vol 3, No 2 (2018): Journal Information System Development (ISD)
Publisher : UNIVERSITAS PELITA HARAPAN

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Unbalanced data classification is a crucial problem in the field of machine learning and data mining. Data imbalances have a poor impact on classification results where minority classes are often misclassified as a majority class. Conventional machine learning algorithms are not equipped with the ability to work on unbalanced data, so the performance of conventional algorithms is always not optimal. In this study, ensemble learning using SMOTEBagging method was applied to classify 11 unbalanced datasets. SMOTEBagging performance is also compared with three types of conventional classification algorithms namely SVM, k-NN, and C4.5. By applying the 5 cross-validation scheme, the AUC value generated by SMOTEBagging is higher at 10 datasets. The mean values of the lowest to highest AUC were obtained by SVM, k-NN, C4.5 and SMOTEBagging algorithms with values 0.638, 0.742, 0.770 and 0.895. By applying Friedman test it was found that the performance of AUC SMOTEBagging differed significantly with the other three conventional methods SVM, k-NN and C4.5ENSEMBLE LEARNING DENGAN  METODE  SMOTEBagging PADA KLASIFIKASI DATA TIDAK SEIMBANG