Claim Missing Document
Check
Articles

Found 24 Documents
Search

The Accuracy Comparison Between Word2Vec and FastText On Sentiment Analysis of Hotel Reviews Siti Khomsah; Rima Dias Ramadhani; Sena Wijaya
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 6 No 3 (2022): Juni 2022
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (380.214 KB) | DOI: 10.29207/resti.v6i3.3711

Abstract

Word embedding vectorization is more efficient than Bag-of-Word in word vector size. Word embedding also overcomes the loss of information related to sentence context, word order, and semantic relationships between words in sentences. Several kinds of Word Embedding are often considered for sentiment analysis, such as Word2Vec and FastText. Fast Text works on N-Gram, while Word2Vec is based on the word. This research aims to compare the accuracy of the sentiment analysis model using Word2Vec and FastText. Both models are tested in the sentiment analysis of Indonesian hotel reviews using the dataset from TripAdvisor.Word2Vec and FastText use the Skip-gram model. Both methods use the same parameters: number of features, minimum word count, number of parallel threads, and the context window size. Those vectorizers are combined by ensemble learning: Random Forest, Extra Tree, and AdaBoost. The Decision Tree is used as a baseline for measuring the performance of both models. The results showed that both FastText and Word2Vec well-to-do increase accuracy on Random Forest and Extra Tree. FastText reached higher accuracy than Word2Vec when using Extra Tree and Random Forest as classifiers. FastText leverage accuracy 8% (baseline: Decision Tree 85%), it is proofed by the accuracy of 93%, with 100 estimators.
Big Data Analytics to Analyze Sentiment, Emotions, and Perceptions of Travelers (Case Study: Tourism Destination in Purwokerto Indonesia) Siti Khomsah; Rima Dias Ramadhani; Sena Wijayanto
Jurnal E-Komtek (Elektro-Komputer-Teknik) Vol 5 No 2 (2021)
Publisher : Politeknik Piksi Ganesha Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37339/e-komtek.v5i2.791

Abstract

Big data analytics can extract travelers' sentiment, emotions, and experiences from their internet opinions. This study analyzes sentiment, emotion, and traveler experiences at eight tourism destinations in Purwokerto Central Java, Indonesia. The methods are lexicon using NCR vocabulary(EmoLex) and word cloud analysis. The results show visitors generally have a positive sentiment. The five destinations with high positive sentiment are the Village (91%), Lokawisata Baturaden(81%), Baturaden Forest (79%), Limpa Kuwus (78%), and Taman Andang(.77%). In comparison, other destinations achieve positive sentiment under 70%. Only a few visitors give negative sentiment to all tourism destinations. The emotion of visitors stands out in Joy and Trust. NRC revealed sadness dan anger emotion but only about 20%. Cloud analysis does not reveal a distinguish keyword because the word feature still contained noise such as conjunction, adverb, and the name of the sites. Further research must consider other text preprocessing to handle noises.
Performance Comparison Supervised Machine Learning Models to Predict Customer Transaction Through Social Media Ads Afandi Nur Aziz Thohari; Rima Dias Ramadhani
Journal of Computer Networks, Architecture and High Performance Computing Vol. 4 No. 2 (2022): Article Research Volume 4 Number 2, July 2022
Publisher : Information Technology and Science (ITScience)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/cnahpc.v4i2.1488

Abstract

The application of machine learning has been used in various sectors, one of which is digital marketing. This research compares the performance of six machine learning algorithms to predict customer transaction decisions. The six algorithms used for comparison are Perceptron, Linear Regression, K-Nearest Neighbors, Naïve Bayes, Decision Tree, and Random Forest. The dataset is obtained from Facebook ads transaction data in 2020. The goal is to get a model that has the best performance so that it can be deployed to the web. The method that is used to compare the results is a confusion matrix and also uses visualization of the model to get the prediction error that occurred. Based on the test results, the random forest algorithm has the highest accuracy, recall, and f1-score values, with scores of 96.35%, 95.45%, and 93.32%. The highest precision value was generated by the logistic regression algorithm, which was 94.44%. Based on the data visualization presented by the random forest algorithm, it has the least prediction errors, there are four data. Therefore, it can be concluded that the random forest algorithm has the best performance because it has the highest value in the three confusion matrix measurements and the smallest data prediction error. The model of the random forest algorithm is deployed to the web platform and can be accessed at the link iklan-sosmed.herokuapp.com.
Hasil Klasifikasi Algoritma Backpropagation dan K-Nearest Neighbor pada Cardiovascular Disease Nashrulloh Khoiruzzaman; Rima Dias Ramadhani; Apri Junaidi
Indonesian Journal of Data Science, IoT, Machine Learning and Informatics Vol 1 No 1 (2021): February
Publisher : Research Group of Data Engineering, Faculty of Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (628.386 KB) | DOI: 10.20895/dinda.v1i1.141

Abstract

Cardiovascular disease adalah penyakit yang diakibatkan oleh kelainan yang terjadi pada organ jantung. Cardivascular disease dapat menyerang manusia dari usia muda hingga usia tua yang terdapat 13 faktor yang mempengaruhinya yaitu Age, Sex, Chest pain, Trestbps, Chol, Fbs, Restecg, Thalach, Exang, Oldpeak, Slope, Ca, dan Thal. Cardiovascular disease beragam jenisnya antara lain penyakit jantung koroner, gagal jantung, tekanan darah tinggi, tekanan darah rendah dan lain-lain. Oleh karena itu, penelitian ini memiliki tujuan untuk melakukan klasifikasi terhadap cardiovascular disease. Pada penelitian ini menggunakan algoritma backpropagation dan algoritma K-nearest neighbor. Langkah awal dilakukan adalah proses perhitungan euclidean distance pada K-NN untuk mencari jarak k terdekat untuk mendapatkan kategori berdasarkan frequensi terbanyak dari nilai k yang ditentukan dan mencari bobot baru untuk algoritma backpropagation untuk mendapatkan bobot baru yang digunakan untuk mendapatkan nilai yang sesuai dengan yang diharapkan. Pengujian sistem ini terdiri dari pengujian nilai akurasi dengan nilai K, pengujian K-fold X validation dan pengaruh hidden layer. Hasil dari Penelitian ini bahwa algoritma backpropagation menghasilkan nilai akurasi sebesar 64%, presisi sebesar 62%, recall sebesar 64% dan algoritma K-nearest neighbor menghasilkan nilai akurasi sebesar 66%, presisi sebesar 61% dan recall sebesar 66%. Pengaruh hidden layer terhadap algoritma backpropagation dalam mengklasifikasikan cardiovascular disease sangat besar hal ini sesuai dengan hasil dari penelitian yang telah dilakukan bahwa ketika jumlah hidden layer kecil, nilai yang dihasilkan juga kecil akan tetapi ketika jumlah hidden layernya tinggi nilai akurasinya bahkan menjadi rendah .
Analisis Sentimen Masyarakat Terhadap COVID-19 Pada Media Sosial Twitter Ardianne Luthfika Fairuz; Rima Dias Ramadhani; Nia Annisa Ferani Tanjung
Indonesian Journal of Data Science, IoT, Machine Learning and Informatics Vol 1 No 1 (2021): February
Publisher : Research Group of Data Engineering, Faculty of Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (591.332 KB) | DOI: 10.20895/dinda.v1i1.180

Abstract

Akhir tahun 2019 lalu dunia digemparkan oleh munculnya suatu penyakit yang disebabkan oleh virus SARS-CoV-2 yang merupakan jenis virus terbaru dari coronavirus. Penyakit ini dikenal dengan nama COVID-19. Penyebaran penyakit ini terbilang cukup luas dan cepat. Dalam waktu singkat penyakit ini mulai menyebar ke segala penjuru dunia tak terkecuali Indonesia. Dengan tingkat penyebaran yang begitu tinggi dan belum ditemukannya vaksin untuk COVID-19, menyebabkan kekacauan di tengah masyarakat. Hal ini mempengaruhi banyak sektor kehidupan masyarakat. Tak sedikit masyarakat yang aktif bersosial media dan menuliskan pendapat, opini serta pemikirannya di platform media sosial seperti Twitter. Terjadinya pandemi ini mendorong masyarakat untuk menuliskan opini, pemikiran serta pendapatnya terhadap COVID-19 pada media sosial Twitter. Dibutuhkan suatu model sentiment analysis untuk mengklasifikasi tweet masyarakat di Twitter menjadi positif dan negatif. Sentiment analysis merupakan bagian dari Natural Language Processing yang membuat sebuah sistem guna mengenali serta mengekstraksi opini dalam bentuk teks. Pada penelitian ini digunakan algoritma Naive Bayes dan K-Nearest Neighbor untuk digunakan dalam membangun model sentiment analysis terhadap tweet pengguna Twitter terhadap COVID-19. Didapatkan akurasi sebesar 85% untuk algoritma Naïve Bayes dan 82% untuk algoritma K-Nearest Neighbor pada nilai k=6, 8, dan 14.
Prediksi Harga Saham Bank Bri Menggunakan Algoritma Linear Regresion Sebagai Strategi Jual Beli Saham Janur Syah Putra; Rima Dias Ramadhani; Auliya Burhanuddin
Indonesian Journal of Data Science, IoT, Machine Learning and Informatics Vol 2 No 1 (2022): February
Publisher : Research Group of Data Engineering, Faculty of Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20895/dinda.v2i1.273

Abstract

Shares are securities as proof of ownership of investors in a company. Stocks have a volatile nature, this makes stocks difficult to predict. Stock prediction is an effort to estimate the stock price, especially in the Bank Rakyat Indonesia company that will appear in the future, and to increase investors' profit opportunities in making investment decisions. During the COVID-19 pandemic, Bank BRI's shares experienced significant ups and downs in four months, which illustrates the sensitivity of the stock to an event. Therefore, it is important to predict stock prices to reduce the risk accepted by investors. The prediction itself requires time series data. Time series is data that is collected sequentially from time to time. The method used for time series data is Linear Regression because this method can handle time-series data. Based on these problems, stock prediction research will be conducted at the Bank Rakyat Indonesia company using the Linear Regression method. Bank Rakyat Indonesia share price data were obtained from the investing.com website from the period starting on January 1, 2008, to June 1, 2020. The data is processed starting from preprocessing to determine attributes, remove unnecessary attributes, and change the contents of the data type, then process split data to divide the dataset into training and test data. The attributes used in this study are Date and Price and the distribution of the data used is 60:40, 65:35, 70:30, 75:25, and 80:20. The best ratio is at 80:20 which produces train and test accuracy of 0.89 and 0.91, Then each training data and testing data are entered into the linear regression model for prediction. The error results from the predictions were calculated using MAPE and yielded a percentage of 13.751% for training data, 13.773% for test data, and 13.755% for overall data. The MAPE results indicate that the linear regression method can be used to predict the stock price of BRI Bank.
Perbandingan Performa Antara Algoritma Naive Bayes Dan K-Nearest Neighbour Pada Klasifikasi Kanker Payudara Annisa Nugraheni; Rima Dias Ramadhani; Amalia Beladinna Arifa; Agi Prasetiadi
Indonesian Journal of Data Science, IoT, Machine Learning and Informatics Vol 2 No 1 (2022): February
Publisher : Research Group of Data Engineering, Faculty of Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20895/dinda.v2i1.391

Abstract

Breast cancer is the second most common cause of death from cancer after lung cancer is in the first place. Breast cancer occurs when cells in breast tissue begin to grow uncontrollably and can disrupt existing healthy tissue. Therefore, there is a need for a classification to distinguish breast cancer patients and healthy people. Based on previous research, the Naïve Bayes and K-Nearest Neighbor algorithms are considered capable of classifying breast cancer. In the research process using the breast cancer dataset from the Breast Cancer Coimbra dataset in 2018 UCI Machine Learning Repository with a total of 116 data, while for the calculation of the feasibility of the method using the Confusion Matrix (Accuracy, Precision, and Recall) and the ROC-AUC curve. The purpose of this study is to compare the performance of the Naïve Bayes and K-Nearest Neighbor algorithms. In testing using the Naïve Bayes algorithm and the K-Nearest Neighbor algorithm, there are several test scenarios, namely, data testing before and after normalization, model testing based on a comparison of training data and testing data, model testing based on K values ​​in K-Nearest Neighbors, and model testing. based on the selection of the strongest attribute with the Pearson correlation test. The results of this study indicate that the Naïve Bayes algorithm has the highest average accuracy of 69.12%, healthy precision 64.90%, pain precision 83%, healthy recall 88%, sick recall 61.11% and AUC 0.82 which is included in the good classification category. Meanwhile, the highest average results of the K-Nearest Neighbor algorithm are 76.83% for accuracy, 76% healthy precision, 80.21% pain precision, 74.18% for healthy recall, 80.81% sick recall and 0.91 AUC which is included in the excellent classification category.
PEMBERDAYAAN DAN PENINGKATAN KAPASITAS KELEMBAGAAN MASYARAKAT DESA MELALUI AGROWISATA BERBAHASA INGGRIS Novanda Alim Setya Nugraha; Siti Khomsah; Rima Dias Ramadhani; Tri Ginanjar Laksana
DEVOTE: Jurnal Pengabdian Masyarakat Global Vol. 1 No. 2 (2022): DEVOTE: Jurnal Pengabdian Masyarakat Global, Desember 2022
Publisher : LPPM Institut Pendidikan Nusantara Global

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (654.782 KB) | DOI: 10.55681/devote.v1i2.402

Abstract

Agribusiness utilizes inorganic waste that can be recycled to produce selling points such as basket bags from plastic waste and dolls from cloth waste. There is a superior product in this business unit, namely processed Jenitri seeds which are processed into handicrafts. Jenitri seeds are not only used as an ornamental tool. In certain beliefs, the work of Jenitri seeds is used as a worship tool and a medical device. Therefore, Jenitri seeds have a fairly high selling value compared to other handicrafts because of their various functions. Adiluhur Tourism Village is a village that is currently under development as a tourist spot with the name Kebumen English Tourism Village (KWIK) and is located in Adiluhur Village, Kec. Adimulyo, Kab. Kebumen. There are 3 business units that are superior and are still in the development stage, namely business units in the fields of tourism (agro-tourism), agriculture (agriculture), and handicrafts (agribusiness). Currently, the primary superior unit in Adiluhur Tourism Village is a business unit in the tourism sector. Agrotourism is managed by CV in collaboration with BUMDes (Village Owned Enterprise) Mulia Jaya. The tour featured in this unit is an introduction to several types of captive animals (various types of snakes, monitor lizards, sea urchins, iguanas, mongooses, Australian geckos, crocodiles, alligator fish, and many more) as well as a museum containing ancient agricultural tools (bronze spoon, harrows, sickles, hoes, nails, antique lamps, and many more). The potential that is being developed in this unit is outbound with the target visitor being Elementary Schools. Not only that, the agro-tourism manager plans to work with the BKSDA (Natural Resources Conservation Center) in caring for the animals in the unit.
Implementation of Encrypt National ID Card in Sinovi Application Use Waterfall Methodology Teguh Rijanandi; Ayu Silvia; Bintang Abillah Safna; Rima Dias Ramadhani
RIGGS: Journal of Artificial Intelligence and Digital Business Vol. 1 No. 2 (2023)
Publisher : Prodi Bisnis Digital Universitas Pahlawan Tuanku Tambusai

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (3299.838 KB) | DOI: 10.31004/riggs.v1i2.15

Abstract

In this era of increasingly rapid technology, the development of information systems is also growing rapidly, because information systems provide what users need. Information is a very valuable thing. When information or data falls into irresponsible hands, it will bring disaster to the owner, and there have been many cases of data leaks in the past that have harmed several parties. There are various ways to protect data or information. Therefore, data security techniques are needed, the message security process is very diverse, including using cryptography. Cryptography aims to scramble messages so that they are difficult to read by unauthorized parties. In this study, we will use the AES 256 encryption method with the waterfall research method to secure KTP images on the Sinovi application, Sinovi is a copyright registration application at the Telkom Institute of Technology Purwokerto. The results of this study are the AES method will convert data or information in the form of plain text into cipher-text then stored in a file to replace the image file so that in this way the image cannot be viewed directly because it must pass the decryption technique first and the research data will be presented. in a table of blackbox testing. It is hoped that a security system like this can protect data from unauthorized parties and further research is expected to have research that tests AES 256 with different methods. https://github.com/teguh02/T-Encryption
Perbandingan Algoritme Naïve Bayes dan C4.5 Pada Pengklasifikasian Tingkat Pemahaman Belajar Mahasiswa Dalam Pembelajaran Daring Nora Trivetisia; Rima Dias Ramadhani; Merlinda Wibowo
Progresif: Jurnal Ilmiah Komputer Vol 19, No 1: Februari 2023
Publisher : STMIK Banjarbaru

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35889/progresif.v19i1.1081

Abstract

Online learning is a learning system that has been widely implemented since the Covid-19 Pandemic. This learning system is synonymous with the use of internet-based learning media. In practice, teachers often have difficulty knowing how far their students can understand the material being taught. Therefore, it is necessary to do a classification to make it easier for teachers to assess the level of understanding in terms of health, motivation, and teaching methods. Many classification algorithms can be used so that analysis is needed to find the best algorithm. This study focuses on comparative observations of two classification algorithms, namely Naïve Bayes and C4.5. The dataset used is the result of a student questionnaire at the Telkom Purwokerto Institute of Technology in the form of a Likert scale. The steps taken were data preprocessing and then classification using Naïve Bayes and C4.5. The result is that Naïve Bayes is superior to C4.5 with a Naïve Bayes testing accuracy of 99% compared to C4.5 with 91% accuracy. So, it can be concluded that Naïve Bayes is superior to C4.5 in this case.Keywords: Online Learning; Naïve Bayes; C4.5; Classification; Data Mining AbstrakPembelajaran daring adalah salah satu sistem pembelajaran yang ramai diterapkan sejak Pandemi Covid-19. Sistem pembelajaran ini identik dengan penggunaan media belajar berbasis internet. Dalam pelaksanaannya pengajar sering mengalami kesulitan untuk mengetahui sejauh mana mahasiswanya bisa menangkap materi yang diajarkan. Oleh karena itu, perlu dilakukan klasifikasi untuk mempermudah pengajar dalam menilai tingkat pemahaman dari segi kesehatan, motivasi, dan cara pengajaran. Banyak algoritme klasifikasi yang dapat digunakan sehingga dibutuhkan analisis untuk mencari algoritme terbaik. Penelitian ini berfokus pada pengamatan komparasi terhadap dua algoritme klasifikasi yaitu Naïve Bayes dan C4.5. Dataset yang digunakan adalah hasil kuesioner mahasiswa Institut Teknologi Telkom Purwokerto berbentuk skala Likert. Tahapan yang dilakukan adalah preprocessing data lalu dilakukan klasifikasi menggunakan Naïve Bayes dan C4.5. Hasilnya Naïve Bayes lebih unggul dari C4.5 dengan akurasi untuk pengujian Naïve Bayes sebesar 99% dibanding C4.5 dengan akurasi 91%. Maka, dapat disimpulkan bahwa Naïve Bayes lebih unggul daripada C4.5 pada kasus ini.Kata kunci: Pembelajaran Daring; Naïve Bayes; C4.5; Klasifikasi; Data Mining