Claim Missing Document
Check
Articles

Found 13 Documents
Search

Analysis Sentiment Aspect Level on Beauty Product Reviews Using Chi-Square and Naïve Bayes Felia Novitasari; Mahendra Dwifebri Purbolaksono
Journal of Data Science and Its Applications Vol 4 No 1 (2021): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2021.4.72

Abstract

The many platforms that are equipped with review features make it easy for people to convey anything. Product reviews are judgments that are opinions from consumers about the products they have purchased. These reviews can provide benefits for both producers and consumers. Reviews from consumers can contain ratings that cover aspects of the product and reviews can run into hundreds or even thousands. A large number of reviews makes it difficult in the sentiment analysis process. Therefore we need a model that can analyze sentiment based on aspects of the product. Sentiment analysis was performed using the naive Bayes algorithm, feature extraction with TF-IDF, and feature selection with chi-square. The application of stopwords removal or stemming processesreprocessing and the use of n-grams in feature extraction can affect the resulting performance. In addition, the application of feature selection to the built model has an important role because it can improve classification performance. From the research results obtained the best accuracy of 80,18%, recall of 72,49%, precision of 77,25%, and f1-score of 74,73%.
Classification of Personality based on Beauty Product Reviews Using the TF-IDF and Naïve Bayes (Case Study : Female Daily) Novia Russelia Wassi; Adiwijaya Adiwijaya; Mahendra Dwifebri Purbolaksono
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.61

Abstract

A person's personality is an important parameter to determine the character of each person and also as an assessment in various ways. In this day and age personality can not only be known from psychological tests, but also can be known in various ways. One way is through reviews presented in electronic media. In this study, a person's personality was classified into three "Big Five" personality groups, namely: Openness, Conscientiousness, and Extraversion using the Naïve Bayes method and TF-IDF as Feature Extraction. The results of the classification that have been done get 81% accuracy with preproccessing scenarios using Stemming and Stopword, TF-IDF unigram, and BernoulliNB classifier type.
FASE-1: IMPLEMENTASI STANDAR ATURAN PEMODELAN UML SEBAGAI DASAR SPESIFIKASI KEBUTUHAN DI EFARMING CORPORA BANDUNG Gede Agung Ary Wisudiawan; Mahendra Dwifebri Purbolaksono; Pramoedya Syachrizalhaq Lyanda; Yudi Priyadi
BERNAS: Jurnal Pengabdian Kepada Masyarakat Vol 2 No 1 (2021)
Publisher : Universitas Majalengka

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31949/jb.v2i1.734

Abstract

In Phase 1 of this Abdimas, activities focused on the general analysis presented through system modeling rules and data modeling. This activity will have an impact on the next Abdimas phases. There is a problem focus on how to describe the system and the process of sending data. There was a problem in analyzing the requirements specifications in the process of sending agricultural activist data to its members. In general, the objectives regarding the potential/opportunities for empowerment of farmer members in the EFarming Corpora community can be increased by identifying requirements specifications, which will be defined through the creation of system modeling and data modeling. There is an adoption of the method through modification is carried out for Phase 1 of the Regular Scheme of community service activities, which will provide understanding to farmer activists in the Bandung Corpora Efarming Environment regarding information technology to support Farmer Community activities. The discussion of material implemented in this community service partner includes several supporting results in the next phase. This discussion is the basis for development that can be explored in the System Development Life Cycle stage, through the support of the Software Requirement Specification, Elicitation, Requirement Statement, Software Modeling, and Software Prototype. In its implementation, for Phase 1, Abdimas has a form of activity resulting from several activities: scientific training, a compilation of required information systems, and surveys and analysis of industrial needs.
Identifying Emotion on Indonesian Tweets using Convolutional Neural Networks Naufal Hilmiaji; Kemas Muslim Lhaksmana; Mahendra Dwifebri Purbolaksono
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 5 No 3 (2021): Juni 2021
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v5i3.3137

Abstract

especially with the advancement of deep learning methods for text classification. Despite some effort to identify emotion on Indonesian tweets, its performance evaluation results have not achieved acceptable numbers. To solve this problem, this paper implements a classification model using a convolutional neural network (CNN), which has demonstrated expected performance in text classification. To easily compare with the previous research, this classification is performed on the same dataset, which consists of 4,403 tweets in Indonesian that were labeled using five different emotion classes: anger, fear, joy, love, and sadness. The performance evaluation results achieve the precision, recall, and F1-score at respectively 90.1%, 90.3%, and 90.2%, while the highest accuracy achieves 89.8%. These results outperform previous research that classifies the same classification on the same dataset.
Perbandingan Support Vector Machine dan Modified Balanced Random Forest dalam Deteksi Pasien Penyakit Diabetes Mahendra Dwifebri Purbolaksono; Muhammad Irvan Tantowi; Adnan Imam Hidayat; Adiwijaya Adiwijaya
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 5 No 2 (2021): April 2021
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v5i2.3008

Abstract

Diabetes (diabetes) was a metabolic disorder caused by high levels of sugar in the blood caused by disorders of the pancreas and insulin. According to data from the Ministry of Health of the Republic of Indonesia, Diabetes was the third-largest cause of death in Indonesia with a percentage of 6.7%. The high rate of death from diabetes encouraged this study, with the aim of early detection. This research used a Machine Learning approach to classify the data. In this paper, a comparison of Support Vector Machine (SVM) and Modified Balanced Random Forest (MBRF) was discussed for classifying diabetes patient data. Both methods were chosen because it was proven in previous studies to get high accuracy, so that the two methods are compared to find the best classification model. Several preprocessing methods were used to prepare the data for the classification process. The entire combination of preprocessing steps will be carried out on the two classification methods to produce the same dataset. The evaluation was carried out using the Confusion Matrix method. Based on the experimental results in the process of testing the system being built, the maximum performance results were 87.94% using SVM and 97.8% using MBRF.
A Multi-label Classification on Topic of Hadith Verses in Indonesian Translation using CART and Bagging Rendi Kustiawan; Adiwijaya Adiwijaya; Mahendra Dwifebri Purbolaksono
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 6, No 2 (2022): April 2022
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v6i2.3787

Abstract

Hadith is a source of law for Muslims after the al-qur'an, in which there are instructions in the form of words, actions, attitudes, and others. Hadith must be studied and practiced by Muslims, then used as a way of life after the al-qur'an. Classifying hadith is a way to make it easier for Muslims to learn hadith by looking at the text pattern in the translation of Bukhari hadith based on three classes or categories based on suggestions, prohibitions, and information. The classification carried out is a multi-label classification. The classification process uses N-gram and TF-IDF as feature extraction, CART and bagging as classification methods, and hamming loss as evaluation methods. Bagging is used to cover the shortcomings of CART, namely, the CART model is less stable, which, if there is a slight change in the training data, will have a significant effect on the resulting learning model. Several testing methods were carried out to obtain the best hammer loss value in this study. Based on several tests that have been carried out, the best hamming loss value is 0.1914 or 80.86%. These results indicate that the use of bagging can help increase accuracy by 5%.
Aspect Based Sentiment Analysis on Beauty Product Review Using Random Forest Anggitha Yohana Clara; Adiwijaya Adiwijaya; Mahendra Dwifebri Purbolaksono
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.58

Abstract

Cosmetics and beauty products (including skincare) are the products used as body care or face care and used to accentuate the body alure. A product could give diverse sentiment to the consumers including positive and negative sentiment. Many consumers of beauty products are sharing their reviews to help other consumers to find the right products to buy and to give feedback to the brand of the beauty product itself. The number of reviews is inversely proportional to the lack of opinion identification towards product’s aspects. Hence, a study has been conducted to analyze beauty products reviews as toner, serum, sun protection, and exfoliator. The analysis process is conducted aspect based to determine sentiment towards aspect of beauty products based on the reviews. The result is addressed to people using skincare and beauty product brands in deducting consumer’s opinion. The solution to this problem is by using Random Forest with hyperparameters tuning as classification method, and TF-IDF and n-gram as feature extraction methods. The multi-aspect sentiment analysis in this study obtained highest accuracy for 90.48%, precision for 87.27%, recall for 70.13%, and F1-Score for 71.77%.
Peningkatan Hasil Klasifikasi pada Algoritma Random Forest untuk Deteksi Pasien Penderita Diabetes Menggunakan Metode Normalisasi Gde Agung Brahmana Suryanegara; Adiwijaya; Mahendra Dwifebri Purbolaksono
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 5 No 1 (2021): Februari 2021
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v5i1.2880

Abstract

Diabetes is a disease caused by high blood sugar in the body or beyond normal limits. Diabetics in Indonesia have experienced a significant increase, Basic Health Research states that diabetics in Indonesia were 6.9% to 8.5% increased from 2013 to 2018 with an estimated number of sufferers more than 16 million people. Therefore, it is necessary to have a technology that can detect diabetes with good performance, accurate level of analysis, so that diabetes can be treated early to reduce the number of sufferers, disabilities, and deaths. The different scale values for each attribute in Gula Karya Medika’s data can complicate the classification process, for this reason the researcher uses two data normalization methods, namely min-max normalization, z-score normalization, and a method without data normalization with Random Forest (RF) as a classification method. Random Forest (RF) as a classification method has been tested in several previous studies. Moreover, this method is able to produce good performance with high accuracy. Based on the research results, the best accuracy is model 1 (Min-max normalization-RF) of 95.45%, followed by model 2 (Z-score normalization-RF) of 95%, and model 3 (without data normalization-RF) of 92%. From these results, it can be concluded that model 1 (Min-max normalization-RF) is better than the other two data normalization models and is able to increase the performance of classification Random Forest by 95.45%.
Classification of Hadith Topic of Indonesian Translation Using K-Nearest Neighbor and Chi-Square Ghinaa Zain Nabiilah; Said Al Faraby; Mahendra Dwifebri Purbolaksono
International Journal on Information and Communication Technology (IJoICT) Vol. 7 No. 2 (2021): December 2021
Publisher : School of Computing, Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21108/ijoict.v7i2.573

Abstract

Hadith is the main way of life for Muslims besides the Qur'an whose can be applied in everyday life. Hadith also contains all the words or deeds of the Prophet Muhammad which are used as a source of the law of Islam. Therefore, many readers, especially Muslims, are interested in studying hadith. However, the large number of hadiths makes it difficult for readers or those who are still unfamiliar with Islam to read them. Therefore, we conducted a study to classify hadith textually based on the type of teaching, so that readers can get an overview or other reference in reading and searching for hadith based on the type of teaching more easily. This study uses KNN and chi-square methods as feature selection. We also carried out several test scenarios, including implementing stopword removal modifications in preprocessing and experimenting with selecting k values ​​for KNN to determine the best performance. The best performance was obtained by using the value of k = 7 on KNN without implementing chi-square and with stopword removal modification with a hammer loss value of 0.1042 or about 89.58% of the data correctly classified.
Peningkatan Hasil Klasifikasi pada Algoritma Random Forest untuk Deteksi Pasien Penderita Diabetes Menggunakan Metode Normalisasi Gde Agung Brahmana Suryanegara; Adiwijaya; Mahendra Dwifebri Purbolaksono
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 5 No 1 (2021): Februari 2021
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v5i1.2880

Abstract

Diabetes is a disease caused by high blood sugar in the body or beyond normal limits. Diabetics in Indonesia have experienced a significant increase, Basic Health Research states that diabetics in Indonesia were 6.9% to 8.5% increased from 2013 to 2018 with an estimated number of sufferers more than 16 million people. Therefore, it is necessary to have a technology that can detect diabetes with good performance, accurate level of analysis, so that diabetes can be treated early to reduce the number of sufferers, disabilities, and deaths. The different scale values for each attribute in Gula Karya Medika’s data can complicate the classification process, for this reason the researcher uses two data normalization methods, namely min-max normalization, z-score normalization, and a method without data normalization with Random Forest (RF) as a classification method. Random Forest (RF) as a classification method has been tested in several previous studies. Moreover, this method is able to produce good performance with high accuracy. Based on the research results, the best accuracy is model 1 (Min-max normalization-RF) of 95.45%, followed by model 2 (Z-score normalization-RF) of 95%, and model 3 (without data normalization-RF) of 92%. From these results, it can be concluded that model 1 (Min-max normalization-RF) is better than the other two data normalization models and is able to increase the performance of classification Random Forest by 95.45%.