cover
Contact Name
Adiwijaya
Contact Email
adiwijaya@telkomuniversity.ac.id
Phone
+6282217633999
Journal Mail Official
jdsa@telkomuniversity.ac.id
Editorial Address
Telkom University Jl. Telekomunikasi Terusan Buah Batu Indonesia, 40257, Bandung, Indonesia
Location
Kota bandung,
Jawa barat
INDONESIA
Journal of Data Science and Its Applications
Published by Universitas Telkom
ISSN : -     EISSN : 26147408     DOI : https://doi.org/10.34818/jdsa
Core Subject : Science,
JDSA welcomes all topics that are relevant to data science, computational linguistics, and information sciences. The listed topics of interest are as follows: Big Data Analytics Computational Linguistics Data Clustering and Classifications Data Mining and Data Analytics Data Visualization Information Science Tools and Applications in Data Science
Articles 30 Documents
Classification of Personality based on Beauty Product Reviews Using the TF-IDF and Naïve Bayes (Case Study : Female Daily) Wassi, Novia Russelia; Adiwijaya, Adiwijaya; Purbolaksono, Mahendra Dwifebri
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.61

Abstract

A person's personality is an important parameter to determine the character of each person and also as an assessment in various ways. In this day and age personality can not only be known from psychological tests, but also can be known in various ways. One way is through reviews presented in electronic media. In this study, a person's personality was classified into three "Big Five" personality groups, namely: Openness, Conscientiousness, and Extraversion using the Naïve Bayes method and TF-IDF as Feature Extraction. The results of the classification that have been done get 81% accuracy with preproccessing scenarios using Stemming and Stopword, TF-IDF unigram, and BernoulliNB classifier type.
Comparative Analysis of Support Vector Machine-Recursive Feature Elimination and Chi-Square on Microarray Classification for Cancer Detection with Naïve Bayes Amory, Talitha Kayla; Adiwijaya, Adiwijaya; Astuti, Widi
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.62

Abstract

Cancer is a world-famous deadly disease. According to the World Health Organization (WHO), cancer is the second leading cause of death globally and is responsible for an estimated 9.6 million deaths in 2018. One well-known technique for cancer detection is the DNA microarray technique. DNA microarray technology provides an opportunity for researchers to analyze thousands of gene expression profiles at the same time to determine whether a person has cancer or not. However, one of the problems in DNA microarray data is the large number of features that require feature selection. In overcoming these problems, this study will use the feature selection Support Vector Machine-Recursive Feature Elimination (SVM-RFE) and Chi-Square and use the Naïve Bayes classification method. The accuracy results from using feature selection with those that are not will be compared. The accuracy between using the two feature selection methods will also be compared to find which feature selection method is better when combined with the Naïve Bayes classification method. To get an overall picture of the performance comparison, this study also considers precision, recall, and F1-score. The best accuracy results obtained were 100% lung cancer data with SVM-RFE and Chi-Square, 99.6% ovarian cancer with SVM-RFE, 93.7% breast cancer with SVM-RFE, and 90% colon cancer with SVM- RFE.
Cancer Detection based on Microarray Data Classification Using Principal Component Analysis and Functional Link Neural Network Priyono, Iyon; Adiwijaya, Adiwijaya; Aditsania, Annisa
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.52

Abstract

Cancer is a deadly disease caused by abnormal growth of tissue cells that are not controlled in the body. In 2018, according to Globocan data, the number of cancer sufferers has increased from the previous years which was 18.1 million people, with a mortality rate of 9.6 million. In recent years, cancer prediction using DNA microarrays data can help medical experts in analyzing whether a person has cancer or not. DNA microarray data have very large and complex gene expression, therefore a dimensional reduction method is needed. Then, the dimension reduction results will be used for classification into types of cancer or not. In this paper, Principal Component Analysis (PCA) is used as a feature extraction to reduce dimension and Functional Link Neural Network as a classifier. Based on the simulation, the average of accuracy using the FLNN and PCA about 76.08%. Keywords: cancer detection, Microarray data, Functional Link Neural Network, Principal Component Analysis.
Aspect Based Sentiment Analysis on Beauty Product Review Using Random Forest Clara, Anggitha Yohana; Adiwijaya, Adiwijaya; Purbolaksono, Mahendra Dwifebri
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.58

Abstract

Cosmetics and beauty products (including skincare) are the products used as body care or face care and used to accentuate the body alure. A product could give diverse sentiment to the consumers including positive and negative sentiment. Many consumers of beauty products are sharing their reviews to help other consumers to find the right products to buy and to give feedback to the brand of the beauty product itself. The number of reviews is inversely proportional to the lack of opinion identification towards product’s aspects. Hence, a study has been conducted to analyze beauty products reviews as toner, serum, sun protection, and exfoliator. The analysis process is conducted aspect based to determine sentiment towards aspect of beauty products based on the reviews. The result is addressed to people using skincare and beauty product brands in deducting consumer’s opinion. The solution to this problem is by using Random Forest with hyperparameters tuning as classification method, and TF-IDF and n-gram as feature extraction methods. The multi-aspect sentiment analysis in this study obtained highest accuracy for 90.48%, precision for 87.27%, recall for 70.13%, and F1-Score for 71.77%.
Forecasting Number of Passengers of TransJakarta using Seasonal ARIMAX Method Virati, Maftukhatul Qomariyah; Pamanik, Diory Paulus; Pramana, Setia
Journal of Data Science and Its Applications Vol 3 No 1 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.45

Abstract

TransJakarta is one of the most common public transportation modes used by the public in Jakarta. Every day there are more than 300.000 people who use TransJakarta . The number of TransJakarta buses is still limited, so to optimize services, we should know when the number of users in peak time and when the number of users in low time. In addition to providing comfort to customers, maintenance for TransJakarta buses can also be optimized, thereby reducing incident and unwanted events. This study investigates the pattern of the number of TransJakarta passengers differs on weekends, weekdays, and holidays. Also, this study predict how many TransJakarta passengers in the future, by using SARIMAX method, which is SARIMA method with X - factor. In the implementation, the study is conducted using R application with the addition of x-factor in the form of dummy variable for tap-in data in holiday period.The predicted result being produced is not too far away with the actual figure with the best model is SARIMA(0,0,0)(2,1,0)[7] with x-factor and the error analys is MSE = 162402173, MAPE = 2.6122 and MASE = 0.211698.
Movie Recommendation Using Conversational Mechanism and Knowledge Based Filtering Septianta, Marendra; Baizal, Z. K. Abdurahman; Lhaksmana, Kemas Muslim
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.49

Abstract

Conversational recommender system created for helping users in searching information in a domain by using conversational mechanism. These systems help user to get recommendation by selecting items that most suitable to user’s preference by asking user needed. The recommendations generated by eliciting user’s experience e.g. his favourite movies, actor and director and then gives the item that match their interest. There are many methods to get the suitable recommendation that match the user’s preference. In this paper, we use ontology which represents knowledge to get result of recommendation that fit to user preference by using knowledge-based filtering to determine the user’s need. Our system has been implemented for movie domain. We test our system performance by studying user's perception.
Identification of Pedestrians Attributes Based on Multi-Class Multi-Label Classification using Convolutional Neural Network (CNN) Wardana, Wrida Adi; Siradjuddin, Indah Agustien; Muntasa, Arif
Journal of Data Science and Its Applications Vol 3 No 1 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.43

Abstract

The usage of computer vision in identifying pedestrians attributes has received a great attention, especially in the visual surveillance systems. For instance, searching for system based on the attributes. Attributes Identification using Convolutional Neural Network architecture is presented in this article, since the architecture can perform feature learning. CNN consist of convolution layer, ReLU, Pooling, and Fully-connected. There are three experiment scenarios are conducted based on the number of convolution layers, to determine the effect of layers on CNN performance. Three different CNN architectures were trained and tested using a PETA dataset with 35 attributes. The highest accuracy achieved is 75.66% based on number of convolutional layers. The conducted experiments showed that more numbers of convolution layers used would produce the better CNN's performance.
Implementation of Minimum Redundancy Maximum Relevance (MRMR) and Genetic Algorithm (GA) for Microarray Data Classification with C4.5 Decision Tree Mabarti, Irne
Journal of Data Science and Its Applications Vol 3 No 1 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.37

Abstract

Cancer is one of the highest causes of death in various countries, even an increase in mortality rates happens every year. On the other hand, bioinformatics technology will be beneficial for predicting cancer, one of the methods that can be considered in predicting cancer is the classification of microarrays data. Microarray data is data containing many gene expressions that describe DNA cells. Microarray data has enormous dimensions. The dimension reduction method used in this study is the Minimum Redundancy Maximum Relevance (MRMR), the optimization method used is the Genetic Algorithm (GA) method, and the last method is C4.5 aimed at classifying gene data. In this study, there were two trials. The first trial used the Minimum Redundancy Maximum Relevance (MRMR) method combined with Genetic Algorithm (GA) as an optimization method and the C4.5 classification method, and the trial resulted in an average accuracy of 79%. While the second trial using the Genetic Algorithm (GA) method for feature selection and the C4.5 classification method produces an average accuracy of 78%.
Snakebite Classification Using Active Contour Model and K Nearest Neighbor Cakravania, Chiara Janetra; Utama, Dody Qori
Journal of Data Science and Its Applications Vol 3 No 1 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.38

Abstract

Indonesia is categorized as one of tropical countries that have a high risk of snakebites. This surely may endanger rural citizens’ lives for there are still many snakes found in rural areas. The main cause of death from snakebite cases is by reason of the venom squirted from snake’s canine teeth. Others causes are errors in identifying the bite marks visually. There are anatomical differences between puncture wounds from venomous and non-venomous snakes. This study established a snakebite identification system using Active Contour Model and K Nearest Neighbor (KNN) methods. By performing some tests related to the parameters used in the method, the highest accuracy value on K Nearest Neighbor method was obtained by using the correlation distance rule, the K value = 3, without using distance weight in the classification system.
Sentiment Analysis of Movie Review using Naïve Bayes Method with Gini Index Feature Selection Purnomoputra, Riko Bintang; Adiwijaya, Adiwijaya; Novia Wisesty, Untari
Journal of Data Science and Its Applications Vol 2 No 2 (2019): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2019.2.36

Abstract

In movie reviews, there is information that determines whether the movie is good or bad. Sentiment analysis is used to process information to determine the polarity of the sentence. With unstructured reviews and a lot of data attributes so that it requires much time and computational capabilities that become a problem in the classification process. To process a lot of data selection features becomes a solution to reduce dimensions so it accelerate the classification process and reduce the occurrence of misclassification. The first Gini Index Text feature selection used to classify documents and successfully enhanced the classifier performance. Multinomial Naïve Bayes (MNNB) is a popular classifier used for document classification however, will the Gini Index Text feature selection able to improve MNNB classification performance. Therefore in this study the author aims to use the Gini Index Text (GIT) for text feature selection with MNNB classifier to classify movie review into positive and negative classes. The data used is IMDB dataset that contains reviews in English sentences, the data will be divided into two parts, training data is 90% and data testing is 10%. The test results prove that the Gini index as a selection feature can increase accuracy where accuracy without feature selection is 56% and with feature selection of 59.54% with an increase of 3.54%.

Page 1 of 3 | Total Record : 30