Claim Missing Document
Check
Articles

Found 10 Documents
Search

Design of the use of chatbot as a virtual assistant in banking services in Indonesia Bhakti Prabandyo Wicaksono; Amalia Zahra
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 11, No 1: March 2022
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v11.i1.pp23-33

Abstract

A chatbot is a computer program designed to simulate an interactive communication to user (human) via text, audio, or video. Currently several banks in Indonesia have adopted chat technology in customer service. The application of artificial intelligence in customer service aims to prepare banks for the challenges of industry banking 4.0. In addition, it is also to solve problems currently faced by customer service. Implementing chatbot platform in banking in Indonesia is not just plug and play, although there are quite a lot of chatbot platforms available, including Rasa Platform, Botika Platform, and Kata.ai Platform. However, this study only evaluates two chatbot platforms, namely Rasa and Botika, where the two platforms are considered not yet able to be immediately adopted by banks. This is because the application of banking technology in Indonesia must refer to regulatory regulations, including those related to environmental needs, language, speed, and accuracy to understand the intent of users. Hence, research is needed to decide which chatbot platform can be implemented in the banking industry without violating regulatory regulations. From the results of evaluations conducted using the usability and hedonic motivation system adoption system (HMSAM) methods, it is found that users prefer Botika platform to be implemented in the banking industry.
Spoken language identification using i-vectors, x-vectors, PLDA and logistic regression Ahmad Iqbal Abdurrahman; Amalia Zahra
Bulletin of Electrical Engineering and Informatics Vol 10, No 4: August 2021
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v10i4.2893

Abstract

In this paper, i-vector and x-vector is used to extract the features from speech signal from local Indonesia languages, namely Javanese, Sundanese and Minang languages to help classifier identify the language spoken by the speaker. Probabilistic linear discriminant analysis (PLDA) are used as the baseline classifier and logistic regression technique are used because of prior studies showing logistic regression has better performance than PLDA for classifying speech data. Once these features are extracted. The feature is going to be classified using the classifier mentioned before. In the experiment, we tried to segment the test data to three segment such as 3, 10, and 30 seconds. This study is expanded by testing multiple parameters on the i-vector and x-vector method then comparing PLDA and logistic regression performance as its classifier. The x-vector has better score than i-vector for every segmented data while using PLDA as its classifier, except where the i-vector and x-vector is using logistic regression, i-vector still has better accuracy compared to x-vector.
Spoken language identification on 4 Indonesian local languages using deep learning Panji Wijonarko; Amalia Zahra
Bulletin of Electrical Engineering and Informatics Vol 11, No 6: December 2022
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v11i6.4166

Abstract

Language identification is at the forefront of assistance in many applications, including multilingual speech systems, spoken language translation, multilingual speech recognition, and human-machine interaction via voice. The identification of indonesian local languages using spoken language identification technology has enormous potential to advance tourism potential and digital content in Indonesia. The goal of this study is to identify four Indonesian local languages: Javanese, Sundanese, Minangkabau, and Buginese, utilizing deep learning classification techniques such as artificial neural network (ANN), convolutional neural network (CNN), and long-term short memory (LSTM). The selected extraction feature for audio data extraction employs mel-frequency cepstral coefficient (MFCC). The results showed that the LSTM model had the highest accuracy for each speech duration (3 s, 10 s, and 30 s), followed by the CNN and ANN models.
Multimodal music emotion recognition in Indonesian songs based on CNN-LSTM, XLNet transformers Andrew Steven Sams; Amalia Zahra
Bulletin of Electrical Engineering and Informatics Vol 12, No 1: February 2023
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v12i1.4231

Abstract

Music carries emotional information and allows the listener to feel the emotions contained in the music. This study proposes a multimodal music emotion recognition (MER) system using Indonesian song and lyrics data. In the proposed multimodal system, the audio data will use the mel spectrogram feature, and the lyrics feature will be extracted by going through the tokenizing process from XLNet. Convolutional long short term memory network (CNN-LSTM) performs the audio classification task, while XLNet transformers performs the lyrics classification task. The outputs of the two classification tasks are probability weight and actual prediction with the value of positive, neutral, and negative emotions, which are then combined using the stacking ensemble method. The combined output will be trained into an artificial neural network (ANN) model to get the best probability weight output. The multimodal system achieves the best performance with an accuracy of 80.56%. The results showed that the multimodal method of recognizing musical emotions gave better performance than the single modal method. In addition, hyperparameter tuning can affect the performance of multimodal systems.
Multi-feature stacking order impact on speech emotion recognition performance Yoga Tanoko; Amalia Zahra
Bulletin of Electrical Engineering and Informatics Vol 11, No 6: December 2022
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v11i6.4287

Abstract

One of the biggest challenges in implementing SER is to produce a model that performs well and is lightweight. One of the ways is using one-dimensional convolutional neural network (1D CNN) and combining some handcrafted features. 1D CNN is mostly used for time series data. In time series data, the order of information plays an important role. In this case, the order of stacked features also plays an important role. In this work, the impact of changing the order is analyzed. This work proposes to brute force all possible combinations of feature orders from five features: Mel-frequency cepstral coefficient (MFCC), Mel-spectrogram, chromagram, spectral contrast, and tonnetz, then uses 1D CNN as the model architecture and benchmarking the model's performance on the Ryerson audio-visual database of emotional speech and song (RAVDESS) dataset. The results show that changing the order of features can impact overall classification accuracy, specific emotion accuracy, and model size. The best model has an accuracy of 79.17% for classifying 8 emotion classes with the following order: spectral contrast, tonnetz, chromagram, Mel-spectrogram, and MFCC. Finding a suitable order can increase the accuracy up to 16.05% and reduce the model size up to 96%.
Multi Kelas Speaker Recognition Menggunakan Deep Learning dengan CN-Celeb Dataset Adipta Martulandi; Amalia Zahra
Building of Informatics, Technology and Science (BITS) Vol 4 No 3 (2022): Desember 2022
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v4i3.2467

Abstract

Speaker recognition has been widely applied in various fields of human life such as Siri from Apple, Cortana from Microsoft, and Voice Assistant by Google. One of the problems when creating speaker recognition is related to the dataset used for the modeling process. The dataset used for creating the speaker recognition model is mostly data that cannot represent real-world conditions. The result is when implemented in the real-world conditions are not optimal. This study develops a speaker recognition model using deep learning (LSTM) with the CN-Celeb dataset. The CN-Celeb dataset is data taken directly from the real world so there is a lot of noise. The hope of using this dataset is that it can represent real world conditions. Model development uses 2 stacked LSTM for multi-class speaker recognition tasks. In addition, this study performs tuning hyperparameters with a grid search method to obtain the most optimal model configuration. The results showed that the EER value of the LSTM model was 10.13% better than the reference baseline paper of 15.52%. In addition, when compared with other studies that also used the CN-Celeb dataset but using different models, it was found that the LSTM model had promising results. From the results of study that has been carried out and also compared with other people's research, it was found that the LSTM model gave promising performance. The LSTM model is compared with the x-vectors, PLDA, TDNN, and transformers models
PREDIKSI KELULUSAN SISWA SEKOLAH MENENGAH PERTAMA MENGGUNAKAN MACHINE LEARNING Agusti Frananda Alfonsus Naibaho; Amalia Zahra
Jurnal Informatika dan Teknik Elektro Terapan Vol 11, No 3 (2023)
Publisher : Universitas Lampung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.23960/jitet.v11i3.3056

Abstract

In recent years, there has been a number of students who graduated late at Lubuk Alung 1st State Junior Highschool. This statement is supported by graduation data from Lubuk Alung 1st Satet Junior Highschool. Therefore, it is necessary to predict students’ graduation status to identify which factors influence the student’s graduation, which will also consequently help the school to solve problem more easily. To solve this problem, the researchers predict student graduation based on student graduation information. The attributes used are personal data related to students, student academic data, and data related to the work of the student’s parents. This research retrieved data on student graduation from schools that have been recapitulated. The classification algorithms used to predict students’ graduation are decision tree, random forest, and extreme gradient boosting with grid searchCV and k-fold=5. The prediction accuracy using the random forest algorithm outperforms the others with a value of 99.5%.
Stacking ensemble learning for optical music recognition Francisco Calvin Arnel Ferano; Amalia Zahra; Gede Putra Kusuma
Bulletin of Electrical Engineering and Informatics Vol 12, No 5: October 2023
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v12i5.5129

Abstract

The development of music culture has resulted in a problem called optical music recognition (OMR). OMR is a task in computer vision that explores the algorithms and models to recognize musical notation. This study proposed the stacking ensemble learning model to complete the OMR task using the common western musical notation (CWMN) musical notation. The ensemble learning model used four deep convolutional neural networks (DCNNs) models, namely ResNeXt50, Inception-V3, RegNetY-400MF, and EfficientNet-V2-S as the base classifier. This study also analysed the most appropriate technique to be used as the ensemble learning model’s meta-classifier. Therefore, several machine learning techniques are determined to be evaluated, namely support vector machine (SVM), logistic regression (LR), random forest (RF), K-nearest neighbor (KNN), decision tree (DT), and Naïve Bayes (NB). Six publicly available OMR datasets are combined, down sampled, and used to test the proposed model. The dataset consists of the HOMUS_V2, Rebelo1, Rebelo2, Fornes, OpenOMR, and PrintedMusicSymbols datasets. The proposed ensemble learning model managed to outperform the model built in the previous study and succeeded in achieving outstanding accuracy and F1-scores with the best value of 97.51% and 97.52%, respectively; both of which were achieved by the LR meta-classifier.
Data augmentation and enhancement for multimodal speech emotion recognition Jonathan Christian Setyono; Amalia Zahra
Bulletin of Electrical Engineering and Informatics Vol 12, No 5: October 2023
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v12i5.5031

Abstract

Humans’ fundamental need is interaction with each other such as using conversation or speech. Therefore, it is crucial to analyze speech using computer technology to determine emotions. The speech emotion recognition (SER) method detects emotions in speech by examining various aspects. SER is a supervised method to decide the emotion class in speech. This research proposed a multimodal SER model using one of the deep learning based enhancement techniques, which is the attention mechanism. Additionally, this research addresses the imbalanced dataset problem in the SER field using generative adversarial networks (GAN) as a data augmentation technique. The proposed model achieved an excellent evaluation performance of 0.96 or 96% for the proposed GAN configuration. This work showed that the GAN method in the multimodal SER model could enhance performance and create a balanced dataset.
The Website Optimization and Analysis on XYZ Website using the Web Core Vital Method Kristian Handoko Wijaya Sukardjoh; Amalia Zahra
Indonesian Journal of Computer Science Vol. 12 No. 5 (2023): Indonesian Journal of Computer Science (IJCS) Volume 12 Number 5 (2023)
Publisher : STMIK Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

XYZ website is a business website that operates in the field of e-commerce which is implemented through websites and applications, for several years the website has had a percentage of users' usage speed which has decreased quite a bit and has become old, due to lack of maintenance of some of the features contained in the website application which have an impact on the lack of customer interest in buying goods on the XYZ website and more influential in terms of access from searching e-commerce notifications from Google that if the percentage of websites decreases over a long period of time, this will result in websites not being allowed to publish advertisements. In this study, we analyze the problem to understand the problem starting from small things, namely from the use of programming languages, the data provided, the use of writing code, third party or vendor support, filling out website content, and websites using vital core web architecture. So that the website used has good comfort, accelerates the use of the website which can be affected by the large number of customer visitors, and can facilitate the development team's performance in management and maintenance and provide many positive things from customers so that business runs fast and provides convenience for customers and the XYZ website can received by Google.