Faiza Renaldi
Universitas Jenderal Achmad Yani

Published : 7 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Proceeding of the Electrical Engineering Computer Science and Informatics

Spoken Word and Speaker Recognition Using MFCC and Multiple Recurrent Neural Networks Yoga Utomo; Esmeralda Contessa Djamal; Fikri Nugraha; Faiza Renaldi
Proceeding of the Electrical Engineering Computer Science and Informatics Vol 7, No 1: EECSI 2020
Publisher : IAES Indonesia Section

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eecsi.v7.2059

Abstract

Identification of spoken word and speaker has been featured in many kinds of research. The problem or obstacle that persists is in the pronunciation of a particular word. So it is the noise that causes the difficulty of words to be identified. Furthermore, every human has different pronunciation habits and is influenced by several variables, such as amplitude, frequency, tempo, and rhythmic. This study proposed the identification of spoken sounds by using specific word input to determine the patterns of the speaker and spoken using Mel-frequency Cepstrum Coefficients (MFCC) and Multiple Recurrent Neural Networks (RNN). The Mel coefficient of MFCC is used as an input feature for identifying spoken words and speakers using RNN and Long Short Term Memory (LSTM). Multiple RNN works spoken word and speaker in parallel. The results obtained by multiple RNN have an accuracy of 87.74%, while single RNNs have 80.58% using Adam of new data. In order to test our model computational regularly, the experiment tested K-fold Cross-Validation of datasets for spoken and speakers with an average accuracy of 86.07%, which means the model to be able to learn on the dataset without being affected by the order or selection of test data.