De Rosal Igniatus Moses Setiadi
Universitas Dian Nuswantoro

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Improving Indonesian multietnics speaker recognition using pitch shifting data augmentation Kristiawan Nugroho; Isworo Nugroho; De Rosal Igniatus Moses Setiadi; Omar Farooq
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 12, No 4: December 2023
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v12.i4.pp1901-1908

Abstract

Speaker recognition to recognize multiethnic speakers is an interesting research topic. Various studies involving many ethnicities require the right approach to achieve optimal model performance. The deep learning approach has been used in speaker recognition research involving many classes to achieve high accuracy results with promising results. However, multi-class and imbalanced datasets are still obstacles encountered in various studies using the deep learning method which cause overfitting and decreased accuracy. Data augmentation is an approach model used in overcoming the problem of small amounts of data and multiclass problems. This approach can improve the quality of research data according to the method applied. This study proposes a data augmentation method using pitch shifting with a deep neural network called pitch shifting data augmentation deep neural network (PSDA-DNN) to identify multiethnic Indonesian speakers. The results of the research that has been done prove that the PSDA-DNN approach is the best method in multi-ethnic speaker recognition where the accuracy reaches 99.27% and the precision, recall, F1 score is 97.60%.