Syed Asif Ahmad Qadri
International Islamic University Malaysia

Published : 6 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 6 Documents
Search

On the use of voice activity detection in speech emotion recognition Muhammad Fahreza Alghifari; Teddy Surya Gunawan; Mimi Aminah binti Wan Nordin; Syed Asif Ahmad Qadri; Mira Kartiwi; Zuriati Janin
Bulletin of Electrical Engineering and Informatics Vol 8, No 4: December 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (903.469 KB) | DOI: 10.11591/eei.v8i4.1646

Abstract

Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing the voice activity detection (VAD) concept. The emotional voice data from the Berlin Emotion Database (EMO-DB) and a custom-made database LQ Audio Dataset are firstly preprocessed by VAD before feature extraction. The features are then passed to the deep neural network for classification. In this paper, we have chosen MFCC to be the sole determinant feature. From the results obtained using VAD and without, we have found that the VAD improved the recognition rate of 5 emotions (happy, angry, sad, fear, and neutral) by 3.7% when recognizing clean signals, while the effect of using VAD when training a network with both clean and noisy signals improved our previous results by 50%.
A critical insight into multi-languages speech emotion databases Syed Asif Ahmad Qadri; Teddy Surya Gunawan; Muhammad Fahreza Alghifari; Hasmah Mansor; Mira Kartiwi; Zuriati Janin
Bulletin of Electrical Engineering and Informatics Vol 8, No 4: December 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (303.859 KB) | DOI: 10.11591/eei.v8i4.1645

Abstract

With increased interest of human-computer/human-human interactions, systems deducing and identifying emotional aspects of a speech signal has emerged as a hot research topic. Recent researches are directed towards the development of automated and intelligent analysis of human utterances. Although numerous researches have been put into place for designing systems, algorithms, classifiers in the related field; however the things are far from standardization yet. There still exists considerable amount of uncertainty with regard to aspects such as determining influencing features, better performing algorithms, number of emotion classification etc. Among the influencing factors, the uniqueness between speech databases such as data collection method is accepted to be significant among the research community. Speech emotion database is essentially a repository of varied human speech samples collected and sampled using a specified method. This paper reviews 34 `speech emotion databases for their characteristics and specifications. Furthermore critical insight into the imitational aspects for the same have also been highlighted.
A critical insight into multi-languages speech emotion databases Syed Asif Ahmad Qadri; Teddy Surya Gunawan; Muhammad Fahreza Alghifari; Hasmah Mansor; Mira Kartiwi; Zuriati Janin
Bulletin of Electrical Engineering and Informatics Vol 8, No 4: December 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (303.859 KB) | DOI: 10.11591/eei.v8i4.1645

Abstract

With increased interest of human-computer/human-human interactions, systems deducing and identifying emotional aspects of a speech signal has emerged as a hot research topic. Recent researches are directed towards the development of automated and intelligent analysis of human utterances. Although numerous researches have been put into place for designing systems, algorithms, classifiers in the related field; however the things are far from standardization yet. There still exists considerable amount of uncertainty with regard to aspects such as determining influencing features, better performing algorithms, number of emotion classification etc. Among the influencing factors, the uniqueness between speech databases such as data collection method is accepted to be significant among the research community. Speech emotion database is essentially a repository of varied human speech samples collected and sampled using a specified method. This paper reviews 34 `speech emotion databases for their characteristics and specifications. Furthermore critical insight into the imitational aspects for the same have also been highlighted.
On the use of voice activity detection in speech emotion recognition Muhammad Fahreza Alghifari; Teddy Surya Gunawan; Mimi Aminah binti Wan Nordin; Syed Asif Ahmad Qadri; Mira Kartiwi; Zuriati Janin
Bulletin of Electrical Engineering and Informatics Vol 8, No 4: December 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (903.469 KB) | DOI: 10.11591/eei.v8i4.1646

Abstract

Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing the voice activity detection (VAD) concept. The emotional voice data from the Berlin Emotion Database (EMO-DB) and a custom-made database LQ Audio Dataset are firstly preprocessed by VAD before feature extraction. The features are then passed to the deep neural network for classification. In this paper, we have chosen MFCC to be the sole determinant feature. From the results obtained using VAD and without, we have found that the VAD improved the recognition rate of 5 emotions (happy, angry, sad, fear, and neutral) by 3.7% when recognizing clean signals, while the effect of using VAD when training a network with both clean and noisy signals improved our previous results by 50%.
A critical insight into multi-languages speech emotion databases Syed Asif Ahmad Qadri; Teddy Surya Gunawan; Muhammad Fahreza Alghifari; Hasmah Mansor; Mira Kartiwi; Zuriati Janin
Bulletin of Electrical Engineering and Informatics Vol 8, No 4: December 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (303.859 KB) | DOI: 10.11591/eei.v8i4.1645

Abstract

With increased interest of human-computer/human-human interactions, systems deducing and identifying emotional aspects of a speech signal has emerged as a hot research topic. Recent researches are directed towards the development of automated and intelligent analysis of human utterances. Although numerous researches have been put into place for designing systems, algorithms, classifiers in the related field; however the things are far from standardization yet. There still exists considerable amount of uncertainty with regard to aspects such as determining influencing features, better performing algorithms, number of emotion classification etc. Among the influencing factors, the uniqueness between speech databases such as data collection method is accepted to be significant among the research community. Speech emotion database is essentially a repository of varied human speech samples collected and sampled using a specified method. This paper reviews 34 `speech emotion databases for their characteristics and specifications. Furthermore critical insight into the imitational aspects for the same have also been highlighted.
On the use of voice activity detection in speech emotion recognition Muhammad Fahreza Alghifari; Teddy Surya Gunawan; Mimi Aminah binti Wan Nordin; Syed Asif Ahmad Qadri; Mira Kartiwi; Zuriati Janin
Bulletin of Electrical Engineering and Informatics Vol 8, No 4: December 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (903.469 KB) | DOI: 10.11591/eei.v8i4.1646

Abstract

Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing the voice activity detection (VAD) concept. The emotional voice data from the Berlin Emotion Database (EMO-DB) and a custom-made database LQ Audio Dataset are firstly preprocessed by VAD before feature extraction. The features are then passed to the deep neural network for classification. In this paper, we have chosen MFCC to be the sole determinant feature. From the results obtained using VAD and without, we have found that the VAD improved the recognition rate of 5 emotions (happy, angry, sad, fear, and neutral) by 3.7% when recognizing clean signals, while the effect of using VAD when training a network with both clean and noisy signals improved our previous results by 50%.