In a communication using texts input, viseme (visual phonemes) is derived from a group of phonemes having similar visual appearances. Hidden Markov model (HMM) has been a popular mathematical approach for sequence classification such as speech recognition. For speech emotion recognition, a HMM is trained for each emotion and an unknown sample is classified according to the model which illustrate the derived feature sequence best. Viterbi algorithm, HMM is used for guessing the most possible state sequence of observable states. In this work, first stage, we defined system of an Indonesian viseme set and the associated mouth shapes, namely system of text input segmentation. The second stage, we defined a choice of one of affection type as input in the system. The last stage, we experimentally using Trigram HMMs for generating the viseme sequence to be used for synchronized mouth shape and lip movements. The whole system is interconnected in a sequence. The final system produced a viseme sequence for natural speech of Indonesian sentences with affection. We show through various experiments that the proposed, the results in about 82,19% relative improvement in classification accuracy.
Copyrights © 2016