Emotion Recognition from Speech Using IG-Based Feature Compensation

This paper presents an approach to feature compensation for emotion recognition from speech signals. In this approach, the intonation groups (IGs) of the input speech signals are extracted first. The speech features in each selected intonation group are then extracted. With the assumption of linear mapping between feature spaces in different emotional states, a feature compensation approach is proposed to characterize feature space with better discriminability among emotional states. The compensation vector with respect to each emotional state is estimated using the Minimum Classification Error (MCE) algorithm. For the final emotional state decision, the compensated IG-based feature vectors are used to train the Gaussian Mixture Models (GMMs) and Continuous Support Vector Machine (CSVMs) for each emotional state. For GMMs, the emotional state with the GMM having the maximal likelihood ratio is determined as the final output. For CSVMs, the emotional state is determined according to the probability outputs from the CSVMs. The kernel function in CSVM is experimentally decided as a Radial basis function. A comparison in the experiments shows that the proposed IG-based feature compensation can obtain encouraging performance for emotion recognition.

並列關鍵字

Emotional Speech ； Emotion Recognition ； Intonation Group ； Feature Compensation

參考文獻

Bhatti, M.W.,Y. Wang,L. Guan(2004).A neural network approach for human emotion recognition in speech.(In Proceedings of the 2004 IEEE International Symposium on Circuits and Systems).

Google Scholar

Chuang, Z. J.,C. H. Wu(2004).Multi-Modal Emotion Recognition from Speech and Text.International Journal of Computational Linguistics and Chinese Language Processing.9(2),45-62.

Chuang, Z. J.,C.H. Wu(2004).Emotion Recognition using Acoustic Features and Textual Content.(In Proceedings of the 2004 IEEE International Conference on Multimedia and Expo).

Google Scholar

Cowie, R.,E. Douglas-Cowie,N. Tsapatsoulis,G. Votsis,S. Kollias,W. Fellenz,J. G. Taylor(2001).Emotion recognition in human-computer interaction.IEEE Signal Processing Magazine.18(1),32-80.

Google Scholar

Deng, L.,J. Droppo,A. Acero(2003).Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition.IEEE Transactions on Speech and Audio.11(6),568-580.

Google Scholar

被引用紀錄

Huang, W. Y. (2012). 結合情緒關鍵字以提升語音情緒辨識率之研究 [master's thesis, Tatung University]. Airiti Library. https://www.airitilibrary.com/Article/Detail?DocID=U0081-3001201315113175

國際替代計量

Emotion Recognition from Speech Using IG-Based Feature Compensation

全文下載

主題瀏覽