Article Info

Robust Speaker Gender Identification Using Empirical Decomposition-Based Cepstral Features

Ghasem Alipoor, Ehsan Samadi
dx.doi.org/10.17576/apjitm-2018-0701-06

Abstract

Automatic gender identification is one of the appealing fields of research with numerous practical applications. However, this issue has not gained its deserved attention, in particular in the presence of environmental noises. In this paper, using the empirical mode decomposition (EMD), some new and improved mel-frequency cepstral coefficient (MFCC) features are developed to address this problem. In the proposed approach, EMD is employed as a filter bank to decompose the speech signal into its frequency bands. Furthermore, another variant is also developed in which the complete ensemble EMD (CEEMD) supersedes the EMD. Moreover, support vector machine (SVM) with radial basis function (RBF) kernel is employed for classification. Performance of these methods is examined for gender identification, in noise-free environments as well as in the presence of various Gaussian and non-Gaussian noises. Simulation results show that, although with fewer features used, utilizing the improved EMD-based cepstral features in noiseless situations leads to the same accuracy as that of the original MFCCs. However, in noisy environments the proposed methods outperform the conventional way of extracting the MFCCs.

keyword

Automatic Gender Identification; Empirical Mode Decomposition; Mel-Frequency Cepstral Coefficients; Support Vector Machine

Area

Pattern Recognition