Acoustical Science and Technology
Online ISSN : 1347-5177
Print ISSN : 1346-3969
ISSN-L : 0369-4232
PAPERS
Formant frequency estimation of high-pitched speech by homomorphic prediction
M. Shahidur RahmanTetsuya Shimamura
Author information
JOURNAL FREE ACCESS

2005 Volume 26 Issue 6 Pages 502-510

Details
Abstract

The conventional model of the linear prediction analysis suffers from difficulties in estimating vocal tract characteristics of high-pitched speakers. This is because the autocorrelation function used by the autocorrelation method of linear prediction for estimating autoregressive coefficients is actually an “aliased” version of that of the vocal tract impulse response. This “aliasing” occurs due to the periodic nature of voiced speech. Generally it is accepted that homomorphic filtering can be used to obtain an estimate of vocal tract impulse response which is free from periodicity. Thus linear prediction of the resulting vocal tract impulse response (referred to as homomorphic prediction) is expected to be free from variations of fundamental frequencies. To our knowledge any experimental study, however, has not yet appeared on the suitability of this method for analyzing high-pitched speech. This paper presents a detail study on the prospects of homomorphic prediction as a formant tracking tool especially for high-pitched speech where linear prediction fails to obtain accurate estimation. The formant frequencies estimated using the proposed method are found to be accurate by more than an order of magnitude compared to the conventional procedure. The accuracy of formant estimation is verified on synthetic vowels for a wide range of pitch periods covering typical male and high-pitched female speakers. The validity of the proposed method is also examined by inspecting the spectral envelopes of natural speech spoken by high-pitched female speakers. We noticed that almost all the previous methods dealing with this limitation of linear prediction are based on the covariance technique where the obtained AR filter can be unstable. The solutions obtained by the current method are guaranteed to be stable which makes it superior for many speech analysis applications.

Content from these authors
© 2005 by The Acoustical Society of Japan
Previous article Next article
feedback
Top