ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Implementing frequency-warping and VTLN through linear transformation of conventional MFCC

S. Umesh, András Zolnay, Hermann Ney

In this paper, we show that frequency-warping (including VTLN) can be implemented through linear transformation of conventional MFCC. Unlike the Pitz-Ney [1] continuous domain approach, we directly determine the relation between frequency-warping and the linear-transformation in the discrete-domain. The advantage of such an approach is that it can be applied to any frequency-warping and is not limited to cases where an analytical closed-form solution can be found. The proposed method exploits the bandlimited interpolation idea (in the frequency-domain) to do the necessary frequency-warping and yields exact results as long as the cepstral coefficients are quefrency limited. This idea of quefrency limitedness shows the importance of the filter-bank smoothing of the spectra which has been ignored in [1, 2]. Furthermore, unlike [1], since we operate in the discrete domain, we can also apply the usual discrete-cosine transform (i.e. DCT-II) on the logarithm of the filter-bank output to get conventional MFCC features. Therefore, using our proposed method, we can linearly transform conventional MFCC cepstra to do VTLN and we do not require any recomputation of the warped-features. We provide experimental results in support of this approach.


doi: 10.21437/Interspeech.2005-155

Cite as: Umesh, S., Zolnay, A., Ney, H. (2005) Implementing frequency-warping and VTLN through linear transformation of conventional MFCC. Proc. Interspeech 2005, 269-272, doi: 10.21437/Interspeech.2005-155

@inproceedings{umesh05_interspeech,
  author={S. Umesh and András Zolnay and Hermann Ney},
  title={{Implementing frequency-warping and VTLN through linear transformation of conventional MFCC}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={269--272},
  doi={10.21437/Interspeech.2005-155}
}