Abstract
The cosine similarity scoring is often used in the i-vector model for its computational efficiency and performance in text-independent speaker recognition field. We propose a new Mahalanobis distance scoring with distance metric learning algorithm in this paper. The Mahalanobis metric matrix is learned using the KISS (keep it simple and straightforward!) method, which is motivated by a statistical inference perspective based on a likelihood-ratio test. After whitening and length-normalization, the i-vectors extracted from the development utterances were used to train the metric matrix. Then, the score between the target i-vector and the test i-vector is based on the Mahalanobis distance. The results on NIST 2008 telephone data show that the performance of new scoring is obviously better than the cosine similarity scoring’s.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process 19(4), 788–798 (2011)
Dehak, N., Dehak, R., Glass, J., Reynolds, D., Kenny, P.: Cosine similarity scoring without score normalization techniques. In: Proc. of Odyssey - The Speaker and Language Recognition Workshop, Brno, Czech Republic, 71–75 (2010)
Brummer, N., Villalba, J., Lleida, E.: Fully bayesian likelihood ratio vs i-vector length normalization in speaker recognition systems. In: NIST SRE Analysis Workshop ( 2011)
Bousquet, P.-M., Matrouf, D., Bonastre, J.-F.: Intersession compensation and scoring methods in the i-vector space for speaker recognition. In: Proc. of International conference on Speech Communication and Technology (2011)
Prince, S.J.D.: Probabilistic linear discriminant analysis for inferences about identity. In: Proc. of International Conference on Computer Vision (ICCV), Rio de Janeiro, Brazil (2007)
Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: Proc. of Odyssey - The Speaker and Language Recognition Workshop, Czech Republic (2010)
Fang, X., Dekhak, N., Glass, J.: Bayesian distance metric learning on i-vector for speaker verification. In: INTERSPEECH 2013 – Proceedings of the 14th Annual Conference of the International Speech Communication Association, August 25–29, 2013, Lyon, France, pp. 2514–2518 (2013)
Xing, E., Ng, A., Jordan, M., Russell, S.: Distance metric learning with application to clustering with side-information. In: Neural Information Processing Systems, pp. 505–512 (2002)
Kostinger, M., Hirzer, M., Wohlhart, P., Roth, P.M., Bischof, H.: Large scale metric learning from equivalence constraints. In: Proceedings of the 2012 Computer Vision and Pattern Recognition., pp. 2288–2295 (2012)
Kenny, P.: Joint factor analysis of speaker and session variability: theory and algorithms, Tech. rep., CRIM (2005)
McLaren, M., van Leeuwen, D.: Source-normalised and weighted LDA for robust speaker recognition using i-vectors. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5456–5459 (2011)
Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: Annual Conference of the International Speech Communication Association (Interspeech), pp. 249–252 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Lei, Z., Luo, J., Wan, Y., Yang, Y. (2015). A Mahalanobis Distance Scoring with KISS Metric Learning Algorithm for Speaker Recognition. In: Yang, J., Yang, J., Sun, Z., Shan, S., Zheng, W., Feng, J. (eds) Biometric Recognition. CCBR 2015. Lecture Notes in Computer Science(), vol 9428. Springer, Cham. https://doi.org/10.1007/978-3-319-25417-3_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-25417-3_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25416-6
Online ISBN: 978-3-319-25417-3
eBook Packages: Computer ScienceComputer Science (R0)