Abstract
In this chapter we discuss, in an informal manner, some of the successes and a few of the outstanding problems of automatic speech recognition (ASR) and speaker identification — for forensic, business and banking purposes. ASR can also help the hard-of-hearing by giving them printed text to read, and the wheelchair-bound by allowing them to control their vehicles by voice. Together with speech synthesis from text, human-machine dialogue systems offer attractive possibilities for all manner of information services.
Civilization advances by extending the number of important operations which we can perform without thinking.
Alfred North Whitehead
If anything can go wrong, it will.
Murphy’s Law
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Chapter 3 — Speech Recognition
C.-H. Lee, F.K. Soong, K.K. Paliwal: Automatic Speech and Speaker Recognition (Kluwer, Boston 1996)
K.H. Davis, R. Biddulph, S. Balashek: Automatic recognition of spoken digits. J. Acoust. Soc. Am. 24, 637–642 (1952)
L.R. Rabiner, B.-H. Juang: Fundamentals of Speech Recognition (Prentice-Hall, Englewood Cliffs, New Jersey, 1993)
S.E. Levinson, L.R. Rabiner: A Task-Oriented Conversational Mode Speech Understanding System, in M.R. Schroeder (ed.): Speech and Speaker Recognition (Karger, Basel 1985)
R.K. Potter, G.A. Kopp, H.C. Green: Visible Speech (D. van Nostrand Co., New York 1947)
R.H. Bolt, F.S. Cooper, E.E. David, Jr., P.B. Denes, J.M. Pickett, K.N. Stevens: Speaker identification by speech spectrograms: A scientists’ view of its reliability for legal purposes. J. Acoust. Soc. Am. 47, 597–612 (1970)
S. Furui: An Overview of Speaker Recognition Technology, in [3.1] pp. 31–56
H.W. Strube, D. Helling, A. Krause, M.R. Schroeder: Word and Speaker Recognition Based on Entire Words, in M.R. Schroeder (ed.): Speech and Speaker Recognition (Karger, Basel 1985)
E.J. Gumbel (ed.): The Emil J. Gumbel Collection: Political Papers of an Anti-Nazi Scholar in Weimar and Exile, 1914–1966 (1990). See also S. Fleishman: Gumbel, the Fire-Breathing Dragon (1970)
L.R. Rabiner, B.H. Juang: An introduction to hidden Markov models. IEEE ASSP Magazine (January 1986)
J. Glimm, J. Impagliazzo, I. Singer (eds.): The Legacy of John von Neumann (Proceedings of Symposia in Pure Mathematics), 50 (American Mathematical Society, Washington 1988)
T. Gramss, S. Borholdt, M. Gross, M. Mitchell, T. Pellizzari (eds.): Non Standard Computation (Wiley-VCH, Weinheim 1998.)
W.S. McCulloch: The Complete Works of Warren S. McCulloch (Intersys-tems Publications, Salinas, California, 1993)
F. Rosenblatt: Principles of Neurodynamics (Spartan Books, New York 1962)
D.E. Rumelhart, J.L. McClelland: Parallel Distributed Processing (MIT Press, Cambridge, Massachusetts, 1986)
T. Kohonen: Self-Organizing Maps, 2nd ed. (Springer, Berlin, Heidelberg 1995)
J.J. Hopfield: Neural networks and physical systems with emergent collective computational abilities. Proc. Nat. Acad. Sciences, USA 79 2554–2558 (1982)
L.P. Yaroslavsky: Digital Picture Processing: An Introduction (Springer, Berlin, Heidelberg 1985)
J.-C. Junqua, J.-P. Haton (eds.): Special Issue on Robust Speech Recognition. Speech Communication 25, 1–192 (1998)
B.E.D. Kingsbury, N. Morgan, S. Greenberg: Robust speech recognition using the modulation spectrogram. Speech Communication (Special Issue on Robust Speech Recognition) 25, 3–27 (1998)
H. Hermansky: Should recognizers have ears? Speech Communication (Special Issue) 25, 3–27 (1998)
B. Kollmeyer, R. Koch: Speech enhancement based on physiological and psycoacoustic models of modulation perception and binaural ineraction. J. Acoust. Soc. Am. 95, 1593–1602 (1994)
T. Houtgast, H.J.M. Steeneken: A review of the MTF concept in room acoustics and its use for estimating speed intelligibility. J. Acoust. Soc. Am. 77, 1069–1077, (1985)
M.R. Schroeder: Modulation transfer function: definition and measurement: Acustica 49, 179–182 (1980)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Schroeder, M.R. (1999). Speech Recognition and Speaker Identification. In: Computer Speech. Springer Series in Information Sciences, vol 35. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-03861-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-662-03861-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-03863-5
Online ISBN: 978-3-662-03861-1
eBook Packages: Springer Book Archive