Speech Recognition and Speaker Identification

Schroeder, Manfred R.

doi:10.1007/978-3-662-03861-1_3

Manfred R. Schroeder⁵

Part of the book series: Springer Series in Information Sciences ((SSINF,volume 35))

283 Accesses

Abstract

In this chapter we discuss, in an informal manner, some of the successes and a few of the outstanding problems of automatic speech recognition (ASR) and speaker identification — for forensic, business and banking purposes. ASR can also help the hard-of-hearing by giving them printed text to read, and the wheelchair-bound by allowing them to control their vehicles by voice. Together with speech synthesis from text, human-machine dialogue systems offer attractive possibilities for all manner of information services.

Civilization advances by extending the number of important operations which we can perform without thinking.

Alfred North Whitehead

If anything can go wrong, it will.

Murphy’s Law

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Chapter 3 — Speech Recognition

C.-H. Lee, F.K. Soong, K.K. Paliwal: Automatic Speech and Speaker Recognition (Kluwer, Boston 1996)
Book Google Scholar
K.H. Davis, R. Biddulph, S. Balashek: Automatic recognition of spoken digits. J. Acoust. Soc. Am. 24, 637–642 (1952)
Article ADS Google Scholar
L.R. Rabiner, B.-H. Juang: Fundamentals of Speech Recognition (Prentice-Hall, Englewood Cliffs, New Jersey, 1993)
Google Scholar
S.E. Levinson, L.R. Rabiner: A Task-Oriented Conversational Mode Speech Understanding System, in M.R. Schroeder (ed.): Speech and Speaker Recognition (Karger, Basel 1985)
Google Scholar
R.K. Potter, G.A. Kopp, H.C. Green: Visible Speech (D. van Nostrand Co., New York 1947)
Google Scholar
R.H. Bolt, F.S. Cooper, E.E. David, Jr., P.B. Denes, J.M. Pickett, K.N. Stevens: Speaker identification by speech spectrograms: A scientists’ view of its reliability for legal purposes. J. Acoust. Soc. Am. 47, 597–612 (1970)
Article ADS Google Scholar
S. Furui: An Overview of Speaker Recognition Technology, in [3.1] pp. 31–56
Google Scholar
H.W. Strube, D. Helling, A. Krause, M.R. Schroeder: Word and Speaker Recognition Based on Entire Words, in M.R. Schroeder (ed.): Speech and Speaker Recognition (Karger, Basel 1985)
Google Scholar
E.J. Gumbel (ed.): The Emil J. Gumbel Collection: Political Papers of an Anti-Nazi Scholar in Weimar and Exile, 1914–1966 (1990). See also S. Fleishman: Gumbel, the Fire-Breathing Dragon (1970)
Google Scholar
L.R. Rabiner, B.H. Juang: An introduction to hidden Markov models. IEEE ASSP Magazine (January 1986)
Google Scholar
J. Glimm, J. Impagliazzo, I. Singer (eds.): The Legacy of John von Neumann (Proceedings of Symposia in Pure Mathematics), 50 (American Mathematical Society, Washington 1988)
Google Scholar
T. Gramss, S. Borholdt, M. Gross, M. Mitchell, T. Pellizzari (eds.): Non Standard Computation (Wiley-VCH, Weinheim 1998.)
MATH Google Scholar
W.S. McCulloch: The Complete Works of Warren S. McCulloch (Intersys-tems Publications, Salinas, California, 1993)
Google Scholar
F. Rosenblatt: Principles of Neurodynamics (Spartan Books, New York 1962)
MATH Google Scholar
D.E. Rumelhart, J.L. McClelland: Parallel Distributed Processing (MIT Press, Cambridge, Massachusetts, 1986)
Google Scholar
T. Kohonen: Self-Organizing Maps, 2nd ed. (Springer, Berlin, Heidelberg 1995)
Book Google Scholar
J.J. Hopfield: Neural networks and physical systems with emergent collective computational abilities. Proc. Nat. Acad. Sciences, USA 79 2554–2558 (1982)
Article MathSciNet ADS Google Scholar
L.P. Yaroslavsky: Digital Picture Processing: An Introduction (Springer, Berlin, Heidelberg 1985)
Book MATH Google Scholar
J.-C. Junqua, J.-P. Haton (eds.): Special Issue on Robust Speech Recognition. Speech Communication 25, 1–192 (1998)
Google Scholar
B.E.D. Kingsbury, N. Morgan, S. Greenberg: Robust speech recognition using the modulation spectrogram. Speech Communication (Special Issue on Robust Speech Recognition) 25, 3–27 (1998)
Google Scholar
H. Hermansky: Should recognizers have ears? Speech Communication (Special Issue) 25, 3–27 (1998)
Article Google Scholar
B. Kollmeyer, R. Koch: Speech enhancement based on physiological and psycoacoustic models of modulation perception and binaural ineraction. J. Acoust. Soc. Am. 95, 1593–1602 (1994)
Article ADS Google Scholar
T. Houtgast, H.J.M. Steeneken: A review of the MTF concept in room acoustics and its use for estimating speed intelligibility. J. Acoust. Soc. Am. 77, 1069–1077, (1985)
Article ADS Google Scholar
M.R. Schroeder: Modulation transfer function: definition and measurement: Acustica 49, 179–182 (1980)
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Drittes Physikalisches Institut, Universität Göttingen, Bürgerstrasse 42-44, D-37073, Göttingen, Germany
Professor Dr. Manfred R. Schroeder

Authors

Professor Dr. Manfred R. Schroeder
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Schroeder, M.R. (1999). Speech Recognition and Speaker Identification. In: Computer Speech. Springer Series in Information Sciences, vol 35. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-03861-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-662-03861-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-03863-5
Online ISBN: 978-3-662-03861-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics