Abstract
In this chapter we address the limitations of current approaches to using microphone arrays for speech acquisition and advocate the development of multichannel techniques which employ non-traditional processing and an explicit model of the speech signal. The goal is to combine the advantages of spatial filtering achieved through beamforming with knowledge of the desired time-series attributes and intuitive nonlinear processing. We then offer a multi-channel algorithm which incorporates these principles. The enhanced speech is synthesized using a linear predictive filter. The excitation signal is computed from a nonlinear wavelet-domain process. It uses extrema clustering of the multi-channel speech data to discriminate portions of the linear prediction residual produced by the desired speech signal from those due to multi path effects and uncorrelated noise. The algorithm is shown to be capable of identifying and attenuating reverberant portions of the speech signal and reducing the effects of additive noise.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J. L. Flanagan and H. F. Silverman, “Material for international workshop on microphonearray systems: Theory and practice,” LEMS Technical Report 113, LEMS, Division of Engineering, Brown University, Providence, RI 02912, Oct. 1992.
J. L. Flanagan and H. F. Silverman, “Material for international workshop on microphonearray systems: Theory and practice,” Technical report, CAIP, Rutgers University, Piscataway, NJ 08855, Oct. 1994.
J. Lim, editor, Speech Enhancement. New Jersey: Prentice-Hall, 1983.
J. Deller, J. Proakis, and J. Hansen, Discrete-Time Processing of Speech Signals. New Jersey: Prentice Hall, first edition, 1987.
S. Furui and M. Sondhi, editors, Advances in Speech Signal Processing. New York: Marcel Dekker, first edition, 1992.
R. McAulay and T. Quatieri, “Speech analysis/synthesis based on a sinusoidal representation,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 744–754, Aug. 1986.
J. Hardwick, The Dual Excitation Speech Model, PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, June 1992.
J. Laroche, Y. Stylianou, and E. Moulines, “HNS: Speech modification based on a harmonic + noise model,” in Proc. IEEE ICASSP 1993, pp. 11-550–11-553.
D. Johnson and D. Dudgeon, Array Signal Processing- Concepts and Techniques. New Jersey: Prentice Hall, first edition, 1993.
M. Brandstein and H. Silverman, “A practical methodology for speech source localization with microphone arrays,” Computer, Speech, and Language, vol. 11, pp. 91–126, Apr. 1997.
R. Zelinski, “A microphone array with adaptive post-filtering for noise reduction in reverberant rooms,” in Proc. IEEE ICASSP, 1988, pp. 2578–2580.
K. Simmer and A. Wasiljeff, “Adaptive microphone arrays for noise suppression in the frequency domain,” in Second Cost 229 Workshop on Adaptive Algorithms in Communications, Bordeaux, France, Sept. 1992, pp. 185–194.
Z. Yang, K. Simmer, and A. Wasiljeff, “Improved performance of multi-microphone speech enhancement systems,” in Proceedings of the 14th GRETSI Symposium, 1993, pp. 479–482.
C. Marro, Y. Mahieux, and K. Simmer, “Analysis of noise reduction and dereveberation techniques based on microphone arrays with postfiltering,” IEEE Trans. Speech Audio Proc., vol. 6, pp. 240–259, May 1998.
S. Gierl, “Noise reduction for speech input systems using an adaptive microphone array,” in Proceedings 22nd ISATA, 1990, pp. 517–524.
M. Dahl, I. Claesson, and S. Nordebo, “Simultaneous echo cancellation and car noise suppression employing a microphone array,” in Proc. IEEE ICASSP, 1997, pp. 239–242.
J. Meyer and K. Simmer, “Multi-channel speech enhancement in a car environment using wiener filtering and spectral subtraction,” in Proc. IEEE ICASSP, 1997, pp. 1167–1170.
J. Flanagan, A. Surendran, and E. Jan, “Spatially selective sound capture for speech and audio processing,” Speech Communication, vol. 13, pp. 207–222, 1993.
S. Affes and Y. Grenier, “A signal subspace tracking algorithm for microphone array processing of speech,” IEEE Trans. Speech Audio Proc., vol. 5, pp. 425–437, Sept. 1997.
M. Brandstein, “On the use of explicit speech modeling in microphone array applications,” in Proc. IEEE ICASSP, 1998, pp. 3613–3616.
B. Radlovic, R. Williamson, and R. Kennedy, “On the poor robustness of sound equalization in reverberant environments,” in Proc. IEEE ICASSP, 1999, pp. 881–884.
M. Brandstein, “An event-based method for microphone array speech enhancement,” in Proc. IEEE ICASSP, 1999, pp. 953–956.
B. S. Atal and J. R. Remde, “A new model of lpc excitation for producing natural-sounding speech at low bit rates,” in Proc. IEEE ICASSP, 1982, pp. 614–617.
S. Singhal and B. S. Atal, “Improving performance of multi-pulse lpc coders at low bit rates,” in Proc. IEEE ICASSP, 1984, pp. 1-131–1-134.
S. Mallat and S. Zhong, “Characterization of signals from multiscale edges,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 14, pp. 710–732, July 1992.
S. Kadambe and G. Faye Boudreaux-Bartels, “Applications of the wavelet transform for pitch detection of speech signals,” IEEE Trans. Information Theory, vol. 38, pp. 917–924, Mar. 1992.
S. Mallat, A Wavelet Tour of Signal Processing. Boston: Academic Press, 1998.
S. Griebel and M. Brandstein, “Wavelet transform extrema clustering for multi-channel speech dereverbearation,” in IEEE Workshop on Acoustic Echo and Noise Control, Pocono Manor, Pennsylvania, Sept. 1999, pp. 52–55.
S. M. Griebel, “Multi-channel wavelet techniques for reverberant speech analysis and enhancement,” Technical Report 5, HIMMEL, Harvard University, Cambridge, MA, Feb. 1999.
J. B. Allen and D. A. Berkley, “Image method for efficiently simulating small room acoustics,” J. Acoust. Soc. Am., vol. 65, pp. 943–950, Apr. 1979.
P. M. Peterson, “Simulating the response of multiple microphones to a single acoustic source in a reverberant room,” J. Acoust. Soc. Amer., vol. 80, pp. 1527–1529, Nov. 1986.
H. Kuttruff, Room Acoustics. London: Elsevier, third edition, 1991.
S. Wang, A. Sekey, and A. Gersho, “An objective measure for predicting subjective quality of speech coders,” IEEE J. Selected Areas in Communications, vol. 10, pp. 819–829, June 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media New York
About this chapter
Cite this chapter
Brandstein, M.S., Griebel, S.M. (2000). Nonlinear, Model-Based Microphone Array Speech Enhancement. In: Gay, S.L., Benesty, J. (eds) Acoustic Signal Processing for Telecommunication. The Springer International Series in Engineering and Computer Science, vol 551. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-8644-3_12
Download citation
DOI: https://doi.org/10.1007/978-1-4419-8644-3_12
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-4656-2
Online ISBN: 978-1-4419-8644-3
eBook Packages: Springer Book Archive