Skip to main content

Nonlinear, Model-Based Microphone Array Speech Enhancement

  • Chapter
Acoustic Signal Processing for Telecommunication

Abstract

In this chapter we address the limitations of current approaches to using microphone arrays for speech acquisition and advocate the development of multichannel techniques which employ non-traditional processing and an explicit model of the speech signal. The goal is to combine the advantages of spatial filtering achieved through beamforming with knowledge of the desired time-series attributes and intuitive nonlinear processing. We then offer a multi-channel algorithm which incorporates these principles. The enhanced speech is synthesized using a linear predictive filter. The excitation signal is computed from a nonlinear wavelet-domain process. It uses extrema clustering of the multi-channel speech data to discriminate portions of the linear prediction residual produced by the desired speech signal from those due to multi path effects and uncorrelated noise. The algorithm is shown to be capable of identifying and attenuating reverberant portions of the speech signal and reducing the effects of additive noise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. L. Flanagan and H. F. Silverman, “Material for international workshop on microphonearray systems: Theory and practice,” LEMS Technical Report 113, LEMS, Division of Engineering, Brown University, Providence, RI 02912, Oct. 1992.

    Google Scholar 

  2. J. L. Flanagan and H. F. Silverman, “Material for international workshop on microphonearray systems: Theory and practice,” Technical report, CAIP, Rutgers University, Piscataway, NJ 08855, Oct. 1994.

    Google Scholar 

  3. J. Lim, editor, Speech Enhancement. New Jersey: Prentice-Hall, 1983.

    Google Scholar 

  4. J. Deller, J. Proakis, and J. Hansen, Discrete-Time Processing of Speech Signals. New Jersey: Prentice Hall, first edition, 1987.

    Google Scholar 

  5. S. Furui and M. Sondhi, editors, Advances in Speech Signal Processing. New York: Marcel Dekker, first edition, 1992.

    Google Scholar 

  6. R. McAulay and T. Quatieri, “Speech analysis/synthesis based on a sinusoidal representation,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 744–754, Aug. 1986.

    Article  Google Scholar 

  7. J. Hardwick, The Dual Excitation Speech Model, PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, June 1992.

    Google Scholar 

  8. J. Laroche, Y. Stylianou, and E. Moulines, “HNS: Speech modification based on a harmonic + noise model,” in Proc. IEEE ICASSP 1993, pp. 11-550–11-553.

    Google Scholar 

  9. D. Johnson and D. Dudgeon, Array Signal Processing- Concepts and Techniques. New Jersey: Prentice Hall, first edition, 1993.

    MATH  Google Scholar 

  10. M. Brandstein and H. Silverman, “A practical methodology for speech source localization with microphone arrays,” Computer, Speech, and Language, vol. 11, pp. 91–126, Apr. 1997.

    Article  Google Scholar 

  11. R. Zelinski, “A microphone array with adaptive post-filtering for noise reduction in reverberant rooms,” in Proc. IEEE ICASSP, 1988, pp. 2578–2580.

    Google Scholar 

  12. K. Simmer and A. Wasiljeff, “Adaptive microphone arrays for noise suppression in the frequency domain,” in Second Cost 229 Workshop on Adaptive Algorithms in Communications, Bordeaux, France, Sept. 1992, pp. 185–194.

    Google Scholar 

  13. Z. Yang, K. Simmer, and A. Wasiljeff, “Improved performance of multi-microphone speech enhancement systems,” in Proceedings of the 14th GRETSI Symposium, 1993, pp. 479–482.

    Google Scholar 

  14. C. Marro, Y. Mahieux, and K. Simmer, “Analysis of noise reduction and dereveberation techniques based on microphone arrays with postfiltering,” IEEE Trans. Speech Audio Proc., vol. 6, pp. 240–259, May 1998.

    Article  Google Scholar 

  15. S. Gierl, “Noise reduction for speech input systems using an adaptive microphone array,” in Proceedings 22nd ISATA, 1990, pp. 517–524.

    Google Scholar 

  16. M. Dahl, I. Claesson, and S. Nordebo, “Simultaneous echo cancellation and car noise suppression employing a microphone array,” in Proc. IEEE ICASSP, 1997, pp. 239–242.

    Google Scholar 

  17. J. Meyer and K. Simmer, “Multi-channel speech enhancement in a car environment using wiener filtering and spectral subtraction,” in Proc. IEEE ICASSP, 1997, pp. 1167–1170.

    Google Scholar 

  18. J. Flanagan, A. Surendran, and E. Jan, “Spatially selective sound capture for speech and audio processing,” Speech Communication, vol. 13, pp. 207–222, 1993.

    Article  Google Scholar 

  19. S. Affes and Y. Grenier, “A signal subspace tracking algorithm for microphone array processing of speech,” IEEE Trans. Speech Audio Proc., vol. 5, pp. 425–437, Sept. 1997.

    Article  Google Scholar 

  20. M. Brandstein, “On the use of explicit speech modeling in microphone array applications,” in Proc. IEEE ICASSP, 1998, pp. 3613–3616.

    Google Scholar 

  21. B. Radlovic, R. Williamson, and R. Kennedy, “On the poor robustness of sound equalization in reverberant environments,” in Proc. IEEE ICASSP, 1999, pp. 881–884.

    Google Scholar 

  22. M. Brandstein, “An event-based method for microphone array speech enhancement,” in Proc. IEEE ICASSP, 1999, pp. 953–956.

    Google Scholar 

  23. B. S. Atal and J. R. Remde, “A new model of lpc excitation for producing natural-sounding speech at low bit rates,” in Proc. IEEE ICASSP, 1982, pp. 614–617.

    Google Scholar 

  24. S. Singhal and B. S. Atal, “Improving performance of multi-pulse lpc coders at low bit rates,” in Proc. IEEE ICASSP, 1984, pp. 1-131–1-134.

    Google Scholar 

  25. S. Mallat and S. Zhong, “Characterization of signals from multiscale edges,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 14, pp. 710–732, July 1992.

    Article  Google Scholar 

  26. S. Kadambe and G. Faye Boudreaux-Bartels, “Applications of the wavelet transform for pitch detection of speech signals,” IEEE Trans. Information Theory, vol. 38, pp. 917–924, Mar. 1992.

    Article  Google Scholar 

  27. S. Mallat, A Wavelet Tour of Signal Processing. Boston: Academic Press, 1998.

    MATH  Google Scholar 

  28. S. Griebel and M. Brandstein, “Wavelet transform extrema clustering for multi-channel speech dereverbearation,” in IEEE Workshop on Acoustic Echo and Noise Control, Pocono Manor, Pennsylvania, Sept. 1999, pp. 52–55.

    Google Scholar 

  29. S. M. Griebel, “Multi-channel wavelet techniques for reverberant speech analysis and enhancement,” Technical Report 5, HIMMEL, Harvard University, Cambridge, MA, Feb. 1999.

    Google Scholar 

  30. J. B. Allen and D. A. Berkley, “Image method for efficiently simulating small room acoustics,” J. Acoust. Soc. Am., vol. 65, pp. 943–950, Apr. 1979.

    Article  Google Scholar 

  31. P. M. Peterson, “Simulating the response of multiple microphones to a single acoustic source in a reverberant room,” J. Acoust. Soc. Amer., vol. 80, pp. 1527–1529, Nov. 1986.

    Article  Google Scholar 

  32. H. Kuttruff, Room Acoustics. London: Elsevier, third edition, 1991.

    Google Scholar 

  33. S. Wang, A. Sekey, and A. Gersho, “An objective measure for predicting subjective quality of speech coders,” IEEE J. Selected Areas in Communications, vol. 10, pp. 819–829, June 1992.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer Science+Business Media New York

About this chapter

Cite this chapter

Brandstein, M.S., Griebel, S.M. (2000). Nonlinear, Model-Based Microphone Array Speech Enhancement. In: Gay, S.L., Benesty, J. (eds) Acoustic Signal Processing for Telecommunication. The Springer International Series in Engineering and Computer Science, vol 551. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-8644-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-8644-3_12

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-4656-2

  • Online ISBN: 978-1-4419-8644-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics