Fuzzy Logic Speech/Non-speech Discrimination for Noise Robust Speech Processing

Culebras, R.; Ramírez, J.; Górriz, J. M.; Segura, J. C.

doi:10.1007/11758501_55

R. Culebras²⁰,
J. Ramírez²⁰,
J. M. Górriz²⁰ &
…
J. C. Segura²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3991))

Included in the following conference series:

International Conference on Computational Science

1107 Accesses
1 Citations

Abstract

This paper shows a fuzzy logic speech/non-speech discrimination method for improving the performance of speech processing systems working in noise environments. The fuzzy system is based on a Sugeno inference engine with membership functions defined as combination of two Gaussian functions. The rule base consists of ten fuzzy if then statements defined in terms of the denoised subband signal-to-noise ratios (SNRs) and the zero crossing rates (ZCRs). Its operation is optimized by means of a hybrid training algorithm combining the least-squares method and the backpropagation gradient descent method for training membership function parameters. The experiments conducted on the Spanish SpeechDat-Car database shows that the proposed method yields clear improvements over a set of standardized VADs for discontinuous transmission (DTX) and distributed speech recognition (DSR) and also over recently published VAD methods.

Download to read the full chapter text

Chapter PDF

Neuro-Fuzzy Logic Application in Speech Recognition

Adaptive Prosody Modelling for Improved Synthetic Speech Quality

Fuzzy Logic in Speech Technology - Introductory and Overviewing Glimpses

References

Karray, L., Martin, A.: Towards improving speech detection robustness for speech recognition in adverse environments. Speech Communitation, 261–276 (2003)
Google Scholar
Ramírez, J., Segura, J.C., Benétez, M.C., de la Torre, A., Rubio, A.: A new adaptive long-term spectral estimation voice activity detector. In: Proc. of EUROSPEECH 2003, Geneva, Switzerland, pp. 3041–3044 (2003)
Google Scholar
ETSI: Voice activity detector (VAD) for Adaptive Multi-Rate (AMR) speech traffic channels. ETSI EN 301 708 Recommendation (1999)
Google Scholar
ITU: A silence compression scheme for G.729 optimized for terminals conforming to recommendation V.70. ITU-T Recommendation G.729-Annex B (1996)
Google Scholar
Sangwan, A., Chiranth, M.C., Jamadagni, H.S., Sah, R., Prasad, R.V., Gaurav, V.: VAD techniques for real-time speech transmission on the Internet. In: IEEE International Conference on High-Speed Networks and Multimedia Communications, pp. 46–50 (2002)
Google Scholar
Sohn, J., Kim, N.S., Sung, W.: A statistical model-based voice activity detection. IEEE Signal Processing Letters 16, 1–3 (1999)
Article Google Scholar
Cho, Y.D., Kondoz, A.: Analysis and improvement of a statistical model-based voice activity detector. IEEE Signal Processing Letters 8, 276–278 (2001)
Article Google Scholar
Gazor, S., Zhang, W.: A soft voice activity detector based on a Laplacian-Gaussian model. IEEE Transactions on Speech and Audio Processing 11, 498–505 (2003)
Article Google Scholar
Armani, L., Matassoni, M., Omologo, M., Svaizer, P.: Use of a CSP-based voice activity detector for distant-talking ASR. In: Proc. of EUROSPEECH 2003, Geneva, Switzerland, pp. 501–504 (2003)
Google Scholar
Bouquin-Jeannes, R.L., Faucon, G.: Study of a voice activity detector and its influence on a noise reduction system. Speech Communication 16, 245–254 (1995)
Article Google Scholar
Ramírez, J., Segura, J.C., Benítez, C., de la Torre, A., Rubio, A.: An effective subband osf-based vad with noise reduction for robust speech recognition. IEEE Trans. on Speech and Audio Processing 13, 1119–1129 (2005)
Article Google Scholar
Ramírez, J., Segura, J.C., Benítez, C., García, L., Rubio, A.: Statistical voice activity detection using a multiple observation likelihood ratio test. IEEE Signal Processing Letters 12, 689–692 (2005)
Article Google Scholar
Górriz, J., Ramírez, J., Segura, J., Puntonet, C.: Improved MO-LRT VAD based on bispectra gaussian model. Electronics Letters 41, 877–879 (2005)
Article Google Scholar
Zadeh, L.A.: Fuzzy algorithm. Information and Control 12, 94–102 (1968)
Article MATH MathSciNet Google Scholar
Beritelli, F., Casale, S., Cavallaro, A.: A robust voice activity detector for wireless communications using soft computing. IEEE Journal of Selected Areas in Communications 16, 1818–1829 (1998)
Article Google Scholar
Mendel, J.: Fuzzy logic systems for engineering: A tutorial. Proceedings of the IEEE 83, 345–377 (1995)
Article Google Scholar
Jang, J.S.R.: ANFIS: Adaptive-network-based fuzzy inference systems. IEEE Transactions on Systems, Man, and Cybernetics 23, 665–685 (1993)
Article Google Scholar
Moreno, A., Borge, L., Christoph, D., Gael, R., Khalid, C., Stephan, E., Jeffrey, A.: SpeechDat-Car: A Large Speech Database for Automotive Environments. In: Proceedings of the II LREC Conference (2000)
Google Scholar
Marzinzik, M., Kollmeier, B.: Speech pause detection for noise spectrum estimation by tracking power envelope dynamics. IEEE Transactions on Speech and Audio Processing 10, 341–351 (2002)
Article Google Scholar
Woo, K., Yang, T., Park, K., Lee, C.: Robust voice activity detection algorithm for estimating noise spectrum. Electronics Letters 36, 180–181 (2000)
Article Google Scholar
Li, Q., Zheng, J., Tsai, A., Zhou, Q.: Robust endpoint detection and energy normalization for real-time speech and speaker recognition. IEEE Transactions on Speech and Audio Processing 10, 146–157 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Signal Theory, Networking and Communications, University of Granada, Spain
R. Culebras, J. Ramírez, J. M. Górriz & J. C. Segura

Authors

R. Culebras
View author publications
You can also search for this author in PubMed Google Scholar
J. Ramírez
View author publications
You can also search for this author in PubMed Google Scholar
J. M. Górriz
View author publications
You can also search for this author in PubMed Google Scholar
J. C. Segura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Advanced Computing and Emerging Technologies Centre, The School of Systems Engineering, University of Reading, RG6 6AY, Reading, United Kingdom
Vassil N. Alexandrov
Department of Mathematics and Computer Science, University of Amsterdam, Kruislaan 403, 1098, Amsterdam, SJ, The Netherlands
Geert Dick van Albada
Faculty of Sciences, Section of Computational Science, University of Amsterdam, Kruislaan 403, 1098, Amsterdam, SJ, The Netherlands
Peter M. A. Sloot
Computer Science Department, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Culebras, R., Ramírez, J., Górriz, J.M., Segura, J.C. (2006). Fuzzy Logic Speech/Non-speech Discrimination for Noise Robust Speech Processing. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds) Computational Science – ICCS 2006. ICCS 2006. Lecture Notes in Computer Science, vol 3991. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11758501_55

Download citation

DOI: https://doi.org/10.1007/11758501_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34379-0
Online ISBN: 978-3-540-34380-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fuzzy Logic Speech/Non-speech Discrimination for Noise Robust Speech Processing

Abstract

Chapter PDF

Similar content being viewed by others

Neuro-Fuzzy Logic Application in Speech Recognition

Adaptive Prosody Modelling for Improved Synthetic Speech Quality

Fuzzy Logic in Speech Technology - Introductory and Overviewing Glimpses

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Fuzzy Logic Speech/Non-speech Discrimination for Noise Robust Speech Processing

Abstract

Chapter PDF

Similar content being viewed by others

Neuro-Fuzzy Logic Application in Speech Recognition

Adaptive Prosody Modelling for Improved Synthetic Speech Quality

Fuzzy Logic in Speech Technology - Introductory and Overviewing Glimpses

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation