Skip to main content

Voice-Based Gender Recognition Using Neural Network

  • Conference paper
  • First Online:
Information and Communication Technology for Competitive Strategies (ICTCS 2020)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 191))

Abstract

The human speech contains paralinguistic information used in many speech recognition applications like automatic speech recognition, speaker recognition, and verification. Gender from voice is considered as one of the essential tasks to be detected for such applications. To build a model from a training set, a set of relevant speech features is extracted in order to distinguish gender (i.e., female or male) from a speech signal. This paper focuses on comparison of the proposed neural network (NN) model with the different features like MFCC and mel spectrogram extracted from the speech signal to recognize the gender. Experiments are carried on Mozilla voice dataset and evaluated performance of the network. Experiments show that the combination of MFCC and mel feature sets shows the better accuracy with 94.32%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bishop, J., Keating, P.: Perception of pitch location within a speaker’s range: fundamental frequency, voice quality and speaker sex. J. Acoust. Soc. Am. 32–2, 1100–1112 (2012)

    Article  Google Scholar 

  2. Gaikwad, S., Gawali, B., Mehrotra, S.C.: Gender identification using SVM with combination of MFCC. Adv. Comput. Res. 4, 69–73 (2012)

    Google Scholar 

  3. Zeng, Y.M., Wu, Z.Y., Falk, T., Chan, W.Y.: Robust GMM based gender classification using pitch and RASTA-PLP parameters of speech. In: Proceedings of the International Conference on Machine Learning and Cybernetics, pp. 3376–3379 (2006)

    Google Scholar 

  4. Vergin, R., Farhat, A., O’Shaughnessy, D.: Robust gender-dependent acoustic phonetic modelling in continuous speech recognition based on a new automatic male/female classification. In: Proceedings of International IEEE Conference Acoustics, Speech, and Signal Processing (ICASSP-96), vol. 2, pp 1081–1084. Atlanta, May 7–10 1996

    Google Scholar 

  5. Harb, H., Chen, L.: Voice-based gender identification in multimedia applications. J. Intell. Inform. Syst. 24(2), 179–198 (2005)

    Article  Google Scholar 

  6. Zeng, Y., Wu, Z., Falk, T., Chan, W.Y.: Robust GMM based gender classification using pitch and RASTA-PLP parameters of speech. In: Proceedings of 5th IEEE international conference machine learning and cybernetics, pp 3376–3379. China (2006)

    Google Scholar 

  7. Metze, F., Ajmera, J., Englert, R., Bub, U., Burkhardt, F., Stegmann, J., Muller, C., Huber, R., Andrassy, B., Bauer, J.G., Littel, B.: Comparison of four approaches to age and gender recognition for telephone applications. In: Proceedings of 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, pp. 1089–1092. Honolulu, April 15–20 2007

    Google Scholar 

  8. Ververidis, D., Kotropoulos, C.: Automatic speech classification to five emotional states based on gender information. In: Proceedings of European Signal Processing Conference (EUSIPCO 04), vol. 1, pp. 341–344, Vienna, Austria, Sep. 6–10 2004

    Google Scholar 

  9. Lin, Y.L., Wei, G.: Speech Emotion Recognition Based on HMM and SVM. In: Proceedings of IEEE International Conference Machine Learning and Cybernetics, vol. 8, pp. 4898–4901. Guangzhou, China (2005)

    Google Scholar 

  10. Xiao, Z., Dellandréa, E., Dou, W., Chen, L.: Hierarchical classification of emotional speech. Technical Report RR-LIRIS-2007-06, LIRIS UMR 5205 CNRS (2007)

    Google Scholar 

  11. Raahul, A., Sapthagiri, R., Pankaj, K., Vijayarajan, V.: Voice based gender classification using machine learning. Published under licence by IOP Publishing Ltd., IOP Conference Series: Materials Science and Engineering, vol 263, Issue 4

    Google Scholar 

  12. Qawaqneh, Z., Mallouh, A.A., Barkana, B.D.: Deep neural network framework and transformed MFCCs for speaker’s age and gender classification. Knowl.-Based Syst. 115, 5–14 (2017)

    Article  Google Scholar 

  13. Kabil, S.H., Muckenhirn, H., Magimai-Doss, M.: On learning to identify genders from raw speech signal using CNNs. In: Proceedings of Interspeech, pp. 287–291 (2018)

    Google Scholar 

  14. Doukhan, D., Carrive, J., Vallet, F., Larcher, A., Meignier, S.: An open-source speaker gender detection framework for monitoring gender equality. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018)

    Google Scholar 

  15. Buyukyilmaz, M., Cibikdiken, A.O.: Voice gender recognition using deep learning. In: Conference on Modeling, Simulation and Optimization Technologies and Applications (2016)

    Google Scholar 

  16. Markitantov, M., Verkholyak, O.: Automatic recognition of speaker age and gender based on deep neural networks. In: International Conference on Speech and Computer (SPECOM) (July 2019)

    Google Scholar 

  17. Mozilla: Common voice. Retrieved from https://voice.mozilla.org/ and https://www.kaggle.com/mozillaorg/common-voice

  18. McFee, B., et al.: librosa: audio and music signal analysis in Python. In: Proceedings of the 14th Python in Science Conference, pp. 18–24 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chachadi, K., Nirmala, S.R. (2022). Voice-Based Gender Recognition Using Neural Network. In: Joshi, A., Mahmud, M., Ragel, R.G., Thakur, N.V. (eds) Information and Communication Technology for Competitive Strategies (ICTCS 2020). Lecture Notes in Networks and Systems, vol 191. Springer, Singapore. https://doi.org/10.1007/978-981-16-0739-4_70

Download citation

Publish with us

Policies and ethics