Voice-Based Gender Recognition Using Neural Network

Chachadi, Kavita; Nirmala, S. R.

doi:10.1007/978-981-16-0739-4_70

Kavita Chachadi¹³ &
S. R. Nirmala¹³

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 191))

1074 Accesses
7 Citations

Abstract

The human speech contains paralinguistic information used in many speech recognition applications like automatic speech recognition, speaker recognition, and verification. Gender from voice is considered as one of the essential tasks to be detected for such applications. To build a model from a training set, a set of relevant speech features is extracted in order to distinguish gender (i.e., female or male) from a speech signal. This paper focuses on comparison of the proposed neural network (NN) model with the different features like MFCC and mel spectrogram extracted from the speech signal to recognize the gender. Experiments are carried on Mozilla voice dataset and evaluated performance of the network. Experiments show that the combination of MFCC and mel feature sets shows the better accuracy with 94.32%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bishop, J., Keating, P.: Perception of pitch location within a speaker’s range: fundamental frequency, voice quality and speaker sex. J. Acoust. Soc. Am. 32–2, 1100–1112 (2012)
Article Google Scholar
Gaikwad, S., Gawali, B., Mehrotra, S.C.: Gender identification using SVM with combination of MFCC. Adv. Comput. Res. 4, 69–73 (2012)
Google Scholar
Zeng, Y.M., Wu, Z.Y., Falk, T., Chan, W.Y.: Robust GMM based gender classification using pitch and RASTA-PLP parameters of speech. In: Proceedings of the International Conference on Machine Learning and Cybernetics, pp. 3376–3379 (2006)
Google Scholar
Vergin, R., Farhat, A., O’Shaughnessy, D.: Robust gender-dependent acoustic phonetic modelling in continuous speech recognition based on a new automatic male/female classification. In: Proceedings of International IEEE Conference Acoustics, Speech, and Signal Processing (ICASSP-96), vol. 2, pp 1081–1084. Atlanta, May 7–10 1996
Google Scholar
Harb, H., Chen, L.: Voice-based gender identification in multimedia applications. J. Intell. Inform. Syst. 24(2), 179–198 (2005)
Article Google Scholar
Zeng, Y., Wu, Z., Falk, T., Chan, W.Y.: Robust GMM based gender classification using pitch and RASTA-PLP parameters of speech. In: Proceedings of 5th IEEE international conference machine learning and cybernetics, pp 3376–3379. China (2006)
Google Scholar
Metze, F., Ajmera, J., Englert, R., Bub, U., Burkhardt, F., Stegmann, J., Muller, C., Huber, R., Andrassy, B., Bauer, J.G., Littel, B.: Comparison of four approaches to age and gender recognition for telephone applications. In: Proceedings of 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, pp. 1089–1092. Honolulu, April 15–20 2007
Google Scholar
Ververidis, D., Kotropoulos, C.: Automatic speech classification to five emotional states based on gender information. In: Proceedings of European Signal Processing Conference (EUSIPCO 04), vol. 1, pp. 341–344, Vienna, Austria, Sep. 6–10 2004
Google Scholar
Lin, Y.L., Wei, G.: Speech Emotion Recognition Based on HMM and SVM. In: Proceedings of IEEE International Conference Machine Learning and Cybernetics, vol. 8, pp. 4898–4901. Guangzhou, China (2005)
Google Scholar
Xiao, Z., Dellandréa, E., Dou, W., Chen, L.: Hierarchical classification of emotional speech. Technical Report RR-LIRIS-2007-06, LIRIS UMR 5205 CNRS (2007)
Google Scholar
Raahul, A., Sapthagiri, R., Pankaj, K., Vijayarajan, V.: Voice based gender classification using machine learning. Published under licence by IOP Publishing Ltd., IOP Conference Series: Materials Science and Engineering, vol 263, Issue 4
Google Scholar
Qawaqneh, Z., Mallouh, A.A., Barkana, B.D.: Deep neural network framework and transformed MFCCs for speaker’s age and gender classification. Knowl.-Based Syst. 115, 5–14 (2017)
Article Google Scholar
Kabil, S.H., Muckenhirn, H., Magimai-Doss, M.: On learning to identify genders from raw speech signal using CNNs. In: Proceedings of Interspeech, pp. 287–291 (2018)
Google Scholar
Doukhan, D., Carrive, J., Vallet, F., Larcher, A., Meignier, S.: An open-source speaker gender detection framework for monitoring gender equality. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018)
Google Scholar
Buyukyilmaz, M., Cibikdiken, A.O.: Voice gender recognition using deep learning. In: Conference on Modeling, Simulation and Optimization Technologies and Applications (2016)
Google Scholar
Markitantov, M., Verkholyak, O.: Automatic recognition of speaker age and gender based on deep neural networks. In: International Conference on Speech and Computer (SPECOM) (July 2019)
Google Scholar
Mozilla: Common voice. Retrieved from https://voice.mozilla.org/ and https://www.kaggle.com/mozillaorg/common-voice
McFee, B., et al.: librosa: audio and music signal analysis in Python. In: Proceedings of the 14th Python in Science Conference, pp. 18–24 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

KLE Technological University, Hubballi, Karnataka, India
Kavita Chachadi & S. R. Nirmala

Authors

Kavita Chachadi
View author publications
You can also search for this author in PubMed Google Scholar
S. R. Nirmala
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Global Knowledge Research Foundation, Ahmedabad, Gujarat, India
Amit Joshi
Computing and Technology, Nottingham Trent University, Nottingham, Nottinghamshire, UK
Mufti Mahmud
University of Peradeniya, Kandy, Sri Lanka
Roshan G. Ragel
Prof Ram Meghe College of Engineering and Management, Amravati, India
Nileshsingh V. Thakur

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chachadi, K., Nirmala, S.R. (2022). Voice-Based Gender Recognition Using Neural Network. In: Joshi, A., Mahmud, M., Ragel, R.G., Thakur, N.V. (eds) Information and Communication Technology for Competitive Strategies (ICTCS 2020). Lecture Notes in Networks and Systems, vol 191. Springer, Singapore. https://doi.org/10.1007/978-981-16-0739-4_70

Download citation

DOI: https://doi.org/10.1007/978-981-16-0739-4_70
Published: 27 July 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-0738-7
Online ISBN: 978-981-16-0739-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics