Abstract
Nowadays in all everyday transactions, technological progress has become an intrinsic characteristic that depends on such electronic applications as financial and banking transfers, health care, project management, and other crucial life aspects. The core of these applications is person Identification and/or verification steps which can be considered one of the complicated limitations. Accordingly, the employment of biometric attributes can yield promising outcomes in these fields. A One’s voice is a unique bio-feature whereby people can be authenticated and precludes others from assuming a one’s identity without their previous knowing or assent. This work proposes a model with a new architecture to identify the person by exploiting the unique individual characteristics available in one’s voice based on deep learning. An augmentation method is utilized to increase the samples in the available dataset. The available temporal information at an input audio file is analysed then feature maps from this information are extracted which represent the salient temporal feature (time-domain features). The decision is made based on tracking these voice features over time. Successful and promising results are achieved through this work, the accuracy is close to 99.81% (± 1.78%) and the values of loss function are close to 0.009 over VoxCeleb1 dataset for identifying 40 subjects.
Similar content being viewed by others
References
Mohanty P, Nayak AK (2022) CNN based keyword spotting: an application for context based voiced Odia words. Int J Inf Technol. https://doi.org/10.1007/s41870-022-00992-z
Jain AK, Ross A, Prabhakar S (2004) An introduction to biometric recognition. IEEE Trans Circuits Syst Video Technol 14(1):4–20. https://doi.org/10.1109/TCSVT.2003.818349
Farooq H, Naaz S (2020) Performance analysis of biometric recognition system based on palm print (2020). Int J Inf Technol 12:1281–1289
Rachad S, Nsiri B, Bensassi B (2015) System identification of inventory system using ARX and ARMAX models. Int J Control Autom 8(12):283–294. https://doi.org/10.14257/ijca.2015.8.12.26
Pappalardo CM, Guida D (2018) System identification algorithm for computing the modal parameters of linear mechanical systems. Machines. https://doi.org/10.3390/machines6020012
Mandalapu H et al (2021) Audio-visual biometric recognition and presentation attack detection: a comprehensive survey. IEEE Access 9:37431–37455. https://doi.org/10.1109/ACCESS.2021.3063031
Mamyrbayev OZ, Othman M, Akhmediyarova AT, Kydyrbekova AS, Mekebayev NO (2019) Voice verification using i-vectors and neural networks with limited training data. Bull Natl Acad Sci Repub Kaz 3(379):36–43. https://doi.org/10.32014/2019.2518-1467.66
Kumar A, Mittal VH (2021) Speech recognition in noisy environment using hybrid technique. Int J Inf Technol 13:483–492
Ye F, Yang J (2021) A deep neural network model for speaker identification. Appl Sci 11(8):1–18. https://doi.org/10.3390/app11083603
Aizat K, Mohamed O, Orken M, Ainur A, Zhumazhanov B (2020) Identification and authentication of user voice using DNN features and i-vector. Cogent Eng. https://doi.org/10.1080/23311916.2020.1751557
Zhipeng D, Jingcheng W, Yumin X, Qingmin M, Xiaoming W (2019) Voiceprint recognition based on BP neural network and CNN. J Phys Conf Ser. https://doi.org/10.1088/1742-6596/1237/3/032032
Khdier HY, Jasim WM, Aliesawi SA (2021) Deep learning algorithms based voiceprint recognition system in noisy environment. J Phys Conf Ser. https://doi.org/10.1088/1742-6596/1804/1/012042
Antony A, Gopikakumari R (2018) Speaker identification based on combination of MFCC and UMRT based features. Proced Comput Sci 143:250–257. https://doi.org/10.1016/j.procs.2018.10.393
Johnson JM, Khoshgoftaar TM (2019) Survey on deep learning with class imbalance. J Big Data. https://doi.org/10.1186/s40537-019-0192-5
Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data 2(1):1–21. doi: https://doi.org/10.1186/s40537-014-0007-7
Obayes HK, Al-A’araji N, Al-Shamery E (2019) Examination and forecasting of drug consumption based on recurrent deep learning. Int J Recent Technol Eng 8(2):414–420. https://doi.org/10.35940/ijrte.B1069.0982S1019
Ravi D et al (2017) Deep learning for health informatics. IEEE J Biomed Heal Inform 21(1):4–21. https://doi.org/10.1109/JBHI.2016.2636665
Obayes HK, Al-Turaihi FS, Alhussayni KH (2021) Sentiment classification of user’s reviews on drugs based on global vectors for word representation and bidirectional long short-term memory recurrent neural network. Indones J Electr Eng Comput Sci 23(1):345–353. doi: https://doi.org/10.11591/ijeecs.v23.i1.pp345-353
Al-Shakarchy ND, Ali IH (2019) Abnormal head movement classification using deep neural network DNN. AIP Conf Proc. https://doi.org/10.1063/1.5123123
Al-Shakarchy ND, Ali IH (2020) Detecting abnormal movement of driver’s head based on spatial-temporal features of video using deep neural network DNN. Indones J Electr Eng Comput Sci 19(1):344–352. https://doi.org/10.11591/ijeecs.v19.i1.pp344-352
Fridman L et al (2017) MIT Autonomous vehicle technology study: large-scale deep learning based analysis of driver behavior and interaction with automation, pp 1–17. http://arxiv.org/abs/1711.06976
Nagrani; A, Chung JS, Zisserman A (2017) A large-scale speaker identification dataset. INTERSPEECH
Buduma N, Locascio N (2017) Fundamentals of deep learning: designing next-generation machine intelligence algorithms. Nikhil Buduma; with contributions by Nicholas Locascio
Jung Y (2018) Multiple predicting K-fold cross-validation for model selection. J Nonparametr Stat 30(1):197–215. https://doi.org/10.1080/10485252.2017.1404598
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
AL-Shakarchy, N.D., Obayes, H. & Abdullah, Z.N. Person identification based on voice biometric using deep neural network. Int. j. inf. tecnol. 15, 789–795 (2023). https://doi.org/10.1007/s41870-022-01142-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-022-01142-1