Simultaneous speaker identification and watermarking

Abd El-Wahab, Basant S.; El-khobby, Heba A.; Abd Elnaby, Mustafa M.; Abd El-Samie, Fathi E.

doi:10.1007/s10772-019-09658-x

Simultaneous speaker identification and watermarking

Published: 15 January 2021

Volume 24, pages 205–218, (2021)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Basant S. Abd El-Wahab¹,
Heba A. El-khobby¹,
Mustafa M. Abd Elnaby¹ &
…
Fathi E. Abd El-Samie²

222 Accesses
1 Citation
Explore all metrics

Abstract

Biometric template protection of speech signals and information hiding in speech signals are two challenging issues. To resolve such limitations and increase the level of security, our objective is to build multi-level security systems based on speech signals. So, speech watermarking is used simultaneously with automatic speaker identification. The speech watermarking is performed to embed images into the speech signals that are used for speaker identification. The watermark is extracted for authentication, and then the effect of watermark removal on the performance of the speaker identification system in the presence of degradations is studied. This paper presents an approach for speech watermarking based on empirical mode decomposition (EMD) in different transform domains and singular value decomposition (SVD). The speech signal is decomposed in different transform domains with EMD to yield zero-mean components called intrinsic mode functions (IMFs). The watermark is inserted into one of these IMF components with SVD. A comparison between different transform domains for implementing the proposed watermarking scheme on different IMFs is presented. The log-likelihood ratio (LLR), correlation coefficient (C_r), signal-to-noise ratio (SNR), and spectral distortion (SD) are used as metrics for the comparison. According to the simulation results, we find that the watermark embedding in the discrete sine transform domain provides higher SNR and C_r values and lower SD and LLR values. The proposed approach is robust to different attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-factor authentication model based on multipurpose speech watermarking and online speaker recognition

Article 04 March 2016

Digital speech watermarking to enhance the security using speech as a biometric for person authentication

Article 25 October 2018

Semi-fragile digital speech watermarking for online speaker recognition

Article Open access 21 October 2015

References

Bhat, V., Sengupta, I., & Das, A. (2010). An adaptive audio watermarking based on the singular value decomposition in the wavelet domain. Digital Signal Processing, 20(6), 1547–1558.
Article Google Scholar
Childers, D. G., Skinner, D. P., & Kemerait, R. C. (1977). The cepstrum: A guide to processing. Proceedings of the IEEE, 65(10), 1428–1443.
Article Google Scholar
Cox, I. J., & Miller, M. L. (2002). The first 50 years of electronic watermarking. EURASIP Journal on Advances in Signal Processing, 2, 820936.
Article Google Scholar
Evans, N., Mason, J., Liu, W.-M., & Fauve, B. (2006). An assessment on the fundamental limitations of spectral subtraction. In 2006 IEEE international conference on acoustics speech and signal processing proceedings (Vol. 1, pp. I–I).
Ghouti, L., Bouridane, A., Ibrahim, M. K., & Boussakta, S. (2006). Digital image watermarking using balanced multiwavelets. IEEE Transactions on Signal Processing, 54(4), 1519–1536.
Article Google Scholar
Gupta, S., Jaafar, J., WanAhmad, W. F., & Bansal, A. (2013). Feature extraction using MFCC. Signal & Image Processing: An International Journal (SIPIJ), 4(4), 101–108.
Google Scholar
Haider, F., Akira, H., Luz, S., Vogel, C., & Campbell, N. (2018). On-talk and off-talk detection: A discrete wavelet transform analysis of electroencephalogram. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 960–964).
Hu, H.-T., Lin, S.-J., & Hsu, L.-Y. (2017). Effective blind speech watermarking via adaptive mean modulation and package synchronization in DWT domain. EURASIP Journal on Audio, Speech, and Music Processing, 1, 10.
Article Google Scholar
Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Q., et al. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 454(1971), 903–995.
Article MathSciNet Google Scholar
Khaldi, K., Alouane, M.-T., & Boudraa, A.-O. (2010). Voiced speech enhancement based on adaptive filtering of selected intrinsic mode functions. Advances in Adaptive Data Analysis, 2(1), 65–80.
Article MathSciNet Google Scholar
Khaldi, K., & Boudraa, A.-O. (2012). On signals compression by EMD. Electronics Letters, 48(21), 1329–1331.
Article Google Scholar
Kim, W.-G., Lee, J. C., & Lee, W. D. (2000). An audio watermarking scheme with hidden signatures. In International conference on signal processing (Vol. 253). Beijing.
Kirovski, D., & Malvar, H. (2001). Robust spread-spectrum audio watermarking. In 2001 IEEE international conference on acoustics, speech, and signal processing. Proceedings (Cat. No. 01CH37221) (Vol. 3, pp. 1345–1348).
Kubichek, R. (1993). Mel-cepstral distance measure for objective speech quality assessment. In Proceedings of IEEE pacific rim conference on communications computers and signal processing (Vol. 1, pp. 125–128).
Lie, W.-N., & Chang, L.-C. (2006). Robust and high-quality time-domain audio watermarking based on low-frequency amplitude modification. IEEE Transactions on Multimedia, 8(1), 46–59.
Article Google Scholar
Lu, Z.-M., Xu, D.-G., & Sun, S.-H. (2005). Multipurpose image watermarking algorithm based on multistage vector quantization. IEEE Transactions on Image Processing, 14(6), 822–831.
Article Google Scholar
Matam, B. R., & Lowe, D. (2010). Watermarking audio signals for copyright protection using ICA. In A. M. Al-Haj (Ed.), Advanced techniques in multimedia watermarking: Image, video and audio applications (pp. 144–157). Hersey, PA: IGI Global.
Chapter Google Scholar
Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint arXiv:1003.4083.
Neville, K. L., & Hussain, Z. M. (2009). Effects of wavelet compression of speech on its Mel-Cepstral coefficients. In International conference on communication, computer and power (ICCCP’09), Muscat, (pp. 387–390).
Prochazka, A. N., Kingsbury, G., Payner, P. J. W., & Uhlir, J. (2013). Signal analysis and prediction. Berlin: Springer.
Google Scholar
Soon, Y., Koh, S. N., & Yeo, C. K. (1998). Noisy speech enhancement using discrete cosine transform. Speech Communication, 24(3), 249–257.
Article Google Scholar
Tirumala, S. S., Shahamiri, S. R., Garhwal, A. S., & Wang, R. (2017). Speaker identification features extraction methods: A systematic review. Expert Systems with Applications, 90, 250–271.
Article Google Scholar
Wang, S.-B., Liu, X.-Y., Dang, X., & Wang, J.-M. (2017) A robust speech watermarking based on Quantization Index Modulation and Double Discrete Cosine Transform. In 2017 IEEE 10th international congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI) (pp. 1–6).
Yang, W., Benbouchta, M., & Yantorno, R. (1998). Performance of the modified bark spectral distortion as an objective speech quality measure. In Proceedings of the 1998 IEEE international conference on acoustics, speech and signal processing, ICASSP’98 (Cat. No. 98CH36181) (Vol. 1, pp. 541–544).

Download references

Author information

Authors and Affiliations

Department of Electronics and Electrical Communications Engineering, Faculty of Engineering, Tanta University, Tanta, Egypt
Basant S. Abd El-Wahab, Heba A. El-khobby & Mustafa M. Abd Elnaby
Department of Electronics and Electrical Communications, Faculty of Electronic Engineering, Menoufia University, Al Minufiyah, Egypt
Fathi E. Abd El-Samie

Authors

Basant S. Abd El-Wahab
View author publications
You can also search for this author in PubMed Google Scholar
Heba A. El-khobby
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa M. Abd Elnaby
View author publications
You can also search for this author in PubMed Google Scholar
Fathi E. Abd El-Samie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fathi E. Abd El-Samie.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abd El-Wahab, B.S., El-khobby, H.A., Abd Elnaby, M.M. et al. Simultaneous speaker identification and watermarking. Int J Speech Technol 24, 205–218 (2021). https://doi.org/10.1007/s10772-019-09658-x

Download citation

Received: 28 February 2019
Accepted: 25 November 2019
Published: 15 January 2021
Issue Date: March 2021
DOI: https://doi.org/10.1007/s10772-019-09658-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Simultaneous speaker identification and watermarking

Abstract

Access this article

Similar content being viewed by others

Multi-factor authentication model based on multipurpose speech watermarking and online speaker recognition

Digital speech watermarking to enhance the security using speech as a biometric for person authentication

Semi-fragile digital speech watermarking for online speaker recognition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Simultaneous speaker identification and watermarking

Abstract

Access this article

Similar content being viewed by others

Multi-factor authentication model based on multipurpose speech watermarking and online speaker recognition

Digital speech watermarking to enhance the security using speech as a biometric for person authentication

Semi-fragile digital speech watermarking for online speaker recognition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation