Bengali Phonetics Identification Using Wavelet Based Signal Feature

Phadikar, Santanu; Das, Piyali; Bhakta, Ishita; Roy, Asmita; Midya, Sadip; Majumder, Koushik

doi:10.1007/978-981-10-6427-2_21

Santanu Phadikar¹²,
Piyali Das¹²,
Ishita Bhakta¹²,
Asmita Roy¹²,
Sadip Midya¹² &
…
Koushik Majumder¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 775))

Included in the following conference series:

International Conference on Computational Intelligence, Communications, and Business Analytics

831 Accesses
1 Citations

Abstract

With the advancement of the voice signal processing, speech to text recognition has become an important area of research. Though some efforts are found for the English language, for regional languages like Bengali, Hindi, Guajarati etc. it is very rare or not started yet. Thus objectives of this work is to develop a method to identify isolated Bengali letter/alphabet (Swarabarna and Banjanbarna), from uttered sound. In speech processing, identifying a particular uttered letter consists of two major steps, Speech Feature Extraction and Feature Classification. Studies show that Mel Frequency Cepstral Coefficient (MFCC) give better representation of human auditory system, but at the same time with increased noise, performance of MFCC degrades, which may be reduced by Discrete Wavelet Transform (DWT). Thus MFCC combined with DWT is used as a feature termed as Mel Frequency Wavelet Transform Coefficient (MFWTC) for this work. For experiment, a sound database is developed by uttering of 43 Bengali alphabets {11 Swarabarna and 32 Banjanbarna} by ten speakers, 20 times for each letter. Then these signals are pre-processed to remove the silent portion from both end points followed by applying pre-emphasized filter. Next, MFCC features are extracted from preprocessed signals. These features are then fine-tuned by applying DWT to compute MFWTC features. Not only these feature, Zero Crossing Count(ZCC) are also used independently to compare with this method. Finally these features are used to recognize the Bengali Barnas using different classifiers (BayesNet, NaiveBayes, IB1, LWL, Classification Via Clustering, Dagging, Multi Scheme, VFI, Conjunctive Rule, ZeroR, BFTree and Simple Cart) available in Weka tools. The classification accuracy is measured using 10-fold cross validation method, which shows the average 47.61% and 62.19% for Swarabarna and Banjanbarna respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Chauhan, P.M., Desai, N.P.: Mel Frequency Cepstral Coefficients (MFCC) based speaker identification in noisy environment using wiener filter. In: International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), 2014, pp. 1–5 (2014)
Google Scholar
Manikandan, J., Venkataramani, B., Preeti, P., Sananda, G., Sadhana, K.V.: Implementation of a phoneme recognition system using zero-crossing and magnitude sum function. In: TENCON 2009-2009 IEEE Region 10 Conference, pp. 1–5 (2009)
Google Scholar
Hao, Y., Xiaoyan, Z.: A new feature in speech recognition based on wavelet transform. In: 5th International Conference on Signal Processing Proceedings, WCCC-ICSP 2000, vol. 3, pp. 1526–1529 (2000)
Google Scholar
Nigade, A.S., Chitode, J.S.: Throat microphone signals for isolated word recognition using LPC. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 401–407 (2012)
Google Scholar
Jiang, Z., Huang, H., Yang, S., Lu, S., Hao, Z.: Acoustic feature comparison of MFCC and CZT-based cepstrum for speech recognition. In: Fifth International Conference on Natural Computation, ICNC 2009, vol. 1, pp. 55–59 (2009)
Google Scholar
Devi, M.R., Ravichandran, T.: A novel approach for speech feature extraction by cubic-log compression in MFCC. In: 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering (PRIME), pp. 182–186 (2013)
Google Scholar
Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: MFCC based recognition of repetitions and prolongations in stuttered speech using k-NN and LDA. In: 2009 IEEE Student Conference on Research and Development (SCOReD), pp. 146–149 (2009)
Google Scholar
Shafik, A., Elhalafawy, S.M., Diab, S.M., Sallam, B.M., Abd El-samie, F.E.: A wavelet based approach for speaker identification from degraded speech. Int. J. Commun. Netw. Inf. Secur. (IJCNIS) 1(3) (2009)
Google Scholar
Abdalla, M.I., Ali, H.S.: Wavelet-based Mel-frequency cepstral coefficients for speaker identification using hidden Markov models. arXiv preprint arXiv:1003.5627, vol. 1, pp. 16–21 (2010)
Abdalla, M.I., Abobakr, H.M., Gaafar, T.S.: DWT and MFCCs based feature extraction methods for isolated word recognition. Int. J. Comput. Appl. 69(20), 21–25 (2013)
Google Scholar
Modic, R., Lindberg, B., Petek, B.: Comparative wavelet and MFCC speech recognition experiments on the Slovenian and English speechDat2. In: ISCA Tutorial and Research Workshop on Non-linear Speech Processing (2003)
Google Scholar
Deshpande, M.S., Holambe, R.S.: Speaker identification using admissible wavelet packet based decomposition. Int. J. Sign. Process. 6(1), 20–23 (2010)
Google Scholar
Liw, S.H., Thang, K.F.: Development of intelligent speech-recognition system using wavelet transform and neural network. In: The Second International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE 2014), pp. 72–77 (2014)
Google Scholar
Farooq, O., Datta, S.: Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Sig. Process. Lett. 8(7), 196–198 (2001)
Article Google Scholar
Gaikwad, S., Gawali, B., Yannawar, P., Mehrotra, S.: Feature extraction using fusion MFCC for continuous Marathi speech recognition. In: 2011 Annual IEEE India Conference (INDICON), pp. 1–5 (2011)
Google Scholar
Ali, M.A., Hossain, M., Nuruzzaman Bhuiyan, M.: Automatic speech recognition technique for Bangla words. Int. J. Adv. Sci. Technol. 50, 51–60 (2013)
Google Scholar
Muhammad, G., Alotaibi, Y.A., Nurul Huda, M.: Automatic speech recognition for Bangla digits. In: 12th International Conference on Computers and Information Technology, ICCIT 2009, pp. 379–383 (2009)
Google Scholar
Das, B., Mandal, S., Mitra, P.: Bengali speech corpus for continuous automatic speech recognition system. In: International Conference on Speech Database and Assessments (Oriental COCOSDA), pp. 51–55 (2011)
Google Scholar
Manjunath, K.E., Rao, K.S., Pati, D.: Development of phonetic engine for Indian languages: Bengali and Oriya. In: Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), pp. 1–6 (2013)
Google Scholar
Ghanty, S.K., Shaikh, S.H., Chaki, N.: On recognition of spoken Bengali numerals. In: Computer Information Systems and Industrial Management Applications (CISIM), pp. 54–59 (2010)
Google Scholar
Podder, P., Zaman Khan, T., Khan, M.H., Muktadir Rahman, M.: Comparative performance analysis of Hamming, Hanning and Blackman Window. Int. J. Comput. Appl. 96(18), 1–7 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Maulana Abul Kalam Azad University of Technology, BF-142, Sector-I, Salt Lake, Kolkata, 700064, West Bengal, India
Santanu Phadikar, Piyali Das, Ishita Bhakta, Asmita Roy, Sadip Midya & Koushik Majumder

Authors

Santanu Phadikar
View author publications
You can also search for this author in PubMed Google Scholar
Piyali Das
View author publications
You can also search for this author in PubMed Google Scholar
Ishita Bhakta
View author publications
You can also search for this author in PubMed Google Scholar
Asmita Roy
View author publications
You can also search for this author in PubMed Google Scholar
Sadip Midya
View author publications
You can also search for this author in PubMed Google Scholar
Koushik Majumder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Santanu Phadikar .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, University of Kalyani, Kalyani, West Bengal, India
J. K. Mandal
Department of Computer and System Sciences, Visva Bharati University, Bolpur Santiniketan, West Bengal, India
Paramartha Dutta
Department of Information Technology, Calcutta Business School, Kolkata, India
Somnath Mukhopadhyay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Phadikar, S., Das, P., Bhakta, I., Roy, A., Midya, S., Majumder, K. (2017). Bengali Phonetics Identification Using Wavelet Based Signal Feature. In: Mandal, J., Dutta, P., Mukhopadhyay, S. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2017. Communications in Computer and Information Science, vol 775. Springer, Singapore. https://doi.org/10.1007/978-981-10-6427-2_21

Download citation

DOI: https://doi.org/10.1007/978-981-10-6427-2_21
Published: 24 September 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6426-5
Online ISBN: 978-981-10-6427-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics