Skip to main content

Bengali Phonetics Identification Using Wavelet Based Signal Feature

  • Conference paper
  • First Online:
Computational Intelligence, Communications, and Business Analytics (CICBA 2017)

Abstract

With the advancement of the voice signal processing, speech to text recognition has become an important area of research. Though some efforts are found for the English language, for regional languages like Bengali, Hindi, Guajarati etc. it is very rare or not started yet. Thus objectives of this work is to develop a method to identify isolated Bengali letter/alphabet (Swarabarna and Banjanbarna), from uttered sound. In speech processing, identifying a particular uttered letter consists of two major steps, Speech Feature Extraction and Feature Classification. Studies show that Mel Frequency Cepstral Coefficient (MFCC) give better representation of human auditory system, but at the same time with increased noise, performance of MFCC degrades, which may be reduced by Discrete Wavelet Transform (DWT). Thus MFCC combined with DWT is used as a feature termed as Mel Frequency Wavelet Transform Coefficient (MFWTC) for this work. For experiment, a sound database is developed by uttering of 43 Bengali alphabets {11 Swarabarna and 32 Banjanbarna} by ten speakers, 20 times for each letter. Then these signals are pre-processed to remove the silent portion from both end points followed by applying pre-emphasized filter. Next, MFCC features are extracted from preprocessed signals. These features are then fine-tuned by applying DWT to compute MFWTC features. Not only these feature, Zero Crossing Count(ZCC) are also used independently to compare with this method. Finally these features are used to recognize the Bengali Barnas using different classifiers (BayesNet, NaiveBayes, IB1, LWL, Classification Via Clustering, Dagging, Multi Scheme, VFI, Conjunctive Rule, ZeroR, BFTree and Simple Cart) available in Weka tools. The classification accuracy is measured using 10-fold cross validation method, which shows the average 47.61% and 62.19% for Swarabarna and Banjanbarna respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Chauhan, P.M., Desai, N.P.: Mel Frequency Cepstral Coefficients (MFCC) based speaker identification in noisy environment using wiener filter. In: International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), 2014, pp. 1–5 (2014)

    Google Scholar 

  2. Manikandan, J., Venkataramani, B., Preeti, P., Sananda, G., Sadhana, K.V.: Implementation of a phoneme recognition system using zero-crossing and magnitude sum function. In: TENCON 2009-2009 IEEE Region 10 Conference, pp. 1–5 (2009)

    Google Scholar 

  3. Hao, Y., Xiaoyan, Z.: A new feature in speech recognition based on wavelet transform. In: 5th International Conference on Signal Processing Proceedings, WCCC-ICSP 2000, vol. 3, pp. 1526–1529 (2000)

    Google Scholar 

  4. Nigade, A.S., Chitode, J.S.: Throat microphone signals for isolated word recognition using LPC. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 401–407 (2012)

    Google Scholar 

  5. Jiang, Z., Huang, H., Yang, S., Lu, S., Hao, Z.: Acoustic feature comparison of MFCC and CZT-based cepstrum for speech recognition. In: Fifth International Conference on Natural Computation, ICNC 2009, vol. 1, pp. 55–59 (2009)

    Google Scholar 

  6. Devi, M.R., Ravichandran, T.: A novel approach for speech feature extraction by cubic-log compression in MFCC. In: 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering (PRIME), pp. 182–186 (2013)

    Google Scholar 

  7. Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: MFCC based recognition of repetitions and prolongations in stuttered speech using k-NN and LDA. In: 2009 IEEE Student Conference on Research and Development (SCOReD), pp. 146–149 (2009)

    Google Scholar 

  8. Shafik, A., Elhalafawy, S.M., Diab, S.M., Sallam, B.M., Abd El-samie, F.E.: A wavelet based approach for speaker identification from degraded speech. Int. J. Commun. Netw. Inf. Secur. (IJCNIS) 1(3) (2009)

    Google Scholar 

  9. Abdalla, M.I., Ali, H.S.: Wavelet-based Mel-frequency cepstral coefficients for speaker identification using hidden Markov models. arXiv preprint arXiv:1003.5627, vol. 1, pp. 16–21 (2010)

  10. Abdalla, M.I., Abobakr, H.M., Gaafar, T.S.: DWT and MFCCs based feature extraction methods for isolated word recognition. Int. J. Comput. Appl. 69(20), 21–25 (2013)

    Google Scholar 

  11. Modic, R., Lindberg, B., Petek, B.: Comparative wavelet and MFCC speech recognition experiments on the Slovenian and English speechDat2. In: ISCA Tutorial and Research Workshop on Non-linear Speech Processing (2003)

    Google Scholar 

  12. Deshpande, M.S., Holambe, R.S.: Speaker identification using admissible wavelet packet based decomposition. Int. J. Sign. Process. 6(1), 20–23 (2010)

    Google Scholar 

  13. Liw, S.H., Thang, K.F.: Development of intelligent speech-recognition system using wavelet transform and neural network. In: The Second International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE 2014), pp. 72–77 (2014)

    Google Scholar 

  14. Farooq, O., Datta, S.: Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Sig. Process. Lett. 8(7), 196–198 (2001)

    Article  Google Scholar 

  15. Gaikwad, S., Gawali, B., Yannawar, P., Mehrotra, S.: Feature extraction using fusion MFCC for continuous Marathi speech recognition. In: 2011 Annual IEEE India Conference (INDICON), pp. 1–5 (2011)

    Google Scholar 

  16. Ali, M.A., Hossain, M., Nuruzzaman Bhuiyan, M.: Automatic speech recognition technique for Bangla words. Int. J. Adv. Sci. Technol. 50, 51–60 (2013)

    Google Scholar 

  17. Muhammad, G., Alotaibi, Y.A., Nurul Huda, M.: Automatic speech recognition for Bangla digits. In: 12th International Conference on Computers and Information Technology, ICCIT 2009, pp. 379–383 (2009)

    Google Scholar 

  18. Das, B., Mandal, S., Mitra, P.: Bengali speech corpus for continuous automatic speech recognition system. In: International Conference on Speech Database and Assessments (Oriental COCOSDA), pp. 51–55 (2011)

    Google Scholar 

  19. Manjunath, K.E., Rao, K.S., Pati, D.: Development of phonetic engine for Indian languages: Bengali and Oriya. In: Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), pp. 1–6 (2013)

    Google Scholar 

  20. Ghanty, S.K., Shaikh, S.H., Chaki, N.: On recognition of spoken Bengali numerals. In: Computer Information Systems and Industrial Management Applications (CISIM), pp. 54–59 (2010)

    Google Scholar 

  21. Podder, P., Zaman Khan, T., Khan, M.H., Muktadir Rahman, M.: Comparative performance analysis of Hamming, Hanning and Blackman Window. Int. J. Comput. Appl. 96(18), 1–7 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Santanu Phadikar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Phadikar, S., Das, P., Bhakta, I., Roy, A., Midya, S., Majumder, K. (2017). Bengali Phonetics Identification Using Wavelet Based Signal Feature. In: Mandal, J., Dutta, P., Mukhopadhyay, S. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2017. Communications in Computer and Information Science, vol 775. Springer, Singapore. https://doi.org/10.1007/978-981-10-6427-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6427-2_21

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6426-5

  • Online ISBN: 978-981-10-6427-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics