ABSTRACT
Accurate unsupervised learning of phonemes of a language directly from speech is demonstrated via an algorithm for joint unsupervised learning of the topology and parameters of a hidden Markov model (HMM); states and short state-sequences through this HMM correspond to the learnt sub-word units. The algorithm, originally proposed for unsupervised learning of allophonic variations within a given phoneme set, has been adapted to learn without any knowledge of the phonemes. An evaluation methodology is also proposed, whereby the state-sequence that aligns to a test utterance is transduced in an automatic manner to a phoneme-sequence and compared to its manual transcription. Over 85% phoneme recognition accuracy is demonstrated for speaker-dependent learning from fluent, large-vocabulary speech.
- T. Fukada, M. Bacchiani, K. K. Paliwal, and Y. Sagisaka. 1996. Speech recognition based on acoustically derived segment units. In ICSLP, pages 1077--1080.Google Scholar
- K. Maekawa. 2003. Corpus of spontaneous japanese: its design and evaluation. In ISCA/IEEE Workshop on Spontaneous Speech Processing and Recognition.Google Scholar
- K. K. Paliwal and A. M. Kulkarni. 1987. Segmentation and labeling using vector quantization and its application in isolated word recognition. Journal of the Acoustical Society of India, 15:102--110.Google Scholar
- H. Singer and M. Ostendorf. 1996. Maximum likelihood successive state splitting. In ICASSP, pages 601--604. Google ScholarDigital Library
- J. Takami and S. Sagayama. 1992. A successive state splitting algorithm for efficient allophone modeling. In ICASSP, pages 573--576.Google Scholar
- J. G. Wilpon, B. H. Juang, and L. R. Rabiner. 1987. An investigation on the use of acoustic sub-word units for automatic speech recognition. In ICASSP, pages 821--824.Google Scholar
Index Terms
- Unsupervised learning of acoustic sub-word units
Recommendations
Study of sub-word acoustical models for Kannada isolated word recognition system
The speech recognition system basically extracts the textual information present in the speech. In the present work, speaker independent isolated word recognition system for one of the south Indian language--Kannada has been developed. For European ...
Natural Sounding Sub-Word Units Concatenation in Malay Speech Synthesis
ICSAP '09: Proceedings of the 2009 International Conference on Signal Acquisition and ProcessingThe goal of this work was to concatenate Malay subwordswithout introducing perceptible audible discontinuities.Based on a phonemes adjacency analysis, we build a list ofnon-audible distortion sub-word unit lookup. Selecting subwordfrom this lookup will ...
Acoustic Characterization of Amharic Vowel Sound Units
SSPS '19: Proceedings of the 2019 International Symposium on Signal Processing SystemsAcoustic characterization of vowels has a significant role in the development of speech synthesis and recognition systems. Specifically, it reduces the improper vowel parameter selection for the concatenative and formant based speech synthesis systems. ...
Comments