ABSTRACT
Several applications of statistical tree-based modelling are described here to problems in speech and language. Classification and regression trees are well suited to many of the pattern recognition problems encountered in this area since they (1) statistically select the most significant features involved (2) provide "honest" estimates of their performance, (3) permit both categorical and continuous features to be considered, and (4) allow human interpretation and exploration of their result. First the method is summarized, then its application to automatic stop classification, segment duration prediction for synthesis, phoneme-to-phone classification, and end-of-sentence detection in text are described. For other applications to speech and language, see [Lucassen 1984], [Bahl, et al 1987].
- Bahl, L., et. al. 1987. A tree-based statistical language model for natural language speech recognition. IBM Research Report 13112.Google Scholar
- Brieman, L., et. al. 1984. Classification and regression trees. Monterey, CA: Wadsworth & Brooks.Google Scholar
- Chou, P. 1988. Applications of information theory to pattern recognition and the desing of decision trees and trellises. Ph.D. thesis, Stanford University, Stanford, CA. Google ScholarDigital Library
- Klatt, D. 1976. Linguistic uses of segmental duration in English: acoustic and perceptual evidence. J. Acoust. Soc. Am. 59. 1208--1221.Google Scholar
- Lucassen, J. M. & Mercer, R. L. 1984. An information theoretic approach to the automatic determination of phonemic baseforms. Proc. ICASSP '84. 42.5.1--42.5.4.Google ScholarCross Ref
- Talkin, D. 1987. Speech formant trajectory estimation using dynamic programming with modulated transition costs. ATT-BL Technical Memo. 11222-87-0715-07.Google Scholar
- Some applications of tree-based modelling to speech and language
Recommendations
Modeling filled pauses for spontaneous speech recognition applications
AEE'08: Proceedings of the 7th WSEAS International Conference on Application of Electrical EngineeringThis paper is focused on acoustic modeling for spontaneous speech recognition applications. This topic is still a very challenging task for speech technology research community. The attributes of spontaneous speech can heavily degrade speech recognizer'...
Combined speech enhancement and auditory modelling for robust distributed speech recognition
The performance of automatic speech recognition (ASR) systems in the presence of noise is an area that has attracted a lot of research interest. Additive noise from interfering noise sources, and convolutional noise arising from transmission channel ...
Modeling coarticulation in EMG-based continuous speech recognition
This paper discusses the use of surface electromyography for automatic speech recognition. Electromyographic signals captured at the facial muscles record the activity of the human articulatory apparatus and thus allow to trace back a speech signal even ...
Comments