skip to main content
10.5555/1557690.1557736dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
research-article
Free Access

Unsupervised learning of acoustic sub-word units

Published:16 June 2008Publication History

ABSTRACT

Accurate unsupervised learning of phonemes of a language directly from speech is demonstrated via an algorithm for joint unsupervised learning of the topology and parameters of a hidden Markov model (HMM); states and short state-sequences through this HMM correspond to the learnt sub-word units. The algorithm, originally proposed for unsupervised learning of allophonic variations within a given phoneme set, has been adapted to learn without any knowledge of the phonemes. An evaluation methodology is also proposed, whereby the state-sequence that aligns to a test utterance is transduced in an automatic manner to a phoneme-sequence and compared to its manual transcription. Over 85% phoneme recognition accuracy is demonstrated for speaker-dependent learning from fluent, large-vocabulary speech.

References

  1. T. Fukada, M. Bacchiani, K. K. Paliwal, and Y. Sagisaka. 1996. Speech recognition based on acoustically derived segment units. In ICSLP, pages 1077--1080.Google ScholarGoogle Scholar
  2. K. Maekawa. 2003. Corpus of spontaneous japanese: its design and evaluation. In ISCA/IEEE Workshop on Spontaneous Speech Processing and Recognition.Google ScholarGoogle Scholar
  3. K. K. Paliwal and A. M. Kulkarni. 1987. Segmentation and labeling using vector quantization and its application in isolated word recognition. Journal of the Acoustical Society of India, 15:102--110.Google ScholarGoogle Scholar
  4. H. Singer and M. Ostendorf. 1996. Maximum likelihood successive state splitting. In ICASSP, pages 601--604. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Takami and S. Sagayama. 1992. A successive state splitting algorithm for efficient allophone modeling. In ICASSP, pages 573--576.Google ScholarGoogle Scholar
  6. J. G. Wilpon, B. H. Juang, and L. R. Rabiner. 1987. An investigation on the use of acoustic sub-word units for automatic speech recognition. In ICASSP, pages 821--824.Google ScholarGoogle Scholar

Index Terms

  1. Unsupervised learning of acoustic sub-word units

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image DL Hosted proceedings
              HLT-Short '08: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
              June 2008
              307 pages

              Publisher

              Association for Computational Linguistics

              United States

              Publication History

              • Published: 16 June 2008

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate240of768submissions,31%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader