skip to main content
10.5555/1603899.1603931dlproceedingsArticle/Chapter ViewAbstractPublication Pagesnemlap-conllConference Proceedingsconference-collections
research-article
Free Access

Modularity in inductively-learned word pronunciation systems

Published:11 January 1998Publication History

ABSTRACT

In leading morpho-phonological theories and state-of-the-art text-to-speech systems it is assumed that word pronunciation cannot be learned or performed without in-between analyses at several abstraction levels (e.g., morphological, graphemic, phonemic, syllabic, and stress levels). We challenge this assumption for the case of English word pronunciation. Using igtree, an inductive-learning decision-tree algorithms, we train and test three word-pronunciation systems in which the number of abstraction levels (implemented as sequenced modules) is reduced from five, via three, to one. The latter system, classifying letter strings directly as mapping to phonemes with stress markers, yields significantly better generalisation accuracies than the two multi-module systems. Analyses of empirical results indicate that positive utility effects of sequencing modules are outweighed by cascading errors passed on between modules.

References

  1. Allen, J., S. Hunnicutt, and D. Klatt. 1987. From text to speech: The MITalk system. Cambridge, UK: Cambidge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bloomfield, L. 1933. Language. New York: Holt, Rinehard and Winston.Google ScholarGoogle Scholar
  3. Breiman, L., J. Friedman, R. Ohlsen, and C. Stone. 1984. Classification and regression trees. Belmont, CA: Wadsworth International Group.Google ScholarGoogle Scholar
  4. Burnage, G., 1990. CELBX: A guide for users. Centre for Lexical Information, Nijmegen.Google ScholarGoogle Scholar
  5. Chomsky, N. and M. Halle. 1968. The sound pattern of English. New York, NY: Harper and Row.Google ScholarGoogle Scholar
  6. Daelemans, W. 1988. Grafon: A grapheme-to-phoneme system for Dutch. In Proceedings Twelfth International Conference on Computational Linguistics (COLING-88), Budapest, pages 133--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Daelemans, W. 1996. Experience-driven language acquisition and processing. In M. Van der Avoird and C. Corsius, editors, Proceedings of the CLS Opening Academic Year 1996--1997. Tilburg: CLS, pages 83--95.Google ScholarGoogle Scholar
  8. Daelemans, W., S. Gillis, and G. Durieux. 1994. The acquisition of stress: a data-oriented approach. Computational Linguistics, 20(3):421--451. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Daelemans, W. and A. Van den Bosch. 1992. Generalisation performance of backpropagation learning on a syllabification task. In M. F. J. Drossaers and A. Nijholt, editors, TWLT3: Connectionism and Natural Language Processing, pages 27--37, Enschede. Twente University.Google ScholarGoogle Scholar
  10. Daelemans, W. and A. Van den Bosch. 1997. Language-independent data-oriented grapheme-to-phoneme conversion. In J. P. H. Van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg, editors, Progress in Speech Processing. Berlin: Springer-Verlag, pages 77--89.Google ScholarGoogle Scholar
  11. Daelemans, W., A. Van den Bosch, and A. Weijters. 1997. IGTree: using trees for classification in lazy learning algorithms. Artificial Intelligence Review, 11:407--423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. De Saussure, F. 1916. Course de linguistique générale. Paris: Payot. edited posthumously by C. Bally and A. Riedlinger.Google ScholarGoogle Scholar
  13. Dietterich, T. G., H. Hild, and G. Bakiri. 1995. A comparison of ID 3 and backpropagation for English text-to-speech mapping. Machine Learning, 19(1):5--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Goldsmith, J. 1976. An overview of autosegmental phonology. Linguistic Analysis, 2:23--68.Google ScholarGoogle Scholar
  15. Hunnicutt, S. 1976. Phonological rules for a text-to-speech system. American Journal of Computational Linguistics, Microfiche 57:1--72.Google ScholarGoogle Scholar
  16. Hunnicutt, S. 1980. Grapheme-phoneme rules: a review. Technical Report STL QPSR 2--3, Speech Transmission Laboratory, KTH, Sweden.Google ScholarGoogle Scholar
  17. Koskenniemi, K. 1984. A general computational model for wordform recognition and production. In Proceedings of the Tenth International Conference on Computational Linguistics / 22nd Annual Conference of the ACL, pages 178--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Liberman, M. and A. Prince. 1977. On stress and linguistic rhythm. Linguistic Inquiry, (8):249--336.Google ScholarGoogle Scholar
  19. Mitchell, T. 1997. Machine learning. New York, NY: McGraw Hill. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Mohanan, K. P. 1986. The theory of lexical phonology. Dordrecht: D. Reidel.Google ScholarGoogle Scholar
  21. Piatelli-Palmarini, M., editor. 1980. Language learning: The debate between Jean Piaget and Noam Chomsky. Cambridge, MA: Harvard University Press.Google ScholarGoogle Scholar
  22. Quinlan, J. R. 1986. Induction of decision trees. Machine Learning, 1:81--206. Google ScholarGoogle ScholarCross RefCross Ref
  23. Quinlan, J. R. 1993. c4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Sejnowski, T. J. and C. S. Rosenberg. 1987. Parallel networks that learn to pronounce English text. Complex Systems, 1:145--168.Google ScholarGoogle Scholar
  25. Shavlik, J. W., R. J. Mooney, and G. G. Towell. 1991. Symbolic and neural learning algorithms: An experimental comparison. Machine Learning, 6:111--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Stanfill, C. and D. Waltz. 1986. Toward memory-based reasoning. Communications of the ACM, 29(12):1213--1228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Van den Bosch, A. 1997. Learning to pronounce written words, a study in inductive language learning. Ph.D. thesis, Universiteit Maastricht.Google ScholarGoogle Scholar
  28. Van den Bosch, A. and W. Daelemans. 1993. Data-oriented methods for grapheme-to-phoneme conversion. In Proceedings of the 6th Conference of the EACL, pages 45--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Van den Bosch, A., W. Daelemans, and A. Weijters. 1996. Morphological analysis as classification: an inductive-learning approach. In K. Oflazer and H. Somers, editors, Proceedings of NeMLaP-2, Ankara, Turkey, pages 79--89.Google ScholarGoogle Scholar
  30. Weijters, A. 1991. A simple look-up procedure superior to NETtalk? In Proceedings of icann-91, Espoo, Finland.Google ScholarGoogle ScholarCross RefCross Ref
  31. Weiss, S. and C. Kulikowski. 1991. Computer systems that learn. San Mateo, CA: Morgan Kaufmann.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Wolpert, D. H. 1990. Constructing a generalizer superior to NETtalk via a mathematical theory of generalization. Neural Networks, 3:445--452. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yvon, F. 1996. Prononcer par analogie: motivation, formalisation et évaluation. Ph.D. thesis, Ecole Nationale Supérieure des Télécommunication, Paris.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image DL Hosted proceedings
    NeMLaP3/CoNLL '98: Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
    January 1998
    332 pages
    ISBN:0725806346

    Publisher

    Association for Computational Linguistics

    United States

    Publication History

    • Published: 11 January 1998

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader