skip to main content
10.3115/981574.981596dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free Access

Contextual word similarity and estimation from sparse data

Published:22 June 1993Publication History

ABSTRACT

In recent years there is much interest in word cooccurrence relations, such as n-grams, verb-object combinations, or cooccurrence within a limited context. This paper discusses how to estimate the probability of cooccurrences that do not occur in the training data. We present a method that makes local analogies between each specific unobserved cooccurrence and other cooccurrences that contain similar words, as determined by an appropriate word similarity metric. Our evaluation suggests that this method performs better than existing smoothing methods, and may provide an alternative to class based models.

References

  1. Peter Brown, Vincent Della Pietra, Peter deSouza, Jenifer Lai, and Robert Mercer. Class-based n-gram models of natural language. Computational Linguistics. (To appear). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer, 1991. Word sense disambiguation using statistical methods. In Proc. of the Annual Meeting of the ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Kenneth W. Church and William A. Gale. 1991. A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams. Computer Speech and Language, 5:19--54.Google ScholarGoogle ScholarCross RefCross Ref
  4. Kenneth W. Church and Patrick Hanks. 1990. Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1):22--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Kenneth W. Church and Robert L. Mercer. 1992. Introduction to the special issue in computational linguistics using large corpora. Computational Linguistics. (In press). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ido Dagan and Alon Itai. 1990. Automatic acquisition of constraints for the resolution of anaphora references and syntactic ambiguities. In Proc. of COLING. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ido Dagan, Alon Itai, and Ulrike Schwall. 1991. Two languages are more informative than one. In Proc. of the Annual Meeting of the ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Fano. 1961. Transmission of Information. Cambridge, Mass: MIT Press.Google ScholarGoogle Scholar
  9. William Gale, Kenneth Church, and David Yarowsky. 1992. Using bilingual materials to develop word sense disambiguation methods. In Proc. of the International Conference on Theoretical and Methodolgical Issues in Machine Translation.Google ScholarGoogle Scholar
  10. I. J. Good. 1953. The population frequencies of species and the estimation of population parameters. Biometrika, 40:237--264.Google ScholarGoogle ScholarCross RefCross Ref
  11. R. Grishman, L. Hirschman, and Ngo Thanh Nhan. 1986. Discovery procedures for sublanguage selectional patterns - initial experiments. Computational Linguistics, 12:205--214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Hindle and M. Rooth. 1991. Structural ambiguity and lexical relations. In Proc. of the Annual Meeting of the ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Hindle. 1990. Noun classification from predicate-argument structures. In Proc. of the Annual Meeting of the ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Hirschman. 1986. Discovering sublanguage structures. In R. Grishman and R. Kittredge, editors, Analyzing Language in Restricted Domains: Sublanguage Description and Processing, pages 211--234. Lawrence Erlbaum Associates.Google ScholarGoogle Scholar
  15. F. Jelinek and R. Mercer. 1985. Probability distribution estimation from sparse data. IBM Technical Disclosure Bulletin, 28:2591--2594.Google ScholarGoogle Scholar
  16. Frederick Jelinek. 1990. Self-organized language modeling for speech recognition. In Alex Waibel and Kai-Fu Lee, editors, Readings in Speech Recognition, pages 450--506. Morgan Kaufmann Publishers, Inc., San Maeio, California. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Slava M. Katz. 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics, speech, and Signal Processing, 35(3):400--401.Google ScholarGoogle ScholarCross RefCross Ref
  18. Yoelle Maarek and Frank Smadja. 1989. Full text indexing based on lexical relations - An application: Software libraries. In Proc. of SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Fernando Pereira, Naftali Tishby, and Lillian Lee. 1993. Distributional clustering of English words. In Proc. of the Annual Meeting of the ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Philip Resnik. 1992. Wordnet and distributional analysis: A class-based approach to lexical discovery. In AAAI Workshop on Statistically-based Natural Language Processing Techniques, July.Google ScholarGoogle Scholar
  21. V. Sadler. 1989. Working with analogical semantics: Disambiguation techniques in DLT. Foris Publications.Google ScholarGoogle Scholar
  22. Frank Smadja and Katheleen McKeown. 1990. Automatically extracting and representing collocations for language generation. In Proc. of the Annual Meeting of the ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Contextual word similarity and estimation from sparse data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        ACL '93: Proceedings of the 31st annual meeting on Association for Computational Linguistics
        June 1993
        320 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 22 June 1993

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate85of443submissions,19%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader