Term-weighting approaches in automatic text retrieval

https://doi.org/10.1016/0306-4573(88)90021-0Get rights and content

Abstract

The experimental evidence accumulated over the past 20 years indicates that text indexing systems based on the assignment of appropriately weighted single terms produce retrieval results that are superior to those obtainable with other more elaborate text representations. These results depend crucially on the choice of effective termweighting systems. This article summarizes the insights gained in automatic term weighting, and provides baseline single-term-indexing models with which other more elaborate content analysis procedures can be compared.

References (55)

  • J.W. Perry

    Information analysis for machine searching

    American Documentation

    (1950)
  • C.J. van Rijsbergen

    A theoretical basis for the use of cooccurrence data in information retrieval

    Journal of Documentation

    (June 1977)
  • G. Salton et al.

    An evaluation of term dependence models in information retrieval

  • C.T. Yu et al.

    A generalized term dependence model in information retrieval

    Information Technology: Research and Development

    (October 1983)
  • M.E. Lesk

    Word-word associations in document retrieval systems

    American Documentation

    (January 1969)
  • M. Dillon et al.

    Fully automatic syntax-based indexing

    Journal of the ASIS

    (March 1983)
  • K. Sparck Jones et al.

    Automatic search term variant generation

    Journal of Documentation

    (March 1984)
  • J.L. Fagan

    Experiments in automatic phrase indexing for document retrieval: A comparison of syntactic and non-syntactic methods

  • A.F. Smeaton

    Incorporating syntactic information into a document retrieval strategy: An investigation

  • K. Sparck Jones

    Automatic Keyword Classification for Information Retrieval

    (1971)
  • G. Salton

    Experiments in automatic thesaurus construction for information retrieval

  • R.T. Dattola

    Experiments with fast algorithms for automatic classification

  • D.E. Walker

    Knowledge resource tools for analyzing large text files

  • H. Kucera

    Uses of on-line lexicons

  • R.A. Amsler

    Machine-readable dictionaries

  • E.A. Fox

    Lexical relations: Enhancing effectiveness of information retrieval systems

    ACM SIGIR Forum

    (1980)
  • W.B. Croft

    User-specified domain knowledge for document retrieval

  • Cited by (6900)

    View all citing articles on Scopus

    This study was supported in part by the National Science Foundation under grants IST 83-16166 and IRI 87-02735.

    View full text