skip to main content
article
Free Access

WEIRD: an approach to concept-based information retrieval

Published:01 April 1979Publication History
Skip Abstract Section

Abstract

WEIRD is an automatic document retrieval system designed and implemented at Syracuse University, which attempts to advance the art of computerized retrieval from word-matching to judging conceptual similarity. WEIRD uses a vector space model to represent the relations among terms and documents. Items in the space are located according to their "meaning", which is their proximity to all other items in the data base as measured by co-occurrence frequencies. This is done without manipulating large matrices. The dimensions of the space are not used to define relations; items are defined solely by their position relative to the other items. Retrieval is determined by Euclidean distance from the plotted query. In the first section of the paper the basic characteristics of WEIRD are described. Second, the results of a preliminary evaluation are reported. Alternatives for further development of WEIRD are then considered.

References

  1. Bookstein, A.; Kraft, D. "Operations Research Applied to Document Indexing and Retrieval Decisions." Journal of the ACM, 24(3): 418--427 (1977). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Cagan, C. "A Highly Associative Document Retrieval System." Journal of the American Society for Information Science, 21(5): 330--337 (1970).Google ScholarGoogle ScholarCross RefCross Ref
  3. Cleveland, D. B. "An n-Dimensional Retrieval Model." Journal of the American Society for Information Science, 27(5/6): 342--347 (1976).Google ScholarGoogle ScholarCross RefCross Ref
  4. Cooper, W. S. "Expected Search Length: A Single Measure of Retrieval Effectiveness Based on the Weak Ordering Action of Retrieval Systems." American Documentation, 19(1): 30--41 (1968).Google ScholarGoogle ScholarCross RefCross Ref
  5. Doyle, L. B. "Semantic Road Maps for Literature Searches." Journal of the ACM, 8(4) (1961). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Giuliano, V. E. "Analog Networks for Word Associations." IEEE Transactions on Military Electronics, 1963: 221--225.Google ScholarGoogle Scholar
  7. Harter, S. P. "A Probabilistic Model for Automatic Keyword Indexing, Part 1." Journal of the American Society for Information Science, 26(4): 197--206 (1975).Google ScholarGoogle ScholarCross RefCross Ref
  8. Iker, H. P. "An Historical Note on the Use of Word Frequency Contiguities in Content Analysis." Computers and the Humanitites, 8: 93--98 (1974).Google ScholarGoogle ScholarCross RefCross Ref
  9. Katter, R. V. A Study of Document Representations: Multidimensional Scaling of Index Terms. SDC - Final Report, 1967.Google ScholarGoogle Scholar
  10. Kim, C. "Theoretical Foundation of Thesaurus-Construction and Some Methodological Considerations for Thesaurus Updating." Journal of the American Society for Information Science, 24(2): 148--156 (1973).Google ScholarGoogle ScholarCross RefCross Ref
  11. Maron, M. E.; Kuhns, J. L. "On Relevance, Probabilistic Indexing, and Information Retrieval." Journal of the ACM, 7(3): 216--244 (1960). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Noreault, T.; Koll, M. B.; McGill, M. J. "Automatic Ranked Output from Boolean Searches in SIRE." (Accepted for publication in Journal of the American Society for Information Science, 1977).Google ScholarGoogle ScholarCross RefCross Ref
  13. Osgood, C.; Suci, G.; Tannenbaum, P. The Measurement of Meaning. Urbana: The University of Illinois Press, 1957.Google ScholarGoogle Scholar
  14. Smith, L. C. "Artificial Intelligence in Information Retrieval Systems." Information Processing and Management, 12(3): 189--222 (1976).Google ScholarGoogle ScholarCross RefCross Ref
  15. Sparck Jones, K. "Index Term Weighting." Information Storage and Retrieval, 9(11): 619--633 (1973).Google ScholarGoogle ScholarCross RefCross Ref
  16. Switzer, P. "Vector Images in Information Retrieval." In: Statistical Association Methods for Mechanical Documentation, Symposium Proceedings, Wash., D.C., 1964. (NBS Misc. Publ. 269, 1965) Stevens, M. E.; Heilprin, L.; Giuliano, V. E. (eds.). 163--171.Google ScholarGoogle Scholar
  17. Tars, A. "Stemming as a System Design Consideration." ACM SIGIR Forum, XI(1):9--15 (1976). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Woelfel, J. Sociology and Science. Unpublished manuscript, Michigan State University, Department of Communication, 1971.Google ScholarGoogle Scholar
  19. Yu, C. T.; Salton, G. "Effective Information Retrieval Using Term Accuracy." Communications of the ACM, 20(3): 135--142 (1977). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. WEIRD: an approach to concept-based information retrieval
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGIR Forum
      ACM SIGIR Forum  Volume 13, Issue 4
      Spring 1979
      19 pages
      ISSN:0163-5840
      DOI:10.1145/1095366
      Issue’s Table of Contents

      Copyright © 1979 Author

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 April 1979

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader