Skip to main content

Probabilistic Topic Maps: Navigating through Large Text Collections

  • Conference paper
  • First Online:
Advances in Intelligent Data Analysis (IDA 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1642))

Included in the following conference series:

Abstract

The visualization of large text databases and document collections is an important step towards more flexible and interactive types of information retrieval. This paper presents a probabilistic approach which combines a statistical, model—based analysis with a topological visualization principle. Our method can be utilized to derive topic maps which represent topical information by characteristic keyword distributions arranged in a two—dimensional spatial layout. Combined with multi-resolution techniques this provides a three-dimensional space for interactive information navigation in large text collections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. M. Buhmann. Stochastic algorithms for data clustering and visualization. In M. I. Jordan,editor, Learning in Graphical Models. Kluwer Academic Publishers, 1998.

    Google Scholar 

  2. J. M. Buhmann and H. Kühnel. Complexity optimized data clustering by competitive neural networks. Neural Computation, 5:75–88, 1993.

    Article  Google Scholar 

  3. S. Deerwester, G. W. Dumais, S. T. amd Furnas, Landauer. T. K., and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990.

    Google Scholar 

  4. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statist. Soc. B, 39:1–38, 1977.

    MATH  MathSciNet  Google Scholar 

  5. J. G. Herder. Sprachphilosophische Schriften. Felix Meiner Verlag, Hamburg, 1960.

    Google Scholar 

  6. T. Hofmann. Probabilistic latent semantic analysis. In Proceedings of the 15th Conference on Uncertainty in AI, 1999.

    Google Scholar 

  7. T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd ACM-SIGIR International Conference on Research and Development in Information Retrieval, Berkeley, California, 1999.

    Google Scholar 

  8. T. Hofmann and J. M. Buhmann. Competitive learning algorithms for robust vector quantization. IEEE Transaction on Signal Processing, 46(6):1665–1675, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  9. T. Hofmann and J. Puzicha. Statistical models for co-occurrence data. Technical report, AI Memo 1625, M.I.T., 1998.

    Google Scholar 

  10. S. Kaski, T. Honkela, K. Lagus, and T. Kohonen. WEBSOM-self-organizing maps of document collections. Neurocomputing, 21:101–117, 1998.

    Article  MATH  Google Scholar 

  11. T. Kohonen. Self-organization and Associative Memory. Springer, 1984.

    Google Scholar 

  12. T. Kohonen. Self-Organizing Maps. Springer, 1995.

    Google Scholar 

  13. Linguistic Data Consortium. TDT pilot study corpus. Catalog no. LDC98T25, 1998.

    Google Scholar 

  14. S. P. Luttrell. Hierarchical vector quantization. IEE Proceedings, 136:405–413, 1989.

    Google Scholar 

  15. H. Ritter and T. Kohonen. Self-organizing semantic maps. Biological Cyberbetics, 61:241–254, 1989.

    Article  Google Scholar 

  16. G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1983.

    Google Scholar 

  17. L. Saul and F. Pereira. Aggregate and mixed-order Markov models for statistical language processing. In Proceedings of the 2nd International Conference on Empirical Methods in Natural Language Processing, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hofmann, T. (1999). Probabilistic Topic Maps: Navigating through Large Text Collections. In: Hand, D.J., Kok, J.N., Berthold, M.R. (eds) Advances in Intelligent Data Analysis. IDA 1999. Lecture Notes in Computer Science, vol 1642. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48412-4_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-48412-4_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66332-4

  • Online ISBN: 978-3-540-48412-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics