Probabilistic Topic Maps: Navigating through Large Text Collections

Hofmann, Thomas

doi:10.1007/3-540-48412-4_14

Thomas Hofmann⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1642))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

730 Accesses
7 Citations

Abstract

The visualization of large text databases and document collections is an important step towards more flexible and interactive types of information retrieval. This paper presents a probabilistic approach which combines a statistical, model—based analysis with a topological visualization principle. Our method can be utilized to derive topic maps which represent topical information by characteristic keyword distributions arranged in a two—dimensional spatial layout. Combined with multi-resolution techniques this provides a three-dimensional space for interactive information navigation in large text collections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. M. Buhmann. Stochastic algorithms for data clustering and visualization. In M. I. Jordan,editor, Learning in Graphical Models. Kluwer Academic Publishers, 1998.
Google Scholar
J. M. Buhmann and H. Kühnel. Complexity optimized data clustering by competitive neural networks. Neural Computation, 5:75–88, 1993.
Article Google Scholar
S. Deerwester, G. W. Dumais, S. T. amd Furnas, Landauer. T. K., and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990.
Google Scholar
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statist. Soc. B, 39:1–38, 1977.
MATH MathSciNet Google Scholar
J. G. Herder. Sprachphilosophische Schriften. Felix Meiner Verlag, Hamburg, 1960.
Google Scholar
T. Hofmann. Probabilistic latent semantic analysis. In Proceedings of the 15th Conference on Uncertainty in AI, 1999.
Google Scholar
T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd ACM-SIGIR International Conference on Research and Development in Information Retrieval, Berkeley, California, 1999.
Google Scholar
T. Hofmann and J. M. Buhmann. Competitive learning algorithms for robust vector quantization. IEEE Transaction on Signal Processing, 46(6):1665–1675, 1998.
Article MATH MathSciNet Google Scholar
T. Hofmann and J. Puzicha. Statistical models for co-occurrence data. Technical report, AI Memo 1625, M.I.T., 1998.
Google Scholar
S. Kaski, T. Honkela, K. Lagus, and T. Kohonen. WEBSOM-self-organizing maps of document collections. Neurocomputing, 21:101–117, 1998.
Article MATH Google Scholar
T. Kohonen. Self-organization and Associative Memory. Springer, 1984.
Google Scholar
T. Kohonen. Self-Organizing Maps. Springer, 1995.
Google Scholar
Linguistic Data Consortium. TDT pilot study corpus. Catalog no. LDC98T25, 1998.
Google Scholar
S. P. Luttrell. Hierarchical vector quantization. IEE Proceedings, 136:405–413, 1989.
Google Scholar
H. Ritter and T. Kohonen. Self-organizing semantic maps. Biological Cyberbetics, 61:241–254, 1989.
Article Google Scholar
G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1983.
Google Scholar
L. Saul and F. Pereira. Aggregate and mixed-order Markov models for statistical language processing. In Proceedings of the 2nd International Conference on Empirical Methods in Natural Language Processing, 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Division, UC Berkeley & International CS Institute, Berkeley, CA
Thomas Hofmann

Authors

Thomas Hofmann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics, Imperial College, Huxley Building 180 Queen’s Gate, London, SW7 2BZ, UK
David J. Hand
Leiden Institute for Advanced Computer Science, Leiden University, 2300, RA Leiden, The Netherlands
Joost N. Kok
Berkeley Initiative in Soft Computing, University of California at Berkeley, 329 Soda Hall, Berkeley, CA, 94720, USA
Michael R. Berthold

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hofmann, T. (1999). Probabilistic Topic Maps: Navigating through Large Text Collections. In: Hand, D.J., Kok, J.N., Berthold, M.R. (eds) Advances in Intelligent Data Analysis. IDA 1999. Lecture Notes in Computer Science, vol 1642. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48412-4_14

Download citation

DOI: https://doi.org/10.1007/3-540-48412-4_14
Published: 08 July 1999
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66332-4
Online ISBN: 978-3-540-48412-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics