Abstract
Traditional text clustering methods require enormous computing resources, which make them inappropriate for processing large scale data collections. In this paper we present a clustering method based on the word category map approach using a two-level Growing Self-Organising Map (GSOM). A significant part of the clustering task is divided into separate sub-tasks that can be executed on different computers using the emergent Grid technology. Thus enabling the rapid analysis of information gathered globally. The performance of the proposed method is comparable to the traditional approaches while improves the execution time by 15 times.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Honkela, T., Kaski, S., Lagus, K., Kohonen, T.: Newsgroup Exploration with WEBSOM Method and Browsing Interface, Tech. Rep. A32, Helsinki University of Technology, Laboratory of Computer and Information Science, Espoo, Finland (1996)
Kaski, S., Honkela, T., Lagus, K., Kohonen, T.: WEBSOM—Self-Organizing Maps of Document Collections. Neurocomputing 21, 101–117 (1998)
Foster, I., Kesselman, C. (eds.): The grid: blueprint for a new computing infrastructure. Elsevier, Amsterdam (2004)
Kohonen, T.: Self-organizing maps. Springer, Berlin (1995)
Alahakoon., D., Halgamuge, S.K., Srinivasan, B.: Dynamic Self-Organising Maps with Controlled Growth for Knowledge Discovery. IEEE Transactions on Neural Networks, Special Issue on Knowledge Discovery and Data Mining 11(3) (2000)
Lagus, K., Kaski, S., Kohonen, T.: Mining Massive Document Collections by the WEBSOM Method. Information Sciences 163(1-3), 135–156 (2004)
Honkela, T.: Self-Organizing Maps in Natural Language Processing, Ph.D. thesis, Helsinki University of Technology, Neural Networks Research Center, Espoo, Finland (1997)
Nürnberger, A.: Interactive Text Retrieval Supported by Growing Self-Organizing Maps. In: Proc. of the International Workshop on Information Retrieval, pp. 61–70 (2001)
Larsen, B., Aone, C.: Fast and Effective Text Mining using Linear Time Document Clustering. In: Proceedings of the conference on Knowledge Discovery and Data Mining, pp. 16–22 (1999)
Depoutovitch, A., Wainstein, A.: Building Grid Enabled Data-Mining Applications (2005), http://www.ddj.com/184406345
Salton, G.: Developments in Automatic Text Retrieval. Science 253, 974–979 (1991)
Hsu, A., Halgamuge, S.K.: Enhancement of Topology Preservation and Hierarchical Dynamic Self-Rrganising Maps for Data Visualisation. International Journal of Approximate Reasoning 32(2-3), 259–279 (2003)
Hsu, A., Tang, S., Halgamuge, S.K.: An Unsupervised Hierarchical Dynamic Self-Organising Approach to Class Discovery and Marker Gene Identification in Microarray Data. Oxford University Press, Oxford (2003)
Alahakoon, D.: Controlling the Spread of Dynamic Self Organising Maps. Neural Computing and Applications 13(2), 168–174 (2004)
Wickramasinghe, L.K., Alahakoon, L.D.: Dynamic Self Organizing Maps for Discovery and Sharing of Knowledge in Multi Agent Systems in Web Intelligence and Agent Systems: An International Journal 3(1) (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhai, Y.Z., Hsu, A., Halgamuge, S.K. (2006). Scalable Dynamic Self-Organising Maps for Mining Massive Textual Data. In: King, I., Wang, J., Chan, LW., Wang, D. (eds) Neural Information Processing. ICONIP 2006. Lecture Notes in Computer Science, vol 4234. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893295_30
Download citation
DOI: https://doi.org/10.1007/11893295_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46484-6
Online ISBN: 978-3-540-46485-3
eBook Packages: Computer ScienceComputer Science (R0)