Abstract
The paper presents a concept of a hybrid system consisting of two our original techniques from the computational intelligence area and its application to knowledge discovery from full-text document collection. Our first technique - self-organizing neural network with one dimensional neighborhood and dynamically evolving topological structure - aims at automatically determining the number of groups in the document collection and at grouping the documents in terms of their similarity. In turn, the main goal of our second approach - multi-objective evolutionary designing technique of fuzzy rule-based classifiers with optimized accuracy-interpretability trade-off - is to extract the most important keywords from documents and to generate classification rules which can be helpful in understanding and isolating the subjects of documents collected in the founded groups. The proposed concept may also be useful to develop systems operating in a wide area of human language understanding problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Franke, J., Nakhaeizadeh, G., Renz, I.: Text Mining: Theoretical Aspects and Applications. Physica/Springer, Heidelberg (2003)
GorzaĆczany, M.B., RudziĆski, F.: Cluster analysis via dynamic self-organizing neural networks. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Ć»urada, J.M. (eds.) Artificial Intelligence and Soft Computing - ICAISC 2006. Lecture Notes in Computer Science, vol. 4029, pp. 593â602. Springer, Heidelberg (2006)
GorzaĆczany, M.B., RudziĆski, F.: WWW-newsgroup-document clustering by means of dynamic self-organizing neural networks. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Ć»urada, J.M. (eds.) Artificial Intelligence and Soft Computing - ICAISC 2008. Lecture Notes in Computer Science, vol. 5097, pp. 40â51. Springer, Heidelberg (2008)
GorzaĆczany, M.B., RudziĆski, F.: Handling fuzzy systemsâ accuracy-interpretability trade-off by means of multi-objective evolutionary optimization methods - selected problems. Bull. Pol. Acad. Sci. Tech. Sci. 63(3), 791â798 (2015)
GorzaĆczany, M.B., RudziĆski, F.: Generalized self-organizing maps for automatic determination of the number of clusters and their multiprototypes in cluster analysis. IEEE Trans. Neural Netw. Learn. Syst. PP(99), 1â13 (2017)
Leskovec, J., Rajaraman, A., Ullman, J.: Mining of Massive Datasets. Cambridge University Press, New York (2011)
RudziĆski, F.: Finding sets of non-dominated solutions with high spread and well-balanced distribution using generalized strength Pareto evolutionary algorithm. In: Alonso, J.M., et al. (eds.) 2015 Conference on International Fuzzy Systems Association and European Society for Fuzzy Logic and Technology (IFSA-EUSFLAT-15), vol. 89, pp. 178â185. Atlantis Press, GijĂłn (2015)
RudziĆski, F.: A multi-objective genetic optimization of interpretability-oriented fuzzy rule-based classifiers. Appl. Soft Comput. 38, 118â133 (2016)
Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: improving the strength Pareto evolutionary algorithm for multiobjective optimization. In: Proceeding of the Evolutionary Methods for Design, Optimisation, and Control, pp. 95â100. CIMNE, Barcelona (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
RudziĆski, F. (2019). A Knowledge Discovery from Full-Text Document Collections Using Clustering and Interpretable Genetic-Fuzzy Systems. In: ChoroĆ, K., Kopel, M., Kukla, E., SiemiĆski, A. (eds) Multimedia and Network Information Systems. MISSI 2018. Advances in Intelligent Systems and Computing, vol 833. Springer, Cham. https://doi.org/10.1007/978-3-319-98678-4_44
Download citation
DOI: https://doi.org/10.1007/978-3-319-98678-4_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98677-7
Online ISBN: 978-3-319-98678-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)