Abstract
Clustering data from web user sessions is extensively applied to extract customer usage behavior to serve customized content to individual users. Due to the human involvement, web usage data usually contain noisy, incomplete and vague information. Neural networks have the capability to extract embedded knowledge in the form of user session clusters from the huge web usage data. Moreover, they provide tolerance against imperfect and noisy data. Fuzzy sets are another popular tool utilized for handling uncertainty and vagueness hidden in the data. In this paper a fuzzy neural clustering network (FNCN) based framework is proposed that makes use of the fuzzy membership concept of fuzzy c-means (FCM) clustering and the learning rate of a modified self-organizing map (MSOM) neural network model and tries to minimize the weighted sum of the squared error. FNCN is applied to cluster the users’ web access data extracted from the web logs of an educational institution’s proxy web server. The performance of FNCN is compared with FCM and MSOM based clustering methods using various validity indexes. Our results show that FNCN produces better quality of clusters than FCM and MSOM.
Similar content being viewed by others
Notes
“1212264494.796 829 192.168.23.12 TCP_MISS/200 1014 GET http://tools.google.com/versioncheck.txt DEFAULT_PARENT/192.168.20.1 text/plain”
References
Abraham A (2003) Business intelligence from web usage mining. J Inf Knowl Manag 2(4):375–390
Alam S (2011) Intelligent web usage clustering based recommender system. In: Proceedings of the fifth ACM conference on Recommender systems, ACM, pp 367–370
Ansari Z, Ahmed W, Azeem M, Babu A (2011a) Discovery of web usage profiles using various clustering techniques. Int J Comput Inf Syst 1(3):18–27
Ansari ZA, Babu AV, Ahmed W, Azeem MF (2011d) A comparative study of mining web usage patterns using variants of k-means clustering algorithm. Int J Comput Sci Inf Technol (IJCSIT) 2(4):1407–1413
Ansari Z, Azeem M, Babu AV, Ahmed W (2012) A fuzzy approach for feature evaluation and dimensionality reduction to improve the quality of web usage mining results. Int J Adv Sci Eng Inf Technol 2(6):67–73
Ansari Z, Azeem MF, Babu AV, Ahmed W (2011b) Preprocessing users web page navigational data to discover usage patterns. In: The Seventh International Conference on Computing and Information Technology, Bangkok, Thailand
Ansari Z, Babu AV, Ahmed W, Azeem MF (2011c) A fuzzy set theoretic approach to discover user sessions from web navigational data. In: IEEE Recent Advances in Intelligent Computational Systems (RAICS) 2011, pp 879 – 884
Berkhin P (2002) Survey of clustering data mining techniques. Springer, Heidelberg
Berkhin P (2006) A survey of clustering data mining techniques. Grouping Multidimensional Data. Springer, Berlin Heidelberg, pp 25–71
Bezdek JC, Ehrlich R, Full W (1984) Fcm: The fuzzy c-means clustering algorithm. Elsevier Comput Geosci 10(2):191–203
Chaofeng L (2009) Research on web session clustering. J Softw 4(5):460–468
Chau M, Cheng R, Kao B, Ng J (2006) Uncertain data mining: An example in clustering location data. In: Advances in Knowledge Discovery and Data Mining, Lecture Notes in Computer Science, vol 3918. Springer, pp 199–204
Chen J, Cook T (2007) Mining contiguous sequential patterns from web logs. In: Proceedings of the 16th international conference on World Wide Web, ACM, pp 1177–1178
Chou PH, Li PH, Chen KK, Wu MJ (2010) Integrating web mining and neural network for personalized e-commerce automatic service. Elsevier Expert Syst Appl 37(4):2898–2910
Cohen E, Krishnamurthy B, Rexford J (1998) Improving end-to-end performance of the web using server volumes and proxy filters. SIGCOMM Comput Commun Rev 28:241–253
Dimitrijevic M, Bosnjak Z, Subotica S (2010) Discovering interesting association rules in the web log usage data. Interdiscip J Inf Knowl Manag 5:191–207
Dong YH (2004) A novel competitive neural network for web mining. In: Proceedings of 2004 International Conference on Machine Learning and Cybernetics, 2004
Du K (2010) Clustering: a neural network approach. Elsevier Neural Netw 23(1):89–107
Etminani K, Delui A, Yanehsari N, Rouhani M (2009) Web usage mining: Discovery of the users navigational patterns using som. In: First International Conference on Networked Digital Technologies, 2009. NDT 09, pp 224–249
Fukuyama SM Y (1989) A new method of choosing the number of clusters for the fuzzy c-means method. In: Proceeding of fifth Fuzzy System Symposium, pp 247–250
Ghosh A, Shankar BU, Meher SK (2009) A novel approach to neuro-fuzzy classification. Neural Netw 22:100–109
Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inf Syst 17:107–145
Ketata A, Mudur S, Shiri N (2009) Dependable performance analysis for fuzzy clustering of web usage data. In: IEEE Symposium on Computational Intelligence and Data Mining, 2009. CIDM 09, pp 275–282
Kohonen T (1990) The self-organizing map. Proc IEEE 78:1464–1480
Kumar T, Guruprasad H (2012) Clustering web usage data using concept hierarchy and self organizing map. Int J Comput Appl 56(18):38–44
Le Capitaine H, Frelicot C (2011) A cluster validity index combining an overlap measure and a separation measure based on fuzzy aggregation operators. IEEE Trans Fuzzy Syst 19(3):580–588
Li B, Yang J, Liu C, Zhang J, Zhang Y (2011) Research on improved clustering algorithm on web usage mining based on scientific analysis of web materials. Appl Mech Mater 63:863–867
Liu HC, Yih WLJM, Wu D (2009) Fuzzy cmeans algorithm based on common mahalanobis distances. J Mult Valued Logic Soft Comput 15:581–595
Mobasher B (2007) Data mining for web personalization. Lect Notes Comput Sci 4321:90
Nanopoulos A, Katsaros D, Manolopoulos Y (2002) Exploiting web log mining for web cache enhancement. In: Kohavi R, Masand B, Spiliopoulou M, Srivastava J (eds) WEBKDD 2001 Mining Web Log Data Across All Customers Touch Points, vol 2356., Lecture Notes in Computer ScienceSpringer, Berlin, pp 235–241
Pal S, Talwar V, Mitra P et al (2002) Web mining in soft computing framework: relevance, state of the art and future directions. IEEE Trans Neural Netw 13(5):1163–1177
Park S, Suresh NC, Jeong BK (2008) Sequence based clustering for web usage mining: a new experimental framework and ann enhanced k means algorithm. Elsevier Data Knowl Eng 65(3):512–543
Perkowitz EO M (1998) Adaptive web sites: Automatically synthesizing web pages. In: Proceedings of the 15th National Conference on Artificial Intelligence, Madison, WI, pp 727–732
Perkowitz EOM (2000) Adaptive web sites. Commun ACM 43:152–158
Raghavendra PS, Chowdhury SR, Kameswari SV (2011) Web usage mining using statistical classifiers and fuzzy artificial neural networks. Int J Multimed Image Process (IJMIP) 1(1):9–16
Ren L (2009) Research of web data mining based on fuzzy logic and neural networks. In: Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009. FSKD ’09, vol 3, pp 485–489
Sharma A (2012) Web usage mining using neural network. Int J Rev Comput 9:72–78
Shveta K, Bhatia HM, Dixit VS (2011) Aggregate profiling for recommendation of web pages using som and k-means clustering techniques. Int J Comput Appl 36(9):13–20
Srivastava J, Cooley R, Deshpande M, Tan P (2000) Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explor 1(2):12–23
Van Hulle MM (2012) Self organizing maps. In: Handbook of Natural Computing. Springer, pp 585–622
Vigna G, Robertson W, Kher V, Kemmerer R (2003) A stateful intrusion detection system for world-wide web servers. In: Proceedings. 19th Annual Computer Security Applications Conference, 2003, pp 34–43
Wang W, Zhang Y (2007) On fuzzy cluster validity indices. Elsevier Fuzzy Sets Syst 158:2095–2117
Wei C, Sen W, Yuan Z, Lian-Chang C (2009) Algorithm of mining sequential patterns for web personalization services. ACM SIGMIS Database 40(2):57–66
Xie XL, Beni G (1987) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13:841–847
Xu R, Wunsch ID (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Zahid Lmea N (1999) A new cluster validity for fuzzy clustering. Pattern Recognit 32:1089–1097
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ansari, Z.A., Sattar, S.A. & Babu, A.V. A fuzzy neural network based framework to discover user access patterns from web log data. Adv Data Anal Classif 11, 519–546 (2017). https://doi.org/10.1007/s11634-015-0228-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-015-0228-4