Abstract
The significant success of an organization greatly depends upon the consumers and their relationship with the organization. The knowledge of consumer behavioral and a excellent understanding of consumer expectations is important for the development of strategic management decisions in support of improving the business value. CRM is intensively applied in the analysis of consumer behavior patterns with the use of Machine Learning (ML) Techniques. Naive Bayes (NB) one of the ML supervised classification models is used to analyze customer behavior predictions. In some domain, the NB performance degrades which involves the existence of redundant, noisy and irrelevant attributes in the dataset, which is a violation of underlying assumption made by naive Bayes. Different enhancements have been suggested to enhance the primary assumption of the NB classifier-independence assumption between the attributes of given class label. In this research, we suggest a simple, straight forward and efficient approach called BHFS (Bagging Homogeneous Feature Selection) which is based upon Ensemble data perturbation feature selection methods. The BHFS method is applied to eliminate the correlated, irrelevant attributes in the dataset and selecting a stable feature subset for improving performance prediction of the NB model. The advantage of the BHFS method requires less running time and selects the best relevant attributes for the evaluation of naive Bayes. The Experimental outcomes demonstrate that the BHFS-naive Bayes model makes better predictions compared to the standard NB. The running time complexity is also less with BHFS-NB since the naive Bayes is constructed using selected features obtained from BHFS.
Similar content being viewed by others
Change history
30 May 2022
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1007/s12652-022-03968-w
References
Abellan J, Castellano F (2017) Improving the naive Bayes classifier via a quick variable selection method using maximum of entropy. Entropy 19(6):247. https://doi.org/10.3390/e19060247.2017
Bakar A, AI-Aidarous K, Azuraliza, Othman Z (2013) Improving Naive Bayes classification with rough set analysis. Int J Adv in Comp Tech (IJACT) 5(13):48–60
Bolon-Canedo V, Alonso-Betanzos A (2018) Ensembles for feature selection: a review and future trends. Inf Fusion 52:1–12. https://doi.org/10.1016/j.inffus.2018.11.008
Bressan M, Vitria J (2002) Improving Naive Bayes using class-conditional ICA. Adv in AI-IBERAMAIA 2002:1–10. https://doi.org/10.1007/3-540-36131-6_1
Christry AJ et al (2018) RFM ranking—an effective approach to customer segmentation. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2018.09.004
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(1-4):131–156. https://doi.org/10.1016/s1088-467x(97)00008-5
Dhandayudam P, Krishnamuthi I (2013) Customer behavior analysis using rough set approach. J Theoret Appl Electron Commerce Res 8:21–33. https://doi.org/10.4067/s0718-18762013000200003
Domingos P, Pazzani M (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29:103. https://doi.org/10.1023/A:1007413511361
Donghai et al (2014) A review of ensemble learning based feature selection. IETE Tech Rev. https://doi.org/10.1080/02564602.2014.906859
Fan L, Poh K-L (2007) A comparative study of PCA, ICA and class-conditional ICA for Naive Bayes CLassifier. In: IWANN, pp 16-22. https://doi.org/10.1007/978-3-540-73007-1_3
Frank E et al (2002) Locally weighted Naive Bayes. In: ArXiv abs/1212.2487. Proceedings of the 19th conference on uncertainty in AI, pp 249–256
Friedman N et al (1998) Bayesian network classification with continuous attributes: getting the best of both discretization and parametric fitting. In: ICML, p 98
Friedman N et al (1997) Bayesian networks classifiers. Mach Learn 29:131. https://doi.org/10.1023/A:10077465528199
Huan L, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Engg 17(4):491–502. https://doi.org/10.1109/tkde.2005.66
Karabulut E, Özel S, Ibrikci T (2012) Comparative study on the effect of feature selection on classification accuracy. Proc Technol 1:323–327. https://doi.org/10.1016/j.protcy.2012.02.068
Keogh EJ, Pazzani MJ (1999) Learning augmented bayesian classifiers. In: Proceedings of seventh international workshop on AI and statistics. Ft. Lauderdale
Kononenko I (1991) Semi-naive bayesian classifier. In: Kodratoff Y (ed) ML—EWSL-91. EWSL 1991, pp. 206–219. Lecture notes in computer science (Lecture Notes in AI), vol 482. Springer, Berlin. https://doi.org/10.1007/BFb0017015
Langley P, Sage S (1994) Induction of selective bayesian classifiers. Uncertain Proc. https://doi.org/10.1016/b978-1-55860-332-5.50055-9
Mithas S, Krishnan MS, Fornell C (2006) Why do customer relationship management applications affect customer satisfaction? J Mark 69(4):201–209. https://doi.org/10.1509/jmkg.2005.69.4.20
Moslehi F, Haeri A (2020) A novel hybrid wrapper–filter approach based on genetic algorithm, particle swarm optimization for feature subset selection. J Ambient Intell Human Comput 11:1105–1127. https://doi.org/10.1007/s12652-019-01364-5
Omran S, El Houby EMF (2019) Prediction of electrical power disturbances using machine learning techniques. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-019-01440-w
Pandey AC, Rajpoot DS, Saraswat M (2020) Feature selection method based on hybrid data transformation and binary binomial cuckoo search. J Ambient Intell Human Comput 11:719–738. https://doi.org/10.1007/s12652-019-01330-1
Payne A, Flow P (2005) A Strategic Framework for customer relationship management. J Mark 69(4):167–176
Pazzani MJ (1996) Searching for dependencies in bayesian classifiers. In: Learning from data: AI and statistics. https://doi.org/10.1007/978-1-4612-2404-4_23
Pes B (2019) Ensemble feature selection for high-dimensional data: a stability analysis across multiple domains. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04082-3
Rahman L, Setiawan NA, Permanasari AE (2017) Feature selection methods in improving accuracy of classifying students’ academic performance. In: 2017 2nd International Conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE). https://doi.org/10.1109/icitisee.2017.8285509
Ratanamahatana C, Gunopulos D (2003) Feature selection for the naive bayesian using decision trees. Appl Artif Intell 17:475–487. https://doi.org/10.1080/713827175
Robnik-Sikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69. https://doi.org/10.1023/a:1025667309714
Saeys Y, Abeel T, Van de Peer Y (2008) Robust Feature Selection Using Ensemble Feature Selection Techniques. In: Proceedings of the European Conference on ML and knowledge discovery in databases. Pt II. 5212. 313–325. https://doi.org/10.1007/978-3-540-87481-2_21
Sanchez W, Martinez A, Hernandez Y et al (2018) A predictive model for stress recognition in desk jobs. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-018-1149-9
Seijo-Pardo B, Porto-Díaz I, Bolon-Canedo V, Alonso-Betanzos A (2016) Ensemble feature selection: homogeneous and heterogeneous approaches. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2016.11.017
Soltani Z et al (2018) The impact of the customer relationship management on the organization performance. JHigh Tech Manag Res 29(2):237–246. https://doi.org/10.1016/j.htech.2018.10.001
Webb GI etal (2005) Not so Naive Bayes: aggregating one-dependence estimators. In: ML, 58,5–24. https://doi.org/10.1007/s10994-005-4258-6
Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. Proc Twent Intern Conf Mach Learn 2:856–863
Zheng Z, Geoffrey IW (2000) Lazy learning of Bayesian rules. Machine Learning 41:53–87. https://doi.org/10.1023/a:1007613203719
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s12652-022-03968-w
About this article
Cite this article
Siva Subramanian, R., Prabha, D. RETRACTED ARTICLE: Customer behavior analysis using Naive Bayes with bagging homogeneous feature selection approach. J Ambient Intell Human Comput 12, 5105–5116 (2021). https://doi.org/10.1007/s12652-020-01961-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-01961-9