Abstract
In real-world binary class datasets, the total number of samples may not be the same in both the classes, i.e. size of the majority class is much larger than minority class which is called as imbalance problem. In various classification problems, the main interest is to correctly classify the samples belonging to the minority class. Since support vector machine (SVM) and twin support vector machine (TWSVM) obtain the resultant classifier by giving same importance to all the training samples, it results in a biased classifier towards the majority class in imbalanced datasets. In this paper, by considering the fuzzy membership value for each sample, we have proposed an efficient approach, entropy-based fuzzy twin support vector machine for class imbalanced datasets (EFTWSVM-CIL) where fuzzy membership values are assigned based on the entropy values of samples. Here, we give more importance to the minority class by assigning relatively larger fuzzy memberships to the minority class samples. Further, it solves a pair of smaller-size quadratic programming problems (QPPs) rather than a large one as in the case of SVM. Experiments are performed on various real-world imbalanced datasets, and results of our proposed EFTWSVM-CIL are compared with twin support vector machine (TWSVM), fuzzy twin support vector machine (FTWSVM) and entropy-based fuzzy SVM (EFSVM) for imbalanced datasets.
Similar content being viewed by others
References
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Vapnik VN (1998) Statistical learning theory. Wiley, New York
Vapnik VN (2000) The nature of statistical learning theory, 2nd edn. Springer, New York
Osuna E, Freund, R, Girosi F (1997) Training support vector machines: an application to face detection. In: Computer vision and pattern recognition, 1997. Proceedings., IEEE computer society conference on (pp 130–136)
Phillips PJ (1998) Support Vector Machines Applied to Face recognition. Proc Conf Adv Neural Inf Process Syst 11:803–809
Michel P, El Kaliouby R (2003) Real time facial expression recognition in video using support vector machines. In: Proceedings of the 5th International Conference on Multimodal Interfaces, pp 258–264, ISBN: 1-58113-621-8
Borovikov E (2005) An evaluation of support vector machines as a pattern recognition tool. University of Maryland at College Park. http://www.umiacs.umd.edu/users/yab/SVMForPatternRecognition/report.pdf. Accessed 1 Dec 2016
Kumar MA, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert Syst Appl 36(4):7535–7543
Schmidt M, Gish H (1996) Speaker identification via support vector classifiers, acoustics, speech, and signal processing, 1996. ICASSP-96. In: Conference Proceedings, 1996 IEEE International Conference on, vol. 1. Atlanta, GA, pp 105–108
Khan L, Awad M, Thuraisingham B (2007) A new intrusion detection system using support vector machines and hierarchical clustering. VLDB J 16:507–521
Tomar D, Ojha D, Agarwal S (2014) An emotion detection system based on multi least squares twin support vector machine. Adv Artif Intell 2014:8
Tomar D, Agarwal S (2015) Hybrid feature selection based weighted least squares twin support vector machine approach for diagnosing breast cancer, hepatitis, and diabetes. Adv Artif Neural Syst 2015:1, Article ID 265637
Zhang J, Liu Y (2004) Cervical cancer detection using SVM-based feature screening. In: Proceedings of Seventh Int’l Conference Medical Image Computing and Computer Aided Intervention, pp 873–880
Balasundaram S, Gupta D, Prasad SC (2017) A new approach for training Lagrangian twin support vector machine via unconstrained convex minimization. Appl Intell 46:124–134
Jayadeva, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell (TPAMI) 29:905–910
Lin C-F, Wang S-D (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464–471
Batuwita R, Palade V (2010) FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans Fuzzy Syst 18(3):558–571
Tian D-Z, Peng G-B, Ha M-H (2012) Fuzzy support vector machine based on non-equilibrium data. In: International Conference on Machine Learning and Cybernetics, Xi’an, China, pp 15–17
Wang Y, Wang S, Lai KK (2005) A new fuzzy support vector machine to evaluate credit risk. IEEE Trans Fuzzy Syst 13(6):820–831
Chaudhuri, De K (2010) Fuzzy support vector machine for bankruptcy prediction. Appl Soft Comput 11(2):2472–2486
Shao YH, Chen WJ, Zhang JJ, Wang Z, Deng NY (2014) An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. Pattern Recogn 47(9):3158–3167
Gupta D, Borah P, Prasad M (2017) A fuzzy based Lagrangian twin parametric-margin support vector machine (FLTPMSVM). In: Computational intelligence (SSCI), 2017 IEEE symposium series on pp 1–7 https://doi.org/10.1109/ssci.2017.8280964
Balasundaram S, Gupta D (2016) On optimization based extreme learning machine in primal for regression and classification by functional iterative method. Int J Mach Learn Cybern Springer 7(5):707–728
Balasundaram S, Gupta D, Kapil (2014) 1-norm extreme learning machine for regression and multiclass classification using Newton method. Neurocomputing, Elsevier 128:4–14
Fan Qi, Wang Zhe, Li Dongdong, Gao Daqi, Zha Hongyuan (2017) Entropy-based fuzzy support vector machine for imbalanced datasets. Knowl-Based Syst 115:87–99
Mangasarian OL, Wild EW (2006) Multisurface proximal support vector classification via generalized eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74
Fung G, Mangasarian OL (2001) Proximal support vector machine classifiers. In: Proceedings Internation Conference Knowl. Discov. Data Mining, pp 77–86
Mangasarian OL (1994) Nonlinear programming. SIAM Philadelphia, PA
Chen Y, Wu K, Chen X, Tang C, Zhu Q (2014) An entropy-based uncertainty measurement approach in neighborhood systems. Inf Sci 279:239–250
Burges CJC (1998) Geometry and invariance in kernel based methods. In: Cristopher JCB, Alexander JS (eds) Advances in kernel methods-support vector learning, Bernhard Scholkopf. MIT Press, Cambridge
Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput 17(2–3):255–287
Murphy PM, Aha DW (1992) UCI repository of machine learning databases, University of California, Irvine. http://www.ics.uci.edu/~mlearn. Accessed 1 Dec 2016
Tsang I, Kocsor A, Kwok J (2006) Efficient kernel feature extraction for massive data sets. In: International Conference on Knowledge Discovery and Data Mining
Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Gupta, D., Richhariya, B. & Borah, P. A fuzzy twin support vector machine based on information entropy for class imbalance learning. Neural Comput & Applic 31, 7153–7164 (2019). https://doi.org/10.1007/s00521-018-3551-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3551-9