Skip to main content
Log in

A fuzzy twin support vector machine based on information entropy for class imbalance learning

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In real-world binary class datasets, the total number of samples may not be the same in both the classes, i.e. size of the majority class is much larger than minority class which is called as imbalance problem. In various classification problems, the main interest is to correctly classify the samples belonging to the minority class. Since support vector machine (SVM) and twin support vector machine (TWSVM) obtain the resultant classifier by giving same importance to all the training samples, it results in a biased classifier towards the majority class in imbalanced datasets. In this paper, by considering the fuzzy membership value for each sample, we have proposed an efficient approach, entropy-based fuzzy twin support vector machine for class imbalanced datasets (EFTWSVM-CIL) where fuzzy membership values are assigned based on the entropy values of samples. Here, we give more importance to the minority class by assigning relatively larger fuzzy memberships to the minority class samples. Further, it solves a pair of smaller-size quadratic programming problems (QPPs) rather than a large one as in the case of SVM. Experiments are performed on various real-world imbalanced datasets, and results of our proposed EFTWSVM-CIL are compared with twin support vector machine (TWSVM), fuzzy twin support vector machine (FTWSVM) and entropy-based fuzzy SVM (EFSVM) for imbalanced datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  2. Vapnik VN (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  3. Vapnik VN (2000) The nature of statistical learning theory, 2nd edn. Springer, New York

    Book  Google Scholar 

  4. Osuna E, Freund, R, Girosi F (1997) Training support vector machines: an application to face detection. In: Computer vision and pattern recognition, 1997. Proceedings., IEEE computer society conference on (pp 130–136)

  5. Phillips PJ (1998) Support Vector Machines Applied to Face recognition. Proc Conf Adv Neural Inf Process Syst 11:803–809

    Google Scholar 

  6. Michel P, El Kaliouby R (2003) Real time facial expression recognition in video using support vector machines. In: Proceedings of the 5th International Conference on Multimodal Interfaces, pp 258–264, ISBN: 1-58113-621-8

  7. Borovikov E (2005) An evaluation of support vector machines as a pattern recognition tool. University of Maryland at College Park. http://www.umiacs.umd.edu/users/yab/SVMForPatternRecognition/report.pdf. Accessed 1 Dec 2016

  8. Kumar MA, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert Syst Appl 36(4):7535–7543

    Article  Google Scholar 

  9. Schmidt M, Gish H (1996) Speaker identification via support vector classifiers, acoustics, speech, and signal processing, 1996. ICASSP-96. In: Conference Proceedings, 1996 IEEE International Conference on, vol. 1. Atlanta, GA, pp 105–108

  10. Khan L, Awad M, Thuraisingham B (2007) A new intrusion detection system using support vector machines and hierarchical clustering. VLDB J 16:507–521

    Article  Google Scholar 

  11. Tomar D, Ojha D, Agarwal S (2014) An emotion detection system based on multi least squares twin support vector machine. Adv Artif Intell 2014:8

    Article  Google Scholar 

  12. Tomar D, Agarwal S (2015) Hybrid feature selection based weighted least squares twin support vector machine approach for diagnosing breast cancer, hepatitis, and diabetes. Adv Artif Neural Syst 2015:1, Article ID 265637

  13. Zhang J, Liu Y (2004) Cervical cancer detection using SVM-based feature screening. In: Proceedings of Seventh Int’l Conference Medical Image Computing and Computer Aided Intervention, pp 873–880

  14. Balasundaram S, Gupta D, Prasad SC (2017) A new approach for training Lagrangian twin support vector machine via unconstrained convex minimization. Appl Intell 46:124–134

    Article  Google Scholar 

  15. Jayadeva, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell (TPAMI) 29:905–910

    Article  Google Scholar 

  16. Lin C-F, Wang S-D (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464–471

    Article  Google Scholar 

  17. Batuwita R, Palade V (2010) FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans Fuzzy Syst 18(3):558–571

    Article  Google Scholar 

  18. Tian D-Z, Peng G-B, Ha M-H (2012) Fuzzy support vector machine based on non-equilibrium data. In: International Conference on Machine Learning and Cybernetics, Xi’an, China, pp 15–17

  19. Wang Y, Wang S, Lai KK (2005) A new fuzzy support vector machine to evaluate credit risk. IEEE Trans Fuzzy Syst 13(6):820–831

    Article  Google Scholar 

  20. Chaudhuri, De K (2010) Fuzzy support vector machine for bankruptcy prediction. Appl Soft Comput 11(2):2472–2486

    Article  Google Scholar 

  21. Shao YH, Chen WJ, Zhang JJ, Wang Z, Deng NY (2014) An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. Pattern Recogn 47(9):3158–3167

    Article  Google Scholar 

  22. Gupta D, Borah P, Prasad M (2017) A fuzzy based Lagrangian twin parametric-margin support vector machine (FLTPMSVM). In: Computational intelligence (SSCI), 2017 IEEE symposium series on pp 1–7 https://doi.org/10.1109/ssci.2017.8280964

  23. Balasundaram S, Gupta D (2016) On optimization based extreme learning machine in primal for regression and classification by functional iterative method. Int J Mach Learn Cybern Springer 7(5):707–728

    Article  Google Scholar 

  24. Balasundaram S, Gupta D, Kapil (2014) 1-norm extreme learning machine for regression and multiclass classification using Newton method. Neurocomputing, Elsevier 128:4–14

  25. Fan Qi, Wang Zhe, Li Dongdong, Gao Daqi, Zha Hongyuan (2017) Entropy-based fuzzy support vector machine for imbalanced datasets. Knowl-Based Syst 115:87–99

    Article  Google Scholar 

  26. Mangasarian OL, Wild EW (2006) Multisurface proximal support vector classification via generalized eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74

    Article  Google Scholar 

  27. Fung G, Mangasarian OL (2001) Proximal support vector machine classifiers. In: Proceedings Internation Conference Knowl. Discov. Data Mining, pp 77–86

  28. Mangasarian OL (1994) Nonlinear programming. SIAM Philadelphia, PA

    Book  Google Scholar 

  29. Chen Y, Wu K, Chen X, Tang C, Zhu Q (2014) An entropy-based uncertainty measurement approach in neighborhood systems. Inf Sci 279:239–250

    Article  MathSciNet  Google Scholar 

  30. Burges CJC (1998) Geometry and invariance in kernel based methods. In: Cristopher JCB, Alexander JS (eds) Advances in kernel methods-support vector learning, Bernhard Scholkopf. MIT Press, Cambridge

    Google Scholar 

  31. Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput 17(2–3):255–287

    Google Scholar 

  32. Murphy PM, Aha DW (1992) UCI repository of machine learning databases, University of California, Irvine. http://www.ics.uci.edu/~mlearn. Accessed 1 Dec 2016

  33. Tsang I, Kocsor A, Kwok J (2006) Efficient kernel feature extraction for massive data sets. In: International Conference on Knowledge Discovery and Data Mining

  34. Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310

    Article  Google Scholar 

  35. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak Gupta.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gupta, D., Richhariya, B. & Borah, P. A fuzzy twin support vector machine based on information entropy for class imbalance learning. Neural Comput & Applic 31, 7153–7164 (2019). https://doi.org/10.1007/s00521-018-3551-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3551-9

Keywords

Navigation