Skip to main content
Log in

Enhanced classification of hyperspectral images using improvised oversampling and undersampling techniques

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

In the era of climate change, monitoring and effective retrieval of soil, water bodies, vegetation parameters etc. are of utmost importance which is successfully being executed using remote sensing from last few decades. The advancement of technologies has enabled us to reach effective decision making through these sensors. The advantage of acquiring multitemporal spatially continuous data sometimes turns into a disadvantage due to class imbalance where minority class instances are often misclassified by most of the classifiers. The current work explored the solution to handle this problem by resampling the datasets before the application of classification algorithms by proposing a new computationally efficient class wise resampling technique which is based on SMOTE and centroid-based clustering. The experiment was conducted on two benchmarked publicly available hyperspectral datasets. The output of the current work shows the superiority of the current work over past studies based on the performance evaluation metrics, accuracy, precision, recall and kappa values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Pandey MK, Subbiah K (2018) Performance analysis of time series forecasting using machine learning algorithms for prediction of ebola casualties. In: Communications in computer and information science, vol. 899, Springer Verlag, pp 320–334

  2. Pandey MK, Subbiah K (2016) Social networking and big data analytics assisted reliable recommendation system model for internet of vehicles, vol 10036. LNCS. Springer Verlag

    Google Scholar 

  3. Pandey MK, Subbiah K (2016) A novel storage architecture for facilitating efficient analytics of health informatics big data in cloud. In: Proc. - 2016 16th IEEE Int. Conf. Comput. Inf. Technol. CIT 2016, 2016 6th Int. Symp. Cloud Serv. Comput. IEEE SC2 2016 2016 Int. Symp. Secur. Priv. Soc. Netwo https://doi.org/10.1109/CIT.2016.86

  4. Kumar S,Pandey MK (2017) Performance analysis of time series forecasting of ebola casualties using machine learning algorithm. Proc ITISE 2:885–898

    Google Scholar 

  5. Srivastava PK et al (2020) Revisiting hyperspectral remote sensing: origin, processing, applications and way forward. In: Hyperspectral Remote Sensing, Elsevier, pp 3–21

  6. Lamine S et al (2020) Spectroradiometry as a tool for monitoring soil contamination by heavy metals in a floodplain site. Hyperspectral Remote Sens. https://doi.org/10.1016/B978-0-08-102894-0.00002-4

    Article  Google Scholar 

  7. Pandey MK, Kumar S, Karthikeyan S (2013) Information security management system (ISMS) standards in cloud computing—a critical review. In: 2013 international conference on control computing communication and materials (ICCCCM)

  8. Kumar S, Pandey MK (2014) Comparison of dynamic load balancing policies in data centers. Int J Comput Appl 104:9–13. https://doi.org/10.5120/18298-8324

    Article  Google Scholar 

  9. Kumar S, Pandey MK, Nath A, Subbiah K (2016) Missing QoS-values predictions using neural networks for cloud computing environments. In: 2015 International Conference on Computing and Network Communications, CoCoNet 2015, pp 414–419. https://doi.org/10.1109/CoCoNet.2015.7411219

  10. Kumar S, Pandey MK, Nath A, Subbiah K (2016) Performance analysis of ensemble supervised machine learning algorithms for missing value imputation. In: 2016 2nd Int. Conf. Comput. Intell. Networks, 160–165. https://doi.org/10.1109/CINE.2016.35

  11. Kumar S, Pandey MK, Nath A, Subbiah K, Singh MK (2015) Comparative study on machine learning techniques in predicting the QoS-values for web-services recommendations. In: International Conference on Computing, Communication and Automation, ICCCA 2015, pp 161–167. https://doi.org/10.1109/CCAA.2015.7148398

  12. Singh VP, Pandey MK, Singh PS, Karthikeyan S (2020) An econometric time series forecasting framework for web services recommendation. Procedia Comput Sci 167:1615–1625. https://doi.org/10.1016/j.procs.2020.03.372

    Article  Google Scholar 

  13. Singh VP, Pandey MK, Singh PS, Karthikeyan S (2020) Neural net time series forecasting framework for time-aware web services recommendation. Procedia Comput Sci 171:1313–1322. https://doi.org/10.1016/j.procs.2020.04.140

    Article  Google Scholar 

  14. Singh VP, Pandey MK, Singh PS, Karthikeyan S (2020) An LSTM based time series forecasting framework for web services recommendation. Comput y Sist. https://doi.org/10.13053/cys-24-2-3402

    Article  Google Scholar 

  15. Singh VP, Pandey MK, Singh PS, Karthikeyan S (2019) An empirical mode decomposition (EMD) enabled long sort term memory (LSTM) based time series forecasting framework for web services recommendation. Front Artif Intell Appl 320:715–723. https://doi.org/10.3233/FAIA190241

    Article  Google Scholar 

  16. Fussell J, Rundquist D, Harrington JA (1986) On defining remote sensing. Photogramm Eng Remote Sens 52(9):1507–1511

    Google Scholar 

  17. Laboratory JP (2018) HyspIRI final report. NASA HyspIRI Final Rep., no. September, p 91. https://hyspiri.jpl.nasa.gov/downloads/reports_whitepapers/HyspIRI_FINAL_Report_1October2018_20181005a.pdf

  18. Landgrebe D (2000) Information extraction principles and methods for multispectral and hyperspectral image data. Inf Process Remote Sens. https://doi.org/10.1142/9789812815705_0001

    Article  Google Scholar 

  19. Richards JA (2013) Remote sensing digital image analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30062-2

    Book  Google Scholar 

  20. Weiss GM (2004) Mining with rarity: a unifying framework. SIGKDD Explor Newsl 6(1):7–19. https://doi.org/10.1145/1007730.1007734

    Article  Google Scholar 

  21. Krawczyk B (2016) Learning from imbalanced data: open challenges and future directions. Prog Artif Intell 5(4):221–232. https://doi.org/10.1007/s13748-016-0094-0

    Article  Google Scholar 

  22. Chawla WPKNV, Bowyer KW, Hall LO (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

  23. Lee H, Jung S, Kim M, Kim S (2018) Synthetic minority over-sampling technique based on fuzzy c-means clustering for imbalanced data. 2017 Int. Conf. Fuzzy Theory Its Appl. iFUZZY 2017 2017–Novem:1–6. https://doi.org/10.1109/iFUZZY.2017.8311793

  24. Jian C, Gao J, Ao Y (2016) A new sampling method for classifying imbalanced data based on support vector machine ensemble. Neurocomputing 193:115–122. https://doi.org/10.1016/j.neucom.2016.02.006

    Article  Google Scholar 

  25. Sáez JA, Luengo J, Stefanowski J, Herrera F (2015) SMOTE-IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering. Inf Sci (NY) 291(C):184–203. https://doi.org/10.1016/j.ins.2014.08.051

    Article  Google Scholar 

  26. Beckmann M, Ebecken NFF, De Lima BSLP (2015) A KNN undersampling approach for data balancing. J Intell Learn Syst Appl 7(November):104–116. https://doi.org/10.4236/jilsa.2015.74010

    Article  Google Scholar 

  27. Jianping Zhang IM (2003) KNN approach to unbalanced data distributions: a case study involving information extraction. In: Proceedings of workshop on learning from imbalanced datasets

  28. Lin WC, Tsai CF, Hu YH, Jhang JS (2017) Clustering-based undersampling in class-imbalanced data. Inf Sci (NY) 409–410:17–26. https://doi.org/10.1016/j.ins.2017.05.008

    Article  Google Scholar 

  29. Błaszczyński J, Stefanowski J (2015) Neighbourhood sampling in bagging for imbalanced data. Neurocomputing 150(PB):529–542. https://doi.org/10.1016/j.neucom.2014.07.064

    Article  Google Scholar 

  30. Han H, Wang W, Mao B (2005) Borderline-SMOTE: a new over-sampling method. Adv Intell Comput. https://doi.org/10.1007/11538059_91 (Springer)

    Article  Google Scholar 

  31. Barua S, Islam MM, Yao X, Murase K (2014) MWMOTE - Majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng 26(2):405–425. https://doi.org/10.1109/TKDE.2012.232

    Article  Google Scholar 

  32. Raskutti AK, Bhavani (2004) Extreme re-balancing for SVMs: a case study. SIGKDD Explor Newsl 6(1):60–69. https://doi.org/10.1145/1007730.1007739

    Article  Google Scholar 

  33. Chen XW, Gerlach B, Casasent D (2005) Pruning support vectors for imbalanced data classification. Proc Int Jt Conf Neural Networks 3:1883–1888. https://doi.org/10.1109/IJCNN.2005.1556167

    Article  Google Scholar 

  34. Sun Z, Song Q, Zhu X, Sun H, Xu B, Zhou Y (2015) A novel ensemble method for classifying imbalanced data. Pattern Recognit 48(5):1623–1637. https://doi.org/10.1016/j.patcog.2014.11.014

    Article  Google Scholar 

  35. Solis J, Avizzano CA, Bergamasco M (2002) Diversity analysis on imbalanced data sets by using ensemble models. In: Proc. - 10th Symp. Haptic Interfaces Virtual Environ. Teleoperator Syst. HAPTICS 2002, pp 255–262. https://doi.org/10.1109/HAPTIC.2002.998966

  36. Thai-Nghe N, Gantner Z, Schmidt-Thieme L (2010) Cost-sensitive learning methods for imbalanced data. Proc Int Jt Conf Neural Networks. https://doi.org/10.1109/IJCNN.2010.5596486

    Article  Google Scholar 

  37. Yu H, Sun C, Yang X, Yang W, Shen J, Qi Y (2016) ODOC-ELM: optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data. Knowl-Based Syst 92:55–70. https://doi.org/10.1016/j.knosys.2015.10.012

    Article  Google Scholar 

  38. Nalepa J, Antoniak M, Myller M, Ribalta Lorenzo P, Marcinkiewicz M (2020) Towards resource-frugal deep convolutional neural networks for hyperspectral image segmentation. Microprocess Microsyst. https://doi.org/10.1016/j.micpro.2020.102994

    Article  Google Scholar 

  39. Okwuashi O, Ndehedehe CE (2020) Deep support vector machine for hyperspectral image classification. Pattern Recogn. https://doi.org/10.1016/j.patcog.2020.107298

    Article  Google Scholar 

  40. Tu X, Shen X, Fu P, Wang T, Sun Q, Ji Z (2020) Discriminant sub-dictionary learning with adaptive multiscale superpixel representation for hyperspectral image classification. Neurocomputing 409:131–145. https://doi.org/10.1016/j.neucom.2020.05.082

    Article  Google Scholar 

  41. Fang J, Cao X (2020) Multidimensional relation learning for hyperspectral image classification. Neurocomputing 410:211–219. https://doi.org/10.1016/j.neucom.2020.05.034

    Article  Google Scholar 

  42. Sendash SP, Pratap SV, Kumar PM, Karthikeyan S (2020) Local binary ensemble based self-training for semi-supervised classification of hyperspectral remote sensing images. Comput y Sist. https://doi.org/10.13053/cys-24-2-3374

    Article  Google Scholar 

  43. Singh PS, Singh VP, Pandey MK, Karthikeyan S (2020) One-class classifier ensemble based enhanced semisupervised classification of hyperspectral remote sensing images. In: 2020 Int. Conf. Emerg. Smart Comput. Informatics, 22–27. https://doi.org/10.1109/ESCI48226.2020.9167650

  44. Jain PJFAK, Murty MN (1999) Data clustering: a review. ACM Comput Surv 31(3):399–404. https://doi.org/10.1145/331499.331504

    Article  Google Scholar 

  45. Fushiki T (2011) Estimation of prediction error by using K-fold cross-validation. Stat Comput 21(2):137–146. https://doi.org/10.1007/s11222-009-9153-8

    Article  MathSciNet  MATH  Google Scholar 

  46. Nasa, “AVIRIS.” https://aviris.jpl.nasa.gov/

  47. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1109/64.163674

    Article  MATH  Google Scholar 

  48. Foody GM, Mathur A (2004) A relative evaluation of multiclass image classification by support vector machines. IEEE Trans Geosci Remote Sens 42(6):1335–1343. https://doi.org/10.1109/TGRS.2004.827257

    Article  Google Scholar 

  49. Shao Y, Lunetta RS (2012) Comparison of support vector machine, neural network, and CART algorithms for the land-cover classification using limited training data points. ISPRS J Photogramm Remote Sens 70:78–87. https://doi.org/10.1016/j.isprsjprs.2012.04.001

    Article  Google Scholar 

  50. Melgani F, Bruzzone L (2004) Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans Geosci Remote Sens 42(8):1778–1790. https://doi.org/10.1109/TGRS.2004.831865

    Article  Google Scholar 

  51. Galar M, Fern A, Barrenechea E, Bustince H (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybernet Part C (Appl Rev) 42(4):463–484. https://doi.org/10.1109/TSMCC.2011.2161285

    Article  Google Scholar 

  52. Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437. https://doi.org/10.1016/j.ipm.2009.03.002

    Article  Google Scholar 

  53. Ruiz HD, Bacca CB, Caicedo BE (2020) Dimensionality reduction of hyperspectral images of vegetation and crops based on self-organized maps. Inf Process Agric. https://doi.org/10.1016/j.inpa.2020.07.002

    Article  Google Scholar 

  54. Jain DK et al (2018) An approach for hyperspectral image classification by optimizing SVM using self organizing map. J Comput Sci 25:252–259. https://doi.org/10.1016/j.jocs.2017.07.016

    Article  Google Scholar 

Download references

Acknowledgement

First author would like to thank and acknowledge the University Grants Commission (UGC), New Delhi for providing fellowship to pursue his research through UGC-Junior Research Fellowship Scheme. Further, the authors would like to thank the Department of Science and Technology (DST), New Delhi for the technical support provided through DST-PURSE scheme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vijendra Pratap Singh.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, P.S., Singh, V.P., Pandey, M.K. et al. Enhanced classification of hyperspectral images using improvised oversampling and undersampling techniques. Int. j. inf. tecnol. 14, 389–396 (2022). https://doi.org/10.1007/s41870-021-00676-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-021-00676-0

Keywords

Navigation