Abstract
In the era of climate change, monitoring and effective retrieval of soil, water bodies, vegetation parameters etc. are of utmost importance which is successfully being executed using remote sensing from last few decades. The advancement of technologies has enabled us to reach effective decision making through these sensors. The advantage of acquiring multitemporal spatially continuous data sometimes turns into a disadvantage due to class imbalance where minority class instances are often misclassified by most of the classifiers. The current work explored the solution to handle this problem by resampling the datasets before the application of classification algorithms by proposing a new computationally efficient class wise resampling technique which is based on SMOTE and centroid-based clustering. The experiment was conducted on two benchmarked publicly available hyperspectral datasets. The output of the current work shows the superiority of the current work over past studies based on the performance evaluation metrics, accuracy, precision, recall and kappa values.
Similar content being viewed by others
References
Pandey MK, Subbiah K (2018) Performance analysis of time series forecasting using machine learning algorithms for prediction of ebola casualties. In: Communications in computer and information science, vol. 899, Springer Verlag, pp 320–334
Pandey MK, Subbiah K (2016) Social networking and big data analytics assisted reliable recommendation system model for internet of vehicles, vol 10036. LNCS. Springer Verlag
Pandey MK, Subbiah K (2016) A novel storage architecture for facilitating efficient analytics of health informatics big data in cloud. In: Proc. - 2016 16th IEEE Int. Conf. Comput. Inf. Technol. CIT 2016, 2016 6th Int. Symp. Cloud Serv. Comput. IEEE SC2 2016 2016 Int. Symp. Secur. Priv. Soc. Netwo https://doi.org/10.1109/CIT.2016.86
Kumar S,Pandey MK (2017) Performance analysis of time series forecasting of ebola casualties using machine learning algorithm. Proc ITISE 2:885–898
Srivastava PK et al (2020) Revisiting hyperspectral remote sensing: origin, processing, applications and way forward. In: Hyperspectral Remote Sensing, Elsevier, pp 3–21
Lamine S et al (2020) Spectroradiometry as a tool for monitoring soil contamination by heavy metals in a floodplain site. Hyperspectral Remote Sens. https://doi.org/10.1016/B978-0-08-102894-0.00002-4
Pandey MK, Kumar S, Karthikeyan S (2013) Information security management system (ISMS) standards in cloud computing—a critical review. In: 2013 international conference on control computing communication and materials (ICCCCM)
Kumar S, Pandey MK (2014) Comparison of dynamic load balancing policies in data centers. Int J Comput Appl 104:9–13. https://doi.org/10.5120/18298-8324
Kumar S, Pandey MK, Nath A, Subbiah K (2016) Missing QoS-values predictions using neural networks for cloud computing environments. In: 2015 International Conference on Computing and Network Communications, CoCoNet 2015, pp 414–419. https://doi.org/10.1109/CoCoNet.2015.7411219
Kumar S, Pandey MK, Nath A, Subbiah K (2016) Performance analysis of ensemble supervised machine learning algorithms for missing value imputation. In: 2016 2nd Int. Conf. Comput. Intell. Networks, 160–165. https://doi.org/10.1109/CINE.2016.35
Kumar S, Pandey MK, Nath A, Subbiah K, Singh MK (2015) Comparative study on machine learning techniques in predicting the QoS-values for web-services recommendations. In: International Conference on Computing, Communication and Automation, ICCCA 2015, pp 161–167. https://doi.org/10.1109/CCAA.2015.7148398
Singh VP, Pandey MK, Singh PS, Karthikeyan S (2020) An econometric time series forecasting framework for web services recommendation. Procedia Comput Sci 167:1615–1625. https://doi.org/10.1016/j.procs.2020.03.372
Singh VP, Pandey MK, Singh PS, Karthikeyan S (2020) Neural net time series forecasting framework for time-aware web services recommendation. Procedia Comput Sci 171:1313–1322. https://doi.org/10.1016/j.procs.2020.04.140
Singh VP, Pandey MK, Singh PS, Karthikeyan S (2020) An LSTM based time series forecasting framework for web services recommendation. Comput y Sist. https://doi.org/10.13053/cys-24-2-3402
Singh VP, Pandey MK, Singh PS, Karthikeyan S (2019) An empirical mode decomposition (EMD) enabled long sort term memory (LSTM) based time series forecasting framework for web services recommendation. Front Artif Intell Appl 320:715–723. https://doi.org/10.3233/FAIA190241
Fussell J, Rundquist D, Harrington JA (1986) On defining remote sensing. Photogramm Eng Remote Sens 52(9):1507–1511
Laboratory JP (2018) HyspIRI final report. NASA HyspIRI Final Rep., no. September, p 91. https://hyspiri.jpl.nasa.gov/downloads/reports_whitepapers/HyspIRI_FINAL_Report_1October2018_20181005a.pdf
Landgrebe D (2000) Information extraction principles and methods for multispectral and hyperspectral image data. Inf Process Remote Sens. https://doi.org/10.1142/9789812815705_0001
Richards JA (2013) Remote sensing digital image analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30062-2
Weiss GM (2004) Mining with rarity: a unifying framework. SIGKDD Explor Newsl 6(1):7–19. https://doi.org/10.1145/1007730.1007734
Krawczyk B (2016) Learning from imbalanced data: open challenges and future directions. Prog Artif Intell 5(4):221–232. https://doi.org/10.1007/s13748-016-0094-0
Chawla WPKNV, Bowyer KW, Hall LO (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Lee H, Jung S, Kim M, Kim S (2018) Synthetic minority over-sampling technique based on fuzzy c-means clustering for imbalanced data. 2017 Int. Conf. Fuzzy Theory Its Appl. iFUZZY 2017 2017–Novem:1–6. https://doi.org/10.1109/iFUZZY.2017.8311793
Jian C, Gao J, Ao Y (2016) A new sampling method for classifying imbalanced data based on support vector machine ensemble. Neurocomputing 193:115–122. https://doi.org/10.1016/j.neucom.2016.02.006
Sáez JA, Luengo J, Stefanowski J, Herrera F (2015) SMOTE-IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering. Inf Sci (NY) 291(C):184–203. https://doi.org/10.1016/j.ins.2014.08.051
Beckmann M, Ebecken NFF, De Lima BSLP (2015) A KNN undersampling approach for data balancing. J Intell Learn Syst Appl 7(November):104–116. https://doi.org/10.4236/jilsa.2015.74010
Jianping Zhang IM (2003) KNN approach to unbalanced data distributions: a case study involving information extraction. In: Proceedings of workshop on learning from imbalanced datasets
Lin WC, Tsai CF, Hu YH, Jhang JS (2017) Clustering-based undersampling in class-imbalanced data. Inf Sci (NY) 409–410:17–26. https://doi.org/10.1016/j.ins.2017.05.008
Błaszczyński J, Stefanowski J (2015) Neighbourhood sampling in bagging for imbalanced data. Neurocomputing 150(PB):529–542. https://doi.org/10.1016/j.neucom.2014.07.064
Han H, Wang W, Mao B (2005) Borderline-SMOTE: a new over-sampling method. Adv Intell Comput. https://doi.org/10.1007/11538059_91 (Springer)
Barua S, Islam MM, Yao X, Murase K (2014) MWMOTE - Majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng 26(2):405–425. https://doi.org/10.1109/TKDE.2012.232
Raskutti AK, Bhavani (2004) Extreme re-balancing for SVMs: a case study. SIGKDD Explor Newsl 6(1):60–69. https://doi.org/10.1145/1007730.1007739
Chen XW, Gerlach B, Casasent D (2005) Pruning support vectors for imbalanced data classification. Proc Int Jt Conf Neural Networks 3:1883–1888. https://doi.org/10.1109/IJCNN.2005.1556167
Sun Z, Song Q, Zhu X, Sun H, Xu B, Zhou Y (2015) A novel ensemble method for classifying imbalanced data. Pattern Recognit 48(5):1623–1637. https://doi.org/10.1016/j.patcog.2014.11.014
Solis J, Avizzano CA, Bergamasco M (2002) Diversity analysis on imbalanced data sets by using ensemble models. In: Proc. - 10th Symp. Haptic Interfaces Virtual Environ. Teleoperator Syst. HAPTICS 2002, pp 255–262. https://doi.org/10.1109/HAPTIC.2002.998966
Thai-Nghe N, Gantner Z, Schmidt-Thieme L (2010) Cost-sensitive learning methods for imbalanced data. Proc Int Jt Conf Neural Networks. https://doi.org/10.1109/IJCNN.2010.5596486
Yu H, Sun C, Yang X, Yang W, Shen J, Qi Y (2016) ODOC-ELM: optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data. Knowl-Based Syst 92:55–70. https://doi.org/10.1016/j.knosys.2015.10.012
Nalepa J, Antoniak M, Myller M, Ribalta Lorenzo P, Marcinkiewicz M (2020) Towards resource-frugal deep convolutional neural networks for hyperspectral image segmentation. Microprocess Microsyst. https://doi.org/10.1016/j.micpro.2020.102994
Okwuashi O, Ndehedehe CE (2020) Deep support vector machine for hyperspectral image classification. Pattern Recogn. https://doi.org/10.1016/j.patcog.2020.107298
Tu X, Shen X, Fu P, Wang T, Sun Q, Ji Z (2020) Discriminant sub-dictionary learning with adaptive multiscale superpixel representation for hyperspectral image classification. Neurocomputing 409:131–145. https://doi.org/10.1016/j.neucom.2020.05.082
Fang J, Cao X (2020) Multidimensional relation learning for hyperspectral image classification. Neurocomputing 410:211–219. https://doi.org/10.1016/j.neucom.2020.05.034
Sendash SP, Pratap SV, Kumar PM, Karthikeyan S (2020) Local binary ensemble based self-training for semi-supervised classification of hyperspectral remote sensing images. Comput y Sist. https://doi.org/10.13053/cys-24-2-3374
Singh PS, Singh VP, Pandey MK, Karthikeyan S (2020) One-class classifier ensemble based enhanced semisupervised classification of hyperspectral remote sensing images. In: 2020 Int. Conf. Emerg. Smart Comput. Informatics, 22–27. https://doi.org/10.1109/ESCI48226.2020.9167650
Jain PJFAK, Murty MN (1999) Data clustering: a review. ACM Comput Surv 31(3):399–404. https://doi.org/10.1145/331499.331504
Fushiki T (2011) Estimation of prediction error by using K-fold cross-validation. Stat Comput 21(2):137–146. https://doi.org/10.1007/s11222-009-9153-8
Nasa, “AVIRIS.” https://aviris.jpl.nasa.gov/
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1109/64.163674
Foody GM, Mathur A (2004) A relative evaluation of multiclass image classification by support vector machines. IEEE Trans Geosci Remote Sens 42(6):1335–1343. https://doi.org/10.1109/TGRS.2004.827257
Shao Y, Lunetta RS (2012) Comparison of support vector machine, neural network, and CART algorithms for the land-cover classification using limited training data points. ISPRS J Photogramm Remote Sens 70:78–87. https://doi.org/10.1016/j.isprsjprs.2012.04.001
Melgani F, Bruzzone L (2004) Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans Geosci Remote Sens 42(8):1778–1790. https://doi.org/10.1109/TGRS.2004.831865
Galar M, Fern A, Barrenechea E, Bustince H (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybernet Part C (Appl Rev) 42(4):463–484. https://doi.org/10.1109/TSMCC.2011.2161285
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437. https://doi.org/10.1016/j.ipm.2009.03.002
Ruiz HD, Bacca CB, Caicedo BE (2020) Dimensionality reduction of hyperspectral images of vegetation and crops based on self-organized maps. Inf Process Agric. https://doi.org/10.1016/j.inpa.2020.07.002
Jain DK et al (2018) An approach for hyperspectral image classification by optimizing SVM using self organizing map. J Comput Sci 25:252–259. https://doi.org/10.1016/j.jocs.2017.07.016
Acknowledgement
First author would like to thank and acknowledge the University Grants Commission (UGC), New Delhi for providing fellowship to pursue his research through UGC-Junior Research Fellowship Scheme. Further, the authors would like to thank the Department of Science and Technology (DST), New Delhi for the technical support provided through DST-PURSE scheme.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Singh, P.S., Singh, V.P., Pandey, M.K. et al. Enhanced classification of hyperspectral images using improvised oversampling and undersampling techniques. Int. j. inf. tecnol. 14, 389–396 (2022). https://doi.org/10.1007/s41870-021-00676-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-021-00676-0