Abstract
Pumpkin seeds are frequently consumed as confection worldwide because of their adequate amount of protein, fat, carbohydrate, and mineral contents. This study was carried out on the two most important and quality types of pumpkin seeds, “Ürgüp Sivrisi” and “Çerçevelik”, generally grown in Ürgüp and Karacaören regions in Turkey. However, morphological measurements of 2500 pumpkin seeds of both varieties were made possible by using the gray and binary forms of threshold techniques. Considering morphological features, all the data were modeled with five different machine learning methods: Logistic Regression (LR), Multilayer Perceptrons (MLP), Support Vector Machine (SVM) and Random Forest (RF), and k-Nearest Neighbor (k-NN), which further determined the most successful method for classifying pumpkin seed varieties. However, the performances of the models were determined with the help of the 10 k-fold cross-validation method. The accuracy rates of the classifiers were obtained as LR 87.92 percent, MLP 88.52 percent, SVM 88.64 percent, RF 87.56 percent, and k-NN 87.64 percent.
Similar content being viewed by others
References
Aktaş N, Uzlaşır T, Tunçil YE (2018) Pre-roasting treatments significantly impact thermal and kinetic characteristics of pumpkin seed oil. Thermochim Acta. https://doi.org/10.1016/j.tca.2018.09.012
Arlot S, Celisse A (2010) A suvey of cross-validation procedures for model selection. Stat Surv. https://doi.org/10.1007/BF02506337
Bulut F (2016) Sınıflandırıcı Topluluklarının Dengesiz Veri Kümeleri Üzerindeki Performans Analizleri. Bilişim Teknolojileri Degisi. https://doi.org/10.17671/btd.81137
Chen X, Xun Y, Li W, Zhang J (2010) Combining discriminant analysis and neural networks for corn variety identification. Computers Electron Agric. https://doi.org/10.1016/j.compag.2009.09.003
Cruyff MJ, Böckenholt U, Van Der Heijden PG, Frank LE (2016) A review of regression procedures for randomized response data, including univariate and multivariate logistic regression, the proportional odds model and item response model, and self-protective responses. Handb Stat 34:287–315. https://doi.org/10.1016/bs.host.2016.01.016
Demir BE (2017) Prediction of physical parameters of pumpkin seeds using neural network. Notulae Botanicae Horti Agrobotanici Cluj-Napoca. https://doi.org/10.15835/nbha45110429
Dheer P, Singh P, Singh V (2019) Classifying wheat vaieties using machine learning model. J Pharmacogn Phytochem. https://doi.org/10.13140/RG.2.2.16338.81600
Düzeltir, B. (2004). Çekirdek kabağı (Cucurbita pepo L.) hatlarında morfolojik özelliklere göre tanımlama ve seleksiyon çalışmaları. Master's Thesis, Ankara University Institute of Natural And Applied Sciences. Ankara, Thesis No: 150981
Eldem A, Eldem H, Palalı A (2017) Görüntü İşleme Teknikleriyle Yüz Algılama Sistemi Geliştirme. BEU J Sci. https://doi.org/10.17798/bitlisfen.333984
Guevara F, Gil H, Gomez Gil J (2011) A machine vision system for classification of wheat and barley grain kernels. Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA) 9:672–680. https://doi.org/10.5424/sjar/20110903-140-10
Hossin M, Sulaiman N (2015) A review on evaluation metrics for data classification evaluations. Int J Data Min Knowl Manag Process 5:1–11. https://doi.org/10.5121/ijdkp.2015.52011
Huang M, Tang J, Yang B, Zhu Q (2016) Classification of maize seeds of different years based on hyperspactral imaging and model updating. Computers Electron Agric 122:139–145. https://doi.org/10.1016/j.compag.2016.01.029
Jamuna KS, Kapagavalli S, Vijaya MS, Revathi P, Gokilavani S, Madhiya E. (2010). Classification of seed cotton yield based on the growth stages of cotton crop using machine learning techniques. In: International conference on advances in computer engineering, Bangalore, pp. 312–315. https://doi.org/10.1109/ACE.2010.71
Kalantar B, Pradhan B, Naghibi SA, Motevalli A, Mansor S (2018) Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN). Geomat Nat Haz Risk 9(1):49–69. https://doi.org/10.1080/19475705.2017.1407368
Kavzoğlu T, Çölkesen İ (2010) Destek Vektör Makineleri ile uydu görüntülerinin sınıflandırılmasında kernel fonksiyonlarının etkilerinin incelenmesi. Harita Dergisi 16:73–82. https://doi.org/10.17475/kastorman.289762
Kayak N, Türkmen Ö, Tevfik A (2018) Çerezlik Kabak (Cucurbita pepo L.) Hatlarının SSR (Simple Sequence Repeat) Markörleri ile Karakterizasyonu. Manas J Agric Vet Life Sci. https://doi.org/10.5772/55044
Larson SC (1931) A new formula for predicting the shrinkage of the coefficient of multiple correlation. J Edic Psychol 2:45–55. https://doi.org/10.1214/aoms/1177732951
Mahdavinejad MS, Rezvan M, Barekatain M, Adibi P, Barnaghi P, Sheth AP (2018) Machine learning for internet of things data analysis: a survey. Digit Commun Netw 4:161–175. https://doi.org/10.1016/j.dcan.2017.10.002
Olgun M, Onarcan AO, Özkan K, Işık Ş, Sezer O, Özgişi K et al (2016) Wheat grain classification by using dense SIFT features with SVM classifier. Computers Electron Agric 122:185–190. https://doi.org/10.1016/j.compag.2016.01.033
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66
Pal M (2005) Random forest classifier for remote sensing classification. Int J Remote Sens 26:217–222. https://doi.org/10.1080/01431160412331269698
Pandey N, Krishna S, Sharma S (2013) Automatic Seed classification by shape and color features using machine vision technology. Int J Computer Appl Technol Res 2:208–213. https://doi.org/10.7753/IJCATR0202.1023
Peričin D, Radulović L, Trivić S, Dimić E (2008) Evaluation of solubility of pumpkin seed globulins by response surface method. J Food Eng 84:591–594. https://doi.org/10.1016/j.jfoodeng.2007.07.002
Punn M, Bhalla N (2013) Classification of wheat grains using machine algorithms. Int J Sci Res (IJSR) 2:363–366
Şen Z (2004) Yapay sinir ağları. İstanbul, SU Vakfı. ISBN: 9789756455135
Seymen M, Yavuz D, Dursun A, Kurtar ES, Türkmen Ö (2019) Identification of drought-tolerant pumpkin (Cucurbita pepo L.) genotypes associated with certain fruit characteristics, seed yield, and quality. Agric Water Manag 221:150–159
Shao J (1993) Linear model selection by cross-validation. J Am Stat Assoc 88:486–494. https://doi.org/10.1080/01621459.1993.10476299
Townsend JT (1971) Theoretical analysis of an alphabetic confusion matrix. Percept Psychophys 9:40–50. https://doi.org/10.3758/BF03213026
Yanmaz R, Düzeltir B (2003) Çekirdek kabağı yetiştiriciliği. Türk-Koop Ekin, Tarım Kredi Kooparatifi Merkez Bilgi Yayınları 13:22–24. https://doi.org/10.1016/S2095-3119(13)60611-5
Yavuz D, Seymen M, Yavuz N, Türkmen Ö (2015) Effects of irrigation interval and quantity on the yield and quality of confectionary pumpkin grown under field conditions. Agric Water Manag 159:290–298
Yegul M (2012) Seed yield and quality of some inbreed lines in naked seed pumpkin (Cucurbita pepo var styrica). Yüzüncü Yıl Üniversitesi Tarım Bilimleri Dergisi. https://doi.org/10.20289/zfdergi.409921
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Koklu, M., Sarigil, S. & Ozbek, O. The use of machine learning methods in classification of pumpkin seeds (Cucurbita pepo L.). Genet Resour Crop Evol 68, 2713–2726 (2021). https://doi.org/10.1007/s10722-021-01226-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10722-021-01226-0