Skip to main content

Advertisement

Log in

Decision tree models in predicting water quality parameters of dissolved oxygen and phosphorus in lake water

  • Original Article
  • Published:
Sustainable Water Resources Management Aims and scope Submit manuscript

Abstract

Water quality is an important issue because of its relationship to humans and other living organisms. Predicting water quality parameters is very important for better management of water resources. The decision tree is one of the data mining methods that can create rules for classifying and predicting data using a tree structure. The purpose of this study is to use data mining techniques to investigate and predict the parameters of soluble phosphorus and oxygen in Lake Erie to achieve this purpose. The Classification And Regression Tree (CART) model is compared with the Chi-squared Automatic Interaction Detector (CHAID) model and the Quick Unbiased Efficient Statistical Trees (QUEST) model with the C5 model. Comparison and review of these models to express their applicability to identify water quality parameters are conducted. The results show that decision tree methods with the help of hydrochemical parameters can classify and predict water quality with high accuracy and in a short time. The number of available data is 327. To check the accuracy of the models, the difference between the observed data and the predicted data is used. In the prediction of dissolved oxygen, 214 cases with the CART model and 185 cases with the CHAID model differ by less than 2 units from the observed data. For phosphorus, 245 cases in the CART model and 237 cases in the CHAID model differ less than 0.2 the predicted data with the observed data. Therefore, the accuracy of the CART model is better. The prediction of 256 phosphorus parameter group numbers and 230 dissolved oxygen parameter group numbers with the C5 algorithm is correct. The results show that CART model is better than CHAID model in predicting data, and C5 model is better than QUEST model in predicting group numbers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

All data used for this study will be available under a reasonable request.

References

  • Ahmed M, Mumtaz R, Zaidi SM (2021) Analysis of water quality indices and machine learning techniques for rating water pollution: a case study of Rawal Dam Pakistan. Water Supply 21(6):3225–3250

    Article  Google Scholar 

  • Alaboz P, Dengiz O, Demir S, Şenol H (2021) Digital mapping of soil erodibility factors based on decision tree using geostatistical approaches in terrestrial ecosystem. CATENA 207:105634

    Article  Google Scholar 

  • Anmala J, Turuganti V (2021) Comparison of the performance of decision tree (DT) algorithms and extreme learning machine (ELM) model in the prediction of water quality of the Upper Green River watershed. Water Environ Res 93(11):2360–2373

    Article  Google Scholar 

  • Azam M, Aslam M, Khan K, Mughal A, Inayat A (2017) Comparisons of decision tree methods using water data. Commun Stat Simul Comput 46(4):2924–2934

    Article  Google Scholar 

  • Bashari H, Tarkesh M, Besalatpour AA (2021) Identifying the determinant habitat characteristics influencing the spatial distribution of Ferula ovina (Boiss.) in semiarid rangelands of Iran using machine learning methods. Ecol Complex 45:100909

    Article  Google Scholar 

  • Bayatvarkeshi M, Alam Imteaz M, Kisi O, Zarei M, Mundher Yaseen Z (2020) Application of M5 model tree optimized with Excel Solver Platform for water quality parameter estimation. Environ Sci Pollut Res 28(6):7347–7364

    Article  Google Scholar 

  • Bertani I, Steger CE, Obenour DR, Fahnenstiel GL, Bridgeman TB, Johengen TH, Sayers MJ, Shuchman RA, Scavia D (2017) Tracking cyanobacteria blooms: do different monitoring approaches tell the same story? Sci Total Environ 575:294–308

    Article  Google Scholar 

  • Chen K, Chen H, Zhou C, Huang Y, Qi X, Shen R, Liu R, Zuo M, Zou X, Wang J, Zhang Y, Chen D, Chen X, Deng Y, Ren H (2020) Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data. Water Res 171:115454

    Article  Google Scholar 

  • Chou SJ (2012) Comparison of multilabel classification models to forecast project dispute resolutions. Expert Syst Appl 39:10202–10211

    Article  Google Scholar 

  • Chou SJ, Ho CC, Hoang HS (2018) Determining quality of water in reservoir using machine learning. Eco Inform 44:57–75

    Article  Google Scholar 

  • Delen D, Kuzey C, Uyar A (2013) Measuring firm performance using financial ratios: A decision tree approach. Expert Syst Appl 40:3970–3983

    Article  Google Scholar 

  • Hsu C-Y, Ou S-J, Hsieh W-F (2018) predicting fish ecological as indicator of river pollution using decision tree technique. Paper presented at the 2nd International Conference on Energy and Environmental Science.

  • Huan J, Li H, Li M, Chen B (2020) Prediction of dissolved oxygen in aquaculture based on gradient boosting decision tree and long short-term memory network: a study of Chang Zhou fishery demonstration base China. Comput Electron Agric 175:105530

    Article  Google Scholar 

  • Lee S, Park I (2013) Application of decision tree model for the ground subsidence hazard mapping near abandoned underground coal mines. J Environ Manage 127:166–176

    Article  Google Scholar 

  • Liao H, Sun W (2010) Forecasting and evaluating water quality of Chao Lake based on an Improved Decision Tree method. Procedia Environ Sci 2:970–979

    Article  Google Scholar 

  • Lu H, Ma X (2020) Hybrid decision tree-based machine learning models for short-term water quality prediction. Chemosphere 249:126169

    Article  Google Scholar 

  • Neissi L, Golabi M, Gorman JM (2020) Spatial interpolation of sodium absorption ratio: A study combining a decision tree model and GIS. Ecol Indic 117:106611

    Article  Google Scholar 

  • Ren D, Guo X, Li C (2021) Research on big data analysis model of multi energy power generation considering pollutant emission—empirical analysis from Shanxi Province. J Clean Prod 316:128154

    Article  Google Scholar 

  • Sekaluvu L, Zhang L, Gitau M (2018) Evaluation of constraints to water quality improvements in the Western Lake Erie Basin. J Environ Manage 205:85–98

    Article  Google Scholar 

  • Shukla S, Rajta A, Setia H, Bhatia R (2020) Simultaneous nitrification–denitrification by phosphate accumulating microorganisms. World J Microbiol Biotechnol 36(10):151

    Article  Google Scholar 

  • Stow CA, Glassner-Shwayder K, Lee D, Wang L, Arhonditsis G, DePinto JV, Twiss MR (2020) Lake Erie phosphorus targets: an imperative for active adaptive management. J Great Lakes Res 46:672–676

    Article  Google Scholar 

  • Thoe W, Gold M, Griesbach A, Grimmer M, Taggart ML, Boehm AB (2014) Predicting water quality at Santa Monica Beach: evaluation of five different models for public notification of unsafe swimming conditions. Water Res 67:105–117

    Article  Google Scholar 

  • Varrà MO, Husakova L, Patočka J, Ghidini S, Zanardi E (2021) Classification of transformed anchovy products based on the use of element patterns and decision trees to assess traceability and country of origin labelling. Food Chem 360:129790

    Article  Google Scholar 

  • Vasistha P, Ganguly R (2020) Water quality assessment of natural lakes and its importance: an overview. Mater Today Proc 32:544–552

    Article  Google Scholar 

  • Xu T, Coco G, Neale M (2020) A predictive model of recreational water quality based on adaptive synthetic sampling algorithms and machine learning. Water Res 177:115788

    Article  Google Scholar 

  • Zhang Y, Liang J, Zeng G, Tang W, Lu Y, Luo Y, Xing W, Tang N, Ye S, Li X, Huang W (2020) How climate change and eutrophication interact with microplastic pollution and sediment resuspension in shallow lakes: a review. Sci Total Environ 705:135979

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Zounemat-Kermani.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gorgan-Mohammadi, F., Rajaee, T. & Zounemat-Kermani, M. Decision tree models in predicting water quality parameters of dissolved oxygen and phosphorus in lake water. Sustain. Water Resour. Manag. 9, 1 (2023). https://doi.org/10.1007/s40899-022-00776-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s40899-022-00776-0

Keywords

Navigation