A comparison of methods to avoid overfitting in neural networks training in the case of catchment runoff modelling

doi:10.1016/j.jhydrol.2012.10.019

Journal of Hydrology

Volume 476, 7 January 2013, Pages 97-111

https://doi.org/10.1016/j.jhydrol.2012.10.019 Get rights and content

Summary

Artificial neural networks (ANNs) becomes very popular tool in hydrology, especially in rainfall–runoff modelling. However, a number of issues should be addressed to apply this technique to a particular problem in an efficient way, including selection of network type, its architecture, proper optimization algorithm and a method to deal with overfitting of the data. The present paper addresses the last, rarely considered issue, namely comparison of methods to prevent multi-layer perceptron neural networks from overfitting of the training data in the case of daily catchment runoff modelling. Among a number of methods to avoid overfitting the early stopping, the noise injection and the weight decay have been known for about two decades, however only the first one is frequently applied in practice. Recently a new methodology called optimized approximation algorithm has been proposed in the literature.

Overfitting of the training data leads to deterioration of generalization properties of the model and results in its untrustworthy performance when applied to novel measurements. Hence the purpose of the methods to avoid overfitting is somehow contradictory to the goal of optimization algorithms, which aims at finding the best possible solution in parameter space according to pre-defined objective function and available data. Moreover, different optimization algorithms may perform better for simpler or larger ANN architectures. This suggest the importance of proper coupling of different optimization algorithms, ANN architectures and methods to avoid overfitting of real-world data – an issue that is also studied in details in the present paper.

The study is performed for Annapolis River catchment, characterized by significant seasonal changes in runoff, rapid floods during winter and spring, moderately dry summers, severe winters with snowfall, snow melting, frequent freeze and thaw, and presence of river ice. The present paper shows that the elaborated noise injection method may prevent overfitting slightly better than the most popular early stopping approach. However, the implementation of noise injection to real-world problems is difficult and the final model performance depends significantly on a number of very technical details, what somehow limits its practical applicability. It is shown that optimized approximation algorithm does not improve the results obtained by older methods, possibly due to over-simplified criterion of stopping the algorithm. Extensive calculations reveal that Evolutionary Computation-based algorithm performs better for simpler ANN architectures, whereas classical gradient-based Levenberg–Marquardt algorithm is able to benefit from additional input variables, representing precipitation and snow cover from one more previous day, and from more complicated ANN architectures. This confirms that the curse of dimensionality has severe impact on the performance of Evolutionary Computing methods.

Highlights

► Comparison of different methods to avoid ANN overfitting to data. ► Noise injection outperform early stopping and optimized approximation algorithm methods. ► Different results were obtained using Evolutionary Computation-based and gradient-based algorithms for ANN training. ► Rainfall–runoff modelling by means of MLP ANN provide reasonable results.

Introduction

During the last 20 years artificial neural networks (ANNs) (Haykin, 1999) have become very popular in various scientific disciplines (Paliwar and Kumar, 2009, Wen et al., 2009, Al-Garni, 2010). Within the field of hydrology different Artificial Intelligence methods, including ANNs, also gained much popularity (Maier and Dandy, 2000, Cheng et al., 2002, Cheng et al., 2008, Muttil and Chau, 2006, Lin et al., 2006, Piotrowski et al., 2007, Maier et al., 2010, Acharya et al., 2012, Huo et al., 2012, Nourani et al., 2012). ANN applications to rainfall–runoff modelling are plentiful and include: ASCE Task Committee, 2000, Solomatine, 2003, Cherkassy et al. (2006), Dawson et al., 2006, Piotrowski et al., 2006, Solomatine and Ostfeld, 2008, Wu et al., 2009, Siou et al., 2011 and Wu and Chau (2011). Among different ANN types, multi-layer perceptron neural networks (MLPs) are especially popular due to their simplicity, relatively low number of parameters, clear biological inspirations and a debate whether they may be considered as universal approximators or not (Hecht-Nielsen, 1987, Girosi and Poggio, 1989, Nakamura et al., 1993, Braun and Griebel, 2009). MLPs are of special interest also in the present paper.

Recently Wang et al. (2009) and Elshorbagy et al. (2010) showed a comparison of various Artificial Intelligence techniques for rainfall–runoff forecasting, encouraging the search for novel methods to improve ANN training and selecting their different features. Apart from choosing the neural network type, the successful application of a neural networks to a particular problem requires the determination of a model architecture (which defines number of parameters), an optimization algorithm and a method to avoid overfitting. However, in practice ANNs are frequently used at hand without discussing such details, what may have significant impact on model performance.

The present paper is a continuation of Piotrowski and Napiorkowski (2011) study, which aimed at choosing the best optimization method for MLPs training applied for daily catchment runoff forecasting in colder climate zones. The main objective of the current paper is the comparison of different methods to avoid overfitting when MLPs are applied to similar task. We put special attention to noise injection approach, which is rarely considered in hydrological applications. Also the performance of a new method called optimized approximation algorithm and of early stopping, the most popular approach to deal with overfitting, are studied in details. However, the methods to avoid overfitting cannot be compared or discussed apart of ANN architecture and training algorithms. For instance some optimization algorithms perform poorly in uncertain environments (Jin and Branke, 2005) of which neural networks with noise injection method to avoid overfitting may be an example. This emphasizes the importance of proper coupling methods to avoid overfitting with training algorithms. On the other hand some optimization methods may quickly converge to good solutions for simple ANN architecture with small number of parameters, but perform poorly for more complicated ones with more parameters. One may note that different ANN architectures means different number of parameters and different fitness landscapes, hence formally different problems. It is well known that the performance of optimization algorithms depends on the problem. It was verified empirically (e.g. Epitropakis et al., 2011); it was also proved (Wolpert and Macready, 1997, Wolpert and Macready, 2005) that under certain assumptions the performance of any two algorithms averaged over all possible problems (fitness landscapes) is equal. Of course, in practice few people may be interested in all problems, but the proof presented in Wolpert and Macready (1997) results in important warning: an optimization algorithm applied to a novel task can fail even if it was successful in solving some other problems. In the present paper at first based on the previous findings in the literature two training methods are chosen, one gradient-based and one based on Evolutionary Computation (EC). Then the optimal set of input variables and MLP architecture for each training algorithm is found experimentally. Finally three different methods to avoid ANN overfitting of data are compared for chosen ANN architectures and training algorithms. Below we very briefly introduce the main features connected with MLP training algorithms, architecture and methods to avoid overfitting.

A number of studies addressed the problem of application of different optimization algorithms to the ANN training for various regression problems. The most popular optimization approaches are gradient-based methods and among them the Levenberg–Marquardt (LM) algorithm (Press et al., 2006) is considered as one of the most efficient (see e.g. Adamowski and Karapataki, 2010). In a few studies EC methods were applied to the same problems – with various opinions on their performance. In some papers the advantage of the EC algorithms over the gradient-based methods applied to different ANN training was suggested (Sexton and Gupta, 2000, Jain and Srinivasulu, 2004, Martinez-Estudillo et al., 2006, Chau, 2006, Zhang et al., 2007, Huang et al., 2009), but a number of other studies showed that EC approaches are at least not better than the gradient-based algorithms in terms of ANN model performance and are of course much slower (Mandischer, 2002; Ilonen et al., 2003; Socha and Blum, 2007). Motivated by this diversity of opinions, authors of the present study conducted a detailed survey of recently developed EC methods from two “families” – Differential Evolution (DE) (Storn and Price, 1995) and Particle Swarm Optimization (Kennedy and Eberhart, 1995). Eight EC algorithms were compared with LM method for MLP training with early stopping approach for rainfall–runoff modelling at Annapolis River, Nova Scotia, Canada (Piotrowski and Napiorkowski, 2011). The results are generally in agreement with Mandischer (2002), Ilonen et al. (2003) and Socha and Blum (2007) opinions. Only one of the EC algorithms, namely the Differential Evolution with Global and Local neighbors (DEGL, Das et al., 2009) showed similar performance to LM method, with much slower convergence. It is worth noting that DEGL was also found to be among the best EC-based algorithms, along with Grouped Differential Evolution (GDE) (Piotrowski and Napiorkowski, 2010) and Self-Adaptive Differential Evolution (SADE) (Qin et al., 2009) in training MLP applied to estimation of longitudinal dispersion coefficient in rivers (Piotrowski et al., 2012b). In that study number of data was very small (under 100 observations) imposing the use of very simple MLP architecture, objective function was non-differentiable and a kind of noise injection method to avoid overfitting was used. This suggests that DEGL may be well suited for MLP training in general. Based on above findings, in the present paper LM and DEGL methods are used as training algorithms.

The architecture of MLP defines the number of parameters to be optimized. This architecture should always be adopted to the problem (Zhang et al., 1998, Mahmoud and Ben-Nahki, 2003, Siou et al., 2011, De et al., 2011), as it depends on the number of input and output variables, the number and the quality of available data, the presence of noise in the data, etc. Over-parameterization may have significant negative impact on the performance of neural networks, also in the case of rainfall–runoff modelling (Gaume and Gosset, 2003). The smaller architecture usually allows to obtain better generalization properties (Haykin, 1999) and is easier to train, especially by means of the non-gradient-based methods. Some Evolutionary Computation algorithms were proposed to determine an optimal architecture of ANN (Castillo et al., 2000, Huang et al., 2009) and were applied to hydrological problems (Chen and Chang, 2009), however they are applicable rather to problems when neither expert-knowledge is available nor physically-based choice of input variables and model complexity is possible. Although a number of other methods how to develop ANN architecture exists (Sietsma and Dow, 1991; Wang et al., 1994, Islam et al., 2009, Ssegane et al., 2012, Nourani and Sayyah Frad, 2012), they usually rely on heuristic or subjective decisions and none is widely applied (Zhang et al., 1998).

Impact of different methods used to avoid overfitting on ANN performance, which is the main focus of the present paper, was rarely studied in the literature and such papers usually dealt with classification problems (Holmstrom and Koistinen, 1992, Hua et al., 2006, Zur et al., 2009) or used artificial functions for comparison (Holmstrom and Koistinen, 1992, Reed et al., 1995). Only Giustolisi and Laucelli (2005) studied the impact of a number of methods to avoid ANN overfitting for hydrological data, namely the case of rainfall–runoff modelling for two very small catchments (up to 5 km²) in Italy. However, EC-based optimization methods were not used and the popular noise injection method based on maximization of the cross-validated likelihood (Holmstrom and Koistinen, 1992) was not compared. The early stopping technique led to poor results in Giustolisi and Laucelli (2005) what may be surprising as this is very popular and usually successful approach to avoid overfitting. Moreover, recently a novel methodology called optimized approximation algorithm (Liu et al., 2008) was proposed and gained much interest. The present paper tries to fill the gaps left by Giustolisi and Laucelli, 2005, Hua et al., 2006 and Zur et al. (2009) and presents the comparison of the catchment runoff modelling results obtained when the mentioned three techniques designed to avoid overfitting are coupled with MLPs of different architectures and gradient-based or EC-based optimization algorithms. Neural networks are applied to Annapolis River runoff forecasting, which is located in moderate climate zone.

Section snippets

Study area and hydro-meteorological data

The present paper is a continuation of Piotrowski and Napiorkowski (2011) study for the same catchment, namely upper part of Annapolis River (Nova Scotia, Canada) up to Wilmot settlement, with area of 546 km². Hydrological and meteorological data are available from Water Survey of Canada and Canada’s National Climate Data and Information Archive for the gauge station situated in Wilmot settlement, (44°56′57″N, 65°01′45″W) and meteorological station at Greenwood Airfield (44°58′40″N, 64°55′33″W),

Multi-layer perceptron artificial neural networks and optimization algorithms

MLP neural network (Haykin, 1999) is a nonlinear data-based model that approximates the values of output variables (y) dependent on the set of input variables (x). MLP is formed by several nodes arranged in groups called layers (see Fig. 3). Usually three layers, an input layer, a hidden layer, and an output layer are sufficient in practice (Haykin, 1999, see also real-world data applications in De et al., 2011 and Siou et al., 2011). The number of nodes in input and output layers is determined

Methods to avoid neural network overfitting

To be successfully applied in practice, ANN should have abilities to generalize input–output mapping. In other words, model should be able to correctly approximate observations not included in training set (Geman et al., 1992). In the case of catchment runoff modelling it means the ability to make good runoff predictions for the future hydro-meteorological conditions. To allow proper generalization capabilities one must avoid ANN overfitting of the training data, i.e. model should be fitted

Selection of input variables and MLP architecture

To predict one lead day runoff Q(t + 1), different variants of input variables are considered (see Table 2). The best MLP architecture is chosen according to MSE criterion, Eq. (3). In the simplest combination of inputs, only the most recent measurements of meteorological variables are used, namely UT(t), LT(t), RF(t), SF(t), SC(t) together with two last runoff measurements Q(t), Q(t − 1), what gives seven input variables in total. Each of them is physically important for runoff forecasting – UT

Results and discussion

Three criteria are used in the present paper to compare the results obtained by means of different methods to avoid ANN overfitting, namely mean square error (MSE), mean absolute error (MAE), and Nash–Sutcliffe coefficient (NSC).

During the optimization MSE is used as the objective function (Eq. (3)). MAE is defined as $MAE = \frac{1}{N} \sum_{n = 1}^{N} |y_{n}^{P} - y_{n}|$

Very popular in river runoff forecasting NSC is computed according to the following equation $NSC = 1 - \frac{\frac{1}{N} \sum_{n = 1}^{N} (y_{n}^{P} - y_{n})^{2}}{\frac{1}{N} \sum_{n = 1}^{L} (y_{n} - y_{n}^{a})^{2}}, where y_{n}^{a} = \frac{1}{N} \sum_{n = 1}^{N} y_{n}$ The maximum

Conclusions

The present paper aims at comparison of applications of a number of techniques to avoid ANN overfitting in case of catchment runoff modelling in the area located in moderately cold climate zone. Three methods were considered, namely the noise injection with spread factor h estimated by means of maximizing cross-validation likelihood function (Holmstrom and Koistinen, 1992), optimized approximation algorithm proposed by Liu et al. (2008) and the most popular early stopping (Prechlet, 1998,

Acknowledgments

This work has been supported by the Inner Grant of the Institute of Geophysics, Polish Academy of Sciences Nr. 1b/IGF PAN/2012/MŁ.

References (88)

P.A. Castillo et al.
G-prop: global optimization of multilayer perceptrons using gas
Neurocomputing
(2000)
K.W. Chau
Particle swarm optimization training algorithm for ANNs in stage prediction of Shing Mun River
J. Hydrol.
(2006)
Y.H. Chen et al.
Evolutionary artificial neural networks for hydrological systems forecasting
J. Hydrol.
(2009)
C.T. Cheng et al.
Combining a fuzzy optimal model with genetic algorithm to solve multiobjective rainfall–runoff model calibration
J. Hydrol.
(2002)
C.W. Dawson et al.
Symbolic adaptive neuro-evolution applied to rainfall–runoff modelling in northern England
Neural Networks
(2006)
C.Y. Huang et al.
Evaluating the process of a genetic algorithm to improve the back-propagation network: a Monte Carlo study
Expert Syst. Appl.
(2009)
Z. Huo et al.
Integrated neural networks for monthly river flow estimation in arid inland basin of Northwest China
J. Hydrol.
(2012)
B. Luks et al.
The relationship between snowpack dynamics and NAO/AO indices in SW Spitsbergen
Phys. Chem. Earth
(2011)
M.A. Mahmoud et al.
Architecture and performance of neural networks for efficient A/C control in buildings
Energy Convers. Manage.
(2003)
H.R. Maier et al.
Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications
Environ. Modell. Softw.
(2000)

H.R. Maier et al.

Methods used for the development of neural networks for the prediction of water resource variables in river systems: current status and future directions

Environ. Modell. Softw.

(2010)

M. Mandischer

A comparison of evolution strategies and backpropagation for neural network training

Neurocomputing

(2002)

A. Martinez-Estudillo et al.

Evolutionary product unit based neural networks for regression

Neural Networks

(2006)

V. Nourani et al.

Sensitivity analysis of the artificial neural network outputs in simulation of the evaporation process at different climatological regimes

Adv. Eng. Softw.

(2012)

A.P. Piotrowski et al.

Optimizing neural networks for river flow forecasting – evolutionary computation methods versus the Levenberg–Marquardt approach

J. Hydrol.

(2011)

R.S. Sexton et al.

Comparative valuation of genetic algorithm and backpropagation for training neural networks

Inf. Sci.

(2000)

J. Sietsma et al.

Creating artificial neural networks that generalize

Neural Networks

(1991)

L.K.A. Siou et al.

Complexity selection of a neural network model for karst flood forecasting: the case of the Lez Basin (southern France)

J. Hydrol.

(2011)

H. Ssegane et al.

Advances in variable selection methods I: causal selection methods versus stepwise regression and principal component analysis on data of known and unknown functional relationships

J. Hydrol.

(2012)

A. Varhola et al.

Forest canopy effect on snow accumulation and ablation: an integrative review of empirical results

J. Hydrol.

(2010)

W.C. Wang et al.

A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series

J. Hydrol.

(2009)

Z. Wang et al.

A procedure for determining the topology of multilayer feedforward neural networks

Neural Networks

(1994)

U.P. Wen et al.

A review of Hopfield neural networks for solving mathematical programming problems

Eur. J. Oper. Res.

(2009)

C.L. Wu et al.

Methods to improve neural network performance in daily flow prediction

J. Hydrol.

(2009)

C.L. Wu et al.

Rainfall–runoff modeling using artificial neural network coupled with singular spectrum analysis

J. Hydrol.

(2011)

G. Zhang et al.

Forecasting with artificial neural networks: the state of the art

Int. J. Forecast.

(1998)

J.R. Zhang et al.

A hybrid particle swarm optimization – back-propagation algorithm for feedforward neural network training

Appl. Math. Comput.

(2007)

N. Acharya et al.

A neurocomputing approach to predict monsoon rainfall in monthly scale using SST anomaly as a predictor

Acta Geophys.

(2012)

J. Adamowski et al.

Comparison of multivariate regression and artificial neural networks for peak urban water-demand forecasting: evaluation of different ANN learning algorithms

J. Hydrol. Eng.

(2010)

M.A. Al-Garni

Interpretation of spontaneous potential anomalies from some simple geometrically shaped bodies using neural network inversion

Acta Geophys.

(2010)

S. Amari et al.

Asymptotic statistical theory of overfitting and cross-validation

IEEE Trans. Neural Networks

(1997)

G. An

The effect of adding noise during backpropagation training on a generalization performance

Neural Comput.

(1996)

ASCE Task Committee, 2000. Artificial neural networks in hydrology. II: hydrologic applications. J. Hydrol. Eng. 5(2),...

J. Braun et al.

On a constructive proof of the Kolmogorov’s superposition theorem

Constr. Approx.

(2009)

W.M. Brown et al.

Use of noise to augment training data: a neural network method of mineral–potential mapping in regions of limited known deposit examples

Nat. Resour. Res.

(2003)

C.T. Cheng et al.

Optimizing hydropower reservoir operation using hybrid genetic algorithm and chaos

Water Resour. Manage.

(2008)

V. Cherkassy et al.

Computational intelligence in earth sciences and environmental applications: issues and challenges

Neural Networks

(2006)

S. Das et al.

Differential evolution using a neighborhood-based mutation operator

IEEE Trans. Evol. Comput.

(2009)

S. Das et al.

Differential evolution – a survey of the state-of-the-art

IEEE Trans. Evol. Comput.

(2011)

S.S. De et al.

Identification of the best architecture of a multilayer perceptron in modelling daily total ozone concentration in Kolkata, India

Acta Geophys.

(2011)

B. Dorronsoro et al.

Improving classical and decentralized differential evolution with new mutation operator and population topologies

IEEE Trans. Evol. Comput.

(2011)

A. Elshorbagy et al.

Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology – part 2: application

Hydrol. Earth Syst. Sci.

(2010)

M.G. Epitropakis et al.

Enhancing differential evolution utilizing proximity-based mutation operators

IEEE Trans. Evol. Comput.

(2011)

E. Gaume et al.

Over-parameterisation, a major obstacle to the use of artificial neural networks in hydrology?

Hydrol. Earth Syst. Sci.

(2003)

Cited by (205)

Prediction of California bearing ratio and modified proctor parameters using deep neural networks and multiple linear regression: A case study of granular soils
2024, Case Studies in Construction Materials
The California Bearing Ratio (CBR) and modified proctor parameters belong to the soil geotechnical properties used to assess soil behavior. Direct measurement of these properties can be quite time-consuming in large-scale applications or when immediate results are required. Therefore, significant research efforts have been made in the literature to develop indirect methods for their estimation. However, some gaps in the state-of-the-art can be highlighted in these topics, such as the deficiency in computational models to calculate the maximum dry unit weight ( $γ_{d (\max)}$ ), optimum moisture content ( $w_{opt}$ ) and CBR, and the lack of methods that consider their intrinsic influence on each other. Hence, in this investigation, mathematical and computational models were created to obtain the above-mentioned variables from the soil grain size distribution. The mathematical model was based on Multiple Linear Regression (MLR) correlations. Meanwhile, the computational model was constructed from a custom-made Deep Neural Networks (DNNs) architecture. Subsequently, the accuracy of these models was validated with an experimental case study. The results demonstrated that the proposed methods in this study are more precise than previous approaches in the literature. Accordingly, the main contribution of this manuscript to the industry is the formation of models with high exactness to predict the $γ_{d (\max)}$ , $w_{opt}$ and CBR of granular soils.
Regularizing deep neural networks for medical image analysis with augmented batch normalization[Formula presented]
2024, Applied Soft Computing
Batch Normalization (BN) is a commonly employed regularization technique for deep neural networks. This technique employs normalization and affine transformation to accelerate the training phase. The normalization process forces the distribution of layer inputs into the standard normal distribution in a mini-batch, resolving the internal covariate shift problem. Meanwhile, the affine transformation preserves the network’s ability to perform non-linear feature transformations. However, the effectiveness of BN can be limited when dealing with small mini-batch sizes, since the batch statistics estimated from inadequate samples are inaccurate and unreliable. To address this issue, we present Noise-Assisted Batch Normalization (NABN), which serves as a variant of BN. The proposed method adds random noise into the mean and variance calculated from the mini-batch during the normalization process, enhancing the diversity of mean and variance. We evaluate the effectiveness of our NABN for image classification on CIFAR-10, retinal OCT, and chest X-ray datasets with various convolutional network architectures such as ResNet-20, ResNet-32, ResNet-44, and ResNet-50. Furthermore, experimental results demonstrate the superiority of our proposed approach over the traditional BN for medical image segmentation using U-Net, as evaluated on the MSD liver dataset. Code is available at https://github.com/ROSENty/NABN.git.
Metaheuristic learning algorithms for accurate prediction of hydraulic performance of porous embankment weirs
2024, Applied Soft Computing
A porous weir is an environmentally friendly structure with minimal negative environmental impact. Due to the complex flow mechanism around porous weirs, it is difficult to provide a general empirical relationship to estimate the free and submerged discharge coefficients and threshold submergence or modular limit. Hence, the current study attempts to introduce an efficient alternative approach leveraging on multilayer perceptron (MLP) coupled with generalized normal distribution optimization (GNDO), Runge-Kutta optimization algorithm (RUN), slime mould algorithm (SMA), whale optimization algorithm (WOA), and grey wolf optimizer (GWO). These models' robustness and accuracy are investigated to solve the free discharge coefficient (FDC), modular limit index (MLI), and submerge discharge reduction factor (SDRF) of the porous embankment weirs (PEWs) modeling problem where 6514 datasets are studied experimentally for free, modular limit, and submerged flow conditions. The performance of developed metaheuristic-MLP models is compared with a gradient algorithm-based MLP, namely Levenberg-Marquardt (LM) model. The obtained results from analysis of different performance indicators demonstrate that the GNDO-MLP outperforms other developed MLP models. The results demonstrate that the metaheuristic-based MLP models are more reliable in forecasting FDC, MLI, and SDRF of PEWs than the ML-MLP model. To measure how much the MLP model improved when combined with meta-heuristic algorithms, an aggregated statistical index called objective function was applied. The results of this index showed that the MLP model with GNDO performed better than the MLP models with WOA, SMA, GWO, RUN, and LM by 82%, 66%, 69%, 69%, and 96% for FDC, by 41%, 13%, 9%, 8%, and 48% for MLI, and by 64%, 20%, 8%, 4%, and 69% for SDRF condition, respectively. Furthermore, the results of the uncertainty analysis, using the P-factor and R-factor, demonstrated that GNDO-MLP had lower uncertainty compared to other models. The robustness of the GNDO-MLP model indicates that it can be used for further research on the probabilistic analysis of the PEWs.
Quantifying the natural flood management potential of leaky dams in upland catchments, Part I: A data-based modelling approach
2024, Journal of Hydrology
A novel application of data-based time series methods is proposed in this study to help overcome barriers to quantifying the impacts of Natural Flood Management measures from hydrological timeseries data which hitherto have prevented accurate assessment of the effectiveness of interventions. To demonstrate the value of this method, a transfer function noise model was fitted to stage data from a three year before-after-control-impact style monitoring study of leaky dams in an upland catchment in North Yorkshire, England. Using the data-based time series method, uncertainties associated with stage data were overcome. The models were able to simulate the peaks of flood events on one stream to within ±2 cm accuracy for 95% of events recorded during the baseline monitoring period. These simulations are used in this study’s companion paper to quantify leaky dam impacts on flood peak magnitude. The level of accuracy achieved in this study provides proof of concept for application of the approach to data from other environments and natural flood management interventions, which is crucial if natural flood management is to be used as a mainstream flood risk management measure.
End-to-end machine-learning for high-gravity ammonia stripping: Bridging the gap between scientific research and user-friendly applications
2024, Water Research
The removal and recovery of ammonia from wastewater are critical processes for achieving global environmental sustainability and promoting circular economic development. High-gravity technology is an advanced solution to achieve ammonia stripping from wastewater. This study used machine-learning (ML) techniques to provide more comprehensive insights on various influencing factors, including the operating parameters, wastewater characteristics, and design parameters of rotating packed beds. Bayesian auto-optimization combined with a boosting algorithm effectively overcame the challenges of modeling complex datasets with small sample sizes, multidimensional data, missing values, and skewed distributions. Accurate ML based predictive models for the ammonia removal efficiency (η) and mass transfer coefficient (K_La) were developed, the performance on the training set was R² = 0.98 and R² = 0.89, and on the testing set was R² = 0.98 and R² = 0.82. The developed model revealed that the stripping stage and gas-liquid ratio were the most influential features for predicting η, whereas the liquid flow and high-gravity factor were the most important features for predicting K_La. The well-trained model was then deployed in an online software application that could provide both predictive and auto-update functions for operators and managers, ensuring that practitioners could use the model. The end-to-end machine-learning approach used in this study—that is, covering data collection, model development, and application—could improve the availability of research results, providing valuable references for the further advancement of technology in the field of environmental.
Dissolved organic matter evolution and straw decomposition rate characterization under different water and fertilizer conditions based on three-dimensional fluorescence spectrum and deep learning
2023, Journal of Environmental Management
Straw returning is a sustainable way to utilize agricultural solid waste resources. However, incomplete decomposition of straw will cause harm to crop growth and soil quality. Currently, there is a lack of technology to timely monitor the rate of straw decomposition. Dissolved organic matter (DOM) is the most active organic matter in soil and straw is mainly immersed in the soil in the form of DOM. In order to formulate reasonable straw returning management measures , a timely monitoring method of straw decomposition rate was developed in the study. Three water treatment (60%–65%, 70%–75% and 80%–85% maximum field capacity) and two fertilizer (organic fertilizer and chemical fertilizer) were set up in the management of straw returning to the field. Litterbag method was used to monitor the weight loss rate of straw decomposition under different water and fertilizer conditions in strawberry growth stage. The changes of DOM components were determined by three-dimensional fluorescence spectroscopy (3D-EEM). From the faster decomposition period to the slower decomposition period, the main components of DOM changed from protein-like components to humus-like components. At the end of the experiment, the relative content of humus-like components under the treatment of organic fertilizer and moderate water was the highest. Convolutional neural network (CNN) combined with 3D-EEM was used to identify the decomposition speed of straw. The classification precision of neural network validation set and test are 85.7% and 81.2%, respectively. In order to predict the decomposition rate of straw under different water and fertilizer conditions, 3D-EEM data of DOM were used as the input of CNN, parallel factor analysis (PARAFAC) and fluorescence region integral (FRI), and dissolved organic carbon data were used as the input of dissolved organic carbon linear prediction. The prediction model based on CNN had the best effect (R² = 0.987). The results show that this method can effectively identify the spectral characteristics and predict the decomposition rate of straw under different conditions of water and fertilizer, which is helpful to promote the efficient decomposition of straw.

View all citing articles on Scopus

View full text

A comparison of methods to avoid overfitting in neural networks training in the case of catchment runoff modelling

Summary

Highlights

Introduction

Section snippets

Study area and hydro-meteorological data

Multi-layer perceptron artificial neural networks and optimization algorithms

Methods to avoid neural network overfitting

Selection of input variables and MLP architecture

Results and discussion

Conclusions

Acknowledgments

Neurocomputing

J. Hydrol.

J. Hydrol.

J. Hydrol.

Neural Networks

Expert Syst. Appl.

J. Hydrol.

Phys. Chem. Earth

Energy Convers. Manage.

Environ. Modell. Softw.

Environ. Modell. Softw.

Neurocomputing

Neural Networks

Adv. Eng. Softw.

J. Hydrol.

Inf. Sci.

Neural Networks

J. Hydrol.

J. Hydrol.

J. Hydrol.

J. Hydrol.

Neural Networks

Eur. J. Oper. Res.

J. Hydrol.

J. Hydrol.

Int. J. Forecast.

Appl. Math. Comput.

A neurocomputing approach to predict monsoon rainfall in monthly scale using SST anomaly as a predictor

Acta Geophys.

Comparison of multivariate regression and artificial neural networks for peak urban water-demand forecasting: evaluation of different ANN learning algorithms

J. Hydrol. Eng.

Interpretation of spontaneous potential anomalies from some simple geometrically shaped bodies using neural network inversion

Acta Geophys.

Asymptotic statistical theory of overfitting and cross-validation

IEEE Trans. Neural Networks

The effect of adding noise during backpropagation training on a generalization performance

Neural Comput.

On a constructive proof of the Kolmogorov’s superposition theorem

Constr. Approx.

Use of noise to augment training data: a neural network method of mineral–potential mapping in regions of limited known deposit examples

Nat. Resour. Res.

Optimizing hydropower reservoir operation using hybrid genetic algorithm and chaos

Water Resour. Manage.

Computational intelligence in earth sciences and environmental applications: issues and challenges

Neural Networks

Differential evolution using a neighborhood-based mutation operator

IEEE Trans. Evol. Comput.

Differential evolution – a survey of the state-of-the-art

IEEE Trans. Evol. Comput.

Identification of the best architecture of a multilayer perceptron in modelling daily total ozone concentration in Kolkata, India

Acta Geophys.

Improving classical and decentralized differential evolution with new mutation operator and population topologies

IEEE Trans. Evol. Comput.

Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology – part 2: application

Hydrol. Earth Syst. Sci.

Enhancing differential evolution utilizing proximity-based mutation operators

IEEE Trans. Evol. Comput.

Over-parameterisation, a major obstacle to the use of artificial neural networks in hydrology?

Hydrol. Earth Syst. Sci.