Forecasting daily potential evapotranspiration using machine learning and limited climatic data
Research highlights
▶ Forecasting ETo models for water managing purposes (1985 Hargreaves ETo). ▶ 2 ETo forecasting approaches using air temperature and Multivariate Relevance Vector Machine. ▶ Ind. Approach provides better and larger forecast lags than Dir. Approach. ▶ Better ETo forecast when using forecasted weather variables rather than ETo values. ▶ In study case, ETo is forecasted of up to 4 days using the Ind. Approach.
Introduction
Population increases over the next decades will place a substantial emphasis on achieving higher irrigation efficiencies and greater production per unit of water. A key component of any strategy to improve irrigation water management will be the improvement of water delivery strategies and efficiencies within the irrigation delivery networks. As Kumar et al. (2002) note, evapotranspiration (ET) is one of the most important components of the hydrologic cycle and its accurate estimation is of vital importance for such diverse areas as hydrologic water balance, irrigation system design and management, crop yield simulation and water resources planning and management. Likewise, achieving higher irrigation system performance will depend on reliable forecasts of cropland ET and will require that such forecasts be far enough in the future to compensate for the time travel of the water supply. ET estimation is an important input to water management and irrigation scheduling because crop demands are generally the largest component of water diversions.
A number of computational methods have been developed to estimate potential evapotranspiration (ETo) from climatic data. These methods vary in complexity from models that require only basic information, such as maximum and minimum air temperature (Hargreaves, 1974), to complex models that estimate ETo through energy balance models, such as the Penman-Montieth method (Allen et al., 1998). The advantage of simple models is their use in regions of minimal available weather data. Their major disadvantage is that these may not reflect the effects of localized climatic and geographic variations such as narrow valleys, presence of high ground elevations, extreme latitudes or strong winds. Also, simple methods are usually best suited to weekly or monthly ETo estimates and are less suited to daily estimates.
In recent years there have been several attempts to estimate and forecast ETo with a higher degree of accuracy and over extended futures. Some of them involve numerical and statistical approaches that attempt to accurately simulate the random nature of the meteorological variables (Yamashita and Walker, 1994). The occurrence of difficulties related with these attempts forced researchers to look for other techniques using data-driven tools or statistical learning machines, such as Artificial Neural Networks (Kumar et al., 2002, Smith et al., 2006, Lai et al., 2004), Simple Bayes Classifier and k-Nearest Neighbors (Verdes et al., 2000), Support Vector Machines and Relevance Vector Machines (Gill et al., 2006). These newer approaches have been used primarily to forecast hourly ETo values up to 24 h in advance. Forecasting of daily ETo for 1 to K days ahead in time using data-driven tools have not been reported, even though these methods are known for having excellent modeling accuracy, particularly in representing complex nonlinear behavior (Lai et al., 2004).
The objective of this paper is to demonstrate the adequacy of two different approaches for forecasting daily ETo using statistical learning machines and limited climatic information. The learning machine algorithm used is the Multivariate Relevance Vector Machine (MVRVM). The potential crop evapotranspiration (ETo) is estimated by the 1985 Hargreaves Equation. These results are then compared with crop ET values of the area under study to determine the accuracy of the forecasted estimates obtained. Also, for comparison and benchmarking analysis, another learning machine was tested, the Multi-layer Perceptron, to determine the suitability of the proposed forecasting schemes.
The potential or reference evapotranspiration (ETo) expresses the evaporating power of the atmosphere at a specific location and time of the year and does not consider the crop characteristics and soil factors. As it is mentioned by Allen et al. (1998), the only factors affecting ETo are climatic parameters. Consequently, ETo is a climatic parameter and can be computed from weather data. Among the several methods to estimate ETo, the FAO Penman–Monteith method is recommended as the sole method for determining ETo. This method has been selected because it closely approximates grass ETo at the location evaluated, is physically based, and explicitly incorporates both physiological and aerodynamic parameters.
Situations might occur where data for some weather variables are missing. The use of an alternative ETo calculation procedure, requiring only limited meteorological parameters, should generally be avoided. It is recommended that one calculate ETo using the standard FAO Penman–Monteith method after resolving the specific problem of the missing data (Allen et al., 1998). Despite of this, with limited climatic data as Net Radiation not available, Allen et al. (1998) suggest the Hargreaves ETo equation for its use, specially for water management purposes.
Hargreaves and Allen (2003) note that the current Hargreaves equation was developed in 1975 in an attempt to improve the ETo equation developed by Christiansen (1968). Using eight years of daily cool season grass in precision weighting lysimeters and weather data, Hargreaves performed regressions among measured ETo and temperature data using several ETo methods. Several posterior attempts to improve the resulting ETo equation led to the 1985 Hargreaves ETo equation:where
ETo: potential evapotranspiration (mm/day) of a reference crop (grass),
Tmax, Tmin: maximum and minimum daily air temperature (°C),
TC: 0.5 (Tmax + Tmin),
TR: Tmax − Tmin and;
Ra: extraterrestrial solar radiation (mm/day).
The following empirical simplifications allows to estimate Ra using the latitude and the day of the year, as mentioned by Allen et al. (1998):where
dr: relative distance from the earth to the sun,
J: julian day,
ωs: sunset hour angle (rad),
φl: latitude (rad),
δ: declination of the sun (rad) and
λ: latent heat of vaporization, λ ≈ 2.54.
Hargreaves and Allen (2003) stated that the best use of Eq. (1) would be for ETo estimation in regional planning and reservoir operation studies. The attractiveness of the Hargreaves ETo model is its simplicity, reliability, minimal data requirement, and ease of computation. The viability of using Hargreaves instead of Penman-Montieth ETo equation is demonstrated by the study performed by Trajkovic and Kolakovic (2009) where the difference between Hargreaves and Penmman-Montieth ETo equations is in the range of −4.7% to 6.9% for all the weather stations used in their study.
Once determined the reference or potential evapotranspiration value, the actual evapotranspiration from a cropped area, ET, is determined by adjusting ETo for crop variety and stage of grow as follows:in which Kc is the crop growth stage factor. Tables and information related with the crop stage factor have been developed for most of the agricultural crops around the world. Average values of Kc for various crops can be found in Allen et al. (1998), while location-specific Kc can be found in research publications (Wright, 1982).
Artificial Neural Networks (ANN) have been used extensively for simulation and forecasting in such diverse areas as finances, power generation, water resources and environmental science (Maier and Dandy, 2000). These algorithms can be considered as benchmark for comparison purposes with newer or existing learning machine algorithms for given tasks or analysis. Among the vast implementation of Artificial Neural Network models, the Multi-layer Perceptron (MLP) is one of the most widely used models of ANNs (Nabney, 2002) and they are attractive because of their ability to approximate any smooth function, provided there is enough data to estimate the MLP parameters. Also, despite the successful application of ANN-based models, one of the critical issues often mentioned is the absence of uncertainty measurements associated with their predicted output (Khan and Coulibaly, 2005). To overcome this limitation, the implementation of the Bayesian Inference Method (MacKay, 1992) to calibrate the MLP has made possible measures of the uncertainty related with the predicted outputs.
The MLP architecture can be described as mentioned by Pierce et al. (2008):where
yk: kth component of the MLP output vector y(n); yk ∈ 1 ≤ k ≤ K,
xd: dth component of the input vector x(n); xd ∈ 1 ≤ d ≤ D,
, : matrix weights for the first and second layer respectively, 1 ≤ m ≤ M,
K: number of outputs or forecasted values,
D: number of inputs or values in the past required by the MLP,
M: number of hidden neurons,
bm, bk: bias vector for the first and second layer respectively.
Using a dataset Λ = [x(n), t(n)] with n = 1…N, where t(n) is the vector of the target or actual values t(n) = (t1, … tk, … tK), and N is the number of training examples provided to the learning machine, the training of the MLP is performed by optimizing the network parameters in order to minimize the Overall Error Function E (Bishop, 1995):where
EΛ: data error function,
EW: penalization term,
W = number of weights and biases in the neural network, and
α and β: Bayesian hyperparameters.
In Bayesian terms, the goal is to estimate the probability of the weights and bias of the MLP model, given the dataset Λ (Ungar et al., 1996):where as explained by MacKay (1992),
p(W|t(n)): the posterior probability of the weights,
p(t(n)|W): the likelihood function,
p(W): the prior probability of the weights, and
p(t(n)): the evidence for the dataset.
Assuming a Gaussian distribution for the error term ξ(n) = t(n) − y(n) and the weights W, the likelihood and the prior probabilities can be expressed as mentioned by Petrovic (2007):EΛ models the uncertainty (or error) of the target variables as Gaussian zero-mean noise and variance σ2 β−1. EW defines the conditional probability of W with variance Then Eq. (9) can be expressed as:in which,
E(WMP): expected most probable values for the weights and bias matrices,
H = Hessian matrix and
ΔW = W − WMP.
Once the distribution of W has been estimated by maximizing the likelihood for α and β, the prediction y(n) and the variance of the predictions can be estimated by integrating (marginalizing) over W and the regularization parameters α and β (Bishop, 1995):which can be approximated by:where is the output or estimate variance vector
Then the output variance can be expressed as:where g denotes the gradient of y(n) with respect to the weights; g ▿Wy(n)|WMP. The output variance has then two sources; the first arises from the intrinsic noise in the target data; and the second from the posterior distribution of the ANN weights (Pierce et al., 2008). The output standard deviation vector σy of the predictive distribution for y(n) can be interpreted as an error bar for confidence interval estimation (Bishop, 1995).
The Multivariate Relevance Vector Machine (MVRVM), developed by Thayananthan et al. (2008), is a general Bayesian framework for obtaining sparse solutions to regression and classification tasks based on the Relevance Vector Machines developed by Tipping (2001), and Tipping and Faul (2003). These learning machines are also being used in many of the same areas as the ANNs and are particularly useful in hydrology and water resources because of the generalization properties and the probabilistic estimation, useful to estimate prediction uncertainty (Tripathi and Govindaraju, 2007). The mathematical formulation of the MVRVM is:where
x(n) and t(n): the input and output values that belong to the dataset Λ, as defined for the MLP,
: rth and kth component of the weight matrix
R: number of relevance vectors or cases selcted by the MVRVM, 1 ≤ r ≤ R,
φr(x(n)): rth component of the design matrix or basis function Φ that can be related with a kernel function φr = κ(x(n),x(r)).
The kernel function is a weighting function for the inputs used in non-parametric estimation techniques. It provides an adjustment ratio to the x(n) input vector based on x(R) selected or ‘relevant’ input vectors, which are selected automatically among the N training input vectors (R ≪ N). For the calibration of the MVRVM, this learning machine uses a variation of the Overall Error function (Eq. (8)) and, by using the Bayesian Inference Method, estimates the distribution of the weights of the model (Eq. (9)), similar to the MLP. Also, the error term or noise ξ(n) is assumed to be probabilistic independent zero-mean Gaussian, with variance σ2. The Overall Error function to be minimized is:with:where
k is the kth component of the K different regression functions (outputs) to learn by the MVRVM,
is the adjusted target values,
is the kth component of the adjusted design matrix,
ck is the adjustment factor.
Assuming a Gaussian prior probability distribution for the weights (Tipping, 2001), and representing where each element αn is a hyperparameter that determines the relevance of the associated basis function and βr represents the noise variance in the target data (Thayananthan et al., 2008), the prior is represented by:where is the element at (r,k) of the weighting matrix WRK. Dropping the index k for clarity in the following equations, the likelihood of the target t(n) can be written as:τr is the vector with the rth component of the target data. The posterior probability of W can be written as the product of separate Gaussians of the weights vectors of each output dimension:where μr = βrΣrΦTτr and Σr = (βrΦTΦ + A)−1 are the mean and the variance of the weights. Marginalizing the data likelihood over the weights:where Hr is the Hessian matrix for the rth relevance vector, A most probable set of hyperparameters AMP and noise parameters BMP is obtained by maximizing the marginal likelihood as described by Tipping and Faul (2003). The final hyperparameter values are:andwith output and variance:
Section snippets
Area of study
The water resources of the Sevier River Basin in Central Utah (Fig. 1) are among the most heavily utilized in the Western US. Substantial efforts to increase efficiency via canal lining and on-farm improvement such as conversion to sprinkler irrigation and laser land leveling were made during 1960–1990 period. From 1990 to the present, all reservoirs and stream offtakes have been equipped with SCADA technology and web-based data summaries (SRWUA, 2009). Canal automation was introduced in 1994
Results
Using the training and testing datasets described earlier in Section 2.3, the calibration of the machine learning algorithms was made using both the Direct and Indirect Approaches. To determine the performance and accuracy of the learning machines for forecasting, a 7-day forecast horizon was solicited for the ETo analyses. This value (K = 7) is in fact larger than the required 4-day ahead forecast for the area under study, nevertheless, additional information could be useful for extended daily
Conclusions
The present study demonstrates the adequacy of forecasting near term daily ETo information necessary for water management purposes based on the 1985 Hargreaves ETo equation. Two approaches were tested using the Multivariate Relevance Vector Machine algorithm. The first approach, Direct Approach, involves the estimation of future ETo times series from historical ETo data. The second approach, Indirect Approach, considers to forecast the required climatic data for the 1985 Hargreaves ETo
Acknowledgments
The authors would like to thank the Utah Water Research Laboratory, the Utah Center for Water Resources Research, and the Utah State University Research Foundation for their support of this research. The authors would also like to thank the US Bureau of Reclamation and the Lower Basin Commissioner of the Sevier River Water Users Association for their continued assistance and support in this research.
References (31)
- et al.
Multiobjective analysis of chaotic dynamic systems with sparse learning machines
Advances in Water Resources
(2006) - et al.
Uncertainty analysis of a neural network used for fatigue lifetime prediction
Mechanical Systems and Signal Processing
(2008) - et al.
Pose estimation and tracking using multivariate regression
Pattern Recognition Letters
(2008) - et al.
Crop evapotranspiration guidelines for computing crop water requirements
Irrigation and Drainage, FAO
(1998) Neural Networks for Pattern Recognition
(1995)Pan evaporation and evapotranspiration from climatic data
Proc. Am. Soc. Civil Eng., Journal of Irrigation and Drainage Engineering, ASCE
(1968)- et al.
Soil moisture prediction using support vector machines
American Water Resources Association
(2006) Moisture availability and crop production
Transactions, ASAE
(1974)- et al.
History and evaluation of hargreaves evapotranspiration equation
Journal of Irrigation and Drainage Engineering, ASCE
(2003) - et al.
Streamflow forecasting with uncertainty estimate using Bayesian learning For ANN
Estimating evapotranspiration using Artificial Neural Network
Journal of Irrigation and Drainage Engineering, ASCE
Intelligent weather forecast
A practical Bayesian framework for backpropagation networks
Neural Computation
Neural Networks for the prediction and forecasting of water resources variables: a review of modeling issues and applications
Environmental Modeling and Software
Cited by (143)
Evaluating the ability of deep learning on actual daily evapotranspiration estimation over the heterogeneous surfaces
2024, Agricultural Water ManagementA review of recent advances and future prospects in calculation of reference evapotranspiration in Bangladesh using soft computing models
2024, Journal of Environmental ManagementEnsemble learning-based estimation of reference evapotranspiration (ET<inf>o</inf>)
2023, Internet of Things (Netherlands)Forecasting actual evapotranspiration without climate data based on stacked integration of DNN and meta-heuristic models across China from 1958 to 2021
2023, Journal of Environmental ManagementShort-term daily reference evapotranspiration forecasting using temperature-based deep learning models in different climate zones in China
2023, Agricultural Water Management