A novel stacked generalization ensemble-based hybrid LGBM-XGB-MLP model for Short-Term Load Forecasting

doi:10.1016/j.energy.2020.118874

Energy

Volume 214, 1 January 2021, 118874

https://doi.org/10.1016/j.energy.2020.118874 Get rights and content

Highlights

•
A novel Stacked XGB-LGBM-MLP model is proposed to improve the overall regression performance.
•
A novel STLF technique is explored and developed in this study.
•
A comparative analysis of five hyperparameter optimization algorithms was comprehensively presented for STLF.
•
An assessment of the proposed technique is conducted using two real datasets.
•
A comparative study with the recent benchmark techniques is performed.

Abstract

This paper proposes an effective computing framework for Short-Term Load Forecasting (STLF). The proposed technique copes with the stochastic variations of the load demand using a stacked generalization approach. This approach combines three models, namely, Light Gradient Boosting Machine (LGBM), eXtreme Gradient Boosting machine (XGB), and Multi-Layer Perceptron (MLP). The inner mechanism of Stacked XGB-LGBM-MLP model consists of generating a meta-data from XGB and LGBM models to compute the final predictions using MLP network. The performance of the proposed Stacked XGB-LGBM-MLP model is validated using two datasets from different locations: Malaysia and New England. The main contributions of this paper are: 1) A novel stacking ensemble-based algorithm is proposed; 2) An effective STLF technique is introduced; 3) A critical multi-study analysis for hyperparameter optimization with five techniques is comprehensively performed; 4) A performance comparative study using two datasets and reference models is conducted. Several case studies have been carried out to prove the performance superiority of the proposed model compared to both existing benchmark techniques and hybrid models.

Introduction

With the recent waves of digitization and the fast pervasiveness of information and communication technologies, many initiatives have placed an added emphasis on the development and modernization of the traditional centric power grids [1]. The balance between electricity generation and load demand must be optimally maintained to avoid fatal disturbances on the grid due to overloads [2]. To achieve that aim, electric Load Forecasting (LF) offers the necessary tools for stakeholders and energy suppliers to increase their profitability from Renewable Energy Sources (RES) and meet the ever-growing electricity demand. The high complexity of the utility grid operations paved the way for LF to monitor, control, and manage the electric system operations with high efficiency [3].

Recently, LF reached an overall state of maturity that guarantees its safe applicability and profitability in smart grids and traditional utility grids with satisfactory results. LF reveals a cost-effective, efficient, and reliable technique within the energy management framework [4]. Electrical LF assists the scheduling of load response and maintains the fast and economic dispatch to its optimal. Furthermore, it provides a reliable indicator for managing the complex pricing strategies in liberalized and deregulated energy markets with higher financial benefits [5]. Efficient energy management systems strongly require intelligent algorithms to effectively support the electric operations strategy, decrease the electricity bills, and enhance the energy trading and planning [5]. LF is conducted using a variety of features’ inputs including social, economic, and weather conditions [3]. Based on the application type, the LF is divided into two categories: 1) the time horizon 2) scope of the variables employed [3].

For the time horizon, most of the LF employed techniques can be divided into four major classes; long-term LF valid for years, medium-term LF valid from months to years, Short-Term Load Forecasting (STLF) used from minutes up to one week, and very short-term LF valid from seconds to minutes [6]. The STLF ensures high asset commitment and flexibility of the grid to enhance the serviceability of electricity operations over several time scales for day-to-day operations [7]. In Ref. [3], it was concluded that the most useful LF horizon is STLF.

STLF models are classified into three categories: soft computing techniques, conventional forecasting techniques, and modified traditional techniques [8,9]. Traditional techniques involve regression methods, Iterative reweighted Least Squares, and Exponential Smoothing [10]. However, there are three major drawbacks associated with these methods such as overfitting problems with a massive amount of data, difficulties in feature engineering processing, and relatively less accuracy compared to advanced techniques. Modified techniques include adaptative demand forecasting, stochastic times series, auto-regressive, and moving average-based models, support vector machine-based techniques [3]. Soft computing techniques essentially consist of genetic algorithms, fuzzy logic, neural networks, and knowledge-based expert systems [11]. However, there are some major drawbacks associated with these methods including loss of model interpretability, higher execution time and computational burden, and limited generalization capabilities. Moreover, Hyperparameter Optimization (HO) for soft techniques is a very computationally expensive and time-consuming task [12].

Various algorithms were proposed to solve the optimization problems and enhance the ML model performance. In Refs. [13], the authors proposed an Exchange Market-Genetic Algorithm (EMGA) technique to solve optimization problems with less iterations and better-quality results. The proposed technique combines the merits of the genetic algorithm, and exchange market algorithm [14]. The execution time of the EMGA algorithm took 2.82 min for 641 iterations in solving twelve benchmark functions. Despite the fast execution time and low error rate, the simulation results show that EMGA exhibits a high time iteration ratio. Authors in Ref. [15] used a Simulated Annealing (SA) algorithm for HO of Deep Neural Network (DNN). The proposed SA-DNN achieves accurate results in terms of RMSE. However, the search space of the SA-DNN HO is very limited to avoid computational burden (it only includes the neuron numbers). This leads to low accuracy improvement compared to DNN. In Ref. [16], the authors used the spearmint Bayesian Optimization (BO) method for the HO of recurrent neural networks. The proposed technique is hyper-effective for both short/long-term forecasting. The authors in Ref. [17] used a Derivative-Free Optimization (DFO) technique with deep learning models. The proposed technique uses an efficient feature selection via ensemble structures to predict a variety of RES. However, the comparative analysis of DFO with other HO benchmark techniques is missing.

Ensemble methods have been widely deployed for forecasting applications due to their ease of implementation. In Refs. [18], the authors employed the extreme Gradient Boosting technique (XGB) to predict the load based on similar days using cluster analysis. The presented results confirmed the superiority of the ensemble methods in terms of high accuracy and generalization capabilities compared to the deep learning techniques such as Long Short-Term Memory (LSTM). Nevertheless, the results are unsatisfactory with a relatively poor Mean Square Error (MSE). In Ref. [19], a combination between Convolutional Neural Network (CNN) and Light Gradient Boosting Machine (LGBM) was proposed. The feature extraction process was carried out using CNN model from five wind turbines. The reliability of the proposed technique was verified according to the low registered error metrics values. However, this technique shows sensitivity to time-order character size and requires high computing resources.

In this paper, a Stacked Generalization approach between XGB, LGBM, and Multi-Layer Perceptron (MLP) models named Stacked XGB-LGBM-MLP, is firstly explored for STLF. To the best of the authors’ knowledge, no prior work has addressed this architecture for STLF. The Stacked XGB-LGBM-MLP model is characterized by high accuracy, excellent performance, and ease of implementation. Moreover, the simulation results with an open data portal demonstrated that the Stacked XGB-LGBM-MLP model manages to outperform 11 benchmarks for STLF application. The main contributions of this paper are given as follows:

•
A novel Stacked XGB-LGBM-MLP model is proposed to improve the overall regression performance. Despite the potential learning ability and rigorous mathematical theory of XGB and LGBM models, they can only use tree models with the same category, and it is difficult to fundamentally overcome the inherent defects of the tree models. Using MLP model, the meta-data enhances its training performance to generate a better-quality result.
•
A novel STLF technique is explored and developed in this study. Most of the existing techniques use ensemble models or neural networks for STLF. However, mixing both ensembles and neural networks in one single framework has not received enough attention in the previous studies.
•
A comparative analysis of five HO algorithms was comprehensively presented for STLF. Previous studies focused mainly on using various HO techniques to enhance the performance of the ML models. However, selecting the most appropriate HO technique received little attention in the field of ML.
•
An assessment of the proposed technique is conducted using two real datasets. The sensibility of the proposed technique to the size and nature of the data has been given significant importance in this research study.
•
A comparative study with the recent benchmark techniques is performed. A large comparative study with 11 benchmarks has been conducted to demonstrate the high performance of the proposed stacked XGB-LGBM-MLP model.

Therefore, the paper is organized as follows: Section 2 presents the preliminaries of the proposed technique. In Section 3, two case studies have been conducted. Several existing Machine Learning (ML) models are compared to the proposed technique. Furthermore, a comparative study with the recent benchmark technique regarding the same dataset has been discussed. Finally, section 3 draws conclusions to end this paper.

Section snippets

Preliminaries on ML models and stacked XGB-LGBM-MLP method

In this section, three ML models are introduced according to their distinguished architecture, namely, LGBM, XGB and MLP Network. Moreover, the proposed technique is comprehensively investigated.

Case studies and performance assessment

Based on a publicly available datasets, a real case studies were conducted to assess the proposed approach and illustrate the prediction performance of the hybrid model. Furthermore, a comparative analysis with the recent benchmarks is performed. Moreover, the high accuracy of the proposed method with the latest hybrid STLF technique for the same dataset has been verified.

Conclusions and future work

This paper proposes a novel computing framework based on stacked generalization method for STLF. In order to improve the accuracy of single techniques, the proposed technique combines three efficient methods, namely, Extreme Gradient Boosting (XGB), Light Extreme Gradient Boosting, and Multi-Layer Perceptron (MLP) models. The components of the proposed model are selected based on individual performance, training time, and ease of implementation trade-off. The presented experiments strongly

Credit author statement

Mohamed Massaoudi: Conceptualization, Methodology, Software, original draft preparation. Shady S. Refaat: Validation, writing—review & editing, project administration, funding acquisition. Ines Chihi: Validation, formal analysis, Data curation, Writing – review & editing. Mohamed Trabelsi: Formal analysis, investigation, writing—review & editing, resources. Fakhreddine S. Oueslati: Supervision, project administration. Haitham Abu-rub: Proofreading & editing, review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This publication was made possible by NPRP grant [NPRP10-0101-170082] from the Qatar National Research Fund (a member of Qatar Foundation), the co-funding by IBERDROLA QSTP LLC and sponsorship by Texas A&M Energy Institute Fellowship. The statements made herein are solely the responsibility of the authors.

References (41)

M. Faheem et al.
Smart grid communication and information technologies in the perspective of Industry 4.0: opportunities and challenges
Comput Sci Rev
(2018)
B. Yildiz et al.
A review and analysis of regression and machine learning models on commercial building electricity load forecasting
Renew Sustain Energy Rev
(2017)
D.W. van der Meer et al.
Review on probabilistic forecasting of photovoltaic power production and electricity consumption
Renew Sustain Energy Rev
(2018)
S.-M. Zhou et al.
Low-level interpretability and high-level interpretability: a unifiedview of data-driven interpretable fuzzy system modelling
Fuzzy Set Syst
(2008)
T. Khalili et al.
Optimal battery technology selection and incentive-based demand response program utilization for reliability improvement of an insular microgrid
Energy
(2019)
C.W. Tsai et al.
Optimizing hyperparameters of deep learning in predicting bus passengers based on simulated annealing
Appl Soft Comput J
(2020)
M. Pirhooshyaran et al.
Feature engineering and forecasting via derivative-free optimization and ensemble of sequence-to-sequence networks with applications in renewable energy
Energy
(2020)
M.Q. Raza et al.
A review on artificial intelligence based load demand forecasting techniques for smart grid and buildings
Renew Sustain Energy Rev
(2015)
D.H. Wolpert
Stacked generalization
Neural Network
(1992)
R. Khalid et al.
A survey on hyperparameters optimization algorithms of forecasting models in smart grid
Sustain Cities Soc
(2020)

G.R. Esteves et al.

Long term electricity forecast: a systematic review

(2015)

L. Hernandez et al.

A survey on electric power demand forecasting: future trends in smart grids, microgrids and smart buildings

IEEE Commun Surv Tutorials

(2014)

K. Chen et al.

Short-term load forecasting with deep residual networks

IEEE Trans Smart Grid

(2019)

H. Quan et al.

Short-term load and wind power forecasting using neural network-based prediction intervals

IEEE Trans Neural Netw Learn Syst

(2014)

S. Wanqing et al.

Multifractional Brownian motion and quantum-behaved partial Swarm optimization for bearing degradation forecasting

(2020)

N. Zhang, Z. Li, X. Zou, S. M. Quiring, Comparison of three short-term load forecast models in Southern California,...

C. Sigauke et al.

Peak electricity demand forecasting using time series regression models: an application to South African data

J Stat Manag Syst

(2016)

Y. Yang, W. Hong, S. Li, Deep ensemble learning based probabilistic load forecasting in smart grids, Energy 189, ISSN...

A. Jafari et al.

A hybrid optimization technique using exchange market and genetic algorithms

IEEE Access

(2020)

M. Pirhooshyaran, L. V.Snyder, Forecasting, hindcasting and feature selection of ocean waves via recurrent and...

Cited by (208)

Predicting propeller emergence severity in adverse sea conditions: An integrated approach using stacked machine learning models
2024, Ocean Engineering
In adverse sea conditions, variations in the propeller submergence depth can induce ventilation and out-of-water effects. These effects result in propeller-torque and thrust losses, impacting the shipboard power network's reliability and accelerate wear on propulsion shaft system. Ensuring the stability and safety of sailing ships necessitates predicting the occurrence and severity of ventilation and out-of-water effects. This, in turn, enables the implementation of a switching strategy with a fast response for propulsion control. Nonetheless, existing methods can only identify current propeller ventilation states, mainly categorised as ventilation and non-ventilation, or more elaborately, as full ventilation, partial ventilation and non-ventilation. These methods, however, face a critical limitation—they cannot predict the severity of ventilation or the out-of-water effect, leading to ineffective propulsion-control-strategy switching. This study proposes a method for predicting propeller emergence severity (that is, non-emergence, light emergence and severe emergence) under wave conditions. This method introduces a novel approach for establishing propeller emergence states and a sophisticated stacked model. The new method categorises different propeller emergence states for diverse classifications based on the maximum emergence severity that the propeller can develop. This to overcome the challenges in predicting the development of the ventilation severity and the out-of-water effect. The innovative stacked model helps improve the overall prediction performance of a single intelligent algorithm. Finally, using simulation experimental data, it is verified that the propeller emergence severity method has satisfactory effectiveness and accuracy. The proposed method provides crucial technical support for the efficient switching of propulsion control strategies in electric propulsion systems when ships are navigating under adverse sea conditions.
Enhancing interval-valued time series forecasting through bivariate ensemble empirical mode decomposition and optimal prediction
2024, Engineering Applications of Artificial Intelligence
Interval-valued time series (ITS) has been widely concerned by the academic community due to its outstanding performance in dealing with the uncertainty of systems. Numerous ITS forecasting studies have emerged, while the most popular method is based on “divide and conquer”. For ITS analysis, bivariate empirical mode decomposition (BEMD) is currently the main tool that can simultaneously consider the interrelationships between the upper and lower limits. The concern is that, like empirical mode decomposition (EMD), BEMD suffers from mode mixing and end-point effects, which leads to unsatisfactory results of decomposition. In this paper, a bivariate ensemble empirical mode decomposition (BEEMD) method is proposed and applied to ITS forecasting, which aims at alleviating the problems mentioned above. At the same time, dynamic time warping (DTW) is used to quantize and analyze the mode mixing. Then, the effects of complex-valued signal constructions on ITS forecasting are discussed. To overcome the uncertainty and instability that may arise from a single prediction method, an optimal prediction based on the model pool is designed for sub-modes, and the final result can be obtained by simple addition. Using carbon prices, West Texas Intermediate (WTI) crude oil prices, and short-term loads, as the research objects, the results indicate that BEEMD is effective in avoiding mode mixing in BEMD, the complex-valued signal construction based on center and radius method (CRM) is more conducive to forecasting and optimal prediction can further improve forecasting accuracy.
An emission predictive system for CO and NOx from gas turbine based on ensemble machine learning approach
2024, Fuel
The gas turbine in a combined cyclic power plant (CCPP) produces harmful gases like carbon monoxide (CO) and nitrogen oxide (NOx) into the atmosphere. It is evident to monitor the rate at which these gases are produced during power generation to comply with the industrial standard for emission. Therefore, a system is required to continuously monitor the emission from the CCPP gas turbine. Hence, this work aims to design a stacked ensemble machine learning (SEM) based predictive model for CO and NOx emission from a CCPP gas turbine. The neural network for regression (NNR), a generalized additive model (GAM), and the bagging of regression trees (BT) act as the base learners. A generalized regression neural network (GRNN) is used as a meta-learner for SEM. The hyperparameters of SEM are optimized using a Bayesian optimization algorithm for CO and NOX prediction. In addition to this, the performance of SEM is compared with support vector regression (SVR), decision tree (DRT), and linear regression (LIR). Simulation results demonstrate that SEM can reduce the RMSE 5.7–93.8% for NOx and 1%-41.5% for CO compared to other ML techniques. Finally, comparing the results with ML techniques existing in the literature shows the higher predictive accuracy of the proposed SEM.
Residential net load interval prediction based on stacking ensemble learning
2024, Energy
In response to the high uncertainty associated with residential net load due to the coupling of distributed photovoltaic generation and user demand, this paper proposed a novel cluster-based stacking ensemble learning model for net load interval prediction. Firstly, the k-means algorithm is employed to discover the similarity in user electricity consumption patterns. Then, a RIME optimization algorithm with local enhancement (LRIME) is developed to optimize the parameters and weights of the base learners in stacking ensemble learning. Subsequently, base learners with strong predictive capabilities and significant diversity are chosen as the first-layer predictive models, extreme learning machine (ELM) is utilized as the second-layer predictive model, ultimately resulting in the proposed stacking ensemble learning prediction model. And utilizing the bootstrap method to fit the volatility of point predictions, different prediction intervals are obtained at varying confidence levels, aiming to quantify the integrated uncertainty in photovoltaic generation and load. Testing on the open Ausgrid electricity load data in Australia provided robust validation of the proposed method's effectiveness. In comparison with other outstanding prediction models, the proposed ensemble model can effectively capture the uncertainty in integrating photovoltaic generation and user load.
Investigating boosting techniques’ efficacy in feature selection: A comparative analysis
2024, Energy Reports
Accurate Solar Irradiance (SI) forecasting is an important aspect of solar energy harvesting and it depends on various meteorological features. Numerous feature selection algorithms have been implemented for the selection of suitable meteorological parameters. However, boosting algorithms are not explored widely for feature selection applications. Therefore, in this study, a novel perspective is introduced by exploring the efficacy of boosting algorithms in feature selection applications. In the proposed study, we perform a comparative analysis of different boosting algorithms for feature selection applications including Extreme Gradient Boosting (XgBoost), Categorical Boosting (CatBoost), Random Forest (RF) and Light Gradient Boosting Machine (LGBM). The novelty of this approach is in utilizing these boosting techniques for the selection of the most appropriate features that improve the predictive performance of the model. The SI data of three different geographical locations: Islamabad, Pakistan, Basel, Switzerland and Golden, Colorado, USA are attained from the National Solar Radiation Database (NSRDB) and used in the proposed study. First, the appropriate features are selected by four boosting algorithms separately. The selected features are then fed to the Bidirectional Long-Short-Term Memory (BiLSTM) network for forecasting hour-ahead Global Horizontal Irradiance (GHI). The Root Mean Square Error (RMSE), Mean Square Error (MSE), Mean Absolute Error (MAE), Mean Absolute Scaled Error (MASE) and Normalized Root Mean Square Error (NRMSE) are used as performance indicators. Findings demonstrate that the BiLSTM network trained on selected features, proposed by the XgBoost model, produces better forecasting results. In the case of the Islamabad city dataset, the RMSE and MAE of BiLSTM trained with appropriate features, as compared to the conventional model, are improved by 29.92% and 14.03%, respectively. For the dataset of Basel, the RMSE and MAE of BiLSTM network improved by 14.43% and 28.72%, respectively. Moreover, for the Golden city dataset, the RMSE and MAE of the proposed approach are improved by 10.5% and 17.38%, respectively than the conventional model.
Hypertuned temporal fusion transformer for multi-horizon time series forecasting of dam level in hydroelectric power plants
2024, International Journal of Electrical Power and Energy Systems
This paper addresses the challenge of predicting dam level rise in hydroelectric power plants during floods and proposes a solution using an automatic hyperparameters tuning temporal fusion transformer (AutoTFT) model. Hydroelectric power plants play a critical role in long-term energy planning, and accurate prediction of dam level rise is crucial for maintaining operational safety and optimizing energy generation. The AutoTFT model is applied to analyze time series data representing the water storage capacity of a hydroelectric power plant, providing valuable insights for decision-making in emergency situations. The results demonstrate that the AutoTFT model surpasses other deep learning approaches, achieving high accuracy in predicting dam level rise across different prediction horizons. Having a root mean square error (RMSE) of 2.78 $\times 1 0^{- 3}$ for short-term forecasting and 1.72 considering median-term forecasting, the AutoTFT shows to be promising for time series prediction presented in this paper. The AutoTFT had lower RMSE than the adaptive neuro-fuzzy inference system, long short-term memory, bootstrap aggregation (bagged), sequential learning (boosted), and stacked generalization ensemble learning approaches. The findings underscore the potential of the AutoTFT model for improving operational efficiency, ensuring safety, and optimizing energy generation in hydroelectric power plants during flood events.

View all citing articles on Scopus

View full text

A novel stacked generalization ensemble-based hybrid LGBM-XGB-MLP model for Short-Term Load Forecasting

Highlights

Abstract

Introduction

Section snippets

Preliminaries on ML models and stacked XGB-LGBM-MLP method

Case studies and performance assessment

Conclusions and future work

Credit author statement

Declaration of competing interest

Acknowledgements

Comput Sci Rev

Renew Sustain Energy Rev

Renew Sustain Energy Rev

Fuzzy Set Syst

Energy

Appl Soft Comput J

Energy

Renew Sustain Energy Rev

Neural Network

Sustain Cities Soc

Long term electricity forecast: a systematic review

A survey on electric power demand forecasting: future trends in smart grids, microgrids and smart buildings

IEEE Commun Surv Tutorials

Short-term load forecasting with deep residual networks

IEEE Trans Smart Grid

Short-term load and wind power forecasting using neural network-based prediction intervals

IEEE Trans Neural Netw Learn Syst

Multifractional Brownian motion and quantum-behaved partial Swarm optimization for bearing degradation forecasting

Peak electricity demand forecasting using time series regression models: an application to South African data

J Stat Manag Syst

A hybrid optimization technique using exchange market and genetic algorithms

IEEE Access