A novel stacked generalization ensemble-based hybrid LGBM-XGB-MLP model for Short-Term Load Forecasting
Introduction
With the recent waves of digitization and the fast pervasiveness of information and communication technologies, many initiatives have placed an added emphasis on the development and modernization of the traditional centric power grids [1]. The balance between electricity generation and load demand must be optimally maintained to avoid fatal disturbances on the grid due to overloads [2]. To achieve that aim, electric Load Forecasting (LF) offers the necessary tools for stakeholders and energy suppliers to increase their profitability from Renewable Energy Sources (RES) and meet the ever-growing electricity demand. The high complexity of the utility grid operations paved the way for LF to monitor, control, and manage the electric system operations with high efficiency [3].
Recently, LF reached an overall state of maturity that guarantees its safe applicability and profitability in smart grids and traditional utility grids with satisfactory results. LF reveals a cost-effective, efficient, and reliable technique within the energy management framework [4]. Electrical LF assists the scheduling of load response and maintains the fast and economic dispatch to its optimal. Furthermore, it provides a reliable indicator for managing the complex pricing strategies in liberalized and deregulated energy markets with higher financial benefits [5]. Efficient energy management systems strongly require intelligent algorithms to effectively support the electric operations strategy, decrease the electricity bills, and enhance the energy trading and planning [5]. LF is conducted using a variety of features’ inputs including social, economic, and weather conditions [3]. Based on the application type, the LF is divided into two categories: 1) the time horizon 2) scope of the variables employed [3].
For the time horizon, most of the LF employed techniques can be divided into four major classes; long-term LF valid for years, medium-term LF valid from months to years, Short-Term Load Forecasting (STLF) used from minutes up to one week, and very short-term LF valid from seconds to minutes [6]. The STLF ensures high asset commitment and flexibility of the grid to enhance the serviceability of electricity operations over several time scales for day-to-day operations [7]. In Ref. [3], it was concluded that the most useful LF horizon is STLF.
STLF models are classified into three categories: soft computing techniques, conventional forecasting techniques, and modified traditional techniques [8,9]. Traditional techniques involve regression methods, Iterative reweighted Least Squares, and Exponential Smoothing [10]. However, there are three major drawbacks associated with these methods such as overfitting problems with a massive amount of data, difficulties in feature engineering processing, and relatively less accuracy compared to advanced techniques. Modified techniques include adaptative demand forecasting, stochastic times series, auto-regressive, and moving average-based models, support vector machine-based techniques [3]. Soft computing techniques essentially consist of genetic algorithms, fuzzy logic, neural networks, and knowledge-based expert systems [11]. However, there are some major drawbacks associated with these methods including loss of model interpretability, higher execution time and computational burden, and limited generalization capabilities. Moreover, Hyperparameter Optimization (HO) for soft techniques is a very computationally expensive and time-consuming task [12].
Various algorithms were proposed to solve the optimization problems and enhance the ML model performance. In Refs. [13], the authors proposed an Exchange Market-Genetic Algorithm (EMGA) technique to solve optimization problems with less iterations and better-quality results. The proposed technique combines the merits of the genetic algorithm, and exchange market algorithm [14]. The execution time of the EMGA algorithm took 2.82 min for 641 iterations in solving twelve benchmark functions. Despite the fast execution time and low error rate, the simulation results show that EMGA exhibits a high time iteration ratio. Authors in Ref. [15] used a Simulated Annealing (SA) algorithm for HO of Deep Neural Network (DNN). The proposed SA-DNN achieves accurate results in terms of RMSE. However, the search space of the SA-DNN HO is very limited to avoid computational burden (it only includes the neuron numbers). This leads to low accuracy improvement compared to DNN. In Ref. [16], the authors used the spearmint Bayesian Optimization (BO) method for the HO of recurrent neural networks. The proposed technique is hyper-effective for both short/long-term forecasting. The authors in Ref. [17] used a Derivative-Free Optimization (DFO) technique with deep learning models. The proposed technique uses an efficient feature selection via ensemble structures to predict a variety of RES. However, the comparative analysis of DFO with other HO benchmark techniques is missing.
Ensemble methods have been widely deployed for forecasting applications due to their ease of implementation. In Refs. [18], the authors employed the extreme Gradient Boosting technique (XGB) to predict the load based on similar days using cluster analysis. The presented results confirmed the superiority of the ensemble methods in terms of high accuracy and generalization capabilities compared to the deep learning techniques such as Long Short-Term Memory (LSTM). Nevertheless, the results are unsatisfactory with a relatively poor Mean Square Error (MSE). In Ref. [19], a combination between Convolutional Neural Network (CNN) and Light Gradient Boosting Machine (LGBM) was proposed. The feature extraction process was carried out using CNN model from five wind turbines. The reliability of the proposed technique was verified according to the low registered error metrics values. However, this technique shows sensitivity to time-order character size and requires high computing resources.
In this paper, a Stacked Generalization approach between XGB, LGBM, and Multi-Layer Perceptron (MLP) models named Stacked XGB-LGBM-MLP, is firstly explored for STLF. To the best of the authors’ knowledge, no prior work has addressed this architecture for STLF. The Stacked XGB-LGBM-MLP model is characterized by high accuracy, excellent performance, and ease of implementation. Moreover, the simulation results with an open data portal demonstrated that the Stacked XGB-LGBM-MLP model manages to outperform 11 benchmarks for STLF application. The main contributions of this paper are given as follows:
- •
A novel Stacked XGB-LGBM-MLP model is proposed to improve the overall regression performance. Despite the potential learning ability and rigorous mathematical theory of XGB and LGBM models, they can only use tree models with the same category, and it is difficult to fundamentally overcome the inherent defects of the tree models. Using MLP model, the meta-data enhances its training performance to generate a better-quality result.
- •
A novel STLF technique is explored and developed in this study. Most of the existing techniques use ensemble models or neural networks for STLF. However, mixing both ensembles and neural networks in one single framework has not received enough attention in the previous studies.
- •
A comparative analysis of five HO algorithms was comprehensively presented for STLF. Previous studies focused mainly on using various HO techniques to enhance the performance of the ML models. However, selecting the most appropriate HO technique received little attention in the field of ML.
- •
An assessment of the proposed technique is conducted using two real datasets. The sensibility of the proposed technique to the size and nature of the data has been given significant importance in this research study.
- •
A comparative study with the recent benchmark techniques is performed. A large comparative study with 11 benchmarks has been conducted to demonstrate the high performance of the proposed stacked XGB-LGBM-MLP model.
Therefore, the paper is organized as follows: Section 2 presents the preliminaries of the proposed technique. In Section 3, two case studies have been conducted. Several existing Machine Learning (ML) models are compared to the proposed technique. Furthermore, a comparative study with the recent benchmark technique regarding the same dataset has been discussed. Finally, section 3 draws conclusions to end this paper.
Section snippets
Preliminaries on ML models and stacked XGB-LGBM-MLP method
In this section, three ML models are introduced according to their distinguished architecture, namely, LGBM, XGB and MLP Network. Moreover, the proposed technique is comprehensively investigated.
Case studies and performance assessment
Based on a publicly available datasets, a real case studies were conducted to assess the proposed approach and illustrate the prediction performance of the hybrid model. Furthermore, a comparative analysis with the recent benchmarks is performed. Moreover, the high accuracy of the proposed method with the latest hybrid STLF technique for the same dataset has been verified.
Conclusions and future work
This paper proposes a novel computing framework based on stacked generalization method for STLF. In order to improve the accuracy of single techniques, the proposed technique combines three efficient methods, namely, Extreme Gradient Boosting (XGB), Light Extreme Gradient Boosting, and Multi-Layer Perceptron (MLP) models. The components of the proposed model are selected based on individual performance, training time, and ease of implementation trade-off. The presented experiments strongly
Credit author statement
Mohamed Massaoudi: Conceptualization, Methodology, Software, original draft preparation. Shady S. Refaat: Validation, writing—review & editing, project administration, funding acquisition. Ines Chihi: Validation, formal analysis, Data curation, Writing – review & editing. Mohamed Trabelsi: Formal analysis, investigation, writing—review & editing, resources. Fakhreddine S. Oueslati: Supervision, project administration. Haitham Abu-rub: Proofreading & editing, review & editing.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This publication was made possible by NPRP grant [NPRP10-0101-170082] from the Qatar National Research Fund (a member of Qatar Foundation), the co-funding by IBERDROLA QSTP LLC and sponsorship by Texas A&M Energy Institute Fellowship. The statements made herein are solely the responsibility of the authors.
References (41)
- et al.
Smart grid communication and information technologies in the perspective of Industry 4.0: opportunities and challenges
Comput Sci Rev
(2018) - et al.
A review and analysis of regression and machine learning models on commercial building electricity load forecasting
Renew Sustain Energy Rev
(2017) - et al.
Review on probabilistic forecasting of photovoltaic power production and electricity consumption
Renew Sustain Energy Rev
(2018) - et al.
Low-level interpretability and high-level interpretability: a unifiedview of data-driven interpretable fuzzy system modelling
Fuzzy Set Syst
(2008) - et al.
Optimal battery technology selection and incentive-based demand response program utilization for reliability improvement of an insular microgrid
Energy
(2019) - et al.
Optimizing hyperparameters of deep learning in predicting bus passengers based on simulated annealing
Appl Soft Comput J
(2020) - et al.
Feature engineering and forecasting via derivative-free optimization and ensemble of sequence-to-sequence networks with applications in renewable energy
Energy
(2020) - et al.
A review on artificial intelligence based load demand forecasting techniques for smart grid and buildings
Renew Sustain Energy Rev
(2015) Stacked generalization
Neural Network
(1992)- et al.
A survey on hyperparameters optimization algorithms of forecasting models in smart grid
Sustain Cities Soc
(2020)
Long term electricity forecast: a systematic review
A survey on electric power demand forecasting: future trends in smart grids, microgrids and smart buildings
IEEE Commun Surv Tutorials
Short-term load forecasting with deep residual networks
IEEE Trans Smart Grid
Short-term load and wind power forecasting using neural network-based prediction intervals
IEEE Trans Neural Netw Learn Syst
Multifractional Brownian motion and quantum-behaved partial Swarm optimization for bearing degradation forecasting
Peak electricity demand forecasting using time series regression models: an application to South African data
J Stat Manag Syst
A hybrid optimization technique using exchange market and genetic algorithms
IEEE Access
Cited by (208)
Enhancing interval-valued time series forecasting through bivariate ensemble empirical mode decomposition and optimal prediction
2024, Engineering Applications of Artificial IntelligenceHypertuned temporal fusion transformer for multi-horizon time series forecasting of dam level in hydroelectric power plants
2024, International Journal of Electrical Power and Energy Systems