1 Introduction

Natural gas usage has seen a continuous surge in the world energy systems due to sustainability challenges and to mitigate hazardous environmental impacts (Avraam et al., 2021; Zhou et al., 2021). Economic activities are closely linked with natural gas prices. As a result, predicting natural gas prices becomes a highly arduous task of paramount relevance in the planning and operations of various key assets, including public utilities (Goncharuk & Cirella, 2020). Natural gas price predictions assist in regulatory decision-making regarding electric and gas utilities (Costello, 2010). Despite an uninterrupted growth of natural gas trade, bubbles in natural gas prices in the European Union, Asia, and the US markets have been documented (Li et al., 2020). Economic euphoria, geopolitical events, fluctuations of crude oil prices largely account for the same. Unavoidable lockdowns and curbs owing to the COVID-19 pandemic have further triggered uncertainty in the overall consumption of natural gas and (Cihan, 2022).

The literature reported the rambling price movements of natural gas played a pivotal role in political chaos in Venezuela, public protests in Iran, and political turmoil in Kazakhstan (Rostan & Rostan, 2021). Precise predictions of natural gas prices can help optimize resource utilization, reserves management, and drawing decisions on economic reforms. Natural gas price movements are susceptible to natural calamities and external economic volatility (Huang & Etienne, 2021; Shi & Shen, 2021). Its co-movement and strong nexus with the crude oil price and stock market across different economies are also apparent (Ftiti et al., 2020; Mensi et al., 2021). Fluctuations of natural gas prices induce volatility towards gas-and-coal-fired electricity generation (Linn and Muehlenbachs, 2018). The potential implications of natural gas price predictions motivate us to develop a robust framework that yields highly accurate forecasts during the normal and new-normal timelines.

Natural gas futures price forecasting has received less attention than crude oil. The literature is more focused on predictive analysis of natural gas consumption at industrial and household levels (Gao & Shao, 2021; Hu et al., 2020; Xiao et al., 2020). We retrieve some literature that deals with natural gas futures price forecasting. Variants of regression and artificial neural network (ANN) models have been used for predictive modeling of natural gas prices (Salehnia et al., 2013). The results revealed that the accuracy of the ANN models was the best. Historical prices coupled with news on the internet and search data are used to forecast natural gas prices (Tang et al., 2019). The latter two components impacted positively to achieve better predictions. A Variational mode decomposition-based approach is used to estimate the point and interval forecasts for natural gas prices (Jiang et al., 2021). The obtained forecasts were more reliable and stable than the other models. Some hybrid approaches are also proposed to forecast daily natural gas prices (Wang et al., 2020). Unfortunately, the dearth of predictive modeling work to predict natural gas prices during the COVID-19 pandemic is pretty apparent. It is also important to acknowledge that the previous studies have considered the larger sample sizes for model building. They are not tailor-made to identify the patterns for a shorter sample size, which imposes several practical decision-making restrictions.

This paper contributes in the following two ways: First, it proposes a unique residual driven ensemble machine learning approach. The residuals of the previous stage have been utilized as a technical indicator to perform predictive tasks. The inclusion of residuals of predictive modeling of the previous stage has transpired to be of utmost importance in drawing forecasts of supreme precision in the subsequent stage. Second, explicitly carrying out forecasting during the timeline of the pandemic covering the first and second waves is of enormous practical implications and acts as an acid test to check the robustness and capacity of any predictive structure. The predictive modeling approach proposed in this study helps to produce quality forecasts during said periods. Also, it uses relatively small segments of data of the first and second waves of pandemics exhibiting a high degree of turbulence owing to apparent implications.

The rest of the paper is organized as follows: In Sect. 2, we discuss the pertinent literature in brevity to comprehend the trend and highlight the existing gaps to justify the positioning of current work. Data descriptions alongside the settings to ascertain the prediction quality during Pre COVID-19 and COVID-19 regimes have been elaborated in Sect. 3. The research methodology has been elucidated in Sect. 4. The outcome of predictive exercises has been presented and analyzed in Sect. 5. Finally, the paper is concluded in Sect. 6 by mentioning the key findings, limitations, and future research agendas.

2 Literature review

Owing to its contribution to the energy market, different aspects of natural gas modeling have received considerable attention from researchers. The co-movement patterns of natural gas and crude oil prices were studied (Mensi et al., 2021). Natural gas prices offered diversification benefits. Scholars also studied the interplay of natural gas prices with other commodity and financial assets (Ameur et al., 2020; Kumar et al., 2021; Liao et al., 2021). On a similar note, the natural gas consumption prediction has garnered attention, as reported in the literature (Liu et al., 2021; Lu et al., 2020; Qiao et al., 2020; Wei et al., 2021; Zhou et al., 2020). Unfortunately, few studies are available so far in the literature for forecasting natural gas prices using machine and deep learning techniques despite being a widely used source of clean energy.

An ensemble predictive structure was proposed to analyze Henry Hub's natural gas spot prices using the least-squares boosting technique (Ali, 2020). The predictive performance was satisfactory. A similar framework was constructed to estimate the daily US natural gas price (Wang et al., 2020). Long short-term memory network (LSTM), Support vector regression (SVR), and improved pattern sequence similarity search (IPSS) models were used for generating initial forecasts that were combined using variance reciprocal framework to obtain the final forecasts. The approach was found to be extremely proficient in yielding high-quality predictions.

A granular predictive model was proposed to predict monthly spot prices of natural gas (Li et al., 2021) of Henry Hub. The methodological framework incorporated variational mode decomposition (VMD) to disentangle the original series into granular components and combined particle swarm optimization (PSO) and deep belief network (DBN) to draw forecasts. The findings revealed that WTI crude oil spot prices and natural gas consumption largely governed the evolution of Henry Hub's natural gas spot price. A neoteric short-term forecasting structure was proposed for natural gas price prediction (Wang et al., 2021). The granular framework utilized ensemble empirical mode decomposition to decompose the original series into subseries wherein an RNN model optimized through PSO was applied to fetch component-wise forecasts to draw the final aggregate forecasts. The framework performed better than the other benchmark models.

Most of these studies rely on sophisticated mathematical modeling in conjunction with machine learning and computational intelligence techniques. Unlike natural gas price modeling, the evolutionary pattern of natural gas consumption has been critically delved into during the COVID-19 pandemic (Cihan, 2022). Although natural gas prices and consumption are not entirely independent and exclusive, the evolutionary patterns and overall dynamics of both variables are influenced by different macroeconomic and financial conditions. Therefore, extending models built for consumption patterns recognition to predict price movement may not be ideal in challenging circumstances. Hence, the development of a dedicated framework to estimate futuristic figures of natural gas prices becomes essential.

The review of related literature demonstrates an inclination towards granular and artificial intelligence-driven methodologies for predictive analytics of natural gas prices. Despite being an influential energy commodity with substantial nexus with other financial assets, modeling futures prices has seen meager attention. To the best of our knowledge, no attempt has been made to date to specifically estimate forecast in the COVID-19 period encompassing both first and second waves of infection. Unlike Pre COVID-19 phase, performing prediction in the COVID-19 regime is extremely difficult irrespective of the high-end model used for drawing predictions due to profound uncertainty, disruptions, and anticipation of the market crash. Thus, barring the contribution towards the research void, the current work also propounds a neoteric feature engineering framework to accomplish the objective that can be leveraged for predicting extremely chaotic and uncertain events. Precise predictions of the UK and the US natural gas futures can directly be utilized for risk mitigation through portfolio realignment. Therefore, the apparent absence of research and profound practical applications justify the underlying research significantly.

3 Data description

We investigate the predictability of natural gas prices for the pre-and during-COVID-19 periods and see whether the current pandemic leads to more uncertainty in the pricing structure. Accordingly, we collect daily data for the NGF-UK and NGF-USA from Bloomberg. The ‘During Covid-19’ period is taken from December 31, 2019, the day of receiving the first confirmed COVID-19 case in Wuhan, China, to May 20, 2021. The span of the ‘During COVID-19’ period for this study is 506 days containing 367 observations each for both the considered series. In the ‘Pre COVID-19’ period, we consider data from August 09, 2018, to December 30, 2019, to keep the uniformity in the number of observations for both series.

In summary, we have only 367 observations in each of the pre-and during-COVID-19 periods. We present the temporal patterns of Natural gas futures for the UK and USA during the investigation periods in Fig. 1. The visual inspection implies that all four price movements are highly nonlinear. In the rest of the paper, we use ‘UK-Pre’, ‘UK-During’, ‘US-Pre’, and ‘US-During’ to denote the time series for the NGF-UK in the ‘Pre COVID-19’ period, NGF-UK prices in the ‘During COVID-19’ period, NGF-USA in the ‘Pre COVID-19’ period, and NGF-USA ‘During COVID-19’ period, respectively.

Fig. 1
figure 1

Evolutionary patterns of natural gas futures. Figure 1 represents the temporal patterns on Natural gas futures for the UK and USA during the mentioned periods. The x-axis shows the duration and the y-axis of the figures (a) and (b) shows the natural gas futures for the UK in US$ per MMBtu. The y-axis of figures (c) and (d) shows the NGF-USA in US$ per MMBtu

Table 1 provides the summary statistics and statistical tests results to establish the behavioral patterns of the datasets.

Table 1 Descriptive statistics

From Table 1, we find that the test statistic values of Jarque–Bera, Shapiro Wilk, Kolmogorov–Smirnov, and Anderson–Darling are highly significant for all the time series. So, the natural gas futures exhibit non-parametric behavior. The ADF test statistic values appear insignificant for all the series in pre-and during COVID-19 periods. Therefore, we conclude that no series is stationary. A highly significant test statistic value of the Ljung-Box test at lag 10 confirms the presence of a strong autocorrelation structure in all the time series in pre-and during COVID-19 periods. The test statistic values of both neural network tests substantiate the nonlinear behavior of all four time series. The Hurst exponent values are significantly higher than 0.5 for all the time series. So, we find the existence of persistent trends in the evolutionary patterns of natural gas futures of the UK and the USA.

Several statistical tests on the time series of natural gas futures of the UK and the USA reveal the presence of long memory and autocorrelation structure in pre-and during COVID-19 periods. In such situations, a predictive framework developed based on technical indicators is proven to provide superior forecasts (Ghosh et al., 2019). This encourages us to use a technical indicator-based predictive modeling framework. We select 30 technical indicators for predicting natural gas prices (Ghosh et al., 2019; Hsu, 2011; Huang & Tsai, 2009; Jana et al., 2020), following the previous works. The brief description of the technical indicators used in this work are as follows: MA5, MA10, and MA20 denote 5-day, 10-day, and 20-day moving average, respectively; B5, B10, and B20 denote 5-day, 10-day, and 20-day bias, respectively; MTM5, MTM10, and MTM20 denote 5-day, 10-day, and 20-day momentum, respectively; ROC5, ROC10, and ROC20 denote 5-day, 10-day, and 20-day rate of change, respectively; LAG1, LAG2, LAG3, LAG4, and LAG5 denote 1-day, 2-day, 3-day, 4-day, and 5-day back prices, respectively; EMA5, EMA10, EMA12, EMA20, and EMA26 denote 5-day, 10-day, 12-day, 20-day, and 26-day exponential moving average, respectively; UB, and LB denote upper and lower Bollinger band, respectively; LOSS denotes the difference between today’s and next day’s prices; DIFF denotes the difference between 12-day and 26-day moving average; MACD denotes the moving average convergence divergence; HP signifies the highest price of the day; LP represents the lowest price of the day; and WMS denotes the Williams overbought/oversold index.

The proposed approach uses residual (Res) of predictive modeling in the previous stage as an input feature for predicting the movements in the next stage. So, the forecasts are obtained in the first stage by considering the above 30 features as independent input variables. In the second stage, we propose to include \(Res\) as an explanatory feature in addition to the previous 30 features.

4 Methodology

In this section, we enunciate the components of the adopted integrated framework to carry out natural gas futures price forecasting in the considered regime. The construction of the predictive architecture commences with the feature engineering process. The most important contribution of the predictive structure lies in the selection of features that can compensate for the volatile impact of macroeconomic and other external shocks in recognizing the hidden pattern. Boruta feature screening algorithm has been utilized for accomplishing the task. MODWT and RF are the other two principal components of the proposed methodology.

4.1 BORUTA feature selection algorithm

Boruta (Kursa et al., 2010) is an ensemble feature selection algorithm that is in many ways similar to the RF algorithm with marginal changes. It depends on bootstrapped samples that augment the induced randomness in the underlying information system. Interested readers may follow Kursa and Rudnicki (2010) for details about this algorithm. The fundamental steps of the algorithm are outlined below. In our work, Boruta has been used to evaluate the importance of residual as an explanatory feature and identify the top three features across the granular components on which actual predictive exercises are carried out to obtain the final forecasts. We now discuss the MODWT technique, which is used for the granular decomposition of considered time series.

4.2 Maximal Overlap Discrete Wavelet Transformation (MODWT)

Wavelet decomposition is a widely used technique ideal for the multi-resolution analysis of complex time series, allowing simultaneous decomposition in time and frequency space (Das et al., 2018; Ghosh et al., 2021). We have invoked MODWT to execute the said decomposition for component-wise predictive analysis. MODWT has been preferred over conventional discrete wavelet transformation (DWT) as dyadic restriction and sensitivity to the circular shift of the latter are not applicable to the former.

The technique transforms a given function \(f\left( t \right)\) into a mother \(\psi \left( t \right)\) and father \(\varphi \left( t \right)\) wavelets at a defined scale. If \(F_{it}\) is the i-th time series value at time t, then the detail (\(\psi_{j,k}^{{r_{i} }}\)) and approximation (\(\varphi_{J,k}^{{r_{i} }}\)) can be written as follows:

$$ F_{it} = S_{J}^{{F_{i} }} \left( t \right) + D_{J}^{{F_{i} }} \left( t \right) + D_{J - 1}^{{F_{i} }} \left( t \right) + \ldots + D_{1}^{{F_{i} }} \left( t \right) $$
(1)

where \(S_{J}^{{F_{i} }} \left( t \right) = \mathop \sum \limits_{k} \varphi_{J, k}^{{F_{i} }} \phi_{J,k} \left( t \right)\), \(D_{j}^{{F_{i} }} \left( t \right) = \mathop \sum \limits_{k} \psi_{J, k}^{{F_{i} }} {\upomega }_{j,k} \left( t \right)\), \(\phi_{J,k} \left( t \right) = 2^{ - J/2} \phi \left( {\frac{{t - 2^{J} k}}{{2^{J} }}} \right)\), and \({\upomega }_{j,k} \left( t \right) = 2^{ - j/2} {\upomega }\left( {\frac{{t - 2^{j} k}}{{2^{j} }}} \right)\).

The inverse wavelet transform is

$$ f\left( t \right) = \sum\limits_{l = - \infty }^{\infty } {\sum\limits_{k = - \infty }^{\infty } {\widetilde{{\mathcal{W}}}_{l,k} \times \psi_{l,k} \left( t \right)} } $$
(2)

where \({\stackrel{\sim }{\mathcal{W}}}_{l,k}=\underset{-\infty }{\overset{\infty }{\int }}f\left(t\right){\psi }_{l,k}\left(t\right)dt\) is the discrete wavelet transform of \(f\left(t\right)\).

The decomposition can swiftly disentangle the linear and nonlinear parts of original time series observations into manageable subseries, preserving the fundamental features of financial time series such as spillovers, heteroscedasticity, volatility clustering, etc. (Das et al., 2018; Jana et al., 2021; Singh et al., 2019). The ‘haar’ wavelet transform has been used to implement MODWT to carry out four decomposition levels as it is memory efficient, fast, and precisely reversible with no edge effects. As a result, one approximation (S4) and four detailed components (D1, D2, D3, and D4) are generated. The RF is invoked to perform a forecasting task at each granular component to derive the final forecasts. Mathematically, the final forecast is derived using (3).

$$ \begin{aligned} Final\_Forecast_{i}^{RF} &= Forecast\_D_{1i}^{RF} + Forecast\_D_{2i}^{RF} + Forecast\_D_{3i}^{RF}\nonumber\\ &\quad\; + Forecast\_D_{4i}^{RF} + Forecast\_S_{4i}^{RF}\end{aligned} $$
(3)

where \({Final\_Forecast}_{i}^{RF}\) represents the final forecast obtained from RF for a particular variable i; \({Forecast\_{D}_{j}}_{i}^{RF}\) accounts for the obtained forecasts by RF on detailed components, and \({Forecast\_{S}_{j}}_{i}^{RF}\) reflects the forecasted figure at approximation component by RF.

4.3 Random forest (RF)

The RF (Breiman, 2001) is an ensemble machine learning algorithm capable of handling complex nonlinear and volatile patterns to carry out classification and regression tasks. It has been found to be extremely resilient to outliers and overfitting. The algorithm is relatively faster than classical bagging-based ensemble modeling and better than the adaptive boosting procedure in terms of accuracy (Breiman, 2001). It relies upon the bootstrapping process to construct decision trees as a base learner to classify regression tasks. Classical regression trees are used as base learners in this algorithm. RF utilizes a random sub-spacing principle in growing the trees wherein the best feature to perform branching operations is selected from a random subset of features. The aforesaid setup of randomization eventually results in an unbiased estimation. The operational steps can be found in Breiman (2001).

4.4 Performance evaluation

We have resorted to five distinct metrics to ascertain the quality of predictions as briefly mentioned next:

  • Mean Absolute Error (MAE) \(MAE = \frac{1}{N}\mathop \sum \nolimits_{i = 1}^{N} \left| {Y_{act} \left( i \right) - Y_{pred} \left( i \right)} \right|\), where \(Y_{pred} \left( i \right)\) and \(Y_{act} \left( i \right)\) are predicted and actual values for the i-th observation. A lower MAE value implies a superior prediction.

  • Root Mean Squared Error (RMSE) \(RMSE = \sqrt {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {Y_{act} \left( i \right) - Y_{pred} \left( i \right)} \right)^{2} }\), where \(Y_{pred} \left( i \right)\) and \(Y_{act} \left( i \right)\) are predicted and actual values for the i-th observation. A lower RMSE value implies a superior prediction.

  • Nash–Sutcliffe Efficiency (NSE) \(NSE = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left\{ {Y_{act} \left( i \right) - Y_{pred} \left( i \right)} \right\}^{2} }}{{\mathop \sum \nolimits_{i = 1}^{N} \left\{ {Y_{act} \left( i \right) - \overline{Y}_{act} } \right\}^{2} }}\), where \(\overline{Y}_{act}\) represents the mean of actual observations. NSE values lie between \(- \infty\) to 1. The NSE value should be close to 1 for superior forecast accuracy.

  • Index of Agreement (IA) \(IA = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {Y_{act} \left( i \right) - Y_{pred} \left( i \right)} \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{N} \left\{ {\left| {Y_{pred} \left( i \right) - \overline{Y}_{act} } \right| + \left| {Y_{act} \left( i \right) - \overline{Y}_{act} } \right|} \right\}^{2} }}\). The IA value should be close to 1 for superior forecast accuracy.

  • Theil Index (TI) \(TI = \frac{{\left[ {\frac{1}{N}\mathop \sum \nolimits_{i = 1}^{N} \left( {Y_{act} \left( i \right) - Y_{pred} \left( i \right)} \right)^{2} } \right]^{1/2} }}{{\left[ {\frac{1}{N}\mathop \sum \nolimits_{i = 1}^{N} Y_{act} \left( i \right)^{2} } \right]^{1/2} + \left[ {\frac{1}{N}\mathop \sum \nolimits_{i = 1}^{N} Y_{pred} \left( i \right)^{2} } \right]^{1/2} }}\). The TI value should be close to zero for superior forecast accuracy.

5 Results and discussion

This section systematically presents the main findings of MODWT decomposition of the original time series corresponding to the natural gas futures of the UK and the USA, feature importance and impact of Error as a feature, predictive modeling exercise, and a comparative performance assessment.

5.1 MODWT decomposition

We decompose the natural gas futures time series in pre-and during COVID-19 periods using the MODWT model into four components. The “haar” filter is selected for proving the best decomposed components. The decomposition results are presented in Fig. 2a–d. In each figure, observations are shown along the horizontal axis, and absolute values are given along the vertical axis of respective decomposed components.

Fig. 2
figure 2

MODWT decomposition of natural gas futures

5.2 Feature importance and impact of \(Res\) as a feature

We extensively study the feature importance and the impact of the newly introduced residual (\(Res\)) in the second stage of forecasting the natural gas prices. For this purpose, the Boruta feature selection algorithm is run for 100 iterations only as we do not find significant changes in the importance of features and their rankings. Tables 2, 3, 4, 5 denote the key findings of the feature selection experiment that include mean of squared residuals and percentage variance explained top three features with their importance and execution time. For each decomposed component of the time series representing the NGF-UK and the NGF-USA in the pre-and during COVI-19 periods, we compare the performance of the of by considering the \(Res\) as not a feature (NF) and \(Res\) as a feature (F).

Table 2 Features importance of the decomposed components of UK-Pre (Pre COVID-19 NGF-UK)
Table 3 Features importance of the decomposed components of UK-During (During COVID-19 NGF-UK)
Table 4 Features importance of the decomposed components of US-Pre (Pre COVID-19 NGF-USA)
Table 5 Features importance of the decomposed components of US-During (During COVID-19 NGF-USA)

For the UK-Pre, \(Res\) has transpired to be extremely important features for detailed components. The \(Res\) turned out to be the most important feature for D1, D2, and D3 and occupied the second spot for D4. These detailed components reflect the high-frequency nonlinear parts of the original series. Thus, the inclusion of \(Res\) is extremely useful for nonlinear modeling. The high quantum of actual variation can be explained by \(Res\) solely. The degree of variation explained by \(Res\) is substantially higher than any other feature. When \(Res\) has not been used, other technical indicators have taken the spot. Among the technical indicators, LAG1 has appeared to be very influential that signifies the dependence of natural gas prices on the immediate past information. On the approximation component, \(Res\) has not turned out to be lying on the top three spots in terms of feature importance.

The features importance UK-During are almost like that of UK-Pre. \(Res\) has appeared to possess significant explanatory power for modeling nonlinear components. It would be interesting to compare whether the actual predictive performance reflects the same phenomenon or not. Among the technical indicators, LAG1 has turned out to be highly important as well.

For the US-Pre, \(Res\) has appeared to be crucial as well. Unlike the UK counterparts, \(Res\) has not appeared to be important for the D4 component. Nevertheless, it has transpired to play a critical role in explaining the variation for the approximation component. Its inclusion increases the overall explained variation component. LAG1 has emerged to be important among the set of technical indicators implying that natural gas futures for the US depend on its immediate past behavior.

Features evaluation of US-During justifies the inclusion of \(Res\). It is amply apparent that the said feature has appeared to be an extremely impactful feature that largely expounds the overall variability on all high-frequency detailed components. Likewise, in earlier instances, LAG1 has turned out to be a critical technical indicator also. Hence, it is evident that \(Res\) plays a pivotal role in explaining highly nonlinear trends that may be beneficial for prediction exercise in the pandemic time horizon. It becomes imperative to evaluate how the previous stage unexplained behavior can explain the future volatile traits. Natural gas futures of both the UK and USA across the regimes have significantly depended on immediate historical information.

The decomposed series obtained through MODWT contains one approximation (S4) and four detailed components (D1, D2, D3, and D4). Out of the four detailed components, D1 accounts for the most volatile part (high frequency). It contains the maximum unexplained component compared to the other three detailed components. So, the \(Res\) corresponding to D1 has the maximum importance as a feature used in the second stage and plays the most crucial role in the forecasting process.

5.3 Predictive analytics

The proposed two-stage predictive framework utilizes the MODWT, Boruta, and RF algorithms. Time series corresponding to the NGF-UK and the NGF-USA is initially decomposed using MODWT. In the first stage, 30 independent features are used, whereas, in the second stage \(Res\) is added as the 31st independent feature. In each stage, the final forecast is obtained by summing the component-wise forecasts.

The original daily time series corresponding to all the natural gas futures are randomly portioned into training (70%), testing (15%), and validation (15%) datasets to conduct the predictive exercise. It may be noted here that there are only 367 observations for each time series. Therefore, the training dataset contains 256 observations to build the random forest model. The entire predictive performance is done using the ‘rattle()’ package in R. The number of trees in the model is set to 500, a default value in the package. In this study, we extensively investigate the predictability the natural gas prices in the pre-and during COVID-19 periods. The investigation roadmap is as follows:

  1. a.

    We partition each of the NGF-UK and the NGF-USA into two time zones—pre (August 09, 2018, to December 30, 2019) and during (December 31, 2019, to May 20, 2021) COVID-19 periods. So, there are four time series that are denoted by UK-Pre, UK-During, US-Pre, and US-During. We then decompose each of them into five granular components using MODWT. So, we have a total of 20-time series.

  2. b.

    We apply the FR algorithm on the mentioned 20-time series to obtain the stage 1 forecasts for UK-Pre, UK-During, US-Pre, and US-During based on the 30 selected features. The final forecast of this stage is obtained by aggregating the forecasts obtained from the granular decomposed components.

  3. c.

    We calculate the \(Res\) for all 20-time series and update the feature lists.

  4. d.

    Then, we calculate the stage 1 forecasts for all 20 decomposed time series based on initially selected 30 features and the \(Res\) as the additional feature. Following a similar process, as followed in stage 1, the final forecast of the second stage is obtained.

  5. e.

    To compare the performance of the proposed two-stage framework, we obtain forecasts from the original UK-Pre, UK-During, US-Pre, and US-During time series without decomposing them and applying the RF algorithm.

Finally, we calculate forecast errors and access the predictive performances for the training and the whole datasets for all the decomposed components, aggregate series, and the original series without decomposition considering. For the first two cases, we compute forecast errors and predictive performances for stage 1 (\(Res\) not as a feature) and stage 2 (\(Res\) as a feature). These results are presented in Tables 6, 7, 8. Also, the forecast errors and predictive performances based on the training dataset on MODWT-decomposed individual components for both the stages are shown in the "Appendix" to obtain more insights on the overall predictive performance of the proposed two-stage ensemble machine learning approach.

Table 6 Forecast error and predictive performance based on the training dataset on MODWT-decomposed series
Table 7 Forecast error and predictive performance based on the whole dataset on MODWT-decomposed series
Table 8 Forecast error and predictive performance based on the whole dataset without MODWT-decomposition

We discuss the forecasting performance of the proposed approach based on the forecast errors and predictive performances presented in Tables 6, 7, 8. As expected, the MAE and RMSE values for UK-Pre, UK-During, US-Pre, and US-During are superior for the training dataset with \(Res\) as a feature compared to the MODWT-decomposed series with \(Res\) not as a feature and without MODWT-decomposition. The forecasting errors are minimum for the NGF-UK in the pre-COVID-19 period. The corresponding MAE and RMSE values are 0.001732 and 0.002878, respectively. On the other hand, the forecasting errors are worst for the NGF-USA in the pre-COVID-19 period. Surprisingly, the forecast errors are smaller for the NGF-USA during the during-COVID-19 period compared to the pre-COVID-19 period. However, the forecast errors are smaller for the NGF-UK in the pre-COVID-19 period compared to the during-COVID-19 period.

The predictive performances manifested through IA, NSF, and TI values indicate that the proposed approach is the best. The IA and NSE values are close to 1 for the whole series. For the training dataset, these values are slightly inferior as there are only 256 observations. Smaller TI values indicate superior predictive performance. The best TI values have been obtained for the whole series. As before, these values are marginally less for the training dataset. The predictive performances of the ensemble approach without MODWT decomposition are second best (Table 8). However, both the error components are relatively higher compared to the proposed approach under training and whole datasets.

In terms of the methodology, the forecast errors are least for the proposed MODWT-based ensemble technique with \(Res\) as a feature, followed by the MODWT-based ensemble machine learning technique with \(Res\) not as a feature as evident from Table 6 and 7 on both training and the whole datasets. From Table 6 and Table 8, we find that the proposed approach outperforms the ensemble machine learning technique without integrating MODWT applied to all the four original series in pre-and during-COVID-19 periods. Even the MODWT-based ensemble technique with \(Res\) not as a feature also outperforms the ensemble technique. Another interesting fact is coming out from the comparison of the second part of Tables 7 and 8. The MODWT-based ensemble machine learning technique with \(Res\) not as a feature offers no better performance than the ensemble machine learning technique. The MAE and RMSE values are higher in this case. In summary, we conclude that \(Res\) residual-driven MODWT-based ensemble machine learning approach for forecasting natural gas prices appeared to be the best compared to all other scenarios. This was possible because of the inclusion of \(Res\) of the first stage as a feature in the second stage for forecasting the future movements of natural gas prices.

Interested readers may please refer to the "Appendix" for finding the forecast error and predictive performance based on the training dataset on decomposed components with \(Res\) as a feature and \(Res\) not as a feature (Fig. 3).

Fig. 3
figure 3

Prediction of natural gas futures. Note. Figure 3 represents the comparison of prediction of Natural gas futures for the UK and USA during the mentioned periods. The x-axis shows the duration, and the y-axis of the figures (a) and (b) shows the natural gas futures for the UK in US$ per MMBtu. The y-axis of figures (c) and (d) shows the NGF-USA in US$ per MMBtu

Table 9 presents the correlation between the predicted and actual values. These values should ideally be close to 1 for a better fit model. In all three cases, the correlation values are very high and almost approach 1. This confirms the appropriateness of the predictive frameworks used for forecasting future values of natural gas future prices. For the proposed approach with \(Res\) as a factor, the correlation value is maximum for the UK-Pre, followed by UK-During, US-During, and US-Pre. Correlation values obtained for the proposed approach are higher than the values obtained for the other two approaches. In particular, the correlation values are slightly inferior for the decomposed series with \(Res\) not as a feature than the series without decomposition. So, the correlation test indicates that the proposed residual-based predictive framework is a better fit than the other two models.

Table 9 Correlation between the actual and predicted values

6 Conclusion

This research proposes a residual-driven ensemble machine learning approach for forecasting natural gas prices. We combine the MODWT decomposition technique and an ensemble machine learning algorithm to achieve the research endeavors. The introduction of the unexplained components in terms of the residuals of the previous stage to build a granular forecasting structure for the next stage is a substantial contribution that resulted in highly accurate predictions. The significant findings of this research are summarized next.

The natural gas futures markets of the UK and US have exhibited persistent trends that suggested the deployment of technical indicators as explanatory features. The feature, \(Res\) has enhanced the quality of forecasts by largely explaining the nonlinear granular components of respective original series. The efficacy of the two-stage forecasting approach is amply evident through the incorporation of \(Res\) as a feature. Despite being developed on a relatively small chunk of samples, the predictive architecture demonstrates the ability to yield high-quality predictions in challenging circumstances. The role of MODWT in improving the accuracy of obtained forecasts is imminent that conforms to previous cognate literature of MODWT based complex financial time series prediction (Jana et al., 2021). The prediction accuracy of NGF-UK in Pre COVID-19 phase has appeared to be better than the COVID-19 phase. On the other hand, the prediction accuracy of the NGF-USA has turned out to be better in the COVID-19 phase than Pre COVID-19 regime. So, there is no clear evidence of the impact of COVID-19 on the natural gas future prices was established through this study.

The most challenging part of this research was to obtain a superior forecast for the natural gas prices in the presence of a limited number of observations using machine learning. The availability of more observations will undoubtedly lead to superior predictive performance. The scope of the present work is limited to the UK and US natural gas futures markets. However, this research opens up the opportunities of applying the proposed residual-driven framework for predictive modeling of any economic and financial time series. The approach will offer more benefits for the time series that are more volatile and complex. The scalability of the presented framework makes it ideal for carrying out the predictive task for time series representing high-frequency intraday data as well. Real-life portfolio management and realignment can be conducted based on future figures. The exercise can separately be performed during normal and new normal timelines using predictive residuals of the previous stage as a key component. The unique feature processing utility described here needs further exploration to predict extremely volatile events, including market crashes.