Introduction

In recent years, India’s economic growth has increased very quickly; this country exhibits one of the highest growth rates for a developing country. Numerous production sectors have grown and in addition to the steel and metallurgical sectors, the textile and oil refining industries have also expanded. This economic growth has increased the number of jobs in cities and the populations of large urban areas, such as Delhi and Mumbai. However, the growth of towns and the consequent need for more supplies have damaged the delicate environment of India, where there are high levels of smog, fine dust, and water pollution. Air contamination in India has expanded quickly with the increasing population, increasing number of vehicles, increasing energy utilization, poor transportation framework, poor land use, industrialization and especially with the inadequate environmental guidelines. According to Conibear et al. (2018), sulfur dioxide (SO2), nitrogen dioxide (NO2), and particulate matter (PM) contribute in part to the toxins causing environmental contamination. Many Indian urban communities including Mumbai, Kolkata, and Pune are at risk of air contamination. India’s air contamination emergency is generally due to the toxic, winter air quality in Delhi, and in a few urban areas in the north and central India (Awasthi et al. 2016).

According to Gurjar et al. (2016), the air quality due to pollution is so poor in India that about 1.2 million deaths can be directly attributed to it. Gurjar et al. (2016) claim that one out of eight (about 12.5%) deaths in 2017 were attributable to high rates of respiratory disease, stroke, heart disease, diabetes, and lung cancer, all conditions for which a certain percentage of cases result from severe air pollution. Apparently, out of the 1.2 million deaths, about 51.4% were people below 70 years of age (Solgi and Keramaty 2016). More than three quarters of the population in India are exposed to air pollution far higher than the minimum standard set by the Indian government, which is four times higher than the limit set by the World Health Organization (WHO).

Since March 2020, India has been affected, like much of the world, by the COVID-19 pandemic. In no time, India has experienced an unprecedented increase in infections. The toll of the coronavirus emergency in India has exceeded the threshold of 100 thousand cases. According to reports from Johns Hopkins University, the number of cases in India has reached 101,139, while the deaths caused to date by the virus total 3164. The country’s Ministry of Health has confirmed the accuracy of these data. The Indian Medical Research Council has announced that 2,404,267 tests have been performed nationwide. The rapid spread of the virus despite the lockdown measures imposed raises many questions. In particular, many scholars are looking for a relationship between pollution and the spread of COVID-19. The literature on this topic is very scarce and the few existing studies are very recent. However, relevant scientific literature highlights that exposure to air pollution may be relevant to virus infection spread (Chen et al. 2010; Ye et al. 2016; Chen et al. 2017; Peng et al. 2020); and more recent literature focuses on COVID-19 diffusion (Consticini et al., 2020; Setti et al. 2020; Wu et al. 2020). These latter studies concluded that air pollution is an effective determinant of COVID-19 infection spread. However, the feedback channel remains less investigated. Mitra et al. (2020) studied the atmospheric carbon dioxide (CO2) levels for the city of Kolkata (India) comparing the April 2019 and April 2020 periods. Focusing on Chinese provinces, Huang et al. (2020) analyzed the variations in primary and secondary pollution emissions during the COVID-19 lockdown. Using a different method, Wang et al. (2020) explored the influences of emission reductions due to reduced human activities during the COVID-19 outbreak in China on air pollution. Becchetti et al. (2020) analyzed the data of all the municipalities and all the Italian provinces, both in terms of deaths and daily infections, in relation to pollution levels. In their study, significant causal variables for contagion and death with COVID-19 are represented by the combined provision of three factors: the lockdown measures, the level of local pollution—especially fine dust but also NO2—and the types of local production structures, in particular non-digitalize activities, which therefore in the most acute period of the epidemic crisis had greater resistance to closure. The study estimated the difference between provinces most exposed to fine dust (Lombardy) and least exposed (Sardinia) to be around 1200 cases and 600 deaths per month, a figure that implies a doubling in mortality for the most exposed province. Basically, according to the research, coronavirus infections were higher where the air pollution was higher, although the authors specify that a causal link was not established. Becchetti et al. (2020) discuss statistical relevance, however, which suggests a strong correlation between pollution and infection/mortality. Studies on the relationship between COVID-19 and pollution present statistical analyses but fail to take into account the relationship with economic growth. Furthermore, these studies do not adopt the most modern techniques, which are based on machine learning (ML) approaches.

This paper, starting from the underlying assumption that economic growth in developing countries generates pollution, first verifies the causal link through an econometric approach. It estimates the presence of causality in the Toda-Yamamoto test between economic growth and PM2.5, NO2, CO2. Subsequently, the (short-term) causal link is verified between PM2.5, NO2, CO2 (resulting from unsustainable economy growth), and COVID-19 deaths, through a complex causality algorithm (D2C).

The rest of this paper is organized as follows: “Methods” shows the time series and ML methods; “Results” analyses in detail the results obtained by our algorithm; “Discussion” presents a discussion of the results and “Conclusions” reports our conclusions.

Methods

Data source and strategy: time series analysis

The econometric analysis aims to analyze the presence of a causal relationship among pollution (CO2 emissions, PM2.5, and NO2) per capita gross domestic product (GDP). In our study, we use annual data from 1980 to 2018. Table 1 shows the sources of data used in our empirical analyses: CO2 is CO2 emissions (metric tons per capita); PM2.5 is primary particulate matter (PM); NO2 is NO2 concentrations levels; PCGDP is GDP per capita in 2000 US$.

Table 1 List of the variables

For this work, to avoid distortions in the analysis, values for the variables used were calculated in logarithmic terms. Table 2 presents an exploratory data analysis.

Table 2 Exploratory data analysis

Means are positive for all variables; 10-Trim values are near the means; the interquartile range shows the absence of outliers.

The correlation analysis shows that in our dataset, the variables are strongly correlated: corr (CO2, PCGDP) = 0.9745; corr (CO2, NO2) = 0.9875; corr (PCGDP, NO2) = 0.9754; corr (CO2, PM2.5) = 0.9912; corr (PCGDP, PM2.5) = 0.9245; corr (NO2, PM2.5) = 0.9675 with all significant variables (0.000).

ML evidence

According to Sundararajan et al. (2017), we determined an algorithm capable of generating causal effects between inputs concerning one or more targets. We used an algorithm in ML that could identify causal effects between the variables. Hu et al. (2012) recommend a D2C algorithm on proportion-based causality using the Oryx 2.0.8 protocol in Apache. However, since an algorithm in ML needs many variables (remembering that the data are not interpreted as a time series), we completed mathematical transformations. So, in addition to the general logarithmic transformation, we generated the square of the considered values, the first difference and the first difference calculated in logarithmic terms. In this way, our model computes a combination of 37,040-1Footnote 1 variables with an artificial intelligence approach. Basically, we adopted an empirical strategy similar to Magazzino et al. (2020a, b, c), and Mele and Magazzino (2020). We used a dataset with daily cumulative variables in time series (not considered as such by our neural networks) from January 29 to May 18 2020.

We performed the analysis in ML following the process shown in Fig. 1. This figure shows that starting from our dataset, we increased the variables through mathematical transformations to obtain a large dataset necessary for our D2C algorithm. Subsequently, the causality model was processed and we analyzed those variables deemed significant. Once the D2C commands were imported into the Oryx software, the analysis generated the causalities mentioned above typical of ML process. Finally, we completed the predictive linear regression test and perform training test to verify the accuracy of the algorithm.

Fig. 1
figure 1

The ML process

Results

Subsequently, stationarity tests (ADF, ERS, PP, and KPSS) were performed for each time series of each variable, first on levels and then on the first differences (Table 3).

Table 3 Results for unit roots and stationarity tests

The tests failed to reject the null hypothesis for all the variables relative to the 5% significance level, except for the KPSS test. However, this last test, using a different approach, rejected the I(0) value at the 95% confidence level, indirectly confirming the previous tests.

To verify the causal relationship between each of PM2.5, CO2, NO2, and the per capita economic, we used the Toda-Yamamoto test. This is necessary to test the non-Granger causality allowing, however, for the causal inferential analysis on a VAR which contains or does not present co-integration processes. Table 4 presents the result of the test carried out on our historical data series.

Table 4 Toda-Yamamoto causality tests results

The peculiarity of our result is that India registers a direction of causality. In particular, there is a unidirectional causality from economic growth to PM2.5, CO2, and NO2. The results obtained confirm the hypothesis that the economic growth of a developing country behaves like a bell curve. The relationship between economic development and environmental sustainability is best represented by the so-called environmental Kuznets curve (EKC). The basis of this theory is the idea that the curve represents a mechanism according to which developing countries tend to pollute more; the achievement of a mature and stable phase of economic growth is an essential tool for reducing environmental damage. However, for India, the downward phase of the curve has not yet been observed. Polluting emissions, therefore, still derive from unsustainable economic growth. Polluting emissions may have influenced the spread of COVID-19 in Indian territory, also causing the death of many people. This statement requires an empirical verification through the most current methodologies in ML. Therefore, as reported in the following section, we next estimated the D2C algorithm, aiming to verify the causal link between polluting emissions and COVID-19 deaths.

Table 5 presents the results of causality and significance tests to determine the relationship between the variables of interest in the study. In the model, n-filtered factors were used (which do not appear in the table), which performed the task of training the classification of our model. The self-learning machine worked as explained here. It started from a set of commands with functionality to be preset. Subsequently, as shown in the Appendix Table 6, we sequentially imported various features and parameterized our variables from letter a to letter m. Hence, ten classifiers were trained and tested to achieve the predictive causal link between our variables. These ten classifiers worked through a binary calculation sequence, alternating the values [0] with those of [1].

Table 5 Rank of predictor and significant causality results

The algorithm we used worked by performing on average over 19,000 repetitions for each combination of causality between our variables. The closing percentage of the calculation, within the average of the repeats, was always higher than 70%. Hence, our algorithm has ever completed each cycle for each pair of variables. The value of the average causality was uniform for all pairs of values analyzed. As regards the significance of the results of predictive causality, we parameterized the AUPRC. It was divided into true or false with respect to a p value lower or higher than 5%. We thus ascertained that only a causal relationship was significant within the AUPRC analysis. This was attributable to a unidirectional causality ranging from PM2.5 value to deaths value. Finally, we tested the accuracy of the whole process and algorithm. We applied the “Importance test” to the hyperbolic equations of the D2C to verify which input sent the most signals to the target (Deaths) (see Fig. 2).

Fig. 2
figure 2

Importance test results. Source: our elaboration with BGML

As evident from Fig. 2, the PM2.5 emissions appear to be the largest influence the target variable (deaths). This result confirms the findings from the D2C causality model. A standard method to test the loss of our model is to perform a predictive linear regression analysis (Fig. 3).

Fig. 3
figure 3

Predictive linear regression test. Sources: our elaboration with NN designer

The prediction line (with respect to the target, deaths) confirms the goodness-of-fit of the elaboration with the algorithm in the final architecture. The second phase of the model algorithm’s goodness-of-fit begins with the analysis of perform training. The Quasi-Newton method was used for training (Fig. 4). This is based on Newton’s method, but does not require calculation of second derivatives. Instead, the Quasi-Newton method computes an approximation of the inverse Hessian at each iteration of the algorithm, by only using gradient information.

Fig. 4
figure 4

Perform training test. Sources: our elaborations with NN designer

Figure 4 shows the training and selection errors in each iteration. The blue line represents the training error, while the orange line is the selection error. The initial value of the training error is 6.78169 and its final value after 29 epochs is 0.0781. The initial value of the selection error is 5.44918 and its final value after 108 epochs is 0.0008.

Discussion

The results obtained from our D2C model show that there is a one-way causal link from PM2.5 concentrations to COVID-19 deaths. This result is significant. It points out that high levels of fine particulates are related to the increase in pandemic deaths, which we elaborate on here. Atmospheric PM is composed of solid and liquid particles of microscopic dimensions, suspended in the air.

In particular, the term PM2.5 includes all powders with an aerodynamic diameter less than or equal to 2.5 μm. It is fine particulate matter sourced from all types of combustion, including that of engines in cars and motorcycles, in plants for production of electricity and from wood burning for domestic heating. Forest fires and many other industrial processes also contribute. These tiny particles can be inhaled, reaching into the deepest part of the human respiratory system and lungs. A few days inhaling at high concentrations can cause inflammation of the respiratory tract, asthma crisis, or malfunction of the cardiovascular system. Prolonged exposure can, however, generate chronic effects with symptoms such as a cough, a decrease in the lung or cardiac capacity, asthma and other inflammatory conditions, some even with fatal consequences. According to Wu et al. (2020), a 1 μg/m3 increase in air PM2.5 corresponds to a 15% increase in the mortality rate from COVID-19. Therefore, patients who have been exposed to polluted air for a prolonged period of time are at higher risk of death from complications related to coronavirus than others who have lived in areas with cleaner air. PM2.5, therefore, can greatly aggravate the symptoms of COVID-19 infection, significantly increasing the risk of mortality in patients affected by the virus.

The correlation between concentrations of PM2.5 and COVID-19 deaths is closely connected with India’s economic growth process. As mentioned above, India has one of the highest economic growth rates for a developing country, with rapid development in the steel, metallurgical, textile, and oil refining industries. This growth has led to an increased number of jobs in cities, increasing populations of large urban agglomerations such as Delhi and Mumbai. This growth and the consequent need for more supplies have affected the environment resulting in higher levels of smog, fine dust, and water pollution. In particular, we think that the connection between fine particulate matter and COVID-19 deaths happens through the process of social and economic urbanization. Since 2018, most of the world’s population have lived in urban areas and the United Nations predicts that by 2030, 60% will do so. Urbanization is likely to be one of the main future challenges in developing countries such as India. Migration from the countryside to cities is a critical factor in terms of environmental impact and pollution. According to the WHO (2018), air quality in most cities (which use systems for monitoring and detecting air pollution levels) is below the established limits. However, the WHO itself reports that in many countries with higher economic growth, such as India, urban agglomerations are characterized by high levels of pollution. In fact, Delhi currently holds the highest negative record in terms of emission of polluting particles into the atmosphere. In our dataset, we observed very high daily and hourly levels of PM2.5 in Delhi (from February 2020 to May 2020) (Fig. 5), with values above the limit recommended by the WHO (10 μg/m3).

Fig. 5
figure 5

Concentration (μg/m3) PM2.5 in Delhi. Sources: our elaboration on hourly data. https://openaq.org/

These high concentrations of PM2.5 could aggravate the COVID-19 crisis in India. Our model, excluding the presence of other variables with direct causality for COVID-19 deaths, therefore supports the design of policies for rapid reduction of polluting particles, because their function as a vehicle for the virus (Dominici et al., 2020) could accelerate the number of deaths across the country.

Conclusions

This paper aimed to analyze the relationship between economic growth, polluting emissions, and COVID-19 deaths. We used two estimation models. The econometric model verified the unidirectional causal link between economic growth and PM2.5, CO2, and NO2. Our ML analysis, with D2C algorithm, showed a direct relationship between concentration of PM2.5 and COVID-19 deaths. These results confirm those of numerous very recent contributions, which discuss or present data analyses on the relationship between air pollution levels and the COVID-19 epidemic (Bianchi and Cibella 2020; Cheng et al. 2020; Conticini et al. 2020; Magazzino et al. 2020b; Schwartz et al. 2020). Particular attention is paid to the potential effects of fine particulate matter, the spread of the epidemic, and the prognosis of respiratory infections. The underlying hypothesis is that a high concentration of PM (PM10, PM2.5) makes the respiratory system more susceptible to infection and complications of coronavirus disease. The higher and more constant the exposure to PM over time (as with the elderly), the higher the probability that the respiratory system is predisposed to more severe disease.

This air pollution situation developed from the rapid economic growth, as demonstrated by the time series model. Such air pollution is a characteristic of developing countries and their intensive polluting activities. Therefore, it is necessary to put in place measures that limit the damage caused by pollution. They can be reassigned below.

  • To control the emissions arising for the motor vehicles, its best to put in place measures to adopt vehicles that are less emissive. Such measures would include the use of vehicles that consume less fossil fuel, which results in less emissions due to less combustion. There should also be advocacy for cleaner fuel with less toxic emissive capacity to adequately reduce the toxic gas emissions. In the long run, the country should think of adopting latest technologies which have already been adopted by developed countries, such is the use of electrically driven vehicles. There are pollutants-free vehicles that need to replace the existing fuel-dependent vehicles, in order to achieve a lower number of fossil gases emissions such as carbon monoxide and sulfur dioxide, which are being released to the environment. This will help, in the long term, in reducing respiratory-related diseases that have an expensive cost of treatments for the majority of citizens.

  • How agricultural activities are being carried out should be streamlined to meet the present and environmentally friendly means. Farmers should be educated on the best practices such as the use of natural fertilizers, which causes no pollution for the soil. The continued use of the current fertilizers will continue affecting the environment by having soil affected by chemicals. The use of manure should be replaced with the current fertilizers; farmers should collect manure from cattle and chicken droppings. The twigs and bushes should be ground and buried in the soil to create natural humus content for the crops. Burning should be discouraged at all cost as it even leads to the killing of living organisms that should otherwise perform aeration of the soil to maximize its output. The government could set aside funds to educate farmers on this, and the air pollution that has always been witnessed will be a story of the past in the state of Punjab and other states.

  • Dust which would always cause smog during winter can be minimized by increasing green cover in major parts of the country. Those places with bare land should be covered by planting either grass or any land cover such as trees to prevent any dust source. Planting of trees will serve a better position as it will not only create a soil cover but also serve as windbreakers, thus preventing dust from moving from one point to another. If this is achieved, issues of formation of smog during winter will never be realized again; on the other hand, will be contained by putting in place measures that curb wrongdoers such as smokers near forest areas. Clear instructions should be displayed to the public on the consequences of causing a fire in a forest. Firefighting departments should also set up near forest areas and major towns to curb any possible incidence of fire occurrences.

  • Municipalities should create rules regarding guidelines for the emissions and emission limits of toxic substances. Power plants should ensure that the gases emitted to the atmosphere are treated to ensure that no harmful substance is emitted to the atmosphere. Such mechanisms will only be possible if the environmental agency sets out measures that will be able to guide the population and also if standards will be established. This will go a long way in ensuring that most of the toxic gases that would cause respiratory diseases are controlled. Such control mechanisms will lead to an increase in the life expectancy of people living in India, because chronic illnesses like cancer and asthma will considerably decrease. Regular inspections of power plants should be periodically conducted to certain the rate of pollution.

  • Waste management should be handled by establishing strategies that deal with the implementation of waste treatment and recycling. The long-term goals should be to ensure that landfills are no longer available in major towns and that all the waste is treated and recycled. Corporate management should do away with landfills and ensure that every firm, industry, corporation, and even at individual level treat their waste before disposing it to the environment. Such measures will ensure that no toxins are disposed of. Local management should also ensure that offenders are punished severely to serve as examples. Such punishments would include closing of factories that discharge untreated waste. Such mechanisms will shy other firms from disposing of untreated wastes. The expenses that municipalities would spend on transportation of waste from towns to landfills should rather be used in the treatment of the waste at the source level.