Skip to main content
Log in

Revisiting the importance of temperature, weather and air pollution variables in heat-mortality relationships with machine learning

  • Research Article
  • Published:
Environmental Science and Pollution Research Aims and scope Submit manuscript

Abstract

Extreme heat events have significant health impacts that need to be adequately quantified in the context of climate change. Traditionally, heat-health association methods have relied on statistical models using a single air temperature index, without considering other heat-related variables that may influence the relationship and their potentially complex interactions. This study aims to introduce and compare different machine learning (ML) models, which naturally consider interactions between predictors and non-linearities, to re-examine the importance of temperature, weather and air pollution predictors in modeling the heat-mortality relationship. ML approaches based on tree ensembles and neural networks, as well as non-linear statistical models, were used to model the heat-mortality relationship in the two most populated metropolitan areas of the province of Quebec, Canada. The models were calibrated using a comprehensive database of heat-related predictors including various lagged temperature indices, temperature variations, meteorological and air pollution variables. Performance was evaluated based on out-of-sample summer mortality predictions. For the two studied regions, models relying only on lagged temperature indices performed better, or equally well, than models considering more heat-related predictors such as temperature variations, weather and air pollution variables. The temperature index with the best performance differed by region, but both mean temperature and humidex were among the best indices. In terms of modeling approaches, non-linear statistical models were as competent as more advanced ML models for predicting out-of-sample summer mortality. This research validated the current use of non-linear statistical models with the appropriate lagged temperature index to model the heat-mortality relationship. Although ML models have not improved the performance of all-cause mortality modeling, these approaches should continue to be explored, particularly for other health effects that may be more directly linked to heat exposure and, in the future, when more data become available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availability

The authors do not have permission to share health data. Weather and air pollution data are freely available from Environment and Climate Change Canada (ECCC).

References

  • Analitis A, De’Donato F, Scortichini M, Lanki T, Basagana X, Ballester F, Astrom C, Paldy A, Pascal M, Gasparrini A (2018) Synergistic effects of ambient temperature and air pollution on health in Europe: results from the PHASE project. Int J Environ Res Public Health 15(9):1856

    Article  PubMed  PubMed Central  Google Scholar 

  • Analitis A, Michelozzi P, D’Ippoliti D, De’Donato F, Menne B, Matthies F, Atkinson RW, Iñiguez C, Basagaña X, Schneider A, (2014) Effects of heat waves on mortality: effect modification and confounding by air pollutants. Epidemiology 25(1):15–22

    Article  PubMed  Google Scholar 

  • Atkinson RW, Kang S, Anderson HR, Mills IC, Walton HA (2014) Epidemiological time series studies of PM2.5 and daily mortality and hospital admissions: a systematic review and meta-analysis. Thorax 69(7):660–665

    Article  CAS  PubMed  Google Scholar 

  • Barnett A, Tong S, Clements AC (2010) What measure of temperature is the best predictor of mortality? Environ Res 110(6):604–611

    Article  CAS  PubMed  Google Scholar 

  • Basu R (2009) High ambient temperature and mortality: a review of epidemiologic studies from 2001 to 2008. Environ Health 8(1):1–13

    Article  Google Scholar 

  • Bellinger C, MohomedJabbar MS, Zaïane O, Osornio-Vargas A (2017) A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health 17(1):1–19

    Article  Google Scholar 

  • Bi Q, Goodman KE, Kaminsky J, Lessler J (2019) What is machine learning? A primer for the epidemiologist. Am J Epidemiol 188(12):2222–2239

    PubMed  Google Scholar 

  • Boudreault J, Campagna C, Chebana F (2023) Machine and deep learning for modelling heat-health relationships. Sci Total Environ 892:164660

    Article  ADS  CAS  PubMed  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Bustinza R, Lebel G, Gosselin P, Bélanger D, Chebana F (2013) Health impacts of the July 2010 heat wave in Quebec, Canada. BMC Public Health 13(1):1–7

    Article  Google Scholar 

  • Chebana F, Martel B, Gosselin P, Giroux J-X, Ouarda TB (2013) A general and flexible methodology to define thresholds for heat health watch and warning systems, applied to the province of Québec (Canada). Int J Biometeorol 57(4):631–644

    Article  PubMed  Google Scholar 

  • Cheng J, Xu Z, Zhu R, Wang X, Jin L, Song J, Su H (2014a) Impact of diurnal temperature range on human health: a systematic review. Int J Biometeorol 58(9):2011–2024

    Article  PubMed  Google Scholar 

  • Cheng J, Zhu R, Xu Z, Xu X, Wang X, Li K, Su H (2014b) Temperature variation between neighboring days and mortality: a distributed lag non-linear analysis. Int J Public Health 59:923–931

    Article  PubMed  Google Scholar 

  • Chiu YM, Chebana F, Abdous B, Bélanger D, Gosselin P (2021) Cardiovascular health peaks and meteorological conditions: a quantile regression approach. Int J Environ Res Public Health 18(24)

  • Davis RE, McGregor GR, Enfield KB (2016) Humidity: a review and primer on atmospheric moisture and human health. Environ Res 144:106–116

    Article  CAS  PubMed  Google Scholar 

  • Fisher S, Rosella LC (2022) Priorities for successful use of artificial intelligence by public health organizations: a literature review. BMC Public Health 22(1):2146

    Article  PubMed  PubMed Central  Google Scholar 

  • Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232

    Article  MathSciNet  Google Scholar 

  • Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press

  • Greenwell B, Boehmke B, Cunningham J (2019) Package ‘gbm’. R Package Version 2(5)

  • Gulli A, Pal S (2017) Deep learning with Keras. Packt Publishing Ltd

  • Hastie T (2017) Generalized additive models. In: Statistical models in S. Routledge, pp 249–307

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  CAS  PubMed  Google Scholar 

  • Huang C, Barnett AG, Wang X, Vaneckova P, FitzGerald G, Tong S (2011) Projecting future heat-related mortality under climate change scenarios: a systematic review. Environ Health Perspect 119(12):1681–1690

    Article  PubMed  PubMed Central  Google Scholar 

  • IPCC (2021) Climate change 2021: The physical science basis. In: Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press

  • James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning. Springer

    Book  Google Scholar 

  • Jian L, Patel D, Xiao J, Jansz J, Yun G, Lin T, Robertson A (2023) Can we use a machine learning approach to predict the impact of heatwaves on emergency department attendance? Environ Res Commun 5(4):045005

    Article  Google Scholar 

  • Karlsson M, Ziebarth NR (2018) Population health effects and health-related costs of extreme temperatures: comprehensive evidence from Germany. J Environ Econ Manag 91:93–117

    Article  Google Scholar 

  • Kent ST, McClure LA, Zaitchik BF, Smith TT, Gohlke JM (2014) Heat waves and health outcomes in Alabama (USA): the importance of heat wave definition. Environ Health Perspect 122(2):151–158

    Article  PubMed  Google Scholar 

  • Kolb S, Radon K, Valois M-F, Héguy L, Goldberg MS (2007) The short-term influence of weather on daily mortality in congestive heart failure. Arch Environ Occup Health 62(4):169–176

    Article  PubMed  Google Scholar 

  • Kovats RS, Hajat S (2008) Heat stress and public health: a critical review. Annu Rev Public Health 29:41–55

    Article  PubMed  Google Scholar 

  • Lapointe L (2021) Chief Coroner’s statement on public safety during high temperatures. https://www2.Gov.Bc.ca/Assets/Gov/Birth-Adoption-Death-Marriage-and-Divorce/Deaths/Coroners-Service/News/2021/Chief_coroner_statement_-_heat_related_deaths.Pdf. Accessed 10/02/2023

  • Lee W, Kim Y, Sera F, Gasparrini A, Park R, Choi HM, Prifti K, Bell ML, Abrutzky R, Guo Y (2020) Projections of excess mortality related to diurnal temperature range under climate change scenarios: a multi-country modelling study. Lancet Planet Health 4(11):e512–e521

    Article  PubMed  PubMed Central  Google Scholar 

  • Lee W, Lim Y-H, Ha E, Kim Y, Lee WK (2022) Forecasting of non-accidental, cardiovascular, and respiratory mortality with environmental exposures adopting machine learning approaches. Environ Sci Pollut Res 29(58):88318–88329

    Article  Google Scholar 

  • Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22

    Google Scholar 

  • Manisalidis I, Stavropoulou E, Stavropoulos A, Bezirtzoglou E (2020) Environmental and health impacts of air pollution: a review. Front Public Health 8:14

    Article  PubMed  PubMed Central  Google Scholar 

  • Marien L, Valizadeh M, zu Castell W, Nam C, Rechid D, Schneider A, Meisinger C, Linseisen J, Wolf K, Bouwer L, (2022) Machine learning models to predict myocardial infarctions from past climatic and environmental conditions. Nat Hazards EarthSyst Sci 22:3015–3039

    Article  ADS  Google Scholar 

  • Masselot P, Chebana F, Bélanger D, St-Hilaire A, Abdous B, Gosselin P, Ouarda TB (2018) Aggregating the response in time series regression models, applied to weather-related cardiovascular mortality. Sci Total Environ 628:217–225

    Article  ADS  PubMed  Google Scholar 

  • Meehl GA, Tebaldi C (2004) More intense, more frequent, and longer lasting heat waves in the 21st century. Science 305(5686):994–997

    Article  ADS  CAS  PubMed  Google Scholar 

  • Nishimura T, Rashed EA, Kodera S, Shirakami H, Kawaguchi R, Watanabe K, Nemoto M, Hirata A (2021) Social implementation and intervention with estimated morbidity of heat-related illnesses from weather data: a case study from Nagoya City, Japan. Sustain Cities Soc 74:103203

    Article  Google Scholar 

  • Ogata S, Takegami M, Ozaki T, Nakashima T, Onozuka D, Murata S, Nakaoku Y, Suzuki K, Hagihara A, Noguchi T (2021) Heatstroke predictions by machine learning, weather information, and an all-population registry for 12-hour heatstroke alerts. Nat Commun 12(1):1–11

    Article  Google Scholar 

  • Park J, Kim J (2018) Defining heatwave thresholds using an inductive machine learning approach. PLoS One 13(11):e0206872

    Article  PubMed  PubMed Central  Google Scholar 

  • Park M, Jung D, Lee S, Park S (2020) Heatwave damage prediction using random forest model in Korea. Appl Sci 10(22):8237

    Article  CAS  Google Scholar 

  • Pascal M, Laaidi K, Ledrans M, Baffert E, Caserio-Schönemann C, Le Tertre A, Manach J, Medina S, Rudant J, Empereur-Bissonnet P (2006) France’s heat health watch warning system. Int J Biometeorol 50(3):144–153

    Article  PubMed  Google Scholar 

  • Pascal M, Wagner V, Le Tertre A, Laaidi K, Honoré C, Bénichou F, Beaudeau P (2013) Definition of temperature thresholds: the example of the French heat wave warning system. Int J Biometeorol 57(1):21–29

    Article  PubMed  Google Scholar 

  • Pascal M, Wagner V, Alari A, Corso M, Le Tertre A (2021) Extreme heat and acute air pollution episodes: a need for joint public health warnings? Atmos Environ 249:118249

    Article  CAS  Google Scholar 

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  Google Scholar 

  • Schmidt CW (2020) Into the black box: what can machine learning offer environmental health research? Environ Health Perspect 128(2):022001

  • Smoyer-Tomic KE, Rainham DG (2001) Beating the heat: development and evaluation of a Canadian hot weather health-response plan. Environ Health Perspect 109(12):1241–1248

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Son J-Y, Liu JC, Bell ML (2019) Temperature-related mortality: a systematic review and investigation of effect modifiers. Environ Res Lett 14(7):073004

    Article  ADS  Google Scholar 

  • Statistics Canada (2023) Census profile, 2021 census of population. https://www12.statcan.gc.ca/census-recensement/2021/dp-pd/prof/index.cfm?Lang=E. Accessed 24 Nov 2023

  • Tong S, Wang XY, Barnett AG (2010) Assessment of heat-related health impacts in Brisbane, Australia: comparison of different heatwave definitions. PLoS One 5(8):e12155

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  • Usmani RSA, Pillai TR, Hashem IAT, Marjani M, Shaharudin R, Latif MT (2021) Air pollution and cardiorespiratory hospitalization, predictive modeling, and analysis using artificial intelligence techniques. Environ Sci Pollut Res 28(40):56759–56771

    Article  CAS  Google Scholar 

  • Vaneckova P, Neville G, Tippett V, Aitken P, FitzGerald G, Tong S (2011) Do biometeorological indices improve modeling outcomes of heat-related mortality? J Appl Meteorol Climatol 50(6):1165–1176

    Article  ADS  Google Scholar 

  • Vicedo-Cabrera AM, Forsberg B, Tobias A, Zanobetti A, Schwartz J, Armstrong B, Gasparrini A (2016) Associations of inter-and intraday temperature change with mortality. Am J Epidemiol 183(4):286–293

    Article  PubMed  PubMed Central  Google Scholar 

  • Wang Y, Song Q, Du Y, Wang J, Zhou J, Du Z, Li T (2019) A random forest model to predict heatstroke occurrence for heatwave in China. Sci Total Environ 650:3048–3053

    Article  ADS  CAS  PubMed  Google Scholar 

  • Wlodarczyk A, Molek P, Bochenek B, Wypych A, Nessler J, Zalewski J (2022) Machine learning analyzed weather conditions as an effective means in the predicting of acute coronary syndrome prevalence. Front Cardiovasc Med 9:830823

  • Wood S (2015) Package ‘mgcv.’ R Package Version 1(29):729

    Google Scholar 

  • Xu Z, FitzGerald G, Guo Y, Jalaludin B, Tong S (2016) Impact of heatwave on mortality under different heatwave definitions: a systematic review and meta-analysis. Environ Int 89:193–203

    Article  PubMed  Google Scholar 

  • Ye X, Wolff R, Yu W, Vaneckova P, Pan X, Tong S (2012) Ambient temperature and morbidity: a review of epidemiological evidence. Environ Health Perspect 120(1):19–28

    Article  PubMed  Google Scholar 

  • Zhang K, Rood RB, Michailidis G, Oswald EM, Schwartz JD, Zanobetti A, Ebi KL, O’Neill MS (2012) Comparing exposure metrics for classifying ‘dangerous heat’ in heat wave and health warning systems. Environ Int 46:23–29

    Article  PubMed  PubMed Central  Google Scholar 

  • Zhang K, Li Y, Schwartz JD (2014) What weather variables are important in predicting heat-related mortality? A new application of statistical learning methods. Environ Res 132:350–359

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Denis Hamel and Louis Rochette for the help with the mortality data extraction. They would also like to thank Yohann Chiu, Magalie Canuel, Ray Bustinza, Felix Lamothe, Natalie Gravel and Annabel Ruf for their comments on early versions of this work. The authors also thank the editor and two anonymous reviewers who helped improve the quality of this paper.

Funding

The main author has received funding from the Natural Sciences and Engineering Research Council of Canada (Vanier Scholarship, #CGV-180 821), the Canadian Institute of Health Research (Health System Impact Fellowship, #IF1-184093), Ouranos (Real-Décoste Excellence Scholarship, #RDX-317725) and the National Institute of Public Health of Quebec (no grant number).

Author information

Authors and Affiliations

Authors

Contributions

Jérémie Boudreault: conceptualization, methodology, data curation, formal analysis, visualization, software, writing — original draft, review and editing, funding acquisition. Céline Campagna: conceptualization, writing — review and editing, supervision, project administration, funding acquisition. Fateh Chebana: conceptualization, writing — review and editing, supervision, project administration, funding acquisition.

Corresponding author

Correspondence to Jérémie Boudreault.

Ethics declarations

Ethical approval

This project received ethics approval from the Human Research Ethics Committee of the National Institute of Scientific Research (CER-22–693).

Consent to participate

Not applicable. Consent to participate was not required due to the nature of the data used (i.e., anonymized administrative health database).

Consent for publication

Not applicable. Consent to publish was not required due to the nature of the data used (i.e., anonymized administrative health database).

Competing interests

The authors declare no competing interests.

Additional information

Responsible Editor: Lotfi Aleya

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 303 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boudreault, J., Campagna, C. & Chebana, F. Revisiting the importance of temperature, weather and air pollution variables in heat-mortality relationships with machine learning. Environ Sci Pollut Res 31, 14059–14070 (2024). https://doi.org/10.1007/s11356-024-31969-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11356-024-31969-z

Keywords

Navigation