Elsevier

Environmental Pollution

Volume 256, January 2020, 113367
Environmental Pollution

Comparison of land use regression and random forests models on estimating noise levels in five Canadian cities

https://doi.org/10.1016/j.envpol.2019.113367Get rights and content

Highlights

  • The RF model has higher accuracy than the LUR model at the local and global scales.

  • People living in Montreal and Longueuil exposed to relatively high noise levels.

  • The noise estimates were assigned to postal code areas for future health studies.

Abstract

Chronic exposure to environment noise is associated with sleep disturbance and cardiovascular diseases. Assessment of population exposed to environmental noise is limited by a lack of routine noise sampling and is critical for controlling exposure and mitigating adverse health effects. Land use regression (LUR) model is newly applied in estimating environmental exposures to noise. Machine-learning approaches offer opportunities to improve the noise estimations from LUR model. In this study, we employed random forests (RF) model to estimate environmental noise levels in five Canadian cities and compared noise estimations between RF and LUR models. A total of 729 measurements and 33 built environment-related variables were used to estimate spatial variation in environmental noise at the global (multi-city) and local (individual city) scales. Leave one out cross-validation suggested that noise estimates derived from the RF global model explained a greater proportion of variation (R2: RF = 0.58, LUR = 0.47) with lower root mean squared errors (RF = 4.44 dB(A), LUR = 4.99 dB(A)). The cross-validation also indicated the RF models had better general performance than the LUR models at the city scale. By applying the global models to estimate noise levels at the postal code level, we found noise levels were higher in Montreal and Longueuil than in other major Canadian cities.

Introduction

Chronic exposure to environmental noise on has been increasingly linked to adverse human health effects (Brown and van Kamp, 2017). Long-term exposure to environmental noise is likely to lead to sleep disturbance (Clark and Paunovic, 2018a; Guski et al., 2017; Shepherd et al., 2010), cardiovascular and metabolic diseases (van Kempen et al., 2018; Clark et al., 2017), adverse birth outcomes (Gehring et al., 2014) and cognitive and mental health issues (Clark and Paunovic, 2018b). An accurate assessment of the spatial variation in noise levels is of critical importance for the assessment of human health impacts associated with noise exposure, the management of environmental noise sources, and for the control and mitigation of adverse health outcomes associated with noise exposure. For example, the Environmental Noise Directive by the European Commission requires strategic noise mapping every five years in agglomerations with more than 100,000 residents (European Commission, 2002). However, at present no country in the world has established a population-wide noise monitoring network, even in North America and Western Europe where noise has been more extensively studied as an environmental issue (Héroux et al., 2015).

For the purpose of noise mapping, modelling has been a common approach to address gaps in noise measurement data. Propagation models and geo-statistical models (such as kriging) are two common approaches in early noise mapping studies. Propagation models, e.g., FHWA TNM (Barry and Reagan, 1978; Shu et al., 2007), SoundPlan (Oiamo et al., 2018), CoRTN (Gulliver et al., 2015), and ASJ RTN (Koyasu, 1978), use meteorological, land surface, and transportation variables to simulate acoustic reflection, diffraction, absorption, and transmission according to the physical mechanism of sound propagation and attenuation (Xie et al., 2011). Despite the relatively high accuracy for small areas with well-defined noise sources, laboratory-simulated noise levels deviate from actual measurements because it is not possible to fully characterize the types and distributions of noise sources, or to fully define features of the built environment that influence sound propagation (Cvetković et al., 2011). Geostatistical models start by placing portable noise sensors to obtain noise levels at individual geographic points. Based on the principle of spatial autocorrelation, these in situ measurements can be further interpolated and output as continuous surfaces (Aguilera et al., 2015). Using ordinary kriging, Tsai et al. (2009) and Harman et al. (2016) mapped noise levels for Taiwan and Isparta, Turkey, respectively. Kriging is limited by the density of points of the in situ noise measurements, but also because it does not consider the many geographical, environmental, and social factors with considerable influences on the spatial variation of noise. Thus, geo-statistical models for noise mapping are not purely used but usually jointly used with the consideration of emission sources from geographic and socioeconomic environment.

The application of land use regression (LUR) modelling is a relatively new approach to map the spatial distribution of noise levels. LUR modelling, initially developed for estimating traffic-related air pollution, is based on the principle that concentrations of pollutants at a given location depend on the environmental features of the surrounding area (Hoek et al., 2008). Several studies have used LUR modelling to estimate noise levels in Asian, North American, African, and European cities, and modelled values tend to show better agreement with noise measurements than the conventional kriging models (Xie et al., 2011; Chang et al., 2019; Harouvi et al., 2018; Ragettli et al., 2016; Sieber et al., 2017; Aguilera et al., 2015). Hybrid approaches that combine LUR with geo-statistical models (Ryu et al., 2017; Zuo et al., 2014) or noise propagation models (Oiamo et al., 2018) have also been developed in recent years.

A major limitation of the LUR modelling approach is the inability to capture the complex nonlinear relationships that exist between noise levels and the related characteristics of the built and social environment (predictor variables). The development of machine learning methods has shown utility in dealing with nonlinearity in assessing the relationship between characteristics of the built and social environment and noise levels in urban areas. For example, as a classic machine learning method, the artificial neural network (ANN) was used in early studies to assess traffic or construction noise (Cammarata et al., 1995; Hamoda, 2008; Givargis and Karimi, 2010). Parbat and Nagarnaik (2008) and Genaro et al. (2009) improved the ANN models with a multi-layer perception approach to estimate traffic noise levels for Yavatmal, India and Granada, Spain, obtaining higher model accuracy than linear regression models.

Random forests (RF) is a nonparametric decision tree-based machine learning algorithm (Breiman, 2001) that is useful in overcoming the occurrence of over-fitting common to the decision trees, artificial neural network, and other machine learning methods. The RF is robust, can handle multiple heterogeneous covariates, and has been successfully used to map population density (Gaughan et al., 2016; Stevens et al., 2015), soil properties (Guo et al., 2015; Hengl et al., 2015), and concentrations of air pollutants (Liu et al., 2018; Brokamp et al., 2017). Compared with LUR modelling, RF regression has several dominant advantages. First, the RF approach can well capture the complex non-linear interactions between the dependent and independent variables as so to achieve high model accuracy (Liu et al., 2018). Second, the RF model is non-additive which allows to optimal selection of predictors to establish best splits for regression and leads to improvement in model accuracy. Third, the RF model is robust to avoid outliers by constraining predictions to the scope of the training data (Craig and Huettmann, 2009). Mennitt et al. (2014) first utilized the RF model to estimate noise levels in national parks across the United States. To our knowledge, RF model has not been applied to estimate the spatial distribution of noise levels in urban environment.

The major objectives of this study were to 1) develop RF models to estimate noise levels for five Canadian cities and 2) compare the RF estimation results with those derived from LUR models. To achieve these objectives, we identified the best predictor combinations from thirty-three candidate variables using the leave one out cross-validation (CVone) method for the LUR model and the index for variable importance (i.e., percentage of increased mean squared error, the definition is described in section 3.3) for the RF model. Next, we developed RF and LUR models at the global (overall five cities) and the local (individual cities) scales and compared the accuracy of their estimations. Then, we examined the importance of each used predictor in developing the RF and LUR models for noise exposure estimation. Both global RF and LUR models were applied to compare noise levels at the residential postal code level for the five cities.

Section snippets

Study area

The study cities were selected for several reasons. First, noise measurements at a reasonable fine spatial scale are available for each city. Second, the cities are representative of regional variations in Canadian urban development approaches. Vancouver is located on the east coast, Toronto, Montreal, and Longueuil situate in central Canada, and Halifax sits on the eastern coast. Third, these cities have the greatest populations within each region. According to Statistics Canada (2017),

Data

The noise measurements for these five municipalities were retrieved from Ragettli et al. (2016) for Montreal (87 measurements in 2010 summer and 117 measurements in 2014 spring; a total of 29 repeated sampling sites in 2010 and 2014), Oiamo et al. (2018) for Toronto (217 measurements in 2016 summer and 54 measurements in 2018 winter; a total of 49 repeated sampling sites in 2016 and 2018), Rainham and Dummer (2011) for Halifax (48 two-week average measurements in fall 2010), and Davies et al.

Data preprocessing

Multiple buffers (i.e., 50 m, 100 m, 150 m, 200 m, 300 m, 400 m, 500 m, 600 m, 700 m, 800 m, 900 m, 1000 m, 1500 m, 2000 m, 2500 m, 3000 m, 3500 m, 4000 m, 4500 m, 5000 m) surrounding the individual noise monitoring sites were created. The sum lengths of traffic lines (i.e., local roads, major roads, highways, railways, and bus routes), the total numbers of bus stops, transitions, and intersections, POIs (i.e., cinemas, health centers, fire stations, education centers, and police stations), and

The best buffer and predictors

Table S3 displays the entire candidate predictors, used predictors and their best buffers in the LUR and the RF models at the global and the local scales. The RF model employed more predictors than the LUR model at the global scale (30 vs. 25) but used fewer predictors for the city models. Compared with the global models, fewer predictors were adopted in the local models. Similar predictors, e.g., NDVI, population density, traffic flow, distance to highway, number of intersections, sum length

Discussion

This study is the first attempt to incorporate a machine learning-based RF modelling approach for estimating the spatial distribution of noise in urban areas at such a large geographical scale. To assess the superiority of the RF approach, we developed the predictive models at the local level (each of the five cities) and global level (all of the five cities) and compared the outcomes with the traditional LUR model. The RMSEs, MAEs, and fitting R2s by cross-validation indicated the RF modelling

Limitations and conclusion

This current study has several limitations. First, we did not consider temporal variation in estimating the spatial variation of noise levels across the five cities. Second, noise measurements used in this study were derived from sampling campaigns that employed diverse measurement periods. For example, one-week measurement durations were used in Toronto, Montreal, and Longueuil while noise in Halifax was measured over a two-week period. Noise levels in Vancouver were based on short-term

Acknowledgements

This work was supported by the Canadian Urban Environmental Health Research Consortium grant funded by the Canadian Institutes of Health Research. The authors acknowledge Dr. Jeffrey Brook for his leadership in obtaining this grant that allowed this work to be performed.

References (72)

  • H. Halim et al.

    Effectiveness of existing noise barriers: comparison between vegetation, concrete hollow block, and panel concrete

    Procedia Environ. Sci.

    (2015)
  • B.I. Harman et al.

    Performance evaluation of IDW, Kriging and multiquadric interpolation methods in producing noise mapping: a case study at the city of Isparta, Turkey

    Appl. Acoust.

    (2016)
  • G. Hoek et al.

    A review of land-use regression models to assess spatial variation of outdoor air pollution

    Atmos. Environ.

    (2008)
  • C. Huang et al.

    Satellite data regarding the eutrophication response to human activities in the plateau lake Dianchi in China from 1974 to 2009

    Sci. Total Environ.

    (2014)
  • K. Huang et al.

    Predicting monthly high-resolution PM2.5 concentrations with random forest model in the North China Plain

    Environ. Pollut.

    (2018)
  • S. Keola et al.

    Monitoring economic development from space: using nighttime light and land cover data to measure economic growth

    World Dev.

    (2015)
  • J. Kragh

    Road traffic noise attenuation by belts of trees

    J. Sound Vib.

    (1981)
  • Y. Liu et al.

    Improve ground-level PM2.5 concentration mapping using a random forests-based geostatistical approach

    Environ. Pollut.

    (2018)
  • T.H. Oiamo et al.

    A combined emission and receptor-based approach to modelling environmental noise in urban environments

    Environ. Pollut.

    (2018)
  • H. Ryu et al.

    Spatial statistical analysis of the effects of urban form indicators on road-traffic noise exposure of a city in South Korea

    Appl. Acoust.

    (2017)
  • N. Shu et al.

    Comparative evaluation of the ground reflection algorithm in FHWA Traffic Noise Model (TNM 2.5)

    Appl. Acoust.

    (2007)
  • C. Steele

    A critical review of some traffic noise prediction models

    Appl. Acoust.

    (2001)
  • K.T. Tsai et al.

    Noise mapping in urban environments: a Taiwan study

    Appl. Acoust.

    (2009)
  • A. Wang et al.

    Automated, electric, or both? Investigating the effects of transportation and technology scenarios on metropolitan greenhouse gas emissions

    Sustain. Cities Soc.

    (2018)
  • Y. Xu et al.

    Evaluation of machine learning techniques with multiple remote sensing datasets in estimating monthly concentrations of ground-level PM2. 5

    Environ. Pollut.

    (2018)
  • F. Zuo et al.

    Temporal and spatial variability of traffic-related noise in the City of Toronto, Canada

    Sci. Total Environ.

    (2014)
  • I. Aguilera et al.

    Application of land use regression modelling to assess the spatial distribution of road traffic noise in three European cities

    J. Expo. Sci. Environ. Epidemiol.

    (2015)
  • P. Banerjee et al.

    GIS based spatial noise impact analysis (SNIA) of the broadening of national highway in Sikkim Himalayas: a case study

    AIMS Environ. Sci.

    (2016)
  • T.M. Barry et al.

    FHWA Highway Traffic Noise Prediction Model

    (1978)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • A. Brown et al.

    WHO environmental noise guidelines for the European region: a systematic review of transport noise interventions and their impacts on health

    Int. J. Environ. Res. Public Health

    (2017)
  • A.P. Carvalho et al.

    Sound and noise in urban parks

  • X. Chen et al.

    Using luminosity data as a proxy for economic statistics

    Proc. Natl. Acad. Sci.

    (2011)
  • C. Clark et al.

    WHO Environmental noise guidelines for the European Region: a systematic review on environmental noise and quality of life, wellbeing and mental health

    Int. J. Environ. Res. Public Health

    (2018)
  • C. Clark et al.

    WHO environmental noise guidelines for the European Region: a systematic review on environmental noise and cognition

    Int. J. Environ. Res. Public Health

    (2018)
  • C. Clark et al.

    Association of long-term exposure to transportation noise and traffic-related air pollution with the incidence of diabetes: a prospective cohort study

    Environ. Health Perspect.

    (2017)
  • Cited by (28)

    • The Canadian Environmental Quality Index (Can-EQI): Development and calculation of an index to assess spatial variation of environmental quality in Canada's 30 largest cities

      2022, Environment International
      Citation Excerpt :

      The final selection of data included two air pollution datasets: fine particulate matter (PM2.5) and nitrogen dioxide (NO2); two natural environment datasets: the normalized difference vegetation index (NDVI) and distance to water bodies; two built environment datasets: length of highways and distance to coal, gas, and oil power plants; one UV radiation dataset; and two temperature datasets: the difference in average DA temperature and the overall city temperature during heat and cold wave events. Noise data were not included as estimates for noise levels were only available for five cities (Liu et al., 2020). We were not able to identify a dataset of water quality parameters at the municipal level for all the cities included in the study.

    • Spatial modelling and inequalities of environmental noise in Accra, Ghana

      2022, Environmental Research
      Citation Excerpt :

      We additionally modelled the final predictor variable sets with Random Forest models as a sensitivity analysis for the choice of model infrastructure. Random Forest models have been shown previously to improve predictive accuracy over linear regression in a noise LUR study conducted in Canadian cities (Liu et al., 2020). We made predictions of annual average noise levels for each hour of the day for an ~50 m × 50 m surface of unmeasured locations in the GAMA.

    View all citing articles on Scopus

    This paper has been recommended for acceptance by Eddy Y. Zeng.

    View full text