Introduction

Small area population forecasts are used by government and business for many purposes, including planning future service provision, infrastructure, housing, market analysis, policy development, and research. For example, they inform decisions about whether to build a new school or hospital, assist companies to determine where potential markets are expanding, provide input data for other models such as those projecting transport use, water demand, or numbers of households, and in some countries directly influence the redrawing of electoral boundaries. Decisions making use of small area forecasts can involve substantial investments and affect the provision of essential services to local communities. In addition, in situations where the latest population estimates are out of date, forecasts sometimes have to be used as interim population estimates or nowcasts.

Despite these important uses, small area forecasts remain the poor cousin of forecasts at the national and large subnational region scales, which have benefitted from much more research on methods and data preparation (e.g. Booth, 2006; Mazzuco & Keilman, 2020). Unfortunately, existing cohort-component projection models for countries and large subnational regions, and accompanying fertility, mortality, and migration forecasting methods (e.g. Li & Lee, 2005; Raymer et al., 2006; Shang & Booth, 2020), are often unsuitable for small areas. Difficulties include less detailed and poorer quality demographic data (due to geocoding inaccuracy, imputation, data adjustment/suppression, among others), short time series of datasets due to boundary changes, often erratic demographic trends, zero cell counts, and random noise in the data which mask underlying patterns. In addition, the populations of small areas located in growing urban areas are strongly influenced by changes in the dwelling stock, which are not easily handled by standard cohort-component models. In practice, many small area forecasts are produced using methods which have undergone little development over the last few decades. Not surprisingly, given the challenges of data, methods, and the variability and randomness inherent in small area populations, forecasts for small areas tend to suffer from considerably higher population forecast errors than those for larger subnational regions and countries (e.g. Rayer, 2008; Tayman, 2011; Wilson & Rowe, 2011). Overall, errors tend to increase as population size decreases, with errors rising rapidly once population size declines below about 10,000 (Wilson et al., 2018).

Because of these challenges, and the importance of small area projections across the public and private sectors, it is useful to undertake an assessment of the current state of small area population forecast methodology, and offer some suggestions for research to extend and build on current methods, and fill important gaps. Existing reviews of population forecasting methods tend to focus mostly on the national scale (e.g. Alho, 2015; Booth, 2006), or national and subnational regional scales (e.g. Rees et al., 2015; Wilson & Rees, 2005), or do include small area forecasting but are in need of updating and extending (e.g. Bell, 1997; Rayer, 2015; Wilson, 2011).

This paper presents a review of the literature on small area population forecasting methods, covering items published over the 20 years 2001–2020 inclusive. Primarily, we focus on the academic literature, but also include published methodological reports by statistical offices where they contribute novel methods. The scope of the review is limited to small area population forecasting models and methods for preparing forecast assumptions, but not associated topics, such as the geographical conversion of historical data to consistent boundaries prior to the preparation of forecasts (e.g. Norman et al, 2003; Simpson, 2002; Vasan et al., 2018), or the evaluation and quality assurance of forecasts (e.g. Rees et al., 2019; Wilson, 2017a). Nor do we cover derived small area forecasts or those for population sub-groups, such as school enrolment, ethnic group, or living arrangement and household-type forecasts (e.g. Feng et al., 2020; GLA, 2018; Grip & Grip, 2020; Rees et al., 2020). These topics would benefit from separate dedicated reviews. Later in the paper, we identify research needs and make some suggestions for research avenues which we believe are worth exploring to generate more accurate, informative, and useful small area population forecasts.

There is no widely accepted definition of a ‘small area’ in the demographic literature (Rayer, 2015). Sometimes the term is used to describe geographies for which there is limited demographic data available (however, this might be defined); or a particular population threshold might be specified. We use the term to refer to areas with populations under 100,000, although this cut-off is to some extent arbitrary. We also include grid cells under 25 km2 or 0.05 × 0.05 degrees latitude and longitude in area. Examples of small areas include census tracts (US and Canada), electoral wards (UK), and SA2 areas (Australia and New Zealand). Strictly, in geographically large countries with concentrated population distributions, ‘small areas’ can be huge in area and might be more accurately described as geographical units containing small populations. Population is defined as the usually resident population, to be consistent with estimates and forecasts produced by national statistical offices, rather than de facto populations (the number of people present in an area at a particular time irrespective of whether they are usual residents or visitors).

We also use the term ‘forecasts’ throughout the paper but acknowledge that some small area forecasts are labelled ‘projections’ by their creators who emphasise that their numbers are not intended to be forecasts but simply the outcome of selected assumptions and models. Projections are often defined as a calculation of future population based on chosen assumptions about the future drivers of population change (which may or may not be plausible), whereas a forecast is deemed the most likely future. However, for consistency and because most users tend to interpret projections as forecasts, we primarily use the term ‘forecasts’ here.

Following this introduction, we briefly describe the approach taken in undertaking this review. The next section sets out key themes of the surveyed literature in separate sub-sections followed by a summary of the main findings, while the following section outlines priority research areas and suggests potential avenues for further investigation. An annotated bibliography of literature included in the review is provided in the Supplementary Information accompanying the paper.

Literature Search and Selection

For this review, we searched for journal papers, book chapters, and reports which

  1. (i)

    contained details on forecasting methods and approaches

  2. (ii)

    were for small areas, and

  3. (iii)

    were published between 1st January 2001 and 31st December 2020.

We included literature which focussed on forecasting the components which directly feed into small area population forecasts, such as fertility, mortality, and migration rates. We also included methods for quantifying the uncertainty of small area population forecasts.

Our approach to searching the literature on small area population forecasting methods made use of several information sources. First, we searched for papers on Google Scholar using Harzing’s Publish or Perish (as it allows search results to be exported; Harzing, 2007), JSTOR, Scopus, Web of Science, Springer, Emerald, Wiley, Gale, PubMed, and the arXiv preprint server. Search terms used were population projection, population forecast, population prediction; small area, local area, grid square, gridded, spatial, subnational population, sub-regional population, small area population, ward-level population, county-level population, population project* (where * refers to any other letters), population predict*, and population forecast* and additional terms used to find articles on estimating the components of small area population projections included age-specific migration, age-specific mortality, age-specific fertility, fertility, age profile, estimating, and estimation.

Second, we followed up references in selected papers which appeared to focus on small area population forecasts. Third, we added our own papers on small area forecasting which we have collected over the years. We created an annotated bibliography and drafted a brief summary of up to 100 words of the main points for each of the selected papers. Finally, we approached the corresponding authors of these papers and asked them to check over our draft summaries and correct them if necessary. We also asked if there were any other papers within the scope of the review which we may have missed. A total of 51 authors were contacted and 20 summaries were revised. Following consultations with the authors, a further 6 papers were added to the review and summarised in the annotated bibliography. The final total number of papers included in the review is 84.

Small Area Population Forecasting Methods, 2001–2020

The literature on small area population forecasting methods from the last two decades is presented in the following ten sub-sections, categorised by type of forecasting method or approach. Any categorisation such as this is to some extent subjective, and there is inevitably research which can be categorised under more than one theme.

Extrapolative and Comparative Methods

Simple extrapolative models are generally applied to project population totals rather than age- and sex-specific populations. Although atheoretical and based solely on past trends, their strengths include minimal data requirements, simple and quick calculation, and accuracy which is often comparable to, or better than, more detailed and complex methods (Smith et al., 2013: chapter 8). Comparative (or ratio) methods create a small area projection through a relationship with an independent projection for a larger geographical area (e.g. for a state, large region, or country), and are also generally applied just to population totals. Many of these simple methods date back many decades (Isserman, 1977; Openshaw & Van Der Knaap, 1983; Pearl & Reed, 1920; Pritchett, 1891; Schmitt & Crosetti, 1951; Smith, 1987; White, 1954) though they became less popular in the later decades of the twentieth century as computing power increased and more complex models gained prominence. However, over the last decade or so, a number of studies have re-examined the properties of existing simple extrapolative and comparative methods for local and small area forecasts, and introduced some new variants.

To create grid square population forecasts across the world, Hachadoorian et al. (2011) tested two simple methods, one which applied a projected share of national population (shift-share), and the other a constant share of national population growth. The forecasts covered the period 1995–2025. Over the long run both methods were found to suffer problems, such as negative populations and trend reversals. Xanthos et al. (2013) tested several share-of-population and share-of-population-growth models over 10 years of historical periods to small areas in a region of Greece, finding that the lowest errorsFootnote 1 were obtained from the constant share-of-population model.

Rayer (2008) conducted a comprehensive evaluation of simple methods for forecasting the total populations of 2482 counties in the US. ‘Forecasts’ over historical periods were calculated for multiple jump-off years and projection horizons. Among the five individual extrapolative and comparative models tested, linear extrapolation proved marginally the most accurate, but the constant share-of-population model did not perform particularly well. At 10 and 20 years of horizons mean errors did not vary greatly between methods; however, after 30 years the exponential extrapolation gave by far the highest mean errors. The accuracy and characteristics of several extrapolative and comparative methods were also analysed as part of a study by Wilson (2015). Retrospective small area forecasts were produced for a 10-year horizon and compared to actual population estimates in Australia, New Zealand, and England & Wales. Of the methods evaluated, the best models were found to be a modified share-of-growth model (calculated as shares of national base period growth using only positive shares, with negative shares set to zero), and a constant share-of-population model. Methods susceptible to producing negative populations or runaway growth in the long run were identified (linear, projected share-of-population, constant share-of-growth, exponential, and constant growth rate difference).

Hauer et al. (2015) employed a housing-unit model to forecast the total populations of sub-county areas of Georgia, USA, with the number of housing units being projected by a Linear/Exponential model (linear if base period growth was positive; exponential if it was negative) fitted to a 70-year base period of dwelling counts. The advantage of this composite model is that negative populations and runaway growth are avoided. Linear/Exponential extrapolation was also applied by Hauer et al. (2016) to forecast the populations of census block groups in US counties expected to be severely impacted by sea level rises in the coming decades. These forecasts were then used in further work which modelled internal migration flows out of coastal areas which may occur due to sea level rise (Hauer, 2017). Baker et al. (2008) projected populations via the housing-unit method, forecasting dwelling numbers using a logistic curve because it mimics the growth in dwelling stock as new housing estates are developed. In their application to urban census tract populations in New Mexico, USA, the authors concluded that this approach worked well over a short-term forecast horizon.

Also included under the general heading of extrapolation are time series models. An example can be found in Athukorala et al. (2010), who forecast the total populations of two local government areas in Queensland, Australia, using Autoregressive Integrated Moving Average (ARIMA) models. Walters and Cai (2008) evaluated the Holt-Winters (exponential smoothing), ARIMA, and linear and exponential extrapolative models. They created several sets of retrospective forecasts for US states and local areas in Virginia, USA. The authors conclude that both Holt-Winters and ARIMA models are roughly comparable in accuracy to cohort-component models, and that Holt-Winters is preferable to ARIMA because of its more intuitive nature. Nonetheless, time series models are more complex and time-consuming to implement than simple extrapolative or comparative models, and usually require more data (Tayman et al., 2007); they are not commonly used to forecast small area populations.

Simplified Cohort-Component Methods

The simplified version of the cohort-component model, the Hamilton–Perry model (Hamilton & Perry, 1962), has experienced a revival for forecasting local and small area populations in recent years (Baker et al., 2017; Smith et al., 2013, pp. 176–179). In this model, the population is still projected by cohort, but not by demographic components. Instead, the population of a cohort at the start of a projection interval is multiplied by a Cohort Change Ratio to obtain the aged-on cohort population at the end of the interval. Young childhood populations can be forecast using Child/Woman Ratios. The model yields age-specific population forecasts but does not require fertility, mortality, or migration input data, making it applicable in data-limited circumstances which prevent the use of regular cohort-component models. The consequent limitation of this model is that fertility, mortality, and migration assumptions cannot be specified, and projected demographic components of change are not produced (although Baker et al., (2017, chapter 13) decompose cohort change into mortality and net migration components). Most published applications have been in the USA where small area demographic component data are limited.

The Hamilton–Perry model has been implemented in various ways, and with some interesting extensions. Swanson et al. (2010) used it to forecast the populations of census tracts and block groups, adjusting cohort change ratios to match population estimates early in the forecast horizon, and applying ceiling and floor limits to prevent growth becoming too high or too negative. Baker et al. (2014) used it to project the populations of urban census tracts with spatial weighting applied to preliminary forecasts. This was implemented by averaging preliminary forecasts for each census tract with those of its neighbours as defined by rook and queen contiguity matrices. The spatially -weighted forecasts were found to be more accurate than the original unweighted forecasts. In related work, Inoue (2017) tested alternative ways of smoothing small area cohort change ratios and child/woman ratios. In an application to small areas of Japan, it was found that smoothing which combined both local ratios and ratios from the larger prefecture in which it was located improved the quality of forecasts relative to not applying smoothing.

Tayman and Swanson (2017) evaluated several ways of adjusting cohort change ratios and child/woman ratios throughout the forecast horizon. One option averaged these ratios over two 10-year inter-censal intervals, while a second involved extrapolating the difference in ratios over the two 10-year intervals of the base period. A third option modified local ratios over time to the same extent as ratios calculated from independent forecasts for a larger region. When applied to counties in Washington State and census tracts in New Mexico, the third option gave the most accurate forecasts.

Studies have also assessed unconstrained versus constrained Hamilton–Perry forecasts. Baker et al. (2020) evaluated the Hamilton–Perry model for census tracts across the US, with one set constrained to linear extrapolations of total population, and another remaining unconstrained. The constrained set proved more accurate. Similarly, Wilson (2016) tested several forms of cohort-component model for local areas in Australia, including the standard Hamilton–Perry model and a version constrained to total populations from an extrapolative model. The latter proved more accurate. This constrained Hamilton–Perry model gave only slightly less accurate forecasts by age (except at the highest ages) than versions of the cohort-component model incorporating migration.

Hauer (2019) created population forecasts by age, sex, and race for all counties of the US for 2020–2100 using the Hamilton–Perry model. He introduced one key variation: for county/race populations which were declining he applied cohort change ratios, but for growing populations he used cohort change differences (i.e. absolute numbers). This is similar to the linear/exponential composite model noted earlier. Cohort change ratios and differences were forecast using ARIMA(0,1,1) models, and all population forecasts were constrained to those from the Shared Socioeconomic Pathways for the US as a whole. An evaluation of forecasts from 2000 to 2015 revealed the composite cohort change ratio/difference approach to be slightly more accurate than using cohort change ratios or differences alone.

Dockery et al. (2020) proposed a type of cohort model to project small Indigenous populations in rural and remote regions by sex and 5 years of age group. Although reference is not explicitly made to the Hamilton–Perry model, their approach is effectively a Hamilton–Perry model in which the population is projected using cohort change differences. The unique feature is that cohort change is projected with a regression model which uses a range of demographic, social, and geographic variables. A regression model is also used to project the number of 0–4-year olds based on population numbers in older childhood ages. The method was applied to produce 5-year horizon projections of small area Indigenous populations by age and sex in regional and remote parts of Australia.

Model Averaging and Combining

The benefits of model averaging and combining were highlighted in the statistics and forecasting literature several decades ago (e.g. Bates & Granger, 1969; Clemen, 1989) but until quite recently such approaches were uncommon in demographic forecasting (though see, for example, Isserman, 1977, and Smith & Shahidullah, 1995 for explorations of some of these techniques). Averaging and combining are often found to reduce errors. This is thought to be because they draw on a wider amount of information than any one model, and because errors in different directions from individual models offset one another to some extent (Goodwin, 2009).

For small area population forecasts, evaluations of averaged or composite models generally find that they perform quite well. Rayer (2008) created retrospective total population forecasts for US counties using five simple methods, and found that a mean of all five methods and a trimmed mean based on the three remaining forecasts after excluding the highest and lowest forecasts, produced mean errors comparable to the best individual methods, or better. For sub-county areas in Florida, Rayer and Smith (2010) similarly found that averaged forecasts and trimmed means performed well. Composite models, where different methods are applied to areas based on their population size and growth rate, were found to reduce errors a little further. The Bureau of Economic and Business Research at the University of Florida has employed a trimmed mean method to produce its annual county population forecasts in Florida for some time (e.g. Rayer & Wang, 2020).

Averages of extrapolative and comparative models for forecasting small area population totals were found to result in modest gains in accuracy in Australia, New Zealand, and England & Wales (Wilson, 2015). The best averaged models mostly contained a constant share-of-population model (share of national population) and at least one type of share-of-growth model (national population growth), suggesting that a projection which takes an average of national and local trends is beneficial. The best composite models, based on population size and growth rate categories, also gave errors slightly better than the best individual models. Further assessment of the averaged constant share-of-population/share-of-growth model was reported in Wilson (2017b). It was applied at various geographical scales over several 10 years of forecast horizons. Median Absolute Percentage Errors (MedAPE) were low, and consistently lower than those from linear extrapolation.

Reinhold and Thomsen (2015) evaluated several versions of cohort-component model applied to districts in Lower Saxony, Germany. Retrospective forecasts were created for a 10-year horizon and compared to population estimates. The models tested consisted of the cohort-component with and without migration, and the cohort-component model constrained to total populations from simple linear, no change, projected share-of-population, and constant share-of-population models. The mean of all forecasts gave slightly lower errors than any individual method for both total population and age-specific populations. A similar evaluation of several versions of cohort-component model applied to local government areas in New South Wales, Australia, was undertaken (Wilson, 2016). The cohort-component models tested included those with bi-regional migration flows, net migration numbers, net migration rates, a composite model (using net migration numbers for positive net migration and rates if negative), and the Hamilton–Perry model. A second set of forecasts were produced where the cohort-component forecasts were constrained to total populations from an averaged extrapolative model. The constrained forecasts proved much more accurate than the unconstrained forecasts, and the bi-regional model produced the most accurate age-specific forecasts by a small margin, followed by the composite net migration model.

Incorporating Socioeconomic Variables and Spatial Relationships

Researchers have argued that demography is an inherently spatial social science (e.g. Matthews & Parker, 2013; Voss, 2007), but small area demography is especially so. Demographic trends and patterns in small areas do not occur in isolation, but are geographically connected to other areas—most obviously through migration and short-term mobility, but also by possessing characteristics which are often similar or identical to nearby areas, such as social norms, culture, politics, housing types, environment, laws, and regulations. As summarised by Tobler (1970, p. 236) “everything is related to everything else, but near things are more related than distant things”. In demography more generally, spatial modelling and analysis have grown in importance over the last decade or two (e.g. Chi & Zhu, 2008; Gu et al., 2020; Matthews & Parker, 2013). Yet, only a few contributions to small area population forecasting incorporate explicitly spatial relationships (other than migration flows) in their modelling.

Chi (2009) investigated whether forecasts of sub-county total populations could be improved by incorporating variables correlated with population change in the modelling. Four regression models were applied to forecast the populations of minor civil divisions in Wisconsin which included explanatory variables related to an area’s demographic and socioeconomic characteristics, transport accessibility, natural amenities, and land development characteristics. Comparisons with four simple linear and exponential extrapolative models reveal the regression approach does not outperform extrapolation. Chi and Voss (2011) then created and evaluated regression models which incorporate variables from neighbouring areas. In their “spatio-temporal regression forecasting approach” they selected the seven nearest neighbouring areas and weighted variables according to inter-centroid distances. The neighbour variables included lagged population growth rates and a selection of social, economic, and environmental characteristics. Although conceptually appealing, the regression models did not produce lower errors than a simple extrapolative model when applied to minor civil divisions in Wisconsin, over the 1990–2000 period. Chi et al. (2011) applied these regression models to census tracts in the city of Milwaukee to assess whether they worked better in an urban setting. In this study, Mean Absolute Percentage Errors were found to be slightly lower than those of several simple extrapolative models. In later work, a geographically weighted regression approach was proposed and tested (Chi & Wang, 2017). When applied to minor civil divisions in Wisconsin, the geographically weighted regression model was found to be slightly less accurate than several simple extrapolative forecasts.

Other researchers have applied spatial weighting to the results of population forecasts or to input assumptions. Baker et al. (2014) created three sets of census tract forecasts using the Hamilton–Perry model, two which used initial projections for each census tract averaged with those of its neighbours. This approach yielded lower errors than not applying spatial averaging. Inoue (2017), also using the Hamilton–Perry model, smoothed cohort change ratios and child/woman ratios for small areas by incorporating ratios for the prefecture in which each small area was located. This lowered errors relative to forecasts with no smoothing.

Housing-Led Population Projections

At the small area scale, population size and composition are often closely associated with the number and types of dwellings (Bell, 1997; Dittgen & Dutreuilh, 2005; Franzén & Karlsson, 2010; Myers, 1990). The housing-unit method has long been used by demographers to create population estimates (Bryan, 2004; Swanson & Tayman, 2012) as well as forecasts (Bell, 1997; Foss, 2002). The total population is calculated as the number of private dwelling units multiplied by the proportion of dwellings occupied by usual residents multiplied by average household size, plus the number of people living in non-private (institutional) dwellings. The method is computationally simple and, with accurate estimates of the component terms, it provides accurate population numbers. A limitation is that it only generates population totals (i.e. no age and sex detail), although these totals can be produced for different dwelling types. In forecasting applications, the accuracy of the housing-unit model depends on the reliability of recent data and assumptions made about each of the individual inputs to the model. In many urban areas, planning policies and systems mean that the future number of private dwellings which will be built over the next 5 or 10 years is known with a reasonable degree of certainty. Future change in average household size and the proportion of dwellings occupied may be harder to estimate.

The housing-unit model regularly proves a useful tool for researchers, often in combination with other models. As noted earlier, Baker et al. (2008) used it for census tract population forecasts, employing a logistic curve to model the growth of dwellings, while Hauer et al. (2015) used it for sub-county area forecasts, applying a Linear/Exponential model to forecast dwelling numbers. The housing-unit model is also regularly used by government, academic, and private sector demographers to produce local and small area population forecasts (e.g. Portland State University Population Research Center, 2015; QGSO, 2018).

A mixed housing-unit/cohort-component approach was taken by Dittgen and Dutreuilh (2005) in preparing population forecasts for Paris. Although details of the model are not provided, the description suggests that they created a cohort-component model which separates out migration affecting existing dwellings from the population moving into new dwellings. The number of new dwellings is multiplied by a projected average household size, and then disaggregated using an assumed age–sex profile. The additional population is then added to the population produced by the cohort-component model. A similarly adapted cohort-component model, which handles population in existing and new dwellings separately, is described by Hansen (2010).

An alternative is to create dwelling-led forecasts which combine forecasts of future dwelling numbers with a headship rate or propensity household model together with a cohort-component model. The dwelling input consists of future numbers of dwellings occupied on a usual residence basis plus the number of persons living in institutional accommodation. However, average household size is not an input assumption, but one of the model outputs. This allows changes in population age structure to influence average household size. Marshall and Simpson (2009) and Simpson (2017) describe how dwelling-led projections for local areas in the UK-based POPGROUP software are produced in this way. Migration is adjusted in volume (but not age pattern) so that population and household forecasts are consistent with assumed dwelling numbers. The POPART local area population forecasting program takes a similar approach, adjusting inward and outward migration flows in an iterative calculation scheme (and indirectly adjusting numbers of births and deaths as well as populations-at-risk change) to match future occupied private dwelling numbers (Wilson, 2014).

Franzén and Karlsson (2010) propose quite a different approach in which small area migration rate age profiles for the cohort-component model are not based on noisy local data, but derived from national data and the housing mix of a small area. The key assumption is that people living in certain dwelling type/tenure categories share the same demographic characteristics throughout the country, so that national data on internal migration by dwelling type can be used at the small area scale. In an application to Sweden, the authors identified 33 housing categories using cluster analysis, all with specific in- and out-migration age profiles. For small area forecasts, the authors suggest weighting the housing-specific migration age profiles by the dwelling composition of each small area.

‘Downscaling’ and Disaggregation Approaches

Research on ‘downscaling’ (or disaggregating) national or large regional population forecasts to small areas has emerged over recent years, primarily as part of the literature on the Shared Socioeconomic Pathways (scenarios of macro-scale global social, economic, and natural systems change to facilitate studies of adaptation to climate change; O’Neill et al., 2014). In practice, the methods often involve a mixture of national forecast disaggregation and small area forecasting, but always with the application of top-down constraining. This stream of research appears to be relatively distinct from other work in small area population forecasting, and is generally more focussed on ‘big picture’ global patterns of population with spatial detail for many decades into the future, whereas most other small area forecasting studies are more geographically and temporally limited in extent and have an applied planning and policy focus. Many of the downscaling methods start with national-level population forecasts produced by the United Nations Population Division, the International Institute for Applied Systems Analysis (IIASA), or national statistical offices, and small area grid square population estimates produced as part of the Gridded Population of the World datasets (https://sedac.ciesin.columbia.edu/). Given the ‘big picture’ emphasis and huge amount of data involved in these studies, most use assumptions which considerably simplify reality. Many models in the ‘downscaling’ literature have not undergone comprehensive forecast accuracy assessments.

Some studies apply relatively simple methods. For example, Breidenbach et al. (2019) created age–sex population forecasts for 1 km2 grid squares of Germany for the period 2015–2050. They applied a cohort-component model for each grid square, assuming national mortality rates for each area, area-specific fertility rates, zero internal migration, and net international migration distributed according to the population distribution. Merkens et al. (2016) created small grid square forecasts of coastal population totals across the world from 2010 to 2100. Their method divided each country into 4 ‘zones’ (coastal-urban, coastal-rural, inland-urban, and inland-rural) and assumed recent growth rate differences between zones would continue, with forecasts constrained to independent national population and urbanisation forecasts. Grid square forecasts were created by assuming the grid square population will grow at the same rate as the zonal population.

Other studies have used more complex methods involving multiple models. McKee et al. (2015) prepared small grid square population forecasts for the contiguous USA. First, they created county forecasts by age and sex using a cohort-component model with net migration rates. Second, constraining to US Census Bureau State and national forecasts was applied. And then the county forecasts were disaggregated to grid squares according to a “potential development coefficient” based on a set of variables which included land use, roads, urban accessibility, and current population. Boke-Olén et al. (2017) took a similar approach to create gridded population forecasts for Africa. National scale forecasts from IIASA were disaggregated using spatially detailed data on roads, water bodies, country borders, and distance to urban centres. Chen et al. (2020) applied various machine learning algorithms to forecast the populations of tiny 100 m2 grid squares for China for the period 2015–2050.

Zoraghein and O’Neill (2020a, 2020b) also created small area population forecasts for the USA, though for 1 km2 grid squares and for population totals only. Existing State population forecasts were downscaled to grid squares using a gravity-type model, specified separately for urban and rural areas. Forecast population growth was allocated to each grid square proportional to its assumed ‘suitability for growth’ based on several socioeconomic, land use, and environmental variables, while forecast population decrease was distributed according to the inverse of suitability.

Some other studies involving disaggregation to small areas fall outside the climate-themed literature. They take a variety of approaches. Stimson et al. (2012) describe the creation of a large-scale urban model to forecast population, housing, employment, and other socioeconomic variables for the region of South East Queensland, Australia. Small area population forecasts are closely related to the future pattern of dwelling growth. Future dwelling growth for the region as a whole is allocated to small grid squares using a model which mimics the land development process and considers residential land availability, proximity to existing development, and the accessibility of local areas to services and transport. In New Zealand, Cameron and Cochrane (2017) also used a detailed land-use model, the Waikato Integrated Scenario Explorer (WISE), to disaggregate population forecasts for larger regions created by a cohort-component model to smaller area units. Several regression models were fitted to distribute population. An evaluation over the recent past compared the forecasts with those of two simple models, finding the authors’ forecasts were competitive with one of the simple models and slightly less accurate than the other.

Other approaches apply a variety of disaggregation techniques. Li and Corcoran (2011) evaluated alternative dasymetric methods to distribute population forecasts for the region of South East Queensland, to small areas based on their dwelling density category, with each density category assigned a proportion of regional population growth. They concluded that a method involving four density classes produced the best results. The classic shift-share method was adapted by Alonso González et al. (2015) to disaggregate regional age–sex population forecasts in Spain to small areas. Local area population growth is modelled as the function of three effects: regional total population growth (regional effect), regional change by age group (age effect), and local area age-specific growth (local effect). The data requirements are modest and the calculations relatively simple.

Several studies involve the disaggregation of outputs of cohort-component models applied to demographic data from higher geographies. In Kanaroglou et al. (2009), the authors develop a model which combines a multiregional cohort-based demographic model and an aggregated spatial multinomial logit model to forecast small area populations. Migration flows are estimated by disaggregating migration data available at the municipal level by using migrant and destination area characteristics (e.g. migrant’s age, number of schools, distance to city centre). Jiang et al. (2007) present a Hidden-Markov model-based small area population forecasting method which incorporates a hierarchical framework, such that data from larger geographies can be used to support small area forecasts. Compared to the cohort-component model, the proposed method has lower data requirements, is suited to producing longer-term forecasts, and can output prediction intervals. Jannuzzi (2005) couples a regional-level cohort-component method with a system of differential equations, based on models of population dynamics used in ecology, to disaggregate population forecasts to municipalities.

Small Area Microsimulation

In contrast to most small area population forecasting methods, microsimulation models, by definition, operate at the scale of individuals rather than populations. Consequently, they require considerably more input data and data preparation than macro-scale models, but possess several beneficial features, including rich output detail across many population characteristics (van Imhoff & Post, 1998). They also avoid the need for constraining between geographical scales because all outputs are aggregated from the individual scale. Two broad categories of spatial microsimulation model may be distinguished: dynamic models in which transitions between demographic states are modelled explicitly, and static models in which base period data are re-weighted to aggregate constraints from macro-scale population estimates and projections (Ballas et al. 2005a; Tanton, 2014). Because they do not explicitly model demographic processes “static spatial microsimulation models are best for ‘next day’ analyses” based on what-if? policy scenarios (Tanton & Edwards, 2013, p. 4). A pedagogical example of small area microsimulation is given in Lomax and Smith (2017).

Examples of dynamic spatial microsimulation models can be found in Ballas et al. (2005b), Marois and Bélanger (2014, 2015), Wu and Birkin (2013), and Wu et al. (2011). We briefly describe two models here. Ballas et al. (2005a) created SMILE (Simulation Model for the Irish Local Economy) to produce short-term population forecasts for small areas of Ireland for the periods 1991–1996 and 1996–2002. In the model, mortality probabilities are based on age, sex, and location, while fertility probabilities are dependent on age, marital status, and location. Probabilities of internal migration are obtained by randomly sampling from census data based on an individual’s age, sex, and county. International migration is excluded due to data limitations. The mean absolute percentage error for the total populations of small areas was 6.4% for the 1991–1996 projections. Marois and Bélanger (2014, 2015) describe the development and application of a microsimulation model for forecasting the populations of municipalities within the metropolitan region of Montreal, Canada. A key feature is the use of conditional logistic regression to allocate internal and international migrants to municipalities based on distance to the city centre, current municipality size, municipality development potential, population composition as well as geographic location. The model was applied to produce forecasts from 2011 to 2031. Validation over the 2006–2011 period revealed a Mean Absolute Percentage Error (MAPE) of 3.4% for municipality total populations.

An example of a static spatial microsimulation model for small area population forecasting is described by Harding et al. (2011) and Vidyattama and Tanton (2010). The model SpatialMSM forecasts population and household characteristics for Statistical Local Areas (SLAs) of Australia (Harding et al., 2011). It takes independent macro-level forecasts of SLA populations by age and sex from 2007 to 2027, disaggregates them by labour force status, and then re-weights the base period small area micro-level dataset to achieve consistency with the area/age/sex/labour force forecasts.

Machine Learning

Machine learning methods have proved very successful in a range of fields such as speech recognition (Nassif et al., 2019), translation (Wu et al., 2016), computer vision (Voulodimos et al., 2018), bioinformatics (Li et al., 2019), and in games such as chess (Silver et al., 2018). However, some research suggests that these methods are often less accurate than traditional statistical methods for forecasting work. Makridakis et al. (2018) set out to evaluate the accuracy of forecasts produced by machine learning methods (including Long Short-Term Memory (LSTM) models, Bayesian Neural Networks, and Regression Trees) against traditional statistical methods (such as Exponential Smoothing, ARIMA, and an average of Simple Exponential Smoothing, Holt, and Damped exponential smoothing). They found that traditional methods were more accurate and had lower computational requirements than machine learning methods. The authors suggest that the reason why many articles claim superior machine learning performance is that they do not include a comparison with a suitable benchmark method (particularly a traditional statistical model).

Riiman et al. (2019) evaluated an artificial neural network method known as the Long Short-Term Memory (LSTM) model for forecasting the total population of Alabama counties at 10-year horizons, and compared them with forecasts from a cohort-component model, which produced a MAPE of 6.5% at 10 years. They produced machine learning forecasts using two data types (annual population estimates and decennial census populations) and two training regimes (training models on data from all counties and training a separate model for each county), giving four forecasts in total. When models were trained on one county at a time, they produced lower errors than when models were trained on data from all counties (6.3% MAPE compared to 16.7% MAPE for annual population data, and 5.0% compared to 6.1% for decennial census populations). Thus, the LSTM model produced a significantly better forecast than the cohort-component model when a model was trained for each county and decennial data were used. The Alabama counties had relatively long time series available to train the models (decennial census data from 1910 to 2000, and mid-year population estimates from 1969 to 2000), a time series length not always available for small area populations.

Several papers have also applied machine learning methods to the demographic component inputs of small area population forecasts. Weber (2020) used several machine learning methods and data from 2005 to 2009 to predict net migration rates at the municipal level in Germany for ‘education migration’ (ages 18–24) and ‘family migration’ (ages 0–17 and 30–49) for the period 2011–2015. The author reported forecasting performance by correlating observed and predicted values. Net migration rates for education migration achieved R2 > 0.5 but R2 = 0.25 was the best result for family migration. This suggests that machine learning methods with short time series could not adequately support forecasts of migration.

Several studies have applied such methods in forecasting related to the Shared Socioeconomic Pathways population scenarios. Chen et al. (2020) applied machine learning techniques to predict residential land use, and subsequently to forecast small grid square populations from 2015 to 2020, with forecasting constraints taken from the population scenarios. The methods used included two decision tree-based ensemble methods, XGBoost (Georganos et al., 2018) and random forest (Belgiu & Drăguţ, 2016), and a neural network algorithm as implemented in Hu et al. (2019). Striessnig et al. (2019) prepared county population forecasts by broad age group for the USA consistent with various Shared Socioeconomic Pathway scenarios for the period 2000–2100. Instead of relying on the cohort-component model, they used regression trees to forecast the share of a county’s population in each broad age group based on past demographic characteristics. Census population data from 1980 to 2000 were used for model training. The age group shares were then applied to separate county population totals and constrained to national forecasts by age. Forecasts were compared against actual data during the 2000–2010 period, revealing an R2 of about 0.9.

Estimating and Projecting Small Area Component Input Data

Another area of the small area population modelling literature focuses on estimating and forecasting the separate components of population change: fertility, mortality, and migration. As is the case with small area estimation in general, one of the main challenges in obtaining estimates of components of population change at the local level is dealing with small or zero event counts, which make the underlying trends or patterns are unclear. Additionally, data on components may come from multiple sources, each having their own measurement issues and biases, that need to somehow be reconciled within the same estimation framework.

Because data in small areas are often sparse, many of the methods used for larger areas are unreliable. Scherbov and Ediev (2011) demonstrated the difficulty of creating robust life table statistics for small areas. They showed that traditional life table construction methods often give unacceptably high estimation bias and standard errors for population sizes below 5000. Several approaches have been proposed to deal with this problem. Anson (2018) describes a multilevel model which uses national mortality rates to support the estimation of subnational mortality rates and applies it to 87 subnational units in Belgium to construct age–sex-specific mortality rates. Congdon (2014) produced county-level life expectancy estimates in the US based on a structured random effects model with a regression extension. This method allows for the consideration of various area characteristics, such as the ethnic mix, population density, and socioeconomic indices, which enable stable life expectancy estimates to be formed with minimal amalgamation of counties. Dyrting (2020) and Gonzaga and Schmertmann (2016) both describe estimation of small area demographic rates based on TOPALS, a relational model developed by de Beer (2012) which can be used to smooth and project various age-specific demographic rates. It assumes that an age schedule can be represented as a combination of a standard age schedule and spline-smoothed ratios of local to standard rates. Gonzaga and Schmertmann (2016) developed a TOPALS-based method to estimate small area age-specific mortality rates for small areas with incomplete death registrations in Brazil. Dyrting (2020) extended this method using a penalised-splines approach, where the smoothness of the fit is controlled with a single parameter, and then applied the method to estimate small area age-specific migration schedules in the Northern Territory, Australia (Dyrting, 2020). Simpson and Snowling (2011) evaluated three methods for preparing input data for small area cohort-component forecasts where input data were not available (or affordable). Methods evaluated included a No Local Variation approach which uses the same fertility, mortality, and migration inputs for all small areas within a wider region. The Local Calibration option constrains the No Local Variation assumptions by using total recorded births and deaths in each area, and estimates age–sex-specific migration indirectly from two sets of population estimates. The Local Direct comparison projection uses small area fertility, mortality, and migration age profiles of rates, directly estimated from available data. The authors conclude that the Local Calibration option gives the most plausible forecasts.

While there are several approaches to support the estimation of population components in small areas, a large part of the recent literature has focussed on developing Bayesian methods for estimation and forecasting. Bayesian methods have gained traction in demographic estimation since the United Nations Population Division began to utilise a Bayesian framework to produce its regular national and global population forecasts (Raftery et al., 2012). These methods have since been employed to estimate fertility, mortality, and migration at the global, regional, national, and subnational levels (e.g. Alexander et al., 2017; Alkema & New, 2014; Alkema et al., 2011; Assunção et al., 2005; Azose & Raftery, 2015; Leknes & Løkken, 2020; Schmertmann & Hauer, 2019; Schmertmann et al., 2013). Bryant and Zhang (2018) describe a Bayesian statistical framework which uses multiple, often unreliable, data sources to estimate and forecast entire demographic systems, including births, deaths, internal migration, and international migration) with age, sex, area, and time details.

In brief, Bayesian methods allow population processes to be conceptualised in a statistical framework which can combine both information from observed data and also prior beliefs about likely patterns in demographic processes. These methods are particularly suited to modelling demographic components, because they facilitate combining multiple data sources and dealing with missing data, and allow prior information on likely patterns across age and time to be incorporated. The use of statistical models allows patterns to be readily smoothed across time and information to be shared across space, and automatically accounts for and estimates uncertainty in forecasts.

Bayesian methods for small area demographic estimation first appeared in Peter Congdon’s work on estimating local-level life expectancies in the UK (Congdon, 2009). This approach of using Bayesian multilevel models has been extended by others studying mortality and other demographic processes. For example, Wakefield et al. (2019) use a spatio-temporal Bayesian model to estimate child mortality at the subnational model in data sparse contexts. Alexander et al. (2017) build on classical demographic model lifetable techniques to estimate small area age-specific mortality schedules. Jonker et al. (2012) show that a Bayesian Random Effects approach outperforms traditional life expectancy estimations. Traditional life expectancy estimations tend to generate unacceptably large biases and standard errors for populations of fewer than 5000 person years, conversely the Bayesian approach allows estimates to be produced down to 2000 person years (Jonker et al., 2012). Bryant and Zhang show how these types of methods can be employed to estimate small area internal migration rates in both New Zealand and Iceland (Bryant & Zhang, 2016; Zhang & Bryant, 2020). A model providing a fuller demographic account for subnational regions of New Zealand is presented in Bryant and Graham (2013), where estimates are created using multiple data sources, including census and administrative data (e.g. tax system data, school enrollments, and electoral roll). In a different application of Bayesian methods, Schmertmann and Hauer (2019) develop methods to estimate fertility rates given population age structures. While not explicitly dealing with small populations, these methods are potentially useful in the small population context because often only population counts may be available.

Forecast Uncertainty

Over the last three decades, the literature on quantifying population forecast uncertainty has grown considerably in scope and complexity. Methods have become increasingly sophisticated, probabilistic forecasts have been published for many case study countries, and several software packages made publicly available (Keilman, 2018, 2020). A few statistical agencies have now adopted probabilistic methods to prepare their official population forecasts (e.g. Statistics Netherlands, 2020; Statistics New Zealand, 2020; UN, 2019). Some researchers have focussed on uncertainty for large subnational regions (e.g. Bertino et al., 2014; Rees & Turton, 1998; Swanson & Tayman, 2014; Wilson & Bell, 2007; Wisniowski & Raymer, 2016) but very little attention has been devoted to small area population forecasts, where errors tend to be highest and the need for quantifying uncertainty the greatest. It is not due to a lack of interest from users. In a recent survey, the majority of subnational population forecast users stated that they would like to receive information about population forecast uncertainty (Wilson & Shalley, 2019). Among the methods which quantify uncertainty, Tayman (2011) makes a broad distinction between empirically based prediction intervals based on analyses of historical forecast errors, and model-based intervals generated by probabilistic and statistical models. Model-based intervals are quite data-hungry and complex to produce but provide an almost unlimited set of prediction intervals; empirically based intervals provide less information but require less data and avoid complex model estimation.

Few papers describe probabilistic methods for quantifying local and small area forecast uncertainty. Bayesian methods for population forecasting estimate uncertainty as a natural by-product of the estimation process (Bryant & Zhang, 2018; Bryant & Graham, 2013). Of the non-Bayesian methods, one early contribution, due to Gullickson and Moen (2001), presented probabilistic population forecasts for one county in Minnesota and the rest of the State. A cohort-component model was run 1000 times using fertility rates from a random walk with drift model, Lee–Carter mortality forecasts, and migration rates obtained by sampling randomly from recent net migration rates. Although relatively simple by today’s standards, this was an innovative contribution at the time. A probabilistic cohort-component model was also created by Cameron and Poot (2011) to produce forecasts of district populations in the Waikato region of New Zealand. To keep the model relatively simple, they multiplied age-specific rates by factors drawn randomly from separate distributions for fertility, mortality, and net migration. These factors were maintained for all age-specific rates throughout the forecast horizon.

In contrast to this simulation approach, Rayer et al. (2009) created empirically based prediction intervals. To do so, they prepared retrospective forecasts of total population for 2482 counties in the US over many past decades using seven trend extrapolation methods. Final forecasts were created from a trimmed mean of the five remaining forecasts after excluding the highest and lowest for each area, and their errors were used to estimate 90th percentile predictive intervals for use in later forecasts. The authors found that in most circumstances, the 90th percentile error from one period proved quite accurate in predicting the 90% range of absolute percentage errors obtained in a subsequent period. This empirical approach is applied by the Bureau of Economic and Business Research at the University of Florida to create 75% prediction intervals for county population forecasts in Florida (Rayer & Wang, 2020). The prediction intervals are allocated according to population size and recent growth rate.

Similarly, Wilson et al. (2018) created empirical prediction intervals for total population forecasts from an evaluation of 30 years’ worth of local area population forecasts in Australia. 80th percentile intervals were estimated from historical errors, smoothed across population size categories, and then applied to contemporary local area population forecasts based on jump-off year population sizes. Past error distributions were also used to estimate the ‘shelf life’ of a population forecast to indicate far into the future the forecast could be regarded as still ‘suitable for consumption’. It was defined in this case as the number of years into the forecast horizon the 80% prediction interval would remain within 10% absolute percentage error. Up to 10% error was regarded as a reasonable quality forecast.

Summary

We briefly summarise the key contributions of the literature reviewed in the sub-sections above in Table 1 below.

Table 1 Key contributions in small area population forecasting methods, 2001–2020

Research Needs

The review of small area population forecasting methods above documents many useful and novel contributions over the last two decades. But it also underlines the modest size of the literature on this topic, the many gaps and limitations, and the need for more research to enable the production of more accurate and informative small area population forecasts. Table 2 summarises what we consider to be the most important research needs in small area population forecasting.

Table 2 Some key research needs in small area population forecasting

Small area population forecasts depend crucially on past demographic data being accurate, complete, and up-to-date. Unfortunately, this is often not the case. Accuracy tends to be lower for small areas than large subnational regions, some small area demographic data are unavailable (often the most important component, migration), and population estimates and components are regularly not published until 1–2 years after the reference date. Small area data also need to be based on a consistent set of geographical boundaries, preferably the current set, so that recorded demographic change is real and not just an artefact of boundary change (Norman et al., 2003; Rees et al., 2004). Although past demographic data were outside the scope of our review, a prerequisite for improving small area population forecasting is to have accurate, complete, and timely past and current small area demographic data. Much of this will require efforts from national statistical offices (perhaps including greater use of linked administrative datasets), but researchers can also contribute methods for geographical data conversion and more up-to-date estimation and nowcasting.

Where small area component data are available, a common challenge is dealing with sparse, noisy, and/or inaccurate data. For forecasting purposes, it is best to forecast smooth underlying rates, not noisy base period data. Our review notes several contributions to estimating and smoothing small area rates, particularly through the application of Bayesian methods. The next step is to refine methods and make user-friendly tools available. An additional challenge arises when available data are perturbed or suppressed in some manner by data providers for confidentiality reasons. More research is needed to devise strategies and methods to recover or estimate data where this issue exists.

It would also be helpful if historical validation routinely occurred when new forecasting methods are proposed. A key question to ask is: if a forecasting method was applied n years ago, how well would it have forecast the population for the most recent year we have a population estimate for? And would it have done better than existing methods? While many studies address these questions, some of the forecasting methods covered in the review did not undergo any validation, or were subject to only a very limited evaluation. This was particularly the case for some of the downscaling methods and microsimulation models. Evaluations of new methods can reveal important information about their forecast characteristics, typical forecast accuracy, and whether they improve on existing methods. In addition to historical validation, there has been limited evaluation of ‘real’ small area population forecasts produced by statistical offices and researchers. This too can reveal important findings about methods that work well, and those that do not. It would be helpful if all such evaluations used a consistent set of error measures to enable comparisons to be made easily between different studies. Ideally, demographers would agree on a recommended standard set of error measures to be applied in all forecast evaluations.

The historical validation of methods would be made much easier if there were publicly available small area demographic datasets. Ideally, the datasets would include population estimates for many commonly used spatial units (e.g. census tracts, wards, SA2 areas) and many countries so that that forecasting methods could be evaluated in a variety of demographic environments. Ideally, the datasets would span several decades of population data on a consistent set of geographical boundaries, and include not just population estimates but also the demographic components of change. Data for several countries might shed light on why different extrapolative and comparative small area forecasting methods proved most accurate in different countries (“Extrapolative and Comparative Methods” section). Small area demographic data covering a useful period of time are currently quite difficult to obtain.Footnote 2 We therefore wish to invite researchers and statistical offices to share small area population estimates (where permitted), and accompanying component data time series if available, to be uploaded to the github page at https://demographic-datasets-network.github.io/.

Most small area forecasting methods require some base period data, either to extrapolate past population numbers into the future, or to estimate recent fertility, mortality, and migration rates or patterns. But sometimes, a small area starts the forecast horizon with a population of zero and no demographic history. This is the case for small areas on a city fringe where construction will begin on a new housing estate in the near future. The housing-unit model could be used to forecast total population numbers, but any age–sex detail will need to be provided by additional data or another model, such as the cohort-component model (along with fertility, mortality, and migration assumptions). The literature currently provides little guidance on how such forecasts are best prepared (especially migration rate age profiles), so further research to enable good quality forecasts to be produced for these types of areas would be helpful.

Similarly, many current methods are poorly suited to forecast the populations of small areas undergoing major changes which are likely to re-shape the area’s population size and age structure, such as major housing redevelopment projects or the construction of large-scale institutional accommodation. In addition, many existing modelling approaches cannot easily incorporate major shocks to demographic trends, such as those caused by COVID-19 or the closure of the local area’s main employer. Methods which are able to handle these sorts of changes would likely improve the quality and plausibility of small area population forecasts. A related challenge is how to deal with those small area populations, often in rural and remote regions, experiencing sustained long-term decline. Should their populations be forecast to decline for many decades ahead, possibly reaching zero at some point? It would be useful to develop methods which forecast a plausible minimum population size based on theory and past experience.

Forecast accuracy is one of the key concerns of demographers creating new forecasting methods, and our review documents numerous efforts to improve forecast accuracy using a variety of methods. While this is challenging at the small area scale, several studies show that averaging or combining models often lowers forecast errors. Further research along these lines to investigate model combinations not yet tested would seem sensible. In addition, further exploration of machine learning methods, spatial statistics, and Bayesian methods, which are growing in popularity in the social sciences, would be valuable to determine whether forecast accuracy could be improved further. But it is important not to ‘throw the baby out with the bathwater’. Our view is that these newer methods are best combined with proven existing demographic models, to draw on the strength of both new and established methods.

One of the themes absent in the literature review is how best to produce forecasts which are consistent between geographical scales (Rees, 1994). Small area population forecasts are often produced as part of a wider set of forecasts at multiple geographical scales (e.g. small area, State/province, and national). A common approach is to apply top-down constraining to the projected subnational demographic components of change in which components for each area are scaled by the same factor. Where components are not projected, an alternative is to adjust population growth using the plus-minus technique for example (Smith et al., 2013). But sometimes considerable amounts of constraining are required, effectively revising assumptions by a substantial margin. An alternative is to apply a bottom-up method in which higher geography forecasts are simply the sum of smaller area forecasts. Although a few methods for ensuring forecast consistency exist (e.g. Li & Hyndman, 2021; van Imhoff & Keilman, 1991), research is needed to find optimal, and ideally simple, ways of creating assumptions and forecasts which are consistent between geographical scales.

Another geography-related challenge is to investigate the best ways to output consistent forecasts for multiple custom geographies. While geographical conversion is relatively straightforward for historical data using various spatial interpolation methods (Comber & Zeng, 2019), the equivalent for forecasts is more complex because population growth is unlikely to be uniform across a small area and ancillary data for interpolation are required for future years. In addition, the geographical conversion of internal migration projections is far from simple because some inter-area flows become intra-area flows, and vice versa. One approach would be to create forecasts for ultra-small areas which can be aggregated on a best-fit basis to most custom geographies, but forecasting ultra-small area populations presents a huge challenge.

Also requiring further research is the quantification of forecast uncertainty (Tayman, 2011). Estimates of uncertainty are provided for a few sets of small area forecasts, as the review shows, but mostly uncertainty is not considered. Where measures of uncertainty are provided, they are generally empirically based intervals for total populations only. Much remains to be done to provide users with comprehensive estimates of uncertainty for small population by age and sex and for projected components. A challenge exists in adapting probabilistic methods to cope with the more limited data environment for small areas, and to prevent the forecasting process from becoming very complex and time-consuming. Also needed are user-friendly methods of communicating uncertainty to non-technical users of forecasts (Wilson, 2018).

We hope at least some of the above suggestions for further research are realised in coming years, but it is just as important to translate new research into practical tools. Doing so hugely increases the value of the research. Recent years have seen the creation of many useful demography tools and software packages for practitioners and researchers, some of which are available via the very useful Applied Demography Toolbox website at https://applieddemogtoolbox.github.io/. The development of additional user-friendly small area population forecasting tools and packages, based on new research, would be most helpful.

Another issue which has received little attention is understanding how forecasts are used (Diamond et al., 1990). If demographers know how forecasts are interpreted and used in planning and decision-making, then it should be possible for information about the forecasts, including their strengths and limitations, to be tailored to key user groups. And it might result in greater attention being given to particular aspects of the forecasts when preparing input data and assumptions, and validating outputs. Research on understanding more about how small area population forecasts are used and interpreted could lead to improvements which enhance their value to users.

Finally, it would be a positive development if more subnational population forecasting research was undertaken in countries other than the usual English-speaking Western countries, and by researchers from those countries. It is certainly the case that our review is dominated by research from North America, the UK, and Australia. It might also be beneficial if small area population forecasting research involved more researchers from non-demography backgrounds to cross-fertilise ideas from other disciplines.

Conclusions

In this paper, we have reviewed the literature on small area population forecasting from the last two decades, and suggested several areas suitable for further research. The scope of our review was fairly tightly defined in terms of small area size, forecasting methods, and publication dates, so we did not include related topics such as derived forecasts, quality assurance, and small area population estimation. Even within the narrow scope of the review, there are inevitably some papers we have missed. However, the limited scope of the review enabled us to cover small area forecasting methods in some detail, and we hope this is of use to researchers and practitioners.

A key conclusion from this review is that there is a discrepancy between the importance of small area population forecasting for planning purposes on the one hand, and the state of the small area forecasting methods toolbox on the other. Small area population forecasts are widely used for planning purposes and influence many substantial investment decisions. Although many important methodological contributions have been made over the last 20 years, research funding and progress lags behind that for national level, and large subnational level, population forecasting. We hope that further research, particularly along the lines of the suggestions made in Table 2, will lead to the development of better forecasting methods, and therefore more accurate and useful small area population forecasts, in the future.