7.1 Background

Forecasts of life expectancy have become essential in the estimation of future health care and pension costs and in planning social security policies. Demand for accurate mortality forecasts is high and new models are being introduced each year. One of the most commonly used is the Lee-Carter (LC) model (Lee and Carter 1992), which forecasts age-specific death rates in a log-linear way. Most high-income countries have recorded a log-linear decline of their age-specific death rates, as well as a linear increase of their life expectancy (White 2002). Given these regularities, linear extrapolation is a justifiable approach to predict future mortality and is at the foundation of most forecasting models (Booth and Tickle 2008). However, when mortality development is not linear, reliance on such an assumption can be problematic.

Signs of stagnation in period life expectancy were observed in many low-mortality countries during the second half of the twentieth century. For example, life expectancy stagnated in Eastern European countries between the 1960s and 1980s, in the Netherlands between 1988 and the early 2000s (especially for females) and in Denmark in the 1980s. While each case of stagnation is unique, behaviors such as drinking and smoking play an important role in non-linear mortality development (Vallin and Meslé 2004; Stoeldraijer 2019). Effects from specific cohorts are also at play in some countries, i.e. stagnation or slower decline in mortality can result from childhood living conditions or harmful behavior such as smoking in adulthood, from certain birth cohorts (Lindahl-Jacobsen et al. 2016; Janssen and Kunst 2005). This chapter explores the difficulty in forecasting mortality when breaks in the trends are observed, using the example of Denmark.

In the 1950s, Denmark had one of the world’s highest life expectancies for both sexes, but fell behind many other European countries in the following decades (Jarner et al. 2008). Especially, during the 1980s, female life expectancy stagnated and did not make significant gains until the mid-1990s (Christensen et al. 2010). This stagnation has been mainly attributed to high death rates for generations born between the two World Wars, due to high smoking prevalence and other risk factors (Lindahl-Jacobsen et al. 2016). Since the mid-1990s, life expectancy in Denmark has increased at a similar rate to that of other high-income countries, but continues to lag behind Sweden, a country similar to Denmark in many societal aspects (Christensen et al. 2010).

Such broken trends render forecasting more complex. Should the irregularities of the past be used in forecasting? The official forecasts of life expectancy in Denmark are based on data from 1990, to lower the effect of the stagnation period. However, Danish life expectancy is currently catching up with that of other high-income countries and the recent increase might not be representative of a long-term trend.

This chapter summarizes the conclusions of the Forecasting Danish Life Expectancy and Age at Retirement Workshop, held on December 10, 2018 in Odense, Denmark and can be divided into three main sections. First, methodological issues relating to forecasting mortality in Denmark are discussed. Second, the forecasting results and accuracy of different models are compared for Danish females and males and for both cohorts and periods. Third, implications of the different forecast models for Danish society are presented, both in terms of age at retirement and lifespan variability.

7.2 Methods

Danish official forecasts are based on the LC model (Lee and Carter 1992), with an adjustment of the initial parameters using the Lee and Miller (2001) variant and based on data since 1990 only (Hansen and Stephensen 2013). Whether the approach is optimal has not, however, been demonstrated. In this chapter, 11 models to forecast Danish life expectancy are compared (Table 7.1). The list of models is far from exhaustive, but provides an overview of a range of available forecast models.

Table 7.1 Summary of the forecast models compared

7.2.1 Period Forecasts

The models here compared are extrapolative. The extrapolative approach is often preferred by statistical offices (Booth and Tickle 2008; Stoeldraijer et al. 2013). The models were selected based on their use of different indicators. Bergeron-Boucher et al. (2019) show that the use of different indicators for forecasting leads to significant differences in the results. Other forecasting models could have been used but we have limited the list to the models enumerated in Table 7.1, because they cover the variety of different life table indicators and also to limit the number of cross comparisons. For each indicator, at least one model is a coherent model (see Sect. 7.4.1 for further discussion on coherent models), with the exception of model nr. 9 based on statistical moments, as coherent models following such an approach have not been developed.

The first model involves applying a random walk with drift (RWD) to age-specific logged death rates. This approach is a simple log-linear extrapolation over time t of death rates (m xt) at each age x independently (Bell 1997).

The second model is the Lee-Carter (LC) model. Lee and Carter (1992) popularized the use of the age-specific death rates and principal component analysis to forecast mortality. This method has been extensively used and many extensions have been suggested (Brouhns et al. 2002; Hyndman and Ullah 2007; Li et al. 2013; Li and Lee 2005; Booth et al. 2002, 2006; Lee and Miller 2001; Alho 1998). The model decomposes a centered matrix of log death rates indexed by time and age, using a singular value decomposition (SVD), into an overall level of mortality over time and the age-specific responses to this level. The time-level is extrapolated using time series models with a linear deterministic trend. The method has many advantages, including simplicity, easily interpretable parameters, and minimal subjective judgment (Booth and Tickle 2008). However, the age-specific responses, which can be interpreted as the age-specific rates of mortality improvement if multiplied by the time-level, are constant over time in this model, while evidence shows that they have been increasing, especially at older ages (Kannisto et al. 1994; Booth and Tickle 2008).

The third model is the Li-Lee (LL), which is an extension of the LC model to coherent forecasts for a group of populations (Li and Lee 2005). The LL model is based on the idea that closely related populations – e.g., provinces in a country or neighboring countries – are likely to have similar mortality trends. Forecasting such populations separately tends to increase their differences. Li and Lee (2005) thus suggest that the average of the populations be forecast using the LC model and then forecast the population-specific deviations from this average, using a stationary process. With this approach, the population-specific mortality trends are constrained so that they do not extensively diverge from the average.

Rather than using age-specific death rates, Oeppen (2008) suggests using the life table distribution of deaths (d xt) to forecast mortality with Compositional Data Analysis (CoDA). CoDA is a framework to deal with compositional data, which are defined as positive values representing part of a whole and summing to a constant (e.g., percentages) (Pawlowsky-Glahn and Buccianti 2011). By treating life table deaths as compositional data and using a CoDA framework, the deaths are constrained to vary between 0 and the life table radix (e.g., 1 or 100,000), which conditions the relationship between components. Bergeron-Boucher et al. (2017) show that, by using Oeppen’s CoDA approach, the rates of mortality improvement increase over time, providing more optimistic and less biased forecasts than the LC model. The fourth model is an adaptation of the LC model to CoDA using life table deaths distribution (Oeppen 2008) and the fifth model is an adaptation of the LL model to CoDA (Bergeron-Boucher et al. 2017). These models are respectively called CoDA and CoDA coherent (CoDA-C).

Models extrapolating life expectancy directly can also be used. Among them, we compare a simple approach extrapolating the life expectancy at birth e 0t using the mean rate of improvement in e 0t over past years. We call this approach constant increase (CI).

Alternatively, life expectancy can be assumed to increase by 2.2 years per decade. This increase is equal to the gains in the female best-practice in life expectancy, as defined by Oeppen and Vaupel (2002), since 1960. This approach is here called Oeppen and Vaupel (OV) best practice increase.

Another model based on life expectancy extrapolation is the double-gap (DG) model. The DG model is used to coherently forecast female and male life expectancy in a certain country or region with reference to a benchmark level, for example, the trend given by the long-term historical record life expectancy in the world. The sex-gap in life expectancy is assessed to forecast the male life expectancy in the analyzed population. The extrapolation process is based on classic time series methods (Pascariu et al. 2018).

The final period model is the maximum entropy method (MEM). The MEM makes use of the statistical properties of a probability density function in order to estimate the distribution of deaths of a population in the future (Pascariu et al. 2019). Time series methods for forecasting a limited number of central statistical moments are used and then a reconstruction of the future distribution of deaths using the predicted moments is performed. The estimation of the density function is made using the maximum entropy approach (Mead and Papanicolaou 1984).

7.2.2 Cohort Forecasts

All models selected in this chapter, so far, are designed to forecast period mortality. The first five models (RWD, LC, LL, CoDA and CoDA-C), based on age-specific data, can also be used to forecast cohort mortality by reading the period forecast matrices of death rates by time and age along a diagonal. With the CoDA and CoDA-C models, the forecast life table deaths distributions are transformed into death rates using life table calculations (Preston et al. 2001) and a similar reading is made. Additionally, we compared two models specifically designed to use cohort data to make forecasts: The Cohort Segmented Transformation Age-at-death Distributions (C-STAD) model and a Penalized Composite Link Model (PCLM). True cohort forecasts, i.e. those based on cohort data only, have rarely been achieved (Booth and Tickle 2008), and the C-STAD and PCLM models are among the first to obtain such forecasts.

The C-STAD model is a method that has been recently proposed to model and forecast cohort mortality (Basellini et al. 2020). Specifically, the C-STAD is a relational model based on a warping transformation of the age axis of a reference distribution of deaths. The parameters of the transformation function capture mortality changes in terms of shifting and compression dynamics. Mortality forecasts are obtained from their extrapolation using standard time series models. The C-STAD is a generalization of the approach proposed by Basellini and Camarda (2019) to model and forecast adult period mortality from age-at-death distributions.

Another method recently proposed to forecast cohort age-at-death distributions is based on the PCLM (Penalized Composite Link Model) for ungrouping data (Eilers 2007; Rizzi et al. 2015). The counts of a cohort life table distribution of deaths are treated as realizations of a Poisson process. The age-at-death distribution is modeled by a penalized maximum likelihood, under the following assumptions: (i) the forecast age-at-death distribution is smooth; (ii) no deaths are observed after age 120; (iii) when the last observed age of deaths is far from the mode, the latter is a priori forecast with a simple ARIMA model. The PCLM smoothly redistributes the remaining deaths in the right-hand tail of the age-at-death distribution of a cohort not yet extinct (Rizzi et al. 2019).

7.3 Data

Observed death rates for Denmark were extracted from the Human Mortality Database (HMD 2019) by sex, and life tables were constructed using the standard procedure (Preston et al. 2001). When a death rate is equal to zero, the value was replaced by half of the minimum death rate observed in the dataset, as many of the models cannot be estimated with the presence of zeros. Zeros are, however, rare in the dataset. Overall, data from 1925 to 2016 for both females and males were extracted, but different fitting periods are used across the analyses.

For the LL, CoDA-C and DG models, a reference population is needed. For the LL and CoDA-C models, the reference population is the average mortality trend for Denmark, Sweden, the Netherlands and the United Kingdom. The average is the geometric mean of the death rates of these four countries for the LL model and the associated life table distribution of deaths for the CoDA-C. Data for these countries were also extracted from the HMD. The choice of the reference population is based on the analysis of Kjærgaard et al. (2016). The selected reference population provides the most accurate forecasts for Denmark and consists of countries with similar mortality trends that are geographically close to Denmark. For the DG model, the reference population is the best practice in life expectancy, as defined by Oeppen and Vaupel (2002) and based on countries within the HMD that have the highest life expectancy each year (Pascariu et al. 2018).

7.4 Methodological Challenges in Forecasting Life Expectancy in Denmark

7.4.1 Non-linear Trends

Figure 7.1 shows life expectancy at birth over time in Denmark and Sweden, for females and males. Segmented regressions (Muggeo 2003) have been applied to the trends. The slope of each segment and the year when a break occurs are marked in the figure. For both females and males, the increase in life expectancy was similar for Sweden and Denmark until the second half of the 1970s. After 1977, the Danish female life expectancy increase slowed down until 1995, thus lagging more and more behind Sweden. After 1995, female life expectancy in Denmark increased faster than in the previous period and faster than that of Sweden. The gap in life expectancy between these two countries has been closing in recent years. For males, the Swedish life expectancy increase accelerated in 1979, while in Denmark this break first occurred in 1992. However, the increase in life expectancy since the mid-1990s has been faster for Danish males than for Swedish males. As for females, the gap between the two countries has been closing since the mid-1990s.

Fig. 7.1
figure 1

Life expectancy at birth in Denmark (lower curve) and Sweden (upper curve) between 1925 and 2016, with segmented regressions, (a) Females and (b) Males

Breaks in trends are also observed in the age-specific death rates, especially between age 20 and 70 (Fig. 7.2). Imposing a linear development of past trends and extrapolating these trends in the future thus seems to be inadequate to forecast Danish mortality. Using non-linear or segmented trends could be an option. However, predicting when or if the next break would occur is arduous. When non-linearity in the trends is observed, Stoeldraijer (2019) suggests two approaches.

Fig. 7.2
figure 2

Log death rates in Denmark between 1925 and 2016 at specific ages, with segmented regressions, (a) Females and (b) Males

First, if the causes of the non-linear trends are known, information about these causes could be included in the forecasts. For example, the non-linearity of life expectancy in Denmark has been attributed to smoking (Christensen et al. 2010; Lindahl-Jacobsen et al. 2016). Adjusting for the distorting effect of smoking on mortality is thus likely to improve forecast accuracy (Janssen and Kunst 2007; Bongaarts 2014). Some authors have developed models to forecast mortality that account for smoking (Preston et al. 2014; Janssen et al. 2013; Wang and Preston 2009; Bongaarts 2006). Janssen et al. (2013) show that non-smoking mortality has more linear trends than all-cause mortality. However, risk factors (e.g., smoking) and other epidemiological information are often difficult to forecast as they often have non-linear trends; their relationship with mortality is often imperfectly understood; assumptions about future behaviors are often required; and data on, e.g., smoking or smoking-related mortality, are needed (Booth and Tickle 2008; Wilmoth 1995; Raftery et al. 2014). Given these constraints, epidemiological models are not compared here.

The second recommendation of Stoeldraijer (2019) is to use coherent forecast models (e.g., the LL model) for countries with less linear trends, especially if the causes of the non-linearity are unknown. White (2002) and Oeppen and Vaupel (2002) show that, among high-income countries, gains in life expectancy from countries lagging behind tend to be faster than those of leading countries. They also found that gains from leaders in life expectancy tend to slow down. White (2002) attributes these trends to a convergence in life expectancy towards a mean. Country-specific trends might deviate temporarily from the mean, but will eventually converge towards it. White (2002) also notices that the mean life expectancy among a group of high-income countries is more linear than country-specific trends. Oeppen and Vaupel (2002) find a nearly perfect linear trend in the increase in the record life expectancy over time. Both White (2002) and Oeppen and Vaupel (2002) conclude that these regularities (in the record or average) could be used to forecast mortality and highlight the need to consider mortality changes in an international perspective. Janssen and Kunst (2007) state “[…] we recommend using the experience of other countries not to set target values of life expectancy, but to create a broader empirical basis for the identification of the most likely long-term trend” (Janssen and Kunst 2007, p. 323).

7.4.2 Length of Fitting Period

Given the non-linear mortality trends in Denmark, a basic question is whether or not only recent trends should be used to forecast Danish life expectancy. Table 7.2 shows the difference in predicted life expectancy in 2066 with eight models, when different fitting periods are used: 1960–2016, 1975–2016 and 1990–2016. As the OV approach is not affected by a fitting period, this model is ignored in this section, as well as the models using cohort data. All the other models are sensitive to the fitting period, leading to differences of between 0.3 and 5.7 years for the same model in a 50-year forecast. The forecast results are as sensitive to the fitting period as they are to the model selected. The forecasts based on the most recent period are the most optimistic for both sexes and all models. The Danish population experienced fast improvements in mortality in the recent period and it is thus not surprising that forecasts based on data since the 1990s are more optimistic than those that take the period of stagnation into account.

Table 7.2 Forecasts of life expectancy at birth in 2066 using eight models and three fitting periods: 1960–2016, 1975–2016 and 1990–2016

To evaluate which length of fitting period would have produced the most accurate forecasts for Denmark, an out-of-sample analysis is performed. Data starting from the year 1985 to 1997 are forecast 20 years ahead based on different lengths of the fitting period. For example, life expectancy between 1985 and 2004 is forecast based on the previous 15 years (1970–1984) to the previous 60 years (1925–1984). This procedure is repeated for forecasts starting from 1985 to 1997. In total, 552 forecasts were made. The root mean square error (RMSE) of each forecast is calculated and averaged by length of the fitting period.

Figure 7.3 shows the RMSE for forecasts based on different lengths of fitting period. The results differ by model, but as a general conclusion, the longer the fitting period, the better. A general rule of thumb among forecasting experts is that the fitting period should be at least as long as the forecast horizon. Following this rule, a 20-year forecast should be based on, at least, 20 years of historical data. Our results suggest that longer fitting periods, rather than shorter ones, generally would have provided more accurate forecasts for recent mortality trends. A similar conclusion is drawn for a 50-year forecast (results not shown here). The results also suggest that the coherent models (LL and CoDA-C) are less sensitive to the length of the fitting period, especially for females. For males, a shorter fitting period for the LL and CoDA-C models would have been more accurate.

Fig. 7.3
figure 3

Average RMSE of life expectancy for a 20-year forecast with starting year from 1985 to 1997, by length of fitting period and model; and smoothed average across models (full line), (a) Females and (b) Males

It is important to understand whether an observed period of stagnation or acceleration is the emergence of a new dynamic or a temporal effect. Janssen and Kunst (2007) argue that, because the stagnation in Denmark and also in Norway and the Netherlands is mainly attributable to smoking and was not observed in other countries, it should be regarded as a temporal effect and longer fitting periods should be preferred. Our results are in line with those of Janssen and Kunst (2007) and suggest that long fitting periods should be used to forecast Danish life expectancy. A new dynamic has been in place since the late 1950s (see Fig. 7.1), with gains in life expectancy being mainly attributable to mortality reductions at old ages and from cardiovascular diseases (Christensen et al. 2009; Vallin and Meslé 2010). Lee and Miller (2001) argue that using data since 1950, with the LC model, reduces the bias of the forecasts for the United States.

7.5 Forecasting with Different Models

7.5.1 Period Forecasts

Given the results of Sect. 7.4.2, a fitting period from 1960 is selected and we forecast life expectancy 50 years ahead with the models described in Sect. 7.2.1. As the official Danish forecasts are based on an LC model that uses data since 1990 only, we also use a similar approach which we call LC90.

In 2066, life expectancy at birth is forecast to be between 87.2 and 95.3 years for females and between 83.9 and 91.4 years for males (Fig. 7.4). The forecast results thus vary by the model selected. The most pessimistic model is LC and the most optimistic is the OV for both sexes, for the period selected.

Fig. 7.4
figure 4

Period life expectancy at birth forecast 50 years ahead using ten models, (a) Females and (b) Males

Given the variations across models, the forecast accuracy of the models is estimated by way of an out-of-sample analysis. Recent life expectancy trends are forecast for a horizon of 6 to 26 years using historical data, with 2016 being the final year of the forecast horizon, and the RMSE is calculated for each horizon and then averaged. For example, if the forecast horizon is 26 years, we use data from 1960 to 1990 as the fitting period and forecast life expectancy from 1991 to 2016. As the LC90 model is based on data from 1990, this approach is not evaluated but can be considered similar to the LC approach. The results are presented in Table 7.3. The OV approach would have been the most accurate to forecast recent life expectancy trends in Denmark. The increase in life expectancy of 0.22 years annually is close to the yearly gain in life expectancy observed in Denmark since the mid-1990s (Fig. 7.1). Aside from the OV approach, models using a reference population – i.e. LL, CoDA-C and DG – would have predicted recent life expectancy in Denmark more accurately than the other models. Danish life expectancy has been catching up with other countries in recent years and the results confirm that these models better capture this trend, as discussed in Sect. 7.4.1.

Table 7.3 Average RMSE of life expectancy at birth over forecast horizons of 6 to 26 years, with the two lowest values in bold and rankings displayed in parentheses, females and males

7.5.2 Cohort Forecasts

When looking at forecasts of cohort life expectancy (Table 7.4), the results among models described in Sect. 7.2.2 are similar for older cohorts. For example, females born in 1950 are predicted to live between 79.8 and 80.5 with all models, except the C-STAD model, which forecasts a life expectancy of 81.1. Differences across models are even smaller for males for this cohort, with a predicted life expectancy of between 74.4 and 74.9. As mortality was observed until age 66 in 2016 for this cohort, less variation is seen in the forecasts. As for the period forecasts, the difference across models increases with the forecast horizon. The models based on cohort experience – i.e. C-STAD and PCLM – tend to be more optimistic than the other models, which are based on period forecasts. These models are based on cohort data only. In order to fit the models and complete the mortality experience of a cohort, partial information on this specific cohort is needed. Reliable estimates are obtained for cohorts born up to 1970 and 1960, for C-STAD and PCLM, respectively. Thus, the C-STAD and PCLM models cannot be used to forecast mortality of more recent cohorts.

Table 7.4 Cohort life expectancy at birth for specific cohorts forecast with eight models, the range of the forecast values across the eight models and range across the six models based on period forecasts

7.6 Implications for Danish Society

Forecasts are key to planning economic, health, education and social policies, among others. Large variations in forecast results lead to greater uncertainty about costs, investments and policy planning. Two estimates derived from mortality forecasts are here compared across models: (1) Age at retirement and (2) Lifespan variability.

The forecasts presented in Sect. 7.5.1 are used to estimate the predicted age at retirement and lifespan variability, when possible. The DG, CI and OV models do not allow for an estimation of indicators based on life table statistics, other than e 0.

7.6.1 Age at Retirement

To ensure the sustainability of the Danish pension system, the Danish government implemented in 2007 a system where the pension age is increased if life expectancy is increasing. The legislation regulates the pension age 15 years ahead and it is based on life expectancy at age 60 and an expected increase of 0.6 years over a 15 years period. Based on this assumption, if the Danish population is expected to have a life expectancy at retirement age higher than 14.5 years, pension age is increased by a maximum of one year over a five year period. Changes to the pension age need to be approved by a majority in the Danish parliament. Regulations are voted on with 15 years notice every five years, with the next regulation coming up in 2020. Future pension ages have been decided until 2030 and pension ages until 2035 will be decided in 2020.

As pension ages after 2030 are unknown, we focus on the desired number of years lived after retirement – i.e. 14.5 years – to evaluate the consequences of the different mortality forecasts. Figure 7.5 shows the age with a remaining life expectancy of 14.5 years (x e(x)=14.5), for both sexes combined, forecast using different models. The Figure also shows the official pension age approved by the Danish parliament and the maximum increase in the pension age of one year every five years (dashed) after 2030. In 2016, the official pension age was 65 and x e(x)=14.5 was 72. The gap between the official pension age and x e(x)=14.5 persists in the forecasts, as the official pension age cannot increase faster than one year every five years. Nevertheless, the gap is expected to narrow for all models, if the pension age is increased by its maximum. A maximum increase in the pension age is likely if the policymakers want to bring the average number of years lived after retirement down to 14.5 years. A large gap between x e(x)=14.5 and the pension age is forecast with all models, meaning that the expected number of years spent at retirement will be higher than 14.5.

Fig. 7.5
figure 5

Age with remaining life expectancy of 14.5 years, based on seven models, and official age at retirement, 2017–2049

Figure 7.6a shows the predicted number of years lived at retirement by sex, based on the Danish official pension age for the years where pension ages are determined. With all models, except the LC90 for males, the number of years lived after retirement is predicted to decline over time. The Danish population for most models is expected to be entitled to fewer years with a pension compared to older generations. Males are also expected to live fewer years after retirement than are females. Similar trends are also observed for the cohort forecasts (results not shown here).

Fig. 7.6
figure 6

Number of years lived at retirement and probability of surviving from birth to retirement age using seven forecasting models, Denmark, 2017–2034

However, the models provide different trends when looking at the probability of surviving to the age at retirement (Fig. 7.6b). With the LC and LL models, the survival probability to age at retirement decreases until 2022 and then fluctuates at around 90.3% for females and 85.6% for males. For the MEM and LC90 models, an increase in the survival probabilities to retirement is expected, after an initial decline until 2022.

7.6.2 Lifespan Inequalities

Population health is often summarized by a single measure – life expectancy. However, standard measures of longevity, such as life expectancy, conceal variations in lifespan. Inequality in the length of life is an important indicator of the uncertainty in the timing of death and of heterogeneity in underlying population health at the macro level (van Raalte et al. 2018). Life expectancy and lifespan inequality are usually negatively correlated (Fig. 7.7) (Colchero et al. 2016; Vaupel et al. 2011). Here, we measure lifespan inequality with average life expectancy lost at death, denoted with e (Vaupel and Canudas-Romo 2003). For example, if an individual at time of death has 20 years of remaining life expectancy, then he/she contributes 20 years to lifespan inequality. Since 1960, Danish improvements in life expectancy and lifespan equality were halted by smoking-related mortality in those born between 1919 and 1939, while reductions in old-age cardiovascular mortality further held back lifespan equality (Aburto et al. 2018). It has been shown that, in Denmark, early deaths are more common in underprivileged groups, simultaneously reducing life expectancy and increasing lifespan inequality (Brønnum-Hansen 2017). Therefore, lifespan inequality, together with life expectancy, give a broader perspective on the effect of mortality changes on population health.

Fig. 7.7
figure 7

Relation between life expectancy and lifespan inequality observed (lines) and forecast (shapes) between 1935 and 2066 in Denmark. (a) Females. (b) Males

Moreover, evaluating the predictive ability of mortality forecasts is imperative, yet difficult. Accounting for lifespan inequality can help with this challenge (Bohk-Ewald et al. 2017). Therefore, we included lifespan inequality in our forecasting scenarios. As life expectancy at birth increases, lifespan inequality decreases (Fig. 7.7). However, at advanced ages, life expectancy increases can coincide with a rise in lifespan inequality (Engelman et al. 2010), as observed until the 1990s in Denmark when age at retirement was 65 (Fig. 7.8). Our mortality forecasts suggest a decrease in lifespan inequality from age at retirement in Denmark. This implies that ages at death after retirement could become more equal, which could help in the distribution of health resources by concentrating them in a narrow group of ages.

Fig. 7.8
figure 8

Lifespan inequality observed (lines) and forecast (shapes) from the age at retirement between 1935 and 2034, Denmark. (a) Females. (b) Males

7.7 Discussion

The choice of model and fitting period leads to large variations in forecasts. Bergeron-Boucher et al. (2019) show that the choice of indicator to forecast mortality (e.g., death rates or life expectancy) also leads to significant differences in the forecasts, even when applying a similar extrapolative model on each indicator. Some scholars have proposed that assigning a higher weight to most recent observations would produce better forecasts (Hyndman and Shang 2009), a procedure that is not discussed in our analysis. Such an approach is equivalent to downplaying trends in the more distant past. Preliminary results suggest that this practice does not improve forecasts in all cases. For instance, when forecasting Danish mortality with commonly used models, such as the LC, the most accurate results were achieved without weighting schemes and by using long fitting periods. Despite our findings for Danish mortality, further research about how to weight historical data is necessary, in particular for countries exhibiting mortality deterioration and life expectancy reversals (e.g., former Soviet countries). Given the sensitivity of forecasts to these different factors, decisions have to be made by forecasters, which can often involve subjectivity, and choosing the optimal approach becomes a difficult task.

Nevertheless, our results show that the best extrapolative model to forecast recent period life expectancy in Denmark is based on a simple assumption of a 2.2 years’ increase per decade, with the gap between Danish life expectancy and forecast best-practice life expectancy neither widening nor narrowing. The reason for this result is that, in our out-of-sample analysis, the increase during the validation period (1991–2016) was close to 2.2 years per decade. If other periods had been used for validation, this approach might not have shown similar performance. Aside from this OV approach, our results suggest that using coherent models, such as the LL, CoDA-C or the DG models would have provided more accurate forecasts of recent mortality trends in Denmark than other models. One could also argue that the OV approach is coherent, if life expectancy in all countries is assumed to increase at the same pace of that of a benchmark, which here is the best-practice. Additionally, the results show that a longer fitting period would have generally increased forecast accuracy. The stagnation in life expectancy in Denmark should thus be considered as a temporal effect and a model considering the catching up of Danish mortality trends towards other high-income countries should be preferred.

Stoeldraijer (2019) and Kjærgaard et al. (2016) found that forecasts with coherent models are sensitive to the choice of the reference population. Stoeldraijer (2019) found that the sensitivity of different coherent models differs between females and males, with the LL model being the most sensitive for females and the less sensitive for males, compared with two other coherent models. Kjærgaard et al. (2016) explore which reference population provides the most accurate forecasts and found that the optimal reference population differs across countries. The results of their analysis suggest that selecting a few countries with similar trends in life expectancy to the population of interest as the reference population increases forecast accuracy. This strategy was here used for the LL and CoDA-C models.

Accounting for smoking and cohort effects is also worth exploring when forecasting Danish mortality. However, as stated by Stoeldraijer (2019): “Because more assumptions are required in a method that incorporates smoking, a trade-off must be made between the advantage of being able to take the impact of smoking into account and the advantage of the objectivity of a pure extrapolation approach based on total mortality” (Stoeldraijer 2019, p. 21). As such, in this chapter we have limited our analyses to extrapolative models, often favored by statistical offices to produce official forecasts.

An important aspect of forecasting, which was not discussed in this chapter, is the prediction intervals. As the future is uncertain, it is important to estimate the uncertainty of a forecast. An indication of a likely range of values should thus be included when forecasting (Booth and Tickle 2008).

This chapter highlights the challenges in forecasting mortality in Denmark and the sensitivity of the forecasts to the different choices faced by the forecasters, e.g., which models, indicators and reference period should be used? Given that official forecasts are used to plan economic and social policies, these choices should be made carefully and analytically.

7.8 Replicability

The data and R codes used for the LC, LL, CoDA, CoDA-C and MEM models are publicly available at https://github.com/mpascariu/MortalityForecast. The DG model data and R codes are available in the MortalityGaps R package (Pascariu 2018) and at https://github.com/mpascariu/MortalityGaps.