Model selection for time series of count data

https://doi.org/10.1016/j.csda.2018.01.002Get rights and content

Abstract

Selecting between competing statistical models is a challenging problem especially when the competing models are non-nested. An effective algorithm is developed in a Bayesian framework for selecting between a parameter-driven autoregressive Poisson regression model and an observation-driven integer valued autoregressive model when modelling time series count data. In order to achieve this a particle MCMC algorithm for the autoregressive Poisson regression model is introduced. The particle filter underpinning the particle MCMC algorithm plays a key role in estimating the marginal likelihood of the autoregressive Poisson regression model via importance sampling and is also utilised to estimate the DIC. The performance of the model selection algorithms are assessed via a simulation study. Two real-life data sets, monthly US polio cases (1970–1983) and monthly benefit claims from the logging industry to the British Columbia Workers Compensation Board (1985–1994) are successfully analysed.

Introduction

There are a plethora of integer valued time series models for modelling low count time series data. There are two broad class of approaches for constructing integer valued time series models, observation-driven (e.g. McKenzie (2003); Neal and Subba Rao (2007); Enciso-Mora et al. (2009a)) and parameter-driven (e.g. Davis et al. (2003)) models, see Davis et al. (2015) for an overview. The INAR(p), the pth order integer autoregressive model is a prime example of an observation driven model. These models are motivated by real-valued time series models, primarily ARMA(p,q) (autoregressive-moving average) models and the desire to adapt such models to an integer-valued scenario. A time-series {Xt;tZ} is said to follow an INAR(p) model if, for tZ, Xt=i=1pαiXti+Zt,where α=(α1,α2,,αp) are the autoregressive parameters, denotes a thinning operator and {Zt} are independent (typically identically distributed), integer-valued random variables. Note that if in (1.1) represented multiplication and the {Zt} were Gaussian, then we would recover a standard real-valued AR(p) process. The thinning operator, a generalised Steutel and van Harn operator (Steutel and van Harn, 1979) ensures αiXti is an integer valued random variable. A binomial thinning operator is the most common choice such that αiXtiBin(Xti,αi). The most common choice of Zt is a Poisson distribution with mean λ, which combined with the binomial thinning operator and the condition i=1pαi<1 leads to the stationary distribution of Xt being Poisson with mean λ(1i=1pαi). Parameter-driven models are based on the observed counts {Xt} being driven by some underlying, unobserved latent process, {Yt}, Durbin and Koopman (2000) and Davis et al. (2003), for example a real-valued ARMA(p,q) process, see Dunsmuir (2015). With Poisson distributed counts, a log-link function is used to link the latent process, Yt, and the observed count process, Xt. This results in a generalised linear model with Xt|YtPoμexp(Yt).The observation and parameter driven models described above can be extended in many ways, for example the inclusion of time dependent covariates zt into the INAR(p) parameters (Enciso-Mora et al., 2009b) or into (1.2) to give μ=exp(ztTβ), Davis et al. (2003). Other examples are the development of INARMA(p,q) extensions of (1.1), see Neal and Subba Rao (2007) and INGARCH models (Fokianos, 2011), where for t1, Xt|Ft1X,λPoλt,with λt=μ+aλt1+bXt1 and Ft1X,λ is the σ-field generated by {X0,X1,,Xt,λ0}. The INGARCH model seeks to mimic the behaviour of GARCH models with alternative forms of λt considered in Fokianos (2011). It should be noted that for a=0, the INGARCH model reduces to an INAR(1) model with bXt1Po(bXt1) and ZtPo(μ). For parameter driven models there are alternative latent process formulations such as replacing exp(Yt) by θt where θt is a Markovian process satisfying θt=θt1γBeta(γαt1,(1γ)αt1),see Aktekin et al. (2013) and Aktekin et al. (in press). Negative binomially distributed counts as opposed to Poisson distributed counts can also be included in (1.2), see for example Windle et al. (2013).

Given the wide range of models for integer valued time series a key question is, what is the most appropriate model for a given data set? This leads onto a secondary question of the appropriate order p for an INAR(p) model or an AR(p) autoregressive latent process. For INAR(p) models, efficient reversible jump MCMC algorithms (Green, 1995) have been developed in Enciso-Mora et al. (2009a), Enciso-Mora et al. (2009b) for determining the order p of the model and the inclusion/exclusion of covariates. Reversible jump MCMC could also be employed for determining the most appropriate order of an AR(p) autoregressive latent process. However it is far more difficult to employ reversible jump MCMC for comparing between different classes of models due to the need to develop an efficient trans-dimensional moves between different models, see Brooks et al. (2003). Therefore in this work we focus primarily on choosing between different classes of integer valued time series models although we illustrate our approach for determining the model order p.

In this paper we consider model selection in a Bayesian framework, via direct computation of the marginal likelihood, also known as the model evidence, and alternatively using the DIC, deviance information criterion, Spiegelhalter et al. (2002). We focus for illustration purposes on three models; the INAR(p) model (1.1), the AR(p) Poisson regression model given by (1.2) with {Yt} being an AR(p) process, and the INGARCH model (1.3). To estimate the marginal likelihood, we extend the two stage algorithm given in Touloupou et al. (in press), which first estimates the posterior distribution using MCMC and then uses a parametric approximation of the posterior distribution to estimate the marginal likelihood via importance sampling. This leads to the two key novel contributions of this paper. Firstly, we introduce a particle MCMC algorithm (Andrieu et al., 2010) for estimating the parameters of the AR(p) Poisson regression model. This involves using a particle filter (Gordon et al., 1993) to estimate the likelihood, π(x|θ), where θ denotes the parameters of the model. The use of the particle filter to estimate π(x|θ) is then exploited both in the effective estimation of the marginal likelihood using the algorithm of Touloupou et al. (in press) and also in giving a mechanism for estimating the DIC without the need to resort to data augmentation and the problems that this potentially entails, Celeux et al. (2006).

The remainder of this paper is structured as follows. In Section 2 we introduce the particle MCMC algorithm for the AR(p) Poisson regression model. Given that Neal and Subba Rao (2007) provides an effective data augmentation MCMC algorithm for INAR(p) models we utilise the algorithm provided there in our analysis, whilst in Section 3 we give brief details of an MCMC algorithm for INGARCH model which is particularly straightforward to implement as no data augmentation is required. In Section 4 we present the generic approach to model selection which is employed for all three integer valued time series models under consideration. In Section 5, we conduct a simulation study which demonstrates the ability of the approaches described in Section 4 for determining the true model. The simulation study also provides insights into the AR(p) Poisson regression model and issues associated with identifying the autoregressive parameters in the latent process. In Section 6 we apply the AR(p) Poisson regression, INAR(p) and INGARCH models to two real-life data sets, monthly US polio cases (1970–1983) and monthly benefit claims from the logging industry to the British Columbia Workers Compensation Board (1985–1994). We show that an AR(1) Poisson regression model is preferred for the Polio data, and the inclusion of covariates proposed by Zeger (1988) for the data lead to only a small increase in the marginal likelihood. By contrast the INGARCH model is preferred for benefit claims data with significant evidence for the inclusion of a summer effect. All the data sets analysed in Sections 5 Simulation study, 6 Analysis of data sets along with the R code used for the analysis are provided as supplementary material. Finally in Section 7 we make some concluding observations.

Section snippets

AR(p) Poisson regression model

In this Section we introduce an adaptive, particle MCMC algorithm for obtaining samples from the posterior distribution of AR(p) Poisson regression models Zeger (1988), Davis et al. (2003). The AR(p) Poisson regression model assumes that we observe a (Poisson) count process X1,X2,,Xn which depends upon a (typically unobserved) latent AR(p) process Y1,Y2,,Yn. Specifically, we assume that Xt|YtPo(μtexp(Yt))Yt=i=1paiYti+et,where μt=exp(ztTβ) depends upon k explanatory variables zt=(1,zt1,,ztk

INGARCH model

In this Section we briefly discuss an MCMC algorithm for the INGARCH model. Given that for the INGARCH, Xt|Ft1X,λPoλt,with λt=μ+aλt1+bXt1, for observations x=(x1,x2,,xn) from {Xt}, the likelihood satisfies π(x|μ,a,b,λ0,x0)=t=1nπ(xt|μ,a,b,λt1,xt1)=λtxtxt!expλt.Consequently, no data augmentation is required for analysing this model using MCMC, and given priors on the parameters θ=(μ,a,b,λ0) we can construct a random walk Metropolis algorithm to explore π(θ|x). We choose independent gamma

Model selection

In this Section we consider model selection tools for choosing between competing integer valued time series models. We highlight a range of model selection tools in the Bayesian paradigm.

Reversible jump MCMC (Green, 1995) which extends MCMC to allow trans-dimensional moves enabling the comparison of different models within a single MCMC algorithm. Reversible jump MCMC is particularly well suited for moving between nested models where effective trans-dimensional moves can be identified and has

Simulation study

In this Section we present a simulation study which investigates the effectiveness of the model selection techniques on selecting the order p of an AR(p) Poisson regression model and of identifying the true model for three data sets with one data set each simulated from an AR(1) Poisson regression model, an INAR(1) model and an INGARCH model.

A time series of length 200 was generated from an AR(3) Poisson regression model with a=(0.4,0.25,0.15), τ2=0.5 and ϕ=1 (no covariate data). The data x

Concluding remarks

In this paper we have shown how a particle filter algorithm can be successfully applied to estimate the likelihood, π(x|θ), for an AR(p) Poisson regression model. The particle filter is then utilised both within a particle MCMC algorithm and for computation of the marginal likelihood and the DIC for model selection. This has enabled us to conduct model selection both within AR(p) Poisson regression models to select the appropriate order p of the model and between AR(p) Poisson regression, INAR(p

Acknowledgements

NA was supported by a Ph.D. scholarship from the Saudi Arabian Government (1008074575).

We thank an associate editor and two anonymous referees for insightful comments which helped improve the paper.

References (36)

  • McKenzieE.
  • AktekinT. et al.

    Sequential Bayesian analysis of multivariate count data

    Bayesian Anal.

    (2017)
  • AktekinT. et al.

    Assessment of mortgage default risk via Bayesian state space models

    Ann. Appl. Stat.

    (2013)
  • AndrieuC. et al.

    Particle Markov chain Monte Carlo methods (with discussion)

    J. Roy. Soc. Ser. B

    (2010)
  • AndrieuC. et al.

    The pseudo-marginal approach for efficient Monte Carlo computations

    Ann. Statist.

    (2009)
  • BrockwellP.J. et al.

    Time Series: Theory and Methods

    (1996)
  • BrooksS.P. et al.

    Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions

    J. Roy. Soc. Ser. B

    (2003)
  • CarterC.K. et al.

    On Gibbs sampling for state space models

    Biometrika

    (1994)
  • CarvalhoC.M. et al.

    Particle learning and smoothing

    Statist. Sci.

    (2010)
  • CeleuxG. et al.

    Deviance information criteria for missing data models

    Bayesian Anal.

    (2006)
  • ChibS.

    Marginal likelihood from the Gibbs output

    J. Amer. Statist. Assoc.

    (1995)
  • ChibS. et al.

    Marginal likelihood from the Metropolis–Hastings output

    J. Amer. Statist. Assoc.

    (2001)
  • DavisR.A. et al.

    Observation-driven models for Poisson counts

    Biometrika

    (2003)
  • DavisR.A. et al.

    On autocorrelation in a Poisson regression model

    Biometrika

    (2000)
  • DavisR.A. et al.

    HandBook of Discrete-Valued Time Series

    (2015)
  • DunsmuirW.T.M.

    Generalized linear autoregressive moving average models

  • DurbinJ. et al.

    Time series analysis of non-Gaussian observations on state space models from both classical and Bayesian perspectives

    J. R. Stat. Soc. Ser. B Stat. Methodol.

    (2000)
  • Enciso-MoraV. et al.

    Efficient order selection algorithms for integer valued ARMA processes

    J. Time Ser. Anal.

    (2009)
  • Cited by (18)

    • The ARMA Point Process and its Estimation

      2022, Econometrics and Statistics
      Citation Excerpt :

      This work may benefit from developments on the time series side, such as reversible jump MCMC for model selection Enciso-Mora et al. (2009); Neal and Subba Rao (2007), as well as techniques for likelihood quantification Alzahrani et al. (2018); Touloupou et al. (2018), which has been shown to be important for distinguishing exogenous fluctuations in immigration from bursts and endogenous contagion dynamics Wheatley et al. (2019).

    • Semiparametric time series models driven by latent factor

      2021, International Journal of Forecasting
    • Identification of the relative timing of infectiousness and symptom onset for outbreak control

      2020, Journal of Theoretical Biology
      Citation Excerpt :

      The challenge is to determine which of these five (observation) models best describes the household-stratified symptom-onset data (Fig. 1a). There is a relatively rich literature on Bayesian model discrimination (Chopin et al., 2013; Drovandi and Cutchan, 2016; Alzahrani et al., 2018; Touloupou et al., 2018), and optimal design for such (Chaloner and Verdinelli, 1995; Ryan et al., 2015), which are the most appropriate tools and framework to address this question. A general difficulty with this theory is that practical implementation is at best difficult, and often infeasible.

    • Bayesian model discrimination for partially-observed epidemic models

      2019, Mathematical Biosciences
      Citation Excerpt :

      Further, we consider different examples from epidemiology, which are important for understanding emerging infectious diseases. Other than in [48] and this paper, the method of importance sampling for estimating the evidence has typically only been considered for cases where the likelihood function is known [4,21], for inference on continuous-state models [21], where data augmentation is used to estimate the likelihood function [20,21], or in conjunction with another method which is not suitable for processes with highly variable observations [19]. This sort of implementation is inefficient for models with high dimensional parameter spaces.

    View all citing articles on Scopus
    View full text