Using capture-recapture data and hybrid Monte Carlo sampling to estimate an animal population affected by an environmental catastrophe

doi:10.1016/j.csda.2010.06.009

Computational Statistics & Data Analysis

Volume 55, Issue 1, 1 January 2011, Pages 655-666

https://doi.org/10.1016/j.csda.2010.06.009 Get rights and content

Abstract

We propose a dynamic model for the evolution of an open animal population that is subject to an environmental catastrophe. The model incorporates a capture-recapture experiment often conducted for studying wildlife population, and enables inferences on the population size and possible effect of the catastrophe. A Bayesian approach is used to model unobserved quantities in the problem as latent variables and Markov chain Monte Carlo (MCMC) is used for posterior computation. Because the particular interrelationship between observed and latent variables negates the feasibility of standard MCMC methods, we propose a hybrid Monte Carlo approach that integrates a Gibbs sampler with the strategies of sequential importance sampling (SIS) and acceptance-rejection (AR) sampling for model estimation. We develop results on how to construct effective proposal densities for the SIS scheme. The approach is illustrated through a simulation study, and is applied to data from a mountain pygmy possum (Burramys Parvus) population that was affected by a bushfire.

Introduction

Open population models are important in ecological and environmental research. Early work on open population inference was largely conducted within the maximum likelihood framework. With advances in simulation techniques, implementation of the integrals required for calculating posterior distribution has become increasingly available. This leads to more studies being conducted via a Bayesian approach. For example, Crome et al. (1996) presented a Bayesian analysis to assess the impacts of logging rain forest on fauna using Before-After-Control-Impact-Pairs experimental data. Brooks et al. (2000) applied Bayesian methods to the Cormack–Jolly–Seber (CJS) alike models to estimate survival and recovery/capture probabilities from band-return and capture-recapture data. Link and Barker (2005) presented a hierarchical extension of the (CJS) models for estimating the correlation between biological birth rate and survival rate. Lee et al. (2006) analyzed data collected from a two-release experiment. A comprehensive review on open population models via a Bayesian approach can be found in Buckland et al. (2007).

Recently Huggins (2007) proposed a martingale estimation equation approach to obtain the size of an open population, where the population size is assumed to be a deterministic function of certain covariates and parameters, and linear models are used to estimate the effects of the covariates on the population size. This approach uses full individual capture histories in its bootstrap estimation of the standard deviations of parameter estimators. Sometimes individuals may be difficult to distinguish from each other in a capture-recapture experiment. For example, in the study of goose population (Macinnes, 1966, Malecki et al., 1981) the data are usually collected by sighting the neck or foot bands of the geese. An individual’s capture history is hence not available in this case. Instead, summary data, i.e., the numbers of marked and unmarked animals captured on each occasion, are often obtainable. In this paper we aim to develop a population model and an estimation approach for this type of data. We assume the population under study is open, and has suffered from an environmental catastrophe during the data collection period. Further, suppose that the population size before the catastrophe has a stationary distribution with mean $N$ , the catastrophe has an effect to reduce the mean population size by $N δ$ for some $0 \leq δ \leq 1$ , and the population size eventually returns to the stationary distribution after the catastrophe. To model a stationary population it is reasonable to assume that the mean number of births and immigrants is the same as the mean number of deaths and emigrants.

In Section 2 we develop a dynamic population model to delineate the evolutions of population sizes and capture-recapture data. The model consists of latent variables, observed capture-recapture variables and the parameters of interest. In Section 3 we develop MCMC methods for computing the posterior distributions. In computing the conditional distribution of the latent variables, because the involved distributions are so convoluted that feasible proposal densities with sizeable acceptance probabilities cannot be constructed, we develop a sequential importance sampling (SIS) scheme (cf. Doucet et al. (2001), Liu (2001) and Chapter 14 of Robert and Casella (2004)). Simulation studies are conducted and reported in Section 4. In Section 5 the model and algorithm are applied to capture-recapture data on mountain pygmy possums (Burramys Parvus) whose habitat was affected by a bushfire during the data collection period. Further discussions are given in Section 6.

Section snippets

The model and notation

Consider a capture-recapture experiment conducted at times $t_{1}, t_{2}, \dots, t_{I}$ , with $t_{0}$ denoting an initial time point. The number of occasions, $I$ , is usually small.

Assume that the captured animals are marked and released; the marks are recognizable on later occasions; individuals have the same probability $p$ of being captured, and $p$ does not vary with capture history; the mean departure rate from the population is the same as the mean arrival rate $β$ , apart from the time period in which the catastrophic

Fitting the model

The state-space or hidden Markov framework is often found in formulating dynamic population models. In this framework the state transition distribution, say $g (x_{t} | x_{1 : (t - 1)})$ , has a Markovian dependence structure among the latent variables $x_{1 : t} = (x_{1}, \dots, x_{t})$ , whereas the observation distribution, say $h (y_{t} | x_{t})$ , is conditional on the current state $x_{t}$ . West et al. (1985), Fahrmeir (1992), West and Harrison (1997) and Durbin and Koopman (1997) discuss the dynamic generalized linear models, where $g$ is a

Simulation study

Simulation studies were conducted to test the performance of the proposed method. We generated data according to model Fig. 1 with various parameter combinations that represent small, moderate and large catastrophe effects. We fixed the number of capture occasions $I$ to be 5 and $c = 4$ . The entire process of population evolution was recorded, including the realized quantities $τ_{i} = (Y_{i}, W_{i}, M_{i 1}, M_{i 2})$ and $d_{i} = (u_{i}, m_{i})$ at all occasions. We then treated the $τ = (τ_{1}, \dots, τ_{I})$ as though they were unobserved and we

Application

The data we consider consist of annual captures of the mountain pygmy possum (Burramys Parvus) at Mt. Hotham Australia from 2000–2004. Heinze et al. (2004) contains some discussion of this species. In each year trapping was conducted in November and the traps were placed in the same grid. There was a major bushfire in the area in 2003 so that captures in 2000–2002 were before the fire and those in 2003 and 2004 were after the fire. Interest is on estimating possible effect of the fire on the

Discussion

In traditional open population inference the population size on each occasion is regarded as a model parameter. In this paper we treat the population size as a latent stationary stochastic process. Our interest lies in changes to the mean of this process in response to an environmental catastrophe. Methodologically, the modelling approach in this paper can be considered to be a missing-data formulation. The Bayes framework is a natural accommodation for such formulation, because in Bayes all

Acknowledgements

The authors would like to thank Dean Heinze and Paul Mitrovski for providing the mountain pygmy possum data. They would also like to thank the associate editor and the referees for their suggestions and comments. The research was supported by an Australian Research Council Discovery Grant.

References (22)

R.M. Huggins
On the use of linear models in the estimation of the size of a population using capture-recapture data
Statistics and Probability Letters
(2007)
S.P. Brooks et al.
Bayesian animal survival estimation
Statistical Science
(2000)
S.T. Buckland et al.
Embedding population dynamics models in inference
Statistical Science
(2007)
G. Casella et al.
Explaining the Gibbs sampler
The American Statistician
(1992)
F.J.J. Crome et al.
A novel Bayesian approach to assessing impacts of rain forest logging
Ecological Applications
(1996)
J. Durbin et al.
Monte Carlo maximum likelihood estimation for non-Gaussian state space models
Biometrika
(1997)
L. Fahrmeir
Posterior mode estimation by extended Kalman filtering for multivariate dynamic generalized linear model
Journal of the American Statistical Association
(1992)
A. Gelman et al.
Bayesian Data Analysis
(1995)

A. Gelman et al.

Inference from iterative simulation using multiple sequences

Statistical Science

(1992)

Cited by (4)

Flexible Effective Sample Size Based on the Message Importance Measure
2020, IEEE Open Journal of Signal Processing
Handbook of machine learning: Volume 1: Foundation of artificial intelligence
2018, Handbook Of Machine Learning - Volume 1: Foundation Of Artificial Intelligence
Condition monitoring using computational intelligence methods: Applications in mechanical and electrical systems
2012, Condition Monitoring Using Computational Intelligence Methods: Applications in Mechanical and Electrical Systems
Bayesian approaches to modeling interstate conflict
2011, Advanced Information and Knowledge Processing

View full text

Using capture-recapture data and hybrid Monte Carlo sampling to estimate an animal population affected by an environmental catastrophe

Abstract

Introduction

Section snippets

The model and notation

Fitting the model

Simulation study

Application

Discussion

Acknowledgements

Statistics and Probability Letters

Bayesian animal survival estimation

Statistical Science

Embedding population dynamics models in inference

Statistical Science

Explaining the Gibbs sampler

The American Statistician

A novel Bayesian approach to assessing impacts of rain forest logging

Ecological Applications

Monte Carlo maximum likelihood estimation for non-Gaussian state space models

Biometrika

Posterior mode estimation by extended Kalman filtering for multivariate dynamic generalized linear model

Journal of the American Statistical Association

Bayesian Data Analysis

Inference from iterative simulation using multiple sequences

Statistical Science