Population size estimation using local sample coverage for open populations

https://doi.org/10.1016/S0378-3758(02)00093-9Get rights and content

Abstract

An unsolved problem in the analysis of capture–recapture experiments is the estimation of the size of an open population when the capture probabilities are heterogeneous across the population. Here, we extend a kernel smoothing approach of Huggins and Yip (Biometrics 55 (1999) 387) to the martingale estimating functions based on sample coverage of Chao et al. (J. Statist. Plann. Inference 92 (2001) 213) and solve this problem when there are frequent capture occasions. Simulation results are shown to examine the performance of the proposed estimation procedure. A real data set is used for illustration.

Introduction

The capture–recapture models have been used for estimating parameters in biological populations. There are two types of models: closed and open. A closed model, which is usually valid for data collected in a short time, assumes that there are no additions (birth or immigration) and losses (death or emigration). The population size, which is the main parameter of interest, remains constant during the study period. An open model, which is used to model long-term investigation, allows for additions and losses so that population size varies with time in the experiment. The focus of this paper is on the population size estimation in an open model, although the topic of estimating survival probabilities is also of interest in the biological and ecological sciences.

For closed models, there are many approaches to estimate population size under various assumptions. A practical and important class of models is called heterogeneous models in which capture probabilities are allowed to vary among animals. Several authors have proposed estimators covering a wide range of statistical methodologies. See a recent review by Schwarz and Seber (1999) on animal abundance models, in general, and on this topic in particular.

For open models, a commonly used model is the Jolly–Seber model (Seber, 1982; Pollock et al., 1990). An unresolved problem with this model is the effect of unequal capture probabilities on the Jolly–Seber estimators under general conditions. Another problem with the Jolly–Seber model is that all emigration is permanent so that the animals may not leave and rejoin the population. It has been documented that the Jolly–Seber population size estimators are generally biased downwards if there is heterogeneity in the capture probabilities; see Carothers (1979), Hwang and Chao (1995) and Pledger and Efford (1998). These authors have proposed approaches to take account of heterogeneity under various assumptions. For example, Hwang and Chao (1995) assumed permanent emigration, and multiplicative form of individual heterogeneity and time effect. Pledger and Efford (1998) extended Carothers (1979) simulation method under an “open but stable population” assumption, that is, a population with births and immigration replacing deaths and emigration such that the population size throughout the study is a constant. A basic idea in this paper is to treat the open models as a series of overlapped “locally closed” models (defined later) so that techniques valid for closed models can be applied to an open model. In this framework, some restrictive assumptions can be relaxed.

Huggins and Yip (1999) used kernel smoothing to extend weighted martingale methods developed to estimate the size of a closed population to open populations. They showed that their method performed better than the Jolly–Seber estimator when individuals could leave and re-enter the population. However, they assumed the capture probabilities were homogeneous across the population at each time point. Thus it is to be expected that their method would share the disadvantages of the Jolly–Seber method in the presence of heterogeneous capture probabilities. Chao et al. (2001) developed martingale-based estimating functions to estimate the size of a closed population if the capture probabilities are heterogeneous and Huggins and Chao (2002) have examined the asymptotic properties of these estimators. Here, we apply the kernel smoothing approach of Huggins and Yip (1999) to these latter estimating functions to estimate the size of an open population with heterogeneous capture probabilities under the assumption the population is locally closed and there are regular capture occasions. We show in simulations that this estimator does perform better than the Huggins and Yip (1999) estimator in the presence of heterogeneous capture probabilities.

As in Huggins and Yip (1999), we are again motivated by weekly banding data on the bird species Prinia flaviventris (yellow bellied prinia) collected at the Mai Po bird sanctuary in Hong Kong for a period of 34 weeks from September 1991 to April 1992, previously examined in Huggins and Yip (1999) and described there. Lin and Yip (1999) analyzed part of the data (January–April, 1992) using relevant covariates. Prinia flaviventris is a very common territorial species mainly inhabiting the reed beds in the swamp. The birds were captured in mist nets that were set in the reed beds. The analysis of Huggins and Yip (1999) revealed distinct seasonal behavior with a large population peak in October–mid-December representing the presence of juvenile birds after the breeding season. The population then decreased in late December as many of the juvenile birds moved out of the trapping area or died. Another peak in February–March represented increased activity prior to breeding. During the subsequent breeding season, the females were on their nests and were largely not catchable so that the catchable population again decreased. Thus catchability may be expected to depend on the mobility of the birds which may vary between age classes and the gender. Recall that model Mh assumes that the capture probability for any individual might be different and this probability remains fixed over time. Although age and gender information is available for some captured birds, the information is missing for most of the birds caught in the study period. Therefore, the covariates cannot be utilized for the whole data of 34 weeks. If it could be assumed that the population was closed over a relatively short time and the capture probabilities were time homogeneous, then model Mh would be most appropriate to estimate the population size over this short time period.

Consider a capture release experiment with captures at times 0<t1<⋯<ts. For notational simplicity we suppose the capture occasions are evenly spaced, although this is not crucial. Let k(t) denote the closest capture occasion to t. Our approach to estimate the population size at t, Nt, treats the capture data in the local time interval [tk(t)−K,tk(t)+K], where K is a pre-selected value, as a closed model with a constant population size Nt.

We say a heterogeneous population is approximately locally closed if:

  • 1.

    The capture probabilities of individuals arriving into the population are independent random variables from the distribution F with mean E(p) and coefficient of variation (CV), γ,γ={E[p−E(p)]2}1/2/E(p) and these capture probabilities are independent from individual to individual.

  • 2.

    The capture probabilities do not depend on t.

  • 3.

    The size of the population at time t is Nt=[(t)] where η(t)>0 is a continuous function and is bounded away from 0 in the study time period, and [x] denotes the closest integer to x. There exists a constant λ, such that |η(t)−η(s)|<λ|st|.

  • 4.

    The individuals in the population behave independently of each other. Given any individual, captures at different occasions are independent.

  • 5.

    In any interval, the number of marked individuals that die or are otherwise removed from the population tends to zero as the width of the interval tends to zero.

  • 6.

    Marking does not affect the capture probabilities.

  • 7.

    The probability of removal has the same distribution for individuals captured and released at a given capture occasion as for individuals captured and released before that occasion.

This definition extends the definition of Huggins and Yip (1999) to the case where the capture probabilities are heterogeneous. Assumption 1 concerns the distribution of the capture probabilities and was not required in the homogeneous case considered in Huggins and Yip (1999). This assumption is required for the closed population estimator of Chao et al. (2001) as is Assumption 4. Assumptions 3–6 are as in Huggins and Yip (1999). Assumptions 3 and 5 ensure the population size changes smoothly. Assumption 6 could be relaxed but for simplicity we concentrate on model Mh rather than model Mbh, which includes the behavioural response to capture. Similarly, Assumption 2 could be relaxed if one used the appropriate martingale estimating equations of Chao et al. (2001) or assumed the individual capture probabilities were smooth functions of t. However, this latter condition would require showing the existence of a stochastic process with the appropriate marginal distributions and this is beyond the scope of the present work. Assumption 7 allows us to estimate the number of marked animals that are actually in the population. This assumption is implicit in Huggins and Yip (1999) but was not explicitly stated there. It implies that the relationship between capture and removal does not change as a function of time and seems reasonable in practice. For example, it would be implied by the assumptions that the removal probabilities are independent of the capture probabilities and that marking and release do not affect the removal probabilities.

Section 2 lists further notation. Our proposed estimating procedure is presented in Section 3. The weekly banding data collected in Mai Po bird sanctuary in Hong Kong is discussed and compared with other estimators in Section 4. Simulation results are reported in Section 5.

Section snippets

The weighted martingale estimators

The approach to estimating the population size at time t is to consider capture occasions close to t and weight the resulting estimating functions so that occasions closest to t have the greatest weight.

Let Xij=1, if the ith individual is captured on the jth occasion and 0, otherwise. For a given t,0<t<ts, let ωj(t),j=1,2,…,s, be a set of weights such that j=1sωj(t)=1. We suppose that ωj(t) is non-zero if and only if the time tj for capture occasion j is in (tk(t)−K,tk(t)+K). In this paper,

Example

As indicated in the Introduction section, we were motivated by the capture data set of Prinia flaviventris collected from the Mai Po bird sanctuary for a period of 34 weeks from September 1991 to April 1992. In several weeks no banding was conducted because of the weather condition. A total of 216 birds were captured in this period of which 163 were only captured once, 45 were captured twice, 6 captured 3 times, 1 captured four times and one captured 6 times. Detailed trapping data are given in

Simulation results

A limited simulation study was performed to examine the performance of the estimation procedure under open heterogeneous populations. We considered death-only models. The initial population size was fixed to be 400 and the capture probabilities for these 400 birds were generated from beta distributions. We selected ten beta distributions as given in Table 1. The capture data were generated by using 20 evenly spaced trapping occasions and capture times were 1,2,…,20. Each bird has a survival

Discussion

The simulation study reveals that our approach has advantages over the Huggins and Yip (1999) method in the presence of heterogeneous capture probabilities. However, if there is non-negligible fraction of un-catchable individuals in the population, then the estimator is biased. This is consistent with the analytic closed population results of Huggins and Chao (2002) for the closed population estimator upon which our approach is based, and the more general results of Huggins (2001). In the two

Acknowledgements

This research was completed when the first and the last authors visited National Tsing Hua University, Hsin-Chu, Taiwan. The visit for the first author was supported by an exchange program between Australian Academy of Science and Taiwan National Science Council.

References (17)

  • A Chao et al.

    Population size estimation based on estimating functions for closed capture–recapture models

    J. Statist. Plann. Inference

    (2001)
  • R.M Huggins

    A note on the difficulties associated with the analysis of capture–recapture experiments

    Statist. Probab. Lett.

    (2001)
  • A.D Carothers

    Quantifying unequal catchability and its effects on survival estimates in an actual population

    J. Anim. Ecol.

    (1979)
  • W.W Esty

    The efficiency of Good's nonparametric estimator

    Ann. Statist.

    (1986)
  • I.J Good

    The population frequencies of species and the estimation of population parameters

    Biometrika

    (1953)
  • R.M Huggins et al.

    Asymptotic properties of an optimal estimating function approach to the analysis of mark recapture data

    Comm. Stat.

    (2002)
  • R.M Huggins et al.

    Estimation of the size of an open population from capture–recapture data using weighted martingale methods

    Biometrics

    (1999)
  • W.-D Hwang et al.

    Quantifying the effects of unequal catchabilities on Jolly–Seber estimators via sample coverage

    Biometrics

    (1995)
There are more references available in the full text version of this article.

Cited by (16)

  • Estimating population size of heterogeneous populations with large data sets and a large number of parameters

    2019, Computational Statistics and Data Analysis
    Citation Excerpt :

    In the literature, many methods have been proposed for handling multiple-lists in a closed population: the Poisson log-linear model (Fienberg, 1972; Cormack, 1989; IWGDMF, 1995a, b), the multinomial model (Cormack and Jupp, 1991), and the sample coverage method (Chao and Lee, 1992). There are also methods for dealing with the open population problem (Huggins and Yip, 1999; Huggins et al., 2003; Yang and Huggins, 2003; Yang et al., 2003; Liu et al., 2007; Huggins et al., 2016, 2018). All the existing methods do not elaborate how to handle covariates and very large data.

  • Data

    2019, Journal of Physics: Conference Series
View all citing articles on Scopus
View full text