Some properties of a nonparametric estimator of the size of an open population
Introduction
Local polynomial models are used extensively in nonparametric regression (Fan and Gijbels, 1996). Their use in estimating the size of an open population using capture–recapture data follows from Huggins and Yip (1999). Their approach was based on martingale estimating equations and extended the well-known Jolly–Seber estimators (Seber, 1982) by giving smooth estimates of both the numbers of marked individuals in the population and the population size. These estimators arose from applying kernel smoothing methods to the closed population martingale estimators of Yip (1993). For closed populations, there is a population of size N and capture occasions upon which individuals in the population can be captured. On each occasion the captured individuals that have not been previously captured are marked and the marks of the recaptured individuals are noted. Thus, for each individual that was captured at least once, the raw data consists of the occasions on which they were captured. Let denote the number of individuals captured on occasion j, the number of these that had been previously marked and the known number of marked individuals in the population just prior to occasion j. Under the assumptions the capture probabilities are homogeneous across the population on each occasion, given , and N, has a hypergeometric distribution so that . This gives rise to the martingale estimating equations, and a simple closed form estimator for N. In an open population, with population size on occasion j, Huggins and Yip (1999) developed estimating equations for of the form , where was supposed to be locally constant, i.e. a polynomial of degree zero and are kernel weight functions. Subsequently Huggins et al. (2003) extended the approach to sample coverage estimators to relax the equal catch-ability assumptions of Huggins and Yip (1999), and Yang and Huggins (2003) and Yang et al. (2003) further extended the models to local polynomial models, but relied on bootstrap procedures to estimate standard errors. More recently, Huggins (2006) gave expressions for the standard errors in the semi-parametric case and verified them in simulations but only outlined their derivation. These previous articles have demonstrated the utility of the method and motivate a deeper examination of the properties of the estimators. In this note we begin the formal derivation of the properties of the estimators. We return to the simpler non-parametric estimators of Huggins and Yip (1999) and further simplify the setting by supposing the number of marked animals is observable.
Here we develop a setting for capture–recapture experiments where we can determine conditions under which the large sample properties of the local polynomial estimators of population size may be derived and the rates of convergence can be examined. Local polynomial models involve the order p of the polynomial and a bandwidth h, and the bias and variance depend on both these quantities. When local polynomials are applied to regression models the bias terms are for odd p, for even p and the variance is (Fan and Gijbels, 1996) giving the familiar trade off between the bias and variance. These results require that , which is only of interest if the design points become dense. In nonparametric regression of Y on x this can be achieved by assuming we observe independent pairs , , where the are independently and identically distributed with some common density . Thus, to justify the use of local polynomial models in capture–recapture experiments, it is necessary to develop an asymptotic setting where the capture occasions become dense over the time period in which the experiment is conducted. The results obtained here are perhaps of less practical importance but are important in understanding the procedures and how standard errors may be derived in more complex situations.
In developing the asymptotic properties of a population size estimator from capture–recapture data collected at discrete capture occasions there are three factors to consider. The population size N, the time between capture occasions, and the length of time over which the experiment is conducted, . In traditional experiments for closed populations with and fixed it is implicit in the development of the asymptotic properties and the associated approximating distributions that . Sometimes this is made explicit (Darroch, 1958, Huggins, 1989). Keeping N and fixed and letting approximates the continuous time counting process considered by Becker (1984). To the author's knowledge, keeping N and fixed and letting increase has not been explicitly studied although clearly for closed populations the number of animals sighted in will tend to N as . In open populations with immigration this situation may be more interesting.
We obtain our asymptotic results by considering a sequence of populations of increasing size () and allow the capture occasions to become closer together () for fixed . We also require that the probability of capture is proportional to . This is natural if the traps are always deployed but inspected at regular intervals. This situation could occur if trapping were conducted over a larger and larger area with increasingly frequent inspections of the traps.
In Section 2 we give the model and assumptions, in Section 3 the estimators are given and we derive expressions for their bias and variance. Section 4 contains some discussion of our results.
Section snippets
Model and assumptions
Consider a sequence of capture–recapture experiments over the time interval . Let denote the size of the population at time t in the rth experiment and suppose as . For a given r, we take the point of view that the population size is fixed and that the sample space consists of the outcomes of capture–recapture experiments on this population. We suppose the capture experiment on the rth population consists of captures at the equally spaced capture occasions
The estimators and their properties
Let be a symmetric kernel function with support , , , be a matrix with element , and . For a given t and bandwidth , weighted estimating equations for areThis yields the estimatorsand , whereand
Discussion
We have given conditions under which the local polynomial estimator for the size of a population for a simple model that assumes homogeneous capture probabilities has analogues of the classical bias and variance expressions when the number of marked individuals is known: the bias in estimating by is for , for odd p it is , for even p it is and its variance is . To obtain these results we have supposed that the
Acknowledgement
The author is grateful to two referees for comments that helped clarify the exposition.
References (12)
- et al.
Population size estimation using local sample coverage for open populations
J. Statist. Plan. Inference
(2003) Statistical inference procedure for a hypergeometric model for capture–recapture experiment
Appl. Math. Comput.
(1993)Estimating population size from capture–recapture experiments in continuous time
Austral. J. Statist.
(1984)The multiple-recapture census: I. Estimation of a closed population
Biometrika
(1958)- et al.
Local Polynomial Modelling and its Applications
(1996) On the statistical analysis of capture experiments
Biometrika
(1989)