On GMM estimation of distributions from grouped data
Introduction
Consider a sample of observations , randomly drawn from a parametric distribution , and grouped into classes defined by exogenously chosen class limits , with and . Let be an indicator function such that , and 0 otherwise. Assume that the data available to the researcher are (a) the sample mean , (b) the proportion of observations in each class and (c) the proportion of the total value of all observations in each class Our problem is to estimate , and, if they are unknown, the class limits .
The motivation for this problem is the availability and use of grouped data on income or expenditure, typically provided in this form on the websites of the World Bank and the World Institute for Development Economics Research (WIDER). Data on population shares , income shares , and mean income are available for estimating income distributions . Examples of where such data are used for estimation and measuring poverty and inequality are Chotikapanich et al., 2007, Chotikapanich et al., 2012 and Hajargasht et al. (2012). A method of moments estimator that utilizes , and to estimate beta-2 distributions was proposed by Chotikapanich et al. (2007), and later used in a large scale study of changes in global and regional inequality by Chotikapanich et al. (2012). Hajargasht et al. (2012) refined this earlier work by deriving an optimal GMM estimator, and showing how it can be used to estimate parametric income distributions of any form.
These studies set up moment conditions for the proportions , and for either mean income in each group or for that part of total mean income in the th group Chotikapanich et al., 2007, Chotikapanich et al., 2012 used moment conditions for , whereas, for deriving an optimal GMM estimator, Hajargasht et al. (2012) found it easier to work with . In this paper we derive an expression for the optimal GMM estimator when using the moment conditions for the group mean incomes . Although both approaches are asymptotically equivalent, specification of the moment conditions in terms of the group means is more natural. More importantly, the resulting GMM objective function is more convenient computationally than its counterpart for ; the minimization problem is simpler and convergence is easier to achieve. A small Monte Carlo experiment is used to demonstrate the validity and practicality of the proposed estimator.
Throughout, we treat the class limits as unknown since doing so is more general than treating them as known, and they are not provided in the data source that motivated this study.1 However, our results also hold for the case where are known; there is simply a reduction in the number of parameters to be estimated. Also, since our GMM estimator makes a distributional assumption about the data generating process, it differs from the traditional GMM estimator which is based on a less restrictive set of assumptions. We utilize GMM in spite of the distributional assumption because derivation of the likelihood function that uses information on both the and the is not straightforward.
In Section 2 we review the moment conditions, optimal weight matrix and GMM objective function set up and derived by Hajargasht et al. (2012) for using data on . In Section 3 we use these results to derive the optimal weight matrix and GMM objective function for the case where are used to set up the moment conditions. Results from a Monte Carlo experiment are presented in Section 4.
Section snippets
Previous results
Let the complete set of unknown parameters be given by . Defining the population moments corresponding to and as and , respectively, we have, from Eqs. (1), (4), and Setting up corresponding moment conditions in matrix notation, we define
Optimal weight matrix and objective function for
Considering now the moment conditions for , we have (as before) and , where .2 Collecting these terms into a vector, we define
A Monte Carlo analysis
The design of the Monte Carlo experiment is as follows: We generate data from a generalized beta distribution of the second kind (GB2). This distribution is a flexible and popular candidate for estimating income distributions (see e.g., McDonald, 1984, McDonald and Ransom, 2008, or Kleiber and Kotz, 2003). Its density function is defined as , with parameters . The settings in the experiment were and , implying a relatively
Conclusion
With the estimation of income distributions as our motivation, we have derived the moment conditions and optimal weight matrix for GMM estimation of distributions with grouped data when the moment conditions are based on data in the form of group population shares and group means. Importantly, we showed that in this case the optimal weight matrix is considerably simpler than that used in a previous formulation, leading to a computationally simpler objective function for GMM estimation. A
References (8)
- et al.
Microeconometrics: Methods and Applications
(2005) - et al.
Global income distributions and inequality, 1993 and 2000: incorporating country-level inequality modelled with beta distributions
Rev. Econ. Stat.
(2012) - et al.
Estimating and combining national income distributions using limited data
J. Bus. Econom. Statist.
(2007) Econometric Analysis
(2012)