Elsevier

Economics Letters

Volume 126, January 2015, Pages 122-126
Economics Letters

On GMM estimation of distributions from grouped data

https://doi.org/10.1016/j.econlet.2014.11.031Get rights and content

Highlights

  • Use of grouped data comprising the group proportions and group means.

  • GMM estimation of a parametric distribution from the grouped data.

  • A simplification of the GMM estimator suggested in earlier literature.

  • Monte Carlo evidence of the usefulness and practicality of the estimator.

Abstract

For estimating distributions from grouped data, setting up moment conditions in terms of group shares and group means leads to an optimal weight matrix and a GMM objective function that are considerably simpler than those from a previous specification. Minimization is more efficient and convergence is more reliable.

Introduction

Consider a sample of T observations (y1,y2,,yT), randomly drawn from a parametric distribution f(y;ϕ),(y>0), and grouped into N classes defined by exogenously chosen class limits (z0,z1),(z1,z2),,(zN1,zN), with z0=0 and zN=. Let gi(y) be an indicator function such that gi(y)=1if  zi1<yzi, and 0 otherwise. Assume that the data available to the researcher are (a) the sample mean ȳ, (b) the proportion of observations in each class ci=1Tt=1Tgi(yt)=TiT, and (c) the proportion of the total value of all observations in each class si=1Tȳt=1Tytgi(yt). Our problem is to estimate ϕ, and, if they are unknown, the class limits z1,z2,,zN1.

The motivation for this problem is the availability and use of grouped data on income or expenditure, typically provided in this form on the websites of the World Bank and the World Institute for Development Economics Research (WIDER). Data on population shares ci, income shares si, and mean income ȳ are available for estimating income distributions f(y;ϕ). Examples of where such data are used for estimation and measuring poverty and inequality are Chotikapanich et al., 2007, Chotikapanich et al., 2012 and Hajargasht et al. (2012). A method of moments estimator that utilizes ci,si, and ȳ to estimate beta-2 distributions was proposed by Chotikapanich et al. (2007), and later used in a large scale study of changes in global and regional inequality by Chotikapanich et al. (2012). Hajargasht et al. (2012) refined this earlier work by deriving an optimal GMM estimator, and showing how it can be used to estimate parametric income distributions of any form.

These studies set up moment conditions for the proportions ci, and for either mean income in each group ȳi=1Tit=1Tytgi(yt)=siȳci, or for that part of total mean income in the ith group ỹi=1Tt=1Tytgi(yt)=siȳ=ciȳi.Chotikapanich et al., 2007, Chotikapanich et al., 2012 used moment conditions for ȳi, whereas, for deriving an optimal GMM estimator, Hajargasht et al. (2012) found it easier to work with ỹi. In this paper we derive an expression for the optimal GMM estimator when using the moment conditions for the group mean incomes ȳi. Although both approaches are asymptotically equivalent, specification of the moment conditions in terms of the group means ȳi is more natural. More importantly, the resulting GMM objective function is more convenient computationally than its counterpart for ỹi; the minimization problem is simpler and convergence is easier to achieve. A small Monte Carlo experiment is used to demonstrate the validity and practicality of the proposed estimator.

Throughout, we treat the class limits (z1,z2,,zN1) as unknown since doing so is more general than treating them as known, and they are not provided in the data source that motivated this study.1 However, our results also hold for the case where (z1,z2,,zN1) are known; there is simply a reduction in the number of parameters to be estimated. Also, since our GMM estimator makes a distributional assumption about the data generating process, it differs from the traditional GMM estimator which is based on a less restrictive set of assumptions. We utilize GMM in spite of the distributional assumption because derivation of the likelihood function that uses information on both the ci and the ȳi is not straightforward.

In Section  2 we review the moment conditions, optimal weight matrix and GMM objective function set up and derived by Hajargasht et al. (2012) for using data on (ci,ỹi). In Section  3 we use these results to derive the optimal weight matrix and GMM objective function for the case where (ci,ȳi) are used to set up the moment conditions. Results from a Monte Carlo experiment are presented in Section  4.

Section snippets

Previous results

Let the complete set of unknown parameters be given by θ=(z1,z2,,zN1,ϕ). Defining the population moments corresponding to ci and ỹi as ki(θ) and μ̃i(θ), respectively, we have, from Eqs. (1), (4), ki(θ)=E[gi(y)]=0gi(y)f(y;ϕ)dy=zi1zif(y;ϕ)dy and μ̃i(θ)=E[ygi(y)]=0ygi(y)f(y;ϕ)dy=zi1ziyf(y;ϕ)dy. Setting up corresponding moment conditions in matrix notation, we define H(θ)=1Tt=1Th(yt;θ)=1Tt=1T[g1(yt)k1(θ)gN1(yt)kN1(θ)ytg1(yt)μ̃1(θ)ytgN(yt)μ̃N(θ)]=[c1k1(θ)cN1kN1(θ)ỹ1μ̃1(θ

Optimal weight matrix and objective function for (ci,ȳi)

Considering now the moment conditions for (ci,ȳi), we have E[ciki(θ)]=0 (as before) and plim  [ȳiμi(θ)]=0, where μi(θ)=μ̃i(θ)/ki(θ).2 Collecting these terms into a vector, we define L(θ)=[c1k1(θ)

A Monte Carlo analysis

The design of the Monte Carlo experiment is as follows: We generate data from a generalized beta distribution of the second kind (GB2). This distribution is a flexible and popular candidate for estimating income distributions (see e.g., McDonald, 1984, McDonald and Ransom, 2008, or Kleiber and Kotz, 2003). Its density function is defined as f(y;ϕ)=ayap1/(bap[1+(y/b)a]p+q), with parameters ϕ=(b,p,q,a). The settings in the experiment were b=100,p=1,q=1.5 and a=1.5, implying a relatively

Conclusion

With the estimation of income distributions as our motivation, we have derived the moment conditions and optimal weight matrix for GMM estimation of distributions with grouped data when the moment conditions are based on data in the form of group population shares and group means. Importantly, we showed that in this case the optimal weight matrix is considerably simpler than that used in a previous formulation, leading to a computationally simpler objective function for GMM estimation. A

References (8)

  • C. Cameron et al.

    Microeconometrics: Methods and Applications

    (2005)
  • D. Chotikapanich et al.

    Global income distributions and inequality, 1993 and 2000: incorporating country-level inequality modelled with beta distributions

    Rev. Econ. Stat.

    (2012)
  • D. Chotikapanich et al.

    Estimating and combining national income distributions using limited data

    J. Bus. Econom. Statist.

    (2007)
  • W.H. Greene

    Econometric Analysis

    (2012)
There are more references available in the full text version of this article.

Cited by (0)

View full text