Skip to main content
Log in

Two-stage benchmarking as applied to small area estimation

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

There has been recent growth in small area estimation due to the need for more precise estimation of small geographic areas, which has led to groups such as the U.S. Census Bureau, Google, and the RAND corporation utilizing small area-estimation procedures. We develop a novel two-stage benchmarking methodology using a single weighted squared error loss function that combines the loss at the unit level and the area level without any specific distributional assumptions. This loss is considered while benchmarking the weighted means at each level or both the weighted means and weighted variability at the unit level. Furthermore, we provide multivariate extensions for benchmarking weighted means at both levels. The behavior of our methods is analyzed using a complex study from the National Health Interview Survey (NHIS) from 2000, which estimates the proportion of people that do not have health insurance for many domains of an Asian subpopulation. Finally, the methodology is explored via simulated data under the proposed model. Ultimately, three proposed benchmarked Bayes estimators do not dominate each other, leaving much exploration for further understanding of such complex studies such as the choice of weights, optimal algorithms for efficiency, as well as extensions to multi-stage benchmarking methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Wang J, Fuller WA, Qu Y (2008) Small area estimation under a restriction. Surv Methodol 34:29–36

    Google Scholar 

  • Ugarte MD, Goicoa T, Militino AF (2009) Benchmarked estimates in small areas using linear mixed models with restrictions. Test 18:342–364

    Article  MathSciNet  MATH  Google Scholar 

  • Battese G, Harter R, Fuller W (1988) An error-components model for prediction of county crop area using survey and satellite data. J Am Stat Assoc 83:28–36

    Article  Google Scholar 

  • Datta GS, Ghosh M, Steorts R, Maples J (2011) Bayesian benchmarking with applications to small area estimation. Test 20(3):574–588

    Article  MathSciNet  MATH  Google Scholar 

  • Bell WR, Datta GS, Ghosh M (2013) Benchmarked small area estimators. Biometrika 100(1):189–202

    Article  MathSciNet  Google Scholar 

  • Ghosh M, Kim D, Sinha K, Maiti T, Katzoff M, Parsons V (2009) Hierarchical and empirical Bayes small domain estimation of the proportion of persons without health insurance or minority subpopulations. Surv Methodol 35(1):53–66

    Google Scholar 

  • Ghosh M, Natarajan K, Stroud T, Carlin B (1998) Generalized linear models for small area-estimation. J Am Stat Assoc 93(441):53–66

    MathSciNet  Google Scholar 

  • Isaki CT, Tsay JH, Fuller WA (2004) Weighting sample data subject to independent controls. Surv Methodol 20:35–44

    Google Scholar 

Download references

Acknowledgements

This research was partially supported by the United States Census Bureau Dissertation Fellowship Program and NSF Grants SES 1026165 and SES 1130706. The views expressed reflect those of the authors and not of the National Health Interview Survey, the United States Census Bureau, or NSF. We would like to express our thanks to the Associate Editor and referees for their helpful suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Malay Ghosh.

Appendix

Appendix

Fig. 1
figure 1

The first plot shows \(\hat{\delta}_{i}^{I}- \hat{\delta}_{i}^{B}\) and \(\hat{\delta}_{i}^{C}- \hat{\delta}_{i}^{B}\) versus domain sample size n i , illustrating that the estimator \(\hat{\delta}_{i}^{C}\) is not desirable. The middle plot shows \(\hat{\delta}_{i}^{I}- \hat{\delta }_{i}^{B}\) and \(\hat{\delta}_{i}^{n}- \hat{\delta}_{i}^{B}\) versus domain sample size n i , while the last plot shows \(\hat{\delta }_{i}^{R}- \hat{\delta}_{i}^{B}\) and \(\hat{\delta}_{i}^{n}- \hat{\delta}_{i}^{B}\) versus domain sample size n i

Fig. 2
figure 2

This plot shows the percent increase in PMSE of the raked estimator \(\hat{\delta}_{i}^{R}\), the inverse-variance estimator \(\hat {\delta}_{i}^{I}\), and the domain-weighted inverse-variance estimator \(\hat{\delta}_{i}^{n}\) versus the domain sample size n i

Fig. 3
figure 3

On the left, we plot the difference \(\hat{\delta}_{i}^{R}- \hat {\delta}_{i}^{B}\) versus n i for the simulated and NHIS data. We also plot the difference \(\hat {\delta}_{i}^{I}- \hat{\delta}_{i}^{B}\) versus n i for the simulated and NHIS data which corresponds to the middle plot. We plot the same for \(\hat{\delta}_{i}^{n}-\hat{\delta }_{i}^{B}\) on the right. All three plots illustrate similar behavior for the simulated data as the NHIS data

Fig. 4
figure 4

The figure on the left plots the % increase in PRMSE of the raked estimator under the NHIS and the simulated data versus the domain sample size. This plot illustrates that under both data sources there is more spread as the inverse-variance estimator increases. The middle figure plots the % increase in PRMSE of the inverse-variance estimator under the NHIS data (circles) and the simulated data (plusses) versus the domain sample size, while the right-hand plot does the same for the domain-weighted inverse-variance estimator

Fig. 5
figure 5

The figure on the left plots a kernel density estimate of the difference of the inverse estimator calculated under (i) the NHIS data and (ii) the simulated data. The figure on the right does this for the raked estimator. Both plots illustrate that the posterior mean of the difference of the densities is centered near 0, with the function being unimodal

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghosh, M., Steorts, R.C. Two-stage benchmarking as applied to small area estimation. TEST 22, 670–687 (2013). https://doi.org/10.1007/s11749-013-0338-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-013-0338-2

Keywords

Mathematics Subject Classification

Navigation