Skip to main content

Advertisement

Log in

Using Global Optimization to Estimate Population Class Sizes

  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

In this paper we formulate a nonlinear optimization model to estimate population class sizes based on sample information. The model is nonconvex and has several local minima corresponding to different populations that could have been the source of the sample data. We show that many if not all local solutions can be found using a new global optimization algorithm called OptQuest/NLP (OQNLP). This can be used to estimate the number of individuals in a population with unique or rarely occurring characteristics, which is useful for assessing disclosure risk. It can also be used to estimate the number of classes in a population, a problem with applications in a variety of disciplines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bethlehem J.G., Keller W.J., Pannekoek J. (1990) Disclosure control for microdata. J. Am. Stat. Assoc. 85, 38–45

    Article  Google Scholar 

  • Bunge J., Fitzpatrick M. (1993) Estimating the number of species: A review. J. Am. Stat. Assoc. 88, 364–373

    Article  Google Scholar 

  • Chen G., Keller-McNulty S. (1998) Estimation of identification disclosure risk in microdata. J. Official Stat. 14, 79–95

    Google Scholar 

  • Dalenius T. (1981) A Simple Procedure for Controlled Rounding. Norstedts Tryckeri, Stockholm

    Google Scholar 

  • Dalenius T., Reiss S.P. (1982) Data swapping: A technique for disclosure control. J. Stat. Plan. Infer. 6, 73–85

    Article  Google Scholar 

  • De Waal A.G., Willenborg L.C.R.J. (1998) Optimal local supression in microdata. J. Official Stat. 14, 421–435

    Google Scholar 

  • Drud A. (1994) CONOPT—A Large Scale GRG Code. ORSA J. Comput. 6, 207–216

    Google Scholar 

  • Efron B., Thisted R. (1976) Estimating the number of unseen species: How many words did Shakespear know. Biometrika 63, 435–447

    Google Scholar 

  • Gill, P.E.,Murray,W., Saunders,M.A.: UsersGuide forSNOPTVersion 7,Department ofManagement Science and Engineering, Systems Optimization Laboratory, Stanford University, Stanford, CA, 94305-4026, USA, March 20, (2006)

  • Greenberg B.G., Zayatz L.V. (1992) Measuring risk in public use microdata files. Statistica Neerlandica 46, 33–48

    Article  Google Scholar 

  • Greenberg, B.S. New Approaches to Estimate Disclosure Risk, Presented at the NSF Confidentiality Workshop, Washington, DC, May 12–13 (2003). Retrieved June 1, 2005 from http://www.urban.org/nsfpresentations/pdfs/05_Greenberg.pdf

  • Haas, P., Naughton, J., Sehadri, S., Stokes, L. Sampling-based estimation of the number of distinct values of an attribute. VLDB 95: Proceedings of the International Conference on Very large Databases (In: Dayal, U., Gray, P., Nishio, S. (eds.) pp. 311–322 (1995).

  • Hoshino N. (2001) Applying Pittman’s sampling formula to microdata disclosure risk assessment. J. Official Stat. 17, 499–520

    Google Scholar 

  • Kim, J. A method for limiting disclosure in microdata based on random noise and transformation. Proceedings of the Section on Survey Research Methods Section. American Statistical Association, Alexandria, VA pp. 370–374 (1986)

  • Laguna, M. Optimization of Complex Systems for OptQuest (1997). Retrieved May 23, 2005 from http://www.crystalball.com/optquest/complexsystems.html

  • Lasdon, L., Plummer, J., Ugray, Z., Bussieck, M. Improved filters and randomized drivers for multi-start global optimization. Submitted to Journal of Global Optimization, March 2005

  • Madigan D., York J.C. (1997) Bayesian methods for estimation of the size of a closed population. Biometrika 84(1): 19–31

    Article  Google Scholar 

  • Nash S.G., Sofer A. (1996) Linear and Nonlinear Programming. McGraw-Hill, New York

    Google Scholar 

  • Skinner, C.J., Holmes, D.J. Modelling population uniqueness. Proceedings of the International Seminar on Statistical Confidentiality. pp. 175–199. Statistical Office of the European Communities, Luxembourg, (1993)

  • Smith-Cayama, R.A., Thomas, D.R. Estimating the number of distinct valid signatures in initiative petitions. Proceedings of the Survey Research Methods Section. pp. 238–243. American Statistical Association, Alexandria, VA, (1999)

  • Takemura, Some superpopulation models for estimating the number of population uniques. Statistical Data Protection—Proceedings of the Conference, Lisbon, 25–27 March 1998–1999 edition, pp. 59–76. Office for Official Publications of the European Communities, Luxembourg (1999)

  • Ugray, Z., Plummer, J.C., Glover, F.W., Kelly, J., Lasdon, L.S., Marti, R. A multistart scatter search heuristic for smooth NLP and MINLP problems. Conference on Adaptive Memory and Evolution: Tabu Search and Scatter Search. University of Mississippi at Oxford, March 8–10, (2001)

  • Ugray, Z., Plummer, J.C., Glover, F.W., Kelly, J., Marti, R. Scatter search and local NLP solvers: A multistart framework for global optimization. To appear in INFORMS Journal on Computing.

  • White, J.K., Sangiovanni-vincentelli, A. Relaxation Techniques for the Simulation of VLSI Circuits, Kluwer Academic Publishers (1987)

  • Zayatz, L.V. Estimation of the percent of unique population elements in microdata file using the sample. Statistical Research Division Report Series, Census/SRD/RR-91/08 (1991).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Betsy S. Greenberg.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Greenberg, B.S., Lasdon, L.S. Using Global Optimization to Estimate Population Class Sizes. J Glob Optim 36, 319–338 (2006). https://doi.org/10.1007/s10898-006-9011-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-006-9011-6

Keywords

Navigation