Abstract
Household surveys often fail to capture the top tail of income and wealth distributions, as evidenced by studies based on tax data. Yet to date there is no consensus on how to best reconcile both sources of information, given the multiple biases at play. This paper contributes a novel method, rooted in standard calibration theory, to directly confront the problem of survey non-response between survey micro-data and anonymous tax data under reasonable assumptions. Our key innovation is to endogenously determine a “merging point” between the datasets, above which we start to incorporate information from tax data into the survey, under the assumption that the rate of representativeness is constant, then decreasing with income. This is followed by a “reweighting” and a “replacing” step, which preserves the microdata structure of the original survey, assuming no re-ranking of observations. We illustrate our approach with simulations, which show that our method is robust to the existence of income misreporting, and performs better than alternative methods. We also apply it to real data from five countries, both developed and less developed, finding changes to the levels and trends in income inequality. We discuss several limits to our approach and suggest some guidelines for future research.
Similar content being viewed by others
References
Aaberge, R., Atkinson, A.B.: Top incomes in Norway. In: Atkinson, A.B., Piketty, T. (eds.) Top incomes: A Global Perspective, vol. 2, pp 448–481. Oxford University Press (2010)
Alstadsæter, A., Jacob, M., et al.: Accounting for business income in measuring top income shares: Integrated accrual approach using individual and from data from Norway. Tech. rep. National Bureau of Economic Research (2016)
Alstadsæter, A., Johannesen, N., Zucman, G.: Tax evasion and inequality. https://doi.org/10.1257/aer.20172043, vol. 109, pp 2073–2103 (2019)
Alvaredo, F.: A note on the relationship between top income shares and the Gini coefficient. https://doi.org/10.1016/j.econlet.2010.10.008, vol. 110, pp 274–277 (2011)
Atkinson, A.B., Piketty, T.: Top incomes over the twentieth century: a contrast between continental European and English-speaking countries, p 585. Oxford University Press, Oxford (2007). https://global.oup.com/academic/product/top-incomes-over-the-twentieth-century-9780199286881?lang=en&cc=fr
Atkinson, A. B., Piketty, T.: Top incomes: a global perspective, p 776. Oxford University Press, Oxford (2010). https://global.oup.com/academic/product/top-incomes-9780199286898?cc=fr&lang=en&#
Ayer, M., et al.: An empirical distribution function for sampling with incomplete information. Ann. Math. Statist. 26(4), 641–647 (1955). https://doi.org/10.1214/aoms/1177728423
Blanchet, T., Fournier, J., Piketty, T.: Generalized pareto curves: theory and applications (2017)
Bourguignon, F.: Simple adjustments of observed distributions for missing income and missing people. J. Econ. Inequal., 1–18 (2018)
Brunk, H.D.: Maximum likelihood estimates of monotone parameters. Ann. Math. Statist. 26(4), 607–616 (1955). https://doi.org/10.1214/aoms/1177728420
Burkhauser, R.V, Hahn, M.H, Wilkins, R., Top incomes and inequality in australia: Reconciling recent estimates from household survey and tax return data (2016)
Burkhauser, R.V, Hérault, N., et al.: What has been happening to UK income inequality since the mid-1990s? Answers from reconciled and combined household survey and tax return data. http://www.nber.org/papers/w21991 (2016)
Chancel, L., Piketty, T.: Indian income inequality, 1922-2014: From British Raj to Billionaire Raj? http://wid.world/document/chancelpiketty2017widworld/ (2017)
Czajka, L.: Income inequality in Côte d’Ivoire: 1985-2014. WID.world Working Paper July (2017)
Deville, J.-C., Särndal, C.-E.: Calibration estimators in survey sampling. J. Am. Stat. Assoc. 87(418), 376–382 (1992). https://doi.org/10.1080/01621459.1992.10475217
Diaz-Bazan, T.: Measuring inequality from top to bottom. Policy Research Working Paper 7237 (2015)
DWP: Households below average income: An analysis of the income distribution 1994/95 – 2013/4. https://www.gov.uk/government/uploads/system/uploads/attachmentdata/file/437246/households-below-average-income-1994-95-to-2013-14.pdf (2015)
van Eeden, C.: Testing and estimating ordered parameters of probability distributions. PhD thesis. University of Amsterdam (1958)
Fairfield, T., De Luis, M.J.: Top income shares, business profits, and effective tax rates in contemporary Chile. Rev. Income Wealth 62, S120–S144 (2016)
Fleming, K.G.: We’re skewed–the bias in small samples from skewed distributions, vol. 2, pp 179–183 (2007)
Flores, I., et al.: Top incomes in Chile: A historical perspective on income inequality, 1964–2017. Rev. Income Wealth . https://doi.org/10.1111/roiw.12441. Forthcoming (n.d.)
Hlasny, V., Verme, P.: The impact of top incomes biases on the measurement of inequality in the United States (2017)
Hlasny, V., Verme, P.: Top incomes and inequality measurement: a comparative analysis of correction methods using the EU SILC Data Vladimir. Econometrics 6(30), 1–38 (2018). https://doi.org/10.3390/econometrics6020030
Johansson, F., Klevmarken, A.: Comparing register and survey wealth data (2007)
Korinek, A., Mistiaen, J.A., Ravallion, M.: Survey nonresponse and the distribution of income. J. Econ. Inequal. 4(1), 33–55 (2006). https://doi.org/10.1007/s10888-005-1089-4
Kuznets, S.: Shares of upper income groups in income and savings. NBER. https://doi.org/10.2307/2343040. http://www.jstor.org/stable/10.2307/2343040?origin=crossref (1953)
Medeiros, M., de Castro Galvão, J., de Azevedo Nazareno, L.: Correcting the underestimation of top incomes: combining data from income tax reports and the Brazilian 2010 census. Soc. Indic. Res. 135(1), 233–244 (2018)
Morgan, M.: Essays on income distribution: methodological, historical and institutional perspectives with applications to the case of Brazil (1926–2016) PhD Dissertation in Economics. Paris, Paris School of Economics and EHESS (2018)
Newman, S.J.: Supercentenarian and remarkable age records exhibit patterns indicative of clerical errors and pension fraud. https://doi.org/10.1101/704080. https://www.biorxiv.org/content/early/2020/05/03/704080 (2020)
Novokmet, F., Piketty, T., Zucman, G.: From soviets to oligarchs: inequality and property in Russia 1905-2016, vol. 16, pp 189–223 (2018)
Okolewski, A., Rychlik, T.: Sharp distribution-free bounds on the bias in estimating quantiles via order statistics. Stat. Probabl. Lett. 52(2), 207–213 (2001). https://doi.org/10.1016/S0167-7152(00)00242-X
Pareto, V.: ’Ecrits sur la courbe de la répartition de la richesse (1896)
Piketty, T.: Income inequality in France, 1901–1998. J. Polit. Econ. 111(5), 1004–1042 (2003). https://doi.org/10.1086/376955
Piketty, T., Saez, E.: Income inequality in the United States, 1913–1998. Q. J. Econ. 18(1) (2003)
Piketty, T., Yang, L., Zucman, G.: Capital accumulation, private property and rising inequality in China, 1978-2015. http://www.nber.org/papers/w23368.pdf (2017)
Preston, S.H., Elo, I.T., Stewart, Q.: Effects of age misreporting on mortality estimates at older ages. Popul. Stud. 53(2), 165–177 (1999). https://doi.org/10.1080/00324720308075
Singh, A.C., Mohl, C.A.: Understanding Calibration Estimators in Survey Sampling. Surv. Methodol. 22(2), 107–115 (1996)
Taleb, N.N., Douady, R.: On the super-additivity and estimation biases of quantile contributions. Physica A Stat. Mech. Appl. 429, 252–260 (2015). https://doi.org/10.1016/j.physa.2015.02.038
Acknowledgements
We gratefully acknowledge funding from the Fundación Ramón Areces, ERC (Grant 340831), Ford Foundation, INET (Grant INO14-00023) and from other partners of the World Inequality Lab. We thank Facundo Alvaredo, Yonatan Berman, François Bourguignon, Lucas Chancel, Mauricio De Rosa, Francisco Ferreira, Emmanuel Flachaire, Pablo Gutiérrez, Amory Gethin, Thanasak Jenmana, Nora Lustig, Brian Nolan, Thomas Piketty, Li Yang and Gabriel Zucman for helpful discussions of earlier versions of this paper, as well as participants at the 2018 INET seminar series at the University of Oxford, the May 2018 Workshop on harmonising surveys and tax data at the Paris School of Economics, the ECINEQ 2019 conference, the inequality seminar at ECLAC Santiago (2019), the STEP Seminar at the Université Paris 1 Panthéon–Sorbonne and the Stone Center of Inequality Research seminar at INSEAD.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Blanchet, T., Flores, I. & Morgan, M. The weight of the rich: improving surveys using tax data. J Econ Inequal 20, 119–150 (2022). https://doi.org/10.1007/s10888-021-09509-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10888-021-09509-3