Using Split Samples to Improve Inference on Causal Effects

Marcel Fafchamps; Julien Labonne

doi:10.1017/pan.2017.22

Using Split Samples to Improve Inference on Causal Effects

Published online by Cambridge University Press: 18 September 2017

Marcel Fafchamps and

Julien Labonne

Show author details

Marcel Fafchamps: Affiliation:
Stanford University, Freeman Spogli Institute for International Studies, Encina Hall E105, Stanford, CA 94305, USA. Email: fafchamp@stanford.edu
Julien Labonne*: Affiliation:
Blavatnik School of Government, University of Oxford Radcliffe Observatory Quarter, Woodstock Road, Oxford, OX2 6GG, UK. Email: julien.labonne@bsg.ox.ac.uk
*: *Email: julien.labonne@bsg.ox.ac.uk

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

We discuss a statistical procedure to carry out empirical research that combines recent insights about preanalysis plans (PAPs) and replication. Researchers send their datasets to an independent third party who randomly generates training and testing samples. Researchers perform their analysis on the training sample and are able to incorporate feedback from both colleagues, editors, and referees. Once the paper is accepted for publication the method is applied to the testing sample and it is those results that are published. Simulations indicate that, under empirically relevant settings, the proposed method delivers more power than a PAP. The effect mostly operates through a lower likelihood that relevant hypotheses are left untested. The method appears better suited for exploratory analyses where there is significant uncertainty about the outcomes of interest. We do not recommend using the method in situations where the treatment are very costly and thus the available sample size is limited. An interpretation of the method is that it allows researchers to perform direct replication of their work. We also discuss a number of practical issues about the method’s feasibility and implementation.

Type: Articles
Information: Political Analysis , Volume 25 , Issue 4 , October 2017 , pp. 465 - 482

DOI: https://doi.org/10.1017/pan.2017.22 [Opens in a new window]
Copyright: Copyright © The Author(s) 2017. Published by Cambridge University Press on behalf of the Society for Political Methodology.

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Author’s note: We thank Michael Alvarez (Co-Editor), two anonymous referees, Rob Garlick and Kate Vyborny for discussions and comments. All remaining errors are ours. Replication data are available on the Harvard Dataverse (Fachamps and Labonne 2017). Supplementary materials for this article are available on the Political Analysis Web site.

Contributing Editor: R. Michael Alvarez

References

Anderson, Michael L. 2008. Multiple inference and gender differences in the effects of early intervention: A reevaluation of the abecedaian, perry preschool, and early training projects. Journal of the American Statistical Association 103(484):1481–1495.Google Scholar

Athey, Susan, and Imbens, Guido. 2015. Machine learning methods for estimating heterogeneous causal effects. Stanford University. Mimeo.Google Scholar

Bell, Mark, and Miller, Nicholas. 2015. Questioning the effect of nuclear weapons on conflict. Journal of Conflct Resolution 59(1):74–92.Google Scholar

Belloni, Alexandre, Chernozhukov, Victor, and Hansen, Christian. 2014. High-dimensional methods and inference on structural and treatment effects. Journal of Economic Perspectives 28(2):29–50.Google Scholar

Benjamini, Yoav, Krieger, Abba M., and Yekutieli, Daniel. 2006. Adaptive linear step-up procedures that control the false discovery rate. Biometrika 93(3):491–507.Google Scholar

Benjamini, Yoav, and Yekutieli, Daniel. 2001. The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics 29(4):1165–1188.Google Scholar

Benjamini, Yoav, and Hochberg, Yosef. 1995. Controlling the false discovery rate: A pactrical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological) 57(1):289–300.Google Scholar

Blair, Graeme, Cooper, Jasper, Coppock, Alexander, and Humphreys, Macartan. 2016. Declaring and diagnosing research designs. Columbia University. Mimeo.Google Scholar

Brodeur, Abel, Le, Mathias, Sangnier, Marc, and Zylberberg, Yanos. 2016. Star wars: The empirics strike back. American Economic Journal: Applied Economics 8(1):1–32.Google Scholar

Coffman, Lucas C., and Niederle, Muriel. 2015. Pre-analysis plans are not the solution replications might be. Journal of Economic Perspectives 29(3):81–98.Google Scholar

Dunning, Thad. 2016. Transparency, replication, and cumulative learning: What experiments alone cannot achieve. Annual Review of Political Science 19(1):S1–S23.Google Scholar

Einav, Liran, and Levin, Jonathan. 2014. Economics in the age of big data. Science 346(6210):715.Google Scholar

Fachamps, Marcel, and Labonne, Julien. 2017. Replication data for “Using split samples to improve inference on causal effects”. doi:10.7910/DVN/Q0IXQY, Harvard Dataverse, V1.Google Scholar

Findley, Michael G., Jensen, Nathan M., Malesky, Edmund J., and Pepinsky, Thomas B.. Forthcoming. Can results-free review reduce publication bias? The results and implications of a pilot study. Comparative Political Studies.Google Scholar

Franco, Annie, Malhotra, Neil, and Simonovits, Gabor. 2014. Publication bias in the social sciences: Unlocking the file drawer. Science 345(6203):1502–1505.Google Scholar

Gelman, Andrew. 2014. Preregistration: What’s in it for you? http://andrewgelman.com/2014/03/10/ preregistration-whats/.Google Scholar

Gelman, Andrew. 2015. The connection between varying treatment effects and the crisis of unreplicable research. Journal of Management 41(2):632–643.Google Scholar

Gelman, Andrew, Carlin, John, Stern, Hal, Dunson, David, Vehtari, Aki, and Rubin, Donald. 2013. Bayesian data analysis . 3rd edn. London: Chapman and Hall/CRC.Google Scholar

Gerber, Alan, and Malhotra, Neil. 2008. Do statistical reporting standards affect what is published? Publication bias in two leading political science journals. Quaterly Journal of Political Science 3(3):313–326.Google Scholar

Gerber, Alan S., Green, Donald P., and Nickerson, David. 2001. Testing for Publication Bias in Political Science. Political Analysis 9(4):385–392.Google Scholar

Green, Don, Humphreys, Macartan, and Smith, Jenny. 2013. Read it, understand it, believe it, use it: Principles and proposals for a more credible research publication. Columbia University. mimeo.Google Scholar

Grimmer, Justin. 2015. We are all social scientists now: How big data, machine learning, and causal inference work together. PS: Political Science & Politics 48(1):80–83.Google Scholar

Hainmueller, Jens, and Hazlett, Chad. 2013. Kernel regularized least squares: Reducing misspecification bias with a flexible and interpretable machine learning approach. Political Analysis 22(2):143–168.Google Scholar

Hartman, Erin, and Hidalgo, F. Daniel. 2015. What’s the alternative?: An equivalence approach to balance and placebo tests. UCLA. mimeo.Google Scholar

Humphreys, Macartan, Sanchez de la Sierra, Raul, and van der Windt, Peter. 2013. Fishing, commitment, and communication: A proposal for comprehensive nonbinding research registration. Political Analysis 21(1):1–20.Google Scholar

Ioannidis, John. 2005. Why most published research findings are false. PLOS Medicine 2(8):e124.Google Scholar

Laitin, David D. 2013. Fisheries management. Political Analysis 21:42–47.Google Scholar

Leamer, Edward. 1974. False models and post-data model construction. Journal of the American Statistical Association 69(345):122–131.Google Scholar

Leamer, Edward. 1978. Specification searches. Ad hocinference with nonexperimental data . New York, NY: Wiley.Google Scholar

Leamer, Edward. 1983. Let’s take the Con out of econometrics. American Economic Review 73(1):31–43.Google Scholar

Lin, Winston, and Green, Donald P.. 2016. Standard operating procedures: A safety net for pre-analysis plans. PS: Political Science & Politics 49(3):495–500.Google Scholar

Lovell, M. 1983. Data mining. Review of Economic and Statistics 65(1):1–12.Google Scholar

Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K. M., Gerber, A., Glennerster, R., Green, D. P., Humphreys, M., Imbens, G., Laitin, D., Madon, T., Nelson, L., Nosek, B. A., Petersen, M., Sedlmayr, R., Simmons, J. P., Simonsohn, U., and Van der Laan, M.. 2014. Promoting transparency in social science research. Science 343(6166):30–31.Google Scholar

Monogan, James E. 2015. Research preregistration in political science: The case, counterarguments, and a response to critiques. PS: Political Science & Politics 48(3):425–429.Google Scholar

Nyhan, Brendan. 2015. Increasing the credibility of political science research: A proposal for journal reforms. PS: Political Science & Politics 48(S1):78–83.Google Scholar

Olken, Benjamin. 2015. Pre-analysis plans in economics. Journal of Economic Perspectives 29(3):61–80.Google Scholar

Pepinsky, Tom. 2013. The perilous peer review process. http://tompepinsky.com/2013/09/16/the-perilous- peer-review-process/.Google Scholar

Rauchhaus, Robert. 2009. Evaluating the nuclear peace hypothesis a quantitative approach. Journal of Conflict Resolution 53(2):258–277.Google Scholar

Sankoh, A. J., Huque, M. F., and Dubey, S. D.. 1997. Some comments on frequently used multiple endpoint adjustment methods in clinical trials. Statistics in Medicine 16(22):2529–2542.Google Scholar

Fafchamps and Labonne supplementary material

Fafchamps and Labonne supplementary material 1

File 167.8 KB

Article contents

Using Split Samples to Improve Inference on Causal Effects

Abstract

Access options

Footnotes

References

Fafchamps and Labonne supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests