Logistic regression protects against population structure in genetic association studies

  1. Efrosini Setakis1,3,
  2. Heide Stirnadel2, and
  3. David J. Balding1
  1. 1 Department of Epidemiology and Public Health, Imperial College, St. Mary's Campus, London W2 1PG, United Kingdom
  2. 2 Worldwide Epidemiology, GlaxoSmithKline, Harlow CM19 5AW, United Kingdom

Abstract

We conduct an extensive simulation study to compare the merits of several methods for using null (unlinked) markers to protect against false positives due to cryptic substructure in population-based genetic association studies. The more sophisticated “structured association” methods perform well but are computationally demanding and rely on estimating the correct number of subpopulations. The simple and fast “genomic control” approach can lose power in certain scenarios. We find that procedures based on logistic regression that are flexible, computationally fast, and easy to implement also provide good protection against the effects of cryptic substructure, even though they do not explicitly model the population structure.

Footnotes

  • Article published online ahead of print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.4346306.

  • 3 Corresponding author. E-mail e.setakis{at}imperial.ac.uk; fax 44 20 75941530.

    • Accepted October 6, 2005.
    • Received June 27, 2005.
| Table of Contents

Preprint Server