Article
Efficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes

https://doi.org/10.1016/j.ajhg.2021.03.019Get rights and content
Under an Elsevier user license
open archive

Summary

In genome-wide association studies, ordinal categorical phenotypes are widely used to measure human behaviors, satisfaction, and preferences. However, because of the lack of analysis tools, methods designed for binary or quantitative traits are commonly used inappropriately to analyze categorical phenotypes. To accurately model the dependence of an ordinal categorical phenotype on covariates, we propose an efficient mixed model association test, proportional odds logistic mixed model (POLMM). POLMM is computationally efficient to analyze large datasets with hundreds of thousands of samples, can control type I error rates at a stringent significance level regardless of the phenotypic distribution, and is more powerful than alternative methods. In contrast, the standard linear mixed model approaches cannot control type I error rates for rare variants when the phenotypic distribution is unbalanced, although they performed well when testing common variants. We applied POLMM to 258 ordinal categorical phenotypes on array genotypes and imputed samples from 408,961 individuals in UK Biobank. In total, we identified 5,885 genome-wide significant variants, of which, 424 variants (7.2%) are rare variants with MAF < 0.01.

Keywords

genome-wide association studies
GWAS
phenome-wide association studies
PheWAS
ordinal categorical data
mixed model approach
proportional odds logistic mixed model
POLMM
UK Biobank
saddlepoint approximation
unbalanced phenotypic distribution
food and other preferences
genetic relationship matrix
GRM

Cited by (0)