Introduction

Alcohol dependence (AD) is substantially heritable with studies estimating that heritability is between 50 and 70 % (Heath et al. 1997; Hiroi and Agatsuma 2005; Ystrom et al. 2011; Young-Wolff et al. 2012). Genes associated with AD include those coding for enzymes involved in the absorption and elimination of ethanol such as alcohol dehydrogenase (ADH), aldehyde dehydrogenase (ALDH) and cytochrome P450 2E1 (CYP2E1) (Zakhari 2006; Edenberg 2007). Metabolism of ethanol consists of two rate limiting reactions (Zakhari 2006). Firstly, ethanol is converted to acetaldehyde, which is subsequently metabolized to acetate. The first step is predominantly catalysed by alcohol dehydrogenases (ADH), with minor roles for cytochrome P450 2E1 (CYP2E1) and catalase. In the second step, acetaldehyde is metabolised by aldehyde dehydrogenases (ALDH). Acetaldehyde is considerably more toxic than ethanol, and its accumulation leads to a highly aversive reaction that includes anxiety, facial flushing, nausea, and rapid heartbeat (Eriksson 2001).

Genetic variants that cause a build up of acetaldehyde, either by rapid ethanol metabolism or reduced acetaldehyde metabolism, have been found to be associated with lower risk for AD and heavy drinking (Edenberg 2007). The frequency of these genetic variants varies between ancestral groups (Edenberg 2012). The two polymorphisms that have been most strongly associated with AD in Asian populations, ADH1B Arg47His (rs1229984) and ALDH2 Glu487Lys (rs671), have little/no variation in one African population (Goedde et al. 1992).

AD is a heterogeneous disorder, highly comorbid with internalising disorders (Kessler et al. 1996). Various subtypes of AD have previously been described each with different reasons for developing an addiction, different withdrawal syndromes, different prognoses, and different responses to therapeutic approaches (Lesch et al. 1988). Research has suggested there may be an anxious subtype of AD characterized by high harm avoidance, high reward dependence, and low novelty-seeking behaviour (Cloninger 1987). More recently, reports have suggested this anxious AD may be a genetically specific subtype of AD (Lee et al. 2010). Therefore, if genetic markers could be used to identify this subtype of AD, patient care could be improved by tailoring treatment accordingly.

The median age of onset for AD (23 years of age) is much later than for anxiety disorders (11 years of age) (Kessler et al. 2005). The risk of lifetime dependence to alcohol is far greater for individuals who start drinking at an earlier age (Grant and Dawson 1997). An adolescent cohort of individuals with AD indentifies the most serious cases of AD, and anxiety symptoms would be expected to have been reported by this age.

We investigated whether variation in genes encoding CYP2E1 or acetaldehyde-metabolising enzymes (ALDH1A1, ALDH2) might alter the risk of AD in an adolescent Cape population with mixed ancestry by performing systematic haplotype association analyses to maximize the chances of capturing functional variation. We also investigated the association between a genotype risk score and AD. Investigating genetic associations in different population groups is important in order to replicate and validate previous findings, or where results do not correlate it may indicate heterogeneity. Additionally, we investigated whether AD with or without comorbid symptoms of anxiety may be a genetically specific subtype of AD.

Methods

Participants

Details of the participants have been reported previously (Ferrett et al. 2011). In brief, 80 case control pairs (one with AD, one without AD) from within the Cape Flats region (Cape Town, South Africa) were individually matched for age (within 1 year), gender (each group consisted of 47 females and 33 males), education level, language and socioeconomic status (SES). The average participant was aged 14.8 years (sd 0.76) and had completed 7.6 years (sd 0.82) of education. The sample reflected the sociodemographic profile of the Cape Flats population (100 % Coloured; Language, 69 % Afrikaans, 31 % English; 86 % in households with formal housing; and 85 % earning a gross annual income of less than ZAR 100 000). Exclusion criteria included, but were not limited to: mental retardation; lifetime DSM-IV Axis I diagnoses other than AD (including the following disorders: depressive, anxiety, psychotic, post-traumatic stress, eating, tic, attention-deficit/hyperactivity, oppositional defiant, and conduct); less than 6 years of formal education; and lack of proficiency in English or Afrikaans. Volunteers were screened for eligibility after written informed assent/consent was obtained from volunteers and parents or guardians.

The study protocol and procedures complied with and were conducted in strict adherence to the guidelines contained in the Declaration of Helsinki (2008). Full written approval to conduct the study was obtained from the Western Cape Education Department and the Research Ethics Committee of the Stellenbosch University Faculty of Health Sciences.

Measures

Alcohol use

A revised version of the Timeline Followback (TLFB) procedure (Sobell and Sobell 1992), a semi-structured, clinician-administered assessment of alcohol use and drinking patterns, was used in collaboration with the Kiddie Schedule for Affective Disorders and Schizophrenia Present and Lifetime Versions (K-SADS-PL) (Kaufman et al. 1997) to elicit alcohol-use data. A standard drink was defined as one beer or wine cooler, one glass of wine, or one 1.5-oz shot of liquor (alone or in a mixed drink). AD was defined by a lifetime dosage in excess of 100 units plus a DSM-IV diagnosis of alcohol abuse or dependence. The control group were non-drinkers (who had never consumed alcohol) and light drinkers (lifetime dosage not exceeding 76 units of alcohol), with no history of AD.

Psychopathology

Total symptom counts from the K-SADS-PL were recorded for generalised anxiety disorder. As previously mentioned, individuals with a diagnosis of anxiety disorders were excluded from the study. However, individuals reporting low levels of anxiety symptoms, not severe enough for a diagnosis of anxiety disorders, were included in the study. A binary variable was generated for the presence or absence of these anxiety symptoms in individuals with AD (anxious-AD). Of the 80 individuals with AD, there were 59 individuals without any anxiety symptoms and 21 with anxiety symptoms.

Genotyping

There were genotype data on a total of 29 single-nucleotide polymorphisms (SNPs) (16 SNPs in ALDH1A1, 7 SNPs in ALDH2, 6 SNPs in CYP2E1). Genotyping was carried out using a custom Illumina Infinium iSelect custom 6000 bead chip.

Genotype risk score

We calculated a genotype risk score using all SNPs moderately associated with outcome (chi-square value greater than 1). The score was the unweighted sum of the number of risk alleles (0, 1 or 2) at each of these SNP loci. Separate genotype risk scores were created for the outcomes of AD and anxious-AD.

Statistical analysis

The genotype distributions for each SNP in the control group (without AD) were used to calculate deviation from Hardy–Weinberg equilibrium (HWE) using a χ 2 test, and those SNP’s which did show evidence of deviation were excluded from further analysis. Linkage disequilibrium (LD) D′ values were evaluated for all marker pairs. Customised haplotype blocks were defined in Haploview version 4.2 (Barrett et al. 2005). Allele and haplotype frequencies were compared between cases and controls using a χ 2 test. Logistic regression models were used to investigate the association between genotype risk score and the outcomes of AD, or AD with anxiety symptoms. A Bonferroni correction was applied to address the issues associated with multiple testing (Bland and Altman 1995). A power calculation was performed using the Quanto software (Version 1.2.4) (Gauderman and Morrison 2006). Statistical analyses were performed using Haploview version 4.2 (Barrett et al. 2005) and Stata version 12.1 (StataCorp 2011).

Results

There was evidence that one SNP (rs348457) deviated from HWE (p < 0.001) and there was no genetic variation in another SNP (rs671). Both these SNPs were excluded from our analyses. This left a total of 27 SNPs in the analysis (15 for ALDH1A1, 6 for ALDH2, 6 for CYP2E1). The allele frequencies for the 27 SNPs are presented in Table 1. To correct for the multiple testing of 27 SNPs in two disease models, a threshold level of significance was calculated as p < 0.0009. This is a conservative estimate due to the LD between SNPs. Given the number of tests, there was no evidence of any associations other than one would expect by chance.

Table 1 Allele frequencies and associations with alcohol dependence or alcohol dependence with anxiety symptoms

Linkage disequilibrium

The extent of LD between the SNPs was determined for ALDH1A1 (Fig. 1), ALDH2 (Fig. 2) and CYP2E1 (Fig. 3).

Fig. 1
figure 1

Haplotype block structure for the ALDH1A1 gene on chromosome 9. Haplotype blocks are outlined. Figures represent D′

Fig. 2
figure 2

Haplotype block structure for the ALDH2 gene on chromosome 12. Haplotype blocks are outlined. Figures represent D′

Fig. 3
figure 3

Haplotype block structure for the CYP2E1 gene on chromosome 10. Haplotype blocks are outlined. Figures represent D′

Alcohol dependence analysis

A total of 160 individuals were included in the analysis (80 with AD, 80 without AD). There was some evidence of an association between AD and rs6413419 in the CYP2E1 gene (p = 0.04) (Table 1). In the haplotype analysis there was some evidence of an association with the ACAG haplotype in block 4 of the ALDH1A1 gene (p = 0.03) (Table 2).

Table 2 ALDH1A1, ALDH2 and CYP2E1 haplotype frequencies and associations with alcohol dependence or alcohol dependence with anxiety symptoms

Alcohol dependence with anxiety symptoms analysis

A total of 80 individuals with AD were included in the analysis (21 with anxiety symptoms, 59 without anxiety symptoms). There was weak evidence of an association with anxious-AD and rs63319 of the ALDH1A1 gene (p = 0.10) (Table 1). In the haplotype analysis there was weak evidence of an association with the ACAG haplotype in block 4 of the ALDH1A1 gene (p = 0.06) (Table 2).

Genotype risk score

There were 8 SNPs (4 SNPs in ALDH1A1, 2 SNPs in ALDH2, 2 SNPs in CYP2E1) associated with AD that were included in this genotype risk score. This score ranged from 5 to 13. For every increase in genotype risk score (ie. for every extra risk allele) the odds of AD increased by 35 % (OR 1.35, 95%CI 1.08, 1.68, p = 0.008).

There were 8 SNPs (3 SNPs in ALDH1A1, 2 SNPs in ALDH2, 3 SNPs in CYP2E1) associated with anxious-AD that were included in this genotype risk score. This score ranged from 5 to 16. For every increase in genotype risk score (ie. for every extra risk allele) the odds of having AD with anxiety symptoms (rather than AD without anxiety symptoms) increased by 53 % (OR 1.53, 95%CI 1.14, 2.05, p = 0.004). The effect of both genotype risk scores appeared to be linear, although interpretation is limited due to a small number of individuals at the extremes.

Power calculation

We performed a post hoc statistical power and sample size analysis. Statistical power is defined as the probability of rejecting the null hypothesis while the alternative hypothesis is true. The results vary for each SNP investigated but assuming an allele frequency of 0.9 (rs6413419), a population risk of 0.1, an additive genetic model, an odds ratio of 3 (aa v. AA) and significance set at 5 %, we had 27 % power. Using the same assumptions but setting the power to 80 % we would need a sample size approximately five times the size of the current study (or ten times the size for the anxious-AD analysis).

Discussion

Main findings and comparisons with the literature

Although AD is prevalent in South Africa (Williams et al. 2008; Peltzer et al. 2011) there has been a paucity of previous research investigating genetic variants associated with this phenotype in a South African population. The ACAG haplotype in block 4 of the ALDH1A1 gene had a frequency of 6.9 % in our cohort and provided some evidence of an association with AD. This haplotype was more frequent in individuals with AD than controls (10 % v. 3.8 %, p = 0.03) and more frequent in individuals with AD but without anxiety symptoms than individuals with AD and anxiety symptoms (12.7 % v. 2.4 %, p = 0.06). Therefore, it is possible that this haplotype may identify non-anxious AD as a genetically specific subtype of AD. There were also encouraging findings with regard to the genotype risk score.

The association between AD and the ACAG haplotype was driven by rs11143443. This SNP is located upstream of the 5′ promoter region of the ALDH1A1 gene and has previously been associated with AD in an African American population (Liu et al. 2011). Genetic variation in the promoter region has been reported to affect ALDH1A1 gene expression (Spence et al. 2003), although to our knowledge no such gene expression data exist for this particular SNP. Liu et al. (Liu et al. 2011) also reported evidence of an association between AD and a haplotype in ALDH2, driven by rs7311852. The same SNP provided weak evidence of an association with AD in our data. However, the authors report that the positive association was influenced by population stratification.

The Cape Mixed Ancestry group, which has been shown to have the greatest level of intercontinental admixture compared to any other international population group, consists of individuals of Khoesan, Bantu-speaking African, European and Asian ethnicity (Tishkoff et al. 2009; de Wit et al. 2010). A principal components analysis, using genetic markers across the genome, would be required to determine whether this cohort consists of a single genetic population, and thus whether our associations are influenced by population stratification. Comparing the allele frequencies of individuals in this study with populations in the HapMap project (Gibbs et al. 2003) showed the frequencies to be intermediate between the Utah residents with ancestry from northern and western Europe (CEU) and the Yoruba in Ibadan, Nigeria (YRI) (data not shown).

Strengths, limitations and future directions

The main limitation of this study is the small sample size which is further compounded in our pre-specified subgroup analysis of AD with or without anxiety symptoms. Therefore, our findings must be interpreted with caution and should be considered preliminary. This is especially true given that there was no evidence of any association other than that which one would expect by chance. However, we limited our analyses to focus on three biologically relevant genes and have been clear on the tests that have been performed which is generally more important when dealing with multiple comparisons (Perneger 1998).

This adolescent cohort contained extremely well defined cases of AD as individuals with a diagnosis of comorbid generalised anxiety disorder were excluded from the study. Therefore, our power to identify a genetically specific subtype of non-anxious AD will be greater than other cohorts of a similar size. Our power calculation clearly indicates that larger studies are required in this area. A potential limitation of this selective cohort is that the results may not be generalisable to the general population where AD tends to be highly comorbid with other disorders. Additionally, as this is an adolescent cohort, and the median age of onset for AD is 23 years of age, it is possible that individuals included as controls in this study may go on to develop AD. However, AD is inversely associated with age of first drink and so individuals with AD in this cohort are likely to have a more severe phenotype than a control individual that goes onto develop AD at a later stage (Grant and Dawson 1997).

For some SNPs the evidence of an association was stronger in the AD analysis than in the anxious-AD analysis. This may be because this is a marker associated with AD in general rather than with a specific subtype of AD. Alternatively, this may also be explained by the reduction in sample size between the two analyses. This can be seen for the SNP rs7296651 where the effect size in the AD analysis is less extreme, but the strength of evidence for the association is stronger than in the anxious-AD analysis (AD analysis: OR 0.71, 95%CI 0.45 to 1.12, p = 0.14, compared with anxious-AD analysis: OR 0.62, 95%CI 0.28 to 1.37, p = 0.24). The same set of individuals were used in the discovery and testing of the genetic risk scores therefore, although we observed relatively large effect sizes future work could look to validate these findings in an independent sample.

This study increases the body of evidence investigating genetic variants and AD in a genetically admixed population. It is important to investigate this type of population in order to replicate previous findings, as well as attempting to identify genetic variants for complex disorders (de Wit et al. 2010). Meta-analyses of genetic studies will be needed to identify genetically-specific subtypes of AD, potentially providing insights into the biological mechanism of these disorders.