A meta-analysis of the convergent validity of self-control measures

https://doi.org/10.1016/j.jrp.2011.02.004Get rights and content

Abstract

There is extraordinary diversity in how the construct of self-control is operationalized in research studies. We meta-analytically examined evidence of convergent validity among executive function, delay of gratification, and self- and informant-report questionnaire measures of self-control. Overall, measures demonstrated moderate convergence (rrandom = .27 [95% CI = .24, .30]; rfixed = .34 [.33, .35], k = 282 samples, N = 33,564 participants), although there was substantial heterogeneity in the observed correlations. Correlations within and across types of self-control measures were strongest for informant-report questionnaires and weakest for executive function tasks. Questionnaires assessing sensation seeking impulses could be distinguished from questionnaires assessing processes of impulse regulation. We conclude that self-control is a coherent but multidimensional construct best assessed using multiple methods.

Introduction

The construct of self-control has attracted substantial attention from psychologists working within a variety of theoretical and methodological frameworks. At present, more than 3% of all publications are indexed in the PsycInfo database by the keywords self-control, impulsivity, or related terms.1 However, operational definitions of self-control vary widely, begging the question: do these varied measures tap the same underlying construct? For instance, does the Eysenck Impulsiveness Questionnaire (Eysenck, Easton, & Pearson, 1984) assess the same trait as the preschool delay of gratification task (Mischel, Shoda, & Rodriguez, 1989)? Do these measures tap the same underlying construct as the Stroop (Wallace & Baumeister, 2002) or go/no-go (Eigsti et al., 2006) executive function tasks?

Evidence of convergent validity (i.e., substantial and significant correlations between different instruments designed to assess a common construct) is a “minimal and basic requirement” for the validity of any psychological test (Fiske, 1971, p. 164). Unfortunately, the rather “modest requirement” of convergent validity in psychological measurement is often assumed rather than tested directly (Fiske, 1971, p. 164). In the current investigation, we meta-analytically synthesized evidence from 282 multi-method samples to examine the convergent validity of self-control measures.

Several authors have noted the challenge of defining and measuring self-control (also referred to as self-regulation, self-discipline, willpower, effortful control, ego strength, and inhibitory control, among other terms) and its converse, impulsivity or impulsiveness (e.g., Depue and Collins, 1999, Evenden, 1999, White et al., 1994, Whiteside and Lynam, 2001). As an illustration of the diversity of measures that have been used to assess self-control, consider the following: refraining from pushing a button when a non-target stimulus appears on a computer screen, matching two geometric patterns from a selection of highly similar patterns, choosing between $1 today and $2 one week later, refraining from immediately eating a single marshmallow in order to obtain two marshmallows later, and responding to questionnaire items such as “Do you often long for excitement?” or “I make my mind up quickly.” Given the rather extraordinary range of measures used, one might expect a lively interdisciplinary debate in the self-control literature as to whether these measures are, in fact, tapping the same underlying construct. Instead, self-control researchers tend to read and cite the work of others working in their same methodological tradition: “Unfortunately, with a few exceptions, researchers interested in the personality trait of impulsivity, in the experimental analysis of impulsive behavior, in psychiatric studies of impulsivity or in the neurobiology of impulsivity form largely independent schools, who rarely cite one another’s work, and consequently rarely gain any insight into their own work from the progress made by others” (Evenden, 1999, pp. 348–349).

What do diverse measures of self-control have in common? We suggest that the common conceptual thread running through varied operationalizations of self-control is the idea of voluntary self-governance in the service of personally valued goals and standards. This idea is captured concisely by Baumeister, Vohs, and Tice (2007): “Self-control is the capacity for altering one’s own responses, especially to bring them into line with standards such as ideals, values, morals, and social expectations, and to support the pursuit of long-term goals” (p. 351). Tasks and questionnaire items that attempt to measure self-control implicitly or explicitly posit a plurality of mutually exclusive responses (e.g., if I eat my cake now, I cannot save it too). One response is recognized by the individual as superior insofar as it is aligned with their long-term goals and standards (saving the cake for after dinner will make me happier in the long-run), but an alternative response (eating the cake now) is more gratifying or automatic in the short-term. In such situations, self-controlled individuals tend to choose the superior response, whereas impulsive individuals tend to choose the immediately gratifying or automatic response.

Our conceptualization of self-control emphasizes “top-down” processes that inhibit or obviate impulses, and thus implicitly assumes “bottom-up” psychological processes that generate these impulses. While individuals surely vary in what they find tempting (Tsukayama, Duckworth, & Kim, submitted for publication), given that adults and children across cultures reliably rate themselves lower in self-control than in any other character strength (Peterson, 2006), it seems reasonable to assume that almost everyone is tempted by something. That is, while attraction to various vices may vary across individuals, we agree with Oscar Spearman and C., 1910, Wilde, 2009 that for each of us “there are terrible temptations which it requires strength, strength and courage, to yield to” (Second Act, Line 42).

Our review of the self-control literature revealed four distinct approaches to the measurement of self-control: executive function tasks, delay of gratification tasks, self-report questionnaires, and informant-report questionnaires. Arguably, each of these approaches assesses voluntary self-governance in the service of goals or standards. Still, diversity both within and across these types of measures is striking. Because of their distinct histories, we review the four measurement traditions separately below.

Executive function refers to goal-directed, higher-level cognitive processing in which top-down control is exerted over lower-level cognitive processes (Williams & Thayer, 2009). Emerging first in the neuropsychology literature, executive function is a relatively new construct (Burgess, 1997) that, like the construct of self-control, continues to be inconsistently defined and measured (Banfield et al., 2004, Miller, 2000). Behavioral tasks designed to assess executive function have been used to assess individual differences in self-control (e.g., Eigsti et al., 2006, White et al., 1994), the presence of clinical levels of impulsivity (Baker, Taylor, & Leyva, 1995), the effect of self-control interventions (Diamond, Barnett, Thomas, & Munro, 2007), and experimental manipulations aimed at taxing self-control (Hagger, Wood, Stiff, & Chatzisarantis, 2010).

There is growing evidence that executive function is not unitary in nature, but rather a collection of distinct processes associated with the frontal lobes, including working memory, attention, response inhibition, and task switching (Kramer et al., 1994, Miller, 2000, Miyake et al., 2000). Because any single executive function task tends to assess a plurality of these cognitive processes (Burgess, 1997, Zaparniuk and Taylor, 1997), it was not feasible to organize executive function tasks according to any of the proposed taxonomies of executive function. As an alternative, we noted that authors often explicitly referred to measures as belonging to one of 12 subtypes of executive function task (e.g., sun–moon Stroop, color-word Stroop, counting Stroop) and categorized measures accordingly (see Table 1).

Whereas executive function tasks have their roots in the neuropsychology literature and the study of neurological impairment, delay of gratification tasks were first developed to understand normative, age-related changes in child development. The ability to delay the discharge of impulses figured prominently in Freud’s (1922) psychoanalytic theory of ego development. Early attempts to operationalize the capacity to delay gratification for the sake of long-term gain included coding images of humans in action from responses to Rorschach ink blots (Singer, 1955). Such projective measures of delay of gratification generally demonstrated poor reliability and validity and were supplanted by more direct measures developed by Walter Mischel in the 1960s. Performance in delay tasks has been shown to predict academic achievement (Evans and Rosenbaum, 2008, Mischel et al., 1989), drug use (Kirby, Petry, & Bickel, 1999), and aggressive and delinquent behavior (Krueger, Caspi, Moffitt, White, & Stouthamer-Loeber, 1996).

Mischel’s research included three subtypes of delay tasks (see Table 2). In a hypothetical choice delay task, subjects make a series of choices between smaller, immediate rewards and larger, delayed rewards, most or none of which they expect to actually receive. For instance, children answer questionnaire items such as, “I would rather get ten dollars right now than have to wait a whole month and get thirty dollars then” (Mischel, 1961, p. 3). More recently, similar questionnaires have been used to calculate a discount rate for each individual that relates the subjective value of a delayed reward to the delay required to receive it (e.g., Green et al., 1994, Kirby et al., 1999). In a real choice delay task, subjects make an actual (i.e., not hypothetical) choice between a small, immediate reward and a larger, delayed reward (e.g., Duckworth and Seligman, 2005, Mischel, 1958). This decision happens at a single point in time, after which the decision cannot be revoked. The third subtype, the sustained delay task, differs from hypothetical and real choice tasks in that subjects first choose the preferred delayed reward, which is clearly “worth the wait”. Subsequently, the ability to delay gratification is measured as the time subjects can resist the smaller, immediate reward in order to obtain the larger, deferred reward (e.g., Mischel et al., 1989, Solnick et al., 1980).

A fourth subtype of delay task not used by Mischel and colleagues is the repeated trials delay task, in which subjects complete a series of brief trials, in each of which they choose between a smaller, more immediate reward and a larger, delayed reward (e.g., Newman, Kosson, & Patterson, 1992). As in choice delay procedures, subjects receive actual rewards (e.g., nickels or candy) and cannot revoke their decision.

In individual difference and clinical psychology research, self-control is most often measured by pen-and-paper personality questionnaires completed by the participant or a close informant (e.g., parent). Questionnaire measures of self-control have been shown to predict academic achievement (Duckworth, Tsukayama, & May, 2010), physical health (Moffitt et al., 2011, Tsukayama et al., 2010), wealth (Moffitt et al., 2011), juvenile delinquency (Benda, 2005), criminal activity in adulthood (Moffitt et al., 2011), and even longevity (Kern & Friedman, 2008).

Our literature search revealed over 100 unique self- and informant-report questionnaires, most designed as stand-alone measures and a few as subscales of omnibus personality, temperament, or psychopathology inventories. Items on these questionnaires suggested considerable heterogeneity in the underlying constructs assessed. For instance, the Eysenck I7 Impulsiveness Scale includes items about doing and saying things without thinking (Eysenck et al., 1984). The Self-Control Scale (Tangney, Baumeister, & Boone, 2004) casts a wider net, including items about acting “without thinking through all the alternatives,” as well as “resisting temptation,” and “concentrating.” Likewise, the Barratt Impulsiveness Scale Version 11 (BIS-11) includes separate scales for motor impulsiveness, non-planning impulsiveness, and cognitive impulsiveness (Barratt, 1985).

In an attempt “to bring order to the myriad of measures and conceptions of impulsivity” (p. 684) in the individual difference and clinical psychology literatures, Whiteside and Lynam (2001) administered several previously published self-control questionnaires to a large sample of undergraduates. Item-level factor analyses produced four distinct factors (UPPS) interpreted as “discrete psychological processes that lead to impulsive-like behaviors” (Whiteside & Lynam, 2001; p. 685): Urgency is the inability to override strong impulses (e.g., “I have trouble controlling my impulses”). (Lack of) premeditation, is similar to Eysenck’s conception of acting before thinking (e.g., “My thinking is usually careful and purposeful” (reverse-scored)). (Lack of) perseverance refers to the inability to focus on boring or difficult tasks (e.g., “I tend to give up easily”). Finally, sensation seeking refers to an attraction to exciting and risky activities (e.g., “I’ll try anything once”).

Whiteside and Lynam’s (2001) UPPS model has been validated in subsequent studies (e.g., Miller et al., 2003, Whiteside et al., 2005) but is not the only multidimensional model for self-control. Indeed, at least a dozen different factor structures for self-control (e.g., Barratt, 1985, Buss and Plomin, 1975, Miller et al., 2004, White et al., 1994) have been suggested. One attractive feature of the UPPS is that it situates facets of self-control within the five-factor model of personality, relating urgency to neuroticism, perseverance and planning to conscientiousness, and sensation seeking to extraversion. Also notable is the similarity between the UPPS and Buss and Plomin’s (1975) four-factor model, and at least partial overlap with other proposed factor structures for self-control. Finally, the distinction between sensation seeking impulses and a variety of psychological processes that direct behavior away from those impulses is consistent with dual-system models of self-control (Carver et al., 2009, Eisenberg et al., 2004, Hofmann et al., 2009, Metcalfe and Mischel, 1999, Steinberg, 2008). While dual-system models vary somewhat in their particulars, they all posit two opponent systems underlying the generation of quick, involuntary, and often consummatory impulses on the one hand, and the control of these impulses by deliberate, volitional processes impulses on the other.

Our initial, qualitative survey of the self-control literature indicated considerable heterogeneity in the targeted psychological processes and, in addition, in the level of intended description. Some tasks and questionnaire items, it seemed, were designed to assess aggregate self-controlled behavior (i.e., ultimately behaving in accordance with long-term goals and standards at the expense of short-term gratification). For instance, the Self-Control Scale (Tangney et al., 2004) includes the item, “People would say that I have iron self-discipline.” Other tasks and questionnaire items, in contrast, seemed to target the component psychological processes that precede and contribute to self-controlled behavior. In addition to the four processes specified by Whiteside and Lynam’s (2001) UPPS model, we suggest that self-control may be facilitated by accurately weighing long-term and short-term consequences (delay discounting questionnaire; Kirby et al., 1999), following through on a decision to resist immediate gratification (preschool delay of gratification task, Mischel et al., 1989), suppressing habitual or automatic responses that conflict with one’s goals (go/no-go task; Eigsti et al., 2006), and effectively regulating attention in the face of distractors (Attentional Network Task; Rueda et al., 2004).

Heterogeneity in the targeted dimensions of self-control and in the level of description suggests that correlations among diverse self-control measures may be relatively modest. In addition, measurement error, whether from task-specific or random error variance, should further attenuate estimates of convergent validity. A meta-analysis of published studies reporting multi-method, multi-trait matrices of correlations found that more than 60% of the variance in personality measures was accounted for by task-specific and random error variance (Cote & Buckley, 1987).

In the current investigation, we meta-analytically examined evidence for convergent validity among self-control measures from published and unpublished studies that used at least two different measures of self-control. We had several specific goals in our analyses: First, we sought an overall estimate of the convergent validity among executive function, delay of gratification, and questionnaire self-control measures. Second, we examined sources of heterogeneity in convergent validity estimates, including type and subtype of self-control measure. Finally, we examined our meta-analytically derived correlation matrix for support of the UPPS model (Whiteside & Lynam, 2001).

Section snippets

Literature search

We used two strategies to search the PsycINFO database for published articles and dissertations available by February 2008 that used more than one measure of self-control. First, keyword searches were conducted for self-control related terms and popular self-control measures.

Results

In total, 236 studies met our inclusion criteria, comprised of k = 282 independent samples and N = 33,564 participants. Altogether, j = 3654 effect sizes were culled from these study reports, which were aggregated to 282 effect sizes for the overall and moderator analyses (at the sample level), and 907 effect sizes for intertype comparisons. Study characteristics are summarized in Appendix A, with corresponding references in Appendix B.

Based on the total sample of 282 independent effect sizes, the

Discussion

Across 282 multi-method studies and over 33,000 participants, we found moderate convergence across self-control measures. Correlations did not vary systematically by sample characteristics, including gender, sample type, publication year, or whether the correlations were extracted from published articles, dissertations, or email correspondence with authors. In contrast, over half of the heterogeneity in correlations was explained by the type of self-control measure used. Both within and across

Conclusion

The promise of psychology as a cumulative science depends not only upon field-unifying theories and well-designed studies, but also upon valid, consensually understood measures (Mischel, 2009). On the basis of the current meta-analysis, we suggest that evidence for the convergent validity of self-control measures is adequate – and as strong as the evidence of convergent validity for other psychological measures. Looking to the future, we hope the current investigation encourages collaboration

Acknowledgements

This study was funded by the National Institute on Aging K01 mentored research scientist award (Duckworth PI) and the John Templeton Foundation (Seligman PI; Duckworth Co-PI). The authors acknowledge equal contribution to the manuscript.

References (83)

  • S.P. Whiteside et al.

    The five factor model and impulsivity: Using a structural model of personality to understand impulsivity

    Personality and Individual Differences

    (2001)
  • D.B. Baker et al.

    Continuous performance tests: A comparison of modalities

    Journal of Clinical Psychology

    (1995)
  • J. Banfield et al.

    The cognitive neuroscience of self-regulation

  • E.S. Barratt

    Impulsive subtraits: Arousal and information processing

  • R.F. Baumeister et al.

    The strength model of self-control

    Current Directions in Psychological Science

    (2007)
  • Beck, D. M., Carlson, S. M., & Rothbart, M. K. (2011). Executive function, effortful control and parent report of...
  • B.B. Benda

    The robustness of self-control in relation to form of delinquency

    Youth & Society

    (2005)
  • W. Brown

    Some experimental results in the correlation of mental abilities

    British Journal of Psychology

    (1910)
  • P.W. Burgess

    Theory and methodology in executive function research

  • A.H. Buss et al.

    A temperament theory of personality development

    (1975)
  • C.S. Carver et al.

    Two-mode models of self-regulation as a tool for conceptualizing effects of the serotonin system in normal behavior and diverse disorders

    Current Directions in Psychological Science

    (2009)
  • H. Cooper

    Synthesizing research: A guide for literature reviews

    (1998)
  • J.A. Cote et al.

    Estimating trait, method, and error variance. Generalizing across 70 construct validation studies

    Journal of Marketing Research

    (1987)
  • R.A. Depue et al.

    Neurobiology of the structure of personality: Dopamine, facilitation of incentive motivation, and extraversion

    Behavioral and Brain Sciences

    (1999)
  • A. Diamond et al.

    Preschool program improves cognitive control

    Science

    (2007)
  • D.M. Dougherty et al.

    Laboratory behavioral measures of impulsivity

    Behavior Research Methods

    (2005)
  • A.L. Duckworth et al.

    Self-discipline outdoes IQ in predicting academic performance of adolescents

    Psychological Science

    (2005)
  • A.L. Duckworth et al.

    Establishing causality using longitudinal hierarchical linear modeling: An illustration predicting achievement from self-control

    Social Psychology and Personality Science

    (2010)
  • I.-M. Eigsti et al.

    Predicting cognitive control from preschool to late adolescence and young adulthood

    Psychological Science

    (2006)
  • N. Eisenberg et al.

    The relations of effortful control and impulsivity to children’s resiliency and adjustment

    Child Development

    (2004)
  • S. Epstein

    The stability of behavior: I. On predicting most of the people much of the time

    Journal of Personality and Social Psychology

    (1979)
  • B.A. Eriksen et al.

    Effects of noise letters upon the identification of a target letter in a nonsearch task

    Perception & Psychophysics

    (1974)
  • J.L. Evenden

    Varieties of impulsivity

    Psychopharmacology. Special Issue: Impulsivity

    (1999)
  • S.B. Eysenck et al.

    Age norms for impulsiveness, venturesomeness and empathy in children

    Personality and Individual Differences

    (1984)
  • A.P. Field

    Meta-analysis of correlation coefficients: A monte carlo comparison of fixed- and random-effects methods

    Psychological Methods

    (2001)
  • D.W. Fiske

    Measuring the concepts of personality

    (1971)
  • S. Freud

    Beyond the pleasure principle

    (1922)
  • L. Green et al.

    Discounting of delayed rewards: A life-span comparison

    Psychological Science

    (1994)
  • M.S. Hagger et al.

    Ego depletion and the strength model of self-control: A meta-analysis

    Psychological Bulletin

    (2010)
  • R.K. Heaton et al.

    Use of neuropsychological tests to predict adult patients’ everyday functioning

    Journal of Consulting and Clinical Psychology

    (1981)
  • L.V. Hedges

    Meta-analysis

  • Cited by (673)

    View all citing articles on Scopus
    View full text