Modeling perceptual discrimination in dynamic noise: Time-changed diffusion and release from inhibition

https://doi.org/10.1016/j.jmp.2013.05.007Get rights and content

Highlights

  • Dynamic noise impairs performance and shifts RT distributions on the time axis.

  • We describe two diffusion process models for discrimination in dynamic noise.

  • The integrated system model is based on a time-changed diffusion process.

  • The release from inhibition model is based on known physiological processes.

  • Both models gave good accounts of the RT distributions and accuracy from the task.

Abstract

The speed and accuracy of discrimination of featurally-defined stimuli such as letters, oriented bars, and Gabor patches are reduced when they are embedded in dynamic visual noise, but, unlike other discriminability manipulations, dynamic noise produces significant shifts of RT distributions on the time axis. These shifts appear to be associated with a delay in the onset of evidence accumulation by a decision process until a stable perceptual representation of the stimulus has formed. We consider two models for this task, which assume that evidence accumulation and perceptual processes are dynamically coupled. One is a time-changed diffusion model in which the drift and diffusion coefficient grow in proportion to one another. The other is a release from inhibition model, in which the emerging perceptual representation modulates an Ornstein–Uhlenbeck decay coefficient. Both models successfully reproduce the families of RT distributions found in the dynamic noise task, including the shifts in the leading edge of the distribution and the pattern of fast errors. We conclude that both models are plausible psychological models for this task.

Introduction

In contributing an article to honor William Estes as one of the creators of mathematical psychology, we begin by reflecting on what it means to have done as Estes did, and created a discipline where none was before. Estes made numerous deep and influential contributions during his long and distinguished career, but, arguably, none had greater or more enduring significance for the future of the discipline than his original seminal work in animal learning, stimulus sampling theory (Estes, 1950, Estes, 1955a, Estes, 1955b, Estes and Burke, 1953). In creating stimulus sampling theory, Estes not only constructed an elegant and powerful theory of learning, but also showed by example just what it means to develop and test a process model of a psychological phenomenon. Stimulus sampling theory first confronted the issue that has confronted every process model since then, namely, the inherent variability of behavior: the fact that organisms, whether human or nonhuman, do not exhibit the same behavior from trial to trial or from one presentation of a stimulus to the next. Consequently, a process model for learning must be expressed at the level of operators that show how choice probabilities evolve from trial to trial. Such probabilistic variation is not just a layering of a measurement error model on top of a deterministic process, but is integral to the theory itself.

Those of us who work with process models for psychological phenomena belong to a tradition begun by Estes and are profoundly indebted to him. From his example we understand that the development of a process model is the discipline of expressing a psychological explanation in quantitative terms and, in so doing, of determining precisely what its empirical consequences might be. It is also the discipline of testing a quantitatively expressed explanation against empirical data. Like all applied mathematics, it is the art of making complex problems tractable. In this, it is the art of distinguishing the essential from the superfluous and the simple from the simplistic. Anyone who does work of this kind knows what the benefits of this undertaking can be. The attempt to express a psychological principle in quantitative terms is usually, in the first instance, a process of discovering that the things you thought were precise are in fact not so. It is also a way of flushing out unexamined assumptions and of exposing them to critical scrutiny.

Estes began his long career during the ascendancy of behaviorism and finished it long after the cognitive revolution had become the cognitive orthodoxy. The evolution of his research interests over time reflected the change in the conceptual landscape, from learning, which was the driving force for behaviorism, to perception, memory, categorization, and decision-making. These are topics that remain of central concern to mathematical psychologists today. A number of his later papers focused on the problem of determining whether variables that affect performance in visual recognition tasks do so by affecting perceptual or decision processes (Bjork and Estes, 1973, Estes, 1972, Estes, 1975, Estes, 1982). Estes was profoundly aware of the contribution made by decision processes, which match incoming sensory information against task representations in immediate memory, to performance in simple cognitive tasks. He was also aware of the hazards of theorizing about perceptual and decision processes in isolation, arguing that a proper understanding could only be gained by considering how they act in concert. That question, although framed in somewhat different terms, is the focus of this article.

In a sequence of 12 experiments, Ratcliff and Smith (2010) investigated performance in a novel two-choice discrimination task in which letter stimuli were degraded by embedding them in dynamic visual noise. In their task, a randomly-chosen proportion of the pixels in the letter and the background were inverted in each consecutive frame of the display. Like other manipulations of discriminability, dynamic noise increased response time (RT) and reduced accuracy, but unlike other manipulations, it also produced significant shifts of the RT distribution on the time axis. These were manifested as changes in the distribution’s leading edge, as indexed by its 0.10 quantile. Changes in the 0.10 quantile depend only on the fastest 10% of responses in the distribution and are relatively independent of changes in its variance or higher moments. Ratcliff and Smith found that dynamic noise shifted the leading edge of the distribution by more than 100 ms in the most difficult as compared to the easiest condition.

Fig. 1 shows examples of the stimuli used by Ratcliff and Smith (2010) in their Experiment 1, together with a quantile–probability plot (Ratcliff & Tuerlinckx, 2002) of group data from an unpublished experiment that used the same task. The details of the method can be found in Appendix A. Participants performed the task under speed and accuracy instructions in alternating blocks at five levels of stimulus discriminability, formed by inverting 0.35, 0.40, 0.425, 0.45, 0.475 of the pixels in the display. (When 0.5 of the pixels are inverted, the display becomes a homogeneous, random field of black and white pixels that carries no stimulus information.) In quantile probability plots, selected quantiles of the RT distributions for correct responses and errors are plotted against the choice probabilities, pi and 1pi, for each condition, i. Such plots show how distribution shape, response accuracy, and the relationship between mean RTs for correct responses and errors all change as stimulus discriminability is varied. The distributions in Fig. 1 have been summarized using their 0.1, 0.3, 0.5, 0.7, and 0.9 quantiles.

The unusual result in Fig. 1 is the systematic change in the leading edge of the distribution as a function of noise, which appears as a bowing of the curve representing the 0.1 quantile (the bottom curve in the plot) in both the speed and accuracy conditions. This is unlike the results found in the vast majority of speeded two-choice decision tasks. In most tasks, most of the changes in the distributions are in the upper quantiles; the leading edge is relatively unaffected and the curve representing the 0.1 quantile is almost flat (Ratcliff & Smith, 2004). Following Ratcliff and Smith (2010), we refer to the bowing of the 0.1 quantile function in Fig. 1 as the leading edge effect.

The leading edge effect in  Ratcliff and Smith’s (2010) study was found only with letter discrimination in dynamic noise. There was no leading edge effect in a brightness discrimination task with dynamic noise, in which participants were required to judge whether the average proportion of light pixels in the display was greater or less than 50%. There was no leading edge effect in a letter discrimination task, in which letters were degraded by a simultaneous structure mask composed of random letter fragments in the same stroke font as the stimuli. There was a smaller leading edge effect in the letter discrimination task when the noise was static rather than dynamic.

Ratcliff and Smith (2010) attributed the leading edge effect to a delay in the onset of information accumulation by a decision process until a stable perceptual representation of the stimulus had formed. The phenomenological basis for this interpretation is compelling: When letters are viewed in dynamic noise, they appear to emerge slowly out of the noise. The perceptual experience is quite unlike that in the masking-by-structure discrimination task, in which the stimuli seem to appear instantaneously.

The data in Fig. 1 can be well fitted by a version of Ratcliff’s (1978) diffusion model in which the non-decision time, or time for other processes, Ter, varies systematically with the level of noise in the display. Ratcliff’s model assumes that RT can be additively decomposed into a decision time, TD, and a time for other processes,RT=TD+Ter. The decision time is the first passage time for a Wiener or Brownian motion diffusion process through one of two absorbing boundaries that represent decision criteria. The absorbing boundaries represent upper and lower limits of diffusion: once a boundary is reached, diffusion ceases. Formally, if X(t) is a diffusion process starting at zero, X(0)=0, and a1 and a2 are absorbing boundaries, with a2<0<a1, then we define the first passage times, T(a1) and T(a2), as T(a1)=min{t:X(t)a1}T(a2)=min{t:X(t)a2}. The decision time, TD, in Eq. (1) is the first of these events to occur: TD=min{T(a1),T(a2)}. Either T(a1) or T(a2) may be infinite, but TD is finite with probability one (Cox & Miller, 1965). That is, the process is guaranteed to terminate in finite time. The boundaries are defined as absorbing by the relations P[X(t)=a1|T(a1)t]=1P[X(t)=a2|T(a2)t]=1. These equations state that once the process has reached an absorbing boundary its value does not change any further. Absorbing boundaries are one of several kinds of possible boundary for diffusion processes, which may be either accessible or inaccessible and absorbing, reflecting, or sticky. A discussion of the varieties of boundary behavior may be found in Karlin and Taylor (1981, pp. 226–242). A combination of absorbing and reflecting boundaries has been used in decision models with racing, parallel diffusion processes (Ratcliff and Smith, 2004, Usher and McClelland, 2001) and in models of simple RT (Diederich, 1995).

The information accumulation process in Ratcliff’s model, again denoted as X(t), can be described by a stochastic differential equation, dX(t)=ξdt+sdW(t). In this equation, ξ is a (random) drift coefficient, s is the square root of the diffusion coefficient, and W(t) is a standard Brownian motion process. The square root of the diffusion coefficient is also termed the infinitesimal standard deviation. A standard Brownian motion process has zero drift, unit variance, independent increments, covariance function cov[W(τ),W(t)]=min(τ,t), and possesses a version which is almost surely continuous and is almost everywhere non-differentiable (Karlin & Taylor, 1981). In the psychological model of Eq. (2), it describes a process in which evidence is accumulated continuously in time and perturbed by broad spectrum Gaussian noise, idealized as white noise. The drift is assumed to be normally distributed, ξN(ν,η), with mean ν and standard deviation η. The standard deviation η describes the between-trial variability in stimulus quality, like the noise in signal detection theory. In most applications of diffusion process models, the model can be fitted with a single value of Ter, but Ratcliff and Smith (2010) found that a separate value of Ter was needed for each condition to account for the data from the dynamic noise task. We have used ξ and s to denote the drift and infinitesimal standard deviation in Eq. (1) to emphasize the link with Ratcliff’s work, but elsewhere in the article we denote them by ν and σ, respectively.

The additive decomposition in Eq. (1) is consistent with the kind of discrete stages model proposed by Sternberg (1969), in which the process of stimulus identification does not begin until after the process of stimulus encoding is complete. If we make such an interpretation, and if we identify the process of stimulus identification with the accumulation of evidence by a diffusion process or some other sequential-sampling process, then  Ratcliff and Smith’s (2010) results imply that the effect of dynamic noise is to delay the onset of evidence accumulation: the noisier the stimulus, the more evidence accumulation is delayed. This is consistent with the phenomenology, but it begs the deeper question of how the decision process knows when to “turn on.” To express this in less homuncular terms, what is the trigger signal or other mechanism that initiates the process of evidence accumulation, and how is it linked to the process of perceptual encoding?

Ratcliff and Smith (2010) proposed two general mechanisms that could adaptively couple the onset of evidence accumulation to the time course of stimulus encoding. One mechanism was based on the integrated system model of Smith and Ratcliff (2009), which is a form of stochastic continuous-flow system, like the cascade model of McClelland (1979). In the integrated system model, the onset of evidence accumulation is gradual rather than abrupt. The decision process becomes active as soon as stimulus information becomes available, but the rate of accumulation increases as the stimulus representation develops. The rate of evidence accumulation is controlled by a time-dependent diffusion coefficient that sets the clock of the process: the larger the diffusion coefficient, the more rapidly the process diffuses towards the absorbing boundaries. The coupling of the encoding and decision processes provided by the time-dependent diffusion coefficient avoids the need for a mechanism that initiates evidence accumulation based on an assessment of stimulus quality. The accumulation process in the integrated system is described by the stochastic differential equation dX(t)=ν(t)dt+σ(t)dW(t). Here ν(t) is a time-dependent drift and σ(t) is a time-dependent infinitesimal standard deviation. The time dependency in the coefficients reflects the time course of perceptual encoding. In tasks with brief stimulus exposures the encoded stimulus information is identified with the contents of visual short-term memory (VSTM). In fitting the model to data, we again assume the additive decomposition of Eq. (1), but make a slightly different interpretation of Ter. In Ratcliff’s model, Ter is an aggregate of the times for perceptual encoding, response selection, and response execution. In the model of Eq. (3), stimulus information becomes available part way through the encoding process and begins to drive the decision process. The component of encoding that begins when the decision process becomes active and ends when the drift and diffusion coefficients have reached their maximum value is excluded from Ter.

To obtain a well-behaved model, the drift and diffusion coefficients are assumed to be proportional to one another: ν(t)σ2(t). Here “well-behaved” means a model that predicts distributions of RT and orderings of correct responses and errors that resemble those found in empirical data. The drift is assumed to depend on the stimulus condition whereas the diffusion coefficient is the same for all conditions. In a neurally-inspired version of the model, the drift depends on the difference between an excitatory and an inhibitory process and the diffusion coefficient depends on their sum (Smith, 2010, Smith and McKenzie, 2011). The proportionality between the drift and diffusion coefficients is then only approximate rather than exact, but suffices to yield a well-behaved model. In either case, the rate of evidence accumulation depends on the diffusion coefficient. When no stimulus information is present, ν(t) and σ2(t) are both zero and no accumulation takes places.

The second mechanism proposed by Ratcliff and Smith (2010) was release from inhibition, which they conceptualized as a stimulus-dependent modulation of decay in an Ornstein–Uhlenbeck (OU) diffusion process (Busemeyer and Townsend, 1992, Busemeyer and Townsend, 1993, Smith, 1995, Smith, 2000, Usher and McClelland, 2001). The information accumulation process in this model can be described by the stochastic differential equation dX(t)=[ν(t)λ(t)X(t)]dt+σdW(t). In this equation, ν(t) and λ(t) are, respectively, time-dependent stimulus information and decay coefficients. Unlike Eq. (3), the diffusion coefficient, σ2, is constant. In Eq. (4), evidence accumulation is controlled by λ(t) rather than by the diffusion coefficient.

The release from inhibition mechanism relies on the properties of the stationary distribution of the OU process. Unlike the Wiener process in Eqs. (2), (3), the OU process possesses a stationary distribution. For an OU process with constant stimulus information, ν(t)ν, and constant decay, λ(t)λ, the mean and variance of the process are E[X(t)]=νλ[1eλt] and var[X(t)]=σ22λ[1e2λt], respectively (Karlin and Taylor, 1981, Smith, 2000). Because the process X(t) is Gaussian, its finite dimensional distributions are completely characterized by its first two moments, together with its covariance function. At large values of t, X(t) has a stationary Gaussian distribution, N(ν/λ,σ/2λ). If λ is large, most the probability mass will be concentrated in the vicinity of the starting point, X(0)=0. There is a non-negligible probability that the process can reach an absorbing boundary and trigger a response, but when λ is large this probability will be small.

In Ratcliff and Smith’s (2010) proposed release from inhibition model, at the beginning of the trial λ(t) is large. Because the diffusion coefficient is constant, the process accumulates information, but on the majority of trials it stays near its starting point. At a point at which the quality of the information provided by the perceptual encoding process is sufficient, the inhibition is released, and the decay coefficient changes from a large to a small value. An OU process with small decay approximates a Wiener process (Ratcliff & Smith, 2004). Consequently, once inhibition is released, the accumulation process will satisfy a stochastic differential equation like Eq. (2). With ξ=ν and s=σ, the mean and variance of this process are E[X(t)]=νt and var[X(t)]=σ2t. That is, once inhibition is released, the mean and variance of the process increase linearly with time as occurs in Ratcliff’s (1978) model. The resulting model would be expected to behave similarly to Ratcliff’s model with a random starting point and a value of Ter that depends on the time at which release from inhibition occurs.

Ratcliff and Smith (2010) discussed these mechanisms only in general qualitative terms. Our aim in this article is to describe formal implementations of them and to report fits to experimental data. To foreshadow our results, both models provide a good quantitative account of performance in the dynamic noise task. They accurately capture the leading edge effect and also the pattern of fast errors in Fig. 1. Notably, they do so without the assumption of between-trial variance in starting point. In the diffusion model the starting point of the accumulation process, z, is assumed to be uniformly distributed with range sz (Ratcliff, Van Zandt, & McKoon, 1999). Starting point variability allows the model to capture the pattern of fast errors that is often found when discriminability is high and speed is stressed (Luce, 1986). Our models are able to capture this pattern without assuming trial-to-trial variation in starting point. Before we describe fits of the models we first characterize their qualitative properties to give the reader insight into the way in which they are able to predict the patterns of performance found in the data.

Section snippets

The integrated system model as a time-changed diffusion process

The accumulation process in the integrated system model can be viewed as a time-changed diffusion process, in which the instantaneous rate of evidence accumulation depends on the time-dependent diffusion coefficient, σ2(t). Useful insights into the properties of such processes can be obtained by considering the transformation that changes them into a standard Brownian motion process. This transformation is the basis for numerical integration equation methods for solving first-passage time

The release from inhibition model

Fig. 3 shows the transformation that maps the release from inhibition model to a standard Wiener process. In Appendix B it is shown that this transformation is x=1σ{xexp[tλ(s)ds]tν(s)exp[sλ(z)dz]ds}t=0texp[2sλ(z)dz]ds. This transformation generalizes the well-known result (Cox & Miller, 1965, p. 229), that a standard OU process, with drift λx and unit variance, can be realized from the Wiener process by an exponential expansion of the time variable and an exponential contraction of the

Assumptions of the model

The integrated system model (Sewell and Smith, 2012, Smith et al., 2010, Smith and Ratcliff, 2009) combines a time-inhomogeneous diffusion decision process with a process model of drift (Appendix C). The drift model seeks to characterize the combined effects of perception, memory, and attention on performance in speeded two-choice tasks. The model was developed to account for the effects of spatial attention in near-threshold visual tasks with briefly presented stimuli and, to that end, it

The time course of perceptual encoding

The two key assumptions of the release from inhibition model are: (a) dynamic noise delays the process of forming a representation of the information in the stimulus, and (b) the process of evidence accumulation is controlled by a time-dependent OU decay coefficient that is time-locked to the developing representation. Instead of assuming the VSTM-based drift model of the integrated system model, we sought to implement these assumptions in the simplest possible way, to allow us to evaluate the

Generalizing to other dynamic noise tasks

Ratcliff and Smith (2010) attributed the leading edge effect to the time needed to form a perceptual representation of the features of a noisy stimulus. Although they showed the effect occurs in letter discrimination, they did not investigate whether it occurs in other tasks in which stimulus features are presented in dynamic noise. Letter discrimination is a relatively complex task because the stimuli are comprised of multiple features. To perform it, people must form representations of the

The overconstrained estimation view

Compared to the standard diffusion model of Eq. (2), the integrated system model and the release from inhibition model make relatively complex assumptions about the time course of processing within a trial. We were motivated to consider such complex models because the standard diffusion model is unable to account for data like those in Fig. 1. An alternative view was proposed by Donkin, Brown, and Heathcote (2009), who argued that poor fits like those reported by Ratcliff and Smith (2010) may

Discussion

In this article, we considered two process models for decision-making in the dynamic noise task. Our larger theoretical concern in investigating this task was to try to understand the relationship between perceptual and decision processes in two-choice discrimination. Diffusion models have been extremely successful in accounting for performance in such tasks, but they do so by subsuming all of the processes prior to the decision process into a single value of drift, which is most often treated

Coda

As mathematical psychologists working today we often tend to take for granted the idea that psychological processes can be characterized mathematically. We are comfortable that our program—of developing psychological explanations for behavior, expressing them in mathematical form, and then testing the resulting model against empirical data—is a meaningful one. It is therefore easy to forget that when William Estes and others created mathematical psychology in the 1950s it was by no means

Acknowledgments

The research in this article was supported by Australian Research Council Discovery Grant DP 110103406 to Philip Smith and Air Force Office of Scientific Research Grant FA9550-11-1-0130 to Roger Ratcliff.

References (64)

  • P.L. Smith

    Stochastic dynamic models of response time and accuracy: a foundational primer

    Journal of Mathematical Psychology

    (2000)
  • P.L. Smith et al.

    Attention orienting and the time course of perceptual decisions: response time distributions with masked and unmasked displays

    Vision Research

    (2004)
  • S. Sternberg

    The discovery of processing stages: extension of Donders’ method

  • M. Abramowitz et al.

    Handbook of mathematical functions

    (1964)
  • G.S. Berns et al.

    How the basal ganglia make decisions

  • R.B. Bhattacharya et al.

    Stochastic processes with applications

    (1990)
  • E.L. Bjork et al.

    Letter identification in relation to linguistic context and masking conditions

    Memory & Cognition

    (1973)
  • A. Buonocore et al.

    On the two-boundary first-crossing-time problem for diffusion processes

    Journal of Applied Probability

    (1990)
  • A. Buonocore et al.

    A new integral equation for the evaluation of first-passage-time probabilities densities

    Advances in Applied Probability

    (1987)
  • J. Busemeyer et al.

    Decision field theory: a dynamic-cognitive approach to decision making in an uncertain environment

    Psychological Review

    (1993)
  • T.A. Busey et al.

    Sensory and cognitive components of visual information acquisition

    Psychological Review

    (1994)
  • I.D. Cherkasov

    On the transformation of the diffusion process to a Wiener process

    Theory of Probability and its Applications

    (1957)
  • D.R. Cox et al.

    The theory of stochastic processes

    (1965)
  • R.L. De Valois et al.

    Spatial vision

    (1990)
  • C. Donkin et al.

    The overconstraint of response time models: rethinking the scaling problem

    Psychonomic Bulletin & Review

    (2009)
  • W.K. Estes

    Toward a statistical theory of learning

    Psychological Review

    (1950)
  • W.K. Estes

    Statistical theory of spontaneous recovery and regression

    Psychological Review

    (1955)
  • W.K. Estes

    Statistical theory of distributional phenomena in learning

    Psychological Review

    (1955)
  • W.K. Estes

    Interactions of signal and background variables in visual processing

    Perception & Psychophysics

    (1972)
  • W.K. Estes

    The locus of inferential and perceptual processes in letter identification

    Journal of Experimental Psychology: General

    (1975)
  • W.K. Estes

    Similarity-related channel interactions in visual processing

    Journal of Experimental Psychology: Human Perception and Performance

    (1982)
  • W.K. Estes et al.

    A theory of stimulus variability in learning

    Psychological Review

    (1953)
  • Cited by (45)

    • Fast solutions for the first-passage distribution of diffusion models with space-time-dependent drift functions and time-dependent boundaries

      2021, Journal of Mathematical Psychology
      Citation Excerpt :

      Diffusion models have been introduced in psychology to account for behavioural data from two-alternative forced choice tasks. Over the last four decades these models have been applied to numerous domains, including perceptual decision-making (Bogacz et al., 2006; Ratcliff, 2002; Smith et al., 2014, 2004), multisensory decision-making (Diederich, 1995; Nidiffer et al., 2018), memory retrieval (McKoon & Ratcliff, 1996; Starns, 2014; White et al., 2014), lexical decision-making (Ratcliff et al., 2004; Wagenmakers et al., 2008; Yap et al., 2015), and neurophysiology (Churchland et al., 2008; Kühn et al., 2011; Philiastides, 2006; Purcell et al., 2010). Much of this success of diffusion models is due to their ability to simultaneously account for the complete distribution of observed response times and accuracies.

    • A single, simple, statistical mechanism explains resource distribution and temporal updating in visual short-term memory

      2020, Cognitive Psychology
      Citation Excerpt :

      One potential contributor to a delay in the current task is the use of dynamic noise in the current task to control the rate at which information was available. Previous response time modeling work by Smith et al. (2014) has suggested that the presence of dynamic noise interleaved with a target stimulus changes the rate of information available for decision-making over time, leading to effectively no information being accumulated for a period of time after the stimulus onset, after which time the rate of information slowly increases. The nonzero time intercept is also consistent with the linear filter model of Busey, Loftus, and colleagues (Busey & Loftus, 1994; Loftus, Busey, & Senders, 1993).

    View all citing articles on Scopus
    View full text