Introduction

A thorny issue in timing research is how to prevent people from adopting a counting strategy when they perform a temporal task. The spontaneous use of a counting strategy to estimate durations emerges at around 10 years of age (Levin & Wilkening, 1989; Wilkening, Levin, & Druyan, 1987), although some younger 7- to 8-year-old children also count time (Clément & Droit-Volet, 2006; Espinosa-Fernandez, de la Torre Vacas, García-Viedma, García-Guitiérrez, & Torres Colmenero, 2004). Beyond this age, most people count at a constant rhythm to ensure the accuracy of their temporal estimates. According to Fraisse (1963), 97% of adults spontaneously use a counting strategy when they have to estimate time. However, timing with or without counting has different effects on the two fundamental properties of time perception, known as the scalar properties of time (for a review, see Wearden & Lejeune, 2008). First, mean temporal estimates (M) are more accurate with counting than without it. Second and more important, the counting of time constitutes a violation of the scalar property of variance. This property requires the variability in temporal estimates (SD) to increase with duration value, such that the coefficient of variation (SD/M) remains constant across different duration values, consistent with Weber’s law. SD/M is an index of variability in temporal judgments. Thus, the lower SD/M is, or the lower the magnitude of variability in temporal judgments is, the higher is the sensitivity to time. Contrary to the scalar property of variance, when participants count time, SD remains constant, and SD/M decreases as duration value increases, indicating that sensitivity to time thus increases with duration value. The violation of this scalar timing property is explained by the fact that the counting activity reduces the variance in estimated duration by subdividing time into fixed subintervals of approximately 1 s (Grondin, Meilleur-Wells, & Lachance, 1999; Killeen, 1992; Killeen & Weiss, 1987). The sum of the variances of these subintervals is thus lower than the variance produced for the estimation of the interval taken as a whole. With a counting strategy, temporal estimates therefore remain more constant from trial to trial, and the amount of noise introduced into the representation of time does not increase with the magnitude of the duration (e.g., Clément & Droit-Volet, 2006; Grondin, 2001; Grondin & Killeen, 2009; Hinton, Harrington, Binder, Dargerian, & Rao, 2004; Hinton & Rao, 2004; Rakitin et al., 1998; Wearden, 1991). To avoid counting strategies and its effect on the scalar properties of timing, researchers thus choose a specific no-counting method, but often in an arbitrary way. Across studies, there is indeed no consistent strategy that has been employed to account for the counting problem. This may, perhaps, be one of the main reasons as to why there is no consensus regarding which processes, which brain regions, and which relevant time scales are factors that determine our experience of time. The present study was thus the first that has been run to systematically test the effects on time perception in different temporal tasks of various methods that are classically used to suppress counting.

To avoid chronometric counting, researchers often decide to use durations shorter than 1 s (e.g., Grondin et al., 1999; Grondin, Ouellet, & Roussel, 2004; McCormack, Wearden, Smith, & Brown, 2005; Rattat & Picard, 2011; Wearden, Pilkington, & Carter, 1999). Grondin et al. (1999) demonstrated that “it becomes useful to count explicitly when intervals are longer than 1.18 s” (p. 993). However, the use of short durations raises the question of the mechanism(s) underlying the processing of different duration ranges. Within the framework of scalar expectancy theory (Gibbon, 1977), some researchers have postulated that the same timing mechanism underlies the processing of durations both shorter and longer than 1 s. However, a growing number of studies suggest that there are two distinct timing mechanisms: a sensory mechanism for durations in the milliseconds range and a more cognitively mediated mechanism for durations in the seconds-to-minutes range (e.g., Gutyrchik et al., 2010; Kagerer, Wittmann, Szelag, & Steinbüchel, 2002; Lewis & Miall, 2003; Rammsayer, 2009; Wittmann, Leland, Churan, & Paulus, 2007; Zélanti & Droit-Volet, 2011). For instance, Ulbrich, Churan, Fink, and Wittmann (2007) showed that adults with high and low working memory capacity differ in their temporal reproduction of durations that are longer than 2–3 s, but not of those that are shorter. More recently, neuroimaging studies have revealed the involvement of different anatomical substrates in the timing of durations shorter or longer than 1–2 s—namely, the cerebellum for the former (Ivry & Spencer, 2004) and the thalamo-cortico-striatal circuits for the latter, including the prefrontal cortex involved in high-level cognitive activities (for reviews, see Coull, Cheng, & Meck, 2011; Lewis & Miall, 2006).

Therefore, to examine the estimation of suprasecond durations, it is necessary to develop procedures that avoid the use of counting strategies, although some studies take no such methodological precautions. Those that do often choose a procedure in an arbitrary way without knowing whether it is the best procedure and what its effects will be on temporal performance. Although most of them simply tell the participants not to count, some are more skeptical about this procedure, because they do not entirely trust their participants. Put differently, even if participants are instructed not to count, researchers assume that they will count in their heads. Consequently, some researchers have preferred to employ a secondary task that interferes with the counting activity, such as the verbal repetition of random digits presented on a computer screen (e.g., Pouthas & Perbal, 2004; Rakitin et al., 1998; Rakitin, Stern, & Malapani, 2005; Wearden, Rogers, & Thomas, 1997a). However, it is well known that when participants keep track of the passage of time while simultaneously performing a concurrent task, the load in working memory increases, and temporal judgment is disrupted (for a review, see Fortin, 1999). Finally, other authors have preferred to adopt the repetitive speech method used by Baddeley (1997) in order to suppress vocal or subvocal counting, such as repeating “blablabla” as quickly as possible (e.g., Baudouin, Vanneste, Isingrini, & Pouthas, 2006; Delgado & Droit-Volet, 2007; Droit-Volet & Clément, 2005; Droit-Volet, Clément, & Fayol, 2008; Droit-Volet & Rattat, 2007). This method is supposed to encumber the phonological loop without consuming resources from working memory. However, according to Franssen, Vandierendonck, and Van Hiel (2006), articulatory suppression also affects timing performance by producing an underestimation of duration. In sum, it appears that the different methods employed to suppress counting are not equivalent and affect the processing of time differently.

The aim of the present study was to determine the best procedure for investigating time estimations in humans with a minimal disruption of the processing of time per se—that is to say, the procedure that provides temporal judgments that conform to the scalar properties of time (accuracy of temporal estimates and holding of Weber’s Law with a constant coefficient of variation for different duration ranges). We tested the three methods that are classically employed to suppress counting—no-counting instructions, articulatory suppression, and an interference task—on temporal performance in the most often used temporal tasks: generalization, bisection, and reproduction. Two duration ranges (1–4 and 2–8 s) were also used in order to verify the scalar properties of time. Moreover, we compared temporal performances obtained in these no-counting conditions with performances in a counting condition in which the participants were explicitly instructed to count at the rhythm with which they felt most comfortable.

Method

Participants

A total of 240 students from Clermont University, 18–46 years of age (165 women and 75 men; mean age: 24.9 years, SD = 7.69), participated voluntarily in the present experiment.

Materials

Each participant was seated at a table approximately 50 cm from the screen of a Power Macintosh computer in a quiet room. The computer controlled all the experimental events and recorded data via PsyScope software (Cohen, McWhinney, Flatt, & Provost, 1993). In the three temporal tasks, the stimuli to be timed consisted of 500-Hz tones played over the computer speakers. Participants responded by pressing the “S” or “L” key of the computer keyboard for the bisection and generalization tasks and the space bar for the reproduction task. For the interference task, random digits (range, 1–100) were displayed in the center of the computer screen for 150 ms, with an interdigit interval randomly chosen between 450 and 750 ms. Digit presentation lasted from the beginning of the period before the temporal stimulus was presented (i.e., during the intertrial interval that was randomly chosen between 1 and 3 s) to the offset of this stimulus (see Wearden et al., 1997a, for a similar procedure).

Procedure

Participants were randomly assigned to one of the three temporal tasks (bisection, generalization, or reproduction) and one of the four counting conditions (counting instructions, no-counting instructions, articulatory suppression, or interference task). There were, therefore, 20 participants in each condition. In the counting instructions condition, the participants were asked to count aloud during the presentation of the stimulus at the rhythm with which they felt most comfortable. In the no-counting instructions condition, they were explicitly told not to count, and the experimenter added that if they did count, the results would be distorted. In the articulatory suppression condition, the participants were instructed to generate repetitive speech as quickly as possible (“blablabla”), and the experimenter monitored the continuity of their verbal activity. Finally, in the interference task condition, the participants had to repeat aloud the successive digits presented at random in the center of the computer screen.

For each condition, the participants performed the task in two sessions, separated by a 5-min interval: one session with a 1- to 4-s duration range and the other with a 2- to 8-s range. The presentation order was counterbalanced across participants. The comparison durations were 1, 1.5, 2, 2.5, 3, 3.5, and 4 s for the 1- to 4-s condition and 2, 3, 4, 5, 6, 7, and 8 s for the 2- to 8-s condition. In the temporal bisection task, the short and long standard durations were 1 versus 4 s and 2 versus 8 s, while in the generalization task the standard durations were 2.5 and 5 s. The procedure for the two sessions was similar for each temporal task, with the exception of the durations that were tested.

In the temporal generalization task, the standard duration was played four times. Then, in a testing phase, participants were given seven blocks of nine trials without feedback: one trial for each comparison duration that was either shorter or longer than the standard duration and three for the comparison duration that was equal to the standard duration. This made a total of 126 trials for the two duration ranges. As with the other temporal tasks, the trials were presented in a random order within each block, and the intertrial interval was randomly chosen between 1 and 3 s; a cross was displayed in the center of the computer screen, indicating that the participant could begin the trial by pressing the space bar. Participants had to judge whether the comparison duration was similar (yes response) or not similar (no response) to the standard duration by pressing the corresponding key. The keypress order was counterbalanced across participants.

In the temporal bisection task, the participants initially heard four successive presentations of the short and long standard durations. Then they were given eight blocks of 7 trials (a total of 112 trials without feedback)—that is, 1 trial for each of the seven comparison durations. The participants’ task was to judge whether the comparison duration was more similar to the short or the long standard duration by pressing the corresponding key, the keypress order again being counterbalanced.

For the temporal reproduction task, the participants were presented with a first stimulus whose duration had to be reproduced (presentation phase). Then, immediately after this presentation phase, the participants triggered the start of a second stimulus by pressing the space bar once. They pressed the space bar again when they judged that the elapsed duration was similar to the first one (reproduction phase). All participants performed 28 trials: 4 trials for each of the seven target durations, presented in a random order (a total of 56 trials). The temporal reproduction task consisted of two phases: a duration presentation phase and a duration reproduction phase. Therefore, as a function of the counting condition, the participants had to count, to not count, to generate repetitive speech, or to repeat the digits displayed during these two phases.

Data analysis

For the three temporal tasks (generalization, bisection, and reproduction), we assessed temporal performance using indexes of time accuracy and temporal variability, which differed from one task to another but were consistent with the literature (see the Results section for a description of these indexes). In order to test the effect of counting/no-counting conditions on the fundamental properties of time perception, separate standard repeated measures analyses of variance (ANOVAs) including condition as a between-participants factor and duration range as a within-participants factor were performed on these two indexes, with the criterion for statistical significance set at α = .05. We reported partial eta-square (η 2p ) as a measure of effect size whenever a significant main or interaction effect was revealed by the ANOVAs. Note that previous analyses revealed neither a significant main effect nor any interaction effect involving the keypress order factor. This factor was therefore not included in the statistical analyses. Obviously, a significant main or interaction effect including condition on time accuracy measures would suggest that mean temporal accuracy was different with and without counting and/or between the no-counting conditions. As for a significant main or interaction effect including duration range on the variability measures of temporal performance, it would suggest a violation of the scalar property of variance. Indeed, as was reported in the introduction, according to the scalar property of variance, the magnitude of variability in temporal estimates should be proportional to duration values and should not decrease with the increase of duration values as for counting strategies.

Results

Temporal generalization task

Figure 1 shows the generalization gradients, with the proportion of yes responses (i.e., identification of a stimulus as having the standard duration) plotted against the comparison stimulus durations, as a function of the counting/no-counting condition, for the short (upper panel) and long (lower panel) duration ranges. To examine generalization performance, we calculated for each participant (1) the hit score (i.e., the proportion of yes responses made to the standard duration), which provided an index of temporal accuracy, and (2) some index of response dispersion, which corresponded to the proportion of total yes responses that occurred to the standard duration and the two comparison durations immediately adjacent (Wearden, Wearden, & Rabbitt, 1997b). Theoretically, this last measure would approach 1.0 if all yes responses were clustered closely around the standard value, while it would be lower if generalization gradients were flatter.

Fig. 1
figure 1

Mean proportion of yes responses plotted against comparison durations in the short (upper panel) and long (lower panel) duration ranges for the counting, no-counting, articulatory suppression, and interference task conditions in the temporal generalization task

The ANOVA run on the hit score revealed a significant main effect of condition, F(3, 76) = 3.84, p < .05, η 2p = .13. By contrast, neither the main effect of duration range nor the interaction effect between duration range and condition was significant, both Fs < 1. A posteriori t-tests with Bonferroni adjustment indicated that the hit score was significantly higher in the counting condition than it was in the interference task condition (p = .01), suggesting a lower temporal accuracy in the latter. No other significant between-condition comparisons were found (all ps > .10). In fact, when the interference task was used to avoid counting strategies, the generalization gradients were not peaked at the standard duration value. As is revealed in Fig. 1, for the short-duration range, the comparison duration just longer than the standard duration (3 s) was judged to be similar to the standard duration just as often as was the comparison duration that was equal to the standard (2.5 s), t < 1, while the participants gave more yes responses for the 2.5-s comparison duration than they did for the 2-s one, t(19) = 4.86, p < .0001. By contrast, for the long-duration range, the 4 s-comparison duration that was shorter than the standard 5-s one was judged to be similar to the standard duration just as often as was the 5-s comparison duration, t < 1, while the comparison duration that was longer than the standard one (i.e., 6 s) was clearly differentiated from the 5-s comparison duration, with a significantly lower proportion of yes responses, t(19) = 3.46, p < .01. As is discussed below, this indicates that the interference task produced a shortening effect for the long stimulus durations and a lengthening effect for the shorter ones. Overall, these results suggest that the temporal accuracy was relatively good in all experimental conditions in the generalization task, except when an interference task was used, especially for the comparison durations between 1 and 3 s.

As for the hit score, for the index of response dispersion, the main effect of condition reached statistical significance, F(3, 76) = 3.79, p < .05, η 2p = .13, whereas the main effect of duration range did not, F < 1. However, the interaction effect between these two factors was significant here, F(3, 76) = 4.48, p < .01, η 2p = .15. As revealed by the significant condition × duration range effect, although the proportion of yes responses to the three central comparison durations remained, on average, constant in all conditions for the short-duration range (i.e., 1–4 s), F < 1, it varied as a function of condition for the long-duration range (i.e., 2–8 s), F(3, 79) = 8.67, p < .0001. More precisely, the index of response dispersion was significantly lower in the counting condition (.45) than it was in the three no-counting ones (post hoc Scheffé tests: no-counting instructions [.56], p < .0001; articulatory suppression [.54], p < .05; interference task [.53], p < .001); no significant differences emerged between the three no-counting conditions (all ps > .10). These results suggest that each no-counting method used in the present study flattened the participants’ generalization gradients, as compared with the counting condition, thus revealing a decrease in their sensitivity to time in the generalization task. Furthermore, as was also revealed by the significant condition × duration range interaction, there was no significant main effect of duration range in the three no-counting conditions [no-counting instructions, t < 1; articulatory suppression, t < 1; interference task, t(19) = 1.84, p > .05], consistent with the scalar property of variance. In contrast, the index of response dispersion was significantly lower for the long- than for the short-duration range in the counting condition (.39 vs. .51), t(19) = 3.54, p < .01. This significant effect of duration range in the counting condition was a typical case of violation of Weber’s law, with the shape of the generalization gradient varying with the range of durations. Consistent with the results of studies on counting time (e.g., Clément & Droit-Volet, 2006; Wearden, 1991), the shape of the generalization gradient was relatively steeper for the long-duration range than for the short one (Fig. 1). This indicates that when the participants counted time, their temporal sensitivity improved when duration magnitude increased. By contrast with the explicit counting condition, when the participants were prevented from counting by simple instructions not to count or articulatory suppression or even the interference task, Weber’s law held, since the sensitivity to time did not vary with duration range.

An additional method for checking whether temporal performances exhibit the scalar property of variance is to test the superimposition of the generalization gradients for the short- and long-duration ranges when they are plotted on the same relative scale, a good superimposition indicating that the scalar property of variance holds (for the method, see, e.g., Droit-Volet & Wearden, 2001; Penney, Gibbon, & Meck, 2000). Figure 2 illustrates this superimposition test and clearly shows that the generalization gradient was steeper for the long durations than for the short durations in the counting condition, consistent with the violation of the scalar property described above. In contrast, the generalization gradients for the two duration ranges superimposed perfectly well in the no-counting condition. For the other methods of preventing counting, the superimposition between the generalization gradients was reasonably good for the articulatory suppression method and worse for the interference task, with a shifting of the gradient toward the right for the short durations. Overall, the present results showed that the three methods used efficiently prevented the participant from counting in a generalization task but the interference task disrupted temporal discrimination to a greater degree, especially for the short durations.

Fig. 2
figure 2

Mean proportion of yes responses plotted against comparison durations expressed as a fraction of the standard duration in the two duration ranges (1–4 and 2–8 s) for the counting, no-counting, articulatory suppression, and interference task conditions in the temporal generalization task

Temporal bisection task

As with the generalization task, the bisection functions corresponding to the proportion of long responses plotted against comparison durations differed according to the counting method for both the short (upper panel) and the long (lower panel) duration range, although these differences were smaller than they were in the generalization task (Fig. 3). To examine bisection performances, we calculated the bisection point (BP) and the Weber ratio (WR) (Table 1), applying the regression method to the steepest part of the individual psychophysical function (for the method, see Church & Deluty, 1977; Wearden, 1991). Note that although they are various ways of calculating these two indexes, they generally yield nearly identical results (Wearden & Ferrara, 1995). BP (also called point of subjective equality) corresponded to the stimulus duration that gave rise to 50% long responses. This provided a sort of index of temporal accuracy in comparison with a control condition, such that a change in the location of the BP due to an experimental manipulation (in the present study, the counting/no-counting instructions) suggests a temporal distortion (i.e., under- or overestimation of durations) from one condition to another. The WR corresponded to the difference limen (half the difference between the stimulus giving rise to 75% long responses and that giving rise to 25% long responses) divided by BP. This provided an index of temporal sensitivity that reflected the slope of the bisection function, in that the smaller the WR, the greater the sensitivity to time. The ANOVA on BP showed neither a main effect of condition, F(3, 76) = 1.06, p > .10, nor a condition × duration range interaction, F < 1. By contrast, the main effect of duration range did reach statistical significance, F(1, 76) = 586.44, p < .0001, η 2p = .88, indicating that the BP was higher for the long-duration range than for the short one (4.57 vs. 2.39). Consequently, the BP did not change as a function of condition.

Fig. 3
figure 3

Mean proportion of long responses plotted against comparison durations in the short (upper panel) and long (lower panel) duration ranges for the counting, no-counting, articulatory suppression, and interference task conditions in the temporal bisection task

Table 1 Bisection points (BPs) and Weber ratios (WRs) in the two duration ranges for the four experimental conditions in the temporal bisection task

Contrary to the BP, there was a significant main effect of condition for WR, F(3, 76) = 9.10, p < .0001, η 2p = .26, as well as a significant condition × duration range interaction, F(3, 76) = 3.65, p < .05, η 2p = .13, but no main effect of duration range, F < 1. To examine the significant interaction further, we compared WRs in the two duration ranges for each condition taken separately. When the participants were explicitly instructed to count, WR was significantly lower for the long-duration range than for the short one (.19 vs. .22), t(19) = −2.27, p < .05. As with the generalization task, this demonstrated that counting time produced a violation of the scalar property of variance. By contrast, the scalar property of variance held with all the methods used to prevent counting. The WR values for the short- and long-duration ranges were similar: no-counting instructions, t < 1; articulatory suppression, t < 1; and interference task, t(19) = 1.79, p > .05. This is verified with the superimposition of psychometric functions for the two duration ranges on a relative scale that were relatively better in the three no-counting conditions than in the counting one (Fig. 4). However, when we compared the three no-counting conditions (Fig. 4), the superimposition of psychometric functions for the two duration ranges appeared better for the no-counting condition than for the other methods of preventing counting. This was linked to the WR that was higher with the interference task than with both the instructions not to count and articulatory suppression, at least for the long-duration range (post hoc Scheffé tests: both ps < .05), the differences not reaching significance for the short-duration range (Scheffé testsFootnote 1: all ps > .05). In sum, consistent with the data from the generalization task, the present results showed that the three no-counting methods successfully prevented the participants from counting in a bisection task, as indicated by the application of Weber’s law in these three conditions. However, temporal sensitivity was lower with the interference task than with the other two no-counting methods.

Fig. 4
figure 4

Mean proportion of long responses plotted against comparison durations expressed as a fraction of the bisection point in the two duration ranges (1–4 and 2–8 s) for the counting, no-counting, articulatory suppression, and interference task conditions in the temporal bisection task

Temporal reproduction task

In order to investigate temporal reproduction performances, we calculated a relative temporal reproduction score (difference between temporal reproduction and stimulus duration divided by stimulus duration) that indicated the extent to which the stimulus duration was estimated accurately, overestimated (>0), or underestimated (<0) and the coefficient of variation in the reproduced durations (SD/M) (seeTable 2; for an example, see Droit-Volet, 2010). As for the two previous tasks, for each score, an ANOVAFootnote 2 was carried out with condition as a between-participants factor and duration range as a within-participants factor, but an additional within-participants factor was added—namely, stimulus duration. For the relative temporal reproduction score (Fig. 5), while the ANOVA did not reveal a significant three-way interaction, F < 1, it did reveal a significant two-way interaction between condition and stimulus duration, F(8.24, 208.77) = 13.91, p < .0001, η 2p = .35, and more interestingly, between condition and duration range, F(3, 76) = 6.94, p < .0001, η 2p = .21. The three main effects were also significant [condition, F(3, 76) = 14.24, η 2p = .36; stimulus duration, F(2.74, 208.77) = 35.06, η 2p = .32; duration range, F(1, 76) = 15.32, η 2p = .17; all ps < .0001]. As is shown in Fig. 5, the participants underreproduced durations when they were instructed to count, as well as when they were instructed not to count or to perform articulatory suppression, in both the short- and long-duration ranges [one-sample t-test: counting instructions, t(19) = -8.14, and t(19) = -8.54; no-counting instructions, t(19) = -4.97, and t(19) = -10.43; articulatory suppression, t(19) = -4.31, and t(19) = -6.83, respectively; all ps < .0001]. However, temporal accuracy was better in the counting condition than in no-counting ones, at least for the longer durations. Indeed, the participants underestimated 9% of durations of with counting and of 20% and 18% with the instructions not to count and articulatory suppression, respectively (a posteriori comparisons with Bonferroni adjustment: both ps < .05). No significant difference was found between these two no-counting conditions for either the short- or the long-duration range (both ps > .10). In addition, as suggested by the significant condition × stimulus duration interaction, with the instructions to count, temporal underestimation was greater for the shortest stimulus durations (i.e., the mean temporal reproduction scores for stimuli 1, 2, and 3) than for the longest ones (i.e., the mean temporal reproduction scores for stimuli 5, 6, and 7) (13% vs. 0.8%), t(19) = 3.76, p < .001, whereas the reverse was observed in the no-counting (10% vs. 22%), t(19) = 5.48, and the articulatory suppression (11% vs. 23%), t(19) = 6.51, conditions (all ps < .0001).

Table 2 Coefficient of variation in the two duration ranges for the five experimental conditions, as a function of stimulus duration, in the temporal reproduction task
Fig. 5
figure 5

Relative temporal reproductions for each stimulus duration in the short (upper panel) and long (lower panel) duration ranges for the counting, no-counting, articulatory suppression, and interference task conditions in the temporal reproduction task

Moreover, as is illustrated in Fig. 5, when participants performed the interference task, we found a specific temporal pattern that clearly differed from the other conditions, especially for the shortest stimulus durations. One-sample t-tests showed that the interference task produced temporal overestimation of 16% for the short-duration range, t(19) = 2.31, p < .05, but not for the long one, t < 1. However, as revealed by the post hoc analyses, for the short-duration range, it was only the shortest stimulus durations (<2.5 s) that were overestimated 21%, t(19) = 3.20, p < .01, and not the longest ones (>2.5 s), t < 1. Accordingly, this difference between the short- and long-duration ranges was due to temporal overestimation, which decreased as duration value increased, with a specific temporal overestimation for stimulus durations shorter than 2.5 s. Put differently, all these results suggest that the interference task produced a lengthening effect for the short durations, while the two other no-counting conditions produced a shortening effect for both short and long durations.

Since each trial in the reproduction task consisted in two phases—that is, a duration presentation and a duration reproduction phase—we therefore decided to verify whether the lengthening effect observed when the interference task was performed during both of these two phases would also emerge when it was performed only during the duration presentation phase. To that aim, we asked 20 new participants (13 women and 7 men; mean age: 27.3 years, SD = 7.4) to repeat the digits displayed on the computer screen solely during the playing of the first stimulus—that is, during the presentation phase. An ANOVA was run on the relative temporal reproduction score, with location of the interference task (presentation + reproduction vs. presentation) and duration range as the between-participants factor and stimulus duration as the within-participants factor. This ANOVA revealed a significant main effect of the location of the interference task, F(1, 38) = 29.90, p < .0001, η 2p = .44, indicating that whereas the participants overestimation of durations was 7% when the interference was performed both during the presentation and the reproduction phases, their underestimation of durations was 24% when the interference task was performed only during the presentation phase (Fig. 6). There was also a significant main effect of stimulus duration, F(6, 87.18) = 49.83, p < .0001, η 2p = .57, and duration range, F(1, 38) = 19.86, p < .0001, η 2p = .34, suggesting that temporal accuracy was better for the shortest stimulus durations (i.e., the mean temporal reproduction scores for stimuli 1, 2, and 3) than for the longest ones (i.e., the mean temporal reproduction scores for stimuli 5, 6, and 7), as well as for the short- than for the long-duration range. No significant interaction effects were found. In sum, when the interference task was performed only during the reproduction phase, there was an underestimation of time—as obtained with the other methods of preventing counting (no-counting instructions and articulatory suppression), but with a greater shortening effect. Nevertheless, whatever the direction of the time distortion (over- or underestimation), the magnitude of this distortion appeared to be more reduced when the interference task occurred in the two reproduction phases (presentation + reproduction) than when it occurred only in the presentation phase.

Fig. 6
figure 6

Relative temporal reproductions for each stimulus duration in the short (upper panel) and long (lower panel) duration ranges for the interference task presentation + reproduction and interference task presentation conditions in the temporal reproduction task

The ANOVA (see note 2) conducted on SD/M (Table 2) revealed significant main effects of stimulus duration, F(5.16, 392.40) = 12.24, p < .0001, η 2p = .14, condition, F(3, 76) = 26.06, p < .0001, η 2p = .51, and duration range, F(1, 76) = 6.23, p < .05, η 2p = .08, as well as a significant interaction between condition and duration range, F(3, 76) = 7.78, p < .0001, η 2p = .23. There was no other significant interaction effect. The significant main effect of condition indicated that temporal reproduction was less variable in the counting condition (.10) than in the no-counting instructions (.15), articulatory suppression (.16), and interference task (.19) conditions (post hoc Scheffé tests: all ps < .05). Conversely, among the methods used to avoid counting, the interference task produced significantly more variability in temporal reproduction than did the instructions not to count (p < .05). A posteriori comparisons with Bonferroni adjustment did not reveal any other significant difference between two no-counting conditions (all ps > .05). However, as revealed by the significant condition × duration range interaction, although the main effect of condition remained significant for both the short-duration, F(3, 76) = 20.30, p < .0001, and the long-duration, F(3, 76) = 21.74, p < .0001, ranges, the magnitude of differences in the variability of temporal reproductions between the counting condition and no-counting conditions was greater for the long-duration range than for the short one. For the counting instructions, the coefficient of reproduced durations was lower for the long-duration range than for the short one, t(19) = 4.56, p < .0001, whereas there was no such change for the instructions not to count and articulatory suppression, t < 1, and t(19) = 1.73, p > .10, respectively. This was due to the fact that temporal performances conformed to the scalar property of variance when the participants did not count with the instructions to not count and the repetitive speech.

Contrary to these two methods of preventing counting, when the participants performed the interference task to avoid counting, they reproduced significantly more variable durations for the short-duration range than for the long one (.20 vs. .17), t(19) = 3.39, p < .01. The conformity to the scalar property of variance that we observed with the interference task in the temporal generalization and bisection tasks was thus not replicated in the temporal reproduction task. Indeed, the specific violation of the scalar property of variance was not replicated when the interference task was performed only during the duration presentation phase. Indeed, the additional ANOVA conducted on SD/M with location of the interference task (presentation + reproduction vs. presentation) and duration range as the between-participants factors and stimulus duration as the within-participants factor revealed significant main effects of stimulus duration, F(6, 228) = 10.94, p < .0001, η 2p = .22, and of duration range, F(1, 38) = 5.05, p < .05, η 2p = .12, as well as a significant duration range × location of the interference task interaction, F(1, 38) = 6.17, p < .05, η 2p = .14. There was no other significant main or interaction effect. The significant duration range × location of the interference task interaction revealed that when the interference task was performed only during the presentation phase, the coefficient of reproduced durations was lower for the long-duration range than for the short one, t(19) = 3.39, p < .01, whereas there was no such change when the interference task was performed during both the presentation and reproduction phases (as previously showed). In sum, these results suggested that temporal reproduction performance conformed to the scalar property of variance when the interference task was performed solely during the presentation phase, but not when it was also performed during the reproduction phase.

Discussion

The increase in SD with the increase in duration magnitude is the fundamental scalar property of the perception of time that has been found in animals, as well as in human adults. However, the specificity of human adults is that they are aware of this variability and, therefore, use a counting strategy to ensure the accuracy and the precision of their temporal estimates, whatever the duration being estimated. The results of the present study using three different temporal tasks showed that when the participants counted time, their temporal judgments were indeed systematically more accurate, the generalization gradients and the bisection functions being steeper and the reproduced durations more accurate with than without counting. Furthermore, when they counted, SD did not increase but, rather, decreased with the increase of duration values, such that their sensitivity to time decreased with longer durations, instead of remaining constant. Therefore, our results confirmed previous findings that the use of a verbal counting strategy to estimate time produces temporal behavior that does not conform to the scalar property of variance (e.g., Grondin et al., 1999; Killeen, 1992; see also Wearden & Lejeune, 2008, for a review).

However, the main challenge of our study was to determine the method of preventing counting in different temporal tasks that allows temporal estimations to be examined with minimal disruption of time processing per se. In the present study, we tested three methods of avoiding counting that have classically been used in research on time perception—namely, no-counting instructions, articulatory suppression, and an interference task. First of all, our results demonstrated that each method efficiently prevents participants from counting time, in that contrary to the counting condition, temporal performances in these three no-counting conditions all conformed to the scalar property of variance, whichever temporal task was administered: temporal generalization, bisection, or reproduction. Only in one specific condition—that is, when the participants performed the interference task during both the presentation of the target stimulus and its reproduction—did temporal performance fail to conform to this scalar property of time, the interference task increasing SD for the short-duration range (1–4 s). As is discussed later, this suggests that using an interference task in the reproduction task during both the presentation and the reproduction phase is not a good strategy for investigating time perception without counting.

Despite their efficiency in preventing the participants from counting, the three no-counting methods are not equivalent in their effects on temporal judgments—in particular, on the two fundamental properties of time perception. Our results revealed that the interference task disrupted temporal behavior to a greater extent than the no-counting instructions and articulatory suppression did. More specifically, repeating digits aloud while processing time introduced a greater amount of noise into time estimation. In the generalization task, this resulted in flatter generalization gradients, and in the bisection task, in flatter psychometric functions. In the reproduction task, the interference task tended also to produce more variable reproduction than did the other no-counting instructions conditions, but mainly when the interference task was performed during both the presentation and the reproduction phases. This indicates that the interference task decreased sensitivity to time, as compared with the other methods of avoiding counting. We can therefore conclude that the concurrent interference task does more than just prevent participants from counting time, since it also affects the processing of time per se. As in most other studies of time perception (e.g., Rakitin et al., 2005; Rakitin et al., 1998; Wearden et al., 1997a, 1997b), the interference task used in the present study consisted of repeating digits aloud that were displayed at random intervals during the presentation of the stimuli that had to be timed. According to the attentional versions of internal clock models (e.g., Lejeune, 1998; Zakay & Block, 1996), an attentional switch controls the passage of pulses emitted by the internal clock into an accumulator by closing and opening at the beginning and the end of the stimulus being timed, respectively. However, during stimulus presentation, the switch may run in a “flickering” mode, closing briefly at the beginning of each event (Lejeune, 1998; Penney et al., 2000). We therefore suggest that the random presentation of digits produced a series of random attentional interferences, which introduced noise into the temporal processing. Interference is known to produce more errors and greater variability in time judgments (e.g., Brown, 1997, 2006; Witherspoon & Allan, 1985). For instance, Gautier and Droit-Volet (2002) showed that the random presentation of attentional distractors in a bisection task increased the variability in time discrimination.

The repetition of digits used to prevent counting is thus a particularly attention-demanding second task that imposes an additional load on working memory, thus disrupting the processing of time per se. This would also explain why the interference task had a relatively greater effect on the coefficient of variation in the temporal reproduction task than in the two temporal discrimination tasks, especially when it was performed during both the presentation of the stimulus duration and its reproduction. Baudouin et al. (2006) compared different temporal tasks (temporal production and reproduction) and demonstrated that temporal reproduction is the highest attentional-demanding task. The participants must indeed encode the duration and keep it in working memory while it is reproduced. Consistent with this idea, while the scalar property of variance held when the interference task was combined with the generalization and bisection tasks, it no longer held when this method was combined with the temporal reproduction task during the two phases of this task. This violation of the scalar property of variance in the reproduction task was due mainly to the interference task generating more variable reproductions of short stimulus durations than long ones. In all probability, repeating digits presented randomly was a greater source of interference in the reproduction task for short durations because the participants had to press the key twice in quick succession—once to initiate the stimulus and once when they judged that its duration was equal to the duration stored in working memory.

In addition, our results showed that the interference task not only increased variability in time judgments, but also distorted time perception (i.e., affected mean temporal accuracy) in both the temporal reproduction and the generalization tasks. By contrast, in the temporal bisection task, we did not observe any time distortion related to the interference task. This may be due to the fact that the bisection task is less demanding in terms of attentional and working memory resources. Several studies have suggested that, in bisection, participants simply classify the duration as short or long (e.g., Allan, 2002; Droit-Volet & Rattat, 2007; Rodriguez-Girones & Kacelnik, 2001), rather than storing the standard durations in reference memory and basing their temporal judgment on the direct comparison of these standards with the probe stimulus duration. Furthermore, whereas the variability of duration judgments significantly increased with general cognitive ability as assessed in IQ test scores in the temporal generalization task, it remained stable in the bisection task (Wearden et al., 1997b). In the present study, our results in the reproduction task showed that the effect of the interference task in temporal judgments varied according to the task phases during which it was performed. When the interference task was performed during both the presentation and reproduction phases, it led the participants to overreproduce the stimulus durations, especially those shorter than 2.5 s. The interference task brought about a similar temporal overestimation in the generalization task for the short-duration range, with a rightward shift of the generalization gradient. However, for the long-duration range, we observed a temporal underestimation, rather than an overestimation, with a leftward shift of the generalization gradient. This leftward shift indicated that the standard duration was judged relatively shorter with the interference task than with the other no-counting methods. However, in the temporal reproduction task, when the same interference task was performed solely during the presentation of the target duration, it led the participants to underreproduce, and not overreproduce, the stimulus durations. In similar conditions, Pouthas and Perbal (2004) already found an underreproduction of target durations. This shortening effect observed in both the temporal generalization and reproduction tasks is consistent with the results of studies using a time-sharing paradigm, which have shown that when attentional resources are distracted away from the processing of time, durations are judged shorter (e.g., Brown & Merchant, 2007; Champagne & Fortin, 2008; Rattat, 2010; see also Brown, 2010, and Coull, 2004, for reviews). As was suggested previously, the poorer temporal accuracy, as well as the greater variability in temporal performance with the interference task than with the other no-counting methods, is presumably due to this task, which consumed attentional resources to the detriment of time processing

However, as was previously mentioned, for the shortest stimulus durations in the temporal generalization and temporal reproduction tasks, our results showed no such underestimation but, instead, an overestimation of durations. This temporal overestimation was not found in previous studies using similar interference tasks (e.g., Pouthas & Perbal, 2004; Wearden et al., 1997a). However, these studies used longer target durations than those employed in the present study. Although this overestimation is not easy to explain, it is possible that requiring the participants to repeat digits aloud increased response times for the shortest durations, especially in the temporal reproduction task. According to the models of temporal reproduction, the duration reproduced is \( \left( {{1} - b} \right)t + d \), where t is the target duration, b is a sort of threshold factor (the percentage through the target duration at which the response is initiated), and d is the time taken to generate the response (Droit-Volet, 2010; Wearden, 2003). Recently, Droit-Volet highlighted the greater importance of the motor components for accuracy temporal reproduction for short stimulus duration (<3 s) than for longer ones (>3 s). In our study, an obvious difference between the two phases of the temporal reproduction task concerned the motor response, which was required during the reproduction phase, but not during the presentation one. It is therefore likely that the digit repetition could have partially delayed the response production processes, especially for short durations, for which the participants must relatively quickly stop the stimulus by a keypress when they judged that its duration was equal to the duration to be reproduced just presented. Another possibility is that the digit repetition may partially have delayed the opening of the attentional switch at the end of the stimulus duration. The digits presented may indeed have diverted attention from generating the response and, thereby, increased the participants’ temporal reproductions.

Whatever the case, our data clearly show that using an interference method to avoid counting is not the best procedure for investigating time perception without counting, although the difference was less marked in the case of the bisection task. The interference task induces concurrent processing that interferes with the processing of time per se, even violating sometimes the scalar properties of time. In the case of people with limited attentional resources, such as older people or children, this may decrease their temporal performances due to their inability to divide their attention between two tasks, rather than to any difficulty in estimating time. All the methods have disadvantages, however; the best methods are instructions not to count or articulatory suppression, even if our results revealed that the latter produces more noise in the perception of time than does the former. For several decades, a number of researchers investigating time perception in human adults have simply instructed their participants not to count, despite criticism from their peers. The present study has at long last vindicated them, by providing evidence that their method is both simple and efficient. Finally, the method that can be now recommended for the investigation of time perception in human adults is to instruct participants not to count and add that the results would be distorted if they count, in order to force them to respect the instructions. In addition, to verify the effects of these instructions, a series of different duration values must be used to systematically test whether the scalar property of variance still holds. This investigation could eventually also be done individual per individual in order to exclude from the final sample the few number of participants that have not respected the instructions. In addition, as has already been suggested by Droit-Volet (2010), the temporal discrimination tasks must be preferred, as compared with the particularly complex temporal reproduction task. However, it is now important to further investigate the role of explicit instructions in our experience of time.