Sustained attention is conceptualized as the “…ability to self-sustain mindful, conscious processing of non-arousing stimuli whose repetitive non-arousing properties would otherwise lead to habituation and distraction” (Robertson, Manly, Andrade, Baddeley, & Yiend, 1997. p. 747). It is a critical ability for everyday functioning and is argued to underpin more complex processes, such as inhibitory control (Reck & Hund, 2011), mathematical ability (Anobile, Stievano, & Burr, 2013) and the self-regulation of behavior (Martin, Razza, & Brooks-Gunn, 2012). A more complete understanding of sustained attention development requires clarification of two issues. First, while overall RT variability (RTV) is often used to measure sustained attention, analytic techniques that decompose RT data (e.g., into transient fluctuations and slow gradual changes) might provide more insight into sustained attention development (Castellanos et al., 2005; Johnson, Kelly, et al., 2007; Stuss, Murphy, Binns, & Alexander, 2003). Second, little longitudinal research has been conducted on sustained attention development in children. Information from such a study would further our understanding about sustained attention development.

One difficulty characterizing sustained attention development is that different developmental effects are found depending on the task and measures used. Some cross-sectional research using child-friendly tasks suggest sustained attention ability reaches asymptote at around 6 or 7 years, as measured by mean RT (MRT), standard deviation of RT (SDRT), and omission errors on the Sustained Attention subtest of the German Test Battery of Attention Performance (KITAP; Sobeh & Spijkers, 2012); and as measured by the number of correctly selected targets in the Visual Attention subtest of the Developmental Neuropsychological Assessment battery (NEPSY; Visu-Petra, Benga, & Miclea, 2007). Other cross-sectional research suggests sustained attention continues to develop until 8 years, as measured by omission errors in the divided attention, visual scanning, and Go/No-Go tasks in the KITAP (Trautmann & Zepf, 2012). Others report improvement until 10 years in the auditory and visual attention subtests of the NEPSY; and in the number of errors and RTV on the Score! subtest of the Test of Everyday Attention in Children (TEA-CH; Klenberg, Korkman, & Lahti-Nuuttila, 2001; Betts, Mckay, Maruff, & Anderson, 2006). Child-friendly tasks are designed to be engaging and arousing. Therefore, such tasks may not assess the ability to self-sustain processing of non-arousing stimuli. It is likely that child-friendly attention tasks may provide external arousal and attentional support. This may result in ceiling effects due to an inadequate assessment of the ability to self-sustain processing of unarousing stimuli.

Other cross-sectional research reports improvement in sustained attention into early adolescence, in terms of SDRT, coefficient of variation (CV;SDRT divided by RT) and omission errors on the Continuous Performance Task (CPT) and standard Go/No-Go tasks. These studies found that with increasing age there is a decrease in omission errors and SDRT in 6- to 12-year-olds (Brocki, Tillman, & Bohlin, 2010; although c.f. Somerville, Hare, & Casey, 2011), a decrease in CV in 7- to 13-year-olds (Klarborg et al., 2013), and a decrease in percentage of omissions in 8- to 13- year-olds (Rebok et al., 1997). A lifespan study of 10 to 70 year-olds reported a decrease in CV until around 16 years with limited change between 17 and 40 years (Fortenbaugh et al., 2015). Meanwhile two other lifespan studies reported an ongoing decrease in omission errors and CV from 12-years (McAvinue et al., 2012) and 14-years (Carriere, Cheyne, Solman, & Smilek, 2010) into adulthood. These two studies did not include children younger than 12 years, but provide evidence of ongoing maturation of sustained attention processes through late childhood and adolescence. Further evidence for ongoing improvement in sustained attention through childhood comes from a small number of studies using an ex-Gaussian measure of very slow responses, tau, as a measure of attention lapses (Leth-Steensen, King Elbaz, & Douglas, 2000). There was a decrease in tau, from 6- to 13-years of age on a CPT (Tarantino et al., 2013) and between 6 and 19 years on a Go/No-Go task (van Belle, van Hulst, & Durston, 2015); although a study using a stimulus-response compatibility task reported no difference in tau between 6-to 15-year-olds and 17- to 22-year-olds (McAuley, Yap, Christ, & White, 2006). Yet more evidence of sustained attention development through childhood comes from studies that measured target sensitivity using the signal detection theory measure d’. Some researchers regard d’ to be sensitive to arousal in the context of certain tasks (Mackworth, 1968; Sergeant, Oosterlaan, & van der Meere, 1999; van der Meere & Sergeant, 1988). For example, Corkum and colleagues reviewed CPT studies, and highlighted that declines in target sensitivity occur with increased attentional demand, and certain stimulant drugs affect both arousal levels and perceptual sensitivity (Corkum & Siegel, 1993). Studies suggest an increase in target sensitivity from 6- to 15-years (Lin, Hsiao, & Chen, 1999), 7- to 13-years (Klarborg et al., 2013), and 10- to 16-years (Fortenbaugh et al., 2015). Overall, findings suggest that sustained attention continues to mature through childhood and into early adulthood.

Sustained attention tasks, such as vigilance monitoring tasks and some variants of the CPT (e.g., Brocki et al., 2010; Rebok et al., 1997), often involve responding to an infrequent target. These rare “Go” stimuli may be exogenously arousing and interfere with sustained attention (Johnson, Kelly, et al., 2007). In the fixed version of the Sustained Attention to Response Task (SART), the routine response is to press a button to stimuli and withhold a response to infrequent No-Go stimuli (Manly et al., 2003; Robertson et al., 1997). In this task, commission errors, which are responses to the No-Go target, are a measure of both sustained attention and inhibitory control. The sequential presentation of the digits 1 through 9 may enable participants to predict when a target will next appear; this is thought to reduce the arousing properties of the No-Go target. The repetitive and predictable nature of the Fixed SART likely induces prepotent automatic responding. It has been argued that simpler tasks are more taxing on sustained attention processes (Langner & Eickhoff, 2013), so it is likely that the Fixed SART may increase the burden on sustained attention processes relative to a random presentation of stimuli. McAvinue and colleagues (2012) found that a sample of adults had lower CV and fewer commission and omission errors on the Fixed SART compared with a sample of 12- to 19-year-olds. To date, however, no research has examined developmental changes in Fixed SART performance in typically developing children younger than 12 years. Such an investigation would help clarify developmental trajectories of sustained attention on a non-arousing task. This clarification is important since most researchers have used tasks that may be exogenously arousing. It is therefore currently unknown how a repetitive, fixed sequence of stimuli may affect performance in children.

Response time variability (RTV) is a commonly used measure of sustained attention, with increasing variability thought to reflect poorer ability to sustain attention (Barkley, 1997; Castellanos et al., 2005; Johnson et al., 2008; MacDonald, Nyberg, & Bäckman, 2006; Stuss et al., 2003). For example, children with lower RTV on several neuropsychological tasks were better able to stay on-task during a separate mathematics task (Antonini, Narad, Landberg, & Epstein, 2013), and children with lower RTV across executive control tasks such as an N-back task with a Go/No-Go component, had higher parent ratings of attention difficulties (Gomez-Guerrero et al., 2011). Most researchers have averaged response variability across a given task, using measures such as SDRT and CV. These measures do not provide details on changes in performance at certain timescales or over the course of the task, and therefore little is known about the developmental trajectories of these changes in RT. Several approaches to measuring RTV exist that offer richer indices of RTV than standard deviation measures (Barkley, 1997; Castellanos et al., 2005; Johnson, Kelly et al., 2007; Stuss et al., 2003). One approach that this study used involves computing a Fast Fourier Transform (FFT) spectra of RT data, providing measures of the relative power of periodic changes in responses across different temporal frequencies (Castellanos et al., 2005; Johnson, Kelly, et al., 2007); another is the use of an ex-Gaussian analysis to capture very long responses. These are described further in the Method. To date, no research has investigated age-related changes in slow and fast frequency variability of RTV in childhood. This is a major gap because these changes in RT at different timescales may reflect different attentional processes (Johnson, Kelly et al., 2007; Johnson, Healy, Dooley, Kelly, & McNicholas, 2015) and therefore may provide important insight into cognitive development.

The present study investigates the developmental trajectories of sustained attention in typically developing children on a particularly unengaging task. Six-, 8- and 10-year-olds completed the Fixed SART on three occasions, at 6-monthly intervals. We addressed development in two ways. Cross-sectional differences between age groups provided an estimate of development between 6 and 11 years of age. Longitudinal change between 6 and 7 years, between 8 and 9 years, and between 10 and 11 years of age provided fine-grained analysis of developmental change in childhood. This is important because previous research has been cross-sectional and has not provided detail of individual improvement with age throughout childhood. Previous research suggests that this period of childhood, especially 6 to 8 years, is a period of rapid development of sustained attention; therefore measuring change over a relatively short period of time may provide a more detailed understanding of these critical developmental changes.

In the current study, we make several specific predictions on the basis of the dominant observations in the literature. Most cross-sectional research reports a decrease in SDRT, omission errors, tau, and d’ on Go/No-Go tasks through childhood. Therefore, it is predicted that omission errors, d’ as well as SDRT and all derivations of RTV would show a within-participant decrease over the year, and a decrease between groups of ascending age. No previous research has investigated developmental changes in commission errors in childhood on a non-arousing task such as the Fixed SART; however, research has shown a decrease in commission errors on the Fixed SART from 12 years into adulthood (McAvinue et al., 2012). Therefore, it is expected that commission errors would show a within-participant decrease over the year, and a decrease between groups of ascending age. The response cue in the SART is used to reduce impulsive responding (Bellgrove, Hawi, Lowe, et al., 2005; Manly, Davison, Heutink, Galloway, & Robertson, 2000) and is said to decrease individual differences in RT (Manly et al., 2000). Therefore, little difference in mean RT and mu between the groups is expected. Regarding rates of change over the year, some predictions were made. If only the younger groups improved over the year, this would suggest an early maturation of sustained attention processes and, in turn, would support previous research using child-friendly tasks. If on the other hand, there is a similar rate of change over the year for all age groups, it would reflect ongoing improvement of performance between 6 and 11 years. Such ongoing improvement would be more consistent with previous research using Go/No-Go tasks that show maturation of sustained attention into adolescence.

Method

Participants

Thirty-five 6-year-olds (22 females; 6- to 7-year-old group), 31 8-year-olds (23 females; 8- to 9-year-old group), and 37 10-year-olds (19 females; 10- to 11-year-old group) participated in the study (four children were excluded because IQ < 70). All children were right handed and attended primary/elementary schools in a large Australian city. The three age groups did not differ in estimated full-scale IQ, ADHD symptoms, or gender distribution (see Table 1). This sample also participated in our previous work with the Random variant of the SART (Lewis, Reeve, Kelly, & Johnson, 2017).

Table 1 Participant demographics and the number of participants in each group who had data that could be fitted for each dependent variable

Materials & procedure

At the first test occasion, children completed the Fixed SART in a quiet location in their school. They also completed four subtests from the WISC-IV on a separate day. Parents completed the Parent Questionnaire of the Conner’s 3 Rating Scale (Conners, 2008) (details below). Children completed the SART again 6 and 12 months later. The study was conducted with the approval of the author’s University’s Human Ethics Committee.

WISC-IV

At the first test occasion, four subtests (Block Design, Similarities, Digit Span and Coding) from the WISC-IV were administered, and full-scale IQ was estimated following Sattler and Dumont (2004).

ADHD symptoms

Parents completed the Parent Questionnaire of the Conner’s 3 Rating Scale of ADHD symptoms. This was to ensure that there was no difference in spread of ADHD scores between groups. Seventy-five children had valid data on the Conner’s scale.

SART

The Fixed SART was presented on a 15” laptop computer using E-Prime 2.0. During the Fixed SART, the routine response is to press a button to stimuli and withhold a response to infrequent No-Go stimuli (Manly et al., 2003; Robertson et al., 1997). The digits 1 to 9 were presented individually, in sequential ascending order from 1 through 9; this cycle was repeated 25 times. The participant was instructed to press the left mouse button with their right index finger in response to every digit, except “3” (i.e., the No-Go target). The digit stimulus was displayed for 313 ms, followed by a mask of 125 ms, a response cue for 65 ms, a mask for 375 ms, and a fixation cross for 563 ms (Johnson, Kelly, et al., 2007). The response cue was a bold-font cross and circle in central fixation, and the participant was instructed to time their button-press response to the onset of this response cue rather than the onset of the digit. In this way, a response cue may reduce impulsive, anticipatory responding to digits (Bellgrove, Hawi, Kirley, Gill, & Robertson, 2005; Manly et al., 2000). The total inter-stimulus interval (ISI) was 1439 ms (digit onset to digit onset). The SART took 5.5 minutes to complete. Figure 1 outlines the timing of the SART.

Fig. 1
figure 1

Timing of a single trial of the Sustained Attention to Response Task. A represents a Go trial, which could be digits 1, 2, 4, 5, 6, 7, 8 or 9. B represents a No-Go trial, which was always the digit 3. Note that the digits were presented in an ascending sequential order from 1 through 9, a cycle which repeated 25 times

Children completed a practice block of 36 trials, which included four No-Go trials. Following the practice block, they completed the experimental block of 225 trials, including 25 No-Go targets. Prior to commencing the task, children were asked to re-explain the task to ensure they understood instructions; all children successfully re-explained task instructions. The two follow-up test occasions followed the same procedure.

Sustained attention tasks, such as vigilance monitoring tasks and some variants of the CPT (e.g., Brocki et al., 2010; Rebok et al., 1997), often involve responding to an infrequent target. These rare “Go” stimuli may be exogenously arousing and interfere with sustained attention (Johnson, Kelly, et al., 2007). The sequential presentation of the digits 1 through 9 in the Fixed SART may enable participants to predict when a target will next appear; this is thought to reduce the arousing properties of the No-Go target. It has been argued that simpler tasks are more taxing on sustained attention processes (Langner & Eickhoff, 2013), so it is likely that the Fixed SART may increase the burden on sustained attention processes relative to a random presentation of stimuli.

Data analysis

A count of commission errors (responses to the No-Go digit “3”) and omission errors (non-responses on the Go-trials) were calculated for the first and second halves of the SART as well as for the entire SART. The mean and standard deviation of response times (MRT and SDRT) for the Go-trials were calculated for the entire SART and then for the half-by-half analysis. MRT and SDRT were calculated once RTs less than 100 ms were removed (see Luce, 1986).

Fast Fourier Transform data preparation

A modified version of the method of Johnson and colleagues was followed (Johnson et al., 2008; Johnson, Kelly, et al., 2007).Footnote 1 RTs of less than 100 ms and those for digit “3” were linearly interpolated from the immediately preceding the following RTs.

Derivation of the FFT spectra

The RT data were analysed using Welch’s averaged, modified, periodogram method, separately for both the full SART and the half-by-half analysis. The full SART time series was divided into seven segments of 75 data points; each segment overlapped by 50 data points. Each segment was detrended by subtracting linear components. Each segment was Hamming-windowed and zero-padded to length 450, and the FFT was then calculated for each segment. Any segment with more than 5 missing data points in a row was excluded. Reasons for a missing data point included very fast RTs (all RTs less than 100 ms were removed), omission errors, and digit “3” trials. Any participant with more than three out of the seven segments missing was excluded. Consequently, a small number of participants were not included in the FFT analyses (see Table 1). Note, these exclusion criteria resulted in a reduction in sample size for the youngest group on the FFT measures, which should be taken into account when interpreting findings. For a half by half analysis, the first three FFT segments were averaged in the first half, and the last three segments were averaged in the second half.

The calculation of the area under the spectrum (AUS) over a broad band of interest allowed for an estimate of the power (variance) in the RT signal across the corresponding timescales. Consistency and distinctiveness of any particular regularly repeating RT pattern is measured as a peak at a particular point in the spectrum. Figure 2 provides an example of the spectrum. Healthy adults typically slow on digit “1” in anticipation of the No-Go digit” 3”, and this regularly recurring pattern shows a peak at .0772 Hz, which is the reciprocal of 9 digits by 1.439 ISI. Sampling length is equivalent to 1.439 seconds, once per trial. Therefore the sampling rate is 1/1.1439, or 0.695. This peak was used as a marker to differentiate two forms of variability in responding. The Fast Frequency Area under the Spectra (FFAUS) is a measure of the area under the curve to the right of the peak at .0772 Hz, and the Slow Frequency Area under the Spectra (SFAUS) is a measure of the area under the curve to the left of the peak at .0772 Hz (Johnson, Kelly, et al., 2007). FFAUS encompasses variability within one SART cycle (a cycle of the nine digits), and the Slow Frequency (SFAUS) encompasses variability over a time scale greater than one SART cycle. The FFAUS measures momentary fluctuations in responding, and may represent fluctuations in cognitive control of attention (Johnson, Kelly, et al., 2007). The SFAUS measures slow changes in responding that may occur over longer timescales such as that arising from declining arousal over the task (Johnson, Kelly, et al., 2007). The approach thus distinguishes brief fluctuations from gradual shifts in attention, where standard, averaged measures such as SDRT aggregate them. Note that a half by half analysis was conducted for FFAUS only. This was not conducted for SFAUS because we were interested in slow gradual changes over the whole task.

Fig. 2
figure 2

An example of the FFT spectra derived from the Fixed SART. This represents the data from T1 only

Ex-Gaussian analysis

RT may not be normally distributed, but may be positively skewed due to occasional very long responses (Luce, 1986; van Zandt, 2000). An ex-Gaussian model extends a normally distributed model to include a distinct right-tail capturing these very slow responses, providing more detail than the Gaussian distribution alone (Hockley, 1984; Luce, 1986; McElree & Carrasco, 1999; Palmer, Horowitz, Torralba, & Wolfe, 2011). In this model mu represents the mean of the Gaussian distribution, while sigma represents the variability from that mean, similar to MRT and SDRT. Meanwhile, the exponential component tau represents the skewed tail of the distribution and accounts for very long response times (Johnson et al., 2015; Tarantino, Cutini, Mogentale, & Bisiacchi, 2013). The measure of very low responses, tau, is argued to reflect attention lapses (Leth-Steensen, King Elbaz, & Douglas, 2000; Tarantino et al., 2013), and have been proposed as an endophenotype for Attention Deficit Hyperactivity Disorder (ADHD) (Henríquez-Henríquez et al., 2015; Johnson et al., 2015; Leth-Steensen et al., 2000; Lin, Hwang-Gu, & Gau, 2015; Tarantino et al., 2013).

Mu, sigma, and tau were fitted to each participant’s data sets using Matlab, following the procedure of Lacouture and Cousineau (2008; also see Johnson et al., 2015). A small number of participants were excluded due to the fitting procedure failing to converge on reliable estimates of the parameters (see Table 1).

Signal detection theory parameters

Simple choice signal detection theory was used to calculate sensitivity to the No-Go target (d’) and response bias (criterion) (Johnson et al., 2008). We were interested in the participants’ sensitivity to the No-Go target. Therefore, a hit was a correct inhibition to a No-Go target rather than a correct response to a Go trial,, a miss was a commission error, a false alarm was an omission error, and a correct rejection was a correct Go response. Higher d’ indicates better discrimination between the target and non-target, sometimes interpreted as better maintenance of arousal (Mackworth, 1968; Sergeant, Oosterlaan, & van der Meere, 1999; van der Meere & Sergeant, 1988). To calculate d’, a z transform of the false alarm rate was subtracted from the z transform of the hit rate; d’ = z (H) - z (F) (Stanislaw & Todorov, 1999). Criterion is the level of bias towards saying a stimulus is a target. To calculate criterion, -0.5 was multiplied by the sum of the z transforms of the hit and false alarm rates, then divided by d’; c’ = ( -(z(H) + z (F))/2)/d’ (Macmillan & Creelman, 1991). An unbiased responder will have a criterion of 0. A zero hit rate for one participant was adjusted to assume a hit on 0.5 trials (Kelly, Gomez-Ramirez, & Foxe, 2009).

Statistics

All dependent variables were calculated per participant, per testing occasion, and averaged per group. All variables, except omission errors and criterion, were normally distributed and were analysed using a three-way Test Occasion (Time One/T1, Time Two/T2, Time Three/T3) by Within-Occasion Task Half (first half of the SART, second half of the SART) by Group (6- to 7-year-olds, 8- to 9-year-olds and 10- to 11-year-olds) mixed factorial ANOVA. Within-Occasion Task Half effects will be referred to as Half effects from this point. Including half as a factor allowed an estimation of time-on-task effects; that is, within-occasion decline in performance as the task progressed. The exception was SFAUS which was analyzed with a two-way Test Occasion (T1, T2, T3) by Group (6- to 7-year-olds, 8- to 9-year-olds and 10- to 11-year-olds) mixed factorial ANOVA. SFAUS is a measure of gradual change over the task, so a half-by-half analysis is not appropriate. Significant main effects were investigated with Least Significant Difference-adjusted pair-wise comparisons, because a significant omnibus test provides sufficient protection against family-wise error when there are three groups (Keselman, Cribbie, & Holland, 1999; Levin, Serlin, & Seaman, 1994). Significant interactions were investigated with Bonferroni-adjusted pairwise comparisons. When the assumption of sphericity was violated Greenhouse-Geisser statistics were reported. Partial eta square provided a measure of effect size. Omission errors and criterion were non-normal, according to Kolmogorov-Smirnov and Shapiro-Wilk tests, and were analyzed using non-parametric tests. The Kruskal-Wallis test was used to investigate differences between the three age groups at each test occasion. Friedman’s ANOVA was used to investigate differences between test occasions for each age group. Wilcoxon Signed-Rank was used to investigate within-occasion differences between task halves for each age group. For non-parametric tests, when an overall test was significant, Dunn’s pairwise comparison tests were run, and effect size was calculated for each comparison; the standardized test statistic is reported. As a measure of effect size, the standardized test statistic was divided by the square root of the number of observations to calculate r (Rosenthal, 1991, p 19). The unadjusted significance level was used for all non-parametric tests, for the same reason that Least Significant Difference adjustments were used as described above. All analyses used an alpha level of 0.05. Means and standard deviations for each variable are shown in Table 2.

Table 2 Group means and standard deviations for each normally distributed measure; and median and interquartile range for omission errors and criterion measures

Results

Standard deviation of response time

Significant Group and Half main effects were further explained by a significant Group × Half interaction, F(2, 100) = 4.773, p = .01, ƞp 2 = .087 (Fig. 3), indicating a cross-sectional difference between groups which did not change over the year. In the first half of the task there was a separation between the three groups, reflecting decreasing variability with increasing age (6- to 7-year-old vs 8- to 9-year-old group p = .022; 6- to 7-year-old vs 10-to 11-year-old group p < .001; 8- to 9-year-old vs 10-to 11-year-old group p = .025). In the second half of the task the 6- to 7-year-old group performed the SART with greater variability than both older groups (both p < .001); the difference between the two older groups was not significant (p = .174). Each group performed with significantly increased SDRT in the second compared with the first half of the task (6- to 7-year-old and 10- to 11-year-old groups p < .001; 8- to 9-year-old group p = .020). The 10-to 11-year-old group performed with a larger increase in SDRT between halves compared with the 8- to 9-year-old group; this led to a reduction in the difference between the two older groups as the task progressed.

Fig. 3
figure 3

Half by Age interaction for Standard Deviation of RT (left) and FFAUS (right). Error bars represent the standard error of the mean. Note * indicates significantly different compared with both other age groups; + indicates significantly different to the first half

There was a significant Test Occasion main effect, F(2, 200) = 24.178, p < .001, ƞp 2 = .195. All children, regardless of age, improved over the year. All children performed the SART with significantly less variability in RT as the year progressed (T1 vs. T2 p < .001; T2 vs. T3 p = .002; T1 vs. T3 p < .001). There were no other significant findings on the SDRT measure.

FFAUS

Significant Group and Half main effects were further explained by a significant Group × Half interaction, F(2, 80) = 3.181, p = .047, ƞp 2 = .074, (Fig. 3), indicating a cross-sectional difference between groups that did not change over the year. In the first half of the SART, the 6- to 7-year-old and 8- to 9-year-old groups did not differ significantly (p = .999), and both were significantly more variable than the 10- to 11-year-old group (both p < .001). In the second half of the SART, the 6- to 7-year-old group was significantly more variable than both older groups (8- to 9-year-old group p = .046; 10- to 11-year-old group p < .001); the two older groups did not significantly differ (p = .153). The 6- to 7-year-old and the 10- to 11-year-old groups each performed with significantly increased moment-to-moment variability in the second compared to the first half of the task (p = .001 and p = .003 respectively); the 8- to 9-year-old group did not (p = .674). The 10-to 11-year-old group performed with a larger increase in FFAUS between halves compared with the 8- to 9-year-old group; this led to a reduction in the difference between the two older groups as the task progressed.

There was a significant Test Occasion main effect, F(2, 160) = 9.768, p < .001, ƞp 2 = .109. All children, regardless of age, improved over the year. Children performed the SART with significantly less moment-to-moment variability at T3 compared with T1 (p <.001) and T2 (p = .008); there was no significant difference between T1 and T2 (p = .121). There were no other significant findings.

SFAUS

There was a significant Group main effect, F(2, 80) = 5.550, p = .006, ƞp 2 = .122, indicating a cross-sectional difference between groups that did not change over the year. The 6- to 7-year-old group performed the SART with significantly greater slow variability than the 10- to 11-year-old group, (p = .004); the two older groups did not significantly differ (p = .133). The 6- to 7-year-old group performed with marginally greater slow variability than the 8- to 9-year-old group, but this did not reach significance (p = .058).

There was a significant Test Occasion main effect, F(2, 160) = 7.208, p = .001, ƞp 2 = .083. All children, regardless of age, improved over the year. Children performed the SART with significantly less slow variability at T2 (p = .006) and T3 (p = .001) compared with T1; there was no significant difference in SFAUS between T2 and T3 (p = .399). There were no other significant findings.

Sigma

A significant Half main effect was further explained by a significant Group × Half interaction, F(2, 80) = 5.665, p = .005, ƞp 2 = .124 (Fig. 4), indicating a cross-sectional difference between groups that did not change over the year. The groups did not differ significantly in the first half of the SART (all p’s >.788). During the second half, the 6- to 7-year-old group performed with significantly greater sigma than the 10- to 11-year-old group (p = .012); there were no other significant differences between groups (both p’s >.321). The 6- to 7-year-old group performed with increasing sigma over the course of the SART (p = .001); the 8- to 9-year-old and 10- to 11-year-old groups did not (both p’s >.113). The 6- to 7-year-old group performed at a similar level to the older groups, but became more variable as the task progressed.

Fig. 4
figure 4

Half by Age interaction for Mean RT (left) and sigma (right). Error bars represent the standard error of the mean. Note ǂ indicates significantly different to the 10-11 group; + indicates significantly different to the first half

There was a significant Test Occasion main effect, F(2, 160) = 3.398, p = .036, ƞp 2 = .041. All children, regardless of age, improved over the year. Children performed the Fixed SART with significantly reduced sigma at T3 compared with T1 (p = .015) and T2 (p = .039). T1 and T2 did not differ (p = .599). There were no other significant findings.

Omission errors

Results of non-parametric analyses are presented in Table 3 and in Fig. 5. At T1, the 6- to 7-year-old group made significantly more omission errors than the two older groups, and the 8- to 9-year-old group made significantly more omission errors than the 10- to 11-year-old group. At T2 and T3, the 6- to 7-year-old group made significantly more omission errors than the two older groups; the two older groups did not differ significantly.

Table 3 Results of non-parametric tests for omission errors
Fig. 5
figure 5

Boxplot of omission errors. The boxes represent the 25th percentile to 75th percentile range, with the middle line representing the median. The top and bottom of the whiskers represent the minimum and maximum scores. Note: * indicates significantly different to both other groups; 2 indicates significantly different to T2; 3 indicates significantly different to T3

The 6- to 7-year-old group improved in performance over the year, with a significant reduction in omission errors between each test occasion (see Table 3). The 8- to 9-year-old group also improved over the year, and made significantly fewer omission errors at T3 compared with T1. The 10- to 11-year-old group made significantly more omission errors at T2 compared with at T1 and T3.

The 6- to 7-year-old group made significantly more omission errors in the second compared with the first half of the task at T1 and T2 only (see Table 3). The 8- to 9-year-old group made a consistent number of omission errors over the task at each test occasion. The 10- to 11-year-old group made significantly more omission errors in the second compared with the first half of the task at T2.

Criterion

Results of non-parametric tests on the criterion measure are shown in Table 4. At T1 and T3 there was no significant difference in criterion between the three groups. At T2, the 6- to 7-year-old group showed a significantly greater bias to withhold a response than the 10- to 11-year-old group, while the 8- to 9-year-old group did not differ significantly from the other groups. For each group there was no significant difference between test occasions and no significant within-occasion differences between halves.

Table 4 Results of non-parametric tests for criterion

d’

There was a significant Group main effect, F(2, 100) = 24.293, p < .001, ƞp 2 = .327, indicating a cross-sectional difference between groups that did not change over the year. The 6- to 7-year-old group performed with lower target sensitivity than the two older groups (both p <.001), and the 8- to 9-year-old group performed with lower target sensitivity than the 10- to 11-year-old group (p = .008).

There was a Half main effect, F(1, 100) = 33.124, p < .001, ƞp 2 = .249. All children performed with a within-occasion change in performance across the task, regardless of age. The following pattern of performance did not change over the year. Participants performed with significantly greater target sensitivity in the first compared with the second half of the SART.

There was a Test Occasion main effect, F(2, 200) = 11.012, p < .001, ƞp 2 = .099. All children, regardless of age, improved over the year. Children performed the SART with significantly lower target sensitivity at T1 compared to T2 (p = .002); and T3 (p < .001); there was no significant difference between T2 and T3 (p = .121). There were no other significant findings.

Tau

There was a significant Group main effect, F(2, 80) = 4.207, p = .018, ƞp 2 = .095, indicating a cross-sectional difference between groups that did not change over the year. The 6- to 7-year-old group and 8- to 9-year-old groups did not differ significantly (p = .392), but both groups performed with significantly more long responses than the 10- to 11-year-old group (6- to 7-year-old p = .007; 8- to 9-year-old p = .043).

There was a significant Half main effect, F(1, 80) = 13.517, p < .001, ƞp 2 = .145. All children performed with a within-occasion change in performance across the task, regardless of age. The following pattern of performance did not change over the year. Children performed the Fixed SART with significantly more long responses in the second compared with the first half of the task. There were no other significant findings.

There was a significant Test Occasion main effect, F(2, 160) = 4.128, p = .021, ƞp 2 = .049. All children, regardless of age, improved over the year. Children performed the Fixed SART with significantly fewer long responses at T3 compared with T1 (p = .013), with no other differences between sessions (all p’s > .089).

Commission errors

There was a significant Group main effect, F(2, 100) = 14.400, p < .001, ƞp2 = .224, indicating a cross-sectional difference between groups that did not change over the year. The 6- to 7-year-old group made significantly more commission errors on the Fixed SART than both older groups (8- to 9-year-old group p = .006; 10- to 11-year-old group p <.001). The 8- to 9-year-old group made significantly more commission errors than the 10- to 11-year-old group (p = .022).

There was a significant Half main effect, F(1, 100) = 12.356, p = .001, ƞp 2 = .110. All children performed with a within-occasion change in performance across the task, regardless of age. The following pattern of performance did not change over the year. Participants made significantly more commission errors in the second compared with the first half of the Fixed SART.

There was a significant Test Occasion main effect, F(2, 200) =3.101, p = .05, ƞp 2 = .03. All children, regardless of age, improved over the year. Children made significantly fewer commission errors at T2 (p = .033) and T3 (p = .041) compared with T1, but made a similar number of commission errors at T2 and T3 (p = .725). There were no other significant findings.

Mean response time

A significant Half main effect was further explained by a significant Group × Half interaction, F(2, 100) = 3.637 p = .030, ƞp 2 = .068 (Fig. 4), indicating a cross-sectional difference between groups that did not change over the year. In the first half of the SART, there were no significant differences between the three groups (all p’s > .226). In the second half of the task the 6- to 7-year-old group was significantly slower than the 10- to 11-year-old group (p = .018); there were no other significant differences between groups (both p’s > .322). The 6- to 7-year-old group slowed significantly over the course of the Fixed SART (p < .001); the older groups maintained a consistent MRT throughout the task (both p’s > .465). There were no other significant findings. The 6- to 7-year-old group performed with a MRT similar to the older groups, but became slower to respond as the task progressed.

Mu

There was a significant Group × Half interaction, F(2, 80) = 3.380, p = .039, ƞp 2 = .078, indicating a cross-sectional difference between groups that did not change over the year. In both halves of the SART, all groups performed with similar mu (all p’s > .776). The 10- to 11-year-old group performed the SART with significantly reduced mu in the second compared with the first half of the task (p = .007); the 6- to 7-year-old group and 8- to 9-year-old group performed with a similar level of mu in both halves (both p’s > .254). There were no other significant findings. The 10- to 11-year-old group became faster as the task progressed.

Discussion

This is the first longitudinal investigation of children’s performance on the Fixed SART. There are four key findings. First, the 6- to 7-year-old group showed a diminishing ability to maintain arousal and attention throughout the task; they performed with within-occasion time-on-task effects on almost all measures. Second, the 10- to 11-year-old group performed well relative to younger groups, but their high performance level was difficult to maintain for the full length of the SART. Within each occasion, they performed with time-on-task effects on several measures, particularly in terms of FFAUS and SDRT. Third, the 8- to 9-year-old group performed the SART in an intermediate and stable manner; at each test occasion they performed with a more consistent performance throughout the task relative to the other groups. Fourth, there were significant main effects of test occasion on most measures, indicating changes in performance across the year regardless of age. This suggests ongoing development of sustained attention throughout childhood. The intermediate performance of the 8- to 9-year-old group, and the improvement between assessments, provide support for the first and second predictions that there would be both between-group and between-occasion differences in SDRT, FFAUS, SFAUS, sigma, commission errors, omission errors and d’. There was partial support for the prediction that there would be between-group differences and between-occasion change in tau. There was significant improvement in tau between assessments independent of age; however, there was no significant difference between the two younger groups. Overall, results suggest that between 6 and 11 years, there is ongoing improvement and further development in sustained attention to a non-arousing task.

The Fixed SART was relatively difficult for the 6- to 7-year-old group. As expected, this group performed less competently than the oldest group on almost all measures at each occasion, and less competently than the 8- to 9-year-old group on several measures. The youngest group also showed within-occasion time-on-task effects for almost all measures. This suggests that at 6 to 7 years, maintaining attention for more than a few minutes on an unengaging, repetitive and predictable task was difficult. Maintaining attention to a task requires cognitive control regions such as the anterior cingulate cortex, as well as regions for maintaining alertness such as the medial frontal and prefrontal cortex (Coste & Kleinschmidt, 2016; Fassbender et al., 2004). These regions of the brain are still maturating at 6 and 7 years of age (Giedd et al., 2014; Rubia, 2013), so our findings are not unexpected. Some researchers suggest most development in sustained attention occurs between 6 and 7 years (Sobeh & Spijkers, 2012; Visu-Petra et al., 2007). The current study alternatively indicates ongoing development through childhood; however, it does indicate a period of greater development between 6 and 7 years than after 8 years. These findings may suggest that there is a great amount of improvement in sustained attention to an unengaging task between 6 and 8 years of age.

The 10- to 11-year-old group generally performed the Fixed SART well. As expected, this group outperformed the 6- to 7-year-old group on almost all measures at each occasion, and outperformed the 8- to 9-year-old group at each occasion on several measures. Despite their superior performance, the 10- to 11-year-old group performed with time-on-task effects within each occasion on several measures. These effects were particularly notable for FFAUS and SDRT; on those measures the 10- to 11-year-old group performed at a similar level to the 8- to 9-year-old group in the second half of the task. FFAUS reflects momentary fluctuations in responding, which may be suggestive of fluctuating attention control (Johnson, Robertson, et al., 2007). SDRT incorporates variability across both the slow and fast frequency domains, and the strong FFAUS effect will have been reflected to some extent in the SDRT measure. At 10 to 11 years, there is continuing development of sustained attention on an unengaging task.

The ongoing development of sustained attention performance at 10-11 years of age is in line with past research. Previous cross-sectional studies using Go/No-Go tasks reported a decrease in SDRT and commission errors from childhood into adolescence (Brocki et al., 2010; Ciesielski et al., 2004; Fortenbaugh et al., 2015; Lin et al., 1999; McAvinue et al., 2012; Rebok et al., 1997). Previous fMRI research has reported increasing activation from 10 years into early adulthood of frontoparietal-striatothalamic sustained attention networks during a CPT (Smith, Halari, Giampetro, Brammer, & Rubia, 2011) and during a vigilance delay task (Murphy et al., 2014), suggesting ongoing functional maturation of these networks. The current findings may reflect this protracted development of frontoparietal networks (Casey, Galvan, & Hare, 2005; Casey, Giedd, & Thomas, 2000; Gogtay et al., 2004; Rubia, 2013). While 10- to 11-year-olds performed the Fixed SART well relative to younger children, their high level of performance was not maintained for the duration of the task. Immature frontoparietal networks may make the maintenance of task goals and monitoring of task performance difficult.

The 8- to 9-year-old group performed the Fixed SART in a stable and consistent manner, at a moderately good level. As expected, this middle group mostly performed the SART at a level intermediate to the two other groups. This group showed superior and more consistent sustained attention on the Fixed SART compared with the youngest group at each occasion. Only in terms of FFAUS and tau did the 8- to 9-year-old group perform at a level similar to the 6- to 7-year-old group. Maintenance of cognitive control of attention and avoidance of extremely slow responses on an unengaging task may show improvements later in childhood than other measures of sustained attention. A key finding of this study is that at each occasion the 8- to 9-year-old group did not show significant differences between the first and second halves of the SART on most measures; this group generally showed a stable level of performance across the task. This is an important finding because time-on-task effects are a common finding even in adult data (e.g. Anderson, Wales, & Horne, 2010; Lee, Williams, Sargent, Williams, & Johnson, 2015), so this stability is unexpected. A developmental plateau may have been reached at 8 and 9 years. This may reflect a transition period between the development seen between 6 and 8 years of age, and the improvements in attention control that may occur in adolescence. The 8- to 9-year-old group could perform moderately well on the Fixed SART but are yet to undergo more subtle development that likely occurs into adolescence.

Tau followed a somewhat similar pattern of results to commission errors and d’, measures associated with deciding whether or not to respond. On all three measures, all age groups showed within each occasion time-on-task effects, with no Half × Group interactions. These were the only measures that the 8- to 9-year-old group performed with a time-on-task effect. Tau reflects very long responses, and is the skewed tail in the exponential distribution Research has linked increased tau to lapses in attention (Leth-Steensen et al., 2000; Lin et al., 2015; McVay & Kane, 2012; Tarantino et al., 2013; Weissman, Roberts, Visscher, & Woldorff, 2006) and to a slower rate of decision making and information processing (Luce, 1986; Schmiedek, Oberauer, Wilhelm, & Wittmann, 2007). There is, however, debate regarding the interpretation of tau; it has been suggested that tau may not reflect any particular cognitive process (Matzke & Wagenmakers, 2009). Therefore, it is an assumption that increased tau reflects increased lapses in attention. Further research is required to test this assumption. Our data are not contradictory with either interpretation of tau. Attention lapses leading to occasional very long response times may also lead to slower information processing and decreased target sensitivity and inhibitory control. This may explain the similarity in findings between tau, commission errors and d’. Similarly, there are other techniques to analyse RT data such as the drift-diffusion model (Ratcliff, 1978; Ratcliff & Smith, 2004). This model separates decision processes, such as the rate of evidence accumulation (drift rate v) which is determined by the quality of information extracted from the stimulus, from non-decision processes such as encoding processes and response output processes. Future drift-diffusion analysis of RT data from the SART is still required.

On almost all measures, all three age groups showed the same rate of change over the year. There were no Test Occasion × Age Group interactions; no group showed a differential rate of change relative to the other groups. This may be interpreted as evidence of practice effects; however, there are several reasons to consider alternative explanations. First, each measure showed different rates of change over the year. Second, the task was a short 5.5 minutes and there was a long 6-month interval between tests. Lastly, while interactions cannot be investigated with non-parametric tests, results suggest that the two younger groups made a decreasing number of omission errors over the year, while the 10- to 11-year-old group performed with similar number of omission errors at each assessment. Therefore practice effects cannot fully explain these findings. Changes over the year may instead reflect incremental and consistent improvement throughout childhood of sustained attention ability on an unengaging task.

This study has implications for our understanding of sustained attention. Previous research on RTV in children has used SDRT or CV as a measure of sustained attention ability, often reporting a decrease with increasing age through childhood (Brocki et al., 2010; Klarborg et al., 2013). In the current study, each measure of RTV showed a different pattern of results. The two older groups did not significantly differ in slow fluctuations in RT,. The two younger groups did not significantly differ in very long responses. Meanwhile, in terms of both SDRT and fast frequency fluctuations in RT, the 10- to 11-year-old group performed in a superior manner to the 8- to 9-year-old group, but this superior performance was not maintained for the duration of the task. These different findings may suggest that these measures represent different components of attention. We propose that tau may be indicative of attention lapses (Henríquez-Henríquez et al., 2015; Leth-Steensen, King Elbaz, & Douglas, 2000; Lin, Hwang-Gu, & Gau, 2015), slow fluctuations in RT may indicate declining arousal over the task (Johnson, Kelly, et al., 2007) and fast frequency fluctuations in RT may indicate cognitive control of attention (Johnson, Kelly et al., 2007). The current study may therefore suggest that sustained attention does not follow a single developmental pathway, and should not be considered a unitary process. These interpretations of what these RTV measures represent are just one possible explanation, and further research is required to assess which cognitive and biological mechanisms these measures reflect. Regardless of the meaning of these measures, the current study highlights different rates of development for different measures of RTV. The current study adds weight to the previous suggestion that FFAUS and tau are highly sensitive measures of sustained attention development (Lewis et al., 2017). Future research must, therefore, be careful in generalizing findings from simple statistics such as SDRT because alternative methods reveal a more complex development of sustained attention.

To build on the findings of the current study, further research on RTV is still needed. In particular, there is a need for clarification of the neural bases of RTV in childhood. The neural sources of slow frequency variability, fast frequency variability, and very long responses are still largely unknown in children. Some research has investigated the neural correlates of SDRT and CV in children (McIntosh, Kovacevic, & Itier, 2008; Simmonds et al., 2007; Suskauer et al., 2008; Tamnes, Fjell, Westlye, Østby, & Walhovd, 2012), which largely implicate frontoparietal networks. A few recent studies have investigated the neural correlates of sigma and tau in children (Fassbender et al., 2009; Lin, Gau, Shang, Wu, & Tseng, 2014; van Belle, Van Raalten, et al., 2015), but the focus of these studies was ADHD, so they did not specifically investigate age-related changes. More research is needed to clarify how developmental changes in the various measures of RTV might relate to brain maturation. The various measures of RTV may be underpinned by different neural networks that show different rates of structural and functional maturation. Further longitudinal imaging studies are needed to test this hypothesis, and to further complete our understanding of RTV and sustained attention in childhood.

In conclusion, this study revealed consistent and ongoing sustained attention development between 6 and 11 years, as measured by the Fixed SART. At each test occasion, children aged 6- to 7-years performed the SART in a less competent manner than the 10- to 11-year-old group, and with time-on-task effects on most measures. At 6 to 7 years, arousal and sustained attention were difficult to maintain on this predictable and repetitive task. At each test occasion, children aged 10- to 11-years performed the SART well relative to younger children, but this was effortful to maintain over the entire task. At 10 to 11 years there was ongoing development of arousal, sustained attention, and executive control of attention. Children aged 8 to 9 years performed the SART in a stable and intermediate manner at each test occasion, with little change in performance throughout the task on most measures. Sustained attention to non-arousing stimuli appears to follow incremental and consistent development between 6 and 11 years, but there is a period of stability in performance at 8 and 9 years.