Introduction

One of our cognitive core functions has attracted increasing research interest in recent decades: working memory (WM), defined as the ability to control attention and simultaneously manipulate and temporarily store information (Kane & Engle, 2002). WM is a multifaceted construct in which storage and executive processing interact. The storage component holds a limited amount of information in an active state for a short time. In contrast, the central executive component controls resources and monitors information processing across informational domains (Baddeley, 1986). WM is essential for tasks in everyday life because it enables us to filter, analyze, and act on a steady stream of information. Furthermore, WM is associated with a broad range of higher-order cognitive abilities, such as executive control and problem-solving (e.g., Jaeggi et al., 2010; Lu et al., 2011; Miyake et al., 2000). Our WM capacity circumscribes our ability to learn to a large extent (Cowan, 2014), which has led scholars to investigate its impact on academic success. Results confirm that WM is one of the best single predictors of children’s academic achievements (Alloway & Alloway, 2010; Fitzpatrick et al., 2015). Children with WM deficits show below-average academic performance (Titz & Karbach, 2014). To minimize cascade-like deficits, interventions targeting WM can be applied (cf. Jones et al., 2020a; Wass et al., 2012). This idea is supported by studies suggesting greater neural and behavioral malleability in children (e.g., Heckman, 2006; Stiles et al., 2005; Thomas & Johnson, 2006).

However, WM interventions are still rare in ordinary elementary school settings. Findings of typically developed children’s benefits from training in novel tasks are neither robust nor consistent (Sala and Gobet, 2017). The present study aimed to identify which training task features and trainees’ characteristics promote cognitive and academic benefits by assessing personal variables of participants and comparing the effects of WM training with those of perceptual training implemented in a standard school setting.

WM Training

Debate on the efficacy of WM training is ongoing. WM training has the potential to improve children’s WM performance (Diamond & Lee, 2011), executive functions (Scionti et al., 2020), some higher-order cognitive performance (Alloway et al., 2013; Studer-Luethi et al., 2016), academic abilities (Holmes & Gathercole, 2014; Karbach et al., 2015), or even real-life behavior (Luis-Ruiz et al., 2020). In addition to behavioral effects, several neuroimaging studies have demonstrated that the effects of WM training couple with complex patterns of subtle, localized structural and functional changes in the brain (Astle et al., 2015; Bäckman et al., 2011; Caeyenberghs et al., 2016; Sánchez-Pérez et al., 2019). These findings indicate the alteration of a biological system responsible for information processing.

Nevertheless, the data are inconsistent. More consistent training-induced benefits have been found in children than in adults (see Wass et al., 2012) and especially in children with some cognitive impairments (Jones et al., 2020a; Ko et al., 2020; Oldrati et al., 2020; Passarotti et al., 2020; Veloso et al., 2020). However, several studies have failed to confirm WM training’s effects on children’s academic or higher-order cognitive performance (Dunning et al., 2013; Thorell et al., 2009). Such inconsistent results have led to several reviews with disparate conclusions (e.g., Au et al., 2015; Constantinidis & Klingberg, 2016; Melby-Lervåg et al., 2016; Titz & Karbach, 2014). Besides different methodological standards across studies, such inconsistencies may arise from variations in training task features, trainees’ characteristics, or both, which may evoke varying influences on the transfer-enabling cognitive processing systems (Pergher et al., 2020; Pergher et al., 2020).

Characteristics of Training Tasks and Trainees

Training-induced improvements may rely on specific processes activated in particular tasks. Some interesting insights from this research include the relevance of task features, such as gamification (Katz et al., 2014; Khaleghi et al., 2021; Shaban et al., 2021) or paradigm-specificity and complexity (Byrne et al., 2020; Gathercole et al., 2019; Gibson et al., 2013; Holmes et al., 2019; Küper & Karbach, 2016; Minear et al., 2016). A more pronounced transfer to nontrained tasks seems to occur from adaptive (Brehmer et al., 2011), multiparadigm and multifactorial training (Jaušovec & Jaušovec, 2012; Owen et al., 2010; Schmiedek et al., 2010; von Bastian & Oberauer, 2013), and complex storage and processing training (Gibson et al., 2013; Jones et al., 2020b; von Bastian & Oberauer, 2013). Two examples of such training tasks are the n-back and complex span tasks, as they combine demands on storage and processing components of WM (Chein & Morrison, 2010; Jones et al., 2020a).

Research into how trainees’ characteristics influence the effectiveness of interventions provides evidence of the role of interindividual differences as an essential factor (Melby-Lervåg et al., 2016). Among others, individual factors that influence both training compliance and training outcomes in adults and children are personality (Studer-Luethi et al., 2012, 2016; Urbánek & Marček, 2016), emotions (Brose et al., 2014), and motivation (Appelgren et al., 2016; Jaeggi et al., 2011). For instance, there is an interaction between personality traits, neuroticism and conscientiousness, and training outcomes (Studer-Luethi et al., 2012). Furthermore, children’s effortful control abilities predicted training improvement and larger post-training gains (Studer-Luethi et al., 2016).

Other studies have associated education, strategy use, motivation, parenting structures, and family functioning with short-term improvements in WM and training compliance following WM training in school-age children (Appelgren et al., 2016; Pascoe et al., 2019; Pergher et al., 2020). More is to be learned about the impact of trainees’ characteristics on cognitive training outcomes.

Study Paradigm

This study implemented WM training in a standard school setting to explore how training task features and trainees’ characteristics influenced training and transfer outcomes. We compared the effectiveness of a complex WM training with a control intervention involving perceptual-matching training on the same set of transfer measures. We also assessed some personal, regulatory, motivational, and social variables of children completing the training. The study complies with the basic methodological criteria suggested for WM training in children (cf. Vernucci et al., 2022).

Our WM training paradigm applied two complex visual WM tasks that are often used for training WM, an n-back and a complex span task. In the n-back task, participants see a series of stimuli and are required to judge for each stimulus whether it is the same as the stimulus seen in n items back. In the complex span task, participants must remember a series of stimuli with a processing task between each item.

Our control training used an audiovisual training task that placed far lower demands on the storage and processing components of WM but instead placed high demands on the perceptual integration of auditory and visual information in short-term memory. We chose auditory-visual matching tasks from the AUDILEX program (Karma, 1989), which has been used in research before (Kujala et al., 2001). Its theoretical background involves auditory structuring ability and auditory-visual matching. The tasks require participants to compare a stream of auditory tones with visual patterns of differing heights and lengths representing the tones on the computer screen.

Thus, both training regimes require attentional control and the processing of a stream of stimuli kept active for a short time (Engle, 2002). Critically, the WM tasks also require extensive processing of information, storage of items in primary memory, and retrieval and comparison of information from secondary memory (Unsworth & Engle, 2007). In contrast, perceptual training tasks require fewer but overlapping components, especially perceptual processing and auditory and visual information matching. According to Baddeley (2000) and Miyake’s (Miyake & Friedman, 2012; Miyake et al., 2000) models, the components involved in the WM tasks are storage, retrieval, and the central executive functions of inhibition, updating, and shifting. However, the components involved in the perceptual training are the visual sketchpad and the phonological loop. In Bastian et al.’s (2013) terms, our WM tasks place high demands on the storage and processing functions of the WM. In contrast, the perceptual tasks place demands on the relational integration of information in perceptual processing in WM.

Regarding trainees’ characteristics, we assessed individuals’ variables that may influence WM training outcomes. Volitional, motivational, and social factors seem relevant for training performance, as frustration and emotions during the training need to be regulated if the trainee performs well.

Building upon previous findings outlined above, we had the following hypotheses:

1) Children in the WM training group will reach higher transfer on cognitive performance measures than the perceptual training group, assuming that improvements may depend on the specific processes involved in WM training regimens, such as demands on the active processing and storage components of WM.

2) Several personal factors will relate positively to training outcomes. These include two personality factors, low neuroticism and high conscientiousness, and two self-regulation factors, high effortful control and power of endurance. In addition, taking into consideration that the training was conducted in groups in a school setting, we hypothesized that two school-related factors, the joy of learning and social integration, would positively influence training success.

Methods

Participants

Eighty-six elementary school students (42 girls, 44 boys) with a mean age of 10.1 years (SD = 0.74; range = 8.2–12.1) participated in the study. We recruited from five elementary schools in Switzerland and enrolled participants attending third grade (62.9%, 30 boys) and fourth grade (37.1%, 19 boys). The informed consent communicated to parents, teachers, and students that the participants would receive one of two interventions, both of which were beneficial in different ways. All caregivers provided informed written consent before participation. The teachers indicated that all the participants were able to understand and write German well. Each participating class received an award after completing the training.

Participants were matched for age, gender, and general intelligence. Then, we randomly assigned participants to the WM training group (n = 43; mean age = 10.1 years; SD = 0.64; 22 boys) or to the control group (n = 43; mean age = 10.1 years; SD = 0.83; 22 boys). We excluded the data of seven children (three in the WM training group, four in the control group) from the analyses due to their infrequent attendance at training. The minimal attendance required was 17 sessions. Both experimental groups performed an average of 17.5 training sessions (SD = 2.53).

Design and Procedure

Teachers’ and parents’ rated the individual variables for participating students, which the authors collected before starting the intervention. Pre-post tests bookended the intervention. A pre-test (T1) was given the week before starting the training, followed by a post-test (T2) the week after training completion. The whole class received simultaneous assessments of intelligence and academic measures, while the WM tasks were administered individually. The cognitive tests were randomized such that A and B versions appeared in the pre-post-tests in a counterbalanced order to reduce retest effects. Additionally, self-reported questionnaires were conducted only at pre-test.

After the pre-tests, the authors randomly assigned participants to one of two study groups. One group completed the WM training (WM training group), whereas the other group participated in the perceptual-matching training (control group). The authors and master-level students conducted the testing and training sessions during regular school lessons three times per week over 6 weeks. Each session lasted approximately 15 min. The children trained their tasks individually with headphones in groups of five to eight in a separate room in each school. In every training session, the WM training group completed the farm span task and jumping animal task, and the perceptual-matching training group completed two auditory-visual matching tasks. After the final training sessions, a questionnaire was administered to measure training enjoyment, motivation, and perceived profit.

Materials

WM Training Tasks

The WM training consists of two tasks, an n-back task named the “jumping animal task,” and a span task named the “farmer task.” The jumping animal task is based conceptually on the visual single n-back task (Jaeggi et al., 2010). The farm span task derives from a complex span task (i.e., animal span task in Buschkuehl et al., 2008). We modified the training tasks to improve their attractiveness for children using findings of specific features relevant to young trainees’ motivation and performance (Mohammed et al., 2017; Moret-Tatay et al., 2016; Prins et al., 2011). Specifically, we modified the layout, storyline, adaptivity, feedback, and reward system shown in Fig. 1 (Studer-Luethi et al., 2012).

Fig. 1
figure 1

Adapted n-back and complex span task applied in the working memory training group. “Jumping animal task”: A Task design and immediate feedback, B feedback slide; “farm span task”: C sequence recall slide, D feedback slide

Jumping Animal Task

In this n-back task, pictures of a jumping animal, such as a rabbit or a kangaroo, appear at different locations on the screen. A sequence of locations appears on the screen (presentation time: 500 ms, interstimulus interval: 2500 ms). During each interval, the participant “feeds the animal” by pressing a pre-defined target key whenever the animal’s current location is the same as n positions back in the sequence or presses a pre-defined nontarget key in any other case. Immediate feedback pops up at the top of the screen for each response shown in Fig. 1A. Every level of n contains three blocks, represented by field sizes of 4, 8, and 11 grid compartments. If the participant has made fewer than three mistakes, the field size increases. The level of n increases after the successful completion of the third block. Similarly, the field size decreases after more than five mistakes, but the n decreases only after three unsuccessful blocks. After each block, the participant receives visual performance feedback shown in Fig. 1B.

Animal Span Task

At the encoding stage, the participant is presented with a sequence of animals either the right way up or upside-down. The participant must determine as quickly as possible the animal's orientation by pressing the right or the left mouse button. If the participant waits longer than 3000 ms to answer, they receive a reminder to respond more quickly. At the recall stage at the end of each animal sequence, all the animals appear on the screen. The participant is prompted to reproduce the sequence of initial presentations by clicking on the animals shown in Fig. 1C). The participant receives visual performance feedback, and a bar indicates the level reached shown in Fig. 1D. The following sequence length increases by one if the participant’s reaction has been quick enough without mistakes. Likewise, the following sequence reduces by one if the sequence contains mistakes.

For both training tasks, the mean task level of every training session served as the dependent variable that defined training performance; the difference between the last two training sessions and the first two training sessions served as the dependent variable defining training gain.

Perceptual Training Tasks

We applied two auditory-visual matching tasks shown in Fig. 2 (Audilex; Karma, 1989), in which sound patterns with 3–15 elements were graphically presented on the screen as horizonal sequences of rectangles. Synchronously, sound elements were presented through headphones which varied in pitch, duration, and intensity. These variations were visually represented by the vertical position, length, and thickness of the rectangles on the screen. Importantly, relevant characteristics such as immediate feedback and game-like features such as pictures and colors were also present in this task (Mohammed et al., 2017; Moret-Tatay et al., 2016). The participant plays both tasks, typically starting the session by playing task 1 and then continuing with the more difficult task 2.

Fig. 2
figure 2

Perceptual-matching training tasks 1 and 2 applied in the control group. Note. A Task design; B task examples of task 1, in which participants are required to choose the matching pattern, and task 2, in which participants are required to press a key as soon as the pattern is complete

In task 1, two patterns appear on the screen. Two seconds later, a tone sequence beeps through headphones, and the software prompts the participant to indicate which of the visual patterns corresponds to the presented sound pattern. In task 2, only one pattern appears visually on screen, while the corresponding sound sequence plays simultaneously. The software directs the participant to follow the pattern and press the spacebar as soon as the sound corresponding to the last element of the visual pattern plays.

Smiling faces on the screen appear after each correct response, whereas the same sequence repeats in case of an incorrect response. Easy and complex patterns are presented randomly throughout each session. The task is adaptive, in that the participant can change the stimulus-onset asynchrony (SOA of stimulus block and sound duration, varying between 200 and 1800 ms and the sound duration within a window of 30–80% of the SOA 60–1440 ms) and make the task harder. The participant can also change the musical instrument on which the sound plays according to preference (e.g., trumpet, flute, violin, piano). The software guides the participant to change the duration and instruments in the tasks during the training period.

Training task performance was measured at the first and last training session with a test version of task 2. This included a set of 30 audiovisual matching tasks with a stimulus presented with a 1000-ms SOA and sounds with a duration of 550 ms throughout the test. The number of hits served as dependent variable, registered by space-bar presses occurring during the time window when the last sound of the pattern is played.

Measures for Cognitive and Scholastic Abilities

A battery of cognitive measures which represents performance in various cognitive and scholastic areas was chosen based on factors which were of interest regarding the hypotheses.

General Cognitive Ability

Nonverbal intelligence was assessed using the revised German adaptation of the Culture Fair Intelligence Test (CFT 20-R; Weiss, 2006). We used a short form suitable for young children, consisting of four subtests: series, classifications, matrices, and topologies. The composite score of the four subtests served as the dependent variable.

Proxy for Crystallized Intelligence (Gc)

The German vocabulary intelligence test taken from the revised CFT 20-R was administered to measure crystallized intelligence (CFT-WS; Weiss, 2006). The test consists of 30 keywords that are not part of the basic vocabulary of the German language. For each of the keywords, the software directs the participant to choose the word with the same or closest meaning from a sample of five words; the test ends after 6 min. The number of correct word choices served as the dependent variable.

Working Memory

WM capacity was assessed with a backward color recall task, conducted individually with each participant (Roebers & Kauer, 2009). The participant is presented with a sequence of colored discs on a computer screen; each disc appears for 1 s. At the end of each sequence, a dwarf appears on the screen. The software invites the participant to help the dwarf collect the discs by recalling the sequence presented in the reverse order. Sequence length is two at the beginning and increases by one item when the participant has correctly recalled two of three sequences at a particular level. The dependent variable was the number of trials of correctly reproduced sequences.

Math

We used a standardized curriculum-based math test (DEMAT 2 + /3 + /4 + ; Krajewski, Liehm, & Schneider, 2004). The five subtests deal with characteristics of numbers, comparison of numbers, addition and subtraction, duplication and bisection, and division. The sum of all arithmetic subtest scores was z-scored for each class, and we integrated the standardized variables into one dependent variable reflecting math ability.

Reading

We used a German reading diagnostic test to assess reading ability (LDL; Walter, 2010). In this test, the participant is prompted to read out a text for 1 min as quickly and accurately as possible. The number of correctly read words serves as the dependent variable.

Questionnaires for Personal Variables

Neuroticism

The authors applied the Hamburger assessment of neuroticism and extraversion (HANES—KJ, Form 1; Buggle & Baumgärtel, 1975). The instrument is a self-report questionnaire for children and adolescents based on Eysenck’s model of personality (Eysenck, 1967). For this study, we used only the neuroticism subscale. The instructor read questions aloud to the class, and participants responded by marking yes or no for each question on the questionnaire. Cronbach’s alpha was 0.90.

Conscientiousness

Since conscientiousness is difficult for children to evaluate (Tackman et al., 2017), we used the conscientiousness subscale of the parent-reported five factors questionnaire for children (FFK, Asendorpf & van Aken, 1999). Cronbach’s alpha was 0.93.

Effortful Control

To obtain information on participant’s effortful control, reflecting the temperament category self-regulation (Rothbart, Derryberry, & Posner, 1994), we used a short questionnaire for parents (nine items). The questions came from the Children’s Behavior Questionnaire (CBQ, Blair & Razza, 2007; Putnam & Rothbart, 2006). Parents indicated on a 7-point Likert scale how well a description of behavior fitted that of their participant. Questions refer to attention (“shows strong concentration when drawing or coloring in a book”), inhibitory control (“is good at following instructions”), emotion (“gets angry when s/he can’t find something”), and approach (“becomes very excited before an outing”). Cronbach’s alpha was 0.95.

Power of Endurance

Teachers rated participant’s power of endurance with four items drawn from the Intelligence and Development Scale (IDS; Meyer et al., 2009). Questions referred to the participant’s endurance to execute and finish a task even when experiencing difficulties or tiredness. Teachers indicated their level of agreement for every statement on a 4-point Likert scale. Cronbach’s alpha was 0.94.

Learning Enjoyment

Teachers reported the general learning enjoyment of the participant with eight questions drawn from the IDS (Meyer et al., 2009). Questions referred to the joy of the participant in learning and understanding new information. Teachers indicated their level of agreement for every statement on a 4-point Likert scale. Cronbach’s alpha was 0.92.

Social Integration in Class

To find out about the degree of social integration of the students, we asked the teachers to evaluate several statements. These included the following: “The participant has positive social interactions”; “The participant is well integrated into the group”; and “The participant can cooperate in teamwork.” Teachers indicated their level of agreement for every statement on a 4-point Likert scale. Cronbach’s alpha was 0.91.

Results

Training Performance

Both groups improved their performances in the training tasks shown in Figs. 3 and 4. Participants in the WM training group significantly improved their performance from the first two training sessions to the last two sessions in both the farm span task, t(40) = 3.05, p = 0.004, d = 0.47, and the jumping animal task, t(40) = 4.87, p < 0.001, d = 1.82. Participants in the control group significantly improved their response accuracy in the visual-auditory matching task at the first and the last training session, t(39) = 4.26, p < 0.001, d = 0.673.

Fig. 3
figure 3

Working memory training performance

Fig. 4
figure 4

Training task performance in the first and last training session in both training groups. a Working memory training. b Perceptual training

Personal Measures and Training Improvements

None of the personal variables was associated with initial performance in the WM training tasks (with r ranging from 0.06 (endurance) to 0.20 (effortful control), both n.s.). In order to test whether any of the personal variables would help predict training task performance, we analyzed the correlations between these variables with average WM training level and training task improvement in both groups. Training task improvement is the difference between the first training session and the final training session in both WM training tasks (average z-scores of span and n-back tasks) and in the test version of the perceptual training task, respectively. The correlations appear in Table 1.

Table 1 Correlations of personal variables with training and transfer performance

Regarding WM training, neuroticism related to decreased training performance (p = 0.065) and training gain (p < 0.05). In contrast, conscientiousness positively related to training mean (p = 0.11) and marginally significantly related to training gain (p = 0.049). Effortful control was positively associated with mean WM training performance (p < 0.05), but it was not related to training gain. Power of endurance was related to higher training mean (p < 0.01) and training gain (p < 0.01). The joy of learning was unrelated to training mean but strongly related to training gain (p < 0.01). Finally, social integration was associated with training mean (p < 0.05) and training improvement (p < 0.001). The scatterplots of the relationships between the personal variables and WM training gain appear in Fig. 5. There were no significant correlations between the personal variables and perceptual training task gain.

Fig. 5
figure 5

Association between working memory training gain and personal variables. Correlations between working memory training improvements in z-scores and a neuroticism, b conscientiousness, c effortful control, d endurance, e joy of learning, and f social integration

We also collected responses to four questions eliciting training feedback: (1) How motivated were you for the training? (2) How much effort did you put into the training? (3) How challenging did you find the training intervention? and (4) To what extent do you feel it has improved your cognitive abilities? Notably, there were no group differences between the WM training and control groups for the mean feedback score (F(1, 71) = 0.64, n.s., d = 0.19). We did not find any significant association between these motivational variables and either training or transfer performance in the two groups (r ranging from 0.03 to 0.12).

Transfer Performance

We conducted ANOVAs for repeated measures for the transfer variables with the factors group (WM training group vs active control group) and time (pre- and post-training assessment). To control for any baseline group differences (Lord’s paradox), we also conducted ANCOVA on posttest measures with pretest measures as the covariate. We found no difference in direction or magnitude of effects and therefore only included the analyses of the ANOVA for repeated measures. We conducted post hoc analyses of differences of means (Δ), corrected for multiple comparisons with the Bonferroni correction. We also computed the within group changes by calculating the effect size Cohen’s d with the correction for repeated measures as proposed by Morris (2007). The resulting transfer effects are presented in Table 2 and Fig. 6.

Table 2 Pre-test and post-test performance in cognitive and academic tasks
Fig. 6
figure 6

Performance in transfer measures before and after cognitive training. Mean pre-test (T1) and post-test (T2) performance and standard deviation in the transfer variables for the WM training group (left) and the control group (right). *p < 0.05; **p < 0.01 (Bonferroni-corrected differences of means)

Notably, the WM training and control groups did not differ in their performance at pre-test (all t < 0.65, p = n.s.; see supplementary material).

Transfer to Cognitive Measures

Working Memory

There was no main effect of time on the improvement in visual WM, F(1,78) = 1.129, p = 0.291, ηp2 = 0.014, but a marginally significant interaction between time and group, F(1,78) = 3.846, p = 0.051, ηp2 = 0.05. Post hoc analysis revealed a significant improvement only in the WM training group (Δ = 0.82, p = 0.038) but not in the control group (Δ =  − 0.24, p = 0.522).

Fluid Intelligence

The improvement in nonverbal intelligence performance between pre-test and post-test reached significance for both groups, F(1,74) = 22.98, p < 0.001, ηp2 = 0.24, and this improvement was independent of time ⨯ group interaction, F(1,74) = 0.126, p = 0.724, ηp2 = 0.002. Post hoc analyses confirmed the improvement in nonverbal intelligence in both the WM training group (Δ = 6.48, p < 0.001) and the control group (Δ = 5.58, p = 0.003).

Crystallized Intelligence

Pre- to post-intervention improvements were found in vocabulary performance, F(1,73) = 11.443, p = 0.001, ηp2 = 0.14. This improvement was independent of time ⨯ group interaction, F(1,73) = 2.247, p = 0.138, ηp2 = 0.03. Post hoc analyses revealed a significant improvement in the WM training group (Δ = 1.87, p = 0.001), but not in the control group (Δ = 0.72, p = 0.196).

Transfer to Academic Abilities

Mathematics

There was no main effect of time on the improvement in mathematical abilities, F(1,69) = 2.46, p = 0.122, ηp2 = 0.03, but there was a significant time ⨯ group interaction, F(1,69) = 6.45, p = 0.013, ηp2 = 0.09. Post hoc analyses revealed a significant improvement in mathematical ability in the WM training group (Δ = 0.35, p = 0.006), but not in the control group (Δ = 0.09, p = 0.485).

Reading

We found overall improvements in reading, F(1.79) = 23.36, p < 0.001, ηp2 = 0.23, independent of intervention group, F(1,79) = 0.00, p = 0.987, ηp2 = 0.00. Post hoc analyses confirmed the improvement in reading skill in both the WM training group (Δ = 7.00, p = 0.001) and control group (Δ = 6.95, p = 0.001).

Personal Variable and Transfer Performance

Possible associations between personal variables and transfer measures were analyzed using correlational analysis, analysis of variance (ANOVA), and analysis of covariance (ANCOVA). There were no significant results.

Discussion

This study examined training task features and trainees’ characteristics on children’s WM training outcomes in an elementary school setting. To do so, we compared the outcomes of a WM training group to a control group that trained with perceptual-matching tasks in an elementary school setting. We found that the WM training group showed a significant increase in math performance and in a WM task, compared to the control group. Post hoc analyses revealed a small improvement in vocabulary after WM training compared to a null effect in the control group. No differential training effects were found for fluid intelligence and reading. We also found that several personal factors positively influence children’s WM training performance. These include endurance, personality factors of low neuroticism and high conscientiousness, and school-related factors of the joy of learning and social integration. Thus, our results suggest that training-induced effects may depend on the demands of the cognitive training tasks as well as on participant’s personal, regulatory, and school-related characteristics.

The following discussion first considers training outcomes for various transfer measures, then discusses individual differences, and finally integrates the results and draws conclusions.

To start with WM performance, a marginally significant interaction between group and transfer performance indicate that WM task performance improved more by WM training than by perceptual-matching training. This finding supports our previously found training effect on WM capacity with a sample of school children (Studer-Luethi et al., 2016). It is in line with meta-analyses demonstrating transfer effects to WM measures (near transfer; e.g., Soveri et al., 2017; Weicker et al., 2016). Because our transfer WM measure (backward color recall task) differed structurally from the trained WM tasks, logic suggests that the training-induced performance improvement arose from improved processing rather than improved strategy. This assumption is supported by the fining of improved brain functional connectivity in the attentional network in school children following WM training (Sánchez-Pérez et al., 2019) and found associations of increased WM performance with changes in cerebral activity following training (Astle et al., 2015; Brehmer et al., 2011; Olesen et al., 2003; Stevens et al., 2016).

However, it is also possible that the participants learned strategies during WM training with the complex span task (e.g., grouping items) and later applied that learning to the WM measure (strategy mediation hypothesis; Dunning et al., 2013; Peng and Fuchs, 2017; Malinovitch, Jakoby, & Ahissar, 2020). One important caveat is that this study’s participants were children, who generally show greater plasticity than adults (Zhao et al., 2018).

On vocabulary task as a measure of crystallized intelligence, significant improvement was only observed in the WM training group (increase by 1.9 points), but not in the control group (increase by 0.7 points). However, the interaction between groups and test performance was nonsignificant. The transfer effect was approximately as strong as in our previous study (Studer-Luethi et al., 2016) and similar to Alloway et al., (2013) result. WM capacity is a crucial factor for learning and the ability to retrieve knowledge (Gathercole et al., 2006). We speculate that WM training increased these abilities by improving WM processes.

We found no difference in changes to the performance in a matrix test as measure for fluid intelligence between the WM training group and the control group; both groups improved performance between 5.5 and 6.5 points. Transfer to intelligence after WM training has been extensively reviewed in the literature, with inconsistent conclusions (Au et al., 2015; Melby-Lervåg et al., 2016). Our findings join a growing body of research that demonstrates small or no WM training-induced effects on transfer to reasoning or fluid ability tasks (far transfer), in comparison to active or passive control groups (Gathercole et al., 2019; Soveri et al., 2017). As we have no comparison to a no-contact control, we cannot rule out a mere retest effect. Also, the intervention length was quite short, which itself is a limiting factor (Pergher et al., 2020). Nevertheless, both of our experimental groups completed training tasks with high attentional demands; thus, increased attentional processing may have improved their reasoning performance. More research is needed to identify training task characteristics that promote fluid abilities.

Regarding academic abilities, we found evidence of improved performance in a standardized math test after WM training (d = 0.35) in comparison to a control training (d = 0.09). To the best of our knowledge, this study is the first to find improved arithmetic performance after short, intense WM training with regularly developed primary school children. Similarly, another study found significant improvements in math school performance among children who participated in a computer-based WM and math training in comparison to children with no training (Sanchez-Perez et al., 2018). Furthermore, other WM training studies demonstrate transfer to the arithmetic abilities of children with some intellectual or academic difficulties or special needs (Bergman-Nutley & Klingberg, 2014; Dahlin, 2013; Holmes & Gathercole, 2014; Layes et al., 2018; Nelwan et al., 2018; Zhang et al., 2018). Interestingly, another study demonstrated that the other direction seems to work too: training in mental calculation enhanced visuospatial WM in children (Wang et al., 2019). These and the present results are supported the finding that mathematical abilities and counting strategies in young children are heavily reliant on WM (Bull & Lee, 2014; Lee et al., 2009), so we assume that a training-induced increase in WM processes improved mathematical performance. Alternatively, one can propose an effect of WM training on learning capacity (Nutley & Söderqvist, 2017; Söderqvist & Nutley, 2015). Given that skills in mathematics are increasingly crucial to educational and career success across the lifespan (Geary, 2013), this result is promising.

In contrast, we did not find substantial differences between the effects of our WM training and that of the perceptual training on reading performance. Both training groups improved their performances in the reading test to a similar degree (around 7 points). This finding supports our previous result, in which improvement in reading performance following WM training was as large as following reading training (Studer-Luethi et al., 2016). Other study results are inconsistent, in that several demonstrated a positive impact on reading after WM training or after interventions similar to our perceptual training that foster phonological awareness (Karbach et al., 2015; Kujala et al., 2001; Pfost et al., 2019) while others did not (Ang et al., 2015; Chacko et al., 2014; Dunning et al., 2013). Again, as we have no comparison to a no-contact control, we cannot rule out a mere retest effect. WM capacity, efficient processing, and matching visual and auditory information are required for fast and accurate reading (Phillips et al., 2016). Thus, we speculate that a training-induced increase in these processes may have some positive impacts on reading abilities.

We now move to the results about the characteristics of trainees, starting with personality. We found a moderate inverse relationship between neuroticism and training gain (r =  − 0.42), which was comparable to previously found associations in young adults (r =  − 0.25, r =  − 0.24; Minear et al., 2016; Studer-Luethi et al., 2012) and children (r =  − 0.32; Studer-Luethi et al., 2016). Likewise, this finding supports the notion that neuroticism can diminish WM performance, probably through emotional and cognitive resource-demanding interference, such as stressful thoughts and anxiety (cf., Derakshan & Eysenck, 2009). In contrast, conscientiousness was positively related to training gain (r = 0.28), which was similar to previous findings (r = 0.28; Studer-Luethi et al., 2016; Studer-Luethi et al., 2012). Likewise, this finding is in line with the repeatedly found positive relation between conscientiousness and training outcomes because conscientious participants tend to be more motivated to excel and improve their skills (e.g., Woods et al., 2016). Thus, we assume that lower neuroticism and higher conscientiousness lead to higher focus, commitment, and motivation in the context of a cognitive training assignment.

The self-regulatory factor of effortful control was unrelated to training gain but positively related to training performance (r = 0.37), similar to the association found in another sample of children (r = 0.33; Studer-Luethi et al., 2016). In addition, we found that children’s power of endurance had an impact on their training mean and training gain. This finding corroborates other results showing that children with good self-regulation can inhibit a dominant response, such as feeling tired or not motivated, and regulate external (e.g., noise in the class room) and internal distraction (i.e., feeling unmotivated; Blair & Diamond, 2008; Eisenberg et al., 2004). Thus, we assume that effortful control and power of endurance may have helped the children efficiently regulate emotional and cognitive processes to perform well in the WM training tasks.

Teachers’ evaluation of children’s joy of learning was a strong predictor of high training gain (r = 0.67). In contrast, children who do not like learning and feeling challenged in school did not improve their scores in the training tasks. Indeed, the joy of learning seems more critical for training than training-related motivation or enjoyment (Söderqvist et al., 2012). Comparably, another study demonstrated that children’s desire to master school learning was more closely related to short-term improvements after WM training than WM processes (Pascoe et al., 2019). Thus, we assume that children’s prior enjoyment of challenges may be critical in facilitating WM training outcome.

Interestingly, we also found a strong association between training performance and the estimated degree of social integration (r = 0.68), in that only children with higher social integration reached above average training scores and improved their scores throughout the training phase. In line with that, other research shows that higher peer popularity is related to higher WM performance, while peer rejection lowers performance (de Wilde et al., 2016; McQuade et al., 2013). Furthermore, children with emotionally secure relationships have shown enhanced task involvement and persistence (Koomen et al., 2004; Thijs & Koomen, 2008). The regulatory depletion model can serve as a theoretical framework explaining such effects, by stating that social processes can reduce a shared pool of resources (Davies et al., 2008). In our case, social interferences, such as comparisons or comments of others, may have lowered cognitive resources for the training of less integrated participants. This found link highlights the importance of considering the social context in which a cognitive intervention with children is implemented.

Finally, transfer variance after training could not be explained by personal variables. It seems that these personal characteristics seem not to act as significant facilitators of or obstructions to benefits following WM training. This result indicates that the positive transfer effects of WM training on the cognitive and academic measures found in this study seem to stem from the assumed WM processes and not from confounding variables of personality, self-regulation, or motivation.

Limitations

The study has three main limitations. One limitation is the small sample size, resulting in weak statistical power and rather exploratory statistical analyses. Another limitation is the lack of a no-contact control group. However, most authors (e.g., Redick et al., 2013; Vernucci et al., 2022) advocate using active control groups over simple no-contact control groups. A recent meta-analysis concluded that passive and active controls do not differ meaningfully in their performance (Au et al., 2020).

Furthermore, the study lacked blinding participants, as they completed the training in mixed groups within their school classes. They saw the other’s training, which might have influenced the children’s training motivation or performance. Finally, the application of conceptually different cognitive training tasks makes it more difficult to draw inferences about underlying mechanisms of transfer. Nevertheless, an advantage of this study’s approach is that specific types of training tasks are available.

Conclusions and Implications

This study demonstrates some evidence for cognitive and academic benefits, namely in WM and math performance, after WM training implemented in a school with second- to fourth-grade children. These benefits were significantly stronger than after training with a perceptual-matching task, suggesting that WM training tasks that place high demands on active processing, storage, and retrieval of information improve cognitive performance more than do training tasks with low demands on these factors. We also found a small transfer effect on vocabulary in the WM training group, as post hoc analyses showed, even though the effect of experimental condition did not reach significance. Finally, we found no differential training effect but overall improvements in reading and fluid intelligence. Thus, we found no generalized effects, which aligns with other inconsistent results in this research field, but the found effects are nevertheless promising. While some meta-analyses conclude that WM training with typically developing children yields not benefits (Sala and Gobet, 2017), our study concludes that there is still reason to keep investing resources in WM training research with children. Undeniably, more research is needed to identify underlying factors of WM training, and neuroimaging signature of intervention effects are needed in addition to the behavioral results reported in this study (cf. Tymofiyeva & Gaschler, 2021).

No less importantly, the data reveal the strong impact of personality and regulatory factors, the joy of learning, and social integration on training performance. The interindividual factors influenced children’s success in the training so that children with positive scores for these factors could increase their training performance throughout the training phase. These are meaningful findings for learning activities in general because they indicate that individual variables may determine how much children want to learn and progress. If the scores on these variables are low, personal variables should be the target of interventions alongside content teaching.

Furthermore, the finding that interindividual variables influence training highlights the importance of ascertaining which children benefit most from computerized training. For example, it might not be very beneficial to advise a student with a low joy of learning or effortful control to undertake WM training because the training tasks are challenging and repetitive and thus demand high self-control. For such a participant, strategy instructions and real-world activities such as martial art interventions may be more beneficial (Blair & Raver, 2014; Lakes & Hoyt, 2004). Moreover, teachers and other caregivers should consider encouraging children to approach training with curiosity and the motivation to learn new things, but without pressure. Assessments of individual differences might help children become aware of their approach to learning to adapt it during training.

These results have implications for the utility of WM training in institutional settings such as schools (Rode et al., 2014): In addition to subject-specific teaching, underlying essential functions, such as WM capacity and self-regulation, can be fostered with computerized WM training alongside other interventions, such as behavioral and mindset interventions, physical activity, imaginary play, as well as phonological awareness and inhibition interventions (see Rowe et al., 2019).

Our results contribute to the field by demonstrating that WM training implemented in a regular school setting can foster some cognitive and academic performance. Interindividual factors need to be considered in further training studies, as our results demonstrate that training success is strongly associated with personal and school-related factors such as the power of endurance, the joy of learning, and social integration.