Benefit of enactment over oral repetition of verbal instruction does not require additional working memory during encoding

Yang, Tianxiao; Gathercole, Susan E.; Allen, Richard J.

doi:10.3758/s13423-013-0471-7

Benefit of enactment over oral repetition of verbal instruction does not require additional working memory during encoding

Brief Report
Open access
Published: 02 July 2013

Volume 21, pages 186–192, (2014)
Cite this article

Download PDF

You have full access to this open access article

Psychonomic Bulletin & Review Aims and scope Submit manuscript

Benefit of enactment over oral repetition of verbal instruction does not require additional working memory during encoding

Download PDF

Tianxiao Yang^1,2,
Susan E. Gathercole³ &
Richard J. Allen⁴

4957 Accesses
35 Citations
3 Altmetric
Explore all metrics

Abstract

For this research, we used a dual-task approach to investigate the involvement of working memory in following written instructions. In two experiments, participants read instructions to perform a series of actions on objects and then recalled the instructions either by spoken repetition or performance of the action sequence. Participants engaged in concurrent articulatory suppression, backward-counting, and spatial-tapping tasks during the presentation of the instructions, in order to disrupt the phonological-loop, central-executive, and visuospatial-sketchpad components of working memory, respectively. Recall accuracy was substantially disrupted by all three concurrent tasks, indicating that encoding and retaining verbal instructions depends on multiple components of working memory. The accuracy of recalling the instructions was greater when the actions were performed than when the instructions were repeated, and this advantage was unaffected by the concurrent tasks, suggesting that the benefit of enactment over oral repetition does not cost additional working memory resources.

How does enactment affect the ability to follow instructions in working memory?

Article 07 November 2014

Forward and backward recall of serial actions: Exploring the temporal dynamics of working memory for instruction

Article 03 October 2018

Following instructions from working memory: Why does action at encoding and recall help?

Article Open access 21 July 2016

Performing actions to command requires that a series of operational steps be memorized with the intention to subsequently carry them out, as is common in everyday activities. Many earlier studies have focused on instructions transmitted either verbally (De Renzi & Vignolo, 1962; Engle, Carullo, & Collins, 1991; Gathercole, Durling, Evans, Jeffcock, & Stone, 2008; Kaplan & White, 1980; Kim, Bayles, & Beeson, 2008; Lesser, 1976) or through demonstration (Meltzoff & Prinz, 2002). Although written instructions are commonly encountered in daily life, the cognitive processes underpinning this mode of directing behavior have been less intensively investigated. This form of instruction is the focus of the present research.

The ability to follow instructions is closely related to the capacity to temporarily hold and manipulate information, an ability known as working memory (WM; Brener, 1940; Engle et al., 1991; Gathercole et al., 2008). The link was first noted by Brener, who found correlations between performance accuracy for simple instructions and digit span, a measure of short-term memory. The token test, which involves the execution of a series of actions upon oral commands—for instance, “after picking up the green rectangle, touch the white circle” (De Renzi & Vignolo, 1962)—was also found to be significantly correlated with verbal, visual, and motor aspects of short-term memory (Lesser, 1976; Wold & Reinvang, 1990), though as this test was developed to discriminate subtle oral comprehension difficulties in aphasic patients, grammatical complexity also varied with the length of the instructions.

The ability to successfully encode and act on instructions is particularly important in educational settings, with children with poor WM showing particular difficulty remembering and following classroom-based instruction (Gathercole, Lamont, & Alloway, 2006). In line with this finding, Engle et al. (1991) found positive correlations between WM capacity and children’s ability to manually implement spoken instructions analogous to those often encountered in classrooms. Gathercole et al. (2008) reported similar findings in 5-year-old children using spoken instructions such as “touch the red pencil, then pick up the blue ruler and put it in the black box,” and they also found that children were substantially better at enacting than at verbally repeating such instructions. Gathercole et al. (2008) speculated that this arises from the formation of a motoric or spatial representation linking objects to physical movements. A similar advantage has also been observed in young adults, using written instructions (Koriat, Ben-Zur, & Nussbaum, 1990).

In the present work, we used a dual-task methodology to explore how the multicomponent model of WM (Baddeley, 2000; Baddeley & Hitch, 1974) might contribute to encoding instructions for enactment versus verbal recall in a healthy young adult population. Constructing and storing detailed action plans may be an effortful process, possibly engaging the central executive, a limited-capacity resource responsible for the attentional control of WM (Baddeley, 1996, 2007). This might contribute to several key processes in remembering instructions—for example, dividing attention between reading instructions and monitoring the locations of relevant objects, associating specific movements with objects, and keeping track of task progress (Gathercole & Alloway, 2008). More generally, the construction of a spatial/motoric representation of instruction sequences in the action condition may be attentionally demanding. This is supported by the higher correlations between instruction performance and backward digit recall (which loads on both verbal storage and the central executive) than forward digit recall (depending primarily on verbal storage) that were observed by Gathercole et al. (2008) in 5-year-old children. We therefore predicted that concurrent backward counting, a task widely used to disrupt this component of WM (Allen, Baddeley, & Hitch, 2006; Baddeley, Hitch, & Allen, 2009), would lead to reduced accuracy in general, but also have a relatively larger adverse effect on action than on verbal recall.

The visuospatial-sketchpad component, providing visual and spatial storage and incorporating a spatial maintenance mechanism (Logie, 1995), may also be important. This component may maintain visual codes of written words during reading (Logie, 2003), and spatial/motoric representations of the action sequences (Gathercole et al., 2008). If so, disruption by a complex spatial-tapping activity should particularly impair the performance of instructions. In contrast, when the task simply involves verbal repetition, the storage of a phonological representation of the sentence may be sufficient. According to Baddeley (1986), verbal STM is based on a short-term store (the phonological loop) supplemented by a subvocal rehearsal mechanism that can recode visual information and offset decay. The loop is also involved in integrating information across saccades during reading (Rayner, 1998). It was predicted that disrupting the phonological loop via articulatory suppression (the repetition of irrelevant digits during encoding) should particularly impair the accuracy of verbal recall.

In the present experiments, we investigated how these WM subcomponents may contribute to the recall of instructions. In each of two experiments, participants read a sentence describing a series of action commands involving operations upon different colored objects, such as “push the black pencil and spin the green rubber, then pick up the red pencil and put it into the blue folder and touch the white bag.” At test, participants either performed the instructions on objects placed in front them or recalled the instructions aloud. In Experiment 1, we used the dual-task methodology to examine the contributions of the phonological loop and the central executive during encoding, whereas in Experiment 2 we compared verbal and visuospatial WM.

Experiment 1

Method

Participants

A group of 24 native English speakers 18 to 28 years of age (20 females, four males), all students at the University of York, took part in exchange for course credit or payment.

Materials

The object set consisted of 12 objects, including six smaller items (yellow ruler, blue ruler, white eraser, green eraser, red pencil, and black pencil) and six containers (black box, red box, yellow bag, white bag, blue folder, and green folder). We used five types of movement (touch, pick up . . . put it into, push, and spin). Following pilot work, each instruction contained five action phrases, with each phrase containing a movement, a color, and an object. No repetition of objects was permitted within an instruction sequence, and adjacent objects in the sequence always featured different colors—for instance, “push red box, pick up black pencil, put it into yellow bag, touch red pencil, spin blue ruler.” Three sets of 12 instruction sequences were created, and these were implemented in counterbalanced order for each participant, balanced across each concurrent task condition. Three practice sets containing six trials (two for each condition) were also prepared. Randomly generated three-digit numbers were used for each of the articulatory suppression and backward counting trials.

All objects were placed on a 146-cm (length) × 75-cm (width) × 71-cm (height) desk, with object location varying randomly between trials. A monitor displaying the written instructions was placed behind the objects (see Fig. 1).

Design and procedure

In a 3 × 2 mixed design, concurrent task was a within-subjects variable (baseline vs. articulatory suppression vs. backward counting), and recall type was a between-subjects variable (verbal vs. enactment recall). The order of the concurrent task conditions was counterbalanced between participants.

Each participant carried out the six-trial practice for all conditions, before commencing the test trials. In all conditions, the entire instructional sequence (containing five action segments) was simultaneously presented on screen in Times New Roman font, size 16, for 13 s. Each action segment appeared on a different line, aligned to the screen center. This was followed by a 1-s blank-screen delay and then a beep sound indicating recall. For articulatory suppression, participants first saw a three-digit number (e.g., 358) at screen center (same font type and size as the instructions) for 3 s and began repeating it continuously, at a paced speed of 2 s per cycle, through instruction presentation to the point of recall. The backward counting procedure was similar, except that participants counted in decrements of two.

According to the assigned groups, participants either repeated the instructions back (verbal recall) or performed the actions (enactment recall), with the experimenter recording these responses. At the end of each trial, the experimenter changed the locations of objects randomly on the table while the participants closed their eyes.

Results and discussion

For both the verbal and enactment recall condition, the dependent variable was the mean number of correct actions (including movement, object color, and object identity) per instruction sequence. We defined an action as a “chunk” of elements containing items and movement, with recall being scored as correct only when the combination of movement, color, object, and ordinal position was accurately produced. This response metric had been used by Gathercole et al. (2008) to define span performance on this task. The means and standard errors are illustrated in Fig. 2.

A 3 × 2 (Concurrent Task × Recall Type) analysis of variance (ANOVA) revealed significant main effects of concurrent task, F(2, 44) = 61.017, MSE = 0.271, p < .001, η _p ² = .735, and recall type, F(1, 22) = 12.509, MSE = 0.219, p = .002, η _p ² = .362, with enactment recall being superior to verbal recall. No significant interaction was apparent between concurrent task and recall type, F(2, 44) = 0.014, MSE = 0.271, p = .986, η _p ² = .001. The effects of both articulatory suppression, F(1, 22) = 11.511, MSE = 0.547, p = .003, η _p ² = .344, and backward counting, F(1, 22) = 56.894, MSE = 0.520, p < .001, η _p ² = .721, were found to be significant, but we observed no interactions with recall type: suppression, F(1, 22) < 0.001, MSE = 0.547, p = .998, η _p ² = .001; backward counting, F(1, 22) = 0.021, MSE = 0.520, p = .887, η _p ² = .001.

Therefore, we obtained three principal findings in Experiment 1. First, the disruptive effects of articulatory suppression and backward counting were consistent with the involvement of the phonological loop and central executive components of WM in encoding verbal sequences (Baddeley et al., 2009; Gathercole et al., 2008). Second, the performance of recall by enactment was more accurate than recall by spoken repetition, replicating previous observations of an action advantage (Gathercole et al., 2008; Koriat et al., 1990). Crucially, because the concurrent tasks disrupted verbal and enacted recall equivalently, neither the central executive nor the phonological loop appeared to be the source of the enactment advantage.

Experiment 2

The purpose of Experiment 2 was to explore the involvement of a further component of the Baddeley and Hitch (1974) WM model, the visuospatial sketchpad, in following instructions. A complex spatial-tapping activity adapted from the Corsi-block task (Corsi, 1972; Milner, 1971) was employed as a dual task to disrupt the operation of the visuospatial sketchpad. The original Corsi task involves participants repeating the sequence in which blocks are tapped by the experimenter. In the present experiment, participants were required only to tap three blocks in sequence, to minimize the involvement of executive resources that are required by longer sequences (Vandierendonck, Kemps, Fastame, & Szmalec, 2004). The tapping pattern varied from trial to trial so that it would not become an automatic procedural-memory task. Research has shown that complex tapping configurations involve greater spatial demands than do simple configurations (Busch, Farrell, Lisdahl-Medina, & Krikorian, 2005); hence, the tapping patterns were designed to ensure substantial spatial interference.

An articulatory suppression condition was also included in order to directly compare the contributions of the phonological loop and visuospatial WM. As in Experiment 1, suppression involved retaining and verbally repeating three digits, whereas tapping required maintaining and tapping three locations. Articulation and tapping rates were equated across the two conditions, at 2 s per cycle.

We investigated two hypotheses. First, as spatial coding may contribute to the process of representing instructions in a 3-D task environment, and as Corsi-block tapping is assumed to disrupt this coding, the tapping should also impair subsequent recall of the instructions. The remaining two hypotheses, as in Experiment 1, were that superior recall performance should be observed for enactment than for oral repetition, and that significant articulatory suppression effects would be observed. Although both verbal and visuospatial WM may be important in remembering verbal instructions involving actions in an environment rich in visual and spatial cues, no specific hypotheses were made regarding the relative sizes of their contributions to each response condition.