1 Introduction

Proceeding digitization at work places reveals new possibilities to support workers in their daily tasks and to optimize their cognitive workload. For example, smart devices in order picking present task instructions on the display of handheld scanners, tablets and smart glasses (pick-by-vision) or as pick-by-voice via headsets. Task instructions – formerly presented on a sheet of paper – may contain article location and quantity or route information. To ensure healthy workplace conditions, these instructions should be adapted to the worker’s mental state. If cognitive load is high due to task difficulty, the device may possibly adapt perceptual dimensions of the content or the content itself on the display of smart glasses or in audio devices. This individual instruction adaptation should support the worker context-specifically and could prevent additional work strain. In the current study, we will focus on the order picking context and its state-of-the-art smart devices, as order pickers are still a very relevant human resource in intralogistics with high demands regarding regular changes in physical and mental load as well as time pressure [1,2,3,4].

Findings from previous studies highlighting dual-task walking suggest that gait analysis could provide useful insights into the order picker’s current mental state. Dual-task walking refers to the concept of concurring locomotion and another task [5] – for example walking while observing the smart device display or warehouse surroundings. A frequent finding hereby is a performance decrement in the other task while walking compared to a neutral condition, termed dual-task cost. However, dual-task costs may also refer to a performance decrement in the walking task associated with the dual-task situation [5] and are correlated with the risk of falling (for a concurrent hypothesis of walking as cognitive facilitation/arousal cf. [6]).

The degree of dual-task costs is in turn determined by the nature of the secondary task [5, 7, 8]. Bock [7] compared the performance of young and older adults in several cognitive tasks while walking and found especially severe performance derogation in a visual control task. He therefore concluded that the involvement of the visual system is a crucial factor for stable walk. Two visual processing streams need to be attended to, so that mental resource allocation to a visually demanding task hampers walking performance as well as walking while performing visually demanding tasks reduces performance in the latter. Along this line, Barra and colleagues [8] found that spatial tasks and not verbal tasks increase the risk of falling. Both studies are relevant in light of determining the mental state of an order picker and the use of smart glasses to read a picking list – a visual observation task - while order picking.

Tomporowski and Audiffren [9] conducted a study in which young and older participants performed auditory task switching while standing, walking at preferred speed and walking faster than preferred speed (see below for an explanation of task switching and switch costs). All participants showed switch costs, but these were not modulated by the movement condition in young adults. Older adults showed a speed-accuracy trade-off: slower responses while standing compared with the walking conditions, but also less errors standing in contrast to walking. In this study employing an auditory task, the typical dual-task costs while walking could not be observed. This study may provide further evidence for differential mental states while walking depending on the nature of the task and might prove useful in the assessment of the order picker’s mental state when picking with instructions via headset.

In a previous study, we investigated the influence of smart devices on cognitive performance in a simulated order picking task while standing. Results showed generally faster responses to the use of headsets compared with smart glasses, but also worse performance with respect to selective attention and cognitive flexibility [10]. Here, the mental state of the order picker depended clearly on the smart device presenting instructions (cues) either visually or auditorily.

Nonetheless, additional research is required to understand the relationship and potential interaction of gait and cognitive functions while using smart devices. If, for example, cognitive load shows a deteriorating effect on gait and therefore task performance, the order picker compensates for it by reducing gait speed. However, if the work context requires an increased gait speed resulting in a potentially detrimental effect on cognitive functions, the cognitive difficulty of the task should be reduced. A failure to do so would particularly affect cognitive resources of older order pickers. Ample empirical evidence suggests that gait and posture control require additional mental resources in older people [5, 11]. If these findings are not taken into account when adjusting information load, performance and safety is at risk.

In our study, we set out to investigate the influence of gait and smart devices on cognitive functions, more precisely on selective attention. Selective attention (i.e., focusing on relevant information while ignoring irrelevant information) is crucial while reading picking information as well as searching for the relevant item in the storage racks [1, 10]. We investigated cognitive performance in an order picking setting employing a task switching paradigm and an Eriksen flanker task [12] while standing or walking at preferred speed. Generally, the task switching paradigm as well as the Eriksen flanker task enable assessing selective attention and cognitive flexibility as participants are asked to change their focus of attention regularly to attend to varying relevant information in quick succession.

Congruency effects (i.e., performance decrement of incongruent relative to congruent stimuli) and switch costs (i.e., performance decrement of task switches relative to task repetitions) are means of investigating the former mentioned cognitive concepts as cognitive performance indicators. Congruency effects arise from conflicting information in incongruent trials, in which both possible responses are triggered by components of the stimuli, and its size represent the size of cognitive conflict [13]. Switch costs result from the additional effort in switch trials to update new task requirements as well as interference control from competing tasks and can be seen as empirical markers for cognitive flexibility [14].

In the beginning of each trial, a cue indicated the relevant customer order (one of two tasks) and thereby the correct stimulus-response rule either via smart glasses or headset (pick-by-vision vs. pick-by-voice). Participants were asked to classify the central letter of a five-letter string with a manual response. Importantly, the central letter was either congruent or incongruent with the surrounding letters introducing potential cognitive conflict and the need to focus attention. Also, the customer order could change repeatedly (task transition: repetition vs. switch), requiring cognitive flexibility. Participants were either walking at their personal comfort speed or standing on a treadmill.

Based on our preceding study [10], we expected generally increased switch costs and congruency effects for the headset use compared with smart glasses while standing. In addition, we expected a stronger decrease of performance with smart glasses compared with the headset use while walking as dual-task costs due to the involvement of the visual system [7].

2 Methods

2.1 Participants

Twenty-four participants (21 female) with a mean age of 24 years (SD = 4) were tested in this experiment. They reported normal or corrected-to-normal vision and normal hearing abilities. Participants gave informed consent and received course credit or 20 € for participation.

2.2 Apparatus and Stimuli

Participants were either standing or walking on a treadmill (Woodway PPS 70 Ortho) equipped with a safety belt system while responding to a letter discrimination task [12]. The central target letter in between distractors of a letter string was supposed to be “placed” to the left or right side of a palette by a keypress depending on the currently relevant “customer order”. The “customer order” determining the response rule was indicated by a cue and could change from one trial to the next.

Participants were standing or walking on the treadmill at ca. 2.85 m distance in front of a screen (2.53 m × 1.58 m). The fixation cross and letter string stimuli were presented on it by a Panasonic PT-EZ590 WUXGA projector. Cues and response feedback were presented via Brother AiRScouter WD-200B in the smart glasses part or via Bose SoundTrue Ultra in-ear headphones in the headset part. Manual responses were recorded via Speedlink Phantom Hawk joy-sticks placed in the left and right hand.

Two color words (red or green) served as cues for the “customer order” and corresponding response rule. The visual cues were presented in white on black background on the smart glass display (75 px height, display resolution of 1280 * 720, font: Consolas). The auditory cues, which were artificially created via an online text to speech service (fromtexttospeech.com), were presented on the headphones.

Stimuli consisted of five-letter strings in white on the black screen made of the letters “S” and “H” (35 px height, resolution of 1920 * 1200, font: Consolas). While in congruent trials all five letters were identical asking for the same response (e.g., left keypress), the central letter in incongruent trials was surrounded by the other letter type (e.g., SSHSS) associated with the opposite response.

Response feedback was displayed on the smart glasses as a green checkmark or a red cross (correct vs. incorrect). In the headset part, response feedback was presented over headphones: high-pitched bell sound or low-pitched “buzzer” after a correct or incorrect response.

2.3 Procedure

In the beginning of the experiment, the comfort speed (in mm/s) of each participant was assessed by a repeated step-wise acceleration routine. Also, participants’ eyesight as well as hearing abilities were tested. Subsequently, participants underwent calibration routine for a standardized orientation across all participants of the smart glasses display relative to the screen (central alignment of a cross on the smart glasses to a square on the screen).

The experiment consisted of two movement and two smart device parts: standing or walking while using smart glasses or the headset (counterbalanced across participants). During all parts, participants were asked to discriminate the central letter of a letter string stimulus as either “H” or “S” by responding via keypress of a joystick in either the left or right hand. The “customer order” (representing the task) determined the stimulus-response association (counterbalanced across participants) and was indicated by a cue (e.g., “green”: S = right, H = left; “red”: S = left, H = right) presented via smart glasses or headset. In single task blocks, only one type of customer orders was presented (e.g., only red) as baseline for behavioral performance, while both types were randomly presented in mixed tasks blocks (resulting in task repetitions or switches). Both block types appeared in counterbalanced order across participants (two single task blocks of 20 trials and two blocks of 160 trials each per smart device and movement condition) with single tasks either preceding or succeeding mixed tasks of each condition. Practice trials were included before each new block type (eight trials per each single task block and sixteen trials before the mixed tasks block).

Each trial began with the cue presentation for 400 ms succeeded by a pause/blank screen on the headphones or smart glasses for 200 ms. Meanwhile, a fixation cross was presented on the screen until the stimulus appeared (cue-stimulus interval of 600 ms). The stimulus was presented for a maximum of 1500 ms or until a response was made. No response within this interval was registered as response omission and therefore as error. Response feedback was presented either on the headphones or smart glasses for 300 ms, followed by a a pause/blank screen for 200 ms (response-stimulus interval of 1100 ms). The experiment lasted approx. 2 h.

2.4 Design

The independent within-subject variables were movement (standing vs. walking), smart device (smart glasses vs. headset), task transition (repetition vs. switch), and congruency (congruent vs. incongruent). The levels of movement and smart device were blocked and counterbalanced. The levels task transition and congruency were varied randomly. Response times (RT) and error rates (ER) were registered as dependent variables. All tests of significance were conducted at an alpha level of 0.05.

3 Results

For data analyses, the first trial of each block was excluded as it cannot be classified as task repetition or switch. Trials with RT less than 100 ms were removed (0.1%) as well as trials with RT exceeding ±3 SD of the individual mean (1.6%). Also, trials following an error were discarded (15.7%). Incorrect responses and response omissions were registered as errors. For RT analyses, only correct trials were analyzed. We conducted repeated measures Analyses of Variances (ANOVA) with the variables above and report only significant results. Mean RT and ER of single tasks are depicted in Fig. 1 and of mixed tasks in Fig. 2.

Fig. 1.
figure 1

Response times and error rates in single tasks as a function of movement, device and congruency. Error bars reflect the 95% confidence interval around the mean.

Fig. 2.
figure 2

Response times and error rates of mixed tasks as a function of movement, device, task transition and congruency. Error bars reflect the 95% confidence interval around the mean.

3.1 Single Tasks

The RT analysis indicated significant main effects of device, F(1, 23) = 12.22, p = .002, \( \eta_{p}^{2} \) = .35, and congruency, F(1, 23) = 96.70, p < .001, \( \eta_{p}^{2} \) = .81. Responding to instructions via smart glasses was slower compared with instructions via headset (527 ms vs. 494 ms). Congruent trials yielded faster responses than incongruent trials (492 ms vs. 529 ms), indicating a congruency effect.

The ER analysis showed a main effect of congruency F(1, 23) = 38.63, p < .001, \( \eta_{p}^{2} \) = .63 with less errors in congruent compared with incongruent trials (3.8% vs. 12.2%). In addition, a three-way interaction of movement, device, and congruency was significant, F(1, 23) = 7.71, p = .011, \( \eta_{p}^{2} \) = .25. Post-hoc analyses showed a significant two-way interaction of device and congruency while standing, F(1, 23) = 13.70, p = .001, \( \eta_{p}^{2} \) = .37, but not while walking F < 1.00. While standing, the congruency effect was more pronounced for the headset use compared with the use of smart glasses: the headset yielded a significant congruency effect of 13%, t(23) = 5.33, p < .001 (one-tailed), and the smart glasses of only 3.7%, t(23) = 1.75, p = .047 (one-tailed). The congruency effects between headset and smart glasses while walking did not differ significantly: 7.9%, t(23) = 4.82, p < .001 (one-tailed), and 9.3%, t(23) = 3.26, p = .002 (one-tailed).

3.2 Mixed Tasks

The RT analysis showed main effects of task transition, F(1, 23) = 49.59, p < .001, \( \eta_{p}^{2} \) = .68, and congruency, F(1, 23) = 86.57, p < .001, \( \eta_{p}^{2} \) = .79. Repetition trials yielded shorter RT than switch trials (572 ms vs. 651 ms), indicating switch costs, as well as congruent trials yielded shorter RT than incongruent trials (593 ms vs. 630 ms), indicating a congruency effect.

Moreover, the two-way interactions of movement and device, F(1, 23) = 4.72, p = .040, \( \eta_{p}^{2} \) = .17 and of device and task transition, F(1, 23) = 31.89, p < .001, \( \eta_{p}^{2} \) = .58, were significant. While standing, responses to the devices were comparable (smart glasses: 610 ms, headset: 607 ms) whereas while walking, responses to smart glasses were slower (628 ms) than to the headset (601 ms). Yet and importantly, switch costs were larger in the headset condition (104 ms) than in the smart glasses condition (54 ms).

The three-way interaction of movement, device, and congruency was also significant, F(1, 23) = 6.20, p = .020, \( \eta_{p}^{2} \) = .21. The congruency effect in the smart glasses condition was less pronounced than in the headset condition while standing (31 ms vs. 41 ms) and more pronounced while walking (48 ms vs. 33 ms).

However, since there was a significant four-way interaction of movement, device, task transition, and congruency, F(1, 23) = 5.71, p = .025, \( \eta_{p}^{2} \) = .20, incorporating lower-level interactions, this interaction will be explored in detail. We conducted post-hoc tests separately for each task transition condition, the additional factor to the three-way interaction above. For task repetitions, the three-way interaction of movement, device and congruency was significant, F(1, 23) = 7.44, p = .012, \( \eta_{p}^{2} \) = .25, but not for task switches, F < 1.00, indicating similar congruency effects across movement and device conditions for switches. We explored task repetitions therefore further separately for devices: The two-way interaction of movement and congruency was significant for smart glasses, F(1, 23) = 9.53, p = .005, \( \eta_{p}^{2} \) = .29, but not for headsets, F(1, 23) = 2.39, p = .136, \( \eta_{p}^{2} \) = .09. Using the headset, the congruency effect was 47 ms while standing, t(23) = 6.67, p < .001 (one-tailed), and 31 ms while walking, t(23) = 4.46, p < .001 (one-tailed), although this difference was not significant as suggested by the non-significant interaction above. However, in the use of smart glasses, the congruency effect amounted to 27 ms while standing, t(23) = 4.15, p < .001 (one-tailed), and increased significantly to 57 ms while walking, t(23) = 5.36, p < .001 (one-tailed).

The ER analysis indicated main effects of device, F(1, 23) = 6.61, p = .017, \( \eta_{p}^{2} \) = .22, task transition, F(1, 23) = 157.04, p < .001, \( \eta_{p}^{2} \) = .87, and congruency, F(1, 23) = 19.55, p < .001, \( \eta_{p}^{2} \) = .46. Participants made more errors using the smart glasses compared with the headset (17.4% vs 14.0%). ER were higher in switch compared with repetition trials (21.7% vs. 9.7%), indicating switch costs, and higher in incongruent compared with congruent trials (17.6% vs. 13.8%), indicating a congruency effect.

The two-way interaction of task transition and congruency was also significant, F(1, 23) = 12.80, p = .002, \( \eta_{p}^{2} \) = .36. The congruency effect was more pronounced in repetition trials compared with switch trials (5.3% vs. 2.1%). Post-hoc t-tests indicated significant congruency effects for both conditions: t(23) = 5.42, p < .001 (one-tailed) for repetitions, t(23) = 2.31, p = .015 (one-tailed) for switches.

3.3 Comfort Speed Analysis

To assess the relationship between individual gait aspects and cognitive performance, personal comfort speed (in mm/s) of each participant was extracted. The comfort speed was then compared to measures of cognitive performance in the walking conditions across single and mixed tasks for RT and ER for both devices. For single tasks, neither individual mean RT nor mean ER in walking blocks correlated with comfort speed. There was also no significant correlation with the individual congruency effect across devices in ER. In RT, however, there was a significant positive correlation of the size of the congruency effect when using the headset and comfort speed, r(22) = .57, p = .003, indicating the faster the comfort speed the more pronounced the congruency effect. The same contrast for the use of smart glasses was not significant. For mixed tasks, there was no significant correlation of comfort speed with switch costs or congruency effects in RT or ER, but there was one significant negative correlation of individual mean RT while walking with comfort speed, r(22) = −.41, p = .046: the faster the comfort speed the slower the individual mean RT. At first sight, these findings might suggest a detrimental effect of increased gait speed on cognitive performance, however, data is not convincing enough to generalize at this point (only two significant correlations of in total 16 contrasts). Importantly, the introduction of personal comfort was initially thought to compensate for individual differences in walking speed and to set an individual speed baseline. It is therefore not surprising that there were hardly any correlations of comfort speed and measures of cognitive performance.

4 Discussion

In this study we investigated the influence of gait on selective attention while using smart devices in an order picking setting. Results shall be used to assess the order picker’s current mental state to adapt task instructions on the smart devices (smart glasses and headset). To this end, we employed a task switching paradigm (two customer orders as task) [14] and an Eriksen flanker task (assorting articles to the left or right) [12] while standing or walking at comfort speed on a treadmill. Task instructions (cues) were presented on smart glasses or a headset. Participants performed single task blocks with only one customer order or mixed tasks in which the customer order could change from trial to trial. We registered RT and ER and expected differential switch costs (cognitive flexibility) and congruency effects (distractibility) as cognitive load indicators with respect to smart device and movement condition.

4.1 Synopsis of Results

Results indicated switch costs (i.e., performance decrement of task switches relative to task repetitions) and congruency effects (i.e., performance decrement of incongruent relative to congruent stimuli), which depended further on the current smart device and walking condition. In single task response times, responses to smart glasses were slower than to the headset. Results of single task errors suggested increased congruency effects for the headset while standing compared with the use of smart glasses, while congruency effects between the smart devices were similar during walking. In mixed tasks response times, switch costs were more pronounced for headset use than smart glasses. Congruency effects were most pronounced in repetitions during the use of smart glasses while walking. In mixed tasks errors, smart glasses produced more errors. In sum, the combination of a particular smart device and movement condition determined the effect on attention and thus the order picker’s mental state.

4.2 Influence of Smart Devices on Cognitive Performance – Application in Warehouses

Results of single tasks can be used to infer the cognitive ergonomic use of smart devices in well-structured warehouses without much distraction [10]. Here, smart glasses provided slower responses but also, while standing, smaller congruency effects, participants were therefore less prone to distraction or cognitive conflict. Depending on the overall goal of system adaptation regarding speed and accuracy as cognitive load indicators, either the order picker’s speed while using headsets or the distraction resistance of smart glasses can be favored.

The performance in mixed tasks can be compared with the performance in crowded and busy warehouses requiring the need for cognitive flexibility and resistance to distraction and cognitive conflict [10]. Smaller switch costs of smart glasses indicate better cognitive flexibility. However, smart glasses showed also larger congruency effects while walking in task repetitions and generally more errors. Smart glasses are therefore only favorable when distractions while walking are not frequent in the warehouse.

4.3 Influence of Gait on Cognitive Performance – Application in Warehouses

The movement conditions had an influence of cognitive performance with respect to smart glasses and congruency effects in single task errors while standing. The standing condition can simulate the situation of an order picker in front of a shelf, where selective attention is needed to search for the respective item. Here, congruency effects were smaller for smart glasses compared with headsets. Participants were less distracted through conflicting information. The modality switch from instructions (auditory) to stimuli on the screen (visual) was accompanied by costs [15]. Instructions via smart glasses seem therefore favorable.

Gait had also an effect on cognitive performance in mixed tasks. Here, in task repetitions, congruency effects were especially large while walking wearing smart glasses. In line with the idea of two visual processing streams, one focusing on walking and the other on the cognitive task [5], incongruent stimuli yielded a large decrease in cognitive performance when the visual modality was involved in the cognitive task. Displaying instructions on smart glasses, drawing on visual attention, is therefore less favorable while order pickers move through warehouses. While walking, staying focused is crucial for workplace safety. Busy and crowded warehouses provide much visual input, are potentially dangerous and the use of headsets for instructions should be preferred over smart glasses.

4.4 Instruction Adaptation in Technical Systems

Finding context-specific cognitive load, which depend on gait as well as the smart device in use, suggest that order pickers could benefit from adaptive instructions. Generally, instructions via smart glasses are favorable regarding cognitive flexibility, but perceiving conflicting visual information is more harmful when wearing smart glasses while walking. In order to establish safe and healthy workplaces, these differences in cognitive load indicators based on gait information and task requirements need to be taken into account to adapt instructions accordingly.

For example, in challenging situations, the physical properties (e.g. contrast, color intensity, font size) or the content of instructions could be adjusted according to the respective load. As soon as the cognitive load decreases, the changes can be reversed. Moreover, instructions could fade out on the smart glasses to reduce distraction, while the order picker moves around in the warehouse. In addition, other parameters of the situation at work could be adapted such as the physical workload of a worker or the number of assignments and breaks in a given time period.

Using gait information might prove beneficial in the future compared to other indicators because no additional mobile equipment for monitoring the mental state of the worker is required. Acceleration sensors can be easily integrated into the smart devices themselves and from their data, gait information could be derived. However, more research is needed to link the acceleration information to cognitive functions. In addition, which was outside the scope of the current study, the influence of smart devices and cognitive functions on gait needs to be further explored. In doing so, differentiating between dual-task costs and effects due to different task prioritization becomes possible [5].

5 Conclusion

The combination of a particular smart device and walking condition determined the effect on attention and thus the order picker’s mental state, suggesting that gait speed and performance requirements can be used as cognitive load indicators in technical systems to adapt instructions. Physical properties (e.g., contrast, color intensity, font size) or the content of instructions can be adjusted to match the current mental state.