Research ReportInfluences of intra- and crossmodal grouping on visual and tactile Ternus apparent motion
Research Highlights
►‘Non-motion’ stimuli influence visual and tactile Ternus apparent motion. ►Influences of cue-target onset asynchronies on visual/tactile apparent motion. ►The roles of temporal grouping and attentional modulation on crossmodal interaction.
Introduction
Investigating crossmodal interactions is essential for a comprehensive understanding of the perceptual system (Welch and Warren, 1986). A number of studies on crossmodal interaction have shown asymmetrical influences on the (perceived) direction of apparent motion between different modalities (Craig, 2006, Occelli et al., 2009, Sanabria et al., 2007a, Sanabria et al., 2004, Soto-Faraco et al., 2002, Strybel and Vatakis, 2004). In a typical paradigm for investigating crossmodal capture of apparent motion (e.g., Soto-Faraco et al., 2002), a pair of stimuli in one modality is presented synchronously or asynchronously with another pair of collocated stimuli in a second modality; that is, one to give rise to an apparent-motion stream is induced in one modality and another in another modality, with either congruent or incongruent directions of motion. Observers are asked to judge the movement direction of the stimuli in the target modality, while ignoring the one in the distractor modality. Observers usually make highly accurate direction judgments with congruent motion directions, but their performance is relatively poor with incongruent directions in the ‘synchronous’ condition. This phenomenon has been termed ‘dynamic-capture effect’ (Soto-Faraco et al., 2002, Soto-Faraco et al., 2004a, Soto-Faraco et al., 2004b). However, if the distractor stimuli are presented asynchronously, the capture effects become weak or even disappear, owing to a reduced crossmodal interaction of two (temporally separated) motion streams (Soto-Faraco et al., 2004a, Soto-Faraco et al., 2004b). Performance in the incongruent condition also depends on which modality has been selected as (irrelevant) distractor modality. For example, direction judgments of auditory apparent motion have been shown to be reduced to chance level by incongruent visual apparent motion, while the direction of visual apparent motion is rather unaffected by incongruent auditory apparent motion (Sanabria et al., 2007a, Sanabria et al., 2004, Soto-Faraco et al., 2002). Besides audio–visual interactions, asymmetric dynamic-capture effects have also been observed between touch and vision, and touch and audition (Craig, 2006, Lyons et al., 2006, Occelli et al., 2009).
Recently, dynamic-capture effects have been found to be dependent on intramodal motion grouping (Sanabria et al., 2005, Sanabria et al., 2004, Lyons et al., 2006), stimulus intensity (Occelli et al., 2009), as well as attention (Oruc et al., 2008). For example, Sanabria et al., 2005, Sanabria et al., 2004 manipulated the strength of task-irrelevant visual apparent motion (distractor modality) by increasing the number of visual stimuli and extending their presentation from before to after the presentation of auditory apparent motion (target modality). They found dynamic capture to be significantly reduced when the visual stimuli were presented prior to the combined audiovisual display, compared to when the audiovisual display was presented first (Sanabria et al., 2005); using a central visual alerting signal (the fixation point flashing in the same rhythm as first two lights in the six-lights apparent-motion stream presented in their experiment), they were able to rule out that this reduction reflected a temporal-warning effect. Given this, Sanabria et al. concluded that strengthening intramodal visual grouping would improve perceptual segregation of the auditory (target) from the visual (distractor) events. However, rather than manipulating the number of intramodal stimuli, Occelli et al. (2009) recently varied sound intensity to examine its effect on crossmodal audiotactile dynamic capture; they found more intense auditory distractors to induce a stronger crossmodal capture effect compared to less intense distractors. They argued that sounds of higher intensity may attract attention, thus increasing their capability of capturing the tactile apparent motion. It is worth noting that although both studies manipulated the strength of intramodal (visual and, respectively, auditory) grouping, the results are somewhat discordant. The disparity is perhaps due to differences in how attentional engagement (capture) operates in the various modalities.
Previous studies of crossmodal attention have revealed that stimulation in one modality can enhance perceptual sensitivity for spatially congruent stimulus locations in another modality (Eimer et al., 2002, Gray and Tan, 2002, Kennett et al., 2001, Macaluso et al., 2000). For example, a concurrent tactile cue can improve discriminability of visual stimuli at the same location (Macaluso et al., 2000), and a sudden sound can improve the detectability of a flash subsequently appearing at the same location (McDonald et al., 2000). The role of attention in crossmodal dynamic capture has been systematically investigated recently by Oruc et al. (2008). They modulated attention by trial-wise precueing/postcueing or by blockwise cueing of the target modality — that is, the target modality was specified either before the presentation of a given trial display, after the presentation, or before each trial block. The results revealed divided attention (i.e., in the postcueing condition, where participants had to attend to both modalities on each trial) to greatly increase the asymmetric capture effect with audiotactile apparent-motion streams; however, there was no effect on the discrimination of visual apparent motion in visuotactile and audiovisual motion streams. The results showed attention to influence crossmodal dynamic-capture effects, but the attentional modulations were asymmetric among modalities.
The above-mentioned ‘crossmodal dynamic-capture’ studies have examined motion direction capture effects mainly by using simultaneous presentation of apparent motion in both the target and distractor modalities. One limitation of the crossmodal dynamic-capture paradigm is that it is hard to rule out response biases induced by motion in the distractor modality as well as the influence of a static ‘ventriloquism’ effect (where the position of the stimulation in one modality is captured by that of the stimulation in another modality) (Sanabria et al., 2007b). In addition, participants in these studies were only required to judge the motion direction of the target stimuli. Thus, how the strength of the motion percept is modulated was largely neglected (but see Occelli et al, 2009). In contrast, several recent studies examining apparent motion have shown that the motion percept in the target modality can be modulated by spatially uninformative but temporally irrelevant grouping stimuli in the distractor modality (henceforth, we refer to the latter stimuli as ‘non-motion’ distractors) (Bruns and Getzmann, 2008, Getzmann, 2007, Shi et al., 2010). For example, Getzmann and colleagues found that the presentation of short sounds (at a fixed location) temporally intervening between the visual stimuli facilitated the impression of continuous visual motion relative to the baseline (visual stimuli without sounds), whereas sounds presented before the first or after the second visual stimulus as well as simultaneously presented sounds reduced the continuous-motion impression (Getzmann, 2007). Bruns and Getzmann (2008) argued that crossmodal temporal grouping, which gives rise to a temporal ventriloquism effect (Morein-Zamir et al., 2003), is the main factor influencing the visual motion impression.
However, despite the recent focus on the influence of crossmodal interactions on apparent motion (Bruns and Getzmann, 2008, Occelli et al., 2009, Sanabria et al., 2004), the role of intramodal perceptual grouping is still not clear. Moreover, it is currently not known how modulations of non-motion perceptual grouping in the distractor modality would influence the motion percept in the target modality. Finally, the role of attention in crossmodal apparent motion is not fully understood.
To examine the roles of non-motion perceptual grouping and attention in crossmodal motion interaction, we adopted the two-state Ternus apparent-motion paradigm in the present study (Ternus, 1926). Ternus apparent motion arises from a typical Ternus display (see Fig. 1) which consists of two sequential visual frames, each presenting two horizontal dots (with the same inter-dot distance in the two frames), where the two frames, when overlaid, share one common dot at the center (the ‘middle’ dot). With different stimulus onset asynchronies (SOAs), there are often two distinct percepts: ‘element motion’ and ‘group motion’. In element motion, the outer dots are perceived as moving, while the center dot appears to remain static or flashing; in group motion, the two dots are perceived to move together as a group. Similar motion percepts have been demonstrated recently in touch (Harrar and Harris, 2007). It has been proposed that in Ternus apparent motion, temporal and spatial grouping processes are in competition (Kramer and Yantis, 1997). At short SOAs between the two frames, temporal grouping (temporal proximity) prevails, that is, the stimulus in the ‘overlapping’ position of the first frame is likely to be grouped with the stimulus appearing at the same location in the second frame, leading to the percept of ‘element motion’. By contrast, at long SOAs, temporal proximity weakens and spatial grouping within a frame becomes more prominent, giving rise to a dominant percept of ‘group motion’. In contrast to simple direction judgments of apparent motion, such spatio-temporal grouping mechanism in Ternus apparent motion provides a useful tool to examine intra- and crossmodal perceptual grouping effects on crossmodal motion perception.
We chose vision and touch to investigate crossmodal apparent motion in the present study chiefly for two reasons. First, Ternus apparent motion has been demonstrated to be similar in vision and touch (Harrar and Harris, 2007). Second, vision and touch have different temporal properties. Vision has a lower temporal resolution but a higher spatial resolution than touch. Accordingly, the role of spatio-temporal grouping for crossmodal apparent motion may differ between vision and touch.
To induce non-motion perceptual grouping, we introduced rhythmic precues or synchronous cues from the stimulus in the middle position of the Ternus display (Fig. 2). Rhythmic cues extended the presentation rhythm of the (repeated) middle stimuli to prior to the onset of the Ternus apparent-motion display, which is analogous to the enhancement of apparent motion in the distractor modality by increasing the number of distractors prior to the onset of the target stimuli in previous studies (Sanabria et al., 2005, Sanabria et al., 2004). However, the major difference is that we only manipulated non-motion temporal grouping, instead of both spatial and temporal grouping. Synchronous cues, implemented by adding synchronous stimuli in the irrelevant modality to the middle position, could enhance the multisensory salience of the middle stimuli, which is similar to the auditory intensity modulation in Occelli et al.'s (2009) study of audiotactile crossmodal dynamic capture. As shown in most crossmodal dynamic-capture studies, increasing the distractor and target asynchrony would eventually diminish the crossmodal interaction due to clearly separated motion streams (Soto-Faraco et al., 2004a, Soto-Faraco et al., 2004b). In order to systematically investigate the crossmodal interaction by temporal grouping and attention, we varied the onset asynchronies between the distractor stimuli and the target Ternus display (henceforth, we will refer to the distractor stimuli as ‘(pre-) cues’ and the asynchrony as cue-target onset asynchrony, CTOA). If the crossmodal interaction operates (only) within a short temporal range, one would expect precues with long CTOAs not to influence the perception of apparent motion. However, crossmodal cueing has also been shown to enhance perceptual sensitivity (Eimer et al., 2002, Gray and Tan, 2002, Kennett et al., 2001, Macaluso et al., 2000); thus, in the Ternus display, it may improve the detection of stimuli within one frame and enhance the spatial linkage (grouping) of these stimuli. On this logic, one would expect modulations of apparent motion induced by crossmodal precues even with long CTOAs. Furthermore, crossmodal temporal grouping may be weaker than intramodal temporal grouping (Gilbert, 1938) and it may operate via different mechanisms. Thus, in the present study, we also compared the effects of intra- with those of crossmodal temporal grouping.
Specifically, in Experiments 1 and 2, we manipulated intramodal temporal grouping of the middle stimuli within a given (either the visual or the tactile) modality by presenting the middle stimulus twice prior to the Ternus display. In Experiment 1, the SOA between the two precues as well as the CTOA were kept the same as the SOA between the Ternus frames to maintain the same presentation rhythm, so as to enhance the temporal grouping. Note that the CTOA (and thus the SOA between the two precues and that between the two Ternus display frames) varied randomly across trials. We refer to this as rhythmic-cue condition. The sequence of stimuli and their timing was essentially the same in Experiment 2, except that the CTOA between the second precue and the first Turnus frame was fixed and either short or long CTOAs (170 vs. 530 ms). Short and long CTOAs were compared in order to examine the dynamic change of the cueing effect. Experiments 3 and 4 were designed to examine the crossmodal effects of tactile precues and tactile synchronous cues on visual apparent motion. The presentation rhythm in Experiment 3 was the same as that as in Experiment 1, and fixed short and long CTOAs were compared in Experiment 4 (similar to Experiment 2). Experiments 5 and 6 were analogous to Experiments 3 and 4, however with the cue and target modalities switched.
Section snippets
Results
The percentages of ‘group motion’ responses were calculated for each condition individually for each observer; then the psychometric curves were fitted using a logistic function (Treutwein and Strasburger, 1999). Fig. 3 illustrates typical results and psychometric estimates for one observer, namely, for the rhythmic-cue and baseline conditions in the visual Ternus apparent-motion task with intramodal cues (Experiment 1). Subsequently, for each observer, the SOA at which he/she was equally
Discussion
The major findings of the present study are summarized in Table 2.
With non-motion intramodal precueing of the middle position, differential effects were observed between visual and tactile Ternus apparent motion. Cueing the middle flash at a fixed short CTOA or in rhythmic fashion shifted the visual Ternus apparent motion towards ‘element motion’. In contrast, an opponent-type pattern was found for tactile precues between short and long CTOAs: precueing the middle tap at short CTOAs shifted
Participants
In each experiment, thirteen naïve observers were enrolled for payment: mean age 24 (9 females), 22 (7 females), 22.4 (7 females), 23.5 (9 females), 23.3 (5 females), and 22.5 (8 females) for Experiments 1 to 6 respectively. All had normal or corrected-to-normal vision and none of them reported any history of somatosensory disorders. They all gave informed consent prior to the experiment. The experiments were conducted in accordance with the guidelines of the LMU Department of Psychology ethics
References (46)
- et al.
The ventriloquist effect results from near-optimal bimodal integration
Curr. Biol.
(2004) - et al.
Audiovisual influences on the perception of visual apparent motion: exploring the effect of a single sound
Acta Psychol. (Amst)
(2008) - et al.
The cognitive and neural correlates of “tactile consciousness”: a multisensory perspective
Conscious. Cogn.
(2008) - et al.
The peripheral critical flicker frequency
Vision Res.
(1979) - et al.
Auditory capture of vision: examining temporal ventriloquism
Brain Res. Cogn. Brain Res.
(2003) - et al.
The effect of attention on the illusory capture of motion in bimodal stimuli
Brain Res.
(2008) - et al.
Intramodal perceptual grouping modulates multisensory integration: evidence from the crossmodal dynamic capture task
Neurosci. Lett.
(2005) - et al.
Perceptual and decisional contributions to audiovisual interactions in the perception of apparent motion: a signal detection study
Cognition
(2007) - et al.
The ventriloquist in motion: illusory capture of dynamic information across sensory modalities
Brain Res. Cogn. Brain Res.
(2002) Multisensory attention and tactile information-processing
Behav. Brain Res.
(2002)
Sensory specificity of apparent motion
J. Exp. Psychol. Hum. Percept. Perform.
The Psychophysics Toolbox
Spat. Vis.
Visual motion interferes with tactile motion perception
Perception
Cross-modal interactions between audition, touch, and vision in endogenous spatial attention: ERP evidence on preparatory states and sensory modulations
J. Cogn. Neurosci.
The effect of brief auditory stimuli on visual apparent motion
Perception
A study in inter-sensory Gestalten
Psychol. Bull.
Dynamic and predictive links between touch and vision
Exp. Brain Res.
Tactile motion activates the human middle temporal/V5 (MT/V5) complex
Eur. J. Neurosci.
Multimodal Ternus: visual, tactile, and visuo-tactile grouping in apparent motion
Perception
Perceptual organization of apparent motion in the Ternus display
Perception
Tactile-visual temporal ventriloquism: no effect of spatial disparity
Percept. Psychophys.
Tactile-visual links in exogenous spatial attention under different postures: convergent evidence from psychophysics and ERPs
J. Cogn. Neurosci.
Perceptual grouping in space and time: evidence from the Ternus display
Percept. Psychophys.
Cited by (16)
Psychophysics of wearable haptic/tactile perception in a multisensory context
2019, Virtual Reality and Intelligent HardwareVisual unimodal grouping mediates auditory attentional bias in visuo-spatial working memory
2013, Acta PsychologicaCitation Excerpt :Regarding this issue, Woodman, Vecera, and Luck (2003) showed that the attentional bias over VSWM encoding similarly spread into the same perceptual group of the cued element. In the present study, we hypothesized that the auditory cueing effect on VSWM would depend on the perceptual object association between the auditory and visual inputs (see Chen, Shi, & Müller, 2010; Kawachi & Gyoba, 2006; Sanabria, Soto-Faraco, Chan, & Spence, 2005; Sanabria, Soto-Faraco, & Spence, 2004). In Botta et al.'s study, given the coarse spatial resolution of the auditory system, the auditory cue could be associated with any of the multiple visual elements presented in the display, with all, or with none of them (see Fig. 1).
Synesthetic Correspondence: An Overview
2024, Advances in Experimental Medicine and BiologyMicrosaccadic Eye Movements but not Pupillary Dilation Response Characterizes the Crossmodal Freezing Effect
2020, Cerebral Cortex CommunicationsTemporal reference, attentional modulation, and crossmodal assimilation
2018, Frontiers in Computational Neuroscience