Elsevier

Brain Research

Volume 1354, 1 October 2010, Pages 152-162
Brain Research

Research Report
Influences of intra- and crossmodal grouping on visual and tactile Ternus apparent motion

https://doi.org/10.1016/j.brainres.2010.07.064Get rights and content

Abstract

Previous studies of dynamic crossmodal integration have revealed that the direction of apparent motion in a target modality can be influenced by a spatially incongruent motion stream in another, distractor modality. Yet, it remains to be examined whether non-motion intra- and crossmodal perceptual grouping can affect apparent motion in a given target modality. To address this question, we employed Ternus apparent-motion displays, which consist of three horizontal aligned visual (or tactile) stimuli that can alternately be seen as either ‘element motion’ or ‘group motion’. We manipulated intra- and crossmodal grouping by cueing the middle stimulus with different cue-target onset asynchronies (CTOAs). In unimodal conditions, we found Ternus apparent motion to be readily biased towards ‘element motion’ by precues with short or intermediate CTOAs in the visual modality and by precues with short CTOAs in the tactile modality. By contrast, crossmodal precues with short or intermediate CTOAs had no influence on Ternus apparent motion. However, crossmodal synchronous tactile cues led to dominant ‘group motion’ percepts. And for unimodal visual apparent motion, precues with long CTOAs shifted apparent motion towards ‘group motion’ in general. The results suggest that intra- and crossmodal interactions on visual and tactile apparent motion take place in different temporal ranges, but both are subject to attentional modulations at long CTOAs.

Research Highlights

►‘Non-motion’ stimuli influence visual and tactile Ternus apparent motion. ►Influences of cue-target onset asynchronies on visual/tactile apparent motion. ►The roles of temporal grouping and attentional modulation on crossmodal interaction.

Introduction

Investigating crossmodal interactions is essential for a comprehensive understanding of the perceptual system (Welch and Warren, 1986). A number of studies on crossmodal interaction have shown asymmetrical influences on the (perceived) direction of apparent motion between different modalities (Craig, 2006, Occelli et al., 2009, Sanabria et al., 2007a, Sanabria et al., 2004, Soto-Faraco et al., 2002, Strybel and Vatakis, 2004). In a typical paradigm for investigating crossmodal capture of apparent motion (e.g., Soto-Faraco et al., 2002), a pair of stimuli in one modality is presented synchronously or asynchronously with another pair of collocated stimuli in a second modality; that is, one to give rise to an apparent-motion stream is induced in one modality and another in another modality, with either congruent or incongruent directions of motion. Observers are asked to judge the movement direction of the stimuli in the target modality, while ignoring the one in the distractor modality. Observers usually make highly accurate direction judgments with congruent motion directions, but their performance is relatively poor with incongruent directions in the ‘synchronous’ condition. This phenomenon has been termed ‘dynamic-capture effect’ (Soto-Faraco et al., 2002, Soto-Faraco et al., 2004a, Soto-Faraco et al., 2004b). However, if the distractor stimuli are presented asynchronously, the capture effects become weak or even disappear, owing to a reduced crossmodal interaction of two (temporally separated) motion streams (Soto-Faraco et al., 2004a, Soto-Faraco et al., 2004b). Performance in the incongruent condition also depends on which modality has been selected as (irrelevant) distractor modality. For example, direction judgments of auditory apparent motion have been shown to be reduced to chance level by incongruent visual apparent motion, while the direction of visual apparent motion is rather unaffected by incongruent auditory apparent motion (Sanabria et al., 2007a, Sanabria et al., 2004, Soto-Faraco et al., 2002). Besides audio–visual interactions, asymmetric dynamic-capture effects have also been observed between touch and vision, and touch and audition (Craig, 2006, Lyons et al., 2006, Occelli et al., 2009).

Recently, dynamic-capture effects have been found to be dependent on intramodal motion grouping (Sanabria et al., 2005, Sanabria et al., 2004, Lyons et al., 2006), stimulus intensity (Occelli et al., 2009), as well as attention (Oruc et al., 2008). For example, Sanabria et al., 2005, Sanabria et al., 2004 manipulated the strength of task-irrelevant visual apparent motion (distractor modality) by increasing the number of visual stimuli and extending their presentation from before to after the presentation of auditory apparent motion (target modality). They found dynamic capture to be significantly reduced when the visual stimuli were presented prior to the combined audiovisual display, compared to when the audiovisual display was presented first (Sanabria et al., 2005); using a central visual alerting signal (the fixation point flashing in the same rhythm as first two lights in the six-lights apparent-motion stream presented in their experiment), they were able to rule out that this reduction reflected a temporal-warning effect. Given this, Sanabria et al. concluded that strengthening intramodal visual grouping would improve perceptual segregation of the auditory (target) from the visual (distractor) events. However, rather than manipulating the number of intramodal stimuli, Occelli et al. (2009) recently varied sound intensity to examine its effect on crossmodal audiotactile dynamic capture; they found more intense auditory distractors to induce a stronger crossmodal capture effect compared to less intense distractors. They argued that sounds of higher intensity may attract attention, thus increasing their capability of capturing the tactile apparent motion. It is worth noting that although both studies manipulated the strength of intramodal (visual and, respectively, auditory) grouping, the results are somewhat discordant. The disparity is perhaps due to differences in how attentional engagement (capture) operates in the various modalities.

Previous studies of crossmodal attention have revealed that stimulation in one modality can enhance perceptual sensitivity for spatially congruent stimulus locations in another modality (Eimer et al., 2002, Gray and Tan, 2002, Kennett et al., 2001, Macaluso et al., 2000). For example, a concurrent tactile cue can improve discriminability of visual stimuli at the same location (Macaluso et al., 2000), and a sudden sound can improve the detectability of a flash subsequently appearing at the same location (McDonald et al., 2000). The role of attention in crossmodal dynamic capture has been systematically investigated recently by Oruc et al. (2008). They modulated attention by trial-wise precueing/postcueing or by blockwise cueing of the target modality — that is, the target modality was specified either before the presentation of a given trial display, after the presentation, or before each trial block. The results revealed divided attention (i.e., in the postcueing condition, where participants had to attend to both modalities on each trial) to greatly increase the asymmetric capture effect with audiotactile apparent-motion streams; however, there was no effect on the discrimination of visual apparent motion in visuotactile and audiovisual motion streams. The results showed attention to influence crossmodal dynamic-capture effects, but the attentional modulations were asymmetric among modalities.

The above-mentioned ‘crossmodal dynamic-capture’ studies have examined motion direction capture effects mainly by using simultaneous presentation of apparent motion in both the target and distractor modalities. One limitation of the crossmodal dynamic-capture paradigm is that it is hard to rule out response biases induced by motion in the distractor modality as well as the influence of a static ‘ventriloquism’ effect (where the position of the stimulation in one modality is captured by that of the stimulation in another modality) (Sanabria et al., 2007b). In addition, participants in these studies were only required to judge the motion direction of the target stimuli. Thus, how the strength of the motion percept is modulated was largely neglected (but see Occelli et al, 2009). In contrast, several recent studies examining apparent motion have shown that the motion percept in the target modality can be modulated by spatially uninformative but temporally irrelevant grouping stimuli in the distractor modality (henceforth, we refer to the latter stimuli as ‘non-motion’ distractors) (Bruns and Getzmann, 2008, Getzmann, 2007, Shi et al., 2010). For example, Getzmann and colleagues found that the presentation of short sounds (at a fixed location) temporally intervening between the visual stimuli facilitated the impression of continuous visual motion relative to the baseline (visual stimuli without sounds), whereas sounds presented before the first or after the second visual stimulus as well as simultaneously presented sounds reduced the continuous-motion impression (Getzmann, 2007). Bruns and Getzmann (2008) argued that crossmodal temporal grouping, which gives rise to a temporal ventriloquism effect (Morein-Zamir et al., 2003), is the main factor influencing the visual motion impression.

However, despite the recent focus on the influence of crossmodal interactions on apparent motion (Bruns and Getzmann, 2008, Occelli et al., 2009, Sanabria et al., 2004), the role of intramodal perceptual grouping is still not clear. Moreover, it is currently not known how modulations of non-motion perceptual grouping in the distractor modality would influence the motion percept in the target modality. Finally, the role of attention in crossmodal apparent motion is not fully understood.

To examine the roles of non-motion perceptual grouping and attention in crossmodal motion interaction, we adopted the two-state Ternus apparent-motion paradigm in the present study (Ternus, 1926). Ternus apparent motion arises from a typical Ternus display (see Fig. 1) which consists of two sequential visual frames, each presenting two horizontal dots (with the same inter-dot distance in the two frames), where the two frames, when overlaid, share one common dot at the center (the ‘middle’ dot). With different stimulus onset asynchronies (SOAs), there are often two distinct percepts: ‘element motion’ and ‘group motion’. In element motion, the outer dots are perceived as moving, while the center dot appears to remain static or flashing; in group motion, the two dots are perceived to move together as a group. Similar motion percepts have been demonstrated recently in touch (Harrar and Harris, 2007). It has been proposed that in Ternus apparent motion, temporal and spatial grouping processes are in competition (Kramer and Yantis, 1997). At short SOAs between the two frames, temporal grouping (temporal proximity) prevails, that is, the stimulus in the ‘overlapping’ position of the first frame is likely to be grouped with the stimulus appearing at the same location in the second frame, leading to the percept of ‘element motion’. By contrast, at long SOAs, temporal proximity weakens and spatial grouping within a frame becomes more prominent, giving rise to a dominant percept of ‘group motion’. In contrast to simple direction judgments of apparent motion, such spatio-temporal grouping mechanism in Ternus apparent motion provides a useful tool to examine intra- and crossmodal perceptual grouping effects on crossmodal motion perception.

We chose vision and touch to investigate crossmodal apparent motion in the present study chiefly for two reasons. First, Ternus apparent motion has been demonstrated to be similar in vision and touch (Harrar and Harris, 2007). Second, vision and touch have different temporal properties. Vision has a lower temporal resolution but a higher spatial resolution than touch. Accordingly, the role of spatio-temporal grouping for crossmodal apparent motion may differ between vision and touch.

To induce non-motion perceptual grouping, we introduced rhythmic precues or synchronous cues from the stimulus in the middle position of the Ternus display (Fig. 2). Rhythmic cues extended the presentation rhythm of the (repeated) middle stimuli to prior to the onset of the Ternus apparent-motion display, which is analogous to the enhancement of apparent motion in the distractor modality by increasing the number of distractors prior to the onset of the target stimuli in previous studies (Sanabria et al., 2005, Sanabria et al., 2004). However, the major difference is that we only manipulated non-motion temporal grouping, instead of both spatial and temporal grouping. Synchronous cues, implemented by adding synchronous stimuli in the irrelevant modality to the middle position, could enhance the multisensory salience of the middle stimuli, which is similar to the auditory intensity modulation in Occelli et al.'s (2009) study of audiotactile crossmodal dynamic capture. As shown in most crossmodal dynamic-capture studies, increasing the distractor and target asynchrony would eventually diminish the crossmodal interaction due to clearly separated motion streams (Soto-Faraco et al., 2004a, Soto-Faraco et al., 2004b). In order to systematically investigate the crossmodal interaction by temporal grouping and attention, we varied the onset asynchronies between the distractor stimuli and the target Ternus display (henceforth, we will refer to the distractor stimuli as ‘(pre-) cues’ and the asynchrony as cue-target onset asynchrony, CTOA). If the crossmodal interaction operates (only) within a short temporal range, one would expect precues with long CTOAs not to influence the perception of apparent motion. However, crossmodal cueing has also been shown to enhance perceptual sensitivity (Eimer et al., 2002, Gray and Tan, 2002, Kennett et al., 2001, Macaluso et al., 2000); thus, in the Ternus display, it may improve the detection of stimuli within one frame and enhance the spatial linkage (grouping) of these stimuli. On this logic, one would expect modulations of apparent motion induced by crossmodal precues even with long CTOAs. Furthermore, crossmodal temporal grouping may be weaker than intramodal temporal grouping (Gilbert, 1938) and it may operate via different mechanisms. Thus, in the present study, we also compared the effects of intra- with those of crossmodal temporal grouping.

Specifically, in Experiments 1 and 2, we manipulated intramodal temporal grouping of the middle stimuli within a given (either the visual or the tactile) modality by presenting the middle stimulus twice prior to the Ternus display. In Experiment 1, the SOA between the two precues as well as the CTOA were kept the same as the SOA between the Ternus frames to maintain the same presentation rhythm, so as to enhance the temporal grouping. Note that the CTOA (and thus the SOA between the two precues and that between the two Ternus display frames) varied randomly across trials. We refer to this as rhythmic-cue condition. The sequence of stimuli and their timing was essentially the same in Experiment 2, except that the CTOA between the second precue and the first Turnus frame was fixed and either short or long CTOAs (170 vs. 530 ms). Short and long CTOAs were compared in order to examine the dynamic change of the cueing effect. Experiments 3 and 4 were designed to examine the crossmodal effects of tactile precues and tactile synchronous cues on visual apparent motion. The presentation rhythm in Experiment 3 was the same as that as in Experiment 1, and fixed short and long CTOAs were compared in Experiment 4 (similar to Experiment 2). Experiments 5 and 6 were analogous to Experiments 3 and 4, however with the cue and target modalities switched.

Section snippets

Results

The percentages of ‘group motion’ responses were calculated for each condition individually for each observer; then the psychometric curves were fitted using a logistic function (Treutwein and Strasburger, 1999). Fig. 3 illustrates typical results and psychometric estimates for one observer, namely, for the rhythmic-cue and baseline conditions in the visual Ternus apparent-motion task with intramodal cues (Experiment 1). Subsequently, for each observer, the SOA at which he/she was equally

Discussion

The major findings of the present study are summarized in Table 2.

With non-motion intramodal precueing of the middle position, differential effects were observed between visual and tactile Ternus apparent motion. Cueing the middle flash at a fixed short CTOA or in rhythmic fashion shifted the visual Ternus apparent motion towards ‘element motion’. In contrast, an opponent-type pattern was found for tactile precues between short and long CTOAs: precueing the middle tap at short CTOAs shifted

Participants

In each experiment, thirteen naïve observers were enrolled for payment: mean age 24 (9 females), 22 (7 females), 22.4 (7 females), 23.5 (9 females), 23.3 (5 females), and 22.5 (8 females) for Experiments 1 to 6 respectively. All had normal or corrected-to-normal vision and none of them reported any history of somatosensory disorders. They all gave informed consent prior to the experiment. The experiments were conducted in accordance with the guidelines of the LMU Department of Psychology ethics

References (46)

  • P.G. Allen et al.

    Sensory specificity of apparent motion

    J. Exp. Psychol. Hum. Percept. Perform.

    (1981)
  • D.H. Brainard

    The Psychophysics Toolbox

    Spat. Vis.

    (1997)
  • J.C. Craig

    Visual motion interferes with tactile motion perception

    Perception

    (2006)
  • M. Eimer et al.

    Cross-modal interactions between audition, touch, and vision in endogenous spatial attention: ERP evidence on preparatory states and sensory modulations

    J. Cogn. Neurosci.

    (2002)
  • S. Getzmann

    The effect of brief auditory stimuli on visual apparent motion

    Perception

    (2007)
  • G.M. Gilbert

    A study in inter-sensory Gestalten

    Psychol. Bull.

    (1938)
  • R. Gray et al.

    Dynamic and predictive links between touch and vision

    Exp. Brain Res.

    (2002)
  • M.C. Hagen et al.

    Tactile motion activates the human middle temporal/V5 (MT/V5) complex

    Eur. J. Neurosci.

    (2002)
  • V. Harrar et al.

    Multimodal Ternus: visual, tactile, and visuo-tactile grouping in apparent motion

    Perception

    (2007)
  • Z.J. He et al.

    Perceptual organization of apparent motion in the Ternus display

    Perception

    (1999)
  • M. Keetels et al.

    Tactile-visual temporal ventriloquism: no effect of spatial disparity

    Percept. Psychophys.

    (2008)
  • S. Kennett et al.

    Tactile-visual links in exogenous spatial attention under different postures: convergent evidence from psychophysics and ERPs

    J. Cogn. Neurosci.

    (2001)
  • P. Kramer et al.

    Perceptual grouping in space and time: evidence from the Ternus display

    Percept. Psychophys.

    (1997)
  • Cited by (16)

    • Visual unimodal grouping mediates auditory attentional bias in visuo-spatial working memory

      2013, Acta Psychologica
      Citation Excerpt :

      Regarding this issue, Woodman, Vecera, and Luck (2003) showed that the attentional bias over VSWM encoding similarly spread into the same perceptual group of the cued element. In the present study, we hypothesized that the auditory cueing effect on VSWM would depend on the perceptual object association between the auditory and visual inputs (see Chen, Shi, & Müller, 2010; Kawachi & Gyoba, 2006; Sanabria, Soto-Faraco, Chan, & Spence, 2005; Sanabria, Soto-Faraco, & Spence, 2004). In Botta et al.'s study, given the coarse spatial resolution of the auditory system, the auditory cue could be associated with any of the multiple visual elements presented in the display, with all, or with none of them (see Fig. 1).

    • Synesthetic Correspondence: An Overview

      2024, Advances in Experimental Medicine and Biology
    View all citing articles on Scopus
    View full text