1. Introduction
Due to the growing interest in the multisensory processes influencing eating and drinking behaviours (see [
1,
2,
3,
4,
5,
6] for reviews), the interactions between dimensions of the sensory signal occurring during food and beverage consumption have received a great deal of attention in different fields of research. In the case of flavour, several of these crossmodal interactions have already been reported (e.g., interactions between taste and smell [
5] and vision and taste [
7]). In addition, the relative contributions of each sensory modality to flavour perception have been partially documented. Some studies have recently provided empirical evidence on the influence of auditory cues on flavour perception (see [
8] for a review) or other food and beverage dimensions. For instance, Zampini and Spence’s study [
9] showed that a product’s sound properties (loudness, frequency) can modulate perceived crispness of potato chips. Indeed, this study showed that participants’ rating of both the crispness and staleness of potato chips was influenced by the loudness and/or the pitch of the auditory feedback elicited during the biting action.
Most researches have thus focused on the distinct influence of visual or auditory cues on flavour expectations or perception. Despite the fact that multisensory research applied to consumer experience is now a blossoming field of research, the audiovisual interactions occurring before consumption are still poorly documented. This is particularly surprising given that these interactions influence consumers’ subsequent experiences by triggering specific sensory expectations (see [
7] for a review). In order to foster knowledge of the mechanism underlying the interactions between visual and auditory signals, an increasing number of studies investigated the mechanisms of crossmodal correspondences which correspond to a set of heterogeneous non-arbitrary associations between perceptual dimensions from different sensory modalities or higher-order dimensions (e.g., emotion, linguistic, see [
10] for a review). The association between size and pitch is one of the most documented crossmodal correspondences. In particular, using simple cues such as pure tones and grey circles, it has been reported that smaller visually presented circles are typically matched with higher-pitched sounds and larger circles with lower-pitched ones [
11,
12,
13].
In a review on multisensory integration, perception, and ecological validity, De Gelder and Bertelson [
14] defended that although simple cues have been shown to elicit robust multisensory integration, it is reasonable to question whether simple cue combinations are sufficient to fully understand the complex naturalistic situations in which humans evolve. In line with this statement, recent studies have investigated crossmodal correspondences using more complex and ecological stimuli in the food and beverage domain. For instance, Velasco et al.’s study [
15] showed crossmodal correspondences between the auditory properties of pouring water (lower pitch for hot water versus higher pitch for cold water) and the congruent words “hot drink” and “cold drink” (see also [
16,
17]). Spence and Gallace’s study [
18] reported another crossmodal correspondence related to carbonation in beverages. In their study, carbonated water was more associated with high-pitched meaningless words, such as “kiki” and “takete”. By contrast, still mineral water was more strongly associated with lower-pitched pseudo-words, such as “bouba” and “maluma”.
Investigating crossmodal interactions between two sensory modalities already allows for bringing additional knowledge to research in flavor perception. However, it has also been reported that several crossmodal correspondences are mutually dependent [
19]. According to Parise and Spence [
19], it is likely that modifying one of the crossmodal correspondences will affect those that are mutually dependent. There is a necessity to investigate further the potential consequences of manipulating the perceptual features involved in many crossmodal correspondences, such as pitch, which may be part of a broader associative network. For instance, auditory pitch has correspondences not only with visual size but also with spatial elevation [
11]. Visual stimuli with high elevation are associated with high-pitched tones and, conversely, stimuli with low elevation are associated with low-pitched tones. It should be noted that few studies have investigated cases of intramodal correspondences. One of them, reported by Evans and Treisman [
11], actually failed to show an intramodal correspondence between size and spatial elevation. This can appear surprising since this correspondence has been found crossmodally (e.g., [
20]). In addition, neither the size-elevation nor the pitch-elevation correspondences were investigated using complex and ecological stimuli.
Following from the literature described above, Experiments 1 and 2 reported here aim at investigating if there is an ecological counterpart of the pitch–size correspondence using more complex and ecological stimuli cuing carbonation in beverages. We hypothesize (1) that there are pitch-size crossmodal correspondences between small bubbles and high-pitched pouring sounds and between big bubbles and low-pitched pouring sounds; (2) that given the reported influence of audiovisual carbonation on perceived freshness in beverages [
21], we hypothesize that a congruent semantic prime will improve participants’ performance in pitch-size crossmodal congruency tasks. Thus Experiment 3 aims at investigating the possible mutual dependence between pitch, size, and elevation using ecological stimuli cuing carbonation in beverages. In particular, we investigated whether a bi-dimensional congruent visual stimulus (e.g., small bubbles and high elevation in space) jointly presented with a congruent auditory stimulus (i.e., high pitch) improves participants’ performance.
4. Experiment 3: The Mutual Dependence of Crossmodal Correspondences and Attentional Effects
Experiment 3 aims at replicating the results reported in Experiments 1 and 2, and at bringing additional evidence on the robustness of the pitch–size correspondence effects between bubbles’ size and pouring sounds’ pitch. In addition, Experiment 3 aims at investigating the mutual dependence between pitch–size correspondence and spatial elevation, as well as attentional effects. This was done while taking into account the complex naturalistic situations that humans are facing during their drinking experiences.
4.1. Methods
4.1.1. Participants
Forty-eight French participants took part in Experiment 3 (24 in Experiment 3a: 50% female, mean age 36 ± 13 SD; and 24 in Experiment 3b: 50% female, mean age 34 ± 11 SD). The participants were recruited with the same criteria as in Experiments 1 and 2. In Experiment 3a, each individual session lasted approximately 45 min and the participants received a 10 € voucher to complete the study. In Experiment 3b, each individual session lasted approximately 1 h and the participants received a 15 € voucher to complete the study. All the participants provided written informed consent prior to taking part in the study. The experiment was approved by the local ethics committee (University Hospital Center of Lyon).
4.1.2. Stimuli
The visual stimuli were the same as in Experiment 1, with an additional picture involving an intermediate bubble size. This picture was selected from the discriminability task conducted before Experiment 1 (see
Supplementary Materials Figure S1). The auditory stimuli were the same as in Experiment 1 (low pitch: 677 HZ, high pitch: 1086 Hz).
4.1.3. Design and Procedure
Three crossmodal correspondences (pitch–size, pitch–elevation, and pitch–size–elevation) and one intramodal correspondence (size–elevation) were investigated. This was done across two types of tasks: the “direct tasks” in Experiment 3a and the “indirect tasks” in Experiment 3b (see details below). The experiments were conducted in a quiet testing room. The participants sat on a chair 70 cm from a LCD computer monitor with a resolution of 1600 × 900 pixels (60 Hz refresh rate). They wore headphones (Sony MDR-ZX110) for which the volume was adjusted to a clearly audible level.
For each tested correspondence, three conditions were investigated. One of them was unimodal and the other two were bimodal (or bi-dimensional); one consisting of a congruent and the other of an incongruent combination of stimuli. Each time, the participants were instructed to focus on one dimension of the signal (i.e., bubble’s size, pitch, or elevation) and to respond only to the stimulus presented in this dimension while ignoring the stimuli presented in the other dimensions. The participants were instructed to press the space bar as rapidly and accurately as possible if the picture or the sound belonged to target category. Emphasis was put on rapidity over accuracy, although the participants were instructed to try and avoid errors as much as possible. The visual and auditory stimuli were presented for 2 s or until the space bar was pressed. Trials were separated by an inter-stimulus interval (ISI) of 150 ms. On-screen instructions were given before each block of trials.
The participants’ level of thirst was evaluated at the beginning of each experiment on a 7-point Likert scale. Before starting each experimental session, the participants were also asked to drink a 200 mL glass of still water in order to control for their level of thirst.
Experiment 3a: Direct Tasks
In the direct tasks, the participants were asked to respond to the same features on which the correspondences were tested (i.e., bubbles’ size, pouring sounds’ pitch, and spatial elevation). Thus, for the pitch–size correspondence, the participants were asked to perform a speeded classification task according to pouring sounds’ pitch (high pitch vs. low pitch) and bubbles’ size (small bubbles vs. big bubbles). For the pitch–elevation correspondence, the participants had to classify pouring sounds’ pitch (high pitch vs. low pitch) and the spatial elevation of the visual stimuli (high elevation vs. low elevation with a fixed intermediate bubbles’ size picture. Note that this intermediate size was used to have one constant size. The size–elevation correspondence tested the speeded classification of bubbles’ size (small bubbles vs. big bubbles) and the spatial elevation of visual stimuli (high elevation vs. low elevation). Finally, the pitch–size–elevation correspondence tested the interaction between these three features. In each experiment, one-third of the trials consisted in unimodal presentations of the stimuli for comparison.
For both the pitch–size and pitch–elevation correspondences, there were 80 unimodal trials (40 visual and 40 auditory) and 80 bimodal trials. The size–elevation unidimensional task on spatial elevation contained 40 trials whereas the bidimensional tasks contained 80 trials. Finally, the bimodal and bidimensional tasks for the pitch–size–elevation correspondence contained 320 trials, giving rise to 760 trials in total. The number of trials was determined in order to have ten repetitions of each stimulus in the unimodal tasks and five repetitions of each stimulus in the bimodal tasks per condition. The number of targeted features within the stimuli was two in the unimodal tasks and four in the bimodal tasks. In the direct tasks, the location of the visual stimuli could be either central (for the pitch–size correspondence), low, or high elevation (for the size–elevation and pitch–size–elevation correspondences).
Experiment 3b: Indirect Tasks
In the indirect tasks, the participants were asked to respond to different features than those corresponding to the tested correspondences (i.e., bubbles’ size, pouring sounds’ pitch, and spatial elevation). In particular, for the pitch–size correspondence, the participants were asked to perform the speeded classification task either according to the lateralization of the auditory stimuli (sounds heard in their left ear versus right ear), or the lateralization of the visual stimuli (pictures seen on the right or left side of the screen). As in the direct tasks, the sounds varied in pitch and the pictures still varied in bubbles’ size. For the pitch–elevation correspondence, the participants also had to classify either the lateralization of both the auditory (still varying in pitch) and visual stimuli (fixed intermediate bubbles’ size). When responding to the auditory stimuli, the intermediate bubbles’ size picture was presented at two possible locations: high or low. When responding to the visual stimuli, there were four possible locations for the intermediate bubbles’ size picture: high elevation right/left vs. low elevation right/left. The size–elevation correspondence tested the speeded classification of pictures’ location. There were four possible locations: high elevation right/left vs. low elevation right/left. The pictures still varied in bubbles’ size. Finally, the pitch–size–elevation correspondence tested the interaction between all of these three features. In all the conditions, one-third of the trials were presented as unimodal stimuli for comparison.
For both the pitch–size and the pitch–elevation correspondences, the unimodal auditory and visual tasks were made of 80 trials each, and the bimodal tasks contained 160 trials. For size–elevation, both the unidimensional condition on the horizontal position and the bidimensional condition contained 80 trials. Finally, the bimodal and bidimensional conditions for the pitch–size–elevation correspondence included 320 trials giving rise to 1120 trials in total. The number of trials was determined according to the same criteria as in the direct tasks.
At the end of the two experiments (direct and indirect tasks), explicit measures about the association between bubbles’ size and pouring sounds’ pitch were collected. The participants were asked to indicate which bubbles’ size and which pouring sounds’ pitch (without any particular stimuli presented) they generally associate to the consumption of fresh sparkling beverages on two 7-point Likert scales ranging from “Very small bubbles” to “Very big bubbles” and from “Very low-pitched sound” to “Very high-pitched sound”.
4.1.4. Data Analysis
In Experiment 3a, data from one participant were not analysed due to a software malfunction during the test. In Experiment 3b, six participants (2 males, 4 females) appeared as outliers in the data distribution and were removed from the subsequent analyses due to more than 40% of unanswered trials (mean 48.8%; mean of unanswered trials for the remaining eighteen participants: 3.2%). In order to normalize the RT distributions, the RT data were log-transformed. The RTs from those trials in which the participants responded correctly were submitted to mixed model analyses of variances with the within-participant factors of Correspondence (pitch–size, pitch–elevation, size–elevation, pitch–size–elevation), Congruency (congruent, incongruent, and unimodal/unidimensional) for each correspondence, Attended modality (target modality, to which the participants were asked to focus on for a given set of trials; visual vs. auditory), and the between-participant factor task (direct vs. indirect). The same analysis was performed considering the participants’ mean percentage of errors as the dependent variable. Tukey post-hoc analyses were subsequently conducted.
4.2. Results and Discussion
When considering the participants’ RTs as the dependent variable, the mixed model ANOVA revealed significant main effects of Task (F = 9.2, p = 0.004, partial η2 = 0.19), Correspondence (F = 16.5, p < 0.0001, partial η2 = 0.005), Pitch–Size Congruency (F = 11.8, p < 0.0001, partial η2 = 0.0015), Size–Elevation Congruency (F = 3.7, p = 0.02, partial η2 = 0.0005), and Attended Modality (F = 102, p < 0.0001, partial η2 = 0.006). There were significant interactions between Task and Correspondence (F = 25.3, p < 0.0001, partial η2 = 0.008) and between Task and Pitch–Size Congruency (F = 3.7, p = 0.03, partial η2 = 0.0005).
Overall, the participants’ RTs were shorter in the indirect task (m = 417 ms ± 1.7 SEM) than in the direct one (m = 531 ms ± 2.6 SEM). Regarding the two-way interaction between Task and Pitch–Size Congruency, a Tukey post-hoc analysis revealed that the participants’ RTs were shorter in congruent trials (m = 537 ms ± 5.3 SEM) than in incongruent ones (m = 551 ms ± 5.5 SEM), in the direct task only (
Figure 4),
p = 0.02. However, there were no significant differences in mean RTs between congruent and incongruent trials according to the Attended Modality (visual vs. auditory), and this was the case independently of the task (direct vs. indirect). Therefore, in the direct task, there was an overall effect of pitch-size. As in Experiment 2, this effect did not occur for each sensory modality. The mean percentages of errors were not significantly different between the two conditions of Pitch–Size Congruency neither in the direct task nor in the indirect one. Even though further investigation is needed, our results suggest that pitch–size correspondence effects occur only when the participants’ attention is directed toward the same features on which the correspondence is tested (i.e., bubbles’ size versus pouring sounds’ pitch). In the direct task, the visual and auditory stimuli that the participants had to discriminate were presented in the task’s instructions. It is thus possible that during the processing of these stimuli along the task, the participants formed higher order associations between the target stimuli and the corresponding category to which they might relate (e.g., freshness perception in beverages, or positively correlated psychophysiological concept such as thirst-quenching, see [
21,
32]). According to this interpretation of our results, the pitch–size correspondence effects reported in the case of beverages might be driven by top-down influences such as attentional processes.
A Tukey post-hoc analysis conducted on size-elevation congruency revealed that the participants’ RTs were shorter in unidimensional trials (m = 516 ms ± 3.8 SEM) than in congruent bidimensional trials (m = 542 ms ± 6.2 SEM). However, there were no significant differences in mean RTs between bidimensional congruent and incongruent trials (
p = 0.7). This null effect of the size–elevation correspondence is consistent with the results obtained by Evans and Treisman [
11]. Our result brings additional evidence that the correspondence between size and elevation is actually not straightforward and further research is needed to confirm or infirm this relation. We believe that the complexity and the ecological character of the stimuli used here might require adjustments and additional testing. For instance, spatial elevation could be manipulated with bubbles in motion in the liquid instead of manipulating the position of the picture on the screen. However, Parise [
19] underlined that while the use of complex stimuli might be more ecological, it is likely that the multidimensionality of such stimuli will increase the difficulty for researchers to identify what are the relevant stimulus dimensions that drive the corresponding crossmodal correspondences.
Regarding the interaction between Pitch, Size, and Elevation, the mixed model ANOVA revealed a significant effect of Congruency on the participants’ RTs only in the direct task (F = 3, p = 0.03, partial η2 = 0.002). The participants’ RTs for congruent pitch–size bimodal stimuli (m = 527 ms ± 8.2 SEM) were not significantly different from their RTs for congruent pitch–size–elevation bimodal stimuli (m = 542 ms ± 8.8 SEM). Thus, there is no additive facilitation effect since a bidimensional congruent visual stimulus (e.g., small bubbles and high elevation in space) presented together with a congruent auditory stimulus (i.e., high pitch) did not significantly reduce the participants’ RTs.
Finally, regarding the analysis of explicit measures, it appeared that the more the participants associated the consumption of fresh sparkling beverages with small bubbles, the more they also associated it with high-pitched pouring sounds (Spearman correlation: rho = 0.21, p < 0.0001).
5. General Discussion
The three experiments reported here aimed at investigating crossmodal correspondences between audiovisual stimuli cuing carbonation in beverages using more complex and ecological stimuli than those generally used in the literature on audiovisual correspondences [
13]. More specifically, Experiment 1 aimed at testing if there are pitch–size correspondence effects between bubbles’ size and pouring sounds’ pitch and at investigating whether these effects can be modulated by a congruent semantic prime (IAT paradigm). Experiment 2 aimed at investigating the relative effects of the two associations involved in the pitch–size correspondence, previously uncovered in the first experiment. In addition, variations in the stimuli (colour of the liquid, width of the glass, and pitch) were included in order to test the robustness of the pitch–size correspondence effects (GNAT paradigm). Finally, Experiments 3a and 3b first aimed at testing the interactions between pitch and size on the one hand and spatial elevation on the other hand. The second aim was to investigate the influence of attentional factors on the correspondence effects (speeded classification task).
In Experiment 1, the results revealed shorter RTs and more accurate responses in congruent blocks than in incongruent ones. These results represent the first piece of empirical evidence showing a more ecological counterpart of the reported pitch–size correspondence. In particular, it reveals pitch–size correspondence effects between bubbles’ size and pouring sounds’ pitch. Inter-individual differences may play a role in the way participants indirectly associate the stimuli. For instance, McEwan and Colwill [
41] reported that the level of carbonation required in a drink for it to be considered as either thirst-quenching or acceptable varies inter-individually. Although some people like a highly carbonated drink, most of them prefer a drink to be slightly sparkling or still. The number of bubbles in our experiments did not vary, but it is possible that the participants have associated a quantity of bubbles with a given size, to a specific carbonation intensity in a drink, and consequently to a certain valence.
Experiment 1 also investigated whether the semantic relation between small bubbles and high-pitched pouring sounds is more strongly related to the concept of freshness than the association between big bubbles and low-pitched pouring sounds, and whether this relation could influence participants’ performance. The results did not show any influence of the semantic prime on participants’ responses. The concept of freshness is broad and heterogeneous among people, even when it is used within the category of beverages [
21,
31,
32,
42]. Roque et al. [
21,
32] previously described the different meanings of freshness, which can refer to: (a) the overall multisensory experience during drinking (involving for instance coldness, sourness, or a menthol odour that will contribute to an actual perception of freshness), (b) the aging of the organic ingredients contained in the drink (e.g., aging of the mint leaves in a mojito), or else (c) the time delay (informed and/or perceived) to which the drink has been prepared before being served. As a consequence, this semantic ambiguity may prevent observing congruency effects when the word “freshness” is used as a semantic prime in relation to specific perceptual features of the beverage because only one meaning is relevant. Heyman et al. [
43] reported that priming effects arise as a result of automatic pre-activation processes and/or the use of strategies such as expectancy generation and semantic matching. As a consequence, automatic priming emerges when the presentation of a related prime (partially) activates the target’s representation, thereby lowering its recognition threshold. Different theories and models are still debated in the literature in marketing [
44] and in cognitive psychology [
45] regarding the appropriate prime and target sizes, the familiarity of both primes and stimuli, or else the influence of personal characteristics such as individuals’ awareness, motivation, and capacity for evaluation.
In Experiment 2, the RT analysis confirmed the existence of the pitch–size correspondence effects revealed in Experiment 1. Variations in the colour of the liquid, width of the glass, and pitch did not significantly interact with the congruency of the audiovisual interaction. This suggests that the pitch–size correspondence effect is robust to variations in the stimulus context. The sensitivity in participants’ responses was influenced neither by the congruency of the interaction, the colour of the liquid, the width of the glass, nor by the pitch’s height. On the basis of our results, it is not possible to conclude on the relative effects of the two associations involved in the pitch–size correspondence reported in the IAT and the GNAT experiments.
Similar to the IAT experiment, the carbonation intensity was kept constant all along the GNAT experiment. It is possible that this fixed bubbles’ density, together with variations in the colour of the liquid, created a perceptual mismatch between the expected versus the actual attributes of the stimuli during the task. For instance, if one person expects brown carbonated drinks to have a high bubble density, then the discrepancy with the actual bubble density in our stimuli might interfere with the relation of congruency between the visual stimuli and the auditory stimuli.
In Experiment 3a (i.e., direct task), pitch–size correspondence effects were obtained. This result is consistent with those obtained in Experiments 1 and 2. Moreover, changes in participants’ attention allowed providing the first empirical evidence that the pitch–size correspondence effect is more likely to occur when participants’ attention is directed toward the same features on which the correspondence is tested (i.e., bubbles’ size and pouring sounds’ pitch). Two different modes of attention have been distinguished, namely the exogenous and the endogenous [
46]. In Experiment 3, it is possible that the presentation of the stimuli only in the instructions of the direct task activated both endogenous and exogenous attentional processes toward the perceptual features of interest during the task, when the participants actually had to classify the previously presented stimuli. Consequently, these attentional processes could be partly responsible for the pitch–size correspondence effects that occurred in the direct task only.
In Experiments 3a and 3b, the results did not reveal any pitch–elevation nor size–elevation correspondence effects. The pitch–elevation correspondence investigated in Experiment 3 has generally been considered as semantically mediated since common terms are used to describe both pitch and elevation, namely high and low. However, it is worth underlining here that this dichotomy does not exist in French common language in which the verbal labels used to describe pitch and elevation are different (i.e., “aigu” and “grave” for pitch, “haut” and “bas” for elevation). Such lack of semantic mediation might explain the lack of pitch–elevation correspondence effects in Experiment 3. Against this hypothesis, several cross-cultural (e.g., [
47]) and developmental (e.g., [
48]) research have shown that the pitch–elevation correspondence was not purely semantic in nature as infants and individuals from cultures who use other terms to describe pitch still show effects of pitch directionality on elevation judgments. Cultural studies would enable verifying whether or not the pitch–elevation correspondence is culturally mediated as a function of the stimulus context.
The implicit measures collected in our experiments did not allow for determining whether the two associations (small bubbles–high pitch and big bubbles–low pitch) equally contribute to the global effect. However, when the participants were required in explicit questionnaires to assess which bubbles’ size and pouring sounds’ pitch they generally associate with fresh sparkling beverages, their results indicate that they associate small bubbles and high-pitched pouring sounds to freshness in beverages more than big bubbles and low-pitched pouring sounds. Several interpretations about the links between the results obtained with implicit and explicit measures have been proposed in the literature. If there is no implicit-explicit correlation, one of the possible interpretations is that the implicit measures have not been “contaminated” by the inherent biases that belong to the explicit measures (such as self-observation bias or limited introspective abilities). In this case, the dissociation between implicit and explicit measures can be interpreted as an index of discriminant validity. However, if there is an implicit–explicit correlation, researchers may interpret it as an index of convergent validity showing the reliability of the paradigm they used. In one of the reviews focusing on the IAT and related tasks, Teige Mocigemba et al. [
49] reported that one ongoing debate is whether low (implicit–explicit correlation of 0.24, see [
50]) to moderate (implicit–explicit correlation of 0.37, see [
51]) correlations between the IAT and explicit measures should be interpreted as cues of discriminant or convergent validity. For instance, a self-observation bias can occur in explicit measures. However, a bias that corresponds to what the participants accept to say about their own perception, attitudes, or behaviours is unlikely to fully explain the dissociations between implicit and explicit measures in a crossmodal version of the IAT. In particular, in our study the participants had no reasons to hide or to be ashamed of a particular behaviour concerning the bubbles’ size and pouring sounds’ pitch that they associate to the consumption of fresh sparkling beverages. Correlational analyses between implicit and explicit measures and the computation of other cues showing the reliability of the used paradigm represent promising research avenues in the food and beverage domain. In particular, the combination of approaches may enable researchers to provide a more in-depth comprehensive model of consumers’ perception and behaviours, for instance in the case of complex multisensory perception such as freshness in beverages.
6. Conclusions
The results of the three experiments reported here confirm the existence of a pitch–size crossmodal correspondence between bubble size and pouring sound pitch in carbonated beverages. Experiment 1 provides the first empirical evidence that complex and ecological audiovisual stimuli also induce the pitch–size correspondence effects well-established with simple stimuli. The result obtained in Experiment 2 using GNAT did not allow for determining the relative effects of the two associations involved in the pitch–size crossmodal correspondence. However, such pitch–size crossmodal correspondence appears to be robust in several respects because the effects still hold when the colour of the liquid, width of the glass, and pitch varied. In Experiment 3, changes in participants’ attention showed that the pitch–size correspondence effect in beverages is more likely to occur when the participants’ attention is directed toward the same features on which the correspondence is tested (i.e., bubbles’ size and pouring sounds’ pitch). On the other hand, the results obtained with explicit measures suggest that the participants associate small bubbles and high-pitched pouring sounds with freshness in beverages more than big bubbles and low-pitched pouring sounds. In terms of applications, the results obtained for the pitch–size correspondence, both implicitly and explicitly, suggest that companies aiming at increasing beverage attractiveness and perceived freshness of their products would benefit from directing consumers’ attention toward the features of interest, such as the bubbles. This could be done, for instance, by increasing the saliency of the perceptual features associated with bubbles’ size and pouring sounds’ pitch. Understanding the relations between different crossmodal correspondences would represent a promising way to develop efficient strategies (in terms of formulation, packaging, retail experience, or ads) to better catch consumers’ attention and likely increase attractiveness and appreciation of products.