Elsevier

Hearing Research

Volume 404, May 2021, 108213
Hearing Research

Research Paper
The perception of octave pitch affinity and harmonic fusion have a common origin

https://doi.org/10.1016/j.heares.2021.108213Get rights and content

Highlights

  • The detection of deviations from an octave interval is generally asymmetric.

  • Octave compressions are generally better detected than octave stretchings.

  • This is true for simultaneous (harmonic) as well as sequential (melodic) octaves.

  • In these two conditions, the magnitude of the asymmetry is listener-dependent.

  • The listener-specific asymmetries found in the two conditions are correlated.

Abstract

Musicians say that the pitches of tones with a frequency ratio of 2:1 (one octave) have a distinctive affinity, even if the tones do not have common spectral components. It has been suggested, however, that this affinity judgment has no biological basis and originates instead from an acculturation process ‒ the learning of musical rules unrelated to auditory physiology. We measured, in young amateur musicians, the perceptual detectability of octave mistunings for tones presented alternately (melodic condition) or simultaneously (harmonic condition). In the melodic condition, mistuning was detectable only by means of explicit pitch comparisons. In the harmonic condition, listeners could use a different and more efficient perceptual cue: in the absence of mistuning, the tones fused into a single sound percept; mistunings decreased fusion. Performance was globally better in the harmonic condition, in line with the hypothesis that listeners used a fusion cue in this condition; this hypothesis was also supported by results showing that an illusory simultaneity of the tones was much less advantageous than a real simultaneity. In the two conditions, mistuning detection was generally better for octave compressions than for octave stretchings. This asymmetry varied across listeners, but crucially the listener-specific asymmetries observed in the two conditions were highly correlated. Thus, the perception of the melodic octave appeared to be closely linked to the phenomenon of harmonic fusion. As harmonic fusion is thought to be determined by biological factors rather than factors related to musical culture or training, we argue that octave pitch affinity also has, at least in part, a biological basis.

Introduction

Humans enjoy melody, "the essential basis of music" in the words of Helmholtz (1863/1954). A melody is a sequence of periodic sounds with specific frequency ratios, forming musical intervals that are perceived as pitch relations. The precision with which these intervals are perceived is of course limited; it depends on the listener's musical training, the intervals themselves, and other factors (Burns and Ward, 1978; Rakowski, 1990; Perlman and Krumhansl, 1996; McDermott et al., 2010; McClaskey, 2017; Graves and Oxenham, 2017). However, in the Western world at least, even people with no substantial musical education readily detect an error of only one semitone (corresponding to a frequency change of about 6%) in the production of one note of a familiar melody (Dowling and Fujitani, 1971; Trainor and Trehub, 1994).

Throughout the human auditory system, up to the cortical level, frequency is represented tonotopically, along unidimensional neural maps (Romani et al., 1982; Talavage et al., 2004). A straightforward hypothesis, therefore, is that the representation of a melodic interval in the auditory system is simply a distance between neural excitations along an axis representing pitch as a logarithmic function of frequency. This would suggest that there is no "physiologically special" melodic interval (apart from the unison). Psychophysical results in line with this hypothesis were obtained by Kallman (1982, experiment 1), who required ordinary Western students to rate the similarity of successive pure tones as a function of their frequency ratio. The ratings smoothly decreased as the frequency ratio varied in small logarithmic steps from 1:1 to about 5:1. Remarkably, no local peak was observed for the simple frequency ratio 2:1, i.e., one octave, even though in the Western musical system two notes forming an octave interval bear the same name and are treated as equivalent sounds (Krumhansl and Shepard, 1979). Analogous findings were reported by Hoeschele et al. (2012, experiment 1).

However, at odds with these results, a number of other experiments have suggested that for a substantial proportion of human listeners, two tones forming a small-integer frequency ratio have a distinctive affinity (or similarity) in pitch. Ratios such as 3:2, 4:3, or 5:4 have been used in some of these experiments (Cohen et al., 1987; Schellenberg and Trehub, 1994, 1996), but the ratio most often used was 2:1, one octave. The demonstrations of octave pitch affinity (OPA) have been based on a variety of methodologies (Deutsch, 1973; Idson and Massaro, 1978; Kallman and Massaro, 1979; Massaro et al., 1980; Demany and Armand, 1984; Hoeschele et al., 2012; Borra et al., 2013; Jacoby et al., 2019).1 In the eight studies that we just cited, OPA was observed using pure-tone stimuli. This is an important detail since the peripheral auditory system behaves as a spectrum analyzer (Schnupp et al., 2012). Ordinary periodic sounds are instead complex tones, and thus consist of a sum of harmonics with frequencies equal to integer multiples of a given fundamental frequency. Consequently, two complex tones one octave apart typically have common spectral components. In addition, the pitch of certain complex tones is subject to octave ambiguities (Terhardt et al., 1982, 1986), which could explain the perception of an affinity between such tones when their fundamental frequencies are one octave apart (Regev et al., 2019). The phenomenon of OPA is more intriguing when it is observed for sounds with no common spectral component and an unambiguous pitch, such as pure tones.

The origin of OPA, for sounds such as pure tones, is the subject of a basic controversy. On one side of the debate, it is contended that OPA is essentially the consequence of an acculturation process (Burns and Ward, 1982; Sergeant, 1983; Jacoby et al., 2019). According to this culturalist hypothesis, Western listeners exhibit OPA because they have learned, consciously or unconsciously, a musical grammar in which tones one octave apart are functionally equivalent. Arbitrary musical grammars can be learned quite rapidly, by mere passive exposure to sound sequences constructed from these grammars (Loui et al., 2010; Rohrmeier et al., 2011). The musical rule of octave equivalence is certainly not arbitrary, because this rule is culturally widespread (Dowling and Harwood, 1986; Brown and Jordania, 2011). However, its main origin might be unrelated to the perception of pitch relations (Burns and Ward, 1982; McPherson et al., 2020). The rule might originate from the mere fact that the sum of two complex tones one octave apart is a single complex tone, with the same period as one of the two added tones. The culturalist explanation of OPA is consistent with the fact that, within the Western adult population, sensitivity to OPA appears to be stronger in musicians than in non-musicians (Allen, 1967; Demany and Armand, 1984; Jacoby et al., 2019), although this could of course be due to an influence of sensitivity to OPA on the willingness to become a musician. Jacoby et al. (2019) suggested in addition that the Tsimane', an Amazonian population living in isolation from Western culture, are completely insensitive to OPA. Western children tested by Sergeant (1983) showed a similar insensitivity and this led the author to assert that OPA was a "concept" rather than a percept. In line with such a view, Regev et al. (2019) found that musically educated listeners who were able to identify an octave interval as such did not manifest a sensitivity to OPA when their brain response to pitch changes was assessed via the "mismatch negativity" evoked potential.

On the other side of the debate, it is contended that OPA originates from physiological processes that are essentially independent of the cultural environment. The experimental evidence supporting this general hypothesis is currently very limited. Sensitivity to OPA has been found in two studies on non-human animals (Blackwell and Schlosberg, 1943; Wright et al., 2000); but the stimuli used by Wright et al. were spectrally rich periodic sounds. Using instead pure tones, Demany and Armand (1984) obtained results suggesting that OPA exists, and is even strong, in 3-month-old human infants. Another argument was put forth by Terhardt (1971, 1974, 1987). In his view, OPA originates from a learning process, but not from musical acculturation: what is learned is the harmonic structure of natural periodic sounds, such as human vocalizations. Due to this learning process, the pitch interval corresponding to a subjectively perfect melodic octave is the pitch interval of harmonics with a frequency ratio of 2:1 in natural periodic sounds. A well-established fact is that when musically educated listeners are requested to set two successive pure tones exactly one octave apart by adjusting their frequency ratio, the obtained ratio is generally slightly larger than 2:1 (Ward, 1954; Ohgushi, 1983; Demany and Semal, 1990; Hartmann, 1993; Rosner, 1999). Terhardt argued that this apparent anomaly ‒ often called the "octave enlargement" effect ‒ can be explained by small repulsive interactions between the representations of simultaneous pure tones in the periphery of the auditory system. He found confirmation of this hypothesis in precise measurements of the pitch of individual spectral components of complex tones. However, Peters et al. (1983) and Hartmann and Doty (1996) failed to replicate Terhardt's observations: they found that the pitch of a complex tone component is not significantly affected by the other components. Their work thus cast serious doubts on the validity of Terhardt's ideas about OPA.

Here, we report new evidence that OPA has a natural basis. More precisely, our study indicates that even for musically educated Western listeners, the pitch interval defining a subjectively perfect melodic octave is largely determined by universal auditory processes rather than by cultural factors. Our essential finding is that the perception of OPA is closely linked to the auditory phenomenon of harmonic fusion. A periodic complex tone is normally heard as a single sound, with a single pitch (related to the fundamental frequency). Yet, it is initially represented in the auditory system as a set of harmonics that, in isolation, evoke different pitches. Their subsequent fusion involves a detection of small-integer frequency ratios ("harmonicity"). When, for example, a 800-Hz harmonic is mistuned by 5% in a complex tone with a 400-Hz fundamental frequency, adult Western listeners perceive two sounds rather than one: the mistuned harmonic is heard as a pure tone standing out of a complex tone (Moore et al., 1986; Hartmann et al., 1990). Harmonic fusion is thought to be helpful in everyday life because real-world acoustic scenes often include simultaneous periodic sounds, produced by separate sources and differing in fundamental frequency; the perceptual segregation of such sounds requires a grouping of their respective spectral components (Bregman, 1990; de Cheveigné, 1997; Kidd et al., 2003; Carlyon and Gockel, 2008; Micheyl and Oxenham, 2010; Popham et al., 2018). Harmonic fusion is apparently operative in newborn infants (Bendixen et al., 2015), in Amazonian listeners isolated from Western culture (McDermott et al., 2016; McPherson et al., 2020), and in at least some non-human mammals (Tomlinson and Schwarz, 1988; Kalluri et al., 2008; Song et al., 2016). Moreover, neural correlates of this perceptual phenomenon have been found in the auditory cortex of monkeys (Fishman and Steinschneider, 2010; Fishman et al., 2014; Feng and Wang, 2017). Thus, harmonic fusion clearly has a natural basis. This should also be the case for OPA if OPA is closely linked to harmonic fusion.

In all but three of the past studies concerning OPA and harmonic fusion, these two phenomena have been investigated in isolation. Interestingly, a similar asymmetry was observed in both cases. First, the "octave enlargement" effect mentioned above suggests that OPA is generally stronger slightly above the physical octave (2:1) than slightly below it. Second, when the listeners' task was to detect small octave mistunings in stimuli consisting of simultaneous pure tones, performance was found to be generally poorer when the octave was stretched than when it was compressed, thus suggesting that harmonic fusion is more tolerant to stretchings than to compressions (Demany et al., 1991; Borchert et al., 2011; Bonnard et al., 2013, 2017). From this resemblance, one could suspect the existence of a link between OPA and harmonic fusion. However, the three studies in which the two phenomena were examined jointly, in the same listeners, did not provide evidence for such a link (Demany and Semal, 1990; Bonnard et al., 2013, 2016). In the present study, OPA and harmonic fusion were again investigated in the same listeners, but with a new methodology. We provide evidence that the two phenomena are linked by showing that the perception of OPA by a given listener is highly correlated with the perception of harmonic fusion by the same listener.

Section snippets

Conditions and stimuli

In this experiment, as well as experiments 2, 5, and 6, we measured the perceptual detectability of octave mistunings, i.e., deviations from a frequency ratio of 2:1, in cyclical sound sequences. Each sequence was built from two short stimuli: (1) a pure tone (T1); (2) a sum of two simultaneous pure tones with higher frequencies that were always exactly one octave apart (T2+T3). Each stimulus had a total duration of 130 ms and was gated on and off with 5-ms raised-cosine amplitude ramps. There

Method

In experiment 1, mistuning detection was easier in SIM than in ALT, as revealed by the fact that Δ had to be larger in ALT than in SIM to get a similar level of performance. Experiment 2 confirmed that the SIM condition was easier than the ALT condition, and determined whether the perceptual advantage provided by a simultaneous presentation of T1 and T2+T3 could be obtained if the simultaneity was illusory rather than real.

Four conditions were employed. In two of them, the sound sequences were

Experiment 3

To check that T1 and T2+T3 were perceived as simultaneous in ALTnoise, we firstly verified that the noise bands were of a sufficiently high level to elicit a continuity illusion. In experiment 3, the 12 listeners who had completed experiment 2 were presented with ALTnoise and SIMnoise sequences in which the level difference between the noise bands and the tones (+8 dB in experiment 2) was now adjustable. The task was to set the noise bands (as a whole) to the level just sufficient for the

Rationale and method

In the experiments described above, mistuning detection was investigated in a limited frequency register: the frequency of T1 varied between 300 and 600 Hz. Experiment 6 essentially replicated the ALTnoise and SIMnoise conditions of experiment 2 with two new ranges of T1 frequency: a "low" register, 200–300 Hz, and a "high" register, 1200–1800 Hz. In the low register, there was no a priori reason to expect results very different from those of experiment 2. However, previous research suggested

General discussion

In the present study, we investigated the perceptual detectability of octave mistunings via two subjectively quite different cues: OPA (for tones presented sequentially) and harmonic fusion (for tones presented simultaneously). Our results demonstrate, in a population of musically educated Western listeners, the existence of an intimate link between OPA and harmonic fusion. Since harmonic fusion undoubtedly originates from physiological processes taking place in every human auditory system, we

Declaration of Competing Interests

None.

Acknowledgements

We thank Josh McDermott, Peter Cariani, Alain de Cheveigné, and an anonymous reviewer for discussions and/or comments on a previous version of the manuscript. This research was partly funded by an MRC Core Award G101400 to author RPC.

References (103)

  • H.R. Blackwell et al.

    Octave generalization, pitch discrimination, and loudness thresholds in the white rat

    J. Exp. Psychol.

    (1943)
  • D. Bonnard et al.

    Auditory discrimination of frequency ratios: the octave singularity

    J. Exp. Psychol. Hum. Percept. Perform.

    (2013)
  • D. Bonnard et al.

    The effect of cochlear damage on the sensitivity to harmonicity

    Ear Hear.

    (2017)
  • E.M.O. Borchert et al.

    Perceptual grouping affects pitch judgments across time and frequency

    J. Exp. Psychol. Hum. Percept. Perform.

    (2011)
  • T. Borra et al.

    Octave effect in auditory attention

    Proc. Natl Acad. Sci. USA

    (2013)
  • A.S. Bregman

    Auditory Scene Analysis: The Perceptual Organization of Sound

    (1990)
  • S. Brown et al.

    Universals in the world's music

    Psychol. Music

    (2011)
  • E.M. Burns et al.

    Categorical perception - phenomenon or epiphenomenon: evidence from experiments in the perception of melodic musical intervals

    J. Acoust. Soc. Am.

    (1978)
  • P. Cariani

    Temporal codes, timing nets, and music perception

    J. New Music Res.

    (2001)
  • P. Cariani

    Musical intervals, scales and tunings: auditory representations and neural codes

  • R.P. Carlyon et al.

    The continuity illusion and vowel identification

    Acta Acust. United Acust.

    (2002)
  • R.P. Carlyon et al.

    Effects of harmonicity and regularity on the perception of sound sources

  • A. de Cheveigné

    Concurrent vowel identification. III. A neural model of harmonic interference cancellation

    J. Acoust. Soc. Am.

    (1997)
  • A. de Cheveigné et al.

    The case of the missing delay lines: synthetic delays obtained by cross-channel phase interaction

    J. Acoust. Soc. Am.

    (2006)
  • A.J. Cohen et al.

    Infants' perception of musical relations in short transposed tone sequences

    Can. J. Psychol. /Revue Can. Psychol.

    (1987)
  • C.J. Darwin

    Simultaneous grouping and auditory continuity

    Percept. Psychophys.

    (2005)
  • L. Demany et al.

    The perceptual reality of tone chroma in early infancy

    J. Acoust. Soc. Am.

    (1984)
  • L. Demany et al.

    Dichotic fusion of two tones one octave apart: evidence for internal octave templates

    J. Acoust. Soc. Am.

    (1988)
  • L. Demany et al.

    Harmonic and melodic octave templates

    J. Acoust. Soc. Am.

    (1990)
  • L. Demany et al.

    On the perceptual limits of octave harmony and their origin

    J. Acoust. Soc. Am.

    (1991)
  • D. Deutsch

    Octave generalization of specific interference effects in memory for tonal pitch

    Percept. Psychophys.

    (1973)
  • P.A. Dobbins et al.

    Octave discrimination: an experimental confirmation of the “stretched” subjective octave

    J. Acoust. Soc. Am.

    (1982)
  • W.J. Dowling et al.

    Contour, interval, and pitch recognition in memory for melodies

    J. Acoust. Soc. Am.

    (1971)
  • W.J. Dowling et al.

    Music Cognition

    (1986)
  • L. Feng et al.

    Harmonic template neurons in primate auditory cortex underlying complex sound processing

    Proc. Natl Acad. Sci. USA

    (2017)
  • Y.I. Fishman et al.

    Neural correlates of auditory scene analysis based on inharmonicity in monkey primary auditory cortex

    J. Neurosci.

    (2010)
  • Y.I. Fishman et al.

    Neural representation of concurrent harmonic sounds in monkey primary auditory cortex: implications for models of auditory scene analysis

    J. Neurosci.

    (2014)
  • H.E. Gockel et al.

    Detection of mistuning in harmonic complex tones at high frequencies

    Acta Acust. United Acust.

    (2018)
  • J.E. Graves et al.

    Familiar tonal context improves accuracy of pitch interval perception

    Front. Psychol.

    (2017)
  • D.M. Green et al.

    Signal Detection Theory and Psychophysics

    (1974)
  • W.M. Hartmann

    On the origin of the enlarged melodic octave

    J. Acoust. Soc. Am.

    (1993)
  • W.M. Hartmann et al.

    On the pitches of the components of a complex tone

    J. Acoust. Soc. Am.

    (1996)
  • W.M. Hartmann et al.

    Hearing a mistuned harmonic in an otherwise periodic complex tone

    J. Acoust. Soc. Am.

    (1990)
  • A. Heinrich et al.

    The continuity illusion does not depend on attentional state: fMRI evidence from illusory vowels

    J. Cogn. Neurosci.

    (2011)
  • H von Helmholtz

    On the Sensations of Tone

    (1954)
  • HoescheleM. et al.

    Pitch chroma discrimination, generalization, and transfer tests of octave equivalence in humans

    Atten. Percept. Psychophys.

    (2012)
  • S. Holm

    A simple sequentially rejective multiple test procedure

    Scand. J. Stat.

    (1979)
  • T. Houtgast

    Psychophysical evidence for lateral inhibition in hearing

    J. Acoust. Soc. Am.

    (1972)
  • A.J.M. Houtsma et al.

    Analytic and synthetic pitch of two-tone complexes

    J. Acoust. Soc. Am.

    (1991)
  • W.L. Idson et al.

    A bidimensional model of pitch in the recognition of melodies

    Percept. Psychophys.

    (1978)
  • Cited by (0)

    View full text