Introduction

Mozart began piano training at the age of 3, Beethoven before the age of 8. Many music programs for children emphasize beginning training as early as possible in order to develop musical skill. However, very little is known about the real effects of early musical training on adult performance. Behavioural studies comparing early-trained (ET) and late-trained (LT) musicians have shown that early training is essential for the development of absolute or “perfect” pitch (Baharloo et al. 1998; Costa-Giomi et al. 2001; Miyazaki and Rakowski 2002). More recently, brain-imaging studies have shown structural and functional changes in the brain associated with musical training that are greater for those who began training early in life (Elbert et al. 1995; Schlaug et al. 1995; Schneider et al. 2002; Gaser and Schlaug 2003; Koeneke et al. 2004). These findings suggest that there may be a critical or sensitive period for musical training, similar to that observed for language acquisition. However, previous studies have not controlled for differences between ET and LT musicians in the total number of years of musical training and experience. By definition, a musician who begins training early has more years of experience than one who begins later when both are the same age. Therefore, it is possible that the previously observed differences in performance and brain structure can be accounted for simply by the duration of musical training. Therefore, the present experiment examined the effect of musical training on performance of a rhythmic tapping task in ET and LT musicians who were matched for years of musical experience.

Evidence for the effect of musical training on later perceptual skill comes from studies of musicians with absolute or perfect pitch. Baharloo et al. (1998) tested a large sample of 691 musicians. They found that of the 92 musicians in that sample who exhibited perfect pitch, 78% began training before the age of 6. Similar results have been obtained by other groups (Costa-Giomi et al. 2001; Miyazaki and Rakowski 2002) and it has been suggested that there may be a genetic component to the development of this skill (Baharloo et al. 1998).

Maturational changes in the human brain coincide with and underlie changes in a wide range of cognitive and motor abilities (Giedd et al. 1999; Paus et al. 1999). Recent studies have shown that early musical training can result in both structural and functional plasticity in auditory and motor regions of the brain (Elbert et al. 1995; Gaser and Schlaug 2003; Koeneke et al. 2004). Elbert (1995) found that expert string players showed a larger cortical representation of the digits of the left hand. Further, he showed a strong correlation between the size of the digit representation and the age of start of musical training, with those who began earlier showing larger representations. Schlaug et al. (1995) reported a larger anterior corpus callosum in musicians compared with non-musicians, with musicians who began training before the age of 7 showing a greater difference than those who began after the age of 7. In a recent study, Bengtsson et al. (2005) showed evidence for greater myelination in the right cortico-spinal tract of professional musicians, and that this difference was specifically related to the number of hours practiced in childhood (before age 11). Taken together, these findings suggest that there may be a critical or sensitive period in development for the motor component of musical training.

The concept of “critical” and “sensitive” periods in development is drawn from work showing that certain behaviours and their neural substrates do not develop normally if appropriate stimulation is not received during a restricted time period in development (Knudsen 2004). During a sensitive period, neural systems are particularly responsive to relevant stimuli, and are more susceptible to change when stimulated. Critical periods are sensitive periods that have relatively abrupt onsets and offsets. The classic example of a critical period comes from the work of Hubel and Wiesel who showed that if cats are deprived of vision to one eye during the first month after birth, they do not develop normal binocular vision, even when vision is restored to the deprived eye (Hubel and Wiesel 1965). At the neural level, the pattern of cellular connectivity is altered and cannot be changed after the critical period has elapsed (Wiesel and Hubel 1965). In contrast to a critical period, where a function cannot be acquired outside the specific developmental window, a sensitive period denotes a period of development where the ability to acquire a specific skill is enhanced compared to other developmental periods. An example of a sensitive period comes from experiments in owls, where spatial representation can be changed by altered sensory input early in life, but normal representation can be relearned later in life (Brainard and Knudsen 1998).

Evidence for critical or sensitive periods in humans is drawn largely from the domain of language acquisition. Single case studies of individuals chronically deprived of linguistic stimulation in early childhood have shown that these individuals fail to develop normal language even after intensive exposure (Curtiss 1977). Further, studies of children with complete removal of the language-dominant left hemisphere revealed that as long as the removal occurred early, language could develop relatively normally. These findings led Lenneberg (1967) to propose that there is a critical period for neural plasticity underlying language functions that extends from early infancy to puberty. This hypothesis has been adapted to the study of second-language acquisition to suggest that exposure to the second language during the sensitive period results in greater fluency than the exposure after that. This hypothesis has been supported by the results of a number of studies showing that second-language proficiency is greater in individuals who were exposed to the second language before age 11–13 (Johnson and Newport 1989; Weber-Fox and Neville 2001).

Surprisingly, there are no experimental studies looking at the effect of early motor training on adult performance, despite conventional wisdom and anecdotal evidence suggesting that early training is a prerequisite for excellence in many domains of skilled motor performance. Some suggestive evidence for the importance of motor experience early in life comes from recent studies of children confined to orphanages and later adopted into families in the UK and US. During their time in the orphanages, these children were highly restricted in terms of motor experience. Investigations of these children’s motor abilities following adoption have shown subtle deficits in motor skills, such as standing balance and fine-motor coordination (Tober and Pollak 2005). These results indicate that motor deprivation during a putative sensitive period for motor learning can result in long-lasting impairments. Based on this, we can hypothesize that enriched motor experience, such as musical training, during this sensitive period could result in lasting neural changes and improved motor performance later in life.

In the present experiment, we tested musicians who began training before and after the age of seven on a timed motor sequence task (TMST), which has been used in several previous behavioural and neuroimaging studies (Penhune and Doyon 2002, 2005; Savion-Lemieux and Penhune 2005). This task requires participants to reproduce a temporally complex rhythmic motor sequence by tapping in synchrony with a series of visual stimuli (Fig. 1). The use of this task is advantageous for two reasons. First, the tapped sequences are non-metrical, making them relatively difficult even for musicians, and requiring them to generalize from the more common metrical rhythms encountered in musical training. Second, the task requires synchronization of the motor response with a visual stimulus, again requiring generalization from the usual auditory-motor synchronization required in musical training.

Fig. 1
figure 1

a Experimental setup. Stimulus sequences were made up of ten white squares, which appeared sequentially at the center of the computer screen. Subjects responded by tapping on a single key of the computer mouse (a). b Stimulus sequences in the practice and learning conditions. Each square in the sequences appeared for either a short (250 ms) or a long duration (750 ms), represented by the short or long line lengths. The ISI was constant (500 ms). Practice sequences consisted of four trials of three sequences: all short, all long and a simple mixture. For the learning condition, sequences were made up of five long and five short elements. Subjects were tested on only one of the two possible learning sequences

As described above, none of the previous studies examining behavioural and neural differences between ET and LT musicians have controlled for differences between the groups in the total number of years of musical training and experience (Elbert et al. 1995; Schlaug et al. 1995; Schneider et al. 2002; Gaser and Schlaug 2003; Koeneke et al. 2004). In a preliminary study using the same task (Watanabe et al. 2004) we compared ET and LT musicians matched for age, but not years of experience, and found the predicted enhanced performance for ET musicians. However, we also found a significant correlation between years of experience and performance; indicating that those who had played longer performed better. This suggested that the most important predictor of performance was not age at the start of training, but simple years of experience. Therefore, for the present experiment we moved to a matched sample where subjects in the two groups were matched for years of musical experience, such that later starters had the same number of years of musical experience as those who started earlier. Years of formal musical training and hours of current practice were also controlled.

Methods

Participants

Participants were 30 young, healthy, right-handed practicing musicians between 17.8 and 36.8 years of age (17 women and 13 men, M = 24.9 years, SD = 5.3) and 10 non-musicians between 19.3 and 33.4 years of age (5 men and 5 women, M = 26.2 years, SD = 5.1) tested for a previous study using the same protocol (Savion-Lemieux and Penhune 2005). All participants were recruited from the undergraduate student population of Concordia University. Musicians were recruited from the Department of Music and from the Montreal-area population. All participants were right-handed as assessed by a handedness questionnaire adapted from Crovitz and Zener (1962). None of the participants had a history of neurological disorders.

For the purposes of this study, a practicing musician was operationally defined as an individual who was currently practicing music and had at least four years of musical experience (range 7.5–26.0 years). Musical experience was defined as the ability to play a musical instrument or sing, acquired through formal and practical training. The participants were predominantly piano and string players. Two of the musicians identified voice as their current primary musical focus, but both had extensive instrumental training and experience (15 years of guitar; 12 years of piano) and both currently practiced these instruments, although not to the same degree as voice. Similarly, a number of other musicians in the sample had played several instruments over their careers and some continued to play a secondary instrument throughout their careers.

Musicians were divided into two groups: early trained musicians (ET; n = 15, 9 women and 6 men) who began training before the age of 7 and late-trained musicians (LT; n = 14, 8 women and 6 men) who began training after the age of 7. The age at which musicians began training, the number of years of experience, the amount of formal training and the number of hours per week currently practiced were assessed using a modified version of the Global Index of Musical Training and Experience (Penhune et al. 1999; see supplementary material). Musicians in the two groups were individually matched for years of musical experience and formal training. Years of experience were defined as the total number of years of musical training. Years of formal training were defined as the total number of years spent in formal training (e.g., lessons). Both groups of musicians were compared to a group of non-musician controls who had been tested for a previous experiment using the same protocol (Savion-Lemieux and Penhune 2005). These individuals were selected to have less than 3 years of musical training or experience and were not currently practicing music. All subjects were right-handed with no history of neurological or psychiatric disorder. The experimental protocol was approved by the Concordia University Human Research Ethics Committee, Montreal, Canada. Participants gave informed consent and were compensated for their time.

Stimuli

The TMST used in this experiment required participants to reproduce a temporally complex rhythmic motor sequence by tapping in synchrony with a series of visual stimuli using a single key on a computer mouse (Fig. 1). The stimuli were ten-element visual sequences consisting of a series of white squares (3 cm2) presented sequentially in the centre of the dark grey background of a computer screen (21-in. Sony Multiscan G500 monitor, 100 Hz). Two different sequences, designed to be of equal difficulty, were used in this study. Each participant was tested on only one of the two possible sequences, which were counterbalanced across participants. Each sequence was composed of five long (750 ms) and five short (250 ms) elements, with a constant inter-stimulus interval (500 ms). The sequences were constructed to have no more than two repeating elements as well as seven transitions from short to long elements. This results in sequences that are temporally regular, but do not conform to a standard musical rhythm. The elements of each sequence can be grouped into a series of intervals of three and five beats (3:5 ratio) based on the beat unit of 250 ms underlying both the stimuli and the inter-stimulus intervals. As these intervals do not represent a simple integer ratio (i.e., division of the intervals yielding an integer value), the sequences represent non-metrical rhythms (Essens 1986; Essens and Povel 1985). The presentation of each sequence was cued by a small white square (1 cm2) that appeared in the middle of the screen. Each block of practice on the TMST contained 12 presentations of the same sequence and lasted 132 s.

Participants performed the TMST using a desktop computer that recorded all generated responses (Intel Pentium III 800 MHz computer, Windows 2000 Professional). Customized media control functions (MCF) software (Digivox, Montreal, Canada) controlled the presentation of the visual stimuli as well as automatically recorded participants’ key-press and release durations, which were subsequently used to calculate two indices of learning: accuracy of reproduction and percent asynchrony of responses with target stimuli (described in detail below).

Procedure

At the beginning of each testing session, before performing the TMST, participants completed a baseline practice task. This task consisted of three simple ten-element sequences that were made up of either all long, all short or a simple mixture (Fig. 1). Participants were instructed to press and hold the mouse key down at the onset of each stimulus in the sequence, and to release it each time the stimulus disappeared. All participants used the index finger of the right hand. The experimenter provided feedback to ensure that the participant understood and learned the motor skill required for the study.

Once the baseline task was completed on day 1, participants were explicitly trained to reproduce one of two TMST sequences to a criterion of three consecutive correct repetitions. After the initial training, no further feedback was provided to the individual. Participants then performed three blocks of their assigned TMST sequence. Participants were seated 57 cm away from the computer monitor and short breaks were provided between blocks of practice to prevent fatigue and optimize performance. Upon completion of the last block of trials, participants were asked whether they used any strategy to learn the TMST, and were instructed not to practice their assigned sequence in between test sessions. On each of the four consecutive days (day 2–5), participants returned to the laboratory to perform the baseline task, review their assigned TMST by reproducing one to two trials of the sequence, and complete three blocks of trials.

Behavioural measures

The learning of motor skill tasks, such as the serial reaction time task (SRT), is typically assessed by reductions in reaction time to individual elements of the motor sequence. That is, faster responses correspond to improved performance. However, performance is measured on the TMST by requiring participants to synchronize their responses as precisely as possible with the presented stimuli. Therefore, learning of the TMST was assessed by examining changes in two different variables: accuracy of responses and synchrony of responses with target stimuli. These measures examined learning of two different aspects of the task. Accuracy reflects learning of the more explicit component of the task—encoding of the correct order of short and long durations in the sequence. However, it still requires the participant to make a relatively accurate motor response—within 2SD of his/her baseline. Response asynchrony reflects the ability to precisely time key-press and key-release responses relative to the visual stimuli.

Performance of the learned sequences was scored individually by using each participant’s average short and long responses from the practice sequences for each day of training (Penhune and Doyon 2002). The first step in scoring was to calculate the average and SD for each participant’s long and short responses on the simple practice sequences (Fig. 1). Responses on the simple practice sequences that were greater than 2SD from the mean were excluded. The average was then recalculated, and the recalculated average ± 2SD was used as the upper and lower limit for accurate response on the learned sequences. This means that as subjects become more accurate and less variable with practice, the criteria for scoring their performance becomes more stringent. The percent of correctly reproduced elements was calculated for each trial and the measure of asynchrony was calculated on correct responses only. This was done so that the asynchrony measure was not contaminated by gross errors. Percent response asynchrony (PASY) measures the percent difference between onset and offset of stimuli and the onset and offset of participant’s key-press responses.

Data analysis

All behavioural measures were averaged across blocks of trials and days of practice. The data were assessed using repeated-measures analysis of variance (ANOVA) with Greenhouse–Geiser correction, with Group as the between-subjects factor and Block or Day as within-subjects factors. Percent correct and PASY measure were analysed separately. Significant differences across days for the two groups were analysed using tests of simple main effects with Bonferroni correction for multiple comparisons.

Results

Analysis of demographic data (Table 1) showed that the ET and LT musician groups were well matched, with no significant differences in the total number of years of musical experience, the number of years of formal training, or the number of hours per week they currently practiced. As predicted, the groups were significantly different in terms of the age of start of musical training and in current age. No significant correlations were found between age and any of the behavioural measures of performance. Repeated measures ANOVAs revealed no significant differences in either percent correct or percent response asynchrony between the two different sequences used in the learning trials and no significant differences between the sexes. Therefore, behavioural data were collapsed across these dimensions. Data for the two vocalists was examined separately, and it was found that these musicians’ performance did not differ significantly from the mean of their groups on any behavioural variable.

Table 1 Measures of musical training and experience

For percent correct (Fig. 2a), a repeated measures ANOVA showed no significant main effect of group, but a significant main effect of day (F (4, 108) = 18.8, p < .001), indicating improved performance across groups for the 5 days of training. Post hoc planned comparisons between groups across days of learning revealed a marginally significant difference between groups only on day 1 (< .09). Although there was no significant interaction, planned comparisons examining changes in performance across days for the two groups showed that LT musicians showed significant improvement between day 1 and day 2 (< .01); whereas ET musicians showed only marginally significant gains between day 1 and 3 (< .09) and day 1 and 4 (< .07). In a separate analysis, both groups were compared with non-musicians (Fig. 2b). In this comparison, only ET musicians showed overall better performance (ANOVAday 1–5: F (2, 36) = 3.2, P < .05; Planned comparison, ET versus NM, < .05).

Fig. 2
figure 2

a Average percent correct and percent response asynchrony (PASY) data for ET, LT and NM groups across 5 days of learning. For percent correct, ET musicians performed better than LT musicians only on day 1; and only ET musicians performed better that NM overall. For PASY, ET musicians performed better than LT musicians overall and specifically on days 2–5. Both ET and LT musicians performed better than NM for this measure. b Box and whisker plots for ET, LT and Non-musician (NM) groups. For percent correct, only ET musicians differed from NM. For PASY, ET musicians performed better than both LT and NM groups. Although the ET and LT groups differ on average, of note is the considerable overlap in individual performance

For percent response asynchrony, a significant main effect of group was found (F (1, 27) = 3.95, P = 0.056), such that ET musicians performed better than LT musicians across the five days of learning (Fig. 2a). Post hoc tests of simple main effect showed that ET musicians performed better than LT musicians on days 2–4 (< .05) and remained marginally significantly different on day 5 (< .07). In addition a main effect of day was observed, such that both groups improved across days of learning (F (4, 108) = 45.9, P < .001). No significant day × group interaction was observed, however, post hoc pair-wise comparisons indicated that both groups show significant improvement between day 1 and 2 (ET = < .001; LT = < .008), and between days 2 and 3 (ET = < .03; LT = < .004), but that only LT musicians continued to show improvements between days 3 and 4 (LT = < .005). Neither group showed improvement between days 4 and 5. In a separate analysis, when the two groups were compared with non-musicians (Fig. 2b) both ET and LT musicians showed overall better performance (F (2, 36) = 7.4, P < .01; ET vs. NM, < .01, LT vs. NM, < .05).

Discussion

The results of this experiment show that ET musicians showed better performance on a novel rhythmic tapping task than LT musicians with similar levels of training and experience. For the more global measure, ET musicians performed better LT musicians only on day 1, and both groups improved across days of practice. In contrast, for the measure of response synchronization, both groups started out at the same level and showed similar improvements with practice. However, from day 2 onward, ET musicians showed better performance than LT musicians and this persisted after 5 days of practice. These findings support the idea that there may be a sensitive period in childhood where enriched motor training through musical practice results in long-lasting benefits for performance later in life. Performance differences were greatest for the measure of response synchronization, suggesting that early training has its greatest effect on neural systems involved in sensorimotor integration and timing. This is consistent with evidence for age-specific developmental changes in motor performance, and age-specific changes in brain regions important for motor control. It is also consistent with the results of studies showing structural changes in motor-related regions of the brain in trained musicians. Importantly, because the task required synchronization with a visual stimulus, these results show that the effects of early training can be generalized to novel motor tasks. Finally, while the ET and LT musicians differed on average, there was considerable overlap in performance between the two groups. This indicates that early training is not the only factor affecting adult performance. Other potential factors that might contribute to the enhanced performance of ET musicians are: individual differences in early ability, motivation and family support for musical training.

The results of this experiment suggest that there may be a sensitive period in brain development where musical training can have long-term effects on motor performance that can be generalized to novel motor tasks. Maturational changes in the human brain are greatest in childhood, but continue into early adulthood. Following birth, the number of synapses and therefore the volume of grey matter continues to increase for between 3 and 15 months, depending on the region of the brain (Huttenlocher and Dabholkar 1997). Once this peak is reached, the number of synapses decreases through the process of pruning, which is thought to underlie experience-dependant specialization. In contrast, the amount of white matter increases throughout development. Therefore, although the total size of the brain does not change substantially after the age of 5, the amount of white matter increases until sometime around age 20 (Casey et al. 2000). Over the last 10 years, a number of studies using structural MRI techniques have examined developmental changes in the volume and proportion of grey and white matter in the brain. The results of these studies have shown that increases in white matter volume are age- and region-specific, with sensory and motor regions showing increases earlier, and frontal and temporal–parietal association areas later (Casey et al. 2000; Gotay et al. 2004; Sowell et al. 2004). Increasing white matter volume measured by MRI is thought to correspond to increasing number of neuronal axons, greater diameter of axons, or greater thickness of the myelin sheath that surrounds them. A number of studies have shown increases in the white matter concentration of the cortico-spinal track and corpus callosum between childhood and late adolescence (Paus et al. 1999; Barnea-Goraly et al. 2005). It has been hypothesized that these increases may underlie decreases in nerve conduction time that are observed with development, and might be related to behavioural phenomena such as decreasing reaction times and increasing motor speed associated with the improvement of fine motor skills across early childhood. A recent study (Bengtsson et al. 2005) examined white matter structure in professional pianists and non-musicians and showed evidence for greater myelination in the right cortico-spinal tract of musicians, and that this difference was specifically related to the number of hours practiced in childhood (before age 11).

These changes in brain development during childhood are paralleled by changes in motor performance. Children show increasing speed in simple reaction time and repetitive finger tapping (Garvey et al. 2003). Motor evoked potentials show decreasing conduction times and increasing inhibition between the hemispheres (ages 10–13). At the same time, mirror movements, which are relatively common in children up to the age of 6–7, decrease. It appears likely that motor development depends on the maturation of multiple central and peripheral control mechanisms. A recent study of sequential finger pointing in 6–11-year-old children showed a discontinuity in performance around the age of 6–7 (Badan et al. 2000). Their data show that at that age, when the task is easy, children can utilize strategies that are more typical of older children, but that when the task is hard, they perform more similarly to younger children.

Taken together with the results of the current study, the above evidence suggests that enriched motor experienced during the period when neural and behavioural systems are immature appears to induce lasting enhancement in performance. As described in the Introduction, during a sensitive period neural systems are particularly sensitive to relevant stimuli, and are more susceptible to change when stimulated. In a recent review of the neural mechanisms underlying sensitive periods, Knudsen described evidence of synaptic changes at the cellular level that indicate that a sensitive period can be opened by experience (Knudsen 2004). Further, he suggests that intensive experience that occurs early in a sensitive period has a unique advantage, because sculpting of circuits by experience early in a sensitive period will shape the way those circuits respond to additional experience later in the sensitive period and beyond. For example, early plasticity in motor implementation and sensorimotor integration may lay down highly tuned circuits that can later be further optimized by learning mechanisms that remain plastic throughout life. This is consistent with the results of our study which show that ET musicians continued to improve on the measure of response synchronization, and to out-perform the LT musicians across 5 days of practice. Further support for enhanced plasticity comes from a recent study of tactile discrimination in professional pianists (Ragert et al. 2004). This study showed that not only did pianists have lower sensory discrimination thresholds compared to non-pianists, but that with additional training, pianists were able to improve those thresholds to a greater degree than non-pianists.

In the present study, ET musicians showed specific enhancement of their ability to learn to synchronize their motor responses to a rhythmic visual sequence. Current theories of motor control maintain that learning and sensorimotor integration are based on error-correction and predictive control mechanisms that have been linked to the cerebellum. Both neurophysiological studies in animals and neuroimaging studies in humans have demonstrated cerebellar involvement in tasks requiring motor learning (Karni et al. 1995; Toni et al. 1998; Kleim et al. 2002; Doyon et al. 2003); timing (Schubotz et al. 2000; Ivry et al. 2003) and sensorimotor integration (Bower 1995; Gao et al. 1996). Further, structural changes in the cerebellum have been shown to occur with learning of a novel task (Kleim et al. 2004). Neuroimaging studies from our lab, using the same task, have shown engagement of the cerebellum during learning (Penhune and Doyon 2003, 2005). It is possible that in ET musicians, intensive early experience with tasks requiring motor learning, timing and sensorimotor integration results in preferential enhancement of cerebellar circuits. Evidence for cerebellar plasticity in musicians comes from a structural MRI study showing enlargement of the cerebellum that was correlated with lifetime practice in male keyboard players (Hutchinson et al. 2002). The Cerebellum, along with the hippocampus, maintains a high degree of plasticity throughout life. Early training may simply enhance the cerebellum’s ability to integrate the sensory and motor information required for learning.

The results of this study show convincing group differences in performance for ET and LT musicians. This conclusion is strengthened by the fact that the groups were matched for years of experience, formal training and current practice. However, there were also clear individual differences in performance, and not all ET musicians performed better than LT musicians. Therefore, it is likely that there are other factors that we did not control that contribute to the observed differences between the groups. The most important of these is early motor ability. Early ability may be potentially related to two factors: (1) genetically determined differences in central and peripheral motor control, or general cognitive abilities such as sustained attention; and (2) individual differences in motivation or environment. Evidence from studies of musicians with absolute pitch show that there may be a genetic contribution to this ability (Baharloo et al. 1998; Zatorre 2003), although it cannot be developed without training. Similarly, a genetic predisposition for earlier development of motor skills or sustained attention abilities could underlie ET musicians’ tendency to start training earlier and to obtain greater benefit from practice. Importantly, motivation can strongly affect learning and plasticity, as demonstrated by experiments in which auditory learning is enhanced by reward or survival saliency (Beitel et al. 2003; Knudsen 2004). Therefore, children with greater intrinsic motivation or with greater family motivation may begin earlier and learn better. Finally, environmental factors such as access to musical training and family support for persistence in musical training could also play important roles. In the future, studies examining matched groups of early- and late-starting children undergoing the same type of musical training will shed light on the contributions of these factors.