Introduction

The diagnostic category of developmental speech sound disorders (SSD) poses a clinical problem due to its size, heterogeneous symptomatology, limited research base, and poor long-term outcomes. SSD is the most prevalent of childhood communication difficulties, constituting more than 70 % of pediatric speech-language pathology caseloads [1]. One UK study [2] reported an incidence (referral) level of 6.4 % of the total child population aged 2–16 years. However, these children differed in terms of age at referral, severity, type of speech errors, associated impairments, aetiological factors identified, developmental trajectory, and response to specific intervention programmes. Such heterogeneity bedevils research into SSD and other developmental communication impairments [3•], leading to contradictory findings that diminish development of theory and clinical practice. While there is consensus with regard to the need for a way of classifying children with SSD into homogeneous subgroups in order to extend research and inform clinical intervention, as yet there is no agreed system [4•]. What is incontrovertible is that unintelligible speech has immediate and long-term negative consequences for children and their families. Even in adulthood, a history of childhood SSD is associated with poorer academic, social, and psychological outcomes than those of childhood peers exhibiting typical speech development [5]. This review examines issues that have been identified, focusing on recent research.

The Nature of SSD

Diagnostic Guidelines

A lack of clear diagnostic guidelines for the identification of children with SSD is a source of confusion. The American Speech-Language-Hearing Association (ASHA) website [6] currently defines SSD as including ‘problems with articulation (making sounds) and phonological processes (sound patterns),’ and provides a link to Bowen’s website [7], where the difference between the two types of speech errors is elaborated. Articulation disorder involves the oral movements that result in speech sound production. The aetiology may be organic, such as anatomical anomaly (e.g., cleft palate) or impaired muscle function (e.g., cerebral palsy). Other articulation difficulties, such as a lisp, have no known cause, appearing to be a mis-learning of how to move muscles to articulate a perceptually recognized version of the target speech sound. Children who are bilingual have identical articulation characteristics in speech sound production in both languages [8]. In contrast, a phonological disorder is language-specific, with some different error patterns apparent in each language spoken. A phonological impairment is an impaired ability to learn the speech-sound contrasts that discriminate words and the constraints that govern how those sounds can be combined [3•]. For example, in English, /r/ and /l/ are two separate speech sounds because they contrast word pairs such as roar and law, whereas native speakers of Japanese do not perceive the two sounds as being different. Each language’s phonology also disallows use of sounds in certain word positions (e.g., in English, /ts/ occurs word finally in bits, but not word initially, explaining why English speakers have difficulty saying the Japanese word tsunami).

The distinction between articulation and phonology, however, seems to be poorly understood. The ASHA website [6] designates the speech error [nanə] for banana as an articulation error, whereas Bowen [7] would identify it as an example of the common phonological error pattern weak syllable deletion where ‘unstressed syllables are deleted from words of more than one syllable.’ It is not surprising, then, that recent research articles have not always determined whether participants have a motor-based disorder affecting articulation, a language-based disorder affecting phonology, or comorbid articulation and phonological impairments. A widely cited 2006 definition [e.g., 9•] stated that SSD was ‘a significant delay in the acquisition of articulate speech sounds’ [10, p. 1294], excluding phonology as well as the concept of disorder. In 2013, however, many articles acknowledged that SSD is a generic term for a diverse population (e.g., SSD ‘refers to problems with speech sound production, perception, and/or phonological representation which may make speech difficult to understand’ [11, p. 296].

Describing Speech Sound Disorder

The way a child’s speech is described reflects researchers’ and clinicians’ understanding of the behaviour. There are three categories of descriptors: speech error symptomatology, abilities (sometimes hypothesised to be causally) associated with speech impairment, and the language learning environment. Each domain can be assessed in a variety of ways, with no one measure providing an adequate theoretical or clinical account [12•]. Furthermore, research suggests that these domains interact to determine long-term outcome for children with speech difficulties [5, 13].

Speech Characteristics

Speech characteristics are described in a range of different ways. Speech sound repertoires list sounds whose articulation is stimulable – those sounds used as phonological contrasts to distinguish between words. The most commonly used measures count the number of correct productions of particular aspects of speech. Percent consonants correct (PCC) is nearly always reported, despite being affected by the size and content of speech sample analysed [7]. It provides no information about vowels or phonotactic constraints and makes no distinction among omission, substitution, and distortion of speech sounds. The most significant issue with PCC is that age-appropriate, delayed, and developmentally atypical errors have equal weighting. This severity measure places typically developing children at risk of being wrongly identified as having SSD, and grants children with delayed phonological development the same priority for intervention as those with disordered development.

Recent literature reveals a trend to investigate and compare different types of speech measures [14, 15•]. Proportion of whole-word proximity counts all phonemes (consonants and vowels) in a child’s production of a word (correct plus incorrect), and then adds the number of accurately produced consonants. This sum is then divided by the number of phonemes plus consonants in the adult production of that word. Although unwieldy, this measure was shown to be more sensitive to phonological change than other measures in four case studies of persisting speech disorder [14].

A study exploring consistency of word and speech sound production found that these two measures were not related [15•]. High word inconsistency, however, was associated with poor performance on receptive vocabulary measures. This finding led to the conclusion that inconsistency is associated with ‘unstable’ lexical representations that may not be fully specified [15•]. An alternative account proposes that word variability reflects errors selecting and sequencing phonemes when assembling a plan for uttering word, a deficit requiring a specific intervention approach [16]. Another measure of speech characteristics describes error patterns (phonological processes [17]), often reflecting developmental phonological constraints (e.g., prevocalic voicing, cluster reduction, substitution of velar by alveolar sounds), but also developmentally atypical constraints (e.g., initial consonant deletion, backing of alveolars). Some standardized assessments [e.g., 18] identify the number of times error patterns are used in a picture-naming task by more than 10 % of children in six-month age bands, allowing identification of age-appropriate errors, delay (use of error patterns typical of a younger age group), and disorder (use of error patterns not apparent at any age in the normative sample). Few studies provide this information about their participants with SSD.

The assessments reviewed rely on auditory-perceptual measurement, which has strengths in clinical practice where resources for instrumentation are scarce [19]. Instrumental measurement has been shown to provide more valid and reliable description of the characteristics of speech, particularly for children whose speech sound disorder involves organic impairment (cleft palate, childhood apraxia of speech, and dysarthria) and those with voice and fluency disorders. Subtle motor control impairments can co-occur with phonological delays and disorders, making differential diagnosis difficult [20].

Associated Abilities

Associated abilities include measures of speech input proficiency, motor skills affecting output, and cognitive-linguistic processing. One common lay assumption is that SSD is caused by peripheral difficulties such as poor auditory discrimination or oromotor difficulties. Research evidence is ambiguous. Parallel difficulties in perception and production of the same speech sounds have raised the question of direction of causality. A recent case study [21] of a four-year-old boy examined this phenomenon. He substituted word initial /k/ with [t] (e.g., [tʌ p] for cup) but /k/ was correct word finally (e.g., [wʊ k] for look). In contrast, he made few perception errors discriminating non-word initial /t/ from /k/, but many word finally. Across all phonetic contexts, however, he made more perceptual errors discriminating /t/ from /k/ than for similar sound pairs he did not produce in error. The conclusion that the authors reached was that ‘a primary production deficit can cause decreased perceptual ability’ [21, p. 259]. The production errors of some children reflect speech motor control deficits (e.g., childhood apraxia of speech, different types of dysarthria), although the relationship between motor control, speech articulation, and intelligibility is complex [20]. For example, performance on a standard test was not significantly correlated with speech motor control measures in a group of children with severe speech impairment and poor sequencing on a Verbal Motor Production Assessment for Children subtest [20]. As such, current research casts doubt on input and output deficits as a general explanation for SSD, although impaired perceptual and motor control impairments can cause speech difficulties in some cases.

Cognitive-linguistic abilities provide alternative accounts of SSD. Impaired phonological working memory (PWM) is thought to underlie speech and language difficulties because it underlies the creation of phonological representations [22]. Recent research, however, suggests that PWM is dependent upon the ability to assemble phonology for speech output [23]. Children were better able to repeat non-words that contained consonants and consonant combinations that they used frequently as opposed to non-words constructed of infrequently used consonant sequences. Another study that longitudinally explored PWM revealed that slowed acquisition of phonotactic probability knowledge could explain poor non-word repetition performance [12•]. Both of these studies provide evidence that having a speech difficulty impairs PWM, not that PWM underlies speech errors.

Two other cognitive linguistic abilities have been associated with SSD. Poor phonological awareness is associated with impaired acquisition of speech and literacy, suggesting that the same cognitive-linguistic deficit in phonological processing may underlie both spoken and written phonological errors [24]. While the term phonological awareness (PA) usually refers to knowledge of linguistic units of speech (syllables, onsets rimes, phonemes), assessment tasks can involve abilities crucial for phonological acquisition. Most current theories assume that toddlers have the ability to implicitly abstract the ‘constraints’ or ‘regularities’ of the phonological system(s) they are learning (e.g., through statistical learning [25]). There is some empirical support for SSD involving an impaired ability to derive phonological constraints. Toddlers who consistently make developmentally atypical speech errors also do poorly on non-verbal rule-learning tasks [26], and children with disordered phonological development do more poorly on executive function tasks of rule abstraction and cognitive flexibility than those with delayed development [27•]. In addition, longitudinal follow-up of children with SSD who made atypical speech errors when they were preschoolers predicted PA at seven years of age [28•]. There is a trend for research investigating abilities associated with SSD to increasingly use executive function tasks [e.g., 29].

The Language Learning Environment

The language learning environment was the default explanation for SSD when no identifiable cause for a child’s speech difficulties was apparent. Although a genetic explanation has provided an alternative account for functional SSD [10], current opinion holds that ‘multiple environmental factors can influence developmental pathways’ irrespective of organic conditions [30• p. 487]. These factors include socioeconomic status, lifestyle, and environment. An epidemiological study [13] reported that children with SSD raised in adversity were at greater risk than children from affluent families in terms of severity and additional diagnoses. There were complex interrelationships between demographics, case history factors, non-verbal abilities, and co-occurring developmental difficulties that affected the nature and severity of presenting cases of SSD, thus influencing outcome of intervention.

Limitations of the Research Base

Evidence-based practice for SSD depends upon the validity of published research. Evaluation of that knowledge base is crucial as it grows and changes. Three limiting factors on SSD research are participant selection, measures used, and evaluating outcome of intervention.

Participant Selection

The participants in studies of SSD are nearly always heterogeneous, both within and across studies. The latest review of evidence-based practice for children with SSD included 134 studies that excluded children with any difficulties (e.g., organic motor, language, and hearing) other than ‘organisation and use of phonemes to signal meaning’ [9•, p. 102]. Additional information about the nature of the participants included age (1;11–10;5) and severity category, limiting the understanding of when a particular approach to intervention might be used with a specific case of SSD. In contrast, some studies published in the last year recruited participants within a narrow age range and provided information about referral, demographics, language background, standard score criteria on assessments of speech, language and cognition, as well as detailed description of the nature of speech errors made [e.g., 15•, 20, 28•]. This focus on qualitative information regarding participants’ speech characteristics allows investigation of specific questions in specific populations of children [e.g., 12•, 23].

Assessment Measures

The tasks researchers use to assess the abilities of children with SSD are rarely pure measures of a specific ability. For example, phonological awareness (PA) performance varies with linguistic unit (syllables, onset-rimes, phonemes) and task demand (identification, segmentation, manipulation of speech units). An individual’s task performance is affected by age, the phonological structure of language(s) learned, and exposure to PA activities at home and at school [24, 31]. Thus, task demands and participant-specific factors have major implications for assessment of particular abilities associated with SSD. It is not surprising, then, that assumptions about what a particular task may assess will change as knowledge grows. As an example, it is generally acknowledged that the use of non-word repetition to assess PWM is problematic for children who make errors pronouncing real words, which led to an assessment specifically for children with SSD [32]. Knowledge about the effect of SSD on non-word repetition tasks has been extended by two recent studies [12•, 23] that challenge PWM as an adequate explanation for SSD. The proposed deficits underlying SSD in some children may involve phonological planning and/or the ability to abstract phonological constraints. Research studies identifying tasks that would discriminate between different types of SSD would allow development of novel theoretically based interventions.

Outcome of Intervention

Determining the value of intervention in SSD is complicated by the specific nature of the disorder, therapeutic approach, the amount and scheduling of intervention, and the timing of assessment of outcome. Reviews [e.g., 9•, 33] of past intervention studies provide limited description of the abilities of participants receiving a range of treatment approaches (e.g., articulation, phonology, whole language) of varying amounts and scheduling: sessions 15–270 minutes in length, occurring between one and five times a week, over six to 46 months, amounting to between 17 and 240 hours of intervention. More recent intervention research provides more information with regard to speech characteristics, focusing on specific aspects of the intervention process, like service delivery or intervention approach. For example, a Portuguese study [34] compared phonological and articulation approaches for SSD. Fourteen children aged 4.0–6.7 years, with severe phonological rather than articulatory difficulties, were randomly allocated to two treatment groups receiving 25 45-minute sessions. Assessments conducted before and after therapy episodes revealed variation in severity and error types. Both groups improved, but children receiving phonological intervention (including PA activities) made much greater progress and showed more generalization to untreated words than children receiving articulation therapy. The strengths of this paper included detailed description of each child’s speech characteristics and the intervention provided. The study might also have provided multiple baseline assessments to detect any spontaneous pre-therapy improvement and maintenance of therapy gains post-therapy, which would have further clarified the nature of their SSD and the usefulness of the intervention. Furthermore, like most current intervention research on SSD, this study rarely differentiates between delay and disorder and, if it is assessed, excludes children making inconsistent errors. These factors limit the potential for the generalisation of findings, both clinically and theoretically.

Classification of Children with SSD

Consequently, Waring and Knight [4•] recently argued for debate on ways of classifying children currently categorized under the generic term ‘speech sound disorders’ in an effort to promote professional communication between clinicians and to elucidate research issues. Their review examines approaches to classification using criteria that evaluate reliability (clinical agreement about a given case), validity (face, descriptive, predictive, construct), coverage (ability to differentially diagnose all presenting individuals), and feasibility (use by speech language pathology clinicians). Three systems were reviewed in depth.

Speech Disorders Classification System (SDCS)

The Speech Disorders Classification System (SDCS) [e.g., 10] is aetiologically based. It comprises eight subgroups: three types of speech delay (genetic, otitis media with effusion, psychosocial), three types of motor speech disorders (apraxia, dysarthria, others not specified), and two groups of residual speech errors (/s/ and /r/). All but the speech error groups are argued to reflect a genetic anomaly linked to specific speech behaviour. The review [4•] concluded that the SDCS is an evolving research tool, currently limited by issues with validity, the potential for children to belong to more than one aetiological subgroup (e.g., OME and family history of SSD), and the ability of clinicians to identify putative subgroups.

Psycholinguistic Framework

The psycholinguistic framework [e.g., 35] was designed to identify underlying deficits in speech processing. Deficits can occur in peripheral hearing, phoneme discrimination, storing accurate phonological representation, and phonological planning and/or execution. Psycholinguistic assessment is the same for all children with SSD. A model of the speech processing chain is rigorously tested by a series of tasks to identify poor performance thought to reflect an impaired speech processing ability. The major strength of the psycholinguistic framework is in demonstrating that children with SSD who share the same aetiology had different deficits in speech processing, revealing the complexity of associations between causal factors and associated deficits. One problem identified with the framework was the need for extensive assessment that weakened its clinical feasibility.

Model for Differential Diagnosis

The model for differential diagnosis [e.g. 36] used speech characteristics to identify separate subgroups that were then compared to identify underlying subgroup-specific processing deficits and response to therapy. The five subgroups of speech disorder identified were:

  1. (i)

    Articulation disorder: substitutions or distortions of the same sounds in isolation and in all phonetic contexts during imitation, elicitation, and spontaneous speech tasks (e.g., lateral lisp). This phonetic disorder affects around 12 % of all children with functional SSD and is most successfully treated by traditional articulation therapy [13].

  2. (ii)

    Phonological delay: presence of speech error patterns that are typical of younger children as determined by normative data where fewer than 10 % of children in a six-month age band produced the error in five different words on a standard test of 50 words (e.g., in English: stopping of fricatives; deletion of /l, r, w, j/ in stop + continuant clusters; weak syllable deletion). This phonemic disorder affects around 55 % of all children with functional SSD. Intervention studies indicate that both whole language and phonological contrast intervention are successful approaches to therapy [34, 36].

  3. (iii)

    Consistent atypical phonological disorder: consistent use of one or more unusual non-developmental error patterns as determined by normative data where fewer than 10 % of children, in any age band, produced the error pattern in five different words (e.g., backing, initial consonant deletion). A child may also display some developmental error patterns that are delayed or age-appropriate. This phonemic disorder affects around 20 % of all children with functional SSD. Phonological contrast therapy is the only therapeutic approach thus far that has been shown to resolve this SSD [36] (Table 1).

    Table 1 Summary of evidence for differential diagnosis approach to classification
  4. (iv)

    Inconsistent phonological disorder: multiple phonemic error forms for the same lexical item while having no oromotor difficulties, determined by the production of 25 words in three separate trials, with a criterion of 40 % for diagnosis of inconsistency (based on normative data of <10 % inconsistency for typically developing children and <30 % for children with delay or consistent atypical disorder). Children perform better in imitation than spontaneous production (cf CAS). This phonological assembly disorder affects about 10 % of children with functional SSD. Core vocabulary therapy that focuses on whole words usually generalises to non-targeted words, establishing consistency and improving accuracy, although follow-up phonological contrast intervention may be indicated once speech is consistent [16].

  5. (v)

    Childhood apraxia of speech (CAS): Speech characterised by inconsistency, oromotor signs (e.g., groping, difficulty sequencing articulatory movements), slow speech rate, disturbed prosody, short utterance length, poorer performance in imitation than spontaneous production. CAS is rare, and reliable identification is clinically challenging. It may involve multiple deficits affecting phonological and phonetic planning as well as motor program implementation.

The review concluded that the differential diagnosis approach to classification had strengths in reliability, validity, and coverage, but that it was not being widely used either clinically or in research. Table 1 summarises, and updates, the evidence supporting the case for the differential diagnosis model, set out in the review [4•].

Longitudinal Studies of Outcomes for Children with SSD

One way of evaluating classification systems is to monitor change in different subgroups longitudinally. Unfortunately, recent large-scale longitudinal studies have provided too little information about the nature of the participants to allow identification of different subgroups [e.g., 37•]. Earlier small-scale studies that had followed up children for 12–18 months reported that those with phonological delay made spontaneous progress, while those with consistent or inconsistent disorder did not [38]. Larger-scale long-term studies are needed to determine the usefulness of the differential diagnosis model, not only for evaluating different approaches to therapy for different subgroups but also for predicting literacy difficulties. For example, one study [28•, p. 173] concluded that ‘different preschool speech error patterns predict different school-age clinical outcomes.’ Preschoolers who use atypical speech error patterns are at risk for literacy disorders once they reach school age.

Longitudinal studies thus far have first identified children with SSD when they are at least three years old, yet children produce their first words at around one year, and there is evidence that speech disorder is apparent from around 24 months [39•]. Two assessments have been developed for two-year-olds suspected of being at risk for SSD [40•, 41]. The Toddler Phonology Test [40•] predicts speech disorder one year later if toddlers produce errors atypical of their peers’ normative data. Reliable identification of toddlers with SSD would allow early intervention that might avoid the negative consequences associated with later identification and treatment [5].

Conclusions

One large-scale UK study reported that 5.6 % of all children have persistent SSD at eight years of age, with an additional 7.9 % having residual articulatory errors (e.g., [w] for /r/) [37•]. A similar finding was made for seven-year-olds in Australia [42]. Current limitations in clinical management of SSD place a significant number of school-age children at risk for academic and social failure [5]. Research is emerging that addresses the current poor outcomes for cases of SSD. A number of 2013 studies have documented the value of differential diagnosis in understanding different types of speech impairment [4•, 15•, 28•] and their treatment [e.g., 16]. Better outcomes may result from early intervention for toddlers with SSD now that they can be identified at two years [40•, 41]. Importantly, the attention to participant selection and focus on task validity in recent studies should add to the base of knowledge with regard to the many ways in which speech may be impaired.