1 Introduction

1.1 School readiness in the US

What makes a child “ready for school”? Although definitions of school readiness vary, the concept is generally agreed to encompass physical health, socioemotional, language, and cognitive skills that empower children to learn effectively in mainstream classrooms (Kagan et al., 1995; Sabol & Pianta, 2017). Many “school readiness” skills—including attention, classroom engagement, social skills, and key pre-reading and math skills—predict learning throughout later school years, suggesting that they lay an important foundation for academic success (Cooper et al., 2014; Duncan et al., 2007; Fitzpatrick & Pagani, 2013; Pagani et al., 2012). In the United States (US), school readiness is most often assessed as children enter kindergarten—the first year of universally available public education, which most children begin at age 5 (National Center for Educational Statistics, 2020). Although these assessments do not determine children’s eligibility for kindergarten (in fact, some states explicitly forbid this; Education Commission of the States, 2020), they provide valuable information about the state of early childhood in the US.

Research on school readiness consistently indicates that children experiencing more developmental risk factors (e.g., living in socioeconomically disadvantaged conditions or belonging to a racial group that experiences discrimination) are often less prepared for kindergarten than their more advantaged peers (Bernstein et al., 2014; Chernoff et al., 2007). In one nationally representative US sample of children entering kindergarten in 2010, those experiencing four risk factors were found to enter kindergarten nearly a year behind in math and reading skills compared to their peers who had not experienced any risk factors (Bernstein et al., 2014). Unsurprisingly, given the associations between school readiness and later learning, these school readiness gaps can translate to persistent gaps in academic achievement (Bloom et al., 2008; Burkam & Lee, 2002).

Participation in high-quality preschool education programs has been associated with boosts in school readiness—especially for children experiencing high levels of risk—and can be an important tool for reducing aforementioned sociodemographic achievement gaps (Barnett, 1995; Camilli et al., 2010; Cannon et al., 2017; Lazar et al., 1982; Manning et al., 2010; Ramey et al., 2000). However, funding for public preschool programs—and, consequently, access to those programs—remains limited in the US (U.S. Department of Education, 2015). It also remains unclear which elements of effective preschool programs are most responsible for their benefits, which presents a challenge for programs and policymakers seeking to prioritize the most cost-effective changes. Many features have been used to define preschool program quality, including curriculum, class size, teacher training, parent engagement opportunities, program duration, and instructional quality (Burchinal, 2018; Hayakawa & Reynolds, 2014; Barnett et al., 2017; Ramey & Ramey, 1998; Reynolds et al., 2014; Schweinhart & Weikart, 1988). The present study contributes to efforts to pinpoint mechanisms of preschool program effects by homing in on one such element—the balance of child-initiated and teacher-directed instruction—and its relation to school readiness.

1.2 Child-initiated practices

Child-initiated approaches to education derive from a Piagetian, constructivist view of children as active learners who construct their own knowledge (Piaget, 1970; Schweinhart & Weikart, 1988; Stipek & Byler, 2004). They are characterized by hands-on activities that give children substantial agency over their learning, and are developmentally aligned with young children’s propensity to learn through hands-on experimentation and play (Fisher et al., 2010; National Association for the Education of Young Children, 2009; Schweinhart & Weikart, 1988). Child-initiated instruction is thought to support learning by building on children’s desire for exploration (which can be stunted by explicit teaching; Bonawitz et al., 2011); maintaining their intrinsic motivation (Deci & Ryan, 2000; Stipek et al., 1995); and promoting deeper, more generalizable learning by allowing children to discover ideas for themselves (Alfieri et al., 2011; McDaniel & Schlager, 1990).

The use of child-initiated learning activities in early childhood has repeatedly been associated with improved socioemotional outcomes. In a landmark study by Stipek et al. (1995), 3- to 7-year-olds attending child-centered (versus more didactic) preschool and kindergarten classes showed evidence of greater motivation, higher self-esteem, and less anxiety about school. Other research has echoed this finding, observing that more child-centered teaching practices in preschool and the early school years are associated with less anxiety (Hirsh-Pasek, 1991), fewer stress behaviors (Burts et al., 1992; Hart et al., 1998), greater academic motivation and interest (Kikas et al., 2018; Lerkkanen et al., 2012), more sophisticated social interactions (DeVries et al., 1991), and lower rates of problem behavior (Barnett et al., 2008) during or soon after program participation. Moreover, such effects seem to endure. Attending a child-centered preschool, versus a more teacher-directed one, has been found to predict better classroom behavior in later elementary school (Marcon, 2002), higher rates of high school completion (Graue et al., 2004), and lower rates of misconduct in adolescence and young adulthood (Schweinhart et al., 1986b; Schweinhart & Weikart, 1997) (although this latter finding has been debated; Bereiter, 1986; Mills et al., 2002; Schweinhart et al., 1986a).

For cognitive and pre-academic skills, evidence regarding the effectiveness of child-initiated practices is more equivocal. In some cases, instructional approaches that grant children more autonomy have been associated with greater gains in literacy (Huffman & Speer, 2000; Kikas et al., 2018; Lerkkanen et al., 2016), science (Marcon, 1993), numerical/math (Huffman & Speer, 2000; Lerkkanen et al., 2016; Pakarinen & Kikas, 2019), and more general cognitive skills like problem-solving (Stipek et al., 1998). Other findings, however, are not fully consistent with these benefits (Duncan et al., 2015; Graue et al., 2004; Karnes et al., 1970).

1.3 Teacher-directed practices

Child-initiated learning can be contrasted with more teacher-directed approaches, which emphasize the need for students to acquire basic skills in academic domains like reading and math (Gersten et al., 1986; Stipek et al., 1995). In this approach, the teacher has primary control over the learning process and uses explicit instruction to transmit specific skills and knowledge (Becker & Carnine, 1980; Bereiter & Engelmann, 1966; Fisher et al., 2010; Gersten et al., 1986; Stipek & Byler, 2004). Material is selected by the teacher, and learning usually takes the form of structured, teacher-led activities including worksheets and group instruction (Becker & Carnine, 1980; Cole et al., 1991; Fisher et al., 2010; Schweinhart & Weikart, 1988).

Though less aligned with developmental principles, there is some evidence that didactic, teacher-directed instruction in pre-K and kindergarten classrooms can support gains in cognitive and academic skills. In one foundational study, Karnes et al. (1970) randomly assigned socioeconomically disadvantaged preschool students (around 4 years old) to four different programs with varied degrees of teacher-imposed structure. Over the course of a year, children in highly teacher-structured, academically focused programs showed larger IQ gains than children in less-structured ones. Numerous subsequent studies have found similar patterns, linking more didactic instruction in early childhood to higher IQ scores (Miller & Dyer, 1975; Schweinhart et al., 1986b), improved language and literacy skills (Duncan et al., 2015; Graue et al., 2004; Marcon, 1999; Stipek et al., 1995, 1998), and better math skills (Duncan et al., 2015; Graue et al., 2004; Marcon, 1999; Stipek et al., 1998). The persistence of these cognitive and academic gains, however, is variable and remains an outstanding research question (DeVries et al., 1991; Duncan et al., 2015; Miller & Dyer, 1975; Schweinhart et al., 1986b).

1.4 The role of child characteristics

Despite evidence for main effects of child-initiated and teacher-directed practices, it is also important to recognize that children enter preschool and early school settings with vast individual differences in skills, knowledge, personality, and socioemotional development. Based on these factors, different instructional approaches may provide a good fit for different children (Cole et al., 1993; Connor et al., 2004, 2009; Kikas et al., 2014; Snow, 2005). Children who are not yet proficient in foundational literacy, math, and self-regulation skills, for instance, might find it difficult to extract information from unstructured, child-initiated activities and may benefit from more structured, direct instruction (Becker & Carnine, 1980; Gersten et al., 1986; Kinder & Carnine, 1991; Kirshner et al., 2006; Sweller et al., 2019).

Cole et al. (1993) found some evidence for this phenomenon in a study of 4- to 5-year-old preschool and kindergarten students eligible for special education; a didactic curriculum seemed to be more effective at promoting reading skills for students with a weaker initial vocabulary, while a more child-initiated curriculum was more effective for those with a stronger vocabulary. Similarly, Connor et al. (2004) found that teacher-managed, explicit instruction was associated with greater improvement in first graders’ decoding skills, but only for students whose skills were initially relatively weak. Kikas et al. (2014) recently also noted similar patterns at the classroom level. They observed the extent to which child-centered and teacher-directed practices were used in first- and second-grade classrooms (with 6- to 7-year-old students), and examined how these practices predicted task persistence and academic skills in classes with different initial characteristics. In classes with high overall levels of task persistence at the start of the study, child-centered practices predicted increases and teacher-directed practices predicted decreases in task persistence over time. In classes with initially low task persistence, an inverse pattern was found. These results suggest that the initial collective behavior of the class altered how effectively different teaching practices supported the development of students’ task persistence. There is also evidence that teachers in the early elementary grades adjust their teaching to be more didactic and teacher-directed when the overall skill level of their classroom is lower—perhaps because they intuitively believe these approaches to be most helpful when students’ skills are weakest (Kikas et al., 2018; Pakarinen & Kikas, 2019). Collectively, these findings highlight the importance of further exploring how children’s initial skill level might shape their responses to different teaching practices and contribute to the effectiveness of these practices.

Other research points to the importance of considering not just specific learning-related skills but also development more broadly. Stipek et al. (1998), for instance, found that kindergarteners (5.25 years old, on average) who spent a school year in more academically focused classes subsequently performed better on tests of math and reading. For preschoolers (average age of 4.4 years), however, an academic focus was associated with relatively lower math and reading scores. This pattern appears to suggest that more child-initiated approaches may be more developmentally appropriate for younger children, while older children can benefit more from teacher-directed learning activities. It is worth noting, however, that some studies have failed to find variation in instructional effects based on children’s skill or developmental level (e.g., Dale & Cole, 1988; Lerkkanen et al., 2016), or found that these effects fade over time (Jenkins et al., 2006). Clearly, further research is needed to probe the validity of these moderated instructional effects, the conditions in which they occur, and their persistence.

In addition to skill and developmental level, sex and gender have been proposed to potentially influence children’s response to instructional practices (Burts et al., 1992; Hart et al., 1998; Kiuru et al., 2012). Young boys, for instance, have been found to show more stress behaviors than girls in teacher-directed but not in more child-centered kindergarten (Burts et al., 1992) and preschool (ages 3 to 5; Hart et al., 1998) classrooms, suggesting that a didactic approach might be less optimal for them. Perhaps the intersection of gender-specific socialization and the behavioral demands of more teacher-directed classrooms tends to produce a better fit for female than for male students. A number of other studies have also reported gender-specific effects of early childhood program instruction (DeVries et al., 1991; Graue et al., 2004; Marcon, 1999; Miller & Dyer, 1975). However, these effects have often appeared only post hoc, have not always shown a consistent pattern, and have not consistently replicated (Huffman & Speer, 2000; Huston-Stein et al., 1977; Kiuru et al., 2012; Marcon, 1993, 2002).

Overall, there is a need for further examination of how individual characteristics shape children’s response to instructional experiences. This endeavor has both theoretical and practical importance. Understanding which approaches work for which groups of children can provide insight into the mechanisms whereby instruction supports positive outcomes, and can assist programs and teachers seeking to optimize instruction for diverse students.

1.5 A blended approach

Child-initiated activities that allow children’s exploration to drive learning have been found to be effective for preschoolers—evidence that is reflected in professional guidelines for early childhood educators (National Association for the Education of Young Children, 2009). However, as reviewed above, evidence for teacher-directed instruction is not uniformly negative; depending on learning aims and child needs, it can be effective for teachers to take a more directive role (Cole et al., 1993; Kikas et al., 2014). Furthermore, even “child-initiated” approaches tend to involve some active teacher participation. There is a difference between instruction that is child-initiated or -centered (children’s interests take the lead, but teachers provide feedback and opportunities for exploration) and child-dominated (teachers step back, allowing children to fully take charge) (Kikas et al., 2014). An interesting question, therefore, is how child- and teacher-directed practices can be blended in early childhood classrooms to allow children and teachers to most effectively share control over the learning process (Lerkkanen et al., 2012; Stipek & Byler, 2004). Thus far, mixed results and methodological differences, especially in conceptualization of what it means to “blend” instructional practices, have made it difficult to draw firm conclusions.

One way of addressing this question involves comparing child outcomes across programs whose theoretical basis promotes more or less teacher control over the learning process. The High/Scope Preschool Curriculum Study, for instance, compared the DISTAR program (which focused on teaching pre-academic skills using direct instruction and teacher reinforcement), the High/Scope program (in which teachers structured opportunities for child exploration), and a “nursery school” program (which emphasized children’s desires and free play). As noted above, the High/Scope study and others using similar methods have found that highly didactic, teacher-directed models tend to be associated with relatively large short-term cognitive gains (Karnes et al., 1970; Miller & Dyer, 1975; Schweinhart et al., 1986b) but small socioemotional gains (Schweinhart et al., 1986b; Schweinhart & Weikart, 1997). However, among programs that are not mainly didactic and emphasize children’s active role in their learning—such as Montessori, High/Scope, and traditional “nursery schools”—more subtle variation in emphasis on child autonomy have not been consistently associated with outcomes. In the High/Scope study, for instance, children attending a highly child-initiated “nursery school” showed few social or academic differences from children attending the High/Scope program where teachers played a more active role in structuring learning (Schweinhart et al., 1986b; Schweinhart & Weikart, 1997).

These studies may appear to suggest that, past a certain point, granting young learners more autonomy has limited implications for learning. However, their conclusions are somewhat limited by a focus on theoretically distinct programs that likely differed in many respects and the fact that within-program variation was not examined (Marcon, 1992). Marcon (1992, 1999, 2002) aimed to address these limitations by using cluster analysis to characterize existing preschool classrooms, regardless of program philosophy, based on teachers’ self-reported educational beliefs and practices. Three clusters were identified: child-centered classrooms where teachers allowed children to drive learning; academically directed classrooms where teachers favored direct instruction; and “middle-of-the-road” classrooms where teachers combined elements of the other models. Preschool (4- to 5-year-old) students attending “middle-of-the-road” classrooms had the weakest academic and behavioral outcomes at the end of the year—perhaps, Marcon suggests, because instruction in these classrooms was less likely to follow a coherent theory. Graue et al. (2004) characterized the balance of child- and teacher-directed instruction in a similarly bottom-up way by asking teachers to rate their early childhood program on two dimensions—focus on child-initiated and teacher-directed instruction—and to describe the instructional materials and teaching philosophy used. They classified programs into four groups: low on both dimensions, high on both dimensions, or low on one but high on the other. Children attending preschool programs with a strong focus on both child- and teacher-directed instruction tended to have the highest scores on a school readiness assessment (a composite of cognitive skills including pre-literacy and math) at kindergarten entry, suggesting that a blend of child-initiated and teacher-directed practices might best support school readiness.

Although both Graue et al. (2004) and Marcon strove to classify programs based on practices more than philosophy, their findings appear contradictory. One seems to suggest that blending child- and teacher-directed approaches is optimal for preschool students (Graue et al., 2004) and the other that a blend is the least optimal (Marcon, 1992, 1999, 2002). The studies’ categorical approaches to instruction make this discrepancy difficult to parse. A classroom that is “midway” between highly child-initiated and highly academically focused, and a classroom that is “high” in both child-initiated and teacher-directed practices, could both be said to “blend” approaches. However, without more precise measurement of the degree to which different practices were used and how they were blended, it is difficult to interpret the discrepant findings.

Recent work on child-initiated and teacher-directed methods has taken steps to better specify an optimal balance by conceptualizing these approaches more continuously. In some cases, balance has been examined at the level of individual learning activities, as the extent to which the child’s vs. teacher’s agenda drive the structure and content. In research on playful learning, for instance, teacher involvement has been conceptualized along a continuum from free, fully child-directed play to didactic teaching, with degrees of guidance in between (Fisher et al., 2010; Pyle & Danniels, 2017). Evidence suggests that guided play—in which adults provide guidance via co-play and questions while allowing children to explore and take initiative—can best support learning (Fisher et al., 2013; Weisberg et al., 2016). Research on autonomy-supportive teaching also supports the idea that sharing control within learning activities can support student outcomes. Autonomy-supportive teaching centers student perspectives through practices like offering choice, inviting students to pursue their interests, and allowing them to work at their own pace (Cadima et al., 2019; Reeve & Cheon, 2021); it seeks to balance teacher and child control by providing space for child autonomy while still providing scaffolding for developing skills. These practices have been associated with improved intrinsic motivation, self-regulation, and other academic benefits (Furtak & Kunter, 2012; Reeve & Cheon, 2021).

Other recent work has taken a slightly different approach, examining how child-initiated and teacher-directed practices are balanced across a school day by looking at the frequency of activities that reflect each approach. Much of this research has used the Early Childhood Classroom Observation Measure (ECCOM), which describes practices falling into two dimensions (child-centered and teacher-directed); observers rate the percentage of time that each practice is seen (Lerkkanen et al., 2012; Stipek & Byler, 2004). “Child-centered” practices reflect constructivist principles and involve facilitating child exploration, providing opportunities to engage with material in diverse ways, and drawing meaningful links between lessons (Stipek & Byler, 2004). “Teacher-directed” practices, in turn, are more didactic, with a focus on paper/pencil tasks and discrete skills taught in relative isolation (Stipek & Byler, 2004).

The ECCOM conceptualizes child-centered and teacher-directed practices as separate dimensions; it is possible for a classroom to score low on both dimensions if there is an absence of deliberate instructional methods (Stipek & Byler, 2004). However, in multiple studies that have used this measure to describe instruction in kindergarten to third grade classrooms across several countries, the two dimensions have been strongly inversely correlated (Kikas et al., 2014; Lerkkanen et al., 2012; Pakarinen & Kikas, 2019). Findings regarding child-centered and teacher-directed practices have also sometimes shown inverse patterns (e.g., Kikas et al., 2014), suggesting that while these practices might be conceptually orthogonal they often tend to balance each other in practice. If a teacher spends a large percentage of instructional time using child-centered methods, they likely spend less time being didactic and directive.

The ECCOM has shown good psychometric properties (Stipek & Byler, 2004), and has the typical benefits of an observational measure; observers can be trained to look for the same behaviors, making the tool an arguably more reliable assessment of teaching practices than a teacher self-report. However, the ECCOM is also time-intensive, requiring a 3-h observation per classroom (Stipek & Byler, 2004). This makes it well-suited as a research tool, but difficult to use with very large samples and not very practical for time-strapped teachers and programs to use in self-monitoring. If the balance between child-initiated and teacher-directed practices is to be examined in larger samples, and potentially incorporated into program monitoring and professional development, a more practical tool is needed.

Thus far, research using continuous measures of the frequency of child-centered and teacher-directed practices in preschool, kindergarten, and early primary school classrooms has also generally focused on linear associations between practices and outcomes (e.g., Kikas et al., 2014; Lerkkanen et al., 2012). This work has suggested that more child-initiated practices are often associated with better student academic skills and greater motivation (Kikas et al., 2018; Lerkkanen et al., 2012, 2016; Pakarinen & Kikas, 2019). However, as suggested by Lerkkanen et al. (2012), it may also be fruitful to examine non-linear associations between child outcomes and the relative proportion of child-centered and teacher-directed practices. This could help identify the optimal blend of approaches, and address whether increasing child-initiated practices is always beneficial or whether there is a threshold beyond which the benefits diminish. Research on non-linear instructional effects, in short, is needed to help pinpoint how preschool children and teachers can most effectively share control across the school day.

1.6 The present study

The present study draws on data from the Midwest Expansion of the Child-Parent Center (CPC) program to examine the optimal balance between child-initiated and teacher-directed preschool instruction in a student population with multiple sociodemographic risk factors. The CPC program is an early childhood intervention that provides comprehensive educational and family services to high-risk children and families. First implemented in the Chicago Public Schools in 1967, the program has shown evidence of numerous positive effects, ranging from greater school readiness to higher rates of high school completion and reduced juvenile arrest (Reynolds et al., 2001, 2003, 2011; Reynolds, Hayakawa, et al., 2016a; Richardson et al., 2017). Further details about the CPC program are available in the program manual (Reynolds, Richardson, et al., 2016b). As part of the Midwest CPC Expansion (MCPC) project, the program has been implemented in a diverse set of schools across four school districts in the Midwestern United States, beginning with the 2012 preschool cohort (Reynolds, Hayakawa, et al., 2016a).

The MCPC project is uniquely positioned to examine instructional balance in three ways. Most notably, the project used a newly developed teacher-report measure of child-initiated and teacher-directed practices that (a) enabled data collection from a large sample of classrooms, (b) defined child- vs. teacher-directed instruction as a single, continuous dimension representing the relative balance of approaches, and (c) asked teachers to report their actual practices rather than their philosophy or curriculum. This measure, called the Classroom Activities Report (CAR; see 11.), is brief and potentially practical for widespread use as a program monitoring tool. Second, data on instructional practices were collected from a sample of classrooms that shared many structural similarities—all were public school programs in the same school district, and all were currently implementing the CPC program. Examining variation in child-initiated instruction within an otherwise relatively homogeneous group of classrooms eliminates some of the confounds inherent in comparing programs that adopt qualitatively different philosophies, which may vary on many features other than their child- vs. teacher-directed focus. Finally, as an existing publicly funded program, CPC serves an ethnically diverse population of young learners in the MCPC project. This increases the generalizability of findings to current public programs.

The present study had two primary aims:

  1. 1.

    Examine both linear and curvilinear associations between the proportion of child-initiated instruction and students’ school readiness, aiming to pinpoint the optimal blend of child- and teacher-directed instruction in pre-K classrooms. We expected that this relation would follow an inverted U-shape, and an intermediate level of child-initiated instruction would be associated with the greatest school readiness.

  2. 2.

    Examine whether these associations vary based on child age, initial skill level, and gender. We anticipated that the optimal level of child-initiated instruction (the proportion of child-initiated instruction associated with the greatest school readiness) might differ based on these factors. Specifically, we predicted that the optimal proportion of child-initiated instruction would be lower for younger and initially less skilled children.

2 Methods

Data for the present study come from the Midwest Expansion of the CPC program (MCPC), a longitudinal project that has followed a cohort of students enrolled in CPC and matched comparison programs across four school districts in the Midwestern United States from preschool to third grade. This study focuses on the preschool experience of students in the largest school district—an urban district whose CPC sites serve a largely low-income population—during the first year of the project (2012–2013). The study procedures were approved by the school district and the Institutional Review Board at the University of Minnesota.

2.1 Participants

The full sample of CPC students in the largest MCPC project district comprised 1724 children. Children were included in this sample if they were 3 or 4 years old at the start of the school year, enrolled in a CPC program site no later than January 1st, and spent at least 4 months of the school year in the program. Reflecting the target population of the CPC program and local neighbourhood demographics, the sample consisted largely of racial-ethnic minority children from socioeconomically disadvantaged families (Table 1). The analytic sample for the present study comprised the subset of these CPC students (N = 1289) who had at least one valid score on the TS GOLD school readiness assessment at any point during the school year (fall, winter, or spring). The demographics of this sample differed somewhat from the full CPC sample. A higher proportion of children in the analytic sample were Black and a lower proportion were Hispanic than in the full sample; children in the analytic sample were also less likely to be receiving special education services, more likely to be eligible for free/reduced price lunch, more likely to be living in a single parent household, and less likely to be attending a school in which a high proportion of students met state reading norms (Table 1). The children in the sample were distributed across 101 CPC class sessions taught by 62 teachers, some of whom taught multiple part-day sessions. Child demographics were obtained from school administrative records, and—for some variables—supplemented with a parent survey.

Table 1 Sample demographics

2.2 Measures

2.2.1 School readiness

To measure children’s school readiness, we relied on scores from the Teaching Strategies (TS) GOLD Assessment System—a performance-based, observational assessment of learning and development across nine domains (including math, literacy, and socioemotional development) (Heroman et al., 2010). The TS GOLD has been cited as the most widely used observational school readiness measure in US early childhood education programs (Russo et al., 2019) and was used routinely in the participating school district to help teachers monitor, understand, and support individual students’ progress. Per district procedure, students were assessed by their teachers three times throughout the school year (fall, winter, and spring). The TS GOLD comprises 66 skills, with teachers asked to rate children’s proficiency on each skill using a scale from 0 to 9 (except for the science domain, which uses a 0 to 2 scale). Scores have shown evidence of reliability and validity in preschool samples, including among dual language learners and students with special education needs (Kim et al., 2013; Lambert et al., 2014, 2015).

The TS GOLD has at least two key advantages as a measure of school readiness. First, it is a performance assessment, with teachers observing children for 4 to 6 weeks prior to rating them. This enhances assessment validity; teachers can tap into a broad base of knowledge about children’s behavior across varied settings, as well as an understanding of their family context, which allows for more authentic assessment of the child’s progress in developing school readiness skills (Lambert et al., 2015). Second, TS GOLD items are aligned to district and state standards, thereby optimizing children’s opportunities to master the skills assessed. It is worth noting, however, that the tool also has potential limitations. Concerns have been raised about its ability to discriminate between individual school readiness skills (versus providing a global assessment) and to reveal variation in school readiness between children within the same classroom (Russo et al., 2019). The measure was used for the present study despite these potential drawbacks, based on its convergent validity with direct assessments of school readiness skills (Lambert et al., 2015; Russo et al., 2019) and its routine use in the school district (which allowed the measure to be used without placing unusual demands on teachers and students).

For the present study, six TS GOLD subscales were of interest: language, literacy, math, socioemotional development, cognitive skill, and physical skill. At the fall assessment, scores on these subscales were strongly inter-correlated (Table 2). For parsimony, therefore, we chose to conduct our primary analyses on the total spring assessment score (a sum of the six subscale scores) and to conduct follow-up analyses on individual subscales as a test of model robustness.

Table 2 Correlations among TS GOLD assessment domains in the fall of the school year, based on pairwise complete observations (sample size in parentheses)

2.2.2 Instructional practices

Instructional practices were assessed using the Classroom Activity Report (CAR)—a questionnaire developed for the MCPC project to assess use of class time (Reynolds, Richardson, et al., 2016b; 11.). In the first section of this form, teachers report how their class time during the prior week was allocated across instructional domains—including language, math, and socioemotional learning. In the second section, they report what percentage of instructional time in each of three core domains (language/literacy, math, and science) was dedicated to teacher-directed vs. child-initiated activities. Teachers are asked to report the actual time spent, not the time they planned to spend based on their curriculum. Tables 3 and 4 report descriptive statistics for these two sections of the CAR, respectively, in the present study sample.

Table 3 Average percentage of class time dedicated to individual instructional domains, according to teachers’ reports on the CAR (N = 56)
Table 4 Average percentage of class time dedicated to child-initiated activities, according to teachers’ reports on the CAR (N = 62)

In the first year of the MCPC project, CPC preschool teachers were asked to complete the CAR monthly throughout the school year. The data reported here are for the second half of the year, as the form was modified halfway through the year and the data could not be feasibly combined. One site/school inadvertently reported instructional practices using a prior version of the CAR that lacked the teacher-directed/child-initiated portion. This site was re-contacted in the fifth year of the MCPC project, and a staff member provided a retrospective estimate for the percentage of child-initiated time during the first year. Although these retrospective reports were likely less reliable than those completed concurrently, we determined that it was important to include them. Not only did the inclusion/exclusion of this site’s data affect statistical power, but the site was one of the most academically focused (and relied the most on teacher-directed practices); entirely dropping its data might have biased the findings by failing to represent the full range of instructional experiences in our sample. The principal investigator of the MCPC project also confirmed that the retrospective estimates were reasonable, based on reports from site staff and from research staff who observed and rated program fidelity during the early years of the project (Human Capital Research Collaborative, 2017). Because key findings were generally comparable whether or not data from this site were included, we report results from the full sample. In the few cases where excluding data from this site changed the pattern of results regarding child-initiated instruction, we have noted this in the text or in a table note.

The number of CARs completed by each teacher ranged from one to six. If a teacher failed to submit a CAR for an afternoon session but did submit one for their morning session, the afternoon was assigned the morning values. This was based on the observation that teachers’ reports for their morning and afternoon sessions virtually never differed. Finally, if a teacher’s child-initiated and teacher-directed percentages for a certain domain did not add exactly to 100%, the values were scaled to add to 100 while maintaining their ratio. In rare cases where the values could not be readily scaled to 100% and would have biased average estimates of child-initiated instruction (e.g., reports of 0% child-initiated and 0% teacher-directed time for the same domain), the data point was excluded from analysis. For each instructional domain (language/literacy, math, and science), the reported percentage of child-initiated time was averaged across time points to obtain an overall measure of instructional approach. These values were fairly robustly correlated across instructional domains, so we also chose to average across domains to obtain a single index of child-initiated activity (see Table 5).

Table 5 Pearson’s correlations among reported child-initiated time across instructional domains (N = 62)

2.3 Analytic approach

2.3.1 Missing data

At the fall assessment, between 23 and 27% of students were missing scores on each TS GOLD subscale. In the spring, the proportion was slightly lower, with 17 to 23% of students missing scores on each subscale. These missing data were imputed using the Expectation-Maximization algorithm (Little & Rubin, 1987; Schafer & Olsen, 1998). This method yields valid estimates under the assumption that data are missing at random; based on examination of baseline demographic data, we judged that this assumption was satisfied. Input variables for the EM procedure included student race, gender, age, free/reduced lunch eligibility, parent demographics (including single parent status, employment status, and education), and all available TS GOLD assessment data.

2.3.2 Analysis

To assess the relation between instructional practices and children’s school readiness at the end of the year, we fitted a series of linear regression models using the “stats” package in R (R Core Team, 2013). Beginning with a model that regressed students’ spring TS GOLD scores on their fall TS GOLD scores, demographics, and classroom characteristics (see “Covariate specification” below), we then systematically added linear and quadratic main effects of child-initiated instructional time. To further assess whether the effect of child-initiated instruction differed based on child characteristics, we fitted models that included interactions between each characteristic of interest (baseline skill level, age at the start of the school year, and gender) and the linear and quadratic effects of child-initiated instructional time.

Because students in our dataset were hierarchically nested within schools, teachers, and sessions (morning/afternoon), the independent error assumption of regression was violated and a regression analysis was likely to yield mis-estimated standard errors. Intraclass correlations indicated that relatively little of the variance in spring TS GOLD scores (less than 10%) was due to differences between schools, but a large proportion (approximately 55%) was due to differences between teachers and this was not accounted for by differences between class sessions (which only comprised about 2% of the variance). We thus chose to estimate standard errors in our models using multi-way standard error clustering at the teacher level, to account for correlations among the scores of students taught and assessed by the same teacher. This method was implemented using the “multiwayvcov” package in R (Graham et al., 2016).

2.3.3 Covariate specification

To account for the fact that students’ initial skill level may affect teachers’ instructional approach (Kiuru et al., 2015; Lerkkanen et al., 2016; Nurmi & Kiuru, 2015), all of our regression models controlled for children’s fall TS GOLD scores. This helped ensure that estimated effects of child-initiated time were more likely due to the effect of instructional practices on skills growth than the effect of students’ initial skill level on instruction. In addition to fall test scores, all models controlled for the following child demographics: racial-ethnic background, gender, special education status, age (3 vs. 4 years old at the start of the school year), and free/reduced lunch eligibility. Demographics were dummy-coded, with the variable label denoting the category coded “1” (e.g., for the “Female” variable, females were “1” and males were the reference group). To account for the fact that repeated measurements are likely to provide a more reliable index of teachers’ practices, models also controlled for the number of CAR forms that each teacher submitted. Finally, all models controlled for two aspects of classroom structure that have been linked to student outcomes—full-day vs. half-day program length, and class size (the number of students in a class session who met inclusion criteria for the MCPC sample)—to help reduce residual variance (Finn & Achilles, 1990; Francis & Barnett, 2019; Lerkkanen et al., 2016; Reynolds et al., 2014).

3 Results

3.1 Main effects of child-initiated instruction

To examine linear and quadratic main effects of child-initiated instruction on school readiness, we fitted a series of three regression models. The first model regressed students’ spring TS GOLD total score on their fall TS GOLD score, demographic variables, and classroom characteristics. The second and third models added linear and quadratic main effects, respectively, of child-initiated instructional time. The results of this model-building process are summarized in Table 6. The linear effect of child-initiated time (added in Model 2) did not explain significant unique variance in spring TS GOLD scores, suggesting that child-initiated instructional time was not linearly associated with spring school readiness after accounting for key student and classroom characteristics. There was evidence of a marginal quadratic association between child-initiated instructional time and school readiness (Table 6; Model 3) after accounting for covariates, with the positive coefficient indicating that an intermediate level of child-initiated instruction was associated with the lowest school readiness scores. However, this effect was only marginal and became non-significant when the subset of classes for whom child-initiated time was reported retrospectively was excluded from analysis.

Table 6 Main effects models predicting spring TS GOLD total score from child, classroom, and instructional variables, N = 1289

3.2 Moderation by child characteristics

Despite the lack of robust main effects of child-initiated time on school readiness, it was possible that these null effects masked varying patterns for different sub-groups of children. To examine this possibility, we fitted a series of six regression models that individually added interactions between the linear and quadratic effects of child-initiated time and three child characteristics that have been suggested to moderate the effect of child-initiated instruction. These included child gender as well as indices of developmental level and baseline skills—age (3 vs. 4 years old at the start of the school year) and fall TS GOLD score. Key coefficients from these models are reported in Table 7. The interaction between baseline (fall) TS GOLD score and child-initiated instruction did not account for significant unique variance, and the same was true for child gender (Table 7; Models 4 and 5 and 8 and 9). However, the interaction between age and a quadratic effect of child-initiated time did account for significant unique variance (Table 7; Model 7). This indicated that child-initiated instruction was associated with school readiness in a curvilinear pattern, but the shape of this pattern differed for 3- and 4-year-olds. Fig. 1 illustrates the pattern of association predicted by this model for each age group. Three-year-old students showed an inverted U-shape pattern, with intermediate levels of child-initiated instruction (around 40% of instructional time) associated with the lowest school readiness scores. Four-year-old students, in contrast, showed gradually tapering growth in school readiness skills as child-initiated instructional time increased. Fig. 2 illustrates this effect in terms of standardized TS GOLD score units. It is important to note that the quadratic interaction effect, while statistically significant, was quite small (partial Cohen’s f2 = .034) (Cohen, 1988). Thus, although interesting, this effect should be interpreted with caution.

Table 7 Moderation models predicting spring TS GOLD total score from child, classroom, and instructional variables, N = 1289
Fig. 1
figure 1

Predicted school readiness (TS GOLD total score) for 3- and 4-year-old preschool students as a function of child-initiated instructional time. Student and classroom covariates are held constant

Fig. 2
figure 2

Standardized mean differences in school readiness (TS GOLD total score) predicted from Model 7 (see Table 7) as a function of child-initiated instructional time and student age. Other student and classroom covariates are held constant

3.3 Robustness of findings

To assess the robustness of our findings, we conducted two sets of follow-up analyses. First, we repeated the analyses reported in the previous two sections for three individual subscales of the TS GOLD: literacy, math, and socioemotional development. Of the six domains that comprise the TS GOLD total score, these three sets of skills were of particular interest due to their stability and relevance for successful academic performance in later grades (Aunola et al., 2004; Blair & Razza, 2007; Cameron et al., 2019; Leppänen et al., 2004; Missall et al., 2007; Zins et al., 2007). Second, we repeated the primary analyses using the full CPC sample of 1724 students; TS GOLD scores were imputed for this sample using the same method as for the analytic subsample (Little & Rubin, 1987; Schafer & Olsen, 1998).

Full results of the robustness analyses are reported in the Online Supplementary Material (Tables S1–S8). The primary findings were consistent across the fully imputed and partially imputed samples, and generally stable across all three TS GOLD outcome domains. Consistent with the overall analysis, no domain showed a significant linear main effect of child-initiated instructional time on spring TS GOLD scores. For literacy and socioemotional scores, there was a quadratic main effect of child-initiated time (Tables S1 and S5; Model 3). However, in both cases this was qualified by a significant (for literacy) or marginal (for socioemotional development) interaction between child age and the quadratic effect of child-initiated time (Tables S2 and S6; Model 7). A significant interaction between age and the quadratic effect of child-initiated time was also found for math scores (Table S4; Model 7). Thus, the key finding—a curvilinear relation between child-initiated instructional time and school readiness that differed significantly based on child age—was consistently observed across all three individual outcome domains, although it was less robust for socioemotional learning than for math and literacy. In all cases, the magnitude of this effect remained small (partial Cohen’s f2 ranging from .019 to .026) (Cohen, 1988).

4 Discussion

The present study strove to examine whether the balance of time devoted to child-initiated vs. teacher-directed activities in a preschool classroom predicts school readiness in a student population experiencing high levels of sociodemographic risk. We hoped the results could help educators and policymakers evaluate whether instruction is likely to be an effective target for improving preschool program quality and reducing achievement gaps associated with sociodemographic risk. Although observed effect sizes were small, our analyses showed evidence of curvilinear associations between child-initiated instruction and school readiness that varied by child age. Four-year-olds generally seemed to be more “school ready” as the proportion of child-initiated instruction in their classroom increased. Three-year-olds, in contrast, tended to be the most “school ready” in classrooms where one approach (either child-initiated or teacher-directed) was clearly dominant, and less so in classrooms where instruction was more equally balanced between the two. Effects of child-initiated instruction did not appear to vary by children’s gender or baseline skill level.

4.1 Is more child-initiated always better?

For older preschoolers in this sample, our results suggest that the answer to this question is a tentative yes. The quadratic pattern linking child-initiated instructional time and school readiness among 4-year-olds showed tapering growth in school readiness as the average proportion of child-initiated instruction increased, suggesting that a higher ratio of child-initiated to teacher-directed instruction tended to be associated with greater school readiness. This pattern is consistent with theory and research suggesting that autonomy and self-directed exploration support early learning. According to this constructivist view, allowing young children to drive their own learning and discover concepts—rather than explicitly teaching them—reflects their natural propensity for exploration and is well-suited for their developmental level (National Association for the Education of Young Children, 2009). This type of open-ended, child-initiated approach has been argued to help maintain children’s curiosity and intrinsic motivation for learning (Bonawitz et al., 2011; Deci & Ryan, 2000; Stipek et al., 1995), and to promote deeper processing of new ideas (Alfieri et al., 2011; McDaniel & Schlager, 1990).

Although the link observed between child-initiated instruction and 4-year-olds’ school readiness seemed to taper as the proportion of child-initiated time increased, the significance of this is unclear. Because the range of child-initiated instruction in this sample only reached a maximum of 70%, it is impossible to know what the results would look like at higher proportions. Perhaps instruction that is about 60–70% child-initiated is optimal for 4-year-olds from this population, and school readiness would have dipped in classrooms that tended to spend over 70% of time on child-initiated activities. This would be consistent with evidence that both child- and teacher-directed elements can support young children’s learning, and a blend of child and teacher control over the learning process can be optimal (Fisher et al., 2010, 2013; Graue et al., 2004). It is also possible, however, that increasing the child-initiated focus in a preschool classroom past a certain point yields little added benefit for 4-year-olds’ learning. Future research with a wider range of instructional approaches, especially highly child-initiated ones, is needed to tease apart these possibilities.

In contrast, our results suggest that for younger children more child-initiated instruction might only be better after a certain point. Contrary to our hypotheses, 3-year-olds showed an inverted quadratic trend—they tended to be the least school ready after attending classrooms that blended on average about 40% child-initiated with 60% teacher-directed instruction, and more school ready after attending classrooms where one type of instruction was more clearly dominant. This pattern is a bit puzzling, given robust theoretical and empirical support for the effectiveness of constructivist, child-initiated instruction in early childhood (e.g., Bonawitz et al., 2011; Graue et al., 2004; Lerkkanen et al., 2016; Stipek et al., 1995). Interestingly, however, it is consistent with Marcon’s (1992, 1999) finding that preschoolers in classrooms that blended academically focused practices with child-centered ones showed the weakest outcomes on several indices of achievement and adjustment. Perhaps, as Marcon suggests, classrooms that blend instructional approaches are less likely to be grounded in a strong theoretical foundation that helps ensure consistency across instructional activities. If this was the case, it could have been especially important for the younger children in our sample because they were new to preschool and might have benefited from the predictability of a consistent approach during this transition. Future research could seek to evaluate this possibility by examining the effects of child-initiated instruction for children who are the same age but vary in school experience (e.g., 4-year-olds in their first year of preschool vs. those who have experienced a year of preschool).

4.2 Key take-aways

It is important to note that the observed child-initiated instruction × age interaction effect was small in magnitude, both in predicting overall school readiness (TS GOLD total score) and in predicting individual domains of school readiness. We thus urge readers to interpret the results with a healthy dose of caution until a similar pattern is replicated. However, we also believe that the present study meaningfully contributes to the literature in several ways.

First, we are not aware of any other studies that have explored curvilinear associations between the proportion of child-initiated preschool instruction and student outcomes. We believe this is a promising approach that has the potential to help the field further explore whether increasing children’s control over the learning context is always beneficial, and to assess whether there is an optimal balance of instruction for supporting school readiness (Lerkkanen et al., 2012). Although our ability to assess this optimal balance in our sample was limited by the observed range of instructional approaches (about 12 to 70% child-initiated, on average), it is interesting that we did not see linear associations but did find evidence of curvilinear ones. This issue deserves further exploration—perhaps with more instructionally diverse classrooms.

Second, our findings help to highlight the importance of individual differences and examining how instructional methods work for different children. Again, given the small effect size, the observed age × instruction interaction should be interpreted with caution. However, as our work joins an existing body of research suggesting that not all children are equally affected by the same instructional practices (e.g., Kiuru et al., 2012; Lerkkanen et al., 2016), it helps make the case that investigating these effects (when they are theoretically justified) can be a fruitful effort. A nuanced understanding of which practices tend to work well for which children has the potential to support ongoing program improvement and efforts to tailor instruction.

Third, our findings overall seem to bolster the body of research suggesting that child-initiated instruction can be beneficial for preschoolers’ learning and preparation to start school (e.g., Clements et al., 2004; Graue et al., 2004; Reynolds, 2000). Although a moderate frequency of child-initiated activities seemed to be sub-optimal for 3-year-olds in our sample, a relatively high proportion of time spent on child-initiated activities (around 60–70%) tended to be associated with relatively good school readiness outcomes for all ages. This suggests to us that preschool programs serving high-risk 3- and 4-year-olds might best support students’ school readiness by encouraging teachers to spend a substantial portion of their instructional time on child-initiated activities. We qualify this statement, however, with two caveats. First, our sample did not include classrooms that spent more than 70% of instructional time, on average, engaged in child-initiated activities, and thus cannot speak to results beyond this range. Second, as discussed further in the limitations below, this correlational study cannot support robust causal inference about the effects of instruction on school readiness.

Finally, although further research is needed to help validate the Classroom Activity Report (CAR), we believe the current study highlights the benefits of this type of measure. The CAR is a concurrent assessment throughout the year, from the perspective of the classroom teacher. This embedded, authentic approach to measurement fills an important gap in the field, given the limits of observational assessments and survey reports. A number of prior studies have relied on teachers’ retrospective reports (Graue et al., 2004) or self-reported instructional beliefs (Marcon, 1992, 1999), which may not fully reflect day-to-day practices. The CAR is also designed as a formative tool for use in ongoing professional development and program improvement. Grade-level and cross-grade-level teacher meetings can draw on the frequent reports of instructional practices provided by this measure to better tailor instruction. This process is also scalable in facilitating professional learning across schools and districts. Observations and other types of reports do not have these advantages, as they can take significant time and effort to complete; for instance, the ECCOM (the observational tool most comparable to the CAR in content) takes several hours (Lerkkanen et al., 2012; Stipek & Byler, 2004).

4.3 Limitations and future directions

The present study also has several limitations. First, although the study strove to account for key confounds by statistically controlling for children’s baseline TS GOLD performance and other aspects of the classroom environment that might influence learning, it is possible that the observed effects of child-initiated instructional time on school readiness were driven by variables other than child-initiated instruction. Classrooms with different levels of child-initiated instruction might have also differed in other characteristics—such as their social and emotional climate (Hyson, 1991; Stipek et al., 1995) and fidelity of CPC program implementation—that could have contributed to the observed associations with child outcomes. Furthermore, the relations between child-initiated instruction and school readiness observed in the present study are unlikely to be purely unidirectional, given that students’ behavior and performance may also lead teachers to adjust their instructional practices (Kiuru et al., 2015; Nurmi & Kiuru, 2015).

Second, it is always possible that the results may be specific to the population of schools and students sampled in the present study. The ethnically diverse sample with high rates of sociodemographic risk factors is representative of the population attending many publicly funded preschool programs in the US—and, as such, provides valuable information about what might work for this population. However, the data come from a single school district and it is possible that the findings may not generalize to other school settings and populations. The classrooms represented in the sample also had a somewhat limited range of instructional approaches (average percent of child-initiated time ranged from about 12 to 70), with relatively few classrooms falling at the far ends of this range. Thus, as discussed above, it is unclear what the findings would look like in classrooms adopting more extreme instructional methods (especially highly child-initiated ones). Further research is needed to examine whether the observed patterns would hold for a different student population and a broader range of instructional approaches.

The short timescale of this study places additional limits on generalizability. Although a temporally proximal outcome—school readiness at the end of the current year—is a logical starting point for examining the effects of preschool instruction, the question of greater practical importance is whether the benefits of a particular instructional approach last beyond the end of that academic year. Especially in light of widespread concerns about the fade-out of preschool program effects (Lipsey et al., 2018), it will be important for future work to explore whether the instructional effects observed in the present study last beyond preschool.

Finally, although the CAR has some distinct advantages as an assessment tool, its practicality and scalability come with trade-offs. In the version of the questionnaire used in the first year of the MCPC project, definitions of “child-initiated” and “teacher-directed” instruction were left open-ended. All teachers had received training on child-initiated instruction as part of the CPC program, but it is still possible that their beliefs about the terms differed slightly and did not capture an identical construct across classrooms. Furthermore, implicitly requiring teachers to classify all activities as either teacher-directed or child-initiated may have obscured important nuances. As noted at the outset, child- and teacher-directed elements can be combined within the same activity by providing scaffolding (teacher-direction) within otherwise open-ended (child-initiated) learning contexts (Fisher et al., 2010, 2013; Weisberg et al., 2013). It would be interesting to know how teachers in the present study classified such activities—which were not clearly child- or teacher-directed but did involve children and teachers sharing control. To increase confidence in our findings and better understand their implications, therefore, future studies should seek to corroborate them using a different, complementary assessment of instructional practices (e.g., an observational measure like the ECCOM; Stipek & Byler, 2004).

5 Conclusion

The present study examined whether balancing child-initiated and teacher-directed practices in preschool classrooms might support optimal school readiness among students experiencing high levels of sociodemographic risk, and whether this pattern might differ based on child characteristics. Our findings tentatively suggest that spending over 60% of instructional time, on average, in child-initiated activities might help support improvements in school readiness for 3- and 4-year-old preschoolers. However, they also draw attention to (1) the possibility that more child-initiated instruction is not always linearly associated with better child outcomes and (2) the fact that instructional approaches might not be uniformly effective for all children (e.g., those at different developmental levels). Based on the small effect sizes, child-initiated learning may not be the most powerful way to increase the effectiveness of preschool programs in preparing children who are experiencing sociodemographic risk to begin school. However, instruction might be a more financially feasible target of intervention than other preschool program characteristics that have been linked to student outcomes (e.g., class size, and full-day programming; Francis & Barnett, 2019; Reynolds et al., 2014). As such, we believe the non-linear sub-group effects we observed merit further investigation, as does the use of the CAR—a brief, practical measure that can facilitate the integration of research on instructional practices with program improvement and professional development efforts.