Introduction

Context-based learning approaches are developed and introduced to address several challenges in secondary chemistry education (Childs et al., 2015; Sevian et al., 2018). It has been suggested that students perceive the chemistry curriculum as fragmented, overloaded and as irrelevant for their personal lives (Gilbert, 2006). Moreover, transfer of chemistry concepts to new situations turns out to be difficult (Pilot & Bulte, 2006). In general, a context-based approach starts with a relevant real-world question, stimulating students to use chemistry concepts to explore, experience and evaluate a problem extracted from real life (Taconis et al., 2016). Furthermore, educational research revealed that context-based approaches, in general, have positive effects on both students’ motivation, and their perception of relevance of chemistry for their personal lives (Bennett, 2017). Effects found on students’ achievements are ambiguous, with generally similar or slightly better results than more traditional approaches (Savelsbergh et al., 2016). However, students might experience context-based approaches as rather complex and overwhelming (King & Ritchie, 2012; Quintana et al., 2004). Although real-world questions are usually attractive and motivating for students, their complexity and open-endedness might evoke frustration and confusion. Moreover, the real-world question might conceal the underlying concepts (Parchmann et al., 2006), causing uncertainty among students about what to learn for their end-of-term exams. In addition, answering a complex and ill-structured real-world question sets high demands on self-regulation processes, such as monitoring progress, and adjusting learning strategies to optimize performance (Dori & Avargil, 2015; Panadero, 2017). Unfortunately, students do not always monitor and regulate these processes during learning with context-based approaches, which limits its potential and effectiveness as educational method to enhance their learning (Azevedo et al., 2012).

Therefore, implementing a scaffold to guide students through their learning process and to monitor their conceptual development might enhance students’ learning. Scaffolding makes complex tasks more manageable and accessible for students (Hmelo-Silver et al., 2007). An interesting approach would be to implement a project management framework in a context-based learning environment and explore whether students experience a context-based approach as less complex. Project management frameworks are used in business to guide employees working on complex projects. In addition, a project management framework consists of features to monitor the project’s progress and to adjust to changing circumstances. It seems reasonable to expect that the implementation of a project management framework might be beneficial for students working on a rather complex real-world question. For that reason, we decided to implement Scrum methodology as framework to scaffold students’ learning in context-based learning environments. Scrum methodology provides ceremonies, roles and artefacts that invite students to reflect on their learning progress (Vogelzang et al., 2020b) and aims to scaffold students’ self-regulation processes (Pope-Ruark, 2012). As yet, Scrum methodology is seldom used in educational settings, although there are some exceptions (Cook, 2017; Parsons & MacCallum, 2019; Pope-Ruark, 2012). Therefore, insights in the effects of the implementation of Scrum methodology on students’ learning are still lacking. The current study aims to contribute to insights into effects of the use of Scrum methodology on students’ learning in rather complex learning environments such as context-based chemistry education.

Context-based learning

In general, context-based approaches aim at various student outcomes, including cognitive, metacognitive and affective ones (Bennett, 2017; Dori & Avargil, 2015). First, context-based approaches intend to promote students´ cognitive development and increase their achievement by connecting a recognizable real-life issue to chemistry concepts. Instead of transmitting knowledge to them by a teacher, students are invited to relate concepts (De Putter-Smits et al., 2013) and to construct meaningful knowledge structures themselves (Krajcik & Shin, 2014). These knowledge structures consist of interconnected concepts rather than as isolated facts. Students construct and use these knowledge structures to interpret, analyze, and answer the real-world question (Hmelo-Silver, 2004). There is widespread consensus in the field of learning sciences that such student-centered approaches contribute to deep learning and sound understanding (Sawyer, 2014). Secondly, context-based approaches require students to use and develop their metacognitive skills. Answering an open real-world question demands skilled use of self-regulation processes, including planning, monitoring, reflection, and control of learning strategies (Azevedo et al., 2012; Dori & Avargil, 2015). Self-regulation of a learning process involves analysing the learning context, setting learning goals, determining learning strategies, reflecting whether the strategies are effective, and monitoring learning progress (Azevedo et al., 2012). Thirdly, context-based approaches intend to improve students’ attitude towards chemistry. This affective aim focuses on how students feel about the chemistry they do (Bennett, 2017). This is of utmost importance given the fact that in many countries students’ attitude towards chemistry is generally negative (Potvin & Hasni, 2014; Savelsbergh et al., 2016). By creating a learning environment in which students are rather autonomous in how they arrange their work, the hope is that this might engage students and influence their feelings of ownership (Katz & Assor, 2007). In addition, the context-based approach might contribute to a more realistic picture of chemistry by showing how it is connected to their personal lives.

The nature of context-based approaches matches constructivist learning principles. Therefore, context-based approaches share many similarities with student-centered learning environments, such as problem-based learning environments, in which students: (1) work collaboratively on real-world issues; (2) take ownership of their learning; (3) reflect explicitly on their learning process and (4) use higher order thinking skills (Land et al., 2012). However, educational research revealed that students of all ages often perceive difficulties in such complex learning environments and students should be scaffolded to engage successfully in student-centered learning environments (Azevedo et al., 2012). In a special issue on context-based approaches, Sevian et al., (2018, p. 4) emphasized there is a need for studies with a focus on how students’ learning can be scaffolded. Moreover, they recommend to investigate the conditions that make learning in context-based classroom advantageous. This study focuses on Scrum methodology as scaffold of students’ learning in context-based learning environments.

Scrum methodology

Scrum methodology is a project management framework, frequently used in business and industry to manage and monitor complex projects (Schwaber & Sutherland, 2012). It consists of ceremonies, roles and artefacts that provide overview, induces feedback and dialogue systematically, and keeps employees on track to achieve the ultimate objectives of the project. According to the developers there are three main principles that underline Scrum methodology: transparency, inspection and adaptation (Schwaber & Sutherland, 2020). Scrum methodology aims to contribute to a social environment in which all participants take their responsibility and feel accountable to achieve the formulated objective. Moreover, they are actively involved in all phases of the entire project (transparency). Furthermore, participating employees are encouraged to provide feedback on the quality of intermediate products (inspection). In addition, they are invited to propose improvements to enhance the quality of both product and workflow during the project (adaptation). The three Scrum pillars are promoted by ceremonies, including the stand-up in which employees discuss perceived problems as well as how they will contribute to reach the common objective. In addition, all participating employees have a specific role in the team, including the role of a scrum master, who serves as a linking pin between Scrum team and customer. Furthermore, a scrum board, which can be seen as a graphical organizer, is used as an artefact that reveals what employees have to do, where they are working on, and what has been done. In addition, employees are asked frequently to deliver an intermediate product. During these review ceremonies, the quality of the intermediate product will be discussed with both customer and other employees. In this phase, employees and customer propose improvements. The review ceremony is followed by the retrospective ceremony in which the participants discuss the quality of workflow in order to improve their workflow. Thus, Scrum methodology structures a social environment with clear procedures, roles and artefacts. It creates moments of reflection and feedback systematically and visualizes potential challenges and problems in an early stage.

Over time, Scrum methodology was adjusted for educational purposes (Cook, 2017; Parsons & MacCallum, 2019). Cook (2017) introduced Scrum methodology in a project-based learning unit on climate change and found that her students showed initiative, were self-directed and collaborated closely with their team mates. Furthermore, Parsons and MacCallum (2019) edited a book with experiences of the implementation of Scrum methodology in a variety of fields, including software development courses, mathematics and chemistry. In general, the experiences suggest that the use of Scrum methodology might increase metacognitive aspects of students’ learning, such as students’ planning skills and mutual collaboration. In addition, the experiences seem to suggest that cognitive learning outcomes as well as affective aspects of students’ learning, such as attitude towards subject, are influenced positively. However, empirical research into the effects of the implementation of Scrum methodology on cognitive, metacognitive and affective aspects of students’ learning is lacking.

A concise overview of Scrum methodology including a short description in thirteen steps of how this approach can be adapted in a context-based learning environment, is presented in Table 1.

Table 1 Concise overview of the ceremonies, roles and artefacts of Scrum methodology, including how they are applied in a context-based learning environment

As can be seen in Table 1, Scrum methodology aims at stimulating a classroom climate in which students work together, deploy and develop their personal qualities. In addition, their self-regulatory skills, their self-efficacy and their self-reflection, might be scaffolded by the different ceremonies, roles and artefacts of Scrum methodology. Moreover, its features have been connected to motivational scaffolds, including promoting autonomy, establishing task value and promoting emotion regulation, suggesting that Scrum methodology might contribute to students’ learning and motivation (Vogelzang et al., 2019). Therefore, we might hypothesize that Scrum methodology provides a framework that scaffolds students’ ownership, stimulates discussion and reflection, and contributes to a learning environment in which students think more deeply about chemistry concepts, are more engaged with the subject of teaching and regulate their learning process more, compared to regular teaching.

From a theoretical point of view it is clear that Scrum methodology scaffolds at least three challenges students face during their learning process: (1) sense making; (2) process management and (3) reflection (Quintana et al., 1999; Reiser, 2004). Sense making refers to the formulation of hypotheses and the interpretation of data to explain a scientific phenomenon. Process management skills are necessary to plan, to conduct experiments and to complete exercises. Reflection refers to monitoring and evaluating learning progress as well as refining plans to proceed to the next step. During the sprints, students are working collaboratively on exercises and assignments to formulate hypotheses, to gather data and to interpret these data. Process management is scaffolded by the stand-up ceremony and with a scrum board. Such artefacts and ceremonies are expected to help students to plan, direct and monitor their learning process. Finally, the review and retrospective ceremony aim to induce reflection upon both learning process and on understanding of subject matter knowledge.

Purpose of the study and research questions

This study can be seen as a response to a call to investigate conditions and circumstances that might enhance students’ learning in context-based approaches (Sevian et al., 2018). It explores whether Scrum methodology might be an appropriate scaffold to direct students’ learning in context-based chemistry education. In the current study, effects of Scrum methodology were examined on students’ learning outcomes and their self-reported affective and metacognitive skills.

Two research questions guided this study:

  1. 1.

    What is the effect of the use of Scrum methodology on students’ learning outcomes in terms of understanding of relevant concepts? (RQ1)

  2. 2.

    What is the effect of the use of Scrum methodology on students´ perceptions of affective and metacognitive aspects of their learning? (RQ2)

Method

Design

A pre-test post-test quasi-experimental control group design was used to explore the impact of the implementation of Scrum methodology on both students’ learning outcomes and their perceptions of affective and metacognitive aspects of their learning. As educational context a module on Green Chemistry was used (Jansen-Ligthelm et al., 2010). The experimental group included students whose teachers applied Scrum in their classes when teaching the Green Chemistry module, whereas teachers of the comparison group used their regular teaching styles.

Participants

Ten teachers of secondary schools all over the Netherlands implemented the Green Chemistry module in their classrooms as part of regular chemistry classes. Six of these teachers, from three schools with twelve different classes, used Scrum methodology as scaffold for students’ learning. Their students formed the experimental group. The other four teachers, from three schools and eight different classes, did not use Scrum methodology and their students formed the comparison group (Table 2). All teachers graduated from university with a degree in chemistry as well as chemistry education. Nine teachers had taught chemistry for more than 15 years, whereas one teacher, participating in the experimental group, had 5 years of teaching experience.

Table 2 Number of classes, teachers and schools as well as number participating students

In total, 320 students (54%, grade 11) answered the questions of a pre-test and post-test on Green Chemistry. In addition, due to time constraints, only 256 students (43%) completed both the pre-test and post-test about their affective and metacognitive perceptions (Table 2).

Materials

A context-based approach was implemented in a module on Green Chemistry in secondary chemistry classrooms (Jansen-Ligthelm et al., 2010). Central theme in this module is the need for new sustainable routes to produce innovative chemicals that do not harm the environment, are less hazardous and use renewable resources (Sjöström et al., 2015). Students become aware that Green Chemistry can be seen as the response of the chemical field to issues concerning our planet, (Lozano & Watson, 2013), that impacts people’s lives in general and their personal lives in particular. During the module the twelve principles of Green Chemistry (Anastas & Warner, 1998) are presented (Table 3). For instance, students calculate the atom efficiency and reaction enthalpy of chemical reactions (resp. number 2 and 6) and discuss the effect of a catalyst as well as the toxicity of reagents (resp. number 9 and 10).

Table 3 Green Chemistry principles

The 12 principles of Green Chemistry converge during the final assignment of this module. Collaborative teams are invited to compare two different routes for the synthesis of adipic acid, an important precursor to produce nylon-6,6-polyamide. The facility to produce adipic acid is located in their neighbourhood. Every team produces a written advice in which they substantiate what route is preferable considering the principles of Green Chemistry. Thus, students become aware that the synthesis of chemicals might affect their living environment as well as their personal lives (see also Vogelzang et al. (2020a)). The written advice task had the form of a report and was (virtually) addressed to local authorities to provide them with relevant arguments to take a final decision on the greenest synthesis of adipic acid. It can be characterised as a one-off task inspired on the work of Kolodner et al. (2003), who used similar tasks to stimulate students’ learning.

Clearly, the structure of this context-based module fits within the theory of situated learning (Mandl & Kopp, 2005) and shares overlap with characteristics of problem-based learning (Hmelo-Silver, 2004). The central question is embedded in a real-life context, and the assignments, tasks and labs are tailored to the subsequent stages of students’ learning process. Students work collaboratively on exercises that are gradually made more complex. Moreover, students are stimulated to plan and monitor their progress themselves. In addition, they are encouraged to critique and revise their work and invited to apply the chemistry concepts involved, to other situations.

Measures

Students’ learning outcomes

Students’ learning outcomes were measured with the Green Chemistry concept test (GCCT, see Appendix 1). The GCCT was based on the concepts covered in the module on Green Chemistry (Jansen-Ligthelm et al., 2010) and consisted of twelve open questions, with scores varying from 1 to 3 points (maximum score = 23 points). The questions assessed students’ understanding of the principles of Green Chemistry (e.g.: E-factor, yield, atom-efficiency), as well as calculation of reaction energy using heats of formation. Content and face validity of the GCCT were checked by two secondary chemistry teachers, who were familiar with the module. The GCCT was piloted with 25 students, not participating in the present study. Answers of both the pilot and 25 students participating in the current study, were scored independently by two teachers and resulted in an interrater reliability score, Cohen’s κ = 0.98. Overall scores on pre-test as well as post-test were converted in a percentage, ranging from 0 to 100%.

Examples of questions:

  • A manufacturer wants to produce a specific chemical. It turns out that there are two different synthesis routes available. Method 1 has an atom efficiency of 50%, whereas method 2 has an atom efficiency of 75%. Explain which method is preferable (2 points).

  • Methyl-tert-butylether (MBTE, C5H12O) is added to petrol to increase its anti-knock rating. MTBE is synthesized from methylpropene (C4H8) and methanol (CH3OH). Calculate the E-factor. Assume that the yield of the reaction is 100% (3 points).

The quality of the written advices was not used as an outcome measure because we focused on the learning outcomes of individual students. In another study, we analysed the quality of the advices with the SOLO-taxonomy. It turned out that groups in the experimental condition delivered more sophisticated advices then groups in the control condition (Vogelzang et al., 2020a).

Affective and metacognitive aspects of students’ learning

Students completed a Likert-type Questionnaire with items on their perceptions of Affective and Metacognitive dimensions of their learning (QAM). The questionnaire was developed for this study. Several sources were used to develop appropriate items. QAM consisted of items on students’ attitude about chemistry, inspired by Bennett (2017), other items focused on students’ belief of their capabilities (Ajzen, 2002) to understand chemistry, and yet other items concentrated on whether the students perceived that they developed their personal qualities (Perry et al., 2019). Items concerning metacognitive dimensions intended to measure how students experienced their mutual collaboration (Schraw et al., 2006), and their self-regulation (Panadero, 2017). In addition, QAM consists of items about how students perceived the learning environment in general (Ambrose et al., 2010) and the guidance provided by the teacher in particular (Joseph, 2009). Face validity was checked by two independent teachers, who suggested a few textual improvements. Then QAM was piloted with one class (26 students), not participating in this study, to check if students perceived any problem with regard to language clarity.

A Principal Component Analysis (PCA) with Varimax rotation, using SPSS was executed on 34 items of the questionnaire of the entire dataset. Three additional questions, focusing on teachers’ behaviour, were excluded before executing the PCA, and formed an independent scale. The Kaiser–Meyer–Okin tests revealed KMO-values of respectively 0.929 and 0.941 for the pre-test and post-test. Six components had eigenvalues above the Kaiser’s criterion of 1.0, explaining a total variance of 61.9% (pre-test) and 64.7% (post-test). These components were selected and fitted in two separate clusters, which could be characterized as respectively an affective dimension and a metacognitive dimension of students’ learning (Bennett, 2017). All 34 items were included in the six components.

Based on the item factor loadings the individual components were labelled as (1) self-efficacy (pre-test: 31.73%; post-test: 35.86%), (2) self-regulation (pre-test: 10.91%; post-test: 11.03%), (3) classroom climate (pre-test: 6.70%; post-test: 6.28%), (4) personal development (pre-test: 5.42%; post-test: 4.80%), (5) attitude towards chemistry (pre-test: 3.64%; post-test: 3.41%), (6) collaboration (pre-test: 3.48%; post-test: 3.33%). The six components found during the Principal Component Analysis are presented in Table 4, including an example item and the reliability in terms of Cronbach’s α. All analyses were performed in IBM SPSS Statistics version 25.

Table 4 Scales, their definition and examples of Likert-type questions (score = 1: completely disagree; score = 5: completely agree)

Procedures

Participating teachers and students gave their written consent to use their responses for research purposes. Students were informed that their responses were anonymised and therefore had no influence on their grades. In addition, they were told that their participation was voluntary. They were informed that they had the opportunity to opt out in any stage of the study. Research clearance was received by the ICLON Research Ethics Committee (IREC) of Leiden University. The teachers of the experimental group followed a professional development program to get familiar with all the Scrum ceremonies, roles and artefacts (Table 1). They attended five sessions of 4 h on Scrum methodology in which: (1) the principles underlying Scrum methodology were explained; (2) they practised with its ceremonies, roles and artefacts; (3) shared both their positive and negative experiences with Scrum methodology and (4) provided each other with feedback.

The context-based module Green Chemistry was implemented in both Scrum classes and classes in which teachers used their regular teaching style. Differences and similarities with regard to different learning phases are presented in Table 5, inspired and adapted on learning phases as proposed by Zimmerman (2002) and Panadero (2017).

Table 5 Differences and similarities between the experimental and comparison group

Clearly, the main difference between the experimental and comparison group is found in the systematic attention on executive regulation functions, including planning and monitoring of work. Furthermore, within the experimental group a continuous alternation of performing and evaluating is found, whereas the performing phase and evaluating phase are separated in the comparison group. Before and after the module a pre-test and post-test were administrated, respectively on cognitive (GCCT) and metacognitive aspects of students’ learning (QAM). An overview of the different stages is presented in Table 6.

Table 6 An overview of the different stages

Implementation of the Green Chemistry module in Scrum classes

At the start of the module, which lasted six weeks with two or three lessons of 50 or 60 min a week, the teacher introduced the central theme of the module as well as its ultimate goal. In addition, students were asked to perform two tests: (1) a pre-test with questions on Green Chemistry (GCCT, Green Chemistry Concept Test) and (2) a questionnaire on their perceptions of metacognitive dimensions of their learning (QAM, Questionnaire on Affective and Metacognitive dimensions of students’ learning) (see Table 6). Thereafter, students entered a comprehensive planning phase, starting with a ceremony in which they formed groups based on their personal qualities (see Table 1). Secondly, the teacher, or in Scrum terms, the product-owner, provided every group with a product backlog in which the learning goals, tasks, and experiments were formulated. However, before the students started to work on the assignment, the teacher asked every group to describe their own definition of fun and their own definition of done. It seems reasonable to expect that formulating your own definition of fun and your own definition of done contributes to a learning environment in which students feel both comfortable and responsible. Subsequently, students discussed the relative weight of all assignments. A small and easy assignment was awarded with 1 point, whereas a difficult task was awarded with 5 points. Clearly, this ceremony helped students to plan all tasks and assignments over the time available. They visualised their planning on a Scrum board, which basically consists of three columns (to do, doing, and done) (Fig. 1). Hence, they wrote their tasks on Post-Its, and redirected them to the right column when they accomplished a task. For example, task 1 in a specific group was “what is a definition of sustainability?”, task 2 was “a lab experiment” and task 3 was “calculate the atom-efficiency of the reaction presented in exercise 2”.

Fig. 1
figure 1

Simplified example of a Scrum board

The Scrum board provided both the teacher and the students with an overview of students’ progress. The planning phase was followed by a sprint in which students performed the assignments and experiments. A sprint took two weeks and consisted of five or six lessons of 60 min. At the start of every lesson each group gathered around their Scrum board and had a small stand-up meeting (5 min), in which they discussed what they had done in the last lesson, what they intended to do in the current lesson, and whether they experienced problems or not. The first sprint focused on the 12 principles of Green Chemistry and was finished with a formative assessment in which students reviewed their knowledge of the chemistry concepts involved. The questions to be answered in the formative assessment were related to the concepts covered during the sprint. Students discuss their answers with teammates and their teacher. Thus, students discover ‘where they are’ and ‘where they are supposed to be’ with regard to understanding of the chemistry concepts. In addition, they get a feeling of what type of questions will be part of the final, summative assessment. In addition, students discussed the quality of their collaboration and reflected on how they could improve their learning in the next sprint. After this retrospective phase, students studied additional concepts in the second spring, such as reaction enthalpies and block diagrams. Afterwards, students’ conceptual development was measured again with a formative assessment. In the last sprint, the students transferred the knowledge they had acquired to a new situation. They were asked to develop all necessary tasks themselves, to create a substantiated advice on the greenest synthesis of adipic acid. Clearly, the students were rather autonomous in how they planned and organized their work. However, the role of the teacher was still important, although its focus changed from transmitting knowledge to students to a more facilitating role. For instance, the teacher introduced the module and provided Scrum boards, product backlogs and formative assessments, and, furthermore discussed conceptual issues with students on request. Finally, students were asked to answer a post-test with questions of Green Chemistry, which was the same as the pre-test, as well as the questionnaire on their perceptions of metacognitive dimensions of their learning.

Implementation of Green Chemistry in regular teaching

The four teachers of the comparison group implemented the Green Chemistry module without using Scrum methodology. In their classes, students formed groups themselves. Similar to the way in which the teachers of the experimental group introduced the module, teachers of the comparison group explained the objective of the module. At the start of the module students were invited to answer: (1) a pre-test with questions on Green Chemistry (GCCT) and (2) questionnaire on their perceptions of metacognitive dimensions of their learning (QAM).

A typical lesson in the comparison group started with a short explanation (i.e., what is atom efficiency and how to calculate its value?) and afterwards students were allowed to work in groups on the tasks, experiments and assignments as presented in the module. In general, teachers had a facilitating role in this phase of the lesson, which means that they stimulated their students, discussed conceptual issues with groups and answered questions on demand.

However, they did not provide scaffolds or special clues to their students with regard to how they might plan and monitor their work. Furthermore, teachers of the comparison group used neither formative assessments to review students’ learning progress nor retrospectives to stimulate their students to reflect on their learning approach. At the end of the lesson series students answered a post-test with questions of Green Chemistry, and the questionnaire on their perceptions of metacognitive dimensions of their learning.

Splitting of the experimental group

Teachers of the experimental group were split into two groups for several reasons. First, it turned out that some teachers reported a straightforward implementation of Scrum methodology in their classes whereas others experienced its implementation as challenging (Vogelzang et al., 2020b). This observation suggested that the experimental group consisted of at least two different teacher groups. In addition, there was evidence to suggest from the QAM-questionnaire that students experienced the quality of the lessons differently. The scale support by teacher was used as fidelity check. This scale included three items on how the students evaluated teachers’ guidance and instruction (with 1 = completely disagree and 5 = completely agree; Cronbach’s α = 0.79). Analyses of variance with the six teachers who taught the experimental group as factor showed that the teachers significantly differed in perceived support (F(5, 189) = 22.731; p < 0.001; η2 = 0.382). Scheffé post-hoc analyses showed (with p < 0.05) that two of the six participating teachers had considerable higher scores. Their average score was 4.49 (with SD = 0.58), whereas two other teachers showed considerably lower scores. Two teachers did not show a clear difference with either group and were—based on their mean scores—added to the low scoring group of teachers. These four teachers with lower scores showed a mean score of 3.70 (with SD = 0.89). The two teachers with high scores were labelled as ‘top-teachers’ and the other four teachers as ‘growth-teachers’. Teachers of the comparison groups (from three schools) showed similar scores on support by teacher (with a means of 4.08 and SD of 0.60) and were therefore not split into separate groups. Consequently, the design of this study can be characterised as quasi-experimental with two separate experimental groups and one comparison group.

Analysis

The two teacher groups, top-teachers and growth-teachers, can be understood as two separate conditions. In order to answer the research questions, differences between the experimental group and comparison group will be analysed separately for top-teachers and growth-teachers. In order to examine the effects on students’ achievement (RQ1) two univariate analyses of covariance were performed with teacher group (one with top-teachers vs comparison-group, and one with growth-teachers vs comparison group) as factor, post-test achievement score as dependent variable, and pre-test achievement score as covariate.

In order to examine the effects on students’ perceptions of affective and metacognitive dimensions of their learning (RQ2), two multivariate analyses of covariance were performed with teacher group (one with top-teachers vs comparison group, and one with growth-teachers vs comparison group) as factor, the set of six affective and metacognitive outcomes as dependent variables, and their pre-test scores as covariate. The affective and metacognitive outcomes showed significant, though moderate, correlations (see Table 7). All dependent variables showed a normal distribution.

Table 7 Correlations for the affective and metacognitive outcomes (post-test; N = 256; *p < 0.001)

Results

Scrum methodology and students’ achievement

Mean scores and standard deviations of the pre- and post-test of the three groups are presented in Table 8. The three groups did not differ significantly on pre-test scores (F(2,317) = 0.41, p = 0.96). Students from the two top-teachers generally achieved significantly higher scores on the post-test compared to students from the comparison group, after controlling for the pre-test scores (F(1,209) = 56.447, p < 0.001, \({\eta }^{2}\)= 0.213). This can be understood as a large effect of the intervention (Cohen, 1988). Students from the four growth-teachers generally performed significantly better on the post-test compared to students from the comparison group, after controlling for the pre-test scores (F(1,209) = 18.232, p < 0.001, \({\eta }^{2}\) = 0.081). This can be understood as a medium effect of the intervention. Furthermore, the mean scores of students taught by top-teachers and growth-teachers were combined (experimental group complete) and subsequently compared with the comparison group, after controlling for the pre-test scores (F(1,318) = 44.845, p < 0.001, \({\eta }^{2}\) = 0.124). The effect-size can be understood as a large effect of the intervention.

Table 8 Mean scores on test (maximum 100 points) of students taught by respectively top-teachers, growth-teachers, the combined experimental group and students of the comparison group

Scrum methodology and students’ perceptions of affective and metacognitive aspects of their learning

Mean scores and standard deviations on the six dependent variables from QAM are presented in Table 9. Comparison of perception scores of students taught by the two top-teachers and the comparison group revealed that students from the top-teachers showed significantly higher scores on classroom climate (F(1,164) = 17.968, p < 0.001, \({\eta }^{2}\) = 0.099), self-regulation (F(1,164) = 6.222, p < 0.05, \({\eta }^{2}\) = 0.037); attitude towards chemistry (F(1,164) = 4.894, p < 0.05, \({\eta }^{2}\) = 0.029) and collaboration (F(1,164) = 9.895, p < 0.005, \({\eta }^{2}\) = 0.057), after controlling for pre-test scores on each relevant variable. The difference on classroom climate can be understood as a medium effect size of the intervention; the other three as small effect sizes. No significant differences were found for self-efficacy and personal development. Comparison of the perception scores of students taught by the growth-teachers and the comparison group after controlling for pre-test scores on each relevant variable revealed that students from the four growth-teachers showed significantly higher scores on self-efficacy (F(1,154) = 6.554, p < 0.05, \({\eta }^{2}\) = 0.041) and attitude towards chemistry (F(1,154) = 7.001, p < 0.01, \({\eta }^{2}\) = 0.044). These effects of the intervention can be understood as small effects. No significant differences were found for the other four variables. Comparison of the perception scores of the complete experimental group and the comparison group after controlling for pre-test scores on each relevant variable revealed that students of the experimental group showed significantly higher scores on classroom climate (F(1,254) = 8.918, p < 0.005, \({\eta }^{2}\) = 0.034), attitude towards chemistry (F(1,254) = 7.364, p < 0.01, \({\eta }^{2}\) = 0.028) and collaboration (F(1,254) = 6.347, p < 0.05, \({\eta }^{2}\) = 0.024). These effects can be understood as small effects. No significant differences were found for the other three variables.

Table 9 Comparison of pre-test and post-test scores of 6 affective and metacognitive components of students’ learning for students taught by respectively top-teachers, growth-teachers and students participating in the comparison group (minimum score = 1; maximum score = 5)

In addition, for students taught by top-teachers we found significant increases for self-regulation (t(99) = − 2.509, p = 0.014) and collaboration (t(99) = − 5.085, p = 0.017). The observed decline for self-efficacy (t(99) = 1.383, p = 0.17) was not significant. For the group of students taught by growth-teachers we found significant increases for self-efficacy (t(89) = − 3.865, p < 0.001) and classroom climate (t(89) = − 2.051, p < 0.001). Furthermore, for the combined group of all students participating in the experimental group we found significant increases for classroom climate (t(189) = − 2.418, p = 0.017) and collaboration (t(189) = − 4.086, p < 0.001). In the comparison group we did not find significant increases or decreases.

Discussion

In this study, the effects are examined of the use of Scrum methodology on students’ learning outcomes and their perceptions of affective, and metacognitive aspects of students’ learning.

Students’ learning outcomes

The results on learning outcomes show that students taught by top-teachers as well as students taught by growth-teachers outperform students of the comparison group. This finding suggests that the ceremonies, roles and artefacts of Scrum methodology guide students through, and simultaneously scaffold, their learning process. Therefore, implementing Scrum methodology in context-based learning environments might respond to criticism that unguided approaches often fail (Kirschner et al., 2006). However, providing a unique and straightforward explanation for the observed effect is impossible. Scrum methodology itself comprises a variety of components. Nevertheless, essential in Scrum is that students are invited systematically and explicitly to use, think and rethink the concepts involved in the context-based course. For example, the recurrent stand-up ceremony enforces students explicitly to share what they have done for their team and what problems they encountered. The stand-up ceremony contributes to a learning environment in which students experience that they are mutually interdependent. With this ceremony each team member is brought systematically in a situation in which he or she has to take responsibility for their team effort. Consequently, it increases the chance that each student engages in the project individually and focuses on its underlying concepts. However, despite these ceremonies, collaboration between students might still be challenging, due to e.g. conflicts of interest between peers (Puurtinen & Mappes, 2009) or ‘free-rider’ issues (Hwang & Wu, 2014).

During the review ceremony students have to give serious thoughts to the concepts they worked on during the sprint. Embedding a formative assessment within a review ceremony evokes both individual reflection and additional discussions about (mis)concepts and stimulates mutual peer feedback in a natural way. It creates opportunities for teachers to adapt their teaching to students’ specific needs (Treagust 2012). Educational research has shown that applying formative assessments might contribute to students’ achievement (Andrade & Heritage, 2017; Black & Wiliam, 2009; Vogelzang & Admiraal, 2017). Therefore, the review ceremony might explain the observed effects on learning outcomes, although an in-depth empirical follow-up study is necessary to investigate its contribution to the overall effect. In contrast, students of the comparison group were not exposed to systematically organized moments in which they are challenged to rethink the concepts involved, which enhances the chance that their learning is superficial. In conclusion, the review ceremony functions as focus point and brings students back to the essentials of the project. It allows students to gain an understanding of key concepts and helps them to elucidate the learning objectives.

Remarkably, despite the fact that growth-teachers reported organizational issues, and in some cases even some resistance against Scrum methodology, their students still outperformed students of the comparison group, suggesting that the ceremonies and artefacts of Scrum methodology play a key role in the observed, positive effects on students’ learning outcomes. That said, the findings on learning outcomes clearly show that the teachers themselves have a substantial impact, which is illustrated by the fact that students taught by top-teachers outperform students of growth-teachers as well as students of the comparison group.

Affective and metacognitive aspects of students’ learning

The effects of the use of Scrum methodology on students’ perceptions of metacognitive and affective dimensions of their learning are less pronounced than those on learning outcomes. Nevertheless, the observation that in the experimental groups significant increases were found for several dimensions, whereas within the comparison group no effects were found, suggests that Scrum methodology might support teachers in facilitating students’ learning. However, the results suggest that the teacher still plays a key-role and there are several clues that support this claim. First of all, for self-regulation only a small effect-size is found when students taught by top-teachers are compared with students of the comparison group. Although the recurring parts of Scrum methodology, including stand-up and retrospective ceremonies, intend to promote students’ self-regulation (see Table 1), no significant effect of teaching with Scrum methodology is found when students taught by growth-teachers are compared with students of the comparison group. Especially noteworthy is that students taught by growth-teachers also did not show an increase in their self-reported perception of their self-regulation. Given the fact that growth-teachers reported resistance towards Scrum ceremonies among their students, whereas top-teachers did not notice resistance, this might be an indication that the way they mentored their students plays a crucial role (Perry et al., 2019).

Findings on self-efficacy confirm the importance of teachers’ role. In general, high levels of self-efficacy are related positively to students’ learning outcomes (Boz et al., 2016). Given the fact that high effect-sizes on students’ learning outcomes were found, comparable effect-sizes on self-efficacy were expected. However, rather unexpected, the self-efficacy of students taught by top-teachers decreased slightly, although not significant. This might be explained by several reasons. Students taught by top-teachers appreciated their support already before the experiment, and reported a high average score on self-efficacy compared to students of the other groups, suggesting that they did not see any need to change their learning strategies. Leaving behind rather successful learning strategies, used for a long time, and interiorizing a new learning approach might be an inconvenient process and might have a negative impact on students’ self-efficacy (Schunk & Meece, 2006). What is also striking, students taught by growth-teachers reported a significant increase in self-efficacy culminating in a small effect-size compared to students of the comparison group, suggesting that, despite the reported resistance, they benefited from certain ceremonies of Scrum methodology.

The most important clue that the teacher plays a key role can be drawn from students’ perception of the classroom climate. Students of classes taught by top-teachers appreciated their classroom climate more than students of the comparison group (medium effect-size). Comparing classes taught by growth-teachers and the comparison group revealed no difference. Furthermore, students taught by growth-teachers reported a significant increase in how they experienced their classroom climate, which suggests that Scrum plays a role.

Findings on personal development are in line with these results. Compared to students of the comparison group, students taught by top-teachers reported a stronger increase that they developed the personal qualities they deployed during Scrum lessons. Developing personal qualities requires a learning environment in which students receive adequate feedback from their teacher and team mates. Scrum methodology creates these opportunities systematically (stand-up, review, retrospective). Solely implementing Scrum methodology in a context-based classroom does not guarantee that students develop their qualities. Again, the teacher, implementing Scrum methodology, matters.

Findings on students’ attitude towards chemistry revealed small effect-sizes for both students taught by top-teachers and growth-teachers, compared to students of the comparison group. There is empirical evidence that both context-based approaches and teachers’ behaviour are effective in encouraging students to develop more positive attitudes towards science (Lee & Erdogan, 2007). Findings on students’ attitude, as presented in this study, are in line with results reported in review study by Savelsbergh et al. (2016). They calculated effect-sizes for context-based approaches intended to stimulate students to reflect on chemistry concepts, and to connect these concepts to their personal lives (Marks & Eilks, 2009). Teachers provide feedback, challenge their students to discuss and thus promote a reflective attitude among them. Apparently, the observed effect-sizes suggest that the Scrum methodology contributes to a positive attitude towards chemistry.

The average scores on collaboration of students taught by top-teachers, growth-teachers and students of the comparison group increased substantially during the intervention (Table 9). However, a statistical difference was only found when scores of students taught by top-teachers were compared with students participating in the comparison group. No difference was found between students taught by growth-teachers and students of the comparison group. This result suggests that the ceremonies (such as forming of groups, stand-up) scaffold mutual collaboration in an appropriate classroom climate. Moreover, this result is an indication that the impact of Scrum methodology is profoundly influenced by teachers who are able to create an atmosphere in which students feel comfortable to work and learn.

Limitations and future directions

Although we distinguished between two qualities of implementation (top-teachers and growth-teachers), the present study did not discriminate between the different components of Scrum methodology. Therefore, it is impossible to explain to what extent the different parts (e.g., the review ceremony, the retrospective or the stand-up ceremony) account for the effects found. A follow-up study might manipulate the various components of Scrum methodology, in particular the review ceremony (with or without a formative assessment). In addition, it might be interesting to use formative assessments, including a close consideration of the enacted formative assessment practice, in both the experimental and control condition, to examine the additional impact of other aspects of Scrum methodology.

Moreover, data on the actual metacognitive processes the students were engaged in were not collected in the current study. A follow-up study might focus on how Scrum supports students’ reflective behavior to find out to which extent Scrum stimulates metacognitive processes.

Furthermore, the Green Chemistry context itself might be changed by using the synthesis of another chemical or a different final assignment such as writing a section of a scientific journal article in argumentative style rationalizing the choice of the synthetic route or writing a technical note in which students use all twelve Green Chemistry principles.

A potential source of bias in the current study is that teachers were assigned to the experimental or control condition based on their motivation for using Scrum methodology. Teachers in the experimental group were motivated to attend a personal development program on Scrum methodology. This might have influenced the impact of Scrum on learning outcomes positively. On the other hand, participating teachers in the experimental group were rather unexperienced: the results on learning outcomes, affective and metacognitive dimensions of students’ learning were gathered during the first implementation of Scrum in their classes. A next step in research on the implementation of Scrum methodology in secondary education could be to examine the process of teaching and learning with Scrum. This kind of research might also reveal differences between teachers in implementing this methodology and their connection with different student outcomes.

Thus, findings are in part due to the inexperience of the teachers, and their students. If teacher and students are more familiar with Scrum methodology, this might positively impact students’ self-efficacy, their self-regulation and their perception that they develop their personal qualities. This requires a longitudinal study of the effect of Scrum methodology on students’ learning.

Providing students with the GCCT before exposing them to the concepts and principles covered by the module is not without risk (cf. productive failure approach (Kapur & Bielaczyc, 2012). Students might learn from the GCCT pre-test itself. However, using the GCCT as pre-test and post-test enabled us to administer students’ progress. Furthermore, students from the experimental groups as well as the comparison group answered GCCT.

Scrum methodology is a widely used project management framework. However, there are other frameworks, including Kanban (Saltz & Heckman, 2020) and digital project management frameworks such as Trello (Parsons et al., 2018). It would be interesting to compare different frameworks in a single study. It might provide a deeper insight in the questions why and how project management frameworks affect students’ learning.

Conclusions and implications

Although participating teachers and their students were unfamiliar with Scrum methodology at the start of the study, results reveal that it can be a powerful and useful tool in rather complex, context-based learning environments. Obviously, effects of the Scrum methodology on students’ learning outcomes depend on the quality of its implementation. Besides, especially in classes taught by top-teachers, positive effects of teaching with Scrum have been found on affective and metacognitive aspects of students’ learning such as their self-regulation, classroom climate, personal development and their attitude towards chemistry.

Yet, several participating teachers, both top-teachers and growth-teachers, mentioned that implementing Scrum methodology in a classroom is not an easy job. At the start, it requires more preparatory work, more planning and many other organizational issues, which is common when a new instructional approach is introduced in a classroom (Prince & Felder, 2007). Other factors to be taken into consideration are student resistance and collaboration issues, not to mention that teachers themselves might perceive feelings of uncertainty.

The present study shows that Scrum methodology might strengthen students’ learning by providing a structure with clear ceremonies, roles and artefacts that guides students in their learning process. However, as always, the teacher matters. Following a professional development program to become familiar with the tenets of Scrum methodology is not enough. Attention should be given to teachers’ behaviour when they implement and, subsequently, how they mentor their students during their lessons, for example, by filming and visiting lessons. Participating teachers as well as the course instructor of the professional development program should act as critical friends to strengthen teachers’ mentoring when they guide their students through the Scrum ceremonies. If a teacher is capable to create a classroom climate in which students feel free to reflect, to provide and receive feedback, the classroom becomes a place where students ‘learn as they Scrum’.