Introduction

Design-based research (DBR) has emerged as a new research methodology from the beginning of this century. Being situated in a real context, DBR focuses on examining a particular intervention by continuous iteration of design, enactment, analysis, and redesign (Brown 1992; Cobb et al. 2003; Collins 1992). The intervention can be an instructional approach, or a type of assessment, or a learning activity, or a technological intervention, namely testing the effectiveness of the particular learning environment or tool (Anderson and Shattuck 2012). With the aim of designing learning environments and developing theories, the DBR explicates how designs work in the real settings and how to better understand the teaching and learning issues involved (The Design-Based Research Collective 2003). As an emerging paradigm, DBR highlights how the design principles evolved by multiple iterations as well as what kinds of intervention can lead to improved outcomes. By linking processes to outcomes in particular contexts, DBR can get a better understanding of intervention as well as improved theoretical accounts (The Design-Based Research Collective 2003).

DBR has been used increasingly in educational field, especially in K-12 contexts with technological interventions (Anderson and Shattuck 2012). Although the promising benefits of DBR are acknowledged in the field of education, many critiques have been proposed in previous studies. It is doubtful that researchers can produce reliable and faithful statements in DBR because researchers themselves are involved in design, development, and implementation of interventions (Barab and Squire 2004). Thus, it is difficult to produce the high research validity in DBR. Furthermore, it is impossible to replicate an intervention in other settings because DBR is contextually dependent (The Design-Based Research Collective 2003; Fishman et al. 2004; Hoadley 2002). Therefore, it is very obvious that there are big gaps between the expectation and application of DBR. This phenomenon makes us question how DBR was adopted and realized in education research in the past decade. Has DBR been most effective in a particular learning domain or research settings? What kinds of methods were utilized in DBR? How researchers design and implement the interventions in DBR? In order to answer these questions, a systematic review of existing studies was conducted to gain insights into the research issues of DBR and provide valuable references for educators and practitioners in this study.

Previous studies have attempted to analyze the methodology, progress, and issues of DBR. For example, Anderson and Shattuck (2012) reviewed the characteristics and progress of DBR by analyzing the abstracts of 47 most cited papers from 2002 to 2011. McKenney and Reeves (2013) suggested that in-depth analysis of full text of DBR should be conducted in order to provide sufficient evidence for assessing the progress of a decade. However, little research has been conducted to thoroughly analyze demographics, the research methodology, intervention, and research outcomes in the field of DBR. Therefore, this study aims to provide an overview of DBR through the systematic analysis of 162 selected studies in the database of 219 social sciences citation index (SSCI) educational journals from 2004 to 2013.

As Noyons and van Raan (1998) reported, separating the published papers into two periods can provide insights into the variation of the particular topic. Several studies have analyzed the variation by splitting the data into different periods of time. For example, Tsai et al. (2011) examined the variation of science learning by analyzing 228 empirical studies during 2000–2004 and 2005–2009. Kinshuk et al. (2013) analyzed highly cited educational technology papers during 2003–2006 and 2007–2010. Zheng et al. (2014) investigated the research topics of computer-supported collaborative learning by analysis of 706 papers published during 2003–2007 and 2008–2012. Thus, an in-depth review of demographics, research methodology, intervention, and research outcomes concerning DBR has been conducted in the present study between the first 5 years (2004–2008) and the second 2 years (2009–2013). The purpose of this review is twofold. First, the authors investigate the status quo of DBR from 2004 to 2013. Second, the variations between the first 5 years (2004–2008) and the second 5 years (2009–2013) in demographics, research methods, intervention characteristics, and research outcomes have been explored based on the selected studies. Therefore, the research questions addressed in this study are as follows:

  1. (1)

    What are the demographics of the selected studies from 2004 to 2013? And, what were the demographics variations between the first 5 years and the second 5 years?

  2. (2)

    What research methodologies in DBR were selected in these selected studies from 2004 to 2013? And, what were the methodology variations during the two periods?

  3. (3)

    What kinds of interventions were adopted in DBR from 2004 to 2013? And, what were the intervention variations during the two periods?

  4. (4)

    What are the measured outcomes in DBR from 2004 to 2013? And, what were the measured outcomes variations during the two periods?

Methodology

This study adopted content analysis method to review the research papers regarding the DBR from 2004 to 2013. This section will describe the details of the paper selection process, coding scheme, and inter-rater reliability.

Paper selection processes

In order to conduct a systematic literature review on DBR, this study selected papers relevant to DBR in the database of 219 education and educational research SSCI indexed journals from 2004 to 2013. More specifically, the paper identification process proceeded in three stages. In the first stage, 479 papers related on DBR were selected using keyword and paper title searches within the 219 journals. The search terms included "design research" and its synonyms (viz. “design-based research,” “developmental research,” “design experiment,” “design based research,” “design experiments,” “design research,” “development research,” “developmental research,” and “formative research”). In the second stage, the authors selected papers based on the following six criteria:

  • First, only papers that were categorized as “articles” in the SSCI database were analyzed in this study. So non-research publications such as “book reviews,” “editorials,” and “letters” were excluded from this study.

  • Second, conceptual papers closely related to DBR were included so as to produce a comprehensive understanding of DBR.

  • Third, the studies had to adopt DBR method to conduct the empirical study.

  • Fourth, the measured outcome variable(s) in the empirical study was related to student outcomes (cognitive outcomes, attitude, and psychomotor skills).

  • Fifth, the empirical study needs to follow appropriate methodology (Jitendra et al. 2011). The research sample groups, settings, learning domains, data sources, and data analysis procedure need to be specified in the empirical study.

  • Sixth, the paper had to be written in English and published from 2004 to 2013.

Failure to satisfy any of these criteria cannot be included in the literature review. Finally, the search and identification resulted in 162 selected articles.

Coding scheme

To answer the aforementioned four research questions, the coding scheme was developed for the purposes of reviewing DBR in the past decade. To address the first research question concerning the demographics in DBR, the category included: research sample group, research settings, and research learning domains. To analyze research methodology in DBR, we focus on the research methods and data sources adopted in selected studies. To answer the third research question regarding the intervention characteristics, the category included: intervention type, revision of intervention, iteration frequency, and iteration duration. To explore what measured outcomes were assessed, we adopted the coding scheme proposed by Wang et al. (2014), namely cognitive outcomes, attitude, psychomotor skills, integrated, and others. Some of these categories, namely research sample groups, research settings, research learning domains, research methods, data sources, and measured outcomes have also been applied in other reviews (Hsu et al. 2012; Wang et al. 2014). Following sections illustrate the details of each sub-dimension.

Research sample groups

Research sample groups were classified into one of the following sub-categories: (1) preschool, (2) primary school, (3) junior and senior high school, (4) higher education, (5) vocational education, (6) teachers, (7) mixed group, and (8) non-specified.

Research settings

Research settings refer to the contexts in which the research was mainly conducted. Research settings were coded as follows: (1) face-to-face classroom, (2) workplace, (3) distance learning setting, (4) blended learning setting, and, (5) non-specified. If a study took place in workplace mixed with a distance learning setting, it was coded into workplace.

Research learning domains

Research learning domains were classified into the following sub-categories: (1) natural science (including science, mathematics, physics, chemistry, biology, geography, and environment science), (2) social science (including politics, education, psychology, and linguistics), (3) engineering and technological science (including engineering and computer science), (4) medical science, (5) mixed learning domain, and (6) non-specified.

Research method

Research method was coded as follows: (1) qualitative method, (2) quantitative method, and, (3) qualitative and quantitative method. Qualitative method refers to the one in which investigators use narratives, ethnographies, case studies, and so on to develop knowledge. Quantitative method means that investigators adopt experiments, surveys, and so on to develop knowledge (Cresswell 2009).

Data sources

Within DBR, multiple data sources can be used to analyze the outcomes of an intervention and to refine it (Cobb et al. 2003; The Design-Based Research Collective 2003; Wang and Hannafin 2005). In this study, the data sources were coded as follows: (1) process data, including video and audio records, log data, think-aloud protocols, (2) outcome data, including test and various kinds of artifacts, (3) miscellaneous data, including questionnaire, interview data, notes (such as field notes, journals, written reflections, observation records), and (4) non-specified.

Intervention type

The intervention type was coded as follows: (1) instructional method (such as collaborative learning, project-based instruction), (2) scaffolding (conceptual scaffolding, procedural scaffolding, and metacognitive scaffolding), (3) integrated teaching models (such as knowledge-building activity), (4) technological intervention, namely testing the effectiveness of the learning environment or the particular tool), and (5) other models or methods (such as professional development model or heuristic task analysis method).

Revision of intervention

Revision of intervention refers to whether the intervention was revised and specified. In terms of revision of intervention, it was coded as follows: (1) revised, and (2) no revision. With respect to specifying how the intervention was revised, it was coded as follows: (1) reported, and, (2) no report.

Iteration frequency

Iteration frequency refers to the number of times the intervention is implemented during the whole research. In this study, the value of iteration frequency was coded as once, twice, thrice, four times, five times, and more than five times.

Iteration duration

Iteration duration refers to how long the intervention is conducted in the whole research. This time span can range from several days to several years.

Measured outcomes

The measured outcomes refer to the investigated crucial variables. Three major domains are selected as the measured outcomes for this study, namely cognitive outcomes, attitude, and psychomotor skills. In addition, if some studies measure multiple kinds of variables, then they are categorized as “Integrated.” If the measured outcomes did not belong to these four domains, they are classified as ‘others.’ Therefore, the measure outcomes are coded as follows: (1) cognitive outcomes, (2) attitude, (3) psychomotor skills, (4) integrated, and (5) others.

Inter-rater reliability

Three raters manually and independently coded all of the articles based on the aforementioned schemes. The percent agreement was used to calculate the inter-rater reliability. The agreement rate between coders was above 0.9, regarded as reliable and stable results (Landis and Koch 1977). The three raters resolved all discrepancies after face-to-face discussion.

Results

Demographics of the selected studies

Table 1 shows the descriptive data for the demographics results of the selected studies in the first 5 years (2004–2008) and the second 5 years (2009–2013).

Table 1 The descriptive data for the results of demographics of the selected studies

Research sample groups

As shown in Table 1, researchers most often selected the higher education group in both periods. On the other hand, preschool sample was the least selected group during both periods. Additionally, the most significant increase was found in the sample group of vocational education (x 2 = 1.97, p < 0.05) and the most significant decrease in the group of junior and senior high school (x 2 = 2.01, p < 0.05) between these two periods. No significant differences were found in other sample groups.

Research settings

Table 1 also shows that most of the research works were conducted in face-to-face classroom. However, there was significant decrease in face-to-face classroom (x 2 = 2.51, p < 0.05) between the first 5 years and the second 5 years. In addition, there was significant increase in distant learning setting between the two periods (x 2 = 4.73, p < 0.05). With respect to the blended learning setting and workplace, there were more growths in these two periods. However, no significant difference was found in blended learning setting (x 2 = 0.53, p > 0.05) and workplace (x 2 = 1.85, p > 0.05).

Research learning domains

In DBR, researchers selected different learning domains to investigate how the interventions function through several cycles. During the past decades, natural science was selected the most often and medical science was selected the least. However, no significant difference was found in natural science (x 2 = 0.68, p > 0.05), social science (x 2 = 1.24, p > 0.05), engineering and technological science (x 2 = 0.06, p > 0.05), medical science (x 2 = 0.01, p > 0.05), and mixed learning domains (x 2 = 1.22, p > 0.05) between these two periods.

Research methodology

Research method

Table 2 shows the descriptive results of research methods and data sources. With respect to research method, qualitative method was adopted the most and quantitative method was conducted the least. However, there was no significant difference in qualitative method (x 2 = 0.25, p > 0.05), quantitative method (x 2 = 0.006, p > 0.05), and qualitative and quantitative methods (x 2 = 0.26, p > 0.05).

Table 2 Descriptive data for the results of research method and data sources

Data sources

With regard to data sources, researchers collected various kinds of data to conduct the research. Miscellaneous data (such as interview data, questionnaires, and various kinds of notes) were utilized the most in the past decade. In addition, the process data increased from 11.82 to 14.54 %. The outcome data decreased from 27.27 to 18.44 %. However, there was no significant difference in process data (x 2 = 0.25, p > 0.05), outcome data (x 2 = 0.61, p > 0.05), and miscellaneous data (x 2 = 0.50, p > 0.05).

Intervention

Intervention type

Table 3 shows the descriptive data of intervention type, iteration frequency and duration, and the revision of intervention. In the past decade, the technological intervention was the major type of intervention used in DBR. However, there was no significant difference in the technological intervention (x 2 = 0.46, p > 0.05) over the two periods. In terms of other intervention types, the most significant increase was found in the intervention of scaffold (x 2 = 5.37, p > 0.05) and the least significant decrease was found in the intervention of instructional method (x 2 = 2.25, p < 0.05). Although there were increases in the integrated teaching models and other models (from 15.56 to 16.24 %), no significant differences were found in these two types of interventions.

Table 3 Descriptive data of intervention types, iteration frequency and duration, and revision of intervention

Revision of intervention

In the last decade, 74.07 % of DBR revised the intervention. During the two 5-year periods, there was growing tendency towards intervention without revision and decreasing trend in revision of intervention. However, there was no significant difference in intervention with revision (x 2 = 1.35, p > 0.05) and without revision (x 2 = 1.08, p > 0.05). Furthermore, we examined whether the studies provided detailed reports of how to revise the intervention. As shown in Table 3, 60.49 % of DBR in the past decade reported what had been revised in terms of intervention. However, there was significant decrease in terms of intervention with revision (x 2 = 3.08, p < 0.05) and significant increase in terms of intervention without revision (x 2 = 2.02, p < 0.05) in these two periods.

Iteration frequency

In terms of iteration frequency, 50 % of the DBR only conducted one cycle in the past decade. There was a slight increase in the iteration frequency of once, twice, and four times. Also, there was a slight decrease in the iteration frequency of three and five times. However, there was no significant difference in any kinds of iteration frequency between the two 5-year periods.

Iteration duration

Most of DBR spent less than 1 year (42.6 %) or only 1 year (25.93 %) to design and test an intervention. 15.43 % of DBR spent 2 years to examine an intervention. Only a small proportion of DBR studies (4.32 %) were conducted for more than 3 years. As shown in Table 3, the iteration duration of 2, 3 years, and more than 3 years decreased from the first 5 years to the second 5 years. The short iteration durations including 1 month, 6 months, and 1 year slightly increased from the first 5 years to the second 5 years. However, no significant difference was found in any kinds of iteration duration between these two periods.

Measured outcomes

Among the 162 studies, most studies focused on the measurement of cognitive outcomes (see Table 4). Some studies examined the integrated skills, for example, problem solving, inquiry abilities, and so on. Only few studies measured learners’ attitude and psychomotor skills. However, there was significant increase in the attitude between the first 5 years and the second 5 years (x 2 = 2.77, p < 0.05). No significant differences were found in the measurement of cognitive process (x 2 = 0.21, p > 0.05), psychomotor skills (x 2 = 0.01, p > 0.05), integrated skills (x 2 = 0.26, p > 0.05), and others (x 2 = 0.11, p > 0.05).

Table 4 Descriptive data of measured outcomes

Discussion

The study presented in this paper describes the status of DBR in the past decade based on the selected 162 SSCI papers. The demographics of the selected studies revealed that the higher education sample group was the most commonly used group in DBR. In terms of research settings, distant learning settings significantly increased and face-to-face classroom settings significantly decreased during the two time periods. Also, DBR was more common in natural science learning domain.

With regard to research methodology, most researchers selected qualitative method to conduct DBR. This result is consistent with prior research that indicated that DBR can be descriptive and explanatory in nature (McKenney and Reeves 2012). In terms of data sources, miscellaneous data such as interview data, questionnaires, and various kinds of notes were adopted in most DBR. This is in line with previous research studies that reported that DBR is typically conducted using multiple forms of data (Dede 2004; Wang and Hannafin 2005). Furthermore, multivocal analysis in DBR was called for in order to obtain the trustworthy and credible conclusion (Fujita 2013).

Results of the present study indicated that most researchers tested technological intervention through designing, developing, implementing, and revising particular technological tools. Furthermore, results also revealed that although most of the research studies revised the intervention, significant decrease was found in specifying how the intervention was revised. This indicated that there was a tendency among researchers towards not providing the details of revising the design and intervention. In addition, most of the research studies only tested the intervention by one cycle in DBR. Also, the iteration duration was only 1 year in most DBR. This can be explained by the findings of Anderson and Shattuck (2012) that multiple iterations and cycles indeed go beyond the time and resources available to researchers.

With respect to the measured outcomes, the results indicated that the effectiveness of design and intervention was captured by measuring cognitive processes of learners, such as learning achievements, conceptual change, and artifacts. Very few studies measured attitudes and psychomotor skills of learners in DBR.

The theoretical and practical implications for future research are proposed as follows. First, this study suggests that much more effort needs to be put to make DBR more sound and reliable because good research requires objectivity, reliability, and validity (Norris 1997). Second, the findings of the present study suggest that multiple iterations are required in DBR so as to refine the theory, methods, or tools. Third, caution should be made when generalize the results of DBR because findings are drawn from the local context (The Design-Based Research Collective 2003). Fourth, the design activities that can yield very interesting outcomes have been paid less attention in DBR (Reimann 2011). It is suggested that the design itself and how the design functions should be emphasized in future study. Finally, educational research has been required to create useful knowledge and provide scientific claims (Lagemann 2002; National Research Council 2002). Therefore, DBR needs to be improved so as to produce useful and replicable knowledge in the future.

Conclusion

This study thoroughly examined the research sample groups, research settings, research learning domains, research methods, data sources, intervention type, revision of intervention, iteration frequency, iteration duration, and measured outcomes in DBR. The main conclusion is that technological intervention is dominated in most of DBR studies. However, there is a tendency among researchers towards not reporting the details of how to revise the intervention in DBR. Also, only one cycle of iteration is conducted in most studies. In addition, the qualitative approach and miscellaneous data were adopted in most DBR.

This study contributes towards better understanding about the status of DBR. The thorough analysis of variation between the two periods (2004–2008 and 2009–2013) can provide directions for the potential research topics. Furthermore, this study proposes the need for new approaches that would emphasize the design process and highlight the value of replicability of research.

Results of this study are influenced by several constraints. First, this study only analyzed the demographics, research methods, interventions, and research outcomes. It would be very valuable to thoroughly analyze how design functions and evolves in different cycles. Second, only journal articles published from 2004 to 2013 were examined in this study. Future studies should extend the data sources to conduct a more deliberate analysis. Finally, it would also be very useful to thoroughly analyze the highly cited papers that have high influence and valuable contribution in the field of DBR, which can provide important insights into future directions for educators and researchers.