Introduction

Research and teaching are two of the main tasks of universities. A close link between them is often considered to be at the heart of the institution (Elen and Verburgh 2008). This close link is currently not only desired by traditional research universities, but also becomes increasingly important to newer universities and other higher education institutions (Kyvik and Skodvin 2003). Graduate and undergraduate programs typically aim to offer their students educational programs that are linked to academic research, for instance, by having courses taught by academic staff who are involved in research, or by engaging students in research practices. The academic community has studied the research–teaching nexus for decades with varying emphases. Until the 1990s, attention was primarily paid to the correlation between being a good researcher and being a good teacher, generally measured by citation indices and student satisfaction, respectively. However, a meta-analysis (Hattie and Marsh 1996) showed only a marginal correlation between these measures. Many academics were nonetheless convinced of the importance of the relation (Neumann 1992). They preferred combining research and teaching (Jensen 1988) and considered linking research and teaching beneficial for their students (Elen et al. 2007). Accordingly, a shift occurred to studies in which academics’ views took central ground. In recent years, students’ perspectives have increasingly come to the foreground. These studies show mostly advantages, but also disadvantages of research integration in university teaching (Turner et al. 2008; Visser-Wijnveen et al. 2012). This paper describes the development and validity of the Student Perception of Research Integration Questionnaire (SPRIQ). The questionnaire is developed to measure students’ perceptions of the integration of research in teaching. A better understanding of the way students perceive research integration in university courses is important for academic staff and program managers who aim to strengthen links between research, teaching, and student learning, as it helps them to identify whether intentions of staff are coherent with students’ experiences.

Previous studies on students’ perceptions of research integration

Many of the previous studies on student experiences and perceptions of research in teaching used data from interviews and focus groups of students (e.g. Buckley 2011; Jenkins et al. 1998; Lindsay et al. 2002; Neumann 1994; Robertson and Blackler 2006). These studies provided a qualitative understanding of how university students experience the complex nature of the relations between research, teaching, and learning. The various studies showed that students, both undergraduate and postgraduate, perceived benefits as well as challenges when links between research and teaching were emphasized. Perceived benefits included increased motivation and interest in the subject, because of the teacher’s enthusiasm and greater credibility (Jenkins et al. 1998; Robertson and Blackler 2006). Furthermore, classes were considered more challenging and intellectual stimulating, especially when research assignments were given to students; interactions with teacher and researchers, including being part of a research community, were especially valued (Neumann 1994; Robertson and Blackler 2006). Students typically appreciated participation in research; however, being merely used as the work force for their teachers was considered a risk (Buckley 2011). Other challenges included academic staff prioritising research over teaching, leading, among other things, to reduced availability for students, or limiting the curriculum or a course to the teacher–researchers’ interests (Lindsay et al. 2002; Neumann 1994).

More recently, several studies used a survey methodology to capture students’ experiences of research integration (Breen and Lindsay 1999; Healey et al. 2010; Spronken-Smith et al. 2014; Turner et al. 2008; Verburgh and Elen 2011). For example, Breen and Lindsay (1999) conducted a survey study to analyse the relations between student motivation and student beliefs about academic research. They distinguished three groups of students: Intrinsic Competent, Extrinsic Social, and Independent Impersonal. The first group consists of students who are intrinsically motivated and feel confident about the course requirements; they value highly research activities of academic staff. The second group consists of externally motivated students whose lives evolve around social interaction with fellow students and staff. The third group of students prefers to study independently and has no interest in communication with staff. These two latter groups are, respectively, indifferent and hostile to the inclusion of research in teaching. Most survey studies focused on students’ perceptions of positive or negative impacts of research and their awareness of research conducted by academic staff at their department or the university as a whole. In the questionnaire designed by Healey and colleagues (Healey et al. 2010; Turner et al. 2008), which was also used by Spronken-Smith et al. (2014) and adapted by Verburgh and Elen (2011), students were asked to identify whether they had experience with various research activities during their studies, whether they were aware of specific research activities taking place at the university and in their department and were asked to score statements about the positive and negative influence of these research activities on their learning. The findings were consistent with results from previous studies: students reported largely positive influences of research activities, especially increased understanding of the subject and stimulated interest and enthusiasm, but also negative influences, such as teachers’ lack of interest in teaching and lack of availability (Healey et al. 2010; Spronken-Smith et al. 2014; Turner et al. 2008). Verburgh and Elen (2011) found that the integration of research in the classroom was the most important factor in predicting the appreciation of research aspects in the learning environment, next to awareness of research of the own lecturers, year of study, awareness of research at the university, and whether the discipline was hard (one dominant paradigm is present) or soft (several coexisting paradigms are present) (cf. Biglan 1973). The above-mentioned studies gave insight into the occurrence of students’ experiences with individual research activities, such as undertaking an independent project, reading a research paper, and attending a research seminar (Healey et al. 2010; Spronken-Smith et al. 2014; Turner et al. 2008), or captured all of their experiences in one measure, called the ‘experienced research integration’ (Verburgh and Elen 2011). However, none of these studies identified students’ perceptions of the different ways in which research can be integrated in teaching. This study addresses that gap by empirically building scales that capture the various distinguishable features within students’ perceptions. Furthermore, this study differs from previous survey studies in focusing on the course (i.e., module or course unit) level.

Tangible and intangible nexus between research and teaching

Neumann (1992) presented a categorization of research and teaching relations within universities based on an interview study with academics. She showed that academics conceive relations between research and teaching in three distinct ways: (1) global connection, (2) tangible connection, and (3) intangible connection. The global connection describes the nexus at the departmental level and relates to the research programs of the department, which may, to some extent, guide the design of university courses. The tangible and intangible connections describe the relations at the student level. In the tangible nexus, the clearly visible forms of the integration of research and teaching are included, such as the transmission of advanced knowledge and results from recent research, and the explicit teaching of research skills and methodology. Neumann (1992) portrayed the intangible connection between research and teaching as related to students developing approaches and dispositions towards knowledge development and research. In the intangible nexus, the more tacit, not directly observable forms of integration of research and teaching are grouped, such as creating an inquisitive research climate, fostering an innovative atmosphere, or stimulating the development of students’ research dispositions. Intangible elements have often been denoted by teachers and by educational researchers as relevant elements of learning to do research, but few researchers (McLean and Barker 2004; Elen et al. 2007; Elen and Verburgh 2008) have addressed the relation between these intangible elements of the research–teaching nexus and students’ experiences of courses.

Model on research and teaching

Healey (2005) described a model that distinguishes two dimensions of curricula related to tangible linkages between research and teaching, namely 1) emphasis on research products or emphasis on research processes and problems, and 2) students as participants or as audience (Fig. 1).

Fig. 1
figure 1

Four modes of the research–teaching nexus (adapted from Healey 2005)

In this model, four quadrants can be distinguished, which are interpreted as four distinct ways of integrating research and teaching in university curricula. Research-led teaching can be characterized as teaching with an emphasis on the research products or outcomes, without students engaging in inquiry or research activities. In research-oriented teaching, students have no active role in inquiry either, but the learning objectives are focused on the research problems and processes instead of research products, so in this quadrant students focus on learning research methods. In research-based teaching, students actively participate in research or inquiry with an emphasis on the research processes and problems. In research-tutored teaching, students also play an active role, for instance, by critically analysing and discussing outcomes of academic research; meanwhile, teaching is mostly focused towards research products. In order to illustrate research-tutored ways of teaching, Healey (2005) used the example of the tutor model from Oxford University.

Although this model provides a framework for constructing and evaluating the research integration in curricula from the perspective of the teaching staff, it is not evident that, in their courses, students experience the dimensions described in this model in a similar way. Therefore, in order to evaluate research integration in learning environments from the perspective of the students, we need to explore the factors that capture students’ perceptions of research integration in university teaching.

Research aims

The present paper describes the development and validity of the Student Perception of Research Integration Questionnaire (SPRIQ). The purpose of this study is twofold. Firstly, our aim is to develop a valid questionnaire that measures students’ perceptions of research integration in courses that can be used to provide feedback to teachers, educational directors, and educational program managers who work towards strengthening linkages between research, teaching, and student learning in their institutions and teaching. Secondly, our aim is to increase our understanding of student perceptions of research in their learning environment. Thus, we focus on the learning environment from the perspective of the students (the attained curriculum; van den Akker 2003); however, we are aware that there are multiple ways to evaluate learning environments in higher education.

Method

A questionnaire to measure student perception of research integration in university courses was constructed in various rounds. The initial item bank contained 79 items, including items related to tangible and intangible aspects (Neumann 1992). Items related to tangible aspects were loosely based on Healey et al. (2010) and Verburgh and Elen (2011). Items related to intangible aspects were loosely based on the Postgraduate Research Evaluation Questionnaire (PREQ; Marsh et al. 2002) that included questions on the integration in the research environment, motivation, and disposition of PhD students. Items related to quality were also based on Marsh et al. (2002), while items about beliefs were largely derived from Verburgh and Elen (2011). Two small pilot studies were conducted in which student feedback was solicited, descriptive statistics were reviewed, and initial factor analyses were performed. A major finding was that the use of very specific activities made the questionnaire less applicable to a variety of courses; therefore, some items were rewritten to capture differences in research methods. This resulted in a temporary instrument with 53 items that was administered to 201 students in two departments of one research intensive university divided over 24 courses. Exploratory factor analysis resulted in nine scales, including seven focusing on aspects of research integration, one on quality, and one on beliefs (van der Rijst et al. 2009). This instrument formed the basis for a set of 40 items that was used in the current study. The items within the most general scale ‘attention for research’ were rewritten to specifically address either research products or research processes, since that is one of the central dimensions of Healey’s model (2005; see Fig. 1), resulting in two separate scales. The tested instrument, therefore, contained 10 scales with each 4 items.

Instrument

The questionnaire consists of three constructs; ‘research integration’, ‘quality of the course’, and ‘beliefs about research integration’. The construct of ‘research integration’ can be subdivided, including both tangible and intangible themes. The tangible themes were derived from Healey’s above-mentioned model (2005) that consists of two dimensions. This resulted in the subscales: focus on ‘research product’, ‘research process’, ‘students as participants’, and two subscales on student as audience: ‘current research’ and ‘teacher’s own research’. Thus, five subscales were based on tangible themes. Three other subscales focused on the intangible aspects (Neumann 1992): ‘integration in research community’, ‘motivation for research’, and ‘academic disposition’. The scales ‘quality of the course’ and ‘beliefs about research integration’ were also included because students’ opinions on the quality of a course and their beliefs about the importance of research integration in their education in general could influence their scores on ‘research integration’ (cf. Verburgh and Elen 2011). All items were scored on a 5-point Likert scale; 36 questions were scored on a frequency scale, ranging from very rarely to very frequently, while the four questions of the beliefs scale were scored on an agreement scale, ranging from strongly disagree to strongly agree.

Procedure

The questionnaire was distributed during the final class of five undergraduate courses within three faculties: medical, science, and humanities of one research intensive university in The Netherlands. These courses included all three bachelor years (cf. Spronken-Smith et al. 2014; Verburgh and Elen 2011) and represented both hard and soft disciplines (cf. Biglan 1973); this disciplinary distinction was found to be relevant in Verburgh and Elen’s (2011) study about students’ research appreciation. All students present at the final sessions were asked to complete the questionnaire. A total of 221 students completed the questionnaire. Only those students who completed the full questionnaire were included in the analyses. As a result, the final number of respondents was 208. The courses varied in number of hours, type of classes, and in the way they included research in the course. In Table 1, additional information on the courses is presented.

Table 1 Course descriptions

Analysis

In order to arrive at a model with an acceptable fit and thus a valid and useful questionnaire, exploratory factor analyses were conducted to explore the proposed model and alternative models. Given the expected relatively high correlations between the factors, an oblique rotation was preferred over an orthogonal rotation: Oblimin with Kaiser normalization was applied as rotation method. Velicer’s Minimum Average Partial (revised) criterion was used to determine the number of factors. To explore alternative models, only items that loaded at least .50 on a factor were included in further analyses. Additionally, to achieve a more economical questionnaire (i.e., the least possible number of items within a scale), items meeting one of the two following criteria were removed: (1) items with the lowest estimates in the largest scale if internal consistency permitted, in particular if removal of such items resulted in an increase in Cronbach’s alpha; (2) if based on the covariance modification indices, covariates were suggested between an item and another scale. Finally, modification indices were examined to identify any error covariates within scales that would considerably improve the fit of the model.

The construct validity was tested by a confirmatory factor analysis. A variety of indices was used in order to check the fit of the confirmatory factor structures. The first index we used was the ratio of χ 2 to degrees of freedom and the corresponding p value. The p value must be greater than 0.05 in order to say that there is a good fit of the data with the assumed model (Hoyle 1995). The ratio should be less than three according to Hoe (2008), although no universally agreed upon standard exists. Other indices we used to determine the fit were the goodness-of-fit index (GFI), the adjusted goodness-of-fit index (AGFI), both covariance matrix reproduction indices, the Tucker-Lewis Index (TLI), the comparative fit index (CFI), both comparative indices measuring against a null model, and the Root-Mean-Square-Error-of-Approximation (RMSEA), a parsimony adjusted measure. Indices GFI and AFGI are more sensitive to model misspecification than TLI and CFI, but also more downward biased with smaller sample sizes, while RMSEA is best in terms of model specification (Fan et al. 1999). A value equal to or greater than 0.90 is considered a good fit in the case of GFI, AGFI, TLI, and CFI (Hoe 2008; Hoyle 1995). A RMSEA value equal to or less than 0.05 is used as an indication of a good fit of the data with the assumed model, and less than .08 is considered an acceptable fit (Hoe 2008). Other structural measures included the internal consistency, or reliability, of each scale as measured by Cronbach’s alpha, and the correlations between the scales, as measured by the Pearson correlation coefficient.

To examine the content validity, we took a closer look into the five courses included in this study as a first exploration of the potential to distinguish between courses. Therefore, we carried out an ANOVA with Tukey B post hoc test. Additionally, we compared the results of each course to the course content to see whether the different scores could be explained by the different characteristics of the courses.

Results

Structure of the SPRIQ

The results of the confirmatory factor analyses are presented in Table 2 for the original model (40 items) and the final model (24 items). The original model, consisting of eight subscales of four items that all contributed to one research integration scale, and separate quality and beliefs scales showed a moderate fit. The exploratory factor analyses clearly identified the separate quality and beliefs scales; however, the eight subscales contributing to a research integration scale were not supported. Instead, the exploratory factor analysis suggested four different subscales. None of the items of either ‘integration in research community’ or ‘academic disposition’ were included in these subscales because of low loadings. After removal of low loading items and reduction in the number of items in the current research scale, the following scores on the various fit indices were attained, indicating an acceptable fit for the final model.

Table 2 Results of the confirmatory factor analysis

The final model includes three scales: research integration, which consists of four subscales, quality, and beliefs (each 3 items). The four research integration subscales are as follows: reflection (4 items), participation (5 items), current research (5 items), and motivation (4 items). The subscale reflection includes items focusing on attention being paid to the research process leading to research results. The subscale participation includes items on the involvement of students in and their contribution to scientific research. Current research is a combination of items concentrating on getting to know the current research from their teachers and in general. Motivation consists of items concerning an increase in student’s enthusiasm and interest for the domain. Quality deals with items related to elements deemed important for good quality teaching, and beliefs captures students’ beliefs about the importance of research integration for their learning (Fig. 2).

Fig. 2
figure 2

Structure of final model of student perceptions of research integration

In Table 3, we present the Cronbach’s alphas, the means and standard deviations of all (sub)scales. A sample item is given for each (sub)scale. The full questionnaire can be found in the appendix (Table 6), including references to the intended (sub)scale and the final (sub)scale. All alpha’s are above .80, and therefore, the internal consistency of each (sub)scale can be considered good. Means vary between 1.88 for participation to 3.43 for quality.

Table 3 Characteristics of the (sub)scales in the final questionnaire

The final structural characteristic we present is the Pearson correlation coefficients between the scales (see Table 4). All scales correlate significantly with each other at the .01 level. Relatively high correlations can be found between current research, motivation, and participation (.70, .67, and .64, respectively). Reflection, beliefs, and quality show low to moderate correlations with the other (sub)scales. All these (sub)scales correlate highest with motivation (.47, .45, and .53, respectively), although reflection shows comparable correlations with participation and current research.

Table 4 Pearson correlation coefficient between the (sub)scales

Content of the SPRIQ

Considerable different scores on the various scales were found between the courses. The results of the post hoc tests are presented in Table 5.

Table 5 Comparison of mean scores on the (sub)scales between courses

First, the four subscales that make up the research integration scale will be discussed; next, the scales beliefs and quality will be discussed as additional measures.

The subscale reflection includes items that reflect on the way research results are produced. The courses in Medicine and Languages paid significantly more attention to this aspect than both Technology courses. While Informatics 2 concentrated on the current ‘state of the art’ instead of the methodological part, Study methods aimed to introduce research method aspects. However, this course hardly discussed research content, so from that perspective the low score on reflection might be explained.

Within the subscale participation, Philology 5 stands out. This corresponds with the teacher’s aim to introduce students to research analysis, including practicing with an authentic research case. Hardly any research participation was expected within Informatics 2, Study methods, and Introduction to medicine, which is reflected by their low scores.

On the subscale current research, Introduction to medicine was scored on the low end, together with Study methods. Both courses concentrated on research methods rather than current research. The ‘state-of-the-art’ character of Informatics 2 resulted in a higher score for this course. The Philology courses were scored highest on this subscale, which is consistent with the aims of these courses.

The subscale motivation relates to students’ increased interest and motivation for research in the discipline of their course. The scores on this subscale can be divided into three groups, with Introduction to medicine and Study methods on the lower end. In these courses, the analytical skills are mainly used as means to an end and not necessarily contributing to research, so the increase in motivation for research is limited. Moderate scores were obtained by Informatics 2 and Philology 3, which focus, among other things, on research content. Philology 5 was scored highest on motivation, and in fact on all other (sub)scales, and showed to be most motivating for research.

The scale beliefs is not course specific, nonetheless, students in Study methods, the only Bachelor 1 course, award less importance to research for their learning. The four other courses were scored similarly on this scale (i.e., between 3.14 and 3.67).

The scale quality intends to measure the overall quality of the course and is harder to evaluate based on course description. All courses were scored relatively high on quality; however, Introduction to medicine, which was the only large class, complete lecture-type course, was scored considerably lower.

Discussion and conclusion

This study aimed to improve our understanding of the way in which university students perceive and experience the research–teaching nexus in specific courses. Furthermore, by developing a questionnaire to measure students’ perceptions, the study aimed to create a tool that can be used by academics, for instance, to explore to what extent the intentions of their courses come across to the students. The SPRIQ was based on the literature about the research–teaching nexus, in particular on the distinction between tangible and intangible aspects of the nexus (cf. Neumann 1992) and on the model by Healey (2005) that distinguishes between outcomes of research and the process of research, on the one hand, and between the role of students as either participants or audience, on the other hand. Initially, eight subscales were designed to capture the integration of research in teaching. The SPRIQ was administered in five undergraduate courses that differed in terms of academic content and in their goals with respect to the research–teaching nexus.

Analysis of the data revealed a factor structure which differed from the intended structure. The scale research integration appeared to consist of four subscales, labelled reflection, participation, current research, and motivation. Reflection, participation, and current research concerned tangible aspects, whereas intangible aspects were apparent in the motivation subscale. In this way, the distinction made by Neumann (1992) was confirmed empirically. Interestingly, two of the envisioned three intangible subscales could not be identified in the students’ responses, nor were any of these items included in other subscales, suggesting that the intangible aspects, such as the development of an academic disposition, are hard to grasp for students. Furthermore, Healey’s (2005) dimension ‘students as participants versus audience’ was apparent, in particular in the subscales participation on the one hand and current research and reflection on the other hand. The latter concerns students’ awareness of the research that is currently done in the course domain, or by their teacher, however, not necessarily with a contribution from the part of the students. This is in contrast to participation in which students’ contributions were required. The other dimension in Healey’s (2005) model, that is, emphasis on the outcomes versus the process of research, did not come up in separate subscales. In contrast, items that initially were grouped in subscales ‘research product’ or ‘research process’ appeared to be combined in the final subscales, in particular reflection. In other words, in the way students perceive research integration in their courses, the distinction between the process and the outcomes of research is not fundamental.

Students perceive a number of benefits when research is integrated in teaching. Several of these benefits are included in the four subscales making up the research integration scale. Reflection touches upon a better understanding of the discipline (Neumann 1994; Turner et al. 2008). Current research includes becoming familiar with the teacher’s research, making research and the researchers more real (Neumann 1994). Participation is high on students’ priority lists (Robertson and Blackler 2006; Buckley 2011), and motivation relates to the inspiring role that research integrated teaching can have (Jenkins et al. 1998; Robertson and Blackler 2006).

In addition to the research integration scale, the SPRIQ contained two other scales, one measuring students’ perceptions of the quality of the course, and the other measuring students’ rating of the importance of research integration for their learning (beliefs). As expected, these scales came out as separate factors; however, both were correlated with research integration and its subscales. Thus, it is advised to include these two factors when investigating perceived research integration in courses. If students evaluate the quality of a course as low, or if they would not value research for their learning, this could negatively affect their scores on the research integration scale.

Indications of content validity of the SPRIQ can be derived from the specific scores from five different courses. Given the respective objectives of these courses, it makes sense that Study methods scored relatively low on research integration, in particular on participation. On the other side of the spectrum, it is encouraging to see that Philology 5, which aims to be a particular research intensive course, received by far the highest scores on all research integration subscales. Interestingly, the quality of both these courses was rated similarly. Furthermore, even courses that rated comparably on the overall research integration scale, for example, Introduction to medicine and Informatics 2, could be distinguished based on the subscales. While Introduction to medicine received higher scores on reflection, Informatics 2 scored higher on current research and motivation. Using subscales, therefore, clearly adds to, amongst other aspects, the feedback function of the questionnaire compared to combining all different research related activities into one overall research integration score (cf. Verburgh and Elen 2011) or ticking individual research activities (cf. Healey et al. 2010; Spronken-Smith et al. 2014; Turner et al. 2008).

Additionally, the beliefs scale was scored similarly for all courses, except for the Bachelor 1 course (i.e., Study methods). The literature is ambiguous on the influence of year of study. Some studies suggest that belief in the benefit of integrating research for students learning increases with years of study (cf. Lindsay et al. 2002; Neumann 1994), while Verburgh and Elen (2011) found that first-year students indicated more positive aspects. Our small sample did not aim to answer this unresolved question, but given this finding and the ongoing debate, it is recommended to continue including the beliefs scale in future research.

We conclude that the present study contributed to our understanding of how students perceive the integration of research in specific courses. The factors motivation, reflection, participation, and current research, together capture students’ perception of research integration accurately. The SPRIQ, in its present form, is a promising tool to provide information about students’ perceptions to teachers and program managers who aim to strengthen links between research, teaching, and student learning in educational practice. Clearly, more studies, including a variety of disciplines and years of study, are needed to further explore the validity of the SPRIQ. Future research may also use this instrument in large-scale studies to explore differences in students’ perceptions of courses in varying disciplines and years of study (cf. Verburgh and Elen 2011), or relate students’ perceptions of research integration to their learning (cf. Spronken-Smith et al. 2012 for inquiry-based learning).