Teachers’ involvement in high-stakes language assessment reforms: The case of Test for English Majors (TEM) in China

https://doi.org/10.1016/j.stueduc.2020.100898Get rights and content

Highlights

  • Teachers had generally positive perceptions of the new Test for English Majors Grade 4 (TEM4).

  • Teachers commented favourably on the various reform measures taken by the TEM Committee.

  • Teachers’ views on the TEM4 reform mostly converged with those of the TEM Committee representatives.

Abstract

As important stakeholders in language assessment, teachers’ perceptions of language test design and implementation are key factors influencing test impact. This study investigated how English teachers perceived the recent reform of Test for English Majors Grade 4 (TEM4) - a high-stakes English proficiency test targeting English-major undergraduates in China. Furthermore, this study explored the convergence and divergence between teachers’ perceptions of the reform and the intentions of the TEM4 Committee. Findings of this study indicate that teachers in general commented favourably on the revised TEM4 and felt that most of the reform measures were appropriate; however, some divergence between teachers’ perceptions and the intentions of the TEM4 developer was uncovered, particularly in relation to the listening section. This study underscores the importance of soliciting teachers’ perspectives on high-stakes assessment reforms and signals the need for building more effective communicative mechanisms between language assessment providers and stakeholders.

Introduction

It is widely recognised that the intended purposes of high-stakes language assessments often extend beyond measuring language abilities, to include promoting positive changes in teaching and learning practices (Cheng, 2005, 2008; Wall, 2012), and improvements in curriculum design (Frederiksen & Collins, 1989; Green, 2007). In mainland China, where large-scale, standardised English tests operate as gatekeeping mechanisms in various high-stakes contexts, including tertiary education, employment, and the allocation of residency permits in certain cities (e.g., Garner & Huang, 2014; Zou, 2012), language testing regimes function as powerful policy instruments for driving changes in practice and curricula, and have been used to promote wide ranging, centrally driven educational reforms across diverse provinces and local contexts (e.g., Jin & Fan, 2011; Jin, 2014).

The Test for English Majors Grade 4 (TEM4), which is the focus of the current paper, is one example of a high-stakes test used for these broader purposes in China. TEM4 was first introduced in 1992, as part of China’s strategy to increase the number of students graduating from Chinese universities with proficiency in English, and is now administered to over a quarter of a million English-major undergraduates annually. A recently revised version of the test, with a new format and content, was introduced in 2016, with the intention of promoting positive changes to curricula and English teaching and learning practices in Chinese universities (Zou & Xu, 2017). The revised TEM4 was largely aimed at better fostering the development of students’ ability to use English in communicative contexts, by strengthening integrated writing ability (e.g., through listen-to-write tasks), among other things. Changes were prompted by perceptions that the previous version had overly promoted the development of discreet components of linguistic knowledge over communication skills (Zou, 2012). Whether or not the recent changes that were made to the TEM4 will yield the intended positive English language learning outcomes across Chinese universities will depend, to a large degree, on how teachers respond to the reforms. The importance of teacher perceptions in shaping the consequences of large-scale, high-stakes language assessment reforms is well documented in China (e.g., Gu, 2007; Qi, 2004) and elsewhere (e.g., Alderson, Clapham, & Wall, 1995; Cheng, Watanabe, & Curtis, 2004; East, 2015; Norris, 2008; Winke, 2011). As Winke (2011) notes, teachers offer a unique insider perspective on testing practices, ‘they administer tests, know their students and can see how the testing affects them, and they recognize – sometimes even decide – how the tests affect what is taught’ (p. 633).

Nonetheless, in highly centralised testing systems, such as in China, language assessments and assessment reforms are typically implemented top-down (Jin & Fan, 2011), with little input from key stakeholder groups like teachers, who, along with students, tend to be placed at the weaker end of the power relations in language testing processes (Shohamy, 2001). Consequently, in the context of the recent TEM4 reform process, little is yet known about how English language teachers in Chinese universities perceive and evaluate the revised version of the test, and thus it remains unclear if the test revisions are likely to deliver the intended improvements to university-level English language teaching and learning. This study aims to address this gap by investigating how English teachers at universities in China perceive and evaluate the new TEM4, particularly the changes made to the test format and content. Moreover, we also examine the perspectives of representatives of the TEM Committee, who were responsible for overseeing the design and implementation of TEM4 and its recent suite of reforms. In comparing the perceptions of teachers and language testers, our study further contributes to the broader need for improved dialogue between these two stakeholder groups, particularly in the Chinese context.

Section snippets

Teachers’ involvement in language assessment

Teachers are important stakeholders in language assessment (Alderson et al., 1995; Bachman & Palmer, 1996; Kim & Isaacs, 2018); very limited research, however, has invited teachers to take a critical stance at language assessment or assessment practices. Three exceptions are East (2015), So (2014), and Winke (2011), each of which will be briefly reviewed next.

Winke (2011) surveyed the views from 267 teachers and test administrators on the English Language Proficiency Assessment (ELPA), an

Research questions

As discussed above, teachers’ perceptions of high-stakes testing play a key role in generating consequences on teaching practices and outcomes. In order to explore the likely impact of the most recent changes to the TEM4 on English teaching and learning in universities in China, it is thus crucial to gain insights into how tertiary level teachers evaluate the revisions made to the test, and to identify if and/or how well their perceptions of the new test format and content align with the

Method

This study employed a mixed-methods sequential explanatory design (e.g., Creswell, 2013; Ivankova, Creswell, & Stick, 2006), which comprised a quantitative component and a follow-up qualitative component. Questionnaires were distributed to English teachers to explore their views on the new TEM4 and the reform measures. This was followed by a qualitative study, which included one-on-one interviews with both English teachers and members of the TEM Committee.

EFA and reliability analysis

Principal axis factoring with oblimin rotation was performed on the 22 items in Part I of the questionnaire. Oblimin rotation was employed to enhance the interpretability of factor solutions because this part of the questionnaire was designed to measure teachers’ perceptions of the new TEM4 and its dimensions should be correlated (e.g., Fan & Ji, 2014). If the correlations between the factors were found to be low or negligible, the same procedures were repeated, using varimax rotation, which

Discussion and conclusions

Teachers play a significant role in the development, implementation and reform of language assessments (Kim & Isaacs, 2018). Eliciting teachers’ views on assessment practices offer ‘a window into their own beliefs and understandings about effective language pedagogy and, following from that, what, in their eyes, constitutes effective assessment’ (East, 2015, p. 103). In this study, we explored how English teachers in China evaluated the recent reform of TEM4 – a large-scale high-stakes English

References (43)

  • AERA et al.

    Standards for educational and psychological testing

    (2014)
  • J.C. Alderson et al.

    Language test construction and evaluation

    (1995)
  • L.F. Bachman et al.

    Language assessment in practice: Designing and developing useful language tests

    (1996)
  • L.F. Bachman et al.

    Language assessment in practice: Developing language assessments and justifying their use in the real world

    (2010)
  • C.A. Chapelle et al.

    Building a validity argument for the Test of English as a Foreign Language

    (2008)
  • L. Cheng

    Changing language teaching through language testing: A washback study

    (2005)
  • L. Cheng

    Washback, impact and consequences

  • J.W. Creswell

    Research design: Qualitative, quantitative, and mixed methods approaches

    (2013)
  • A. Cumming

    Assessing integrated writing tasks for academic purposes: Promises and perils

    Language Assessment Quarterly

    (2013)
  • M. East

    Coming to terms with innovative high-stakes assessment practice: Teachers’ viewpoints on assessment reform

    Language Testing

    (2015)
  • J. Fan

    Development and validation of standards in language testing

    (2018)
  • J. Fan et al.

    Test candidates’ attitudes and their test performance: The case of the Fudan English Test

    University of Sydney Papers in TESOL

    (2014)
  • J. Fan et al.

    A survey of English language testing practice in China: The case of six examination boards

    Language Testing in Asia

    (2013)
  • A. Field

    Discover statistics using SPSS

    (2009)
  • J.R. Frederiksen et al.

    A systems approach to educational testing

    Educational Researcher

    (1989)
  • M. Garner et al.

    Testing a nation: The social and educational impact of the College English Test in China

    (2014)
  • A. Green

    IELTS washback in context

    (2007)
  • X. Gu

    Positive or negative—An empirical study of CET washback

    (2007)
  • IBM

    IBM SPSS statistics version 21

    (2012)
  • N.V. Ivankova et al.

    Using mixed-methods sequential explanatory design: From theory to practice

    Field Methods

    (2006)
  • Cited by (9)

    • Stakeholders’ test perceptions on test reform

      2021, Studies in Educational Evaluation
      Citation Excerpt :

      Because the real reason behind this change was unknown to the teachers, some of them interpreted the removal as a discouragement from using news broadcasts to teach English listening skills. While Fan et al. (2020) suggested that teachers may misinterpret the test and thus respond in ways unintended or unanticipated by test developers, Winke (2011) emphasized that teachers’ different perceptions are not necessarily misunderstandings but rather legitimate doubts that they have about test validity based on their unique insights into testing, learning, and teaching. In her study, Winke surveyed 267 teachers and school administrators regarding their opinions about a state-mandated test for English-learner children in the United States.

    • Examining the college English teaching and listening based on English Proficiency Scale

      2021, Aggression and Violent Behavior
      Citation Excerpt :

      For the details of the inner structure of students writing, such abilities as the ability to use a thesis statement or effectively structure an argumentative essay, the performance profile method (PPM) is proposed here. Fan, Frost, and Liu (2020) examined the need for more efficient communication processes between language assessment suppliers and stakeholders have been stressed, and teachers' views on high-stakeholders evaluation changes are important. Key influences affecting testing influence are the important players in the language evaluation, teachers' understanding of language test design and application.

    View all citing articles on Scopus
    View full text