Skip to main content
Log in

From Test Performance to Language Use: Using Self-Assessment to Validate a High-Stakes English Proficiency Test

  • Regular Article
  • Published:
The Asia-Pacific Education Researcher Aims and scope Submit manuscript

Abstract

Despite the profusion of studies into the factor structure of language tests, limited research is currently available on the relationship between test performance and language use in TLU domains. Utilizing the structural equation modeling approach, this study set out to investigate the factor structure of a high-stakes university-based English proficiency test, and then modeled the relationship between test takers’ performance and their ability to use language in TLU domains. A self-assessment (SA) inventory was developed and validated to capture test takers’ ability to use language in TLU domains. The results showed that the higher-order factor model best fit both the test and SA data. Structural regression analysis indicated a moderately strong relationship between students’ general EFL ability and the SA latent factor. The results of this study lend empirical support to the construct and predictive validity of the test, both of which are deemed crucial to its overall validity argument. The results also shed light on the utility of SA to capture language use in TLU domains as well as the use of SA for test validation purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Alderson, J. C., & Banerjee, J. (2002). Language testing and assessment (Part 2). Language Teaching, 35(02), 79–113.

    Article  Google Scholar 

  • Bachman, L. F., & Palmer, A. S. (1989). The construct validation of self-ratings of communicative language ability. Language Testing, 6(1), 14–29.

    Article  Google Scholar 

  • Bentler, P. M., & Wu, E. J. (2005). EQS 6.1 for windows. Encino, CA: Multivariate Software.

    Google Scholar 

  • Blanche, P., & Merino, B. J. (1989). Self-assessment of foreign language skills: Implications for teachers and researchers. Language Learning, 39(3), 313–338.

    Article  Google Scholar 

  • Bond, T., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences. New York: Routledge, Taylor & Francis Group.

  • Butler, Y. G., & Lee, J. (2006). On-task versus off-task self-assessments among Korean elementary school students studying English. The Modern Language Journal, 90(4), 506–518.

    Article  Google Scholar 

  • Byrne, B. M. (2006). Structural equation modeling with EQS: Basic concepts, applications, and programming (2nd ed.). Mahwah, NJ: Psychology Press.

    Google Scholar 

  • Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (2008). Building a validity argument for the Test of English as a Foreign Language. New York: Routledge, Taylor & Francis Group.

    Google Scholar 

  • Cho, Y., & Bridgeman, B. (2012). Relationship of TOEFL iBT® scores to academic performance: Some evidence from American universities. Language Testing, 29(3), 421–442.

    Article  Google Scholar 

  • Enright, M. K., Bridgeman, B., Eignor, D., Lee, Y.-W., & Powers, D. E. (2008). Prototyping measures of listening, reading, speaking, and writing. In C. A. Chapelle, M. K. Enright, & J. M. Jamieson (Eds.), Building a validity argument for the Test of English as a Foreign Language (pp. 145–186). New York: Routledge, Taylor & Francis Group.

    Google Scholar 

  • Fan, J. (2016). The construct and predictive validity of a self-assessment scale. Papers in Language Testing and Assessment, 5(2), 69–100.

    Google Scholar 

  • Fan, J., & Bond, T. (2016). Using MFRM and SEM in the validation of analytic rating scales of an english speaking assessment. In Q. Zhang (Ed.), Pacific rim objective measurement symposium (PROMS) 2015 conference proceedings (pp. 29–50). Singapore: Springer.

    Chapter  Google Scholar 

  • Fan, J., & Ji, P. (2014). Test candidates’ attitudes and their test performance: The case of the Fudan English Test. University of Sydney Papers in TESOL, 9, 1–35.

    Google Scholar 

  • FDU Testing Team. (2014). The Fudan English test syllabus. Shanghai: Fudan University Press.

    Google Scholar 

  • Gu, L. (2014). At the interface between language testing and second language acquisition: Language ability and context of learning. Language Testing, 31(1), 111–133.

    Article  Google Scholar 

  • In’nami, Y., & Koizumi, R. (2011). Factor structure of the revise TOEFL test: A multi-sample analysis. Language Testing, 29(1), 131–152.

    Article  Google Scholar 

  • In’nami, Y., Koizumi, R., & Nakamura, K. (2016). Factor structure of the Test of English for Academic Purposes (TEAP®) test in relation to the TOEFL iBT® test. Language Testing in Asia, 6(1), 1–23.

    Article  Google Scholar 

  • Kunnan, A. J. (1995). Test taker characteristics and test performance: A structural modeling approach. Cambridge: Cambridge University Press.

    Google Scholar 

  • Llosa, L. (2007). Validating a standards-based classroom assessment of English proficiency: A multitrait-multimethod approach. Language Testing, 24(4), 489–515.

    Article  Google Scholar 

  • Ockey, G. J., & Choi, I. (2015). Structural equation modeling reporting practices for language assessment. Language Assessment Quarterly, 12(3), 305–319.

    Article  Google Scholar 

  • Oller, J. W. (1976). Evidence for a general language proficiency factor: An expectancy grammar. Die neueren sprachen, 75(2), 165–174.

    Google Scholar 

  • Oscarson, M. (2013). Self-assessment in the classroom. In A. J. Kunnan (Ed.), The companion to language assessment (Vol. 2, pp. 712–729). New York: Wiley-Blackwell.

    Chapter  Google Scholar 

  • Powers, D. E., & Powers, A. (2015). The incremental contribution of TOEIC® listening, reading, speaking, and writing tests to predicting performance on real-life English language tasks. Language Testing, 32(2), 151–167.

    Article  Google Scholar 

  • Powers, D. E., Roever, C., Huff, K. L., & Trapani, C. S. (2003). Validating LanguEdge courseware scores against faculty ratings and student self-assessments. Retrieved from http://www.ets.org/Media/Research/pdf/RR-03-11-Powers.pdf.

  • Ross, S. (1998). Self-assessment in second language testing: A meta-analysis and analysis of experiential factors. Language Testing, 15(1), 1–20.

    Google Scholar 

  • Sackett, P. R., Borneman, M. J., & Connelly, B. S. (2008). High stakes testing in higher education and employment: Appraising the evidence for validity and fairness. American Psychologist, 63(4), 215.

    Article  Google Scholar 

  • Sang, F., Schmitz, B., Vollmer, H. J., Baumert, J., & Roeder, P. (1986). Models of second language competence: A structural equation approach. Language Testing, 3(1), 54–79.

  • Satorra, A., & Bentler, P. M. (2001). A scaled difference Chi square test statistic for moment structure analysis. Psychometrika, 66(4), 507–514.

    Article  Google Scholar 

  • Sawaki, Y., Stricker, L. J., & Oranje, A. H. (2009). Factor structure of the TOEFL Internet-based test. Language Testing, 26(1), 5–30.

    Article  Google Scholar 

  • Shin, S. K. (2005). Did they take the same test? Examinee language proficiency and the structure of language tests. Language Testing, 22(1), 31–57.

    Article  Google Scholar 

  • Suzuki, Y. (2015). Self-assessment of Japanese as a second language: The role of experiences in the naturalistic acquisition. Language Testing, 32(1), 63–81.

    Article  Google Scholar 

  • Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics, 5th. Needham Height, MA: Allyn & Bacon.

    Google Scholar 

  • Zheng, Y., & Cheng, L. (2008). Test review: College English Test (CET) in China. Language Testing, 25(3), 408–417.

    Article  Google Scholar 

Download references

Acknowledgements

The preparation of this manuscript was supported by the National Planning Office for Philosophy and Social Sciences (NPOPSS) of the People’s Republic of China under the project title “The Development and Validation of Standards in Language Testing” (Grant Number: 13CYY032). We would like to thank the two anonymous reviewers for their insightful comments and feedback on the previous draft of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinsong Fan.

Appendices

Appendix 1: English translation of can-do statements in the SA inventory

Listening

  1. 1.

    Understand short daily conversations.

  2. 2.

    Understand extended daily conversations.

  3. 3.

    Understand the main idea of English lectures which do not involve much subject knowledge.

  4. 4.

    Understand the details of English lectures which do not involve much subject knowledge.

  5. 5.

    Take necessary notes while attending English lectures.

  6. 6.

    Understand English news broadcasts such as VOA and BBC.

Reading

  1. 1.

    Read English materials in daily life without the assistance of dictionaries such as travel brochures or operation manuals.

  2. 2.

    Read news and current affairs reports in English newspapers.

  3. 3.

    Read academic books or articles published in my field.

  4. 4.

    Understand the author’s views or attitudes toward a particular topic.

  5. 5.

    Locate the information that I need through expeditious reading.

  6. 6.

    Compare and analyze different views or opinions in an English article.

  7. 7.

    Understand the meaning of words based on contextual information.

Writing

  1. 1.

    Write emails and short letters in English.

  2. 2.

    Write resumes and personal statements in English.

  3. 3.

    Use and describe graphs or charts in my English writing.

  4. 4.

    Write extended comments on a particular subject or topic.

  5. 5.

    State my views or opinions on a particular phenomenon.

  6. 6.

    Support or refute a particular view, attitude, or plan in my English writing.

  7. 7.

    Synthesize views or opinions from different input sources.

Speaking

  1. 1.

    Speak on everyday topics.

  2. 2.

    Present my views or opinions on a particular topic in English.

  3. 3.

    Discuss with others about a particular topic in English.

  4. 4.

    Conduct basic English–Chinese interpretation.

  5. 5.

    Express my views or opinions on topics in my field.

  6. 6.

    Use pronunciation and intonations to effectively convey meanings.

Appendix 2

See Fig. 5.

Fig. 5
figure 5

Factor loadings of the higher-order factor model (test data, n = 244)

Appendix 3

See Fig. 6.

Fig. 6
figure 6

Factor loadings of the higher-order factor model (SA data, n = 244)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, J., Yan, X. From Test Performance to Language Use: Using Self-Assessment to Validate a High-Stakes English Proficiency Test. Asia-Pacific Edu Res 26, 61–73 (2017). https://doi.org/10.1007/s40299-017-0327-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40299-017-0327-4

Keywords

Navigation