Abstract
Observed-score equating using the marginal distributions of two tests is not necessarily the universally best approach it has been claimed to be. On the other hand, equating using the conditional distributions given the ability level of the examinee is theoretically ideal. Possible ways of dealing with the requirement of known ability are discussed, including such methods as conditional observed-score equating at point estimates or posterior expected conditional equating. The methods are generalized to the problem of observed-score equating with a multivariate ability structure underlying the scores.
Similar content being viewed by others
References
Braun, H.I., & Holland, P.W. (1982). Observed score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland & D. B. Rubin (Eds.).Test equating (pp. 9–49). New York: Academic Press.
Campbell, N. R. (1928).An account of the principles of measurement and calculation. London: Longmans, Green & Co.
Cizek, G.J., Kenney, P.A., Kolen, M.J., Peters, C.W., & van der Linden, W.J. (1999).The feasibility of linking scores on the proposed Voluntary National Test and the National Assessment of Educational Progress [Final report]. Washington, DC: National Assessment Governing Board.
Dorans, N.J. (1999).Correspondences between ACT and SAT I scores (College Board Rep. No. 99-1). New York: College Entrance Board.
Dubois, P.H. (1970).A history of psychological testing. Boston: Allyn & Bacon.
Feuer, M.J., Holland, P.W., Green, B.F., Bertenthal, M. W., & Hemphill, F. C. (Eds.). (1999).Uncommon measures: Equivalence and linkage among educational tests. Washington, DC: National Academy Press.
Glas, C.A.W. (1992). A Rasch model with a multivariate distribution of ability. In M. Wilson (Ed.),Objective measurement: Theory into practice (Vol. 1, pp. 236–260). Norwood, NJ: Ablex.
Grayson, D.A. (1988). Two-group classification in latent trait theory: Scores with monotone likelihood ratio.Psychometrika, 53, 383–392.
Harris, D.B., & Crouse, J.D. (1993). A study of criteria used in equating.Applied Measurement in Education, 6, 195–240.
Holland, P.W., & Rubin, D.B. (Eds.). (1982).Test equating. New York: Academic Press.
Junker, B.W., & Sijtsma, K. (2000). Latent and manifest monotonicity in item response models.Applied Psychological Measurement, 24, 65–81.
Kolen, M.J., & Brennan, R.L. (1995).Test equating: Methods and practices. New York, NY: Springer-Verlag.
Koretz, D.M., Bertenthal, M.W., & Green, B.F. (Eds.). (1999).Embedded questions: The pursuit of a common measure in uncommon tests. Washington, DC: National Academy Press.
Lehmann, E.L. (1986).Testing statistical hypothesis (2nd ed.). New York, NY: Wiley & Sons.
Linn, R.L. (1993). Linking results of distincts assessments.Applied Measurement in Education, 6, 83–102.
Liou, M., & Cheng, P.E. (1995). Asymptotic standard error of equipercentile equating.Journal of Educational and Behavioral Statistics, 20, 119–136.
Lord, F.M. (1980).Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Lord, F.M. (1982). The standard error of equipercentile equating.Journal of Educational Statistics, 7, 165–174.
Lord, F.M., & Wingersky, M.S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”.Applied Psychological Measurement, 8, 452–461.
Mislevy, R.J. (1992).Linking educational assessments: Concepts, issues, methods, and prospects. Princeton, NJ: Educational Testing Service.
Morris, C.N. (1982). On the foundations of test equating. In P.W. Holland & D.B. Rubin (Eds.),Test equating (pp. 169–191). New York, NY: Academic Press.
Pashley, P.J., & Philips, G.W. (1993).Towards world-class standards: A research study linking international and national assessments. Princeton, NJ: Educational Testing Service, Center for Educational Progress.
Rasch, G. (1960).Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research.
Spearman, C. (1904). The proof and measurement of association between two things.American Journal of Psychology, 15, 72–101.
Suppes, P., & Zinnes, J.L. (1963). Basic measurement theory. In R.D. Luce, R.R. Bush, & E. Galanter (Eds.),Handbook of mathematical psychology (Vol. 1, pp. 1–76). New York, NY: Wiley & Sons.
van der Linden, W. J. (1996). Assembling tests for the measurement of multiple abilities.Applied Psychological Measurement, 20, 373–388.
van der Linden, W.J. (1998a). Stochastic order in dichotomous iem response models for fixed, adaptive, and multidimensional tests.Psychometrika, 63, 211–226.
van der Linden, W.J. (1998b). Optimal assembly of psychological and educational tests.Applied Psychological Measurement, 22, 195–211.
van der Linden, W.J. (in press). Adaptive testing with equated number-correct scoring.Applied Psychological Measurement, 25.
van der Linden, W.J., & Luecht, R.M. (1998). Observed-equating as a test assembly problem.Psychometrika, 63, 401–418.
van der Linden, W.J. & Vos, J.H. (1996). A compensatory approach to optimal selection with mastery scores.Psychometrika, 61, 155–172.
Wilk, M. B., & Gnanadesikan, R. (1968). Probability plotting methods for the analysis of data.Biometrika, 55, 1–17.
Williams, V., Billaud, L., Davis, D., Thissen, D., & Sanford, E. (1995).Projecting the NAEP scale: Results from the North Carolina end—of-grade testing program (Technical Rep. No. 34). Chapel Hill, NC: University of North Carolina, Chapel Hill, National Institute of Statistical Sciences.
Yen, W. (1983). Tau-equivalence and equipercentile equating.Psychometrika, 48, 353–369.
Zeng, L., & Kolen, M.J. (1995). An alternative approach for IRT observed-score equating of number-correct scores.Applied Psychological Measurement, 19, 231–240.
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is based on the author's Presidential Address given on July 7, 2000 at the 65th Annual Meeting of the Psychometric Society held at the University of British Columbia, Vancouver, Canada.
The author is most indebted to Wim M.M. Tielen for his computational assistance and Cees A.W. Glas for his comments on a draft of this paper.
Rights and permissions
About this article
Cite this article
van der Linden, W.J. A test-theoretic approach to observed-score equating. Psychometrika 65, 437–456 (2000). https://doi.org/10.1007/BF02296337
Issue Date:
DOI: https://doi.org/10.1007/BF02296337