Abstract
A method of the IRT observed-score equating using chain equating through a third test without equating coefficients is presented with the assumption of the three-parameter logistic model. The asymptotic standard errors of the equated scores by this method are obtained using the results given by M. Liou and P.E. Cheng. The asymptotic standard errors of the IRT observed-score equating method using a synthetic examinee group with equating coefficients, which is a currently used method, are also provided. Numerical examples show that the standard errors by these observed-score equating methods are similar to those by the corresponding true score equating methods except in the range of low scores.
Similar content being viewed by others
References
Angoff, W.H. (1971). Scales, norms, and equivalent scores. In R.L. Thorndike (Ed.),Educational measurement (2nd ed., pp. 508–600). Washington DC: American Council on Education.
Bahadur, R.R. (1966). A note on quantiles in large samples.Annals of Mathematical Statistics, 37, 577–580.
Bentler, P.M., & Dudgeon, P. (1996). Covariance structure analysis: Statistical practice, theory, and directions.Annual Review of Psychology, 47, 563–592.
Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm.Psychometrika, 46, 443–459.
Bock, R.D., & Lieberman, M. (1970). Fitting a response model forn dichotomously scored items.Psychometrika, 35, 179–197.
Braun, H.I., & Holland, P.W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In P.W. Holland & D.B. Rubin (Eds.),Test equating (pp. 9–49). New York, NY: Academic Press.
Cox, D.R. (1961). Tests of separate families of hypotheses.Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1, 105–123.
Ghosh, J.K. (1971). A new proof of he Bahadur representation of quantiles and an application.Annals of Mathematical Statistics, 42, 1957–1961.
Han, T., Kolen, M.J., & Pohlmann, J. (1997). A comparison among IRT true- and observed score equatings and traditional equipercentile equating.Applied Measurement in Education, 10, 105–121.
Kolen, M.J. (1981). Comparison of traditional and item response theory methods for equating tests.Journal of Educational Measurement, 18, 1–11.
Kolen, M.J., & Brennan, R.L. (1995).Test equating: Methods and practices. New York, NY: Springer.
Liou, M., & Cheng, P.E. (1995). Asymptotic standard error of equipercentile equating.Journal of Educational and Behavioral Statistics, 20, 259–286.
Liou, M., Cheng, P. E., & Johnson, E. (1997). Standard errors of the kernel equating methods under the common-item design.Applied Psychological Measurement, 21, 349–369.
Lord, F.M. (1977). Practical applications of item characteristic curve theory.Journal of Educational Measurement, 14, 117–138.
Lord, F.M. (1980).Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Lord, F.M. (1982a) Item response theory and equating: A technical summary. In P.W. Holland & D.B. Rubin (Eds.),Test equating (pp. 141–148). New York, NY: Academic Press.
Lord, F.M. (1982b). Standard errors of an equating by item response theory.Applied Psychological Measurement, 6, 463–472.
Lord, F.M. (1982c). The standard error of equipercentile equating.Journal of Educational Statistics, 7, 165–174.
Lord, F.M., & Wingersky, M.S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”.Applied Psychological Measurement, 8, 453–461.
Loyd, B.H., & Hoover, H.D. (1980). Vertical equating using the Rasch model.Journal of Educational Measurement, 17, 179–193.
Ogasawara, H. (2000). Asymptotic standard errors of IRT equating coefficients using moments.Economic Review (Otaru University of Commerce),51(1), 1–23.
Ogasawara, H. (2001a). Standard errors of item response theory equating/linking by response function methods.Applied Psychological Measurement, 25, 53–67.
Ogasawara, H. (2001b). Item response theory true score equatings and their standard errors.Journal of Educational and Behavioral Statistics, 26, 31–50.
Rubin, D.B. (1982). Discussion of “Observed-score test equating: A mathematical analysis of some ETS equating procedures”. In P.W. Holland & D.B. Rubin (Eds.),Test equating (pp. 51–54). New York, NY: Academic Press.
Stocking, M.L., & Lord, F.M. (1983). Developing a common metric in item response theory.Applied Psychological Measurement, 7, 201–210.
Tsai, T.-H., Hanson, B.A., Kolen M.J, & Forsyth, R.A. (2001). A comparison of bootstrap standard errors of IRT equating methods for the common item nonequivalent groups design.Applied Measurement in Education, 14, 17–30.
van der Linden, W.J. (2000). A test-theoretic approach to observed-score equating.Psychometrika, 65, 437–456.
van der Linden, W.J., & Luecht R.M. (1998). Observed-score equating as a test assembly problem.Psychometrika, 63, 401–418.
Zeng, L., & Kolen, M.J. (1995). An alternative approach for IRT observed-score equating of number-correct scores.Applied Psychological Measurement, 19, 231–241.
Author information
Authors and Affiliations
Corresponding author
Additional information
The author is indebted to Michael J. Kolen for access to the real data used in this article and anonymous reviewers for their corrections and suggestions on this work.
Rights and permissions
About this article
Cite this article
Ogasawara, H. Asymptotic standard errors of irt observed-score equating methods. Psychometrika 68, 193–211 (2003). https://doi.org/10.1007/BF02294797
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02294797