Skip to main content

Psychometrics

  • Chapter

Abstract

Psychometrics is the study of the measurement of educational and psychological characteristics such as abilities, aptitudes, achievement, personality traits and knowledge (Everitt, 2006). Psychometric methods address challenges and problems arising in these measurements.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   49.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Adams, R. J., Wilson, M., & Wang, W. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21(1), 1–23.

    Article  Google Scholar 

  • Adams, R. J., Wilson, M., & Wu, M. (1997). Multilevel item response models: An approach to errors in variables regression. Journal of Educational and Behavioral Statistics, 22(1), 47–76.

    Article  Google Scholar 

  • American Educational Research Association (AERA), American Psychological Association (APA), and National Council for Measurement in Education (NCME). (1999). Standards for psychological and educational tests. Washington D.C.: AERA, APA, and NCME.

    Google Scholar 

  • Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques (2nd ed.). New York: Dekker.

    Google Scholar 

  • Banta, T. W., Lund, J. P., Black, K. E., & Oblander, F. W. (1996). Assessment in practice: Putting

    Google Scholar 

  • principles to work on college campuses. San Francisco: Jossey-Bass.

    Google Scholar 

  • Baxter, J. (1995). Children’s understanding of astronomy and the earth sciences. In S. M. Glynn & R. Duit (Eds.), Learning science in the schools: Research reforming practice (pp. 155–177). Mahwah, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Black, P., Wilson, M., & Yao, S. (2011). Road maps for learning: A guide to the navigation of learning progressions. Measurement: Interdisciplinary Research and Perspectives, 9, 1–52.

    Google Scholar 

  • Bradlow, E. T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.

    Article  Google Scholar 

  • Brennan, R. L. (2006). Perspectives on the evolution and future of educational measurement. In R. L. Brennan (Ed.), Educational measurement (4th ed.).Westport, CT: Praeger.

    Google Scholar 

  • Briggs, D., Alonzo, A., Schwab, C., & Wilson, M. (2006). Diagnostic assessment with ordered multiplechoice items. Educational Assessment, 11(1), 33–63.

    Article  Google Scholar 

  • Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296–322.

    Google Scholar 

  • Campbell, N. R. (1928). An account of the principles of measurement and calculation. London: Longmans, Green & Co.

    Google Scholar 

  • Claesgens, J., Scalise, K., Wilson, M., & Stacy, A. (2009). Mapping student understanding in chemistry: The perspectives of chemists. Science Education, 93(1), 56–85.

    Article  Google Scholar 

  • Cooke, L. (2006). Is the mouse a poor man’s eye tracker? Proceedings of the Society for Technical Communication Conference. Arlington, VA: STC, 252–255.

    Google Scholar 

  • Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334.

    Article  Google Scholar 

  • Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral mea surements: Theory of generalizability for scores and profiles. New York: John Wiley.

    Google Scholar 

  • Dahlgren, L. O. (1984a). Outcomes of learning. In F. Marton, D. Hounsell & N. Entwistle (Eds.), The experience of learning. Edinburgh: Scottish Academic Press.

    Google Scholar 

  • De Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press.

    Google Scholar 

  • De Boeck, P., & Wilson, M. (Eds.). (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer-Verlag.

    Google Scholar 

  • Draney, K., & Wilson, M. (2009). Selecting cut scores with a composite of item types: The Construct Mapping procedure. In E. V. Smith, & G. E. Stone (Eds.), Criterion-referenced testing: Practice analysis to score reporting using Rasch measurement (pp. 276–293). Maple Grove, MN: JAM Press.

    Google Scholar 

  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum & Associates.

    Google Scholar 

  • Everitt, B. S. (2010). Cambridge dictionary of statistics (3rd ed.). Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Galton, F. (1883). Inquiries into human faculty and its development. AMS Press, New York.

    Book  Google Scholar 

  • Guttman, L. (1944). A basis for scaling qualitative data. American Sociological Review, 9, 139–150.

    Article  Google Scholar 

  • Guttman, L. A. (1950). The basis for scalogram analysis. In S. A. Stouffer, L. A. Guttman, F. A. Suchman, P. F. Lazarsfeld, S. A. Star, & J. A. Clausen (Eds.), Studies in social psychology in world war two, vol. 4. Measurement and prediction. Princeton: Princeton University Press.

    Google Scholar 

  • Holland, P. W., & Hoskens, M. (2003). Classical test theory as a first-order item response theory: Application to true-score prediction from a possibly-nonparallel test. Psychometrika, 68, 123–149.

    Article  Google Scholar 

  • Ivie, J. L., Embretson, S., E. (2010). Cognitive process modeling of spatial ability: The assembling objects task. Intelligence, 38(3), 324–335.

    Article  Google Scholar 

  • Janssen, R., Schepers, J., & Peres, D. (2004). Models with item and item-group predictors. In, P. De Boeck, & M. Wilson, (Eds.), Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer-Verlag.

    Google Scholar 

  • Kakkonen, T., Myller, N., Sutinen, E., & Timonen, J. (2008). Comparison of dimension reduction methods for automated essay grading. Educational Technology & Society, 11(3), 275–288.

    Google Scholar 

  • Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed.).Westport, CT: Praeger.

    Google Scholar 

  • Kofsky, E. (1966). A scalogram study of classificatory development. Child Development, 37, 191–204.

    Article  Google Scholar 

  • Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2, 151–160.

    Article  Google Scholar 

  • Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 140, 52.

    Google Scholar 

  • Longford, N. T., Holland, P. W., & Thayer, D. T.(1993). Stability of the MH D-DIF statistics across populations. In P. W. Holland, & H. Wainer (Eds.), Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement: A new type of fundamental measurement. Journal of Mathematical Psychology, 1, 1–27.

    Article  Google Scholar 

  • Magidson, J., & Vermunt, J. K. (2002). A nontechnical introduction to latent class models. Statistical innovations white paper No. 1. Available at: www.statisticalinnovations.com/articles/articles.html.

    Google Scholar 

  • Marton, F. (1981). Phenomenography: Describing conceptions of the world around us. Instructional Science, 10(2), 177–200.

    Article  Google Scholar 

  • Masters, G. N., Adams, R. J., & Wilson, M. (1990). Charting of student progress. In T. Husen & T. N.

    Google Scholar 

  • Postlethwaite (Eds.), International encyclopedia of education: Research and studies. Supplementary, Volume 2 (pp. 628–634). Oxford: Pergamon Press.

    Google Scholar 

  • Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed.). New York: American Council on Education and Macmillan.

    Google Scholar 

  • Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Education Researcher, 32(2), 13–23.

    Article  Google Scholar 

  • Meulders, M., & Xie, Y. (2004). Person-by-item predictors. In, P. De Boeck, & M. Wilson, (Eds.), Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer-Verlag.

    Google Scholar 

  • Michell, J. (1990). An introduction to the logic of psychological measurement. Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Mislevy, R, J., Wilson, M., Ercikan, K., & Chudowsky, N. (2003). Psychometric principles in student assessment. In T. Kellaghan, & D. L. Stufflebeam (Eds.), International handbook of educational evaluation. Dordrecht, The Netherlands: Kluwer Academic Press.

    Chapter  Google Scholar 

  • National Research Council. (2001). Knowing what students know: The science and design of educational assessment (Committee on the Foundations of Assessment. J. Pellegrino, N. Chudowsky, & R. Glaser, (Eds.), Division on behavioural and social sciences and education). Washington, DC: National Academy Press.

    Google Scholar 

  • National Research Council. (2008). Early childhood assessment: Why, what, and how? Committee on Developmental Outcomes and Assessments for Young Children, Catherine E. Snow & Susan B. Van Hemel, (Eds.), Board on children, youth and families, board on testing and assessment, division of behavioral and social sciences and education. Washington, DC: The National Academies Press.

    Google Scholar 

  • Nisbet, R. J., Elder, J., & Miner, G. D. (2009). Handbook of statistical analysis and data mining applications. Academic Press.

    Google Scholar 

  • Nunnally, C. J. (1978). Psychometric theory (2nd ed.) New York: McGraw Hill.

    Google Scholar 

  • Paek, I. (2002). Investigation of differential item functioning: Comparisons among approaches, and extension to a multidimensional context. Unpublished doctoral dissertation, University of California, Berkeley.

    Google Scholar 

  • Ramsden, P., Masters, G., Stephanou, A., Walsh, E., Martin, E., Laurillard, D., & Marton, F. (1993). Phenomenographic research and the measurement of understanding: An investigation of students’ conceptions of speed, distance, and time. International Journal of Educational Research, 19(3), 301–316.

    Google Scholar 

  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Danmarks Paedogogische Institut.

    Google Scholar 

  • Saal, F. E., Downey, R. G., & Lahey, M. A. (1980). Rating the Ratings: Assessing the Psychometric Quality of Rating Data. Psychological Bulletin. 88(2), 413–428.

    Article  Google Scholar 

  • Scalise, K., & Wilson, M. (2011). The nature of assessment systems to support effective use of evidence through technology. E-Learning and Digital Media, 8(2), 121–132.

    Article  Google Scholar 

  • Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal and structural equation models. Boca Raton, FL: Chapman & Hall/CRC.

    Book  Google Scholar 

  • Shavelson, R. J., Webb, N. M., & Rowley, G. L. (1989). Generalizability theory. American Psychologist, 44, 922–932.

    Article  Google Scholar 

  • Spearman, C. C. (1904). The proof and measurement of association between two things. American Journal of Psychology, 15, 72–101.

    Article  Google Scholar 

  • Spearman, C, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271–295.

    Article  Google Scholar 

  • Takane, Y. (2007). Applications of multidimensional scaling in psychometrics. In C. R. Rao, & S. Sinharay (Eds.), Handbook of statistics, Vol. 26: Psychometrics. Amsterdam: Elsevier.

    Google Scholar 

  • Thorburn, W. M. (1918). The myth of occam’s Razor. Mind, 27(107), 345–353.

    Article  Google Scholar 

  • van der Linden, W. (1992). Fundamental measurement and the fundamentals of Rasch measurement. In M. Wilson (Ed.), Objective measurement: Theory into practice Vol. 2. Norwood, NJ: Ablex Publishing Corp.

    Google Scholar 

  • van der Linden, W. J., & Hambleton, R. K. (Eds.) (1997). Handbook of modern item response theory. New York: Springer.

    Google Scholar 

  • Vosniadou, S., & Brewer, W. F. (1994). Mental models of the day/night cycle. Cognitive Science, 18, 123–183.

    Article  Google Scholar 

  • Wang, W.-C., & Wilson, M. (2005). The Rasch testlet model. Applied Psychological Measurement, 29,126–149.

    Article  Google Scholar 

  • Wiliam, D. (2011). Embedded formative assessment. Bloomington, IN: Solution Tree Press,

    Google Scholar 

  • Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Wilson, M. (2009). Measuring progressions: Assessment structures underlying a learning progression. Journal for Research in Science Teaching, 46(6), 716–730.

    Article  Google Scholar 

  • Wilson, M., & Adams, R. J. (1995). Rasch models for item bundles. Psychometrika, 60(2), 181–198.

    Article  Google Scholar 

  • Wilson, M., & Draney, K. (2002). A technique for setting standards and maintaining them over time. In S. Nishisato, Y. Baba, H. Bozdogan, & K. Kanefugi (Eds.), Measurement and multivariate analysis (Proceedings of the International Conference on Measurement and Multivariate Analysis, Banff, Canada, May 12–14, 2000), pp. 325–332. Tokyo: Springer-Verlag.

    Google Scholar 

  • Wright, B. D. (1968). Sample-free test calibration and person measurement. Proceedings of the 1967 invitational conference on testing (pp. 85–101). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14, 97–116.

    Article  Google Scholar 

  • Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Sense Publishers

About this chapter

Cite this chapter

Wilson, M., Gochyyev, P. (2013). Psychometrics. In: Teo, T. (eds) Handbook of Quantitative Methods for Educational Research. SensePublishers, Rotterdam. https://doi.org/10.1007/978-94-6209-404-8_1

Download citation

Publish with us

Policies and ethics