Skip to main content

Advertisement

Log in

Defining the correctness of a diagnosis: differential judgments and expert knowledge

  • ORIGINAL PAPER
  • Published:
Advances in Health Sciences Education Aims and scope Submit manuscript

Abstract

Approaches that use a simulated patient case to study and assess diagnostic reasoning usually use the correct diagnosis of the case as a measure of success and as an anchor for other measures. Commonly, the correctness of a diagnosis is determined by the judgment of one or more experts. In this study, the consistency of experts’ judgments of the correctness of a diagnosis, and the structure of knowledge supporting their judgments, were explored using a card sorting task. Seven expert pediatricians were asked to sort into piles the diagnoses proposed by 119 individuals who had worked through a simulated patient case of Haemophilus influenzae Type B (HIB) meningitis. The 119 individuals had varying experience levels. The expert pediatricians were asked to sort the proposed diagnoses by similarity of content, and then to order the piles based on correctness, relative to the known correct diagnosis (HIB meningitis). Finally, the experts were asked to judge which piles contained correct or incorrect diagnoses. We found that, contrary to previous studies, experts shared a common conceptual framework of the diagnostic domain being considered and were consistent in how they categorized the diagnoses. However, similar to previous studies, the experts differed greatly in their judgment of which diagnoses were correct. This study has important implications for understanding expert knowledge, for scoring performance on simulated or real patient cases, for providing feedback to learners in the clinical setting, and for establishing criteria that define what is correct in studies of diagnostic error and diagnostic reasoning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Adler, M. D., Trainor, J. L., Siddall, V. J., & McGaghie, W. C. (2007). Development and evaluation of high-fidelity simulation case scenarios for pediatric resident education. Ambulatory Pediatrics, 7, 182–186. doi:10.1016/j.ambp.2006.12.005.

    Article  Google Scholar 

  • Anderson, J. R., & Lebiere, C. (1998). The atomic components of thought. Mahwah, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Barzansky, B., & Etzel, S. I. (2006). Medical schools in the United States, 2005–2006. Journal of the American Medical Association, 296, 1147–1152. doi:10.1001/jama.296.9.1147.

    Article  Google Scholar 

  • Berner, E. S. (2003). Diagnostic decision support systems: How to determine the gold standard? Journal of the American Medical Informatics Association, 10, 608–610. doi:10.1197//jamia.M1416.

    Article  Google Scholar 

  • Boulet, J. R., Murray, D., Kras, J., Woodhouse, J., McAllister, J., & Ziv, A. (2003). Reliability and validity of a simulation-based acute care skills assessment for medical students and residents. Anesthesiology, 99, 1270–1280. doi:10.1097/00000542-200312000-00007.

    Article  Google Scholar 

  • Brown, J. S., & Burton, R. R. (1975). Multiple representations of knowledge for tutorial reasoning. In D. G. Bobrow & A. Collins (Eds.), Representation and understanding (pp. 311–349). New York: Academic Press.

    Chapter  Google Scholar 

  • Colliver, J. A., Vu, N. V., Markwell, S. J., & Verhulst, S. J. (1991). Reliability and efficiency of components of clinical competence assessed with five performance-based examinations using standardized patients. Medical Education, 25, 303–310. doi:10.1111/j.1365-2923.1991.tb00071.x.

    Article  Google Scholar 

  • Coxon, A. P. M. (1999). Sorting data: Collection and analysis. Thousand Oaks, CA: Sage.

    Book  Google Scholar 

  • de Graaff, E. (1989). A test of medical problem-solving scored by nurses and doctors: The handicap of expertise. Medical Education, 23, 381–386. doi:10.1111/j.1365-2923.1989.tb01564.x.

    Article  Google Scholar 

  • Dillon, G. F., Boulet, J. R., Hawkins, R. E., & Swanson, D. B. (2004). Simulations in the United States Medical Licensing Examination (USMLE). Quality & Safety in Health Care, 13(Suppl 1), i41–i45. doi:10.1136/qshc.2004.010025.

    Article  Google Scholar 

  • Dillon, G. F., Clyman, S. G., Clauser, B. E., & Margolis, M. J. (2002). The introduction of computer-based case simulations into the United States medical licensing examination. Academic Medicine, 77, S94–S96. doi:10.1097/00001888-200210001-00029.

    Article  Google Scholar 

  • Friedman, C., Elstein, A., Wolf, F., Murphy, G., Franz, T., Fine, P., et al. (1998). Measuring the quality of diagnostic hypothesis sets for studies of decision support. Medinfo, 9(Pt 2), 864–868.

    Google Scholar 

  • Gaur, L., & Skochelak, S. (2004). STUDENTJAMA. Evaluating competence in medical students. Journal of the American Medical Association, 291, 2143. doi:10.1001/jama.291.17.2143.

    Article  Google Scholar 

  • Grant, J., & Marsden, P. (1987). The structure of memorized knowledge in students and clinicians: An explanation for diagnostic expertise. Medical Education, 21, 92–98. doi:10.1111/j.1365-2923.1987.tb00672.x.

    Article  Google Scholar 

  • Herbers, J. E., Jr., Noel, G. L., Cooper, G. S., Harvey, J., Pangaro, L. N., & Weaver, M. J. (1989). How accurate are faculty evaluations of clinical competence? Journal of General Internal Medicine, 4, 202–208. doi:10.1007/BF02599524.

    Article  Google Scholar 

  • Hills, J. R. (1976). Measurement and evaluation in the classroom. Columbus, OH: Bell and Howell.

    Google Scholar 

  • Hripcsak, G., & Wilcox, A. (2002). Reference standards, judges, and comparison subjects: Roles for experts in evaluating system performance. Journal of the American Medical Informatics Association, 9, 1–15. doi:10.1197/jamia.M1217.

    Article  Google Scholar 

  • Huang, G., Reynolds, R., & Candler, C. (2007). Virtual patient simulation at US and Canadian medical schools. Academic Medicine, 82, 446–451. doi:10.1097/ACM.0b013e31803e8a0a.

    Article  Google Scholar 

  • Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218. doi:10.1007/BF01908075.

    Article  Google Scholar 

  • Iramaneerat, C., & Yudkowsky, R. (2007). Rater errors in a clinical skills assessment of medical students. Evaluation & the Health Professions, 30, 266–283. doi:10.1177/0163278707304040.

    Article  Google Scholar 

  • Kamin, C., Souza, K. H., Heestand, D., Moses, A., & O’Sullivan, P. (2006). Educational technology infrastructure and services in North American medical schools. Academic Medicine, 81, 632–637. doi:10.1097/01.ACM.0000232413.43142.8b.

    Article  Google Scholar 

  • Kruskal, J. B., & Wish, W. (1978). Multidimensional scaling. Newbury Park, CA: Sage.

    Book  Google Scholar 

  • Littlefield, J. H., Demps, E. L., Keiser, K., Chatterjee, L., Yuan, C. H., & Hargreaves, K. M. (2003). A multimedia patient simulation for teaching and assessing endodontic diagnosis. Journal of Dental Education, 67, 669–677.

    Google Scholar 

  • McGaghie, W. C., Boerger, R. L., McCrimmon, D. R., & Ravitch, M. M. (1994). Agreement among medical experts about the structure of concepts in pulmonary physiology. Academic Medicine, 69, S78–S80. doi:10.1097/00001888-199410000-00049.

    Article  Google Scholar 

  • Miller, R. A. (2002). Reference standards in evaluating system performance. Journal of the American Medical Informatics Association, 9, 87–88.

    Article  Google Scholar 

  • Norcini, J. J., & McKinley, D. W. (2007). Assessment methods in medical education. Teaching and Teacher Education, 23, 239–250. doi:10.1016/j.tate.2006.12.021.

    Article  Google Scholar 

  • Pitts, J., Coles, C., & Thomas, P. (1999). Educational portfolios in the assessment of general practice trainers: Reliability of assessors. Medical Education, 33, 515–520. doi:10.1046/j.1365-2923.1999.00445.x.

    Article  Google Scholar 

  • Quirk, M., Mazor, K., Haley, H.-L., Wellman, S., Keller, D., Hatem, D., et al. (2005). Reliability and validity of checklists and global ratings by standardized students, trained raters, and faculty raters in an objective structured teaching environment. Teaching and Learning in Medicine, 17, 202–209. doi:10.1207/s15328015tlm1703_2.

    Article  Google Scholar 

  • Ramnarayan, P., Kapoor, R. R., Coren, M., Nanduri, V., Tomlinson, A. L., Taylor, P. M., et al. (2003). Measuring the impact of diagnostic decision support on the quality of clinical decision making: Development of a reliable and valid composite score. Journal of the American Medical Informatics Association, 10, 563–572. doi:10.1197/jamia.M1338.

    Article  Google Scholar 

  • Ramnarayan, P., Roberts, G. C., Coren, M., Nanduri, V., Tomlinson, A., Taylor, P. M., et al. (2006). Assessment of the potential impact of a reminder system on the reduction of diagnostic errors: A quasi-experimental study. BMC Medical Informatics and Decision Making, 6, 22. doi:10.1186/1472-6947-6-22.

    Article  Google Scholar 

  • Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66, 846–850. doi:10.2307/2284239.

    Article  Google Scholar 

  • Regehr, G., & Norman, G. R. (1996). Issues in cognitive psychology: Implications for professional education. Academic Medicine, 71, 988–1001.

    Article  Google Scholar 

  • Schmidt, H. G., Norman, G. R., & Boshuizen, H. P. (1990). A cognitive perspective on medical expertise: Theory and implication. Academic Medicine, 65, 611–621. doi:10.1097/00001888-199010000-00001.

    Article  Google Scholar 

  • Solomon, L. S., Zaslavsky, A. M., Landon, B. E., & Cleary, P. D. (2002). Variation in patient-reported quality among health care organizations. Health Care Financing Review, 23, 85–100.

    Google Scholar 

  • Takayama, H., Grinsell, R., Brock, D., Foy, H., Pellegrini, C., & Horvath, K. (2006). Is it appropriate to use core clerkship grades in the selection of residents? Current Surgery, 63, 391–396. doi:10.1016/j.cursur.2006.06.012.

    Article  Google Scholar 

  • Wainer, H., & Mee, J. (2004). On assessing the quality of physicians’ clinical judgment: The search for outcome variables. Evaluation & the Health Professions, 27, 369–382. doi:10.1177/0163278704270009.

    Article  Google Scholar 

  • Weiner, J. P., Parente, S. T., Garnick, D. W., Fowles, J., Lawthers, A. G., & Palmer, R. H. (1995). Variation in office-based quality. A claims-based profile of care provided to Medicare patients with diabetes. Journal of the American Medical Association, 273, 1503–1508. doi:10.1001/jama.273.19.1503.

    Article  Google Scholar 

  • Wilson, G. M., Lever, R., Harden, R. M., & Robertson, J. I. (1969). Examination of clinical examiners. Lancet, 1, 37–40. doi:10.1016/S0140-6736(69)90998-2.

    Article  Google Scholar 

Download references

Acknowledgments

The author(s) are grateful to the seven expert pediatricians for participating in this study. We also thank the Study Design and Statistical Consultation Service at the University of Pittsburgh’s Office of Clinical Research, Health Sciences for assistance with statistical aspects of this project. Finally, the authors would like to extend special recognition to Katie Rossi for assistance with data management and manuscript preparation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven L. Kanter.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kanter, S.L., Brosenitsch, T.A., Mahoney, J.F. et al. Defining the correctness of a diagnosis: differential judgments and expert knowledge. Adv in Health Sci Educ 15, 65–79 (2010). https://doi.org/10.1007/s10459-009-9168-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10459-009-9168-0

Keywords

Navigation