Variabilität und Generalisierbarkeit von Ratings zur Qualität von Mathematikunterricht zwischen und innerhalb von Unterrichtsstunden

Jentsch, Armin; Casale, Gino; Schlesinger, Lena; Kaiser, Gabriele; König, Johannes; Blömeke, Sigrid

doi:10.1007/s42010-019-00061-8

Variabilität und Generalisierbarkeit von Ratings zur Qualität von Mathematikunterricht zwischen und innerhalb von Unterrichtsstunden

Variability and generalizability of ratings on the quality of mathematics instruction between and within lessons

Allgemeiner Teil
Published: 08 November 2019

Volume 48, pages 179–197, (2020)
Cite this article

Unterrichtswissenschaft Aims and scope Submit manuscript

Armin Jentsch ORCID: orcid.org/0000-0002-2423-3955¹,
Gino Casale²,
Lena Schlesinger³,
Gabriele Kaiser⁴,
Johannes König¹ &
…
Sigrid Blömeke⁵

984 Accesses
9 Citations
Explore all metrics

Zusammenfassung

Der Erfassung von Unterrichtsqualität durch Beobachterratings ist in den letzten Jahren viel Aufmerksamkeit zuteilgeworden. Erste Evidenz liegt dabei auch zu deren Variabilität zwischen mehreren Unterrichtsstunden vor. Inwieweit Beobachterratings jedoch auch während einer Unterrichtsstunde variieren, wurde bisher kaum untersucht. Des Weiteren liegen nur wenige Studien vor, in denen die Stabilität fachspezifischer und generischer Merkmale der Unterrichtsqualität vergleichend analysiert wurde. Die hier dargestellte Studie knüpft an diese Desiderate an und untersucht die Stabilität fachspezifischer und generischer Merkmale sowohl zwischen als auch innerhalb von Unterrichtsstunden. Die Unterrichtsqualität wurde im Mathematikunterricht in 37 Klassen × 2 Doppelstunden × 4 Unterrichtssegmenten erfasst. Geschulte Beobachterinnen und Beobachter schätzten fünf Merkmale ein, von denen drei die Basisdimensionen (Klassenführung, konstruktive Unterstützung und kognitive Aktivierung) und zwei fachspezifische Merkmale der Unterrichtsqualität darstellen (stoffbezogene und unterrichtsbezogene mathematikdidaktische Qualität). Die Ergebnisse zeigen eine akzeptable Reliabilität für vier der fünf untersuchten Merkmale und bedeutsame Unterschiede in der Variabilität zwischen generischen und fachspezifischen Merkmalen. Die Befunde werden vor dem Hintergrund unterschiedlicher Konzeptualisierungen der Unterrichtsqualität diskutiert.

Abstract

Measuring instructional quality with observer ratings has received great attention within the last years, resulting in first evidence on the variability between lessons. However, there is a lack of research on the extent to which observer ratings vary during a lesson. Furthermore, there are hardly any studies that compared the stability of generic and subject-specific characteristics of instructional quality. The study presented here departs from these desiderata and investigates into the stability of generic and subject-specific characteristics of instructional quality, both between and within lessons. We measured instructional quality in 37 classes during two lessons of mathematics instruction and four segments per lesson. Trained observers rated five characteristics of instructional quality, three of which represent the basic dimensions (classroom management, student support, cognitive activation) and two of which are subject-specific (subject-related and teaching-related mathematics educational quality). The results indicate acceptable reliabilities for four of the aforementioned five characteristics. In addition, variability differed relevantly between generic and subject-specific characteristics of instructional quality. We discuss our findings against various conceptualisations of instructional quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Notes

Korrelative Ansätze zur Untersuchung zeitlicher Stabilität (z. B. Curby et al. 2011; Patrick und Mantzipopoulos 2016) entsprechen insofern dem statistischen Modell der KTT, als dass zeitliche Variabilität mit der Residualvarianz konfundiert sind (Praetorius et al. 2014). Da das Ziel dieser Studie aber gerade darin besteht, Aussagen über stabile und instabile Varianzanteile zu generieren, gehen wir im Literaturüberblick nicht näher auf korrelative Ansätze ein.
In Anlehnung an Cohen (1992) wird die praktische Bedeutsamkeit von Varianzanteilen als kleine (<7 %), moderate (7–15 %) oder große (>15 %) Effektstärke definiert.
Den Ratern stand es frei, nach Absprache untereinander einen Messzeitpunkt um bis zu drei Minuten vorzuziehen oder hinauszuzögern, wenn es auf Grund der Unterrichtssituation angemessen erschien (z. B. angekündigte Pause oder Methodenwechsel).

Literatur

Baumert, J., Kunter, M., Blum, W., Brunner, M., Voss, T., Jordan, A., Tsai, Y.-M., et al. (2010). Teachers’ mathematical knowledge, cognitive activation in the classroom, and student progress. American Educational Research Journal, 47(1), 133–180.
Google Scholar
Blömeke, S., Gustafsson, J.-E., & Shavelson, R. J. (2015). Beyond dichotomies. Competence viewed as a continuum. Zeitschrift für Psychologie, 223, 3–13.
Google Scholar
Blum, W. (2006). Einführung. In W. Blum, C. Drüke-Noe, R. Hartung & O. Köller (Hrsg.), Bildungsstandards Mathematik: Konkret. Sekundarstufe 1: Aufgabenbeispiele, Unterrichtsanregungen, Fortbildungsideen (S. 14–32). Berlin: Cornelsen Scriptor.
Google Scholar
Brennan, R. (2001). Generalizability theory. New York: Springer.
Google Scholar
Brophy, J. (2000). Teaching. Brüssel: International Academy of Education.
Google Scholar
Brunner, E. (2017). Qualität von Mathematikunterricht: Eine Frage der Perspektive. Journal für Mathematikdidaktik. https://doi.org/10.1007/s13138-017-0122-z.
Article Google Scholar
Buchholtz, N., Kaiser, G., & Blömeke, S. (2014). Die Erhebung mathematikdidaktischen Wissens – Konzeptualisierung einer komplexen Domäne. Journal für Mathematik-Didaktik, 35(1), 101–128.
Google Scholar
Casabianca, J. M., Lockwood, J. R., & McCaffrey, D. F. (2015). Trends in classroom observation scores. Educational and Psychological Measurement, 75(2), 311–337.
Google Scholar
Charalambous, C., & Praetorius, A.-K. (2018). Studying instructional quality in mathematics through different lenses: in search of common ground. ZDM Mathematics Education, 50(3), 355–366.
Google Scholar
Clausen, M. (2002). Qualität von Unterricht – Eine Frage der Perspektive? Münster: Waxmann.
Google Scholar
Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155–159.
Google Scholar
Common Core State Standards Initiative [CCSSI] (2014). Mathematics standards. http://www.corestandards.org/Math. Zugegriffen: 9. Nov. 2018.
Google Scholar
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: theory of generalizability for scores and profiles. New York: Wiley.
Google Scholar
Curby, T. W., Stuhlman, M., Grimm, K., Mashburn, A., Chomat-Mooney, L., & Downer, J. (2011). Within-day variability in the quality of classroom interactions during third and fifth grade. The Elementary School Journal, 112(1), 16–37.
Google Scholar
Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. Perspectives in social psychology. New York: Plenum.
Google Scholar
Ditton, H. (2006). Unterrichtsqualität. In K.-H. Arnold, U. Sandfuchs & J. Wiechmann (Hrsg.), Handbuch Unterricht (S. 235–243). Bad Heilbrunn: Klinkhardt.
Google Scholar
Drollinger-Vetter, B. (2011). Verstehenselemente und strukturelle Klarheit: Fachdidaktische Qualität der Anleitung von mathematischen Verstehensprozessen im Unterricht. Münster: Waxmann.
Google Scholar
Fauth, B., Decristan, J., Rieser, S., Klieme, E., & Büttner, G. (2014). Grundschulunterricht aus Schüler‑, Lehrer- und Beobachterperspektive: Zusammenhänge und Vorhersage von Lernerfolg. Zeitschrift für Pädagogische Psychologie, 28(3), 127–137.
Google Scholar
Goldstein, H. (2003). Multilevel statistical models. London: Hodder Arnold.
Google Scholar
Hage, K., Bischoff, H., Dichanz, H., Eubel, K., Oehlschläger, H., & Schwittmann, D. (1985). Das Methodenrepertoire von Lehrern. Eine Untersuchung zum Unterrichtsalltag in der Sekundarstufe I. Opladen: Leske + Budrich.
Google Scholar
Helmke, A. (2012). Unterrichtsqualität und Lehrerprofessionalität: Diagnose, Evaluation und Verbesserung des Unterrichts. Seelze: Klett.
Google Scholar
Hiebert, J., Gallimore, R., Garnier, H., Stigler, J., et al. (Hrsg.). (2003). Teaching mathematics in seven countries. Results from the TIMSS 1999 video study. Washington: National Center for Education Statistics.
Google Scholar
Hill, H. C., Charalambous, C., & Kraft, M. A. (2012). When rater reliability is not enough: teacher observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56–64.
Google Scholar
Hugener, I., Pauli, C., & Reusser, K. (2007). Inszenierungsmuster, kognitive Aktivierung und Leistung im Mathematikunterricht. Analysen aus der schweizerisch-deutschen Videostudie. In D. Lemmermühle, M. Rothangel, S. Bügeholz, M. Hasselhorn & R. Watermann (Hrsg.), Professionell Lehren – Erfolgreich Lernen (S. 109–121). Münster: Waxmann.
Google Scholar
Kane, M. (2011). The errors of our ways. Journal of Educational Measurement, 48, 12–30.
Google Scholar
Kane, T. J., & Staiger, D. O. (2012). Gathering feedback for teaching: combining high-quality observations with student surveys and achievement gains. MET Project Research Paper. Seattle: Bill & Melinda Gates Foundation.
Google Scholar
Klieme, E., & Rakoczy, K. (2008). Empirische Unterrichtsforschung und Fachdidaktik. Outcome-orientierte Messung und Prozessqualität des Unterrichts. Zeitschrift für Pädagogik, 54, 222–237.
Google Scholar
König, J., & Lebens, M. (2012). Classroom Management Expertise (CME) von Lehrkräften messen: Überlegungen zur Testung mithilfe von Videovignetten und erste empirische Befunde. Lehrerbildung auf dem Prüfstand, 5(1), 3–28.
Google Scholar
Kounin, J. S. (1970). Disciplin and group management in classrooms. New York: Holt, Rinehart & Winston.
Google Scholar
Kuger, S., Kluczniok, K., Kaplan, D., & Rossbach, H.-G. (2016). Stability and patterns of classroom quality in German early childhood education and care. School Effectiveness and School Improvement, 27(3), 418–440.
Google Scholar
Learning Mathematics for Teaching Project (2011). Measuring the mathematical quality of instruction. Journal of Mathematics Teacher Education, 14, 25–47.
Google Scholar
Linacre, J. M. (2002). Optimizing rating scale category effectiveness. Journal of Applied Measurement, 3(1), 85–106.
Google Scholar
Lipowsky, F., Drollinger-Vetter, B., Klieme, E., Pauli, C., & Reusser, K. (2018). Generische und fachdidaktische Dimensionen von Unterrichtsqualität – zwei Seiten einer Medaille? In M. Martens, K. Rabenstein, K. Bräu, M. Fetzer, H. Gresch, I. Hardy & C. Schelle (Hrsg.), Konstruktionen von Fachlichkeit. Ansätze, Erträge und Diskussionen in der empirischen Unterrichtsforschung (S. 183–202). Bad Heilbrunn: Klinkhardt.
Google Scholar
Malmberg, L.-E., Hagger, H., Burn, K., Mutton, T., & Colls, H. (2010). Observed classroom quality during teacher education and two years of professional practice. Journal of Educational Psychology, 102(4), 916–932.
Google Scholar
Mashburn, A. J., Meyer, J. P., Allen, J. P., & Pianta, R. C. (2014). The effect of observation length and presentation order on the reliability and validity of an observational measure of teaching quality. Educational and Psychological Measurement, 74(3), 400–422.
Google Scholar
Meyer, J. P., Cash, A. H., & Mashburn, A. (2011). Occasions and the reliability of classroom observations: alternative conceptualizations and methods of analysis. Educational Assessment, 16, 227–243.
Google Scholar
Patrick, H., & Mantzicopoulos, P. (2016). Is effective teaching stable? The Journal of Experimental Education, 84(1), 23–47.
Google Scholar
Pianta, R. C., & Hamre, B. K. (2009). Conceptualization, Measurement, and Improvement of Classroom Processes: Standardized Observation Can Leverage Capacity. Educ Researcher, 38, 109–119.
Google Scholar
Pietsch, M., & Tosana, S. (2008). Beurteilereffekte bei der Messung von Unterrichtsqualität. Das Multifacetten-Rasch-Modell und die Generalisierbarkeitstheorie als Methoden der Qualitätssicherung in der externen Evaluation von Schulen. Zeitschrift für Erziehungswissenschaft, 11(3), 430–452.
Google Scholar
Praetorius, A.-K., Klieme, E., Herbert, B., & Pinger, P. (2018). Generic dimensions of teaching quality: the German framework of three basic dimensions. ZDM Mathematics Education, 50(3), 407–426.
Google Scholar
Praetorius, A.-K., Lenske, G., & Helmke, A. (2012). Observer ratings of instructional quality: Do they fulfil what they promise? Learning and Instruction, 6, 387–400.
Google Scholar
Praetorius, A.-K., Pauli, C., Reusser, K., Rakoczy, K., & Klieme, E. (2014). One lesson is all you need? Stability of instructional quality across lessons. Learning and Instruction, 31, 2–12.
Google Scholar
Praetorius, A.-K., Vieluf, S., Saß, S., Bernholt, A., & Klieme, E. (2016). The same in German as in English? Investigating the subject-specificity of teaching quality. Zeitschrift für Erziehungswissenschaft, 19(1), 191–209.
Google Scholar
Rakoczy, K. (2008). Motivationsunterstützung im Mathematikunterricht – Unterricht aus der Sicht von Lernenden und Beobachtern. Münster: Waxmann.
Google Scholar
Rakoczy, K., & Pauli, C. (2006). Hoch inferentes Rating: Beurteilung der Qualität unterrichtlicher Prozesse. In E. Klieme, C. Pauli & K. Reusser (Hrsg.), Dokumentation der Erhebungs- und Auswertungsinstrumente zur schweizerisch-deutschen Videostudie „Unterrichtsqualität, Lernverhalten und mathematisches Verständnis“ (Teil 3: Hugener, Isabelle; Pauli, Christine & Reusser, Kurt: Videoanalysen) (S. 189–205). Frankfurt am Main: GFPF.
Google Scholar
Reinmann-Rothmeier, G., & Mandl, H. (2006). Unterrichten und Lernumgebungen gestalten. In A. Krapp & B. Weidenmann (Hrsg.), Pädagogische Psychologie. Ein Lehrbuch (S. 613–658). Weinheim: Beltz.
Google Scholar
Reusser, K. (2009). Von der Bildungs- und Unterrichtsforschung zur Unterrichtsentwicklung. Probleme, Strategien, Werkzeuge und Bedingungen. Beiträge zur Lehrerinnen und Lehrerbildung, 27(3), 295–312.
Google Scholar
Schlesinger, L., & Jentsch, A. (2016). Theoretical and methodological challenges in measuring instructional quality in mathematics education using classroom observations. ZDM Mathematics Education, 48(1), 29–40.
Google Scholar
Schlesinger , L., Jentsch, A., Kaiser, G., König, J., & Blömeke, S. (2018). Subject-specific characteristics of instructional quality in mathematics education. ZDM Mathematics Education, 50(3), 475–490.
Google Scholar
Seidel, T., & Shavelson, R. (2007). Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results. Review of Educational Research, 77, 454–499.
Google Scholar
Sekretariat der Ständigen Konferenz der Kultusminister der Länder in der Bundesrepublik Deutschland (2003). Bildungsstandards im Fach Mathematik für den Mittleren Schulabschluss. https://www.kmk.org/fileadmin/Dateien/veroeffentlichungen_beschluesse/2003/2003_12_04-Bildungsstandards-Mathe-Mittleren-SA.pdf. Zugegriffen: 9. Nov. 2018.
Google Scholar
Shavelson, R., & Webb, N. (1991). Generalizability theory: a primer. Thousand Oaks: SAGE.
Google Scholar
Wirtz, M. A., & Caspar, F. (2002). Beurteilerübereinstimmung und Beurteilerreliabilität. Göttingen: Hogrefe.
Google Scholar

Download references

Author information

Authors and Affiliations

Empirische Schulforschung, Quantitative Methoden, Department Erziehungs- und Sozialwissenschaften, Humanwissenschaftliche Fakultät, Universität zu Köln, Gronewaldstr. 2a, 50931, Köln, Deutschland
Armin Jentsch & Johannes König
Institut für Bildungsforschung, Bergische Universität Wuppertal, Gaußstr. 20, 42119, Wuppertal, Deutschland
Gino Casale
Landesinstitut für Lehrerbildung und Schulentwicklung, Felix-Dahn-Straße 3, 20357, Hamburg, Deutschland
Lena Schlesinger
Fakultät Erziehungswissenschaft, Universität Hamburg, Von-Melle-Park 8, 20146, Hamburg, Deutschland
Gabriele Kaiser
Center for Educational Measurement Oslo (CEMO), Gaustadalléen 30d, 0373, Oslo, Norwegen
Sigrid Blömeke

Authors

Armin Jentsch
View author publications
You can also search for this author in PubMed Google Scholar
Gino Casale
View author publications
You can also search for this author in PubMed Google Scholar
Lena Schlesinger
View author publications
You can also search for this author in PubMed Google Scholar
Gabriele Kaiser
View author publications
You can also search for this author in PubMed Google Scholar
Johannes König
View author publications
You can also search for this author in PubMed Google Scholar
Sigrid Blömeke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Armin Jentsch.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jentsch, A., Casale, G., Schlesinger, L. et al. Variabilität und Generalisierbarkeit von Ratings zur Qualität von Mathematikunterricht zwischen und innerhalb von Unterrichtsstunden. Unterrichtswiss 48, 179–197 (2020). https://doi.org/10.1007/s42010-019-00061-8

Download citation

Published: 08 November 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s42010-019-00061-8

Schlüsselwörter

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Variabilität und Generalisierbarkeit von Ratings zur Qualität von Mathematikunterricht zwischen und innerhalb von Unterrichtsstunden

Zusammenfassung

Abstract

Access this article

Notes

Literatur

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Schlüsselwörter

Keywords

Search

Navigation