Item wording effects in self-report measures and reading achievement: Does removing careless respondents help?

https://doi.org/10.1016/j.stueduc.2022.101126Get rights and content

Highlights

  • Item wording effects occur when positively and negatively worded items used together.

  • Item wording effects may distort the factorial structure of self-report measures.

  • The severity of item wording effects is associated with the level of reading ability.

  • Removing careless responses can alleviate the impact of item wording effects.

Abstract

This study examined the impact of item wording effects on the factor structure of the Student Confidence in Reading (SCR) scale in the 2016 Progress in International Reading Literacy Study (PIRLS). First, using a large sample of students from seven English-speaking countries, item wording effects in the SCR scale were investigated for the entire sample and the subgroups of students based on their reading achievement levels in PIRLS 2016. Second, the presence of item wording effects was re-examined after identifying and removing students who answered negatively worded items inconsistently. The results showed that the use of negatively worded items distorted the factor structure of the SCR scale, leading to an unintentionally multidimensional factor structure. Removing students with careless responses had a negligible impact on the model fit for students at the lowest achievement level, whereas it considerably improved the model fit for other students with moderate to high reading achievement.

Section snippets

Item wording effects

In psychological instruments, negatively and positively worded items are often used together for various reasons. In surveys and questionnaires, the combination of positively and negatively worded items can prevent respondents from selecting only positive (or only negative) responses systematically (e.g., Baumbartner & Steenkams, 2001; Weijters, Geuens, & Schillewaert, 2009). Furthermore, negatively worded items may distract respondents from showing acquiescence response style or force them to

Sample

The data of this study came from the 2016 administration of PIRLS, which is an international large-scale assessment organized by the International Association for the Evaluation of Educational Achievement. PIRLS is administered every five years to assess the reading literacy of fourth graders around the world. Results of PIRLS allow researchers and educational agencies to analyze trends in reading achievement and to make international comparisons. The PIRLS assessment consists of

Comparisons of the hypothesized factor models

Table 3 shows the results of the four CFA models (i.e., one-factor, two-factor, bi-factor, and second-order models) for each country. The one-factor (i.e., unidimensional) model did not seem to fit the SCR scale sufficiently in any of the countries, suggesting that item wording effects might be present in the scale. Although the CFI values were at or above the suggested cut-off value of 0.95 for some countries (e.g., Australia, Canada, and Ireland), both TLI and RMSEA did not indicate a good

Discussion

Previous research showed that if the phrasing of items (i.e., positive or negative) has a significant impact on how respondents answer the items, item wording effects may occur as a major threat to the validity of the instrument (Kam & Meyer, 2015; McKay et al., 2014; Zeng et al., 2020). To shed further light on the causes and treatment of item wording effects in self-report measures, this empirical study investigated the relationship between item wording effects and respondents’ reading

Data availability

The data used in this study is publicly available at https://timssandpirls.bc.edu/pirls2016/international-database/.

References (61)

  • H. Baumbartner et al.

    Response styles in marketing research: A cross-national investigation

    Journal of Marketing Research

    (2001)
  • D. Bolt et al.

    An IRT mixture model for rating scale confusion associated with negatively worded items in measures of social-emotional learning

    Applied Measurement in Education

    (2020)
  • H.C. Bulut

    Item wording effects in psychological measures: Do early literacy skills matter?

    Journal of Measurement and Evaluation in Education and Psychology

    (2021)
  • L.J. Cronbach

    Further evidence on response sets and test design

    Educational and Psychological Measurement

    (1950)
  • C. DiStefano et al.

    Further investigating method effects associated with negatively worded items on self-report surveys

    Structural Equation Modeling

    (2006)
  • H. Dodeen

    The effects of positively and negatively worded items on the factor structure of the UCLA Loneliness Scale

    Journal of Psychoeducational Assessment

    (2015)
  • R. Forsterlee et al.

    An examination of the short form of the Need for Cognition Scale applied in an Australian sample

    Educational and Psychological Measurement

    (1999)
  • T. Gnambs et al.

    Cognitive abilities explain wording effects in the Rosenberg Self-Esteem Scale

    Assessment

    (2020)
  • M.N. Hallquist et al.

    MplusAutomation: An R package for facilitating large-scale latent variable analyses in Mplus

    Structural Equation Modeling

    (2018)
  • J. He et al.

    Acquiescent and socially desirable response styles in cross-cultural value surveys

  • A. Hinz et al.

    The acquiescence effect in responding to a questionnaire

    GMS Psycho-Social Medicine

    (2007)
  • M. Hong et al.

    Methods of detecting insufficient effort responding: Comparisons and practical recommendations

    Educational and Psychological Measurement

    (2020)
  • L. Hu et al.

    Cut-off criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives

    Structural Equation Modeling

    (1999)
  • S. Hu et al.

    Difficulties in comprehending affirmative and negative sentences: Evidence from Chinese children with reading difficulties

    Journal of Learning Disabilities

    (2018)
  • J.L. Huang et al.

    Detecting and deterring insufficient effort responding to surveys

    Journal of Business and Psychology

    (2012)
  • J.L. Huang et al.

    Insufficient effort responding: Examining an insidious confound in survey data

    Journal of Applied Psychology

    (2015)
  • A.C. Hutton

    Assessing acquiescence in surveys using positively and negatively worded questions

    (2017)
  • C.C.S. Kam et al.

    How careless responding and acquiescence response bias can influence construct dimensionality: The case of job satisfaction

    Organizational Research Methods

    (2015)
  • N. Kamoen et al.

    Why are negative questions difficult to answer? On the processing of linguistic contrasts in surveys

    Public Opinion Quarterly

    (2017)
  • S. LaRoche et al.

    Sample design in PIRLS 2016

  • Cited by (1)

    View full text