Introduction

In Latin America, the first diagnosed case of COVID-19 was reported on February 25, 2020 in Brazil (Rodriguez-Morales et al., 2020). A few weeks later, every Latin American country closed their borders and established measures to stop the spread of the disease within their territories. However, according to the Coronavirus Resource Center (2021) as of April 26, 2021, a total of 28,146,902 diagnosed cases of COVID-19 were reported in Latin America, with Brazil being the most affected country by this pandemic in the region, followed by Argentina and Mexico. Similarly, most deaths from COVID-19 in Latin America occurred in Brazil, followed by Mexico. Pandemic preparedness varies within the region and different countries are particularly vulnerable to the disease due to limited resources in their health systems, a higher prevalence of chronic diseases, late responses by populist governments, as well as high rates of poverty and inequality (Burki, 2020; Pablos-Méndez et al., 2020). These factors affect the transmission dynamics and impact of COVID-19 in Latin America, which also has implications for the dynamics of the pandemic globally (Miller et al., 2020).

In addition to the social, economic and physical health consequences, COVID-19 has become a pandemic that has significantly impacted the mental health and well-being of the population worldwide (Hossain et al., 2020). Different studies have reported the presence of high rates of anxiety symptoms, depression, post-traumatic stress disorder, psychological distress, suicidal behaviors and ideations, and sleep problems, among others (Leaune et al., 2020; Salari et al., 2020; Vindegaard & Benros, 2020). In Latin America, a recent study, which evaluated the psychological impact of the COVID-19 pandemic in 58 countries, reported an increase in anxiety and depression symptoms in 20 countries in South, North and Central America (Alzueta et al., 2021). Taken together, the above information highlights the need to pay more attention to the development of public policies to support mental health.

Regarding anxiety during the COVID-19 pandemic, the prevalence worldwide varies between 6.33% to 50.9% (Pappa et al., 2020; Wu et al., 2020; Xiong et al., 2020), being higher in women than men (Moghanibashi-Mansourieh, 2020); while in Latin America, the prevalence varies between 5.6% to 81.9% (Alzueta et al., 2021; Goularte et al., 2021; Krüger-Malpartida et al., 2020; Orellana & Orellana, 2020; Paz et al., 2020). However, many of the previous studies have used instruments that assess general anxiety, such as the GAD-7, Spielberger State-Trait Anxiety Inventory and the DASS-21. Using these types of instruments may generate under- or overestimated results, as they do not aim to identify specific anxiety symptoms associated with COVID-19 (Ransing et al., 2020). Seeking to overcome this limitation, instruments have recently been developed to identify anxiety symptoms related to COVID-19, such as the Coronavirus Anxiety Scale (CAS, Lee, 2020). The CAS is one of the most widely used measures in different research studies worldwide (Reis et al., 2020). Recent studies, using CAS, have indicated that the prevalence of anxiety related to COVID-19 varies between 4.9% and 28.7% in different populations (Caycho-Rodríguez, 2021; Lee et al., 2020a; Lee et al., 2020b).

In addition to the CAS, other measures of COVID-19 anxiety have been developed, such as The COVID-19 anxiety syndrome scale (C-19ASS, Nikčević & Spada, 2020), COVID-19-Anxiety Questionnaire (C-19-A, Petzold et al., 2020) and two scales named the COVID-19 Anxiety Scale, one developed in India (Chandu et al., 2020) and another developed in Brazil (Silva et al., 2020). Unlike the CAS, which has 5 items, the other measures have 7 items, as is the case for the COVID-19 Anxiety Scale, 9 items (C-19ASS) and 10 items (C-19-A). Although these are all relatively short measures, having a measure with a small number of items saves assessment time and associated costs (Kemper et al., 2019), improves participation rates (Edwards et al., 2004), and decreases fatigue and other negative reactions that could generate lower data quality (Credé et al., 2012). Despite the presence of criticisms related to the psychometric quality of short measures (Credé et al., 2012; Smith et al., 2000), for some years now, the development and use of short measures to assess psychological variables in clinical and non-clinical contexts has been popularized and generalized (Kruyen et al., 2013).

Furthermore, the CAS measures a set of physiological symptoms of anxiety related to COVID-19 (Lee, 2020); while the other instruments measure the presence of perseverative and avoidance thoughts (Nikčević & Spada, 2020), fear of social interaction and anxiety about illness (Chandu et al., 2020), symptoms related to Generalized Anxiety Disorder such as restlessness or nervousness, tiredness, difficulty concentrating, irritability, muscle tension, sleep disturbance and decreased ability to participate in social activities (Silva et al., 2020), as well as thoughts, feelings and behaviors related to COVID-19 anxiety (Petzold et al., 2020). On the other hand, while the CAS measures COVID-19 anxiety in general contexts, other measures, such as the Pandemic (COVID-19) Anxiety Travel Scale (PATS; Zenker et al., 2021), measure anxiety about specific events such as travel during the pandemic.

The CAS assesses symptoms of dysfunctional anxiety related to COVID-19, also called coronaphobia. Coronaphobia generates increased worry, stress, depression, suicidal ideation, generalized anxiety, safety-seeking behaviors, and impaired daily functioning (Arora et al., 2020; Chakraborty & Chatterjee, 2020; Lee et al., 2020b; Lee et al., 2020c). The CAS was originally developed and validated in English (Lee, 2020); however, psychometric properties of versions of the CAS have been studied in Turkish (Evren et al., 2020), Bangla (Ahmed et al., 2020), Korean (Choi et al., 2020), Polish (Skalski et al., 2021), Spanish (Caycho-Rodríguez et al., 2021b) and Portuguese spoken in Brazil (Padovan-Neto et al., 2021) and in Portugal (Magano et al., 2021). Different psychometric studies have indicated that the CAS is comprised of a single factor and provides highly reliable scores, with alpha coefficient values ranging from 0.80 to 0.93 and omega coefficient values between 0.80 and 0.88 (Ahmed et al., 2020; Caycho-Rodríguez et al., 2021a, 2021b; Evren et al., 2020; Magano et al., 2021; Lee, 2020, Lee et al., 2020b; Lee et al., 2020c; Padovan-Neto et al., 2021). However, some studies have suggested the presence of correlations between the errors of items 1 and 4 (Magano et al., 2021), as well as between items 2 and 3 (Padovan-Neto et al., 2021), which could mask a misspecified model, due to the adulterated increase in the fit indices (Dominguez-Lara, 2019); while another study carried out in Mexico suggests a four-item model, without the inclusion of item number four (Carrillo-Valdez, 2020). Additionally, it has also been suggested that the unifactorial structure of the CAS is invariant between groups of different sexes and ages (Ahmed et al., 2020¸ Caycho-Rodríguez et al., 2021a, 2021b; Lee, 2020). In addition, a recent study based on Item Response Theory suggests that each item in the CAS, especially items 3, 4 and 5, are able to more accurately assess moderate and high levels of dysfunctional anxiety related to COVID-19 (Caycho-Rodríguez et al., 2021b).

Despite these important psychometric findings, as far as is known from the scientific literature, there is no evidence to support the psychometric equivalence of the CAS between different countries. If an instrument was developed in one cultural context and is used in a different one, comparability of measures cannot be assumed between these different cultural groups, because psychological constructs are going to be highly dependent on the cultural context in which the measurement instruments are used (Li et al., 2012). For example, previous studies have indicated different prevalence rates of anxiety symptoms in countries in the Americas, Europe, Africa and Asia during the pandemic (Alzueta et al., 2021) and differences in levels of fear of COVID-19 in Latin American countries (Caycho-Rodriguez, et al. 2021a). However, these differences in prevalence rates may not necessarily be a product of cross-cultural differences (Scholten et al., 2017). While there are symptoms that appear in a specific cultural context and others that appear across cultures (Weiss & Somma, 2007), the presence of methodological problems may generate biased conclusions about cross-cultural differences (Bowden & Fox-Rushby, 2003; van de Vijver & Tanzer, 2004). It is often assumed that an instrument translated from one language to another assesses the construct in the same way in different cultures (Byrne, 2016). However, a major challenge in cross-cultural research in psychology is to understand that cultural comparisons are not free of bias if there is no evidence that, across cultural contexts, measures operate in the same way; however, not all researchers are aware of this problem and its practical implications (Fischer & Karl, 2019). Therefore, if one wishes to use the CAS to estimate and compare the prevalence of dysfunctional anxiety in different countries, it is important to demonstrate the measurement invariance (MI) of the instrument.

Demonstrating MI is a prerequisite for valid cross-cultural comparisons; however, this type of study still receives little attention despite increased interest in cross-cultural research (Boer et al., 2018). MI is a procedure that seeks to demonstrate the extent to which self-report items express the same meaning, and whether responses to them load on the same factors, across languages or other groups (e.g., gender and age) where they are applied (Vandenberg & Lance, 2000). Therefore, the absence of MI may generate biased conclusions about possible cross-cultural differences (Spector et al., 2015). In this sense, MI assesses equivalence as an inherent property of a measurement instrument (Davidov et al., 2014). Generally, there are three basic levels of MI: configural invariance, which seeks to determine whether the factor structure of a scale is similar between the different groups compared without imposing equality restrictions on the model parameters; metric invariance, which assesses whether the factor loading of the items are also similar between groups, which would allow for comparing variances and correlations between variables between groups; and scalar invariance, which assesses whether the factor loadings and intercepts of the items are equal between groups, which would allow for comparison of latent means, factor variances and covariances between groups (Meredith, 1993). If MI reports items that are not invariant at the metric or scalar level, it cannot be concluded that the items of the instrument, in this case the CAS, measure the same construct equally across different groups. The absence of MI may result from differences in the construct measured, called construct bias, methodological problems such as familiarity with the item format (measurement bias) or item content difficulties, or item bias, leading to cross-cultural differences in interpretation (Byrne, 2016; van de Vijver & Tanzer, 2004).

Although MI has been a suggested procedure within cross-cultural studies for several decades (Vandenberg & Lance, 2000), instruments assessing mental health symptoms with evidence of cross-cultural MI are scarce (Scholten et al., 2017). An example of this, within the context of the COVID-19 pandemic, is the Fear of COVID-19 Scale whose cross-cultural MI was assessed in seven Latin American countries (Caycho-Rodríguez et al., 2021a) and eleven countries in Asia, the Americas, Europe and Oceania (Lin et al., 2021). Based on the limited information to date, it is not possible to draw a general conclusion on the MI of the CAS across countries. Therefore, in this study we examined the MI of the CAS in 12 Latin American countries. The factor structure of the CAS was explored in depth, comparing one-factor models with five and four items. Due to previous findings reporting the presence of a unifactorial structure (Ahmed et al., 2020; Caycho-Rodríguez et al., 2021a, 2021b; Evren et al., 2020; Magano et al., 2021; Lee, 2020; Lee et al., 2020b; Lee et al., 2020c; Padovan-Neto et al., 2021) and the cultural uniqueness of the participating countries’ samples (Hofstede, 2001), the presence of at least partial invariance was expected. Furthermore, previous psychometric results of the CAS were conducted with procedures based on Classical Test Theory (CTT), and only one previous study conducted in Peru combined the use of CTT and Item Response Theory (IRT) models (Caycho-Rodríguez et al., 2021b). In this sense, this study also evaluated the characteristics and performances of the CAS items based on the Item Response Theory (IRT) model. IRT provides information on item difficulty and discrimination, independent of the sample (Crocker & Algina, 1986), thus identifying items that contribute most to measurement accuracy (Cooper & Petrides, 2010). In recent years, the use of IRT models has been recommended to improve measurement in psychiatry (Adler & Brodin, 2011). Findings based on the CTT model would facilitate corroborating previous psychometric evidence of the CAS; whereas findings based on IRT would provide a better understanding of its psychometric properties.

Once the MI of the CAS measurement between countries was tested, we proceeded to evaluate the MI of the CAS between groups of different sexes. Based on previous results, the MI of the CAS between male and female groups was expected (Ahmed et al., 2020¸ Caycho-Rodríguez et al., 2021b; Lee, 2020). Additionally, CAS scores were compared between different countries and sexes. Due to cultural differences and the different actions and policies carried out by Latin American governments, differences in COVID-19 anxiety levels would be expected in the countries assessed (Lin et al., 2021). Additionally, women were expected to present higher CAS scores than men, due to the higher probability of presenting anxiety symptoms during the COVID-19 pandemic (Moghanibashi-Mansourieh, 2020; Pappa et al., 2020).

The assessment of the MI of the CAS is especially important, as cultural differences could affect how individuals understand the self-report items designed to assess dysfunctional anxiety related to COVID-19. In addition, it is hoped that the assessment of CAS MI in 12 Latin American countries will contribute to the cross-cultural applicability of the instrument and expand the still small number of instruments available for cross-cultural comparisons of mental health symptoms during this COVID-19 pandemic, and other future pandemics.

Method

Participants

A total of 5196 people from 12 Latin American countries (Argentina, Bolivia, Chile, Colombia, Cuba, Ecuador, El Salvador, Guatemala, Mexico, Paraguay, Peru and Uruguay) participated, selected by non-probabilistic convenience sampling. To participate in the study, individuals had to be of legal age in each country and have given informed consent. The minimum number of participants in each country was calculated using Soper software (Soper, 2021), taking into account the number of observed variables (5 items) and latent variables (1 variable) of the model to be evaluated, the anticipated effect size (λ = 0.3), the probability (α = 0.05) and the statistical power (1 - β = 0.95) with a result of 100 participants per country. Each of the 12 countries in the study had more than this minimum number of participants.

The average number of participants within each country was 433 people and ranged from 253 (Bolivia) to Paraguay (880). Of the total number of participants, 3677 were female, 1509 were male, 9 were transgender, and 1 person did not state their gender, with an average age of 34.06 (SD = 26.54), while 3450 reported not having had COVID-19. Country-specific demographic information is shown in Table 1.

Table 1 Demographic information of the participants in each country

Instrument

Demographic Information

An ad hoc questionnaire was developed to collect demographic information from the participants regarding their age, sex, marital status, educational level, type of work and COVID-19 diagnosis.

COVID-19 Anxiety

We used the Coronavirus Anxiety Scale (CAS, Lee, 2020), which is a self-report instrument that measures the frequency with which a person experiences dysfunctional anxiety related to COVID-19. The validated Spanish version of the CAS was used in this study (Caycho-Rodríguez et al., 2021b). The CAS is comprised of five items that have five response options from 0 (not at all) to 4 (almost every day in the last 2 weeks). The sum of the scores for each item generates a total score ranging from 0 to 20, where a higher score indicates greater dysfunctional anxiety related to COVID-19. Likewise, a cutoff score ≥ 9 (90% sensitivity and 85% specificity) allows for categorizing between people with and without dysfunctional anxiety related to COVID-19.

Procedure

This article uses data from the International Collaborative Study on the Mental Health Effects of the COVID-19 Pandemic in 13 Latin American countries. Data were collected simultaneously in all countries between February and March 2021. Most of the participating countries were from South America (Argentina, Bolivia, Chile, Colombia, Paraguay, Uruguay and Peru), with only a few from Central and North America (Cuba, Ecuador, El Salvador, Guatemala and Mexico). The data collection procedure was the same in all twelve countries. Thus, the invitation to participants was made through social networks (Facebook, Twitter, and Instagram) and email. The invitation had a link attached containing a questionnaire using the Google Forms platform, where the objective of the study, the informed consent and contact information in case there were any doubts in the research process were noted in an introductory section. In addition, the confidentiality of the data was guaranteed and the freedom to withdraw from the study at any time was assured. The questionnaire took between 15 and 20 min to complete. The project was approved by the Ethics Committee of the Universidad Privada del Norte in Peru (FCS_CEI/547–10-21).

Data Analysis

First, we examined the descriptive statistics at the item level. Specifically, we calculated the mean, standard deviation, skewness and kurtosis of each item for each country separately. Next, a series of single-group confirmatory factor analyses (CFAs) were conducted. The initial 5-item unidimensional model was conveniently found to have acceptable model fit for most countries. This was evaluated using approximate fit indices: the comparative fit index (CFI > .95), the Tucker-Lewis index (TLI > .90), the root-mean-square error of approximation (RMSEA< .08), and the standardized root-mean-squared residual (SRMR < .08) (Hu & Bentler, 1999). The specific CFI, TLI, and RMSEA values reported in this study were robust versions of these indices, since a robust maximum likelihood (MLR) estimator was also used (Brosseau-Liard et al., 2012; Brosseau-Liard & Savalei, 2014). Although a nonlinear estimator (e.g., WLSMV) is usually preferred for analyzing ordinal scale variables (Brown, 2015; Li, 2016), in the present study the MLR estimator was chosen for two reasons: 1) simulation studies have shown that, when there are five or more response options, both types of estimators provide very similar results (Rhemtulla et al., 2012); and 2) also through simulations, it has been observed that the WLSMV estimator presents limitations when the objective is to perform invariance analysis, since, this method tends to present higher rates of type I and II errors and prevents the possibility of using pragmatic criteria (e.g. ΔCFI) to evaluate the lack of invariance (Sass et al., 2014). Thus, because this study analyzed a scale whose items had 5 response options, and furthermore our main objective was to examine measurement invariance, it was decided to use the MLR estimator for the CFAs.

After an acceptable baseline model was selected, measurement invariance was analyzed regarding country and gender. Since the groups were very unbalanced, a special sub-sampling approach was followed (Yoon & Lai, 2018). This procedure consists of sampling an equal number of individuals from the smallest group in the other groups and performing the invariance analysis on these groups of equal size. This process is repeated a large number of times (in the present study, 100) and, finally, the values obtained in all replications are averaged (for a detailed description, see Yoon & Lai, 2018). Lack of invariance was judged based on two pragmatic criteria: |ΔCFI| > .01 and |ΔRMSEA| > .015 (in the direction of worse fit), but the more conservative Δχ2 is also reported (Chen, 2007; Cheung & Rensvold, 2002). First, configural invariance was examined with a multigroup confirmatory factor analysis in which no equality restrictions were set between groups (Brown, 2015; Dimitrov, 2010). Then, taking the configural model as a baseline, metric and scalar invariance were examined by imposing increasing restrictions (equal loadings and equal intercepts, respectively) on a baseline configural model. Structural invariance was also examined by restricting the latent means to be equal across groups (Dimitrov, 2010).

In order to examine group differences in a convenient way, composite scores were created by averaging the indicators of the final model. Due to the large sample size, effect sizes were preferred over inferential tests to judge the magnitude of the differences. We chose to examine standardized mean differences as a descriptive approach to differences between groups, instead of resorting to inferential tests such as t-tests. This methodological decision follows a long tradition that questions the use of statistical significance as the main tool for decision making (Cohen, 1994). Also, we decided to work with composite scores (i.e., observable variables), rather than at the level of latent variables, because the latter would imply taking one reference group against which all other groups would be compared (Sass, 2011). Since it is not plausible to take a single country as the reference group, and yet it was of interest to examine the differences between all possible pairs of countries, we chose to make comparisons using the composite measures. Specifically, Hedges’s g was used, aided by the visual inspection of the boxplots.

Internal consistency reliability was also examined. While Cronbach’s alpha was estimated, it has been shown that this coefficient may be limited in some situations (Dunn et al., 2014). Thus, the omega coefficient was also calculated.

Finally, in order to analyze the CAS’s psychometric functioning more thoroughly, a graded response model (GRM) was fitted to the data (Samejima, 2016). This model estimates one discrimination parameter (a), as well as k-1 difficulty parameters (b), per indicator, where k is the number of response options (in this case, 5). The a parameters indicate the extent to which the item correctly distinguishes between people with lower or higher levels of the construct. The higher the discrimination parameter of an item, the clearer the relationship between a person’s level in the measured attribute (θ) and his or her response to that item (Hambleton et al., 2010). On the other hand, the b parameters indicate at what level of the latent variable (θ) individuals will have a probability of .50 of responding to some response option higher than that indicated by the parameter. Information functions can also be obtained from the estimated coefficients, and they can be plotted to provide a global representation of the measure’s representativeness with respect to the construct of interest (Edelen & Reeve, 2007). Indeed, information can be understood in a similar way to the traditional concept of reliability, so the information curves allow us to observe at which levels of the construct (θ) the test presents higher psychometric quality (Furr, 2018).

All analyses were performed with the R program (version 4.0.3). CFA and reliability estimates were performed with the lavaan (version 0.6–8; Rosseel, 2012) and semTools (0.5–3; Jorgensen et al., 2020) packages, respectively. The graded response model was performed with the mirt package (version 1.33.2; Chalmers, 2012).

Results

Item-Level Descriptive Statistics

Table 2 presents the descriptive statistics of each item separately. It can be seen that all the items were positively skewed and had large standard deviations (compared to their means). Notably, items 4 and 5 (both related to stomach discomfort) were severely skewed and kurtotic in all countries. They were also the items that had the lowest standard deviations, indicating limited variability.

Table 2 Item-level descriptive statistics of the coronavirus anxiety scale

Single-Group Confirmatory Factor Analysis

A strictly unidimensional model was tested with CFA. The fit of this model was acceptable for some countries but rather bad for others, especially Bolivia, Cuba, Guatemala, and Uruguay (Table 3). Thus, modification indices were examined and it was evident that the residuals of items 4 and 5 were not independent. As a result, two new models were tested—one removing item 5 (Model 2) and the other removing item 4 (Model 3). As can be seen in Table 3, Model 2 had excellent fit in all countries, except for Uruguay, which had nonetheless acceptable fit according to most indices. On the other hand, Model 3 had sub-optimal fit in some countries, especially Mexico and Uruguay. Therefore, we selected Model 2 as the baseline for the following analyses.

Table 3 Single-group confirmatory factor analyses of the coronavirus anxiety scale

Internal Consistency Reliability

The results from the CFAs reported in Table 3 were used to estimate reliability of the CAS-4 in each country. The standardized factor loadings, as well as coefficients alpha and omega, are presented in Table 4. The scale had good internal consistency reliability in all countries (ɑ ≥ .78, ω ≥ .80).

Table 4 Factor loadings and internal consistency reliability of the coronavirus anxiety scale

Measurement Invariance and Mean Comparison by Country

Next, MI by country was examined. The ΔCFI criterion (but not the ΔRMSEA and the Δχ2) suggested that metric and scalar invariance were met (Table 5). This allowed us to test whether means were equal across groups. As shown in Table 5, model fit did get significantly worse when latent means were constrained to be equal (|ΔCFI| > .01).

Table 5 Measurement and structural invariance of the coronavirus anxiety scale

To examine mean differences, composite variables were created by averaging the 4 final items of the CAS (CAS-4). To ease interpretation, Fig. 1 presents a graphical representation of the CAS scores by country. Most differences were either negligible or small, but some of them were of moderate size. Specifically, Argentina had lower scores than Bolivia (g = 0.52) and El Salvador (g = 0.52). Similarly, Uruguay presented clearly lower scores than Bolivia (g = 0.69), Ecuador (g = 0.62), El Salvador (g = 0.66), Guatemala (g = 0.53), and Peru (g = 0.63).

Fig. 1
figure 1

Boxplots Comparing Scores of the Coronavirus Anxiety Scale by Country

Measurement Invariance and Mean Comparison by Gender

MI by gender was also examined. For this analysis, only two genders (female and male) were included due to sample size limitations. According to the ΔCFI criterion, scalar invariance was found (Table 5). Also, model fit did not worsen to a large degree when latent means were constrained to be equal. This was confirmed by the fact that the mean difference of the composite score between males and females was small (d = 0.21).

Graded Response Model

A graded response model was fitted to the data. As presented in Table 6, item 3 was the most discriminative. While the four proposed items reasonably covered part of the construct’s spectrum, it is also clear that the items are more difficult at high levels of the latent variable (i.e. it is more probable to choose lower options than it is to choose the highest ones, even if the respondent’s level of anxiety is not particularly low).

Table 6 Parameters of the graded response models (grm) of the coronavirus anxiety scale

Figure 2 presents the item information curves for each GRM. This confirms what was previously observed when examining item difficulties: it seems that the CAS-4 is more informative at average-to-high levels of the construct than it is at lower levels.

Fig. 2
figure 2

Item Information Curves of the Coronavirus Anxiety Scale. Note. On the x-axis, the latent variable (θ) level is represented in terms of standard deviations from the zero mean value

Discussion

The results of the confirmatory factor analysis (CFA) indicated that the original single factor model of the CAS showed an acceptable fit for some countries, but poor fit for others (such as Bolivia, Cuba, Guatemala and Uruguay). Therefore, we can say that the original CAS model was not replicated in these countries. In response to this finding, the modification indices were evaluated, where it was identified that the residuals of items 4 and 5 were not independent. The presence of a dependence of residuals in factor models has been previously reviewed in the specialized literature (Dominguez-Lara, 2019) and seems to suggest similarity in item content, task demands, measurement errors, and response style (Brown, 2015). Likewise, this may be generated by factors beyond the construct to be assessed, such as irrelevant content factors, item phrasing and/or item closeness in the assessment protocol (Dominguez-Lara & Merino-Soto, 2017). Indeed, both item 4 (“I lost interest in eating when I thought about or was exposed to information about the coronavirus”) and item 5 (“I felt nausea or stomach problems when I thought about or was exposed to information about the coronavirus”) refer to appetite and nausea problems as a consequence of exposure to information about COVID-19. Bodily sensations, such as those mentioned in items 4 and 5, as well as interpretation can be influenced by different events, such as information coming from the media (Jungmann & Witthöft, 2020). Also, during a virus outbreak, bodily sensations are more likely to be intensified in accordance with this context (Blakey & Abramowitz, 2017). In this regard, items 4 and 5 could be revised, modified, or deleted in future versions of the CAS.

As a result, two new models were proposed: Model 2, where item 5 was eliminated, and Model 3, where item 4 was eliminated. Model 3 did not show an optimal fit in some countries, such as Mexico and Uruguay; while model 2 had an excellent fit in all countries, except Uruguay (which had an acceptable fit in most of the fit indices). For this reason, it is suggested that model 2 (CAS-4) is the best fit for the Latin American countries evaluated. Another study has also indicated that the Spanish version of the five-item CAS does not show adequate performance, suggesting that a four-item model (with the absence of item 4) presents a better fit (Carrillo-Valdez, 2020). A model with correlated errors of items 4 and 5 was not evaluated because it may produce an over- or underestimation of reliability due to the presence of variance not associated with the construct and generate a bias in the interpretation of measurement precision (Yang & Green, 2010). On the other hand, the results on the reliability of the CAS-4 were excellent in all countries, with alpha and omega coefficient values above .80. This is in agreement with what was reported in the other four-item version of the CAS (Carrillo-Valdez, 2020), as well as in the findings of the original version in different contexts (Ahmed et al., 2020; Caycho-Rodríguez et al., 2021a, 2021b; Evren et al., 2020; Magano et al., 2021; Lee, 2020; Lee et al., 2020b; Lee et al., 2020c; Padovan-Neto et al., 2021).

Furthermore, the evaluation of the factor structure of the CAS-4 by multigroup analysis supported configural invariance; therefore, dysfunctional anxiety related to COVID-19, as described and evaluated in this research study, has a uniform meaning in the twelve countries evaluated. However, this level of invariance does not allow us to guarantee the comparability of CAS-4 scores. The findings on metric invariance suggest that all items can be interpreted and answered in a similar way; therefore, a change in dysfunctional anxiety related to COVID-19 causes the same change in CAS-4 scores in the twelve countries evaluated. Finally, scalar invariance would indicate that the relationship between observed and latent CAS-4 score is invariant across countries. The findings provide additional information on the CAS-4, given that its equivalence between different countries had not been evaluated previously, not even in its original five-item version. Thus, it is possible to argue that the possible comparisons between the countries evaluated are valid, since the CAS-4 scores have the same meaning among individuals in the twelve countries.

The presence of scalar equivalence is a necessary condition for comparing CAS-4 means across countries (Steenkamp & Baumgartner, 1998). Thus, when comparing dysfunctional anxiety related to COVID-19 among the twelve Latin American countries included, it was found that most of the differences were insignificant or small; however, Uruguay presented the lowest mean score, while Peru is one of the countries with the highest presence of dysfunctional anxiety symptoms related to COVID-19. In the case of Uruguay, the low presence of COVID-19 related dysfunctional anxiety symptoms is likely associated with the government’s successful handling of the pandemic. Also, having a relatively small population (approximately 3.5 million inhabitants) facilitated the control of disease transmission, making Uruguay one of the countries with the lowest rate of COVID-19 diagnoses and deaths (Taylor, 2020). In the case of Peru, the findings can be explained by the fact that, at the time of data collection for this study, the country was in one of the most difficult moments of the so-called “second wave” of the pandemic, which could have generated greater fear and anxiety about the disease. The number of infections tripled in the month of February 2021, going from 1688 cases per day in the last week of December 2020 to 5668; in addition, the number of deaths increased from 51 per day during the last week of December 2020 to 180 in the first weeks of February 2021; while in hospitals, the number of people hospitalized for COVID-19 went from 3900 in mid-December to 11,715 in February 2021. This is supported by results from previous studies in Latin America where they indicate that an increase in COVID-19 severity is associated with a dramatic increase in fear and anxiety levels (Feter et al., 2021). Added to this, the limited resources available to the Peruvian health system, late responses from the government (Herrera-Añazco et al., 2021) and scandals about the misuse of vaccines (Chauvin, 2021) generated greater uncertainty and distrust among the people.

Differences in average scores of dysfunctional anxiety related to COVID-19 may be explained by cultural differences and available information about the consequences of COVID-19 in each country (Bäuerle et al., 2020). In addition, people may experience different feelings and perceptions due to the different impacts of COVID-19 in their countries. For example, at the time the study data were collected, some countries, such as Argentina and El Salvador, were in a progressive reopening of their economic activities, causing an increase in the rate of contagion. In Chile, the study was conducted during an explosive increase in the number of new infections, due to the end of the summer period and the false perception of security in the population as a result of the successful start of the vaccination campaign. This led to the tightening of existing sanitary measures that left 96% of the Chilean population in social isolation, while there was a strong recession and an unemployment rate that reached 10.4%. Cuba was in a phase of resurgence (second wave) characterized by the high number of people infected with the virus, much higher than in the first stage of the disease. Colombia was going through the end of their second wave of infection, which resulted in a decrease in the number of infections and deaths, as well as the economic reopening in most sectors and the gradual and progressive return to on-site classes in private schools and universities, under the alternation modality.

Additionally, scalar invariance of the CAS-4 according to gender was reported, which supports results previously found in other countries with the original five-item version of the CAS (Ahmed et al., 2020¸ Caycho-Rodríguez et al., 2021b; Lee, 2020). Therefore, it is possible to state that men and women from the twelve countries understand the dysfunctional anxiety related toCOVID-19 construct in the same way. This finding indicates that the CAS-4 can be reliably used to assess and compare dysfunctional anxiety related to COVID-19 in men and women in the countries involved. The comparison of dysfunctional anxiety related to COVID-19 indicated a small difference between both sexes with a higher level found in females. This result is expected according to previous literature (Ahmed et al., 2020¸ Caycho-Rodríguez et al., 2021b; Evren et al., 2020), where women have even come to present anxiety levels three times higher than men during the current pandemic (Wang et al., 2021). This could be because men and women respond differently to stressors during the pandemic, which may cause women to misinterpret their own feelings and be more vulnerable to experience other negative emotions such as depression (Nakhostin-Ansari et al., 2020; Özdin & Bayrak Özdin, 2020). In addition to gender differences in susceptibility to elevated levels of anxiety, it has been suggested that other consequences of the pandemic, such as financial problems, increased child care and schooling, care for sick family members, and decreased job opportunities, may be more detrimental to women than to men (Wenham et al., 2020).

The IRT-based results allow us to evaluate the performance of the CAS-4 items. Thus, it is suggested that item 3 (“I felt paralyzed or frozen when I was thinking about or exposed to information about the coronavirus”) has the best discrimination capacity, which would indicate that this item can distinguish more clearly between individuals with different levels of dysfunctional anxiety related to COVID-19. This result agrees with what was reported in the Peruvian validation where, based on IRT, it was indicated that item 3 was one of those that allowed for a better and more accurate assessment of those individuals with moderate and high levels of dysfunctional anxiety related to COVID-19 (Caycho-Rodríguez et al., 2021b). Previously, it has been shown that each of the CAS items represent physiological symptoms related to clinical symptoms of anxiety and fear (Lee, 2020). However, individuals’ responses to item 3 would provide more information about dysfunctional anxiety related to COVID-19, because motor immobility (expressed in item 3) is an involuntary fear response (Marx et al., 2008) that is typically expressed in people who have gone through traumatic situations, such as the COVID-19 pandemic (Lee, 2020). Thus, people with dysfunctional anxiety related to COVID-19 will respond higher to this item compared to those who do not present this condition. The other items also present similar discrimination parameters; moreover, the difficulty required to respond to each of the four items is always ascending, which is to be expected in measures of psychological distress. Therefore, a higher presence of the latent trait (in this case, dysfunctional anxiety related to COVID-19) is necessary to respond with the higher response options. In addition, the item information curves, along with the finding that, on average, expressions of COVID-19 anxiety were at the lower end of the continuum, suggest that CAS-4 items are most informative at average or high levels of COVID-19-related dysfunctional anxiety.

Although the findings of this study are important, and the CAS-4 showed acceptable psychometric properties in all countries, the study inevitably had some limitations. First, the participating countries were not systematically selected, as their inclusion was the product of a negotiation about potential author interest in participating and their ability to meet the study requirements. In addition, the study included more countries from South America and only four countries from North and Central America. Therefore, the results may not be generalizable to countries not included in the study. Thus, future research should replicate the study and include samples from more countries in North and Central America. Second, the participants were selected by convenience sampling, which did not allow us to have a fully representative sample of the general population of each of the twelve countries. Likewise, only people with Internet access in all countries could be surveyed, which could generate the presence of a sampling bias. In addition, although the data collection process was the same in all countries, the distribution of demographic data was different. This could be corrected by using appropriate sampling (Pierce et al., 2020). All of the above would limit the generalizability of the findings to the general population of the countries. Third, the number of participants varied across countries, which could further limit the generalizability of the findings. While there is evidence that different sample sizes could bias the results obtained using multigroup factor analyses (Brown, 2015), the recommended subsampling approach (Yoon & Lai, 2018), as used in this study, attempted to mitigate this problem. We did not assess the effect of the form of administration, which can potentially interact with cultural effects in each of the countries, generating a systematic effect of the data collection method. In this regard, future research should use different forms of survey administration (e.g., pencil and paper, and online) to reliably separate the effect of administration (Żemojtel-Piotrowska et al., 2018). Additionally, it should also be noted that, in the invariance analysis, invariance was met for the pragmatic ΔCFI criterion, but not for the ΔRMSEA criterion. Although it has been argued that the RMSEA may be biased in very small models such as ours (Kenny et al., 2015), and that the ΔCFI criterion turns out to be more robust in deciding about the lack of invariance (Cheung & Rensvold, 2002), this limitation should not be ignored. Evidence of convergent and discriminant validity with other CAS-4 constructs was not assessed. Finally, although a Spanish version of the CAS was used, the translation of which followed the technical procedures suggested in the scientific literature (Caycho-Rodríguez, 2021b), a specific linguistic analysis of the items in each of the countries was not carried out. In this sense, it would be important to take into account sociolinguistic variation during the processes of future CAS adaptations, regardless of whether the countries speak the same language, to ensure the fidelity of the interpretations obtained (Peterson et al., 2017). Thus, despite the value of obtaining pan-dialectal versions, additional linguistic adaptations are necessary for certain cultural contexts (Squires et al., 2013).

Conclusions

Despite these limitations, the findings provide evidence that the CAS-4 is a valid measure of dysfunctional anxiety related to COVID-19 in the general population of 12 different Latin American countries. Similarly, the CAS-4 was shown to be invariant between groups of men and women. In addition, it has been shown that it can be useful to meaningfully compare scores between countries and genders. From IRT, it was shown that the CAS-4 allows for a better identification of people with average to high levels of dysfunctional anxiety related to COVID-19. Additionally, including a large number of Latin American countries in the study and assessing MI probably provides more generalizable results than previous studies. Thus, greater variability and sensitivity to cultural influences of the CAS-4 items could be detected. Finally, the finding of invariance of the CAS-4 instrument across countries is important for conducting cross-cultural assessments (Milfont & Fischer, 2010). As mentioned, without evidence of invariance it cannot be assumed that the results of cross-cultural comparisons will be valid (Chen, 2008). It is hoped that the findings presented here will motivate and guide future studies to ensure measurement invariance of the CAS-4 before comparing dysfunctional anxiety related to COVID-19 across different cultures or countries.