According to national-level data from Sweden, teachers are among those professionals who have the highest prevalence of psychiatric diagnoses and report the poorest psychosocial working conditions (Swedish Social Insurance Agency, 2014). The findings of a recent study illustrated that teachers reported higher job demands and lower decision authority in the Swedish education sector compared to workers in many other sectors; this trend was particularly pronounced among female teachers between 1991 and 2013 (Cérdas et al., 2019). These findings coincide with (a) significant Swedish school reforms, which took place in the early 1990s, where the responsibility for primary and secondary schools was transferred from the state to all municipalities and (b) the introduction of New Public Management, a market-oriented strategy for governing public functions (The state’s public investigations, 2014). One consequence of these reforms has been the rapidly growing administrative burden for teachers. Furthermore, the profession has lost status in society, which is associated with poor salary growth (Statistics Sweden, 2017) and a rise of uncertificated teachers. Although Sweden is praised for its work on gender equality in general, salary growth for female primary school teachers is worse than it is for male primary school teachers (Statistics Sweden, 2017). In addition, women spend more time doing unpaid work despite working full time (Moreno-Colom, 2017; Väänänen et al., 2005). Unpaid work such as housework and caring for children and other relatives could be a burden for teachers (Bekker et al., 2005; Drake et al., 2019), whereas leisure activities such as relaxation, socializing with friends, sports, or pursuing a hobby could promote recovery from work stressors (Zijlstra & Sonnentag, 2006).

Thus, conditions in both work and home spheres among female and male teachers need to be scrutinized. In the present study we investigated whether the work environment, as well as conditions in the home sphere, differed between male and female teachers. In a longitudinal study, we investigated the development of demands and resources in paid and unpaid work, whether these trends were related to the development of depressive symptoms, and whether there were gender differences in associations between teachers’ demands and resources and their depressive symptoms.

Theoretical Framework

The Job Demands-Resources (JD-R) model (Bakker et al., 2003; Demerouti et al., 2001) provides a theoretical framework for understanding the relation between the work environment and depressive symptoms. The JD-R model postulates that psychosocial work characteristics, defined as job demands and job resources, have implications for work-related outcomes through two rather independent processes (Bakker et al., 2005). These are 1) the health impairment process, basically stating that high job demands predict burnout, and 2) the motivational process, basically stating that high job resources predict work engagement (Bakker et al., 2014). Thus, high demands may over time lead to negative health consequences (through the health impairment process), whereas resources may buffer the impact of job demands on ill-health (Bakker et al., 2014).

In the present study, we focused on depressive symptoms as a potential mental health consequence. Depressive symptoms, including symptoms such as feeling lethargic or low in energy, lack of interest in things, feeling that everything is an effort, blaming oneself, worrying too much, and feeling depressed, can be diagnosed as clinical depression if they reach a particular threshold of severity (Magnusson Hanson et al., 2014). Many studies have shown associations between burnout and depression, and there is an ongoing discussion among researchers about whether they are the same constructs (Bianchi et al., 2015; Brenninkmeyer et al., 2001). However, most recent meta-analyses on the subject indicate that depression and burnout are distinct constructs (Koutsimani et al., 2019). Burnout is related to workplace stress, whereas depression is pervasive and could pertain to all aspects of life (Maslach et al., 1996). The concepts are related in that burnout has been found to predict depressive symptoms (Hakanen & Schaufeli, 2012; Steinhardt et al., 2011). Additionally, because work engagement also has been found to negatively predict depressive symptoms (Hakanen & Schaufeli, 2012), a direct link between resources and depressive symptoms is worth studying.

According to the JD-R model, job demands and job resources vary depending on the particular occupational setting, meaning that different occupational settings may be best described in terms of different job demands and job resources (Bakker et al., 2003). For example, emotional and quantitative work demands, administrative rigidity, pupil misbehaviour and low student motivation are among the demands that have been identified as stressors among teachers (Skaalvik & Skaalvik, 2018; Tuxford & Bradley, 2014), whereas decision authority, supportive relations with colleagues, and relatedness with students have been identified as resources among teachers (Collie et al., 2020; Corso-de-Zúñiga et al., 2020; Fernet et al., 2013; Skaalvik & Skaalvik, 2018). The job demands and resources variables that were examined in the present study were selected based on these studies and were also derived from the literature on work stress in human service professions (Aronsson et al., 2018; Cérdas et al., 2019), of which teachers constitute a large group.

To target potential gender differences with regards to demands and resources in the work and home spheres, we draw upon social constructivist perspectives and social role theory (Eagly & Wood, 2012). According to social role theory people are socialized by society to adhere to social and occupational roles that align with masculine and feminine gender stereotypical traits and behaviours. For example, women are still expected to be caring and to focus more on relationships than men (Eagly et al., 2020), which might have implications in the school context so that female teachers become more attentive of their students' emotional needs. Power and higher social status is inherent to the masculine gender role, paving the way for access to privileges in organizations (Acker, 2006; Eagly & Wood, 2012; Risman, 2004), increasing their chances of maintaining and gaining more favourable job resources (Dubbelt et al., 2016).

Demands and Resources in the Work Sphere for Teachers

The current study focuses on two types of job demands: quantitative work demands and emotional demands. Quantitative work demands refer to workload, time pressure, and conflicting demands and have frequently been studied in occupational health psychology (cf. job demands-control [DCM] model Karasek, 1979; Karasek & Theorell, 1990)). Emotional demands refer to “exposure to emotionally demanding situations, emotional labour (use of deep and surface acting) and work focused on the emotional well-being of others” (Tuxford & Bradley, 2014, p. 1006). The job resources investigated in this study refer to control in terms of decision authority, which involves being able to decide for oneself what work tasks to do and how to perform them, as well as work-time control, which is the ability to influence one’s own work time and plan holidays (Ala-Mursula et al., 2002). Decision authority has been studied frequently as a subdimension of the concept of job control in the DCM model (Karasek, 1979; Karasek & Theorell, 1990), whereas work-time control has been less studied. However, a systematic review of work-time control concluded that it would appear to be an important aspect of the work context associated with health and well-being (Nijp et al., 2012). Furthermore, work-time control has been found to be of particular interest in human resource occupations (including teaching), in which this resource has been shown to be limited compared to other occupations (Aronsson et al., 2018).

Studies scrutinizing gender differences in psychosocial working conditions among teachers are few and inconclusive. For example, one study found that female teachers experienced higher quantitative work demands compared to male teachers (Antoniou et al., 2006), whereas two studies found no significant gender differences with regards to quantitative work demands or job control (Griva & Joekes, 2003; Pisanti et al., 2003). Gender differences in emotional demands among teachers have attracted more attention. For example, a systematic review found inconsistent evidence of gender differences in emotion management strategies among teachers (Olson et al., 2019). In cases where gender differences were nevertheless discovered, these indicated that female teachers to a greater extent compared to male teachers both made themselves emotionally available to their students and used deep acting emotion management strategies (Olson et al., 2019).

Demands and Resources in the Home Sphere for Teachers

Demands and resources could also potentially fall outside the work context, stemming from the home sphere. Thus, multiple hours doing unpaid work during spare time, for example housework or caregiving, may be stressful and may take its toll on individuals’ mental health (Drake et al., 2019). In accordance with the social role theory, population-based figures from several European countries show that women spend far more time doing unpaid work (e.g., housework and childcare) than men, despite working full time (Moreno-Colom, 2017; Väänänen et al., 2005).This is also the situation in Sweden, where married or cohabiting women spend more time on unpaid work than men do (Statistics Sweden, 2018).

Spare time could also provide recovery opportunities (Zijlstra & Sonnentag, 2006). Multinational data show that full-time working women spend less time on leisure activities compared to full-time working men (Gimenez-Nadal & Sevilla-Sanz, 2011), giving women less recovery opportunities (Sonnentag & Zijlstra, 2006). To the best of our knowledge no such comparisons have been done for teachers specifically. Thus, in the present study we consider unpaid work and time spent on leisure activities as a demand and a resource, respectively, in the home sphere.

Work Demands and Resources and Depressive Symptoms in Teachers

Several prospective reviews and meta-analyses have demonstrated an association between higher job demands and depressive symptoms (Bonde, 2008; Swedish Council on Health Technology Assessment, 2014; Theorell et al., 2015). However, very few studies have been based on more than two time points, allowing for an examination of developmental trends and associations over time. With regard to teachers, the studies are predominantly cross-sectional, supporting associations between quantitative work demands (Baka, 2015; Borrelli et al., 2014) or emotional demands (Madsen et al., 2010; Yin et al., 2018) and depressive symptoms. However, most studies on job demands and mental ill-health in teachers have focused on emotional exhaustion (one of three dimensions of burnout) in which such associations have also been supported (see e.g., Griva & Joekes, 2003; Näring et al., 2006; Pisanti et al., 2003; Tuxford & Bradley, 2014).

Job control–of which decision authority is a subdimension –is a well-studied concept in relation to depressive symptoms. For example, findings from several prospective reviews and meta-analyses show that low job control is associated with depressive symptoms (Bonde, 2008; Swedish Council on Health Technology Assessment, 2014; Theorell et al., 2015). Associations between decision authority and both emotional exhaustion (Näring et al., 2006) and major depressive disorders (Besse et al., 2015) have furthermore been found among teachers in cross-sectional studies. Finally, regarding work-time control, a meta-analysis found some evidence of a cross-sectional association between work-time control (in particular flexitime) and health, in which affective well-being was included.

Home Demands and Resources and Depressive Symptoms in Teachers

Unpaid work and caring for others can be emotionally exhausting, but leisure activities, through possibilities for recovery, can help alleviate this exhaustion. Some studies support a link between time spent on caring for children or relatives and emotional exhaustion (Bekker et al., 2005; Drake et al., 2019). However, other studies do not show these patterns (Harryson et al., 2012; Sonnentag & Zijlstra, 2006). Several studies have shown associations between leisure activities and positive health outcomes, such as subjective well-being (Kuykendall et al., 2015). With regard to the link to depressive symptoms, the majority of studies concern the beneficial effects of leisure activities on individuals suffering from depression (Manos et al., 2010). However, to the best of our knowledge, the association between leisure activities and depressive symptoms in a non-depressed sample has not been well studied and teachers has not been the focus. As an exception to the former, a cross-sectional study (Pressman et al., 2009) found that the amount of time spent on enjoyable leisure activities was negatively associated with depressive symptoms.

The Current Study

Overall, research focusing on gender differences in adverse working conditions and mental health among teachers is limited (Collie et al., 2020). At best, studies have controlled for gender (Bradley, 2007; Feuerhahn et al., 2013; Gluschkoff et al., 2016; Näring et al., 2006). An exception is a study that showed that poor work-time control increased the odds of psychological distress among female teachers but not male teachers (Ala-Mursula et al., 2002). The limited work that does exist tends to show inconsistent patterns. The aim of the present study is to clarify existing work by testing a set of nine hypotheses:

Hypothesis 1: Female teachers would report higher levels of depressive symptoms both (a) at baseline and (b) over a ten-year period, compared to male teachers.

Hypothesis 2: Female teachers would report higher levels of quantitative work demands both (a) at baseline and (b) over a ten-year period.

Hypothesis 3: Female teachers would report higher levels of emotional demands both (a) at baseline and (b) over a ten-year period, compared to male teachers.

Hypothesis 4: Female teachers would report lower levels of decision authority both (a) at baseline and (b) over time, compared to male teachers.

Hypothesis 5: Female teachers would report lower levels of work-time control both (a) at baseline and (b) over a ten-year period, compared to male teachers.

Hypothesis 6: Female teachers would report doing more hours of unpaid work both (a) at baseline and (b) over an eight-year period, compared to male teachers.

Hypothesis 7: Female teachers would spend less time on leisure activities both (a) at baseline and (b) over an eight-year period, compared to male teachers.

Hypothesis 8: Among teachers, (a) higher quantitative work demands, (b) higher emotional demands, (c) lower decision authority, and (d) lower work-time control would be associated with higher depressive symptoms at baseline. Also, their slopes over time will be correlated.

Hypothesis 9: Among teachers, a) higher demands in the home sphere (i.e., unpaid work) will be associated with higher levels of depressive symptoms, b) whereas higher resources in the home sphere (i.e., leisure activities) will be associated with lower levels of depressive symptoms at baseline. Also, their slopes over time will be correlated.

Finally, international systematic reviews and meta-analyses of psychosocial work factors and depressive symptoms have suggested that men and women in general are impacted equally by similar levels of adverse working conditions (Bonde, 2008; Swedish Council on Health Technology Assessment, 2014; Theorell et al., 2015). We examined whether this is also the case in a sample of teachers. Hence, we specified the following research questions: Does gender moderate the association between 1) quantitative work demands, 2) emotional demands, 3) decision authority, 4) work-time control, 5) unpaid work, and 6) leisure activities and teachers’ depressive symptoms?

Method

Participants and Procedure

The present study is based on the Swedish Longitudinal Occupational Survey of Health (SLOSH), which is a large longitudinal cohort study that has collected data via questionnaires every second year since 2006. SLOSH is a follow-up study to the nationwide Swedish Work Environment Surveys (SWES) and is approximately representative of the Swedish working population. For an extensive description of the sample and the data collection procedure, see Magnusson Hanson et al. (2018). The present study makes use of data from SLOSH 2008–2018. The inclusion criteria was that the individual had answered at least two SLOSH questionnaires between 2008 and 2018 and on both occasions had been working as a teacher. Subsequent waves were included if the individual at these time points had also been working as a teacher. The response rate among respondents to SLOSH with ethnic backgrounds other than Swedish is low. Hence, it was therefore not possible to take ethnicity into consideration in the present manuscript. The final study sample comprised 1022 participants, including 248 men (24.3%) and 774 women (75.7%) of which 395 (38.6%) had participated and worked as teachers on two occasions, 336 (32.9%) on three occasions, 121 (11.8%) on four occasions, 87 (8.5%) on five occasions and 83 (8.1%) on six occasions.

Among men, the mean age in 2008 was 48.1 (SD = 9.9) years and, upon entering the first wave, 216 (87%) had a university degree, 130 (52%) were upper secondary school teachers, 85 (34%) were primary school teachers, and 33 (13%) belonged to the category “other teaching professionals.” A total of 201 (81%) were cohabiting and 117 (47%) had children. Among women, the mean age in 2008 was 46.2 (SD = 9.9) years, and, upon entering the first wave, 735 (95%) had a university degree, 175 (23%) were upper secondary school teachers, 422 (55%) were primary school teachers, and 177 (23%) belonged to the category “other teaching professionals.” A total of 627 (81%) were cohabiting, and 402 (52%) had children.

The participants received written information about the SLOSH study by mail. Responding to and returning the survey indicated informed consent, which is in accordance with Swedish regulation and practice. The Regional Research Ethics Board in Stockholm approved the study (Dnr: 2017/236–3).

Measures

Depressive Symptoms

Depressive symptoms were assessed using SCL-CD6, a shortened 6-item Swedish version of the longer Hopkins Symptom Checklist (Lipman, 1986; Magnusson Hanson et al., 2014). SCL-CD6 has shown good psychometric properties and is suitable for assessments of depression in large surveys (Magnusson Hanson et al., 2014). The items concerned the extent to which, over the last week, the respondents had felt depressed, lethargic or low in energy, disinterested in things, felt like everything was an effort, blamed themselves, or worried too much. Response alternatives ranged from 1 (not at all) to 5 (very much). Cronbach’s alpha ranged from .90 to .92 depending on the wave. A mean score was calculated for those with answers on at least four of the items. See Table S1 in the online supplement for full wording of all survey items.

Demands and Resources in the Work Sphere

Quantitative work demands. Quantitative work demands were measured with five items (e.g., “Do you have to work very fast?” and “Does your work demand too much effort?”) from the Swedish version of the Demand-Control-Support-Questionnaire (DCSQ) (Chungkham et al., 2013; Theorell et al., 1988). The response alternatives ranged from 1 (never/almost never) to 4 (often). Cronbach’s alpha ranged from .67 to .77 depending on the wave. A mean score was calculated for those with answers on at least three of the items.

Emotional Demands. Emotional demands were assessed with one item from the Copenhagen Psychosocial Questionnaire (COPSOQ) (Pejtersen et al., 2010), “Does your work place you in emotionally-disturbing situations?” with response alternatives ranging from 1 (never/almost never) to 4) (often).

Decision Authority. Two items from the Swedish version of DCSQ (Chungkham et al., 2013; Theorell et al., 1988) were used to assess decision authority (i.e., “Do you have a choice in deciding how you do your work?” and “Do you have a choice in deciding what you do at work?”), with response alternatives ranging from 1 (never/almost never) to 4 (often). Responses to both items were required in order to compute a mean score.

Work-Time Control. Six items from the Swedish version of the originally developed scale by Ala-Mursula et al. (2002) were used to assess workers’ abilitiy to influence their own working time arrangements (i.e., length of working day, start and end times, taking breaks, which days to work, running private errands during work, holidays). The response alternatives ranged from 1 (very low degree) to 5 (very high degree). Cronbach’s alphas ranged from .80 to .86 depending on the wave. A mean score was calculated for those with answers on at least four of the items.

Demands and Resources in the Home Sphere

Unpaid Work. For the purpose of measuring time spent on unpaid work, a modified measure was used (Mårdberg et al., 1991). Unpaid work (measured in hours/week) was calculated as a summed score based on three items that assess time spent on (1) housework (household and maintenance work), (2) childcare (leave, homework, care and supervision) and (3) caring for a dependent. For the last two waves (2016 and 2018), the first question was slightly changed so that it only referred to household work.

Leisure Activities. For the purpose of measuring time spent on leisure activities, a modified measure was used (Mårdberg et al., 1991). Leisure activities (measured in hours/week) was calculated as a summed score based on two items that assess time spent on (1) rest and relaxation and (2) recreational activities (socializing with friends, sports, active in an association, culture).

Covariates

When testing Hypotheses 8–9 (i.e., associations between demands and resources and depressive symptoms), we included covariates that may affect both depressive symptoms and job demands and resources specified in the JD-R theory. According to recent statistics from Statistics Sweden (2018), three out of four primary school teachers are women, while among upper secondary school teachers there is a gender-equal distribution. Because work environment and thus stress levels may differ between different teaching categories, when testing associations between work environment and depressive symptoms, we controlled for teacher category grouped as three categories: 1) upper secondary school, 2) primary school and 3) other teaching professionals. In our sample, the last category included mostly remedial teachers (specialist teachers and special educators, who, for example, teach students in need of special support and support other teachers in doing this work), but also other educators with theoretical specialist competence (for example, teachers who teach at music schools or in aesthetic subjects), and a few guidance counsellors.

Moreover, previous studies have indicated that younger compared to older high school teachers report higher levels of depressive symptoms (Kjellström et al., 2016). Turning to personal life, civil status and parental status can be expected to impact the amount of home and family chores, as well as time spent on leisure activities. Taken together, we therefore controlled for teaching level, educational level, and age when testing Hypothesis 8, and in addition to these, for civil and parental status when testing Hypothesis 9. SLOSH includes register-based information on gender, age, educational level (dichotomized into having a university degree or not), teacher category and self-reported information on civil status (single or married/cohabiting) and parental status (at least one child/no children living at home). Age was divided into four categories: 1) “under 35 years”, 2) “35–44”, 3) “45–54”, 4) “over 54 years.”

Statistical Analyses

SPSS (version 25.0) was used for descriptive analyses. Model testing was conducted in Mplus (version 8.2) (Muthén & Muthén, 1998–2017) and for all models the robust maximum likelihood (MLR) was specified as the estimator allowing for incomplete data and assuming missing at random.

First, to ensure that the measurements were not interpreted differently among males versus females or over time, we tested a set of measurement invariance models across groups and across time for depressive symptoms, quantitative work demands, and work-time control (see Supplement A and Tables S2S7 in the online supplement for a more detailed description and results of measurement invariance testing). Based on these analyses, depressive symptoms and quantitative work demands were deemed as invariant both across gender and across time, which indicates that the meaning of these measures was stable across gender and time. However, with regard to work-time control, in addition to only reaching metric invariance across gender, several fit indices showed poor fit across gender and time. As a result, hypotheses 5a–b and 8d were not tested.

Next, for hypothesis testing, latent growth curve modelling (LGCM) was used to find the univariate growth curve that best described the longitudinal data for a variable in terms of mean intercept (baseline value) and linear and/or non-linear mean slope (i.e., change over time). These parameter estimates could be constrained to be equal between individuals (fixed) or allowed to vary (random). In the case of the latter, a variance different from zero is provided. The multiple-groups command was specified for all models. That is, women and men were allowed to vary in unconstrained models providing separate growth curve parameter estimates for each gender. The analyses were conducted sequentially, starting with the most parsimonious (intercept only) to more complex models (adding a linear slope and then a quadratic term). To evaluate model fit, a combination of 1) absolute fit indices (RMSEA [good ≤ .06; acceptable ≤ .08] and SRMR [good ≤ .08; acceptable ≤ .10]), 2) incremental fit indices (CFI [good ≥ .95; acceptable ≥ .90], and 3) comparisons of competing nested models (ΔChi-square) was used (West et al., 2012). In combination with evaluating fit indices, more parsimonious models were preferred over more complex ones. When MLR is specified as the estimator, a scaling correction factor is needed for testing the difference between competing models (Muthén & Muthén, 1998–2017). See Table S8 in the online supplement for fit indices for competing unconstrained models.

To test for gender differences (Hypotheses 1–7), the best-fitting unconstrained models (women and men allowed to vary) were compared with the models constrained to be equal for women and men. When the unconstrained model proved to be the most optimal model of the two, Wald’s test was used for significance testing to compare mean intercepts and mean slopes for men and women. This procedure was repeated for every demand and resource variable of this study, as well as for depressive symptoms.

In order to test associations between each demand or resource and depressive symptoms, parallel process growth modelling was conducted (see Fig. 1). Both associations between intercepts and slopes were tested. In this step of the analyses, the univariate model that showed best fit statistics from the previous step was used. We started out by testing for inclusion of covariates in the following competing models: 1) no covariates, 2) covariates on intercepts, 3) covariates on intercepts and slopes. Based on the recommendations for fit indices (presented above), the third alternative (covariates on both intercepts and slope) was chosen for all parallel models. Secondly, multiple groups with gender were specified and covariate effects were allowed to vary between women and men. With regards to Hypotheses 8–9, competing models were tested in a sequence: 1) no associations allowed, 2) constrained equal associations for women and men and 3) unconstrained (allowed to vary) associations for women and men.

Fig. 1
figure 1

Conceptual diagram for parallel process growth models. Note. Idep and Sdep for intercept and slope of depressive symptoms. IA and SA for intercept and slope of independent variable (A), where independent variable is either quantitative work demands, emotional demands, decision authority, unpaid work or leisure activities. Unpaid work and leisure activities were only measured at T2–T6, and were therefore studied together with T2–T6 of depressive symptoms. Moreover, the covariates married/cohabitant and having children was only controlled for in the model with unpaid work or leisure activities

Dropout analyses were conducted, comparing the study sample (N = 1022) at their first wave with those teachers who answered only at one time (T1, T2, T3, T4, or T5; N = 448). Table 1 shows that dropouts more often were men, slightly older (2.3 years), and lower-educated compared to the baseline study sample. Dropouts more often belonged to the category “other teaching professionals”; fewer of them were primary teachers and somewhat fewer secondary teachers. Independent 2-way analyses of variance (ANOVA), with the independent factors gender and dropout, indicated that the dropouts reported somewhat lower levels of unpaid work (F(1,1) = 4.796 (p < .05); partial η2 = .004) and leisure activities (F(1,1) = 5.056 (p < .05); partial η2 = .004) compared to the study sample, but there were no differences with regard to depressive symptoms, decision authority, or emotional and quantitative work demands. Importantly, the analyses showed no interaction effect between gender and dropout on any study variable, which implies that dropout should not impact the results of the tested hypotheses. The set criteria of only calculating a mean score for a person with answers on at least 60% of the items on a particular scale at a particular timepoint, resulted in low internal attritions within a range from 9.5% to 14.6% (for any scale at any timepoint).

Table 1 Dropout analyses. Percentages and Means (SD). Chi-square, independent t-test and two-way ANOVA and effect sizes

Results

Table 2 presents the descriptive statistics for the study variables at all time points for men and women. See Table S9 in the online supplement for Pearson’s correlations of the study variables at baseline (first time responses). Table 5 shows fit indices and unstandardized parameter estimates of the most optimal univariate LGCMs for the study variables. All univariate growth models were unconditional (i.e., without covariates).

Table 2 Descriptives for the study variables for men and women

Depressive Symptoms

For depressive symptoms, the final univariate LGCM was a model with a random intercept and fixed linear slope, meaning that the intercepts were allowed to vary between individuals, whereas the slopes were only allowed to vary between women and men. According to the fit indices, the unconstrained model showed significantly better fit to the data (Δχ2(4) = 21.48 [p < .001]) than the constrained model, which indicates that there were gender differences in the model. The fit indices for the unconstrained model showed good fit (χ2(46) = 57.19, RMSEA = .02, CFI = .98; SRMR = .07). There was a significant difference in mean intercepts between women (b = 2.19) and men (b = 1.85), with women showing higher levels of depressive symptoms at baseline than men (corresponding to the effect size Cohen’s d = .56), thus supporting Hypothesis 1a. With regards to mean linear slopes, women had a negative significant slope (b = -.05, p < .001), while the slope for men was not significantly different from zero (see Fig. 2a). The trajectory was significantly higher for women compared to men, thus supporting Hypothesis 1b.

Fig. 2
figure 2

Changes in depressive symptoms (a), quantitative work demands (b), emotional demands (c), decision authority (d), unpaid work (e), and  leisure activities (f) over time

Demands and Resources in the Work Sphere

For quantitative work demands, the final univariate LGCM was a model with a random intercept and random linear slope. According to the fit indices, the unconstrained model showed significantly better fit to the data (Δχ2(6) = 33.44 [p < .001] than the constrained model, which indicates that there were gender differences in the model. The fit indices for the unconstrained model showed good fit (χ2(30) = 43.69, RMSEA = .03, CFI = .99; SRMR = .08). Mean intercepts and mean slopes were significantly different between women and men (Fig. 2b), to the extent that women (b = 2.93) reported significantly higher quantitative work demands at baseline than men (b = 2.70) (corresponding to Cohen’s d = .63), thus supporting Hypothesis 2a. Women had a small but significant negative mean slope (b = -.03, p < .001), while the slope for men was not significantly different from zero. The trajectory was higher for women compared to men, thus supporting Hypothesis 2b.

For emotional demands, the final univariate LGCM was a model with a random intercept, random linear slope and a fixed quadratic term. According to the fit indices, the unconstrained model showed significantly better fit to the data (Δχ2(4) = 81.34 [p < .001]) than the constrained model, which indicates that there were gender differences in the model. Fit indices showed good or acceptable fit to the data, although SRMR was a bit high (χ2(28) = 66.16, RMSEA = .05, CFI = .92; SRMR = .13). Women reported significantly higher (b = 3.31) emotional demands than men (b = 3.05) (Cohen’s d = .67), thus supporting Hypothesis 3a. In addition, the mean linear slopes and the quadratic slopes did not differ statistically between genders, meaning that women followed a trajectory over time of higher levels of emotional demands than men, thus supporting Hypothesis 3b (Fig. 2c).

For decision authority, the final univariate LGCM was a model with a random intercept and random linear slope. According to the fit indices, the unconstrained model did not show significantly better fit to the data (Δχ2(6) = 5.96 [p = .43] than the constrained model, which indicates that there were no gender differences in the model. The fit indices for the constrained model showed acceptable fit regarding most fit indices, although SRMR was a bit high (χ2(36) = 76.63, RMSEA = .05, CFI = .93; SRMR = .12). The mean intercept of decision authority for both men and women was b = 3.42, and there was a small negative but significant mean slope (b = -.01, p < .05) (see Fig. 2d). Thus, Hypotheses 4a and 4b were not supported.

Demands and Resources in the Home Sphere

For time spent on unpaid work, the final univariate LGCM was a model with a random intercept and random linear slope. According to the fit indices, the unconstrained model showed significantly better fit to the data (χ2(6) = 41.94 [p < .001] than the constrained model, which indicates that there were gender differences in the model. The fit indices for the unconstrained model showed good fit (χ2(28) = 52.56, RMSEA = .04, CFI = .96; SRMR = .07). As shown in Tables 3 and 4, women reported doing significantly more hours of unpaid work (b = 14.58) at baseline compared to men (b = 12.22) (Cohen’s d = .29), thus supporting Hypothesis 6a. Moreover, the mean slopes were negative and significant for both genders, following parallel trajectories (Fig. 2e). This means that unpaid work decreased in a similar way over time for men and women. However, women reported doing higher levels of unpaid work than men over time. Thus, Hypothesis 6b was supported.

Table 3 LGCM. Fit indices and parameter estimates for the’best’ unconditional models (without covariates)
Table 4 LGCM. Parameter estimates (b) for the’best’ unconditional models (without covariates)

With regards to leisure activities, the final univariate LGCM was a model with a random intercept and random linear slope. According to the fit indices, the unconstrained model showed significantly better fit to the data (Δχ2(6) = 18.46, p < .01] than the constrained model, which indicates that there were gender differences in the model. The fit indices for the unconstrained model showed good fit (χ2(28) = 26.49, RMSEA = .00, CFI = 1.00; SRMR = .07). Women reported spending less hours on leisure activities (b = 13.92) at baseline than men (b = 15.16; p = .056; Cohen’s d = .22). Over time, the mean slope was significant and negative for women only (b = -.32, p < .001), resulting in the gender difference in time spent on leisure activities increasing over time (Fig. 2f). In fact, at the second time point, the gender difference between the trajectories had already become significant (Wald’s test 8.34, p < .01), with women reporting spending less time on leisure activities compared to men. However, because the gender difference in hours spent on leisure activities was only marginally statistically significant at baseline, Hypothesis 7a was not fully supported, while Hypothesis 7b was.

Parallel Growth Models for Demands, Resources, and Depressive Symptoms

Fit indices and unstandardized parameter estimates are presented in Table 5. First, the relationship between quantitative work demands and depressive symptoms was scrutinized. A conditional parallel growth model – constrained by equally-sized associations for women and men – showed good fit to the data (χ2(217) = 308.88, RMSEA = .03, CFI = .96; SRMR = .07). A positive and significant association was found between the intercepts of quantitative work demands and depressive symptoms (b = .93, p < .001). Also, the slopes of quantitative work demands and depressive symptoms were significantly associated (b = .97, p < .001) to the extent that a more rapid decrease in quantitative work demands was associated with a faster decrease in depressive symptoms. Thus, Hypothesis 8a was fully supported. The model did not improve significantly (Δχ2(3) = .66, p = .88) in an unconstrained model, implying no gender difference in this relationship.

Table 5 Parallel process growth models

For the relationship between emotional demands and depressive symptoms, a conditional parallel growth model – constrained by equally-sized associations for women and men – showed good fit to the data (χ2(215) = 293.09, RMSEA = .03, CFI = .95; SRMR = .08). The intercept of emotional demands was positively and significantly associated with the intercept of depressive symptoms (b = .68, p < .001). The slopes of emotional demands and depressive symptoms were also associated (b = .62, p < .001) to the extent that a more rapid increase in emotional demands was associated with a slower decrease in depressive symptoms. Taken together, Hypothesis 8b was fully supported. The model did not improve significantly (Δχ2(3) = 1.70, p = .64) in an unconstrained model, implying that there was no gender difference in this association.

For the association between decision authority and depressive symptoms, the conditional parallel growth model showed good fit (χ2(217) = 299.22, RMSEA = .03, CFI = .95; SRMR = .08). However, whereas fit for this model was significantly better than it was in the model with no associations (Δχ2(3) = 18.81, p < .001), the associations found between decision authority and depressive symptoms were only marginally significant. Thus, Hypothesis 8c was not supported. The model did not improve significantly (Δχ(3) = 1.46, p = .69) in an unconstrained model, implying that there were no gender differences in this association.

For the associations among the demands and resources in the home sphere and depressive symptoms, the conditional parallel growth model – constrained by no associations between unpaid work and depressive symptoms – showed good fit (χ2(194) = 229.15, RMSEA = .02, CFI = .98; SRMR = .05). No significant associations were found between the growth process of unpaid work and the growth process of depressive symptoms in terms of intercepts or slopes. Hence, Hypothesis 9a was not supported.

For the association between leisure activities and depressive symptoms, the conditional parallel growth model – constrained by equally-sized associations for women and men – showed good fit (χ2(191) = 112.21, RMSEA = .00, CFI = 1.0; SRMR = .06). A negative and significant association was found between the intercept of leisure activities and the intercept of depressive symptoms (b = -.05, p < .001). No significant association was found between the slopes. Taken together, Hypothesis 9b was partly supported. An unconstrained model showed no significant gender differences according to Wald’s tests.

Overall, there were no gender differences in the associations between demands or resources on the one hand and depressive symptoms on the other.

Discussion

The present study investigated gender differences in teachers’ demands and resources in the work and home spheres, as well as gender differences in associations with depressive symptoms. We examined associations between both demands and resources, respectively, and depressive symptoms at baseline and their parallel development over time. The results show that female teachers had more depressive symptoms, higher demands across life domains and poorer resources in the home sphere. Emotional and quantitative work demands as well as less time on leisure activities were associated with depressive symptoms. There were no gender differences in the association between high demands and poor resources on the one hand, and depressive symptoms on the other. This indicates that the higher levels of depressive symptoms of female teachers can be explained by that they are more exposed to more challenging conditions in both work and home spheres.

With regard to depressive symptoms, as hypothesized, female teachers reported higher levels compared to male teachers, which supports the premise that women generally receive more psychiatric diagnoses such as depression, anxiety, and adjustment disorders (Lidwall & Olsson-Bohlin, 2016). Regarding job demands, as hypothesized, compared to male teachers, female teachers reported worse psychosocial working conditions in terms of both higher emotional and quantitative work demands. Our findings are in line with a previous study showing that female teachers experienced higher quantitative work demands compared to male teachers (Antoniou et al., 2006), but contradict the results of other studies in which no gender differences in reported quantitative work demands among teachers were found (Griva & Joekes, 2003; Pisanti et al., 2003).

The differences in reported job demands between male and female teachers may be understood in several ways. Despite similar formal work descriptions, women may feel less able to limit their engagement and responsibilities towards colleagues, supervisors, students and parents, which aligns with gender-role expectations (Eagly & Wood, 2012). For example, consistent with social role theory, a recent systematic review suggests that female teachers and male teachers differed in the way they managed their work-related emotions (Olson et al., 2019). Specifically, female teachers felt the need to hide their anger and tended to care more and make themselves more emotionally available to their students, whereas male teachers to a larger extent kept their distance (Olson et al., 2019).

The absence of gender differences in decision authority found in the present study supports the results of previous studies (Griva & Joekes, 2003; Pisanti et al., 2003), showing no gender differences in job control among teachers. In light of a more recent study investigating gender differences in levels of perceived autonomy support, referring to the extent to which the principal supports high decision authority among their teachers, no difference between male and female teachers was found (Collie et al., 2020). In the present study we did not explore the role of the supervisor, but relied solely on reports of the teachers’ experiences of decision authority and work-time control. The role of the principal in creating equal working conditions for male and female teachers is an important question that deserves further attention in future research.

Our results regarding demands and resources in the home sphere confirm the results of several previous studies, which show that women spend more time on childcare and housework (Bekker et al., 2005; Harryson et al., 2012; Moreno-Colom, 2017) and less time on leisure activities (Gimenez-Nadal & Sevilla-Sanz, 2011). This also supports social role theory (Eagly & Wood, 2012), suggesting that women, in line with their gender role, care more for their home and family, leaving less time for leisure activities.

The Role of Work and Home Spheres in Depressive Symptoms

As expected, there were no gender differences in the association between psychosocial working conditions and depressive symptoms. This lends support to the conclusions of other researchers (Swedish Council on Health Technology Assessment, 2014; Theorell et al., 2015), arguing that it is women’s higher exposure to poor working conditions that may explain their higher levels of depressive symptoms, rather than being more vulnerable to such exposure. In accordance with the JD-R model (Bakker et al., 2003), we found that both emotional and quantitative work demands were associated with depressive symptoms at baseline and their development over time were associated with the development of depressive symptoms. These findings are in line with previous studies on teachers (Baka, 2015; Borrelli et al., 2014; Madsen et al., 2010; Skaalvik & Skaalvik, 2018) as well as with several reviews and meta-analyses, which conclude that job demands are related to depressive symptoms among workers in general (Bonde, 2008; Swedish Council on Health Technology Assessment, 2014; Theorell et al., 2015). Furthermore, lack of decision authority was not associated with depressive symptoms, which contradicts previous research showing that decision authority among teachers in particular (Besse et al., 2015) and job control in general (Bonde, 2008; Swedish Council on Health Technology Assessment, 2014; Theorell et al., 2015) were negatively related to depressive symptoms. However, in accordance with the JD-R model (Bakker et al., 2003), job resources other than decision authority might be more influential with regards to the mental health of teachers and potential gender differences. For example, as a consequence of the growing administrative burden for teachers (The state’s public investigations, 2014), administrative support might be a relevant job resource to study in future research on teachers and depressive symptoms (American Federation of Teachers, 2015; Tickle et al., 2011). Moreover, as teachers often face high emotional demands, emotional support is another resource that have shown to be beneficial and deserves more attention (Feuerhahn et al., 2013).

Broadening the perspective of demands and resources falling outside of the JD-R model to factors in the home sphere, time spent doing unpaid work was not associated with depressive symptoms. Although this finding was unexpected, it was in line with many previous studies showing no association between unpaid work and psychological distress and fatigue (Harryson et al., 2012; Sonnentag & Zijlstra, 2006). Thus, despite an unequal burden of unpaid work between male and female teachers, time spent doing unpaid work does not appear to be a factor that could explain higher levels of depressive symptoms in female teachers. However, it has been suggested that it is the responsibility for planning and managing unpaid work among women, rather than the hours spent doing the work that may be linked to psychological distress (Ciciolla & Luthar, 2019; Harryson et al., 2012). This could be an interesting avenue for future research. Time spent on leisure activities, on the other hand, had a small negative association with depressive symptoms, supporting the results from previous studies (Pressman et al., 2009). However, a meta-analysis stresses the importance of engaging in a wide range of activities (i.e., not just passive activities such as watching TV) as well as in enjoyable activities for optimal health outcomes (Kuykendall et al., 2015). Thus, it would be fruitful if future studies could explore this in more detail by separating passive activities from active activities, and targeting activity satisfaction.

Regarding development over time, our expectation was that teachers’ job demands had increased and their resources decreased. However, the expected trends were not found, which may be due to that there in fact were no such trends or that individual development, such as growing older and getting more work experience, may have had a larger positive impact on individuals’ development. Our finding that depressive symptoms slightly decreased over time for female teachers – besides the observed link to the development in both emotional and quantitative work demands over time – may be explained by the fact that the respondents grow older and psychiatric diagnoses are most common (and are growing the fastest) among the “younger” age group of 30–49 years, and particularly between 30–39 years (Lidwall & Olsson-Bohlin, 2016). This may have been reflected in the study sample, because 1) we found that older female teachers compared to younger peers reported lower depressive symptoms and 2) during the study period the mean age changed from 46 to 56 years for female teachers, indicating that a change towards lower depressive symptoms across time may be associated with the increasing age of the study population. Another potential reason for decreased depressive symptoms may be the healthy worker effect (McMichael, 1976), which means that those with poor health may leave the labour market prematurely (e.g., through long-term sick leave). Alternatively, they may simply drop out of the study, although the dropout analysis did not support such an effect.

An important contribution of the present study is that within the same occupation (i.e., teaching), women reported worse psychosocial working conditions than men. One question that arises is whether this finding could partly be explained by the fact that formal work roles differed significantly between female and male teachers in our study. For example, as reflected in national statistics, most primary school teachers and “other teaching professionals” in the present study were women, while the gender distribution among upper secondary school teachers was equal. To ensure that the observed differences between men and women were not only attributable to different psychosocial working conditions for different kinds of teachers, we conducted sensitivity analyses, in which data were stratified according to teaching level. The results indicate that our finding that female teachers report poorer working conditions than male teachers also in several cases applies to men and women who work in the same teaching category. In addition, the results suggest that gender differences in demands and resources partly can be explained by female teachers specialize in teaching fields with poorer working conditions than do male teachers. It is noteworthy that primary school teachers as a rule seemed to report poorer working conditions than the other two teacher categories, which might be a consequence of poorer working conditions that are often found in female dominated workplaces. Also, there might be gender segregation within teaching categories, for example, among upper secondary school teachers, whereby more women may be teaching female-coded subjects (e.g., languages) and more men may be teaching male-coded subjects (e.g., science), which may be associated with different demands and resources.

The purpose of the present study, however, was to get a broader picture of gender differences in the teaching profession in general, which is reflected in the choice of data: The study is based on data from a representative sample of the Swedish working population and the study thus includes teachers working in all parts of the country and in schools, public and private, with a variety of socioeconomic backgrounds and thus should reflect the numbers of teachers in different teaching categories, as well as different numbers of men and women in respective category. Hence, the results are likely generalisable to the general Swedish teaching workforce. However, gender equality is higher in Sweden than in many other countries, and to what extent gender differences in demands and resources are generalisable to countries in which the division of labour between men and women is more traditional is difficult to say. It is, however, evident from the present results that also in Sweden, a relatively gender equal country, demands and resources across life domains in teachers are unequal, and to women’s disadvantage. Finally, there are probably a variety of reasons for gender differences in reported depressive symptoms, such as different response patterns for men and women as well as biological differences.

Limitations and Future Research Directions

Despite a number of strengths of this study, including the large, representative sample and the longitudinal approach with six time points and validated scales, this study has some limitations that should be taken into account. First, only self-report data were used in the focal analyses (several control variables utilized register data). This poses a risk of inflated associations due to common-method bias (Podsakoff et al., 2003). However, this risk is less of an issue if longitudinal data is used (Podsakoff et al., 2003; Spector, 1994), and at least five measurement points for each parallel process analysis were used in the present study.

Second, emotional demands were measured with a single item, which might limit the validity on its association with depressive symptoms. Here also, a multi-faceted measure could perhaps shed more light on gender differences. In the measurement invariance testing, quantitative work demands, although it is a validated and widely used scale, showed fit indices that are on the borderline of being acceptable (RMSEA around .10). One should in general not rely on specific RMSEA cut-off values or on single fit indices when deciding on accepting or rejecting a model (Chen et al., 2008); yet, our results indicate that the measure for quantitative work demands may need to be developed to better capture teachers’ work situation. Moreover, in the last two waves, the question on hours spent on housework was slightly changed to only refer to household work. This may somewhat have increased the gender differences as maintenance work was not included in these two waves. However, sensitivity analyses showed that excluding these two waves did not change the conclusions – there was still a significant gender difference and no associations between unpaid work and depressive symptoms were found.

In our sample, work-time control showed poor measurement invariance, both across gender and across time, indicating that the construct is interpreted differently by female and male teachers and at different time points, making it unsuitable for comparing levels between men and women. Thus, we did not present results of hypothesis testing for work-time control. A possible reason for this measurement problem might be that work-time control is rather limited for teachers in general; they may not expect to have the opportunity to use flexitime or take any other holidays than the regular school holidays. The other study variables were satisfactory in the preliminary analyses: Although more men dropped out, the dropout analyses suggest that this should not impact our conclusions as there were no interaction effects between gender and dropout on any study variable.

Third, despite the benefits of tracking individuals for up to six waves, longitudinal cohort data often (as in our study) results in large amounts of missing data, which is a potential bias. Maximum likelihood methods assume normal distribution and missing at random, which is impossible to ensure (Yuan et al., 2012). However, only a few small differences were found (see Table S10 in the online supplement) when comparing those who participated only two times (the minimum inclusion criteria) with those who participated five or six times. The ratio between the proportion of male and female teachers did not differ between the two groups and in very few cases did the attrition patterns differ between men and women. This suggests that attrition should have little impact on our conclusions.

Fourth, we implicitly assumed that psychosocial factors had implications for depressive symptoms, but when testing causal associations, three properties should be met (Bollen, 1989). These are: 1) there are associations, 2) the independent variable precedes the dependent variable in time and 3) no confounding variables cause the association. Associations were supported and the development of both psychosocial factors and depressive symptoms over at least five time points were studied. Nevertheless, we cannot entirely rule out that confounding variables explain some of these associations even though we controlled for several demographic covariates. Also, there may be an additional association in the reversed direction (from depressive symptoms to reported psychosocial factor) than hypothesized.

It should be noted that the main aim of this study was to investigate gender differences in demands and resources and their associations with depressive symptoms, and associations between their developments over time. This method captures rather stable trends over time (mostly linear and quadratic), and not fluctuations (to study those other methods are better suited). Recognizing these limitations, this study sheds light on how demands and resources in the work and home spheres develop in parallel with depressive symptoms over time for male and female teachers.

Practice Implications

The results of the present study indicate that psychosocial factors such as higher demands and poorer resources in both the work and home spheres are important to consider in future studies and policy development to support the well-being of female teachers. It is important to underscore that our results do not suggest that women are more vulnerable than men to psychosocial factors in the work and home spheres with regard to mental health, but rather that gender differences in depression are attributed to higher demands and poorer resources among women. Hence, implications for practice may include reducing the higher demands and providing the resources that are lacking (time, especially leisure time) needed to balance work and home life and mental health. Women experience more stress and less stress relief, which is harmful. Therefore, working conditions for female teachers need careful scrutiny and rebalancing.Future studies should investigate gender differences including a wider range of demands and resources in relation to depressive symptoms as well as in specific teacher categories in order to advance the knowledge of gender differences in teaching. For example, the role of the principal in creating good and equal working conditions for male and female teachers, in terms of support regarding both administrative tasks and autonomy, deserves further attention in future research.

Conclusion

The present study focused on a group in the labour market that is highly exposed to occupational stress – teachers – and contributes new knowledge on how more demands and fewer resources in the work and home spheres occur over time in parallel with depressive symptoms. The findings demonstrated that female teachers reported worse conditions than male teachers regarding both demands and resources, but these demands and resources were associated with depressive symptoms in both women and men. Thus, the higher levels of depressive symptoms in female teachers may be explained by their exposure to more challenging conditions in both the work and home spheres compared to male teachers.

Authors’ contribution statements

All authors participated actively and made substantial intellectual contributions to the entire research and review process, including interpretation of results and revising manuscript drafts. JS was the main contributor of the study and conducted the statistical analyses and drafted the manuscript. CM and ST particularly contributed to the study aim. AN made specific contributions to the study aim, research design, choice of statistical methods and funded the study. All authors approved the final version of the manuscript.