Introduction

Bipolar disorder (BD) is a life-course illness characterized by alternation of periods of euthymia with depressive, manic, and mixed episodes. Recently, it has been reconceptualized as a changing disorder with different clinical manifestations over the course of its development from an at-risk or latent stage to a late or end-stage, thus supporting the assumption that it would benefit from a staging model1. Clinical manifestations consistent with staging models include cognitive deterioration and functional decline2,3, changes in inflammatory and neuroanatomical biomarkers3,4,5,6,7,8,9, less response to treatment10, and worse self-reported quality of life (QoL)11 linked to disorder progression.

Although different staging models have been proposed from a theoretical perspective12,13,14,15,16 studies on BD with an empirical staging-development and longitudinal-data approach are scarce17. So far, only one study18 has tested the applicability of a theoretical clinical staging model for BD progressively developed by different authors12,13,14.

Recent research showed that BD could be fit to a mathematical model19,20. In a previous study, we developed a comprehensive, evidence-based k-means clustering model for BD that distinguishes five clusters ranging from the least severe (stage 1) to the most severe one (stage 5) based on clinical characteristics, physical health, cognition performance, real world functioning, and health-related QoL21. Now, with this study, we aim to use a different sample to test the construct validity of our model as well as its longitudinal validity, thereby providing proof of its validity as a staging model for use in patients with BD. We hypothesized that our model would behave properly with a different sample and that, at 3-year follow-up, the majority of the patients would remain at the same stage or would progress or regress one stage, while only a small proportion of patients would progress or regress two or more stages.

Materials and methods

This is a prospective, 3-year follow-up, multicenter study conducted at four sites in Spain (Oviedo, Barcelona, and Valencia) with the aim to develop and validate an empirical staging model for using in patients with BD.

The baseline study was conducted between April 2012 and December 2014 (ref. PI11/02493), and the 3-year follow-up was conducted between April 2015 and July 2018 (ref. PI14/02037). The Clinical Research Ethics Committee of Hospital Universitario Central de Asturias in Oviedo approved the study protocol (refs. 36/12 and 142/15). Written informed consent was obtained from all participants prior to enrollment.

Participants

Of the 224 patients enrolled at baseline, 129 (57.6%) completed the 3-year follow-up assessment. Inclusion criteria at baseline were: (1) outpatients with a SCID-I-confirmed diagnosis of BD according to DSM-IV-TR22 in treatment at any of the four participating sites; (2) age ≥ 18 years; and (3) written informed consent to participate in the study. Exclusion criteria consisted only of refusal to participate in the study.

Assessments

Assessments were identical at baseline and at 3-year follow-up and included: (1) demographic and clinical information obtained from the clinical records of the patients (clinical course and specific characteristics of BD, psychiatric and physical comorbidities, officially recognized disability, and psychopharmacological treatments); (2) psychometric assessment: (2a) clinician-rated outcome measures (CROMs): Spanish versions of Hamilton Depression Rating Scale (HDRS)23, Hamilton Anxiety Rating Scale (HARS)24, Young Mania Rating Scale (YMRS)25, Clinical Global Impression (CGI)26, Oviedo Sleep Questionnaire (OSQ)27, Changes in Sexual Functioning Questionnaire (CSFQ)28, Scale for Cognitive Impairment in Psychiatry (SCIP)29, Global Assessment of Functioning (GAF)30, and Functioning Assessment Short Test (FAST)31; (2b) Patient-Reported Outcome Measures (PROMs): the Spanish version of MOS 36-item Short-Form Health-Survey (SF-36)32; (3) anthropometry [height, weight, waist circumference, and body mass index], vital signs (heart rate and blood pressure), and lab results [hematology (erythrocytes, hemoglobin, leukocytes, platelets), lipid profile (cholesterol, LDL cholesterol, HDL cholesterol, triglycerides), glucose, hepatic function (GPT, GOT, GGT, bilirubin), renal function (creatinine, BUN), hormones (PRL, TSH), and inflammatory and oxidative biomarkers (CRP, homocysteine)] were collected (for further detail, see Fuente-Tomas et al.21).

Our staging model

The first step in the development of our staging model was to create a cluster-based method to classify patients with BD using a cross-sectional sample21. We made a dimensional reduction using k-means clustering. This technique aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. Comparisons of between-group variables were then performed by Chi-square and univariate ANOVA followed by Tukey’s honestly significant difference post-hoc testing. Those variables in which statistically significant differences between groups were found were selected to be part of the model along with other variables added by expert criteria. We used all these variables, hereafter called profilers, to calculate a global severity formula.

Using the severity formula shown below, we obtain a global severity score for each patient which allows us to assign that patient to one of the five clusters of the staging model.

$${\mathrm {Severity}} = \frac{{10}}{{12}} \cdot \left( {{\mathrm {PD}}x{\mathrm {BD}} + {\mathrm {MetS}} + {\mathrm {ComPD}} + {\mathrm {SCIP}}_{{\mathrm {T}}_{r4}} +\; {\mathrm {IllnessN}} + {\mathrm {SFPF}} + {\mathrm {SFMH}} +\; {\mathrm {FAST}}_{\mathrm {T}} + {\mathrm {FAST}}_{\mathrm {leisure}} + {\mathrm {BMI}} + {\mathrm {HospN}} + {\mathrm {SuicAttN}}} \right)$$

The formula includes 12 profilers from the following five life domains: (1) Clinical characteristics of the BD: three profilers: Number of hospitalizations (HospN), Number of suicide attempts (SuicAttN), and Comorbid personality disorder (ComPD); (2) Physical health: three profilers: Body Mass Index (BMI), Metabolic Syndrome (MetS), and Number of comorbid physical illnesses (IllnessN); (3) Cognition: one profiler: Screen for Cognitive Impairment in Psychiatry score (SCIPTr4); (4) real-world functioning: three profilers: permanently disabled due to BD (PD × BD), Functioning Assessment Short Test total score (FASTT), and Functioning Assessment Short Test leisure time subscale score (FASTleisure); and (5) Health-related QoL: two profilers: SF-36 Physical Functioning Scale score (SFPF), and SF-36 Mental Health Scale score (SF-MH). All profilers have the same weight and may take values between 0 and 1, so the severity score ranges from 0 to 10. Based on this score, we proposed the cut-off for delimiting the five clusters using the scores corresponding to the 5th, 25th, 50th, 75th, and 95th percentiles (1.70, 2.50, 4.50, and 6.10, and ≥6.11, respectively).

The second step, described in this paper, was to further validate our classification model as regards construct validity and longitudinal validity with the original sample at 3-year follow-up.

Statistical analysis

Analyses were conducted using IBM SPSS Statistics for Windows, Version 22.0. The significance level was set at p < 0.05. We used a chi-squared test, paired t-test, and ANOVA with Tukey post-hoc test to identify associations between variables.

We tested for construct validity of our staging model by examining if: (1) all the profilers included in the model behave properly, that is, if patients get more severe scores on each profiler in late stages than in early stages and (2) our proposed external validators (GAF scores and pharmacological treatment patterns) also behave properly. We hypothesized that, in late stages, the global level of functioning would be more impaired and the prescribed pharmacological treatment more complex.

Concerning longitudinal validity, we analyzed the shift of patients throughout the model from baseline to 3-year follow-up. Here, we expected patients to move slightly forward or backward along the model with a very small percentage presenting greater changes (more than two stages). Furthermore, we expected a large proportion (more than 50%) of patients who stayed euthymic during the 3-year follow-up period to remain in the same stage.

Results

On average, the mean follow-up time was 37.9 (SD = 2.1) months. At 3-year follow-up, 129 (57.6%) patients were reassessed.

Demographic and clinical characteristics

Table 1 shows participant demographic and clinical characteristics, including the profilers of the model. Patients had a mean age of 50.3 (SD = 12.0), and the majority were female (65.2%) and Caucasian (96.2%). Diagnoses were as follows: 73% had BD I, 23 (17.4%) a comorbid personality disorder, and 9 (7%) a substance use disorder. Furthermore, 38 (32.2%) patients remained in a euthymic state throughout the follow-up period.

Table 1 Patient demographic and clinical characteristics.

The mean CGI-S score was 3.27 (SD = 1.4). Regarding psychopathology, 62 (47%) had a score consistent with bipolar depression according to the HDRS (≥7)33, 12 (9.1%) with a mixed episode (YMRS ≥ 7–20) and 4 (3%) with a manic episode according to the YMRS (>20). Concerning the cognitive assessment, 25 (19.4%) had mild, 27 (20.9%) moderate, and 26 (20.2%) severe impairment. On average, patients were receiving 3.2 (SD = 1.4) prescribed drugs. One hundred seventeen (90.7%) patients were taking one classic mood stabilizer, 29 (22.5%) were taking a combination of two, 78 (60.5%) at least one antipsychotic, 19 (14.7%) a combination of two, 51 (39.5%) antidepressants, and 65 (49.6%) benzodiazepines.

Classification of the patients in the staging model

Of the 129 patients followed at 3 years, 14 (10.9%) were classified as stage 1, 20 (15.5%) as stage 2, 61 (47.3%) as stage 3, and 24 (18.6%) as stage 4, and 10 (7.8%) as stage 5. Their mean global severity score was 3.6 (SD = 1.6), with a minimum of 0.9 and a maximum of 8.4. At baseline, their mean global severity score was 3.6 (SD = 1.4), with a minimum value of 0.8 and maximum of 8.0 (see Fig. 1). The mean global severity score of patients who did not complete follow-up was 3.5 (SD = 1.3). We did not find significant differences in age, gender, bipolar type, age at onset, and total FAST score between followed and lost patients.

Fig. 1: Distribution of patients according to staging model.
figure 1

a Distribution according to global severity formula scores and b distribution according to stages.

Construct validity

As can be seen in Table 2, except for SF-36 mental health, all profilers became significantly worse as they progressed through the stages, thus providing proof of construct validity. Furthermore, evidence of construct validity was also provided by the external validators. Concerning GAF scores, significant worsening was seen as the stages progressed, ranging from 81.1 (SD = 11.9) in stage 1 to 49.5 (SD = 13.4) in stage 2 (see Table 3). Finally, regarding pharmacological treatment patterns, early stages (1 and 2) were associated with monotherapy or use of two-drug combinations, while late stages (4 and 5) were associated with combinations of four or more drugs (p = 0.002). Also, patients in late stages more frequently received antidepressants and benzodiazepines (see Table 3).

Table 2 Construct validity: values and distribution of profilers throughout the model.
Table 3 Construct validity, external validators: GAF scores and pharmacological treatment patterns throughout the model.

Longitudinal validity

Figure 2 shows the shift of patients throughout the model at 3-year follow-up. Specifically, 50% of patients at stage 1 progressed to stage 2 and 16.7% to stage 3. Regarding stage 2, 27.6% of patients regressed one stage, while 37.9% progressed to stage 3 and only one (3.4%) advanced to stage 4. The majority of those at stage 3 remained at that stage (63.3%), while 18.2% regressed or progressed one stage. Regarding stage 4, 32% regressed to stage 3 and 26.3% progressed to stage 5. Finally, one-third of patients at stage 5 remained at that stage, while 55.6% regressed to stage 4 and 1 (11.1%) to stage 3.

Fig. 2
figure 2

Shift throughout the model at 3 years of follow-up.

When looking at the shifts in patients who stayed euthymic during the 3-year follow-up period, almost all remained at the same stage (55.3%) or regressed (23.7%) or advanced (15.8%) one stage. Two patients (5.3%) regressed two stages. In those patients who remained at the same stage or regressed to previous ones, there were statistically significant improvements in the clinical (t = 3.732, p = 0.001 and t = 5.090, p < 0.001, respectively), functioning (t = 2.626, p = 0.016 and t = 3.705, p = 0.004, respectively), and QoL dimensions (t = 8.000, p < 0.001 and t = 3.184, p = 0.010, respectively).

Discussion

Our results demonstrate that our staging model has good construct and longitudinal validity, thus supporting its use in daily clinical practice. Regarding construct validity, with the exception of the mental health-related QoL profiler, all behave properly, showing significant worsening through the stages. Furthermore, the proposed external validators (GAF scores and pharmacological treatment pattern) also behave properly, that is, there is a functional decline across stages, and pharmacological treatment patterns are more complex at late stages than at early ones. Concerning longitudinal validity, at 3-year follow-up, the shift of patients throughout the model was as expected considering the short follow-up period, with half remaining at the same stage, 40% progressing or regressing one stage, and fewer than 10% progressing or regressing two.

Notwithstanding the fact that the course of BD is heterogeneous, there is evidence for clinical progression34, and accordingly, the five life domains of our staging model showed this progression. Regarding the clinical characteristics of BD, patients at late stages experienced more hospitalizations and suicide attempts and more frequently had a comorbid personality disorder. Consistent with these data, one study reported the same results between patients with first and multiple mood episodes35. However, two other studies2,36 did not find this clinical pattern among patients in different stages. This discrepancy may be due to the criteria used to classify patients into stages. In both of those studies, patients were assigned to the different stages based on functional impairment only, and not multiple domains of life. In addition, some of our profilers in this dimension were not used in those studies. Physical health, cognition, and functioning were the domains in which patients showed the most remarkable progressive worsening, thus identifying a score-dependent pattern. These findings comport with previous reports showing cognitive and functional decline along with the progression of BD2,36,37 and with theoretical models proposed by Kapczinski et al. (2009)12 and Cosci and Fava (2013)38. Concerning self-reported QoL, as the disorder progresses, physical QoL seems to worsen. In agreement with our results, a recent study by Tatay-Manteiga et al. (2019)11 showed that BD patients reported poorer QoL in late than early stages in physical, psychological, social, and environment domains. However, again, that study used a different criterion to classify patients based solely on FAST scores.

We have identified only one study that examined a staging model for BD from a longitudinal perspective18, although its aim was to find the patient characteristics that define their progression throughout the model. In our study, over the 3 years of follow-up, patients shifted across stages as expected, that is, in that very short period of time, only five patients had strong shifts (progressing or regressing two or more stages). Unfortunately, we did not find standardized patterns for transition over time in BD staging models to contrast with our results. Further proof of the longitudinal validity of our model is that most patients who were in a euthymic state remained at the same stage or had regressed to a previous one at follow-up. Although functional39 and cognitive40 impairments have been associated with subsyndromal depressive symptoms in cross-sectional studies, our longitudinal results demonstrate a statistically significant improvement in functioning and QoL dimensions in those patients who remained euthymic for 36 months and who remained at the same stage or regressed to a previous stage, thus calling into question the association reported in the literature.

All these findings support the construct and longitudinal validity of our model for patients with BD21 and provide further support for using this clinical staging model in clinical practice, taking into account the easy access to profiles in any clinical environment. We would like to highlight that our model is disorder‐specific, which contributes to the better understanding of BD. We did not include prodromal phases of the disorder because transdiagnostic staging models are probably better suited to the study of at-risk and prodromal phases, while disorder-specific models are more appropriate once it has been diagnosed41. However, the present results must be interpreted in light of one main limitation. Given that patients had to give signed informed consent prior to inclusion in the study, we were unable to include extremely severe/agitated patients in the model, consequently leading to underrepresentation of such patients in the model. Nevertheless, one of the main strengths of our study is its empirical approach and longitudinal prospective design. This is the first study to follow an entire range of adult patients representing different clinical stages of BD. Previous studies that validated the proposed models focused only on comparison of early vs. late stages, rather than on the full clinical course. Furthermore, our model considers BD a multidimensional disorder requiring five different life domains to classify patients, and the proposed severity classification formula is easy to implement in daily clinical practice.

In conclusion, this proposed staging model conforms to the conceptualization of BD as a progressive disorder that develops from mild to severe presentations. In this sense, it could help clinicians and researchers to better understand the disorder and, at the same time, to design more accurate and personalized treatment plans.