Elsevier

Journal of Affective Disorders

Volume 241, 1 December 2018, Pages 519-532
Journal of Affective Disorders

Review article
Applications of machine learning algorithms to predict therapeutic outcomes in depression: A meta-analysis and systematic review

https://doi.org/10.1016/j.jad.2018.08.073Get rights and content

Highlights

  • We surveyed the use of machine learning to inform predictive models in mood disorders.

  • We include studies that use machine learning algorithms to identify predictors of therapeutic outcomes in uni/bipolar depression.

  • Classification algorithms informed by neuroimaging, phenomenological, and genetic data were able to predict therapeutic outcomes with an overall accuracy of 0.82.

  • Predictive models integrating multiple data types performed better when compared to models with single lower-dimension data types (p <0.01).

  • Machine learning provides opportunity to parse clinical heterogeneity and characterize moderators of disease risk and trajectory.

Abstract

Background

No previous study has comprehensively reviewed the application of machine learning algorithms in mood disorders populations. Herein, we qualitatively and quantitatively evaluate previous studies of machine learning-devised models that predict therapeutic outcomes in mood disorders populations.

Methods

We searched Ovid MEDLINE/PubMed from inception to February 8, 2018 for relevant studies that included adults with bipolar or unipolar depression; assessed therapeutic outcomes with a pharmacological, neuromodulatory, or manual-based psychotherapeutic intervention for depression; applied a machine learning algorithm; and reported predictors of therapeutic response. A random-effects meta-analysis of proportions and meta-regression analyses were conducted.

Results

We identified 639 records: 75 full-text publications were assessed for eligibility; 26 studies (n=17,499) and 20 studies (n=6325) were included in qualitative and quantitative review, respectively. Classification algorithms were able to predict therapeutic outcomes with an overall accuracy of 0.82 (95% confidence interval [CI] of [0.77, 0.87]). Pooled estimates of classification accuracy were significantly greater (p < 0.01) in models informed by multiple data types (e.g., composite of phenomenological patient features and neuroimaging or peripheral gene expression data; pooled proportion [95% CI] = 0.93[0.86, 0.97]) when compared to models with lower-dimension data types (pooledproportion=0.68[0.62,0.74]to0.85[0.81,0.88]).

Limitations

Most studies were retrospective; differences in machine learning algorithms and their implementation (e.g., cross-validation, hyperparameter tuning); cannot infer importance of individual variables fed into learning algorithm.

Conclusions

Machine learning algorithms provide a powerful conceptual and analytic framework capable of integrating multiple data types and sources. An integrative approach may more effectively model neurobiological components as functional modules of pathophysiology embedded within the complex, social dynamics that influence the phenomenology of mental disorders.

Introduction

The functional and psychosocial deficits associated with depression are pervasive, often chronic, progressive, and highly disabling (Vigo et al., 2016). The Sequenced Treatment Alternatives to Relieve Depression (STAR*D) trial reported that patients who remit with the first trial of an antidepressant experience significant reductions in work-related disability; in contrast, patients who remit with subsequent treatment trials or strategies exhibit residual functional impairments (Trivedi et al., 2013). Notwithstanding the opportunity afforded by intervening earlier along the disease trajectory, most individuals with mood disorders do not receive timely and accurate diagnoses (Trivedi et al., 2013). Moreover, despite the availability (and, in many cases, accessibility) of effective and multi-modal treatments for mood disorders (e.g., pharmacotherapy, psychotherapy, neuromodulation), current treatment selection and sequencing methods are largely a process of trial-and-error, further delaying effective treatment (Trivedi et al., 2006).

Precision-based approaches that integrate quantifiable neurobiological substrates of psychopathology have been proposed to inform treatment selection and refine disease models in psychiatric disorders (Insel and Cuthbert, 2015). The National Institute of Mental Health’s Research Domain Criteria (RDoC) proffers a dimensional approach to parse clinical heterogeneity across current diagnostic classifications into homogenous, data-driven categories that are more proximate to the etiology of psychiatric disorders. For example, a recent study characterized behavioural phenotypes along a continuum–ranging from functionally adaptive to maladaptive groups–informed by a data-driven analysis of behavioural and psychiatric assessments and resting-state functional magnetic resonance imaging (rs-fMRI) measures in a community-based sample (Van Dam et al., 2017). In addition to examining evidence from populations across symptom-based diagnostic categories, there is a need to integrate findings using a ”systems biology” approach across units of analyses (e.g., genetic, molecular, cellular, circuits, behaviour) and data sources (e.g., pre-clinical, clinical, epidemiological) (Barabási, 2007, McIntosh, 2013).

Recent advances in information processing and computational resources present a novel opportunity to conceptualize and analyze human behaviour (Huys et al., 2016). While conventional statistical methods sequentially analyze the relationship between variables, machine learning approaches can iteratively and contemporaneously analyze multiple interacting associations between variables and/or variable sets (Orrù et al., 2012). Many machine learning methods are capable of identifying which variables in a given dataset are relevant or irrelevant to an outcome of interest, whereas conventional statistical methods rely on investigator input to specify variables relevant to a particular analysis (Bzdok and Meyer-Lindenberg, 2017). Thus, machine learning approaches are capable of agnostically devising data-driven models and algorithms with predictive capability. Moreover, the precision and accuracy of the models’ predictions (i.e., classifications) can be quantified and compared across models and/or iterations within-sample (i.e., cross-validation), as well as with external validation datasets, for greater out-of-sample generalizability (Goodfellow et al., 2016). Machine learning methods are additionally capable of integrating multiple data types (e.g., neuroimaging, behavioural) from multiple sources while correcting for heterogeneity within pooled datasets (Bzdok, Meyer-Lindenberg, 2017, Friston, Redish, Gordon, 2017, Rohart, Gautier, Singh, Lê Cao, 2017).

Toward the aim of improving treatment outcomes in mood disorders, recent studies have attempted to identify replicable and quantifiable predictors of therapeutic outcomes. Attempts to characterize biological and phenomenological characteristics capable of predicting illness trajectory and therapeutic recovery have shown promise; however, no single variable or modality has emerged as a robust predictor of therapeutic outcome (Rosenblat, Lee, McIntyre, 2017, Trivedi, Morris, Grannemann, Mahadi, 2005). The successful application of machine learning algorithms in consumer data analytics and other areas of medicine (e.g., oncology) demonstrates its predictive capability and the potential for machine learning approaches to inform predictive models in mood disorders (Huys, Maia, Frank, 2016, Rakesh, 2017, Ray, Britschgi, Herbert, Takeda-Uchimura, Boxer, Blennow, Friedman, Galasko, Jutel, Karydas, Kaye, Leszek, Miller, Minthon, Quinn, Rabinovici, Robinson, Sabbagh, So, Sparks, Tabaton, Tinklenberg, Yesavage, Tibshirani, Wyss-Coray, 2007, Rumshisky, Ghassemi, Naumann, Szolovits, Castro, McCoy, Perlis, 2016).

Notwithstanding the growing interest in, and opportunities afforded by, the application of machine learning algorithms in mechanistic and therapeutic research, no previous study has comprehensively reviewed the extant literature reporting on the use and application of machine learning algorithms to predict therapeutic outcomes in populations with depression. Herein, we aim to explore how, and to what extent, previous studies have applied machine learning algorithms to inform treatment selection and personalize therapy in clinical populations with depression.

Section snippets

Methods

We conducted a meta-analytic and scoping systematic review informed by recommendations from the Cochrane Handbook for Systematic Reviews of Interventions, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement, and the Joanna Briggs Institute’s Manual for Scoping Reviews (Cochrane Collaboration, Moher, Liberati, Tetzlaff, Altman, PRISMA Group, 2009, Peters, Godfrey, Khalil, McInerney, Parker, Soares, 2015). We evaluate extant studies that have applied machine

Results

The initial search yielded a total of 639 unique records (Table S1). After title and abstract review, 564 records were excluded; 75 full-text publications were assessed for eligibility and after full-text review, 44 records were excluded (Fig. 1). Twenty-seven studies investigated predictors of a therapeutic outcome in depression but did not apply a machine learning algorithm. Six studies applied a machine learning algorithm to differentiate depressed subjects from controls but did not report

Evidence source

Approximately 96% of studies included in the qualitative review were published in peer-reviewed medical journals, with a recent trend towards higher impact factor journals. Results from Egger’s test and funnel plot asymmetry prior to trim and fill additionally support the presence of publication bias. Although our search included grey literature sources, studies that applied machine learning algorithms with negative results may have been less likely to be published, abstracted, or made

Conclusions and future directions

We conducted a comprehensive systematic review and meta-analysis surveying the use of machine learning algorithms to inform predictive models in mood disorders. Classification algorithms were able to predict therapeutic outcomes among subjects of previously published prospective interventional trials (k=20,n=6325) with an overall accuracy of 0.82. Pooled estimates of classification accuracy were significantly different between models informed by a single data type (i.e., neuroimaging,

Disclosures

All authors contributed to the development of the research hypothesis and study design. YL conducted the search, data extraction, and data analysis and wrote the first draft of the manuscript. All authors contributed to the interpretation of results and manuscript writing. This work did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. The authors have no conflicts of interest to declare.

References (72)

  • G.E. Miller et al.

    Clustering of depression and inflammation in adolescents previously exposed to childhood adversity

    Biol. Psychiatry

    (2012)
  • D. Moher et al.

    Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement

    J. Clin. Epidemiol.

    (2009)
  • G. Orrù et al.

    Using support vector machine to identify imaging biomarkers of neurological and psychiatric disease: a critical review

    Neurosci. Biobehav. Rev.

    (2012)
  • A. Serretti et al.

    A neural network model for combining clinical predictors of antidepressant response in mood disorders

    J. Affect. Disord.

    (2007)
  • R. Strawbridge et al.

    Inflammation and clinical response to treatment in depression: a meta-analysis

    Eur. Neuropsychopharmacol.

    (2015)
  • N.T. Van Dam et al.

    Data-driven phenotypic categorization for neurobiological analyses: beyond DSM-5 labels

    Biol. Psychiatry

    (2017)
  • D. Vigo et al.

    Estimating the true global burden of mental illness

    Lancet Psychiatry

    (2016)
  • A.-L. Barabási

    Network medicine–from obesity to the “diseasome”

    N. Engl. J. Med.

    (2007)
  • C.B. Begg et al.

    Operating characteristics of a rank correlation test for publication bias

    Biometrics

    (1994)
  • D. Bzdok et al.

    Machine learning for precision psychiatry: opportunites and challenges

    Biol. Psychiatry

    (2017)
  • A.M. Chekroud et al.

    Reevaluating the efficacy and predictability of antidepressant treatments: a symptom clustering approach

    JAMA Psychiatry

    (2017)
  • C.J. Clopper et al.

    The use of confidence or fiducial limits illustrated in the case of the binomial

    Biometrika

    (1934)
  • Cochrane Collaboration, T., 2011. Cochrane Handbook for Systematic Reviews of...
  • S.G. Costafreda et al.

    Neural correlates of sad faces predict clinical remission to cognitive behavioural therapy in depression

    Neuroreport

    (2009)
  • A.T. Drysdale et al.

    Resting-state connectivity biomarkers define neurophysiological subtypes of depression

    Nat. Med.

    (2017)
  • S. Duval et al.

    A nonparametric “trim and fill” method of accounting for publication bias in meta-analysis

    J. Am. Stat. Assoc.

    (2000)
  • S. Duval et al.

    Trim and fill: a simple funnel-Plot–Based method of testing and adjusting for publication bias in meta-analysis

    Biometrics

    (2000)
  • M. Egger et al.

    Bias in meta-analysis detected by a simple, graphical test

    BMJ

    (1997)
  • A. Etkin et al.

    A cognitive-emotional biomarker for predicting remission with antidepressant medications: a report from the iSPOT-D trial

    Neuropsychopharmacology

    (2015)
  • M.F. Freeman et al.

    Transformations related to the angular and the square root

    Ann. Math. Stat.

    (1950)
  • K.J. Friston et al.

    Computational nosology and precision psychiatry

    Comput. Psychiatry

    (2017)
  • B.S. Gadad et al.

    Peripheral biomarkers of major depression and antidepressant treatment response: current knowledge and future outlooks

    J. Affect. Disord.

    (2017)
  • I. Goodfellow et al.

    Deep Learning

    (2016)
  • J.-P. Guilloux et al.

    Testing the predictive value of peripheral gene expression for nonremission following citalopram treatment for major depression

    Neuropsychopharmacology

    (2015)
  • Q.J.M. Huys et al.

    Computational psychiatry as a bridge from neuroscience to clinical applications

    Nat. Neurosci.

    (2016)
  • T.R. Insel et al.

    Medicine. brain disorders? precisely

    Science

    (2015)
  • Cited by (167)

    View all citing articles on Scopus
    1

    University Health Network, 399 Bathurst Street, MP 9–325, Toronto, ON M5T 2S8 Canada.

    View full text