Text to Brain: Predicting the Spatial Distribution of Neuroimaging Observations from Text Reports

Dockès, Jérôme; Wassermann, Demian; Poldrack, Russell; Suchanek, Fabian; Thirion, Bertrand; Varoquaux, Gaël

doi:10.1007/978-3-030-00931-1_67

Jérôme Dockès¹⁸,
Demian Wassermann¹⁸,
Russell Poldrack¹⁹,
Fabian Suchanek²⁰,
Bertrand Thirion¹⁸ &
…
Gaël Varoquaux¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11072))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

8377 Accesses

Abstract

Despite the digital nature of magnetic resonance imaging, the resulting observations are most frequently reported and stored in text documents. There is a trove of information untapped in medical health records, case reports, and medical publications. In this paper, we propose to mine brain medical publications to learn the spatial distribution associated with anatomical terms. The problem is formulated in terms of minimization of a risk on distributions which leads to a least-deviation cost function. An efficient algorithm in the dual then learns the mapping from documents to brain structures. Empirical results using coordinates extracted from the brain-imaging literature show that (i) models must adapt to semantic variation in the terms used to describe a given anatomical structure, (ii) voxel-wise parameterization leads to higher likelihood of locations reported in unseen documents, (iii) least-deviation cost outperforms least-square. As a proof of concept for our method, we use our model of spatial distributions to predict the distribution of specific neurological conditions from text-only reports.

You have full access to this open access chapter, Download conference paper PDF

Medical Document Mining Combining Image Exploration and Text Characterization

Text mining brain imaging reports

Article Open access 12 November 2019

Agreement between neuroimages and reports for natural language processing-based detection of silent brain infarcts and white matter disease

Article Open access 11 May 2021

1 Introduction

Hundreds of thousands of studies, case reports, or patient records, capture observations in human neuroscience, basic or clinical. Statistical analysis of this large amount of data could provide new insights. Unfortunately, most of the spatial information that these data contain is difficult to extract automatically, because it is hidden in unstructured text, in sentences such as: “[...] in the anterolateral temporal cortex, especially the temporal pole and inferior and middle temporal gyri” [1].

This data cannot be processed easily by a machine, as a machine does not know where the temporal cortex is. As we will show, simply looking up such terms in atlases does not suffice. Indeed, even atlases disagree [2]. Furthermore, joint processing of many reports faces varying terminologies, with regions represented in different atlases that differ and overlap. Finally, not all terms in a report carry the same importance, and practitioners use terms that are not the exact labels of any atlas. Coordinate-based meta-analyses capture the spatial distribution of a term from the literature [3, 4], but they also lack a model to combine terms.

Here, we propose to map case reports automatically to the brain locations that they discuss: we learn mappings of anatomical terms to brain regions from medical publications. We propose a new learning framework for translating anatomical terms to brain images – a process that we call “encoding”. We learn such a mapping, quantify its performance, and compare possible choices of representation of spatial data. We then show in a proof of concept that our model can predict the brain area for textual case reports.

2 Methods: Formalizing Text-to-Brain-Map Translation

2.1 Problem Setting: From Text to Spatial Distributions

We want to predict the likelihood of the location of relevant brain structures described in a document. For this purpose, we perform supervised learning on a corpus of brain-imaging studies, each containing: (i) a text, and (ii) the locations – i.e. the stereotactic coordinates – of its observations. Indeed, Functional Magnetic Resonance Imaging (fMRI) studies report the coordinates of activation peaks (e.g., [5, Table 1]), and Voxel Based Morphometry (VBM) analyses report the location of differences in gray matter density (e.g., [1, Table 2]). Following neuroimaging meta-analyses [3], we frame the problem in terms of spatial distributions of observations in the brain. In a document, observed locations $\mathcal {L} = \{l_a \in \mathbb {R}^3, a = 1 \dots c\}$ are sampled from a probability density function (pdf) p over the brain. Our goal is to predict this pdf p from the text $\mathcal {T}$. We denote q our predicted pdf. A predicted pdf q should be close to p, or take high values at the coordinates actually reported in the study: $\prod _{l \in \mathcal {L}} q(l)$ must be large. In a supervised learning setting, we start from a collection of studies $\mathcal {S} = (\mathcal {T}, \mathcal {L})$, with $\mathcal {T}$ the text and $\mathcal {L}$ the locations. Building the prediction engine then entails the choice of a model relating the predicted pdf p to the text $\mathcal {T}$, the choice of a loss, or data-fit term, and some regularization on the model parameters. We now detail how we make each of these choices to construct a prediction.

Model. We start by modelling the dependency of our spatial pdf q on the study text $\mathcal {T}$. This entails both choosing a representation for q and writing it as a function of the text. While q is defined on a subvolume of $\mathbb {R}^3$, the brain volume, we build it using a partition to work on a finite probably space: this can be either a regular grid of voxels or a set of anatomical regions (i.e. an atlas) $\mathcal {R} = \{\mathcal {R}_k, k=0\dots m\}$. As such a partitioning imposes on each region to be homogeneous, q is then formally written on $\mathbb {R}^3$ in terms of the indicator functions of the parts^{Footnote 1}: $\{r_k = \frac{\mathbb {I}_k}{\Vert \mathbb {I}_k\Vert _1}, ~ k = 1 \dots m\}$. Importantly, the volume of each part $\Vert \mathbb {I}_k\Vert _1$ appears as a normalization constant.

To link q to the text $\mathcal {T}$ of the study, we start by building a term-frequency vector representation of $\mathcal {T}$, which we denote $\varvec{x} \in \mathbb {R}^d$. d is the size of our vocabulary of English words $\mathcal {W} = \{w_t\}$, and $\varvec{x}_t$ is the frequency of word $w_t$ in the text. We assign to each atlas region a weight that depends linearly on $\varvec{x}$:

$$\begin{aligned} q(z) = \sum _{t=1}^d \sum _{k=1}^m \varvec{x}_t \varvec{\beta }_{t,k}r_k(z) \quad \forall z \in \mathbb {R}^3 \end{aligned}$$

(1)

where $\varvec{\beta } \in \mathbb {R}^{d \times m}$ are model parameters, which we will learn.

Using an atlas is a form of regularization: constraining the prediction to be in the span of $\{r_k\}$ reduces the size of the search space. Fine partitions, e.g. atlases with many regions or voxel grids, yield models with more expressive power, but more likely to overfit. Choosing an atlas thus amounts to a bias-variance tradeoff.

Label-Constrained Encoder. A simple heuristic to turn a text into a brain map is to use atlas labels and ignore interactions between terms. The probability of a region is taken to be proportional to the frequency of its label in the text. The vocabulary is then the set of labels: $d = m$. As the word $w_k$ is the label of $\mathcal {R}_k$, $\varvec{\beta }$ is diagonal. For example, for a region $\mathcal {R}_k$ in the atlas labelled “parietal lobe”, the probability on $\mathcal {R}_k$ depends only on the frequency of the phrase “parietal lobe” in the text. We call this model label-constrained encoder.

2.2 Loss Function: Measuring Errors on Spatial Distributions

Strategy. We will fit the coefficients $\varvec{\beta }$ of our model, see Eq. (1), by minimizing a risk $\mathcal {E}(p, q)$: the expectation of a distance between p and q.

A Plugin Estimator of p . We do not have access to the true pdf, p; we need a plugin estimator, which we denote $\hat{p}$. By construction of our prediction q, the best approximation of p we can hope for belongs to the span of our regions $\{r_k\}$. Hence, we build our estimator $\hat{p}$ in this space, setting the probability of a region to be proportional to the number of coordinates that fell inside it:

$$\begin{aligned} \hat{p} = \sum _{k=1}^m \frac{|\{a, \mathbb {I}_k(l_a) = 1\}|}{c} r_k ~ = ~ \sum _{k=1}^m \frac{1}{c} \sum _{a=1}^c\mathbb {I}_k(l_a) r_k ~ \triangleq ~ \sum _{k=1}^m \hat{\varvec{y}}_k r_k. \end{aligned}$$

(2)

When regions are voxels, there are too many regions and too few coordinates. Hence we use Gaussian Kernel Density Estimation (KDE) to smooth the estimated pdf^{Footnote 2}. Our supplementary material details a fast KDE implementation.

Choice of $\mathcal{E}$. We use two common distance functions for our loss. The first is Total Variation (TV), a common distance for distributions. Note that p defines a probability measure on the finite sample space $\mathcal {R}$, $\mathcal {P}(\mathcal {R}_k)= \int _{\mathcal {R}_k} p(z) dz$, where $\mathcal {R} = \{\mathcal {R}_k, k=1 \ldots m\}$ and $\mathcal {R}_k = \mathrm{supp}(r_k)$. q defines Q in the same way. Then,

$$\begin{aligned} \mathrm{TV}(\mathcal {P}, \mathcal {Q}) = \sup _{\mathcal {A} \subset \mathcal {R}}|\mathcal {P}(\mathcal {A}) - \mathcal {Q}(\mathcal {A})|. \end{aligned}$$

(3)

Since $\mathcal {R}$ is finite, a classical result (see [6]) shows that this supremum is attained by taking $\mathcal {A} = \{\mathcal {R}_k | \mathcal {P}(\mathcal {R}_k) > \mathcal {Q}(\mathcal {R}_k)\} $ (or its complementary) and:

$$\begin{aligned} \mathrm{TV}(\mathcal {P}, \mathcal {Q}) = \frac{1}{2}\sum _{k=1}^m|\mathcal {P}(\mathcal {R}_k) - \mathcal {Q}(\mathcal {R}_k)| = \frac{1}{2}\int _{\mathbb {R}^3}|p(z) - q(z)| dz. \end{aligned}$$

(4)

The TV is half of the $\ell _1$ distance between the pdfs. $\Vert \hat{p} - q\Vert _1$ is therefore a natural choice for our loss. The second choice is $\Vert \hat{p} - q\Vert _2^2$, which is a popular distance and has the appeal of being differentiable everywhere.

Factorizing the Loss. Let us call $v_k$ the volume of $r_k$, i.e. the size of its support: $v_k \triangleq \Vert \mathbb {I}_k\Vert _1, ~ k = 1 \dots m$. Remember that $r_k = \frac{1}{v_k}\mathbb {I}_k$. Our loss can now be factorized (see supplementary material for details):

$$\begin{aligned} \int _{\mathbb {R}^3} \delta (\hat{p}(z) - q(z))dz= & {} \sum _{k=1}^m v_k \delta \left( \frac{\hat{\varvec{y}}_k}{v_k} - \frac{\sum _{t=1}^d\varvec{x}_{t}\varvec{\beta }_{t,k}}{v_k} \right) \end{aligned}$$

(5)

Here, $\delta $ is either the absolute value of the difference or the squared difference.

2.3 Training the Model: Efficient Minimization Approaches

To set the model parameters $\varvec{\beta }$, we used n example studies $\{\mathcal {S}_i = (\mathcal {T}_i, \mathcal {L}_i), ~ i = 1 \dots n \}$. We learn $\varvec{\beta }$ by minimizing the empirical risk on $\{S_i\}$ and an $\ell _2$ penalty on $\varvec{\beta }$. We add to the previous notations the index i of each example: $p_i$, $q_i$, $\hat{\varvec{y}}_i$, $\varvec{x}_i$. $\hat{\varvec{Y}} \in \mathbb {R}^{n \times m}$ is the matrix such that $\hat{\varvec{Y}}_{i} = \hat{\varvec{y}}_i$, and $\varvec{X} \in \mathbb {R}^{n \times d}$ such that $\varvec{X}_i = \varvec{x}_i$.

Case $\delta = \ell _2^2$. The empirical risk is

$$\begin{aligned} \sum _{i=1}^n \sum _{k=1}^m \left( \frac{\hat{\varvec{Y}}_{i,k}}{\sqrt{v_k}} - \sum _{t=1}^d\frac{1}{\sqrt{v_k}}\varvec{X}_{i,t}\varvec{\beta }_{t,k} \right) ^2. \end{aligned}$$

(6)

Defining $\varvec{Y}'_{:,k} = \frac{\hat{\varvec{Y}}_{:,k}}{\sqrt{(}v_k)}$ and $\varvec{\beta }'_{:,k} = \frac{\varvec{\beta }_{:,k}}{\sqrt{(}v_k)}$, with an $\ell _2$ penalty, the problem is:

$$\begin{aligned} {\mathop {\hbox {argmin}}\limits _{\varvec{\beta }'}}\left( \Vert \varvec{Y}' - \varvec{\beta }'\varvec{X}\Vert _2^2 + \lambda \Vert \varvec{\beta }'\Vert _2^2 \right) \end{aligned}$$

(7)

where $\lambda \in \mathbb {R}_+$. This is the least-squares ridge regression predicting $\hat{p}$ expressed in the orthonormal basis of our search space $\{\frac{r_k}{\Vert r_k\Vert _2}\}$.

Case $\delta = \ell _1$. The empirical risk becomes

$$\begin{aligned} \sum _{i=1}^n \sum _{k=1}^m|\hat{\varvec{Y}}_{i,k} - \sum _{t=1}^d\varvec{X}_{i,t}\varvec{\beta }_{t,k}| \end{aligned}$$

(8)

This problem is also known as a least-deviations regression, a particular case of quantile regression [7, 8]. Unlike $\ell _2$ regression, which provides an estimate of the conditional mean of the target variable, $\ell _1$ provides an estimate of the median. Quantile regression has been studied (e.g. by economists), as it is more robust to outliers and better-suited than least-squares when the noise is heteroscedastic [7]. Adding an $\ell _2$ penalty, we have the minimization problem:

$$\begin{aligned} \hat{\varvec{\beta }} = {\mathop {\hbox {argmin}}\limits _{\varvec{\beta }}}\left( \Vert \hat{\varvec{Y}} - \varvec{X}\varvec{\beta }\Vert _1 + \lambda \Vert \varvec{\beta }\Vert _2^2 \right) \end{aligned}$$

(9)

Unpenalized quantile regression is often written as a linear program and solved with the simplex algorithm [9], iteratively reweighted least squares, or interior point methods [10]. [11] uses a coordinate-descent to solve a differentiable approximation of the quantile loss (the Huber loss) with elastic-net penalty. Here, we minimize Eq. (9) via its dual formulation (c.f. supplementary material):

$$\begin{aligned} \hat{\varvec{\nu }} = {\mathop {\hbox {argmax}}\limits _{\varvec{\nu }}}\left( Tr(\varvec{\nu }^T\hat{\varvec{Y}} - \frac{1}{4\lambda }\varvec{\nu }^T\varvec{X}\varvec{X}^T\varvec{\nu }) \right) \quad \text { s.t. } \Vert \varvec{\nu }\Vert _\infty \le 1, \end{aligned}$$

(10)

where $\varvec{\nu } \in \mathbb {R}^{n \times m}$. The primal solution is given by $\hat{\varvec{\beta }} = \frac{\varvec{X}^T\hat{\varvec{\nu }}}{2\lambda }$. As the dual loss g is differentiable and the constraints are bound constraints, we can use an efficient quasi-Newton method (L-BFGS, [12]). g and its gradient are fast to compute as $\varvec{X}$ is sparse. $\lambda $ is set by cross-validation on the training set. We use warm-start on the regularization path (decreasing values for $\lambda $) to initialize each problem.

Training the Label-Constrained Encoder. The columns of $\varvec{\beta }$ can be fitted independently from each other. If we want $\varvec{\beta }$ to be diagonal, we only include one feature in each regression: we fit m univariate regressions $\varvec{\hat{y}}_{:,k} \simeq \varvec{X}_{:,k}\varvec{\beta }_{k,k}$.

2.4 Evaluation: A Natural Model-Comparison Metric

Our metric is the mean log-likelihood of an article’s coordinates in the predicted distribution, which diverges wherever $q=0$. we add a uniform background to the prediction, to ensure that it is non-zero everywhere:

$$\begin{aligned} \text {the predicted pdf is written}&q' = \frac{1}{2} (\sum _{k=1}^m\frac{\mathbb {I}_k}{v_k} + q)&\end{aligned}$$

(11)

$$\begin{aligned} \text {the score for a study }\mathcal {S}_i=(\mathcal {T}_i, \mathcal {L}_i), \mathcal {L}_i = \{l_{i,a}\} \text { is}&\frac{1}{c_i}\sum _{a=1}^{c_i}\log (q'_i(l_{i, a})) \end{aligned}$$

(12)

3 Empirical Study

3.1 Data: Mining Neuroimaging Publications

We downloaded roughly 140K neuroimaging articles from online sources including Pubmed Central and commercial publishers. About 14K of these contain coordinates, which we extracted, as in [4]. We built a vocabulary of around 1000 anatomical region names by grouping the labels of several atlases and the Wikipedia page “List of regions in the human brain”^{Footnote 3}. So in practice, $n \approx 14\cdot 10^3$ and $d \approx 1000$. m depends on the atlas (or voxel grid) and ranges from 20 to 30K.

3.2 Text-to-Brain Encoding Performance

Comparison of Atlases and Models. We perform 100 folds of shuffle-split cross-validation (10% in test set). As choices of $\{\mathcal {R}_k\}$, we compare several atlases and a grid of cubic 4-mm voxels. We also compare $\ell _1$ and $\ell _2$ regression, and label-constrained $\ell _2$. The label-constrained encoder is not used for the voxel grid, as it does not have labels. As a baseline, we include a prediction based on the average of the brain maps seen during training (i.e. independent of the text).

Figure 1 gives the results: for all models, voxel-wise encoding performs better than any atlas. Large atlas regions regularize too much. Despite its higher dimensionality, voxel-wise encoding learns better representations of anatomical terms. The label-constrained model performs poorly, sometimes below chance, as the labels of a single atlas do not cover enough words and interactions between terms are important. For voxel-wise encoding, $\ell _1$ regression outperforms $\ell _2$. The best encoder is therefore learned using a $\ell _1$ loss and a voxel partition.

Prediction Examples. Figure 2 shows the true pdf (estimated with KDE) and the prediction for the articles which obtained respectively the best and the first-quartile scores. The median is shown in the supplementary material.

Examples of Coefficients Learned by the Linear Regression. The coefficients of the linear regression (rows of $\varvec{\beta }$) are the brain maps that the model associates with each anatomical term. For frequent terms, they are close to what experts would expect (see for example Figs. 3 and 4).

3.3 Leveraging Text Without Coordinates: Neurological Examples

Our framework can leverage unstructured spatial information contained in a large corpus of unannotated text. To showcase this, assume that we want to know which parts of the brain are associated with Huntington’s disease. Our labelled corpus by itself is insufficient: only 21 documents mention the term “huntington”. But we use it to learn associations between anatomical terms and locations in the brain (Sect. 2). This gives us access to the spatial information contained in the unlabelled corpus, which was out of reach before (Sect. 3.2). We contrast the mean encoding of articles which mention “huntington” against the mean distribution (taking the difference of their log). Since the large corpus contains more information about Huntington’s disease (over 400 articles mention it), this is sufficient to see the striatum highlighted in the resulting map (Fig. 5, left). Figure 5 (right) shows the experiment for Parkinson, and Fig. 6 for Aphasia.

4 Conclusion

We have introduced a theoretical framework to translate textual description of studies into spatial distributions over the brain. Such a translation enables pooling together many studies which only provide text (no images or coordinates), for statistical analysis of their results in brain space. The statistical model gives a natural metric to validate. This metric enables comparing representations, showing that voxel-wise encoding is a better approach than relying on atlases. Building prediction models tailored to our task leads to a linear regression with an $\ell _1$ loss (least absolute deviation), the total-variation distance between the true and the predicted spatial distributions. Such a model can be trained efficiently on dozens of thousands of data points and outperforms simpler approaches.

Applied to descriptions of pathologies that lack spatial information, our model synthesizes accurate brain maps that reflect the domain knowledge. Predicting spatial distributions of medical observations from text opens new alleys for clinical research from patient health records and case reports.

Notes

1.
$\mathcal {R}_0$ denotes the volume outside of the brain, or background, on which q is 0.
2.
Using an atlas is also a form of KDE, with kernel $(z, z') \mapsto 1 / \Vert \mathbb {I}_k\Vert _1$ if z and $z'$ belong to the same region $\mathcal {R}_k, k \in \{1,\dots m\}$, 0 otherwise.
3.
https://en.wikipedia.org/wiki/List_of_regions_in_the_human_brain.

References

Mummery, C.J., Patterson, K., Price, C.J., et al.: A voxel-based morphometry study of semantic dementia: relationship between temporal lobe atrophy and semantic memory. Ann. Neurol. 47(1), 36–45 (2000)
Article Google Scholar
Bohland, J., Bokil, H., Allen, C., Mitra, P.: The brain atlas concordance problem: quantitative comparison of anatomical parcellations. PloS one 4(9), e7200 (2009)
Article Google Scholar
Laird, A.R., Fox, P.M., Price, C.J., et al.: ALE meta-analysis: controlling the false discovery rate and performing statistical contrasts. Hum. Brain Mapp. 25, 155 (2005)
Article Google Scholar
Yarkoni, T., Poldrack, R.A., Nichols, T.E., et al.: Large-scale automated synthesis of human functional neuroimaging data. Nat. Methods 8, 665 (2011)
Article Google Scholar
Van der Zwaag, W., Gentile, G., et al.: Where sound position influences sound object representations: a 7-T fMRI study. Neuroimage 54(3), 1803–1811 (2011)
Article Google Scholar
Gibbs, A.L., Su, F.E.: On choosing and bounding probability metrics. Int. Stat. Rev. 70(3), 419–435 (2002)
Article Google Scholar
Koenker, R., Bassett Jr., G.: Regression quantiles. Econom.: J. Econom. Soc. 46, 33–50 (1978)
Article MathSciNet Google Scholar
Chen, C., Wei, Y.: Computational issues for quantile regression. Sankhyā: Indian J. Stat. 67, 399–417 (2005)
MathSciNet MATH Google Scholar
Koenker, R., d’Orey, V.: Remark AS R92: a remark on algorithm as 229: computing dual regression quantiles and regression rank scores. J. R. Stat. Soc. Ser. C 43(2), 410–414 (1994)
Google Scholar
Portnoy, S., Koenker, R., et al.: The gaussian hare and the laplacian tortoise: computability of squared-error versus absolute-error estimators. Stat. Sci. 12(4), 279–300 (1997)
Article MathSciNet Google Scholar
Yi, C., Huang, J.: Semismooth newton coordinate descent algorithm for elastic-net penalized huber loss regression and quantile regression. J. Comput. Graph. Stat. 26, 547 (2017)
Article MathSciNet Google Scholar
Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(5), 1190–1208 (1995)
Article MathSciNet Google Scholar
Walker, F.O.: Huntington’s disease. Lancet 369(9557), 218–228 (2007)
Article Google Scholar
Davie, C.A.: A review of parkinson’s disease. Br. Med. Bull. 86, 109 (2008)
Article Google Scholar
Damasio, A.R.: Aphasia. N. Engl. J. Med. 326(8), 531–539 (1992)
Article Google Scholar
Surguladze, S., et al.: Interaction of catechol O-methyltransferase and serotonin transporter genes modulates effective connectivity in a facial emotion-processing circuitry. Transl. Psychiatry 2(1), e70 (2012)
Article Google Scholar
Van Dam, W.O., Rueschemeyer, S.A., Bekkering, H.: How specifically are action verbs represented in the neural motor system: an fMRI study. Neuroimage 53(4), 1318–1325 (2010)
Article Google Scholar
Silverman, B.W.: Density Estimation for Statistics and Data Analysis, vol. 26. CRC Press, Boca Raton (1986)
Book Google Scholar
Simonoff, J.S.: Smoothing Methods in Statistics. Springer Science & Business Media, New York (2012). https://doi.org/10.1007/978-1-4612-4026-6
Book MATH Google Scholar
Wand, M.: Fast computation of multivariate kernel estimators. J. Comput. Graph. Stat. 3(4), 433–445 (1994)
MathSciNet Google Scholar
Gramacki, A., Gramacki, J.: FFT-based fast computation of multivariate kernel density estimators with unconstrained bandwidth matrices. J. Comput. Graph. Stat. 26, 459–462 (2016)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This project received funding from: the European Union’s H2020 Research Programme under Grant Agreement No. 785907 (HBP SGA2), the Metacog Digiteo project, the MetaMRI associate team, and ERC NeuroLang.

Author information

Authors and Affiliations

INRIA, CEA, Université Paris-Saclay, Paris, France
Jérôme Dockès, Demian Wassermann, Bertrand Thirion & Gaël Varoquaux
Stanford University, Stanford, USA
Russell Poldrack
Télécom ParisTech, Paris, France
Fabian Suchanek

Authors

Jérôme Dockès
View author publications
You can also search for this author in PubMed Google Scholar
Demian Wassermann
View author publications
You can also search for this author in PubMed Google Scholar
Russell Poldrack
View author publications
You can also search for this author in PubMed Google Scholar
Fabian Suchanek
View author publications
You can also search for this author in PubMed Google Scholar
Bertrand Thirion
View author publications
You can also search for this author in PubMed Google Scholar
Gaël Varoquaux
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jérôme Dockès .

Editor information

Editors and Affiliations

University of Leeds, Leeds, UK
Alejandro F. Frangi
King’s College London, London, UK
Julia A. Schnabel
University of Pennsylvania, Philadelphia, PA, USA
Christos Davatzikos
Universidad de Valladolid, Valladolid, Spain
Carlos Alberola-López
Queen’s University, Kingston, ON, Canada
Gabor Fichtinger

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 362 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dockès, J., Wassermann, D., Poldrack, R., Suchanek, F., Thirion, B., Varoquaux, G. (2018). Text to Brain: Predicting the Spatial Distribution of Neuroimaging Observations from Text Reports. In: Frangi, A., Schnabel, J., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2018. MICCAI 2018. Lecture Notes in Computer Science(), vol 11072. Springer, Cham. https://doi.org/10.1007/978-3-030-00931-1_67

Download citation

DOI: https://doi.org/10.1007/978-3-030-00931-1_67
Published: 13 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00930-4
Online ISBN: 978-3-030-00931-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Text to Brain: Predicting the Spatial Distribution of Neuroimaging Observations from Text Reports

Abstract

Similar content being viewed by others