Deep learning for undersampled MRI reconstruction

Chang Min Hyun; Hwa Pyung Kim; Sung Min Lee; Sungchul Lee; Jin Keun Seo

doi:10.1088/1361-6560/aac71a

1. Introduction

Magnetic resonance imaging (MRI) produces cross-sectional images with high spatial resolution using strong nuclear magnetic resonances, gradient fields, and hydrogen atoms inside the human body (Lauterbur 1973, Seo et al 2014). MRI does not use damaging ionizing radiation like x-rays, but the scan takes a long time (Sodicson et al 1997, Haacke et al 1999) and involves confining the subject in an uncomfortable narrow bow. Shortening the MRI scan time might help increase patient satisfaction, reduce motion artifacts from patient movement, and reduce the medical cost. The MRI scan time is roughly proportional to the number of time-consuming phase-encoding steps in k-space. Many efforts have been made to expedite MRI scans by skipping the phase-encoding lines in k-space while eliminating aliasing, a serious consequence of the Nyquist criterion violation (Nyquist 1928) that is caused by skipping. Compressed sensing MRI and Parallel MRI are some of the techniques used to deal with these aliasing artifacts. Compressed sensing MRI uses prior information on MR images of the unmeasured k-space data to eliminate or reduce aliasing artifacts. Parallel MRI installs multiple receiver coils and uses space-dependent properties of receiver coils to reduce aliasing artifacts (Sodicson et al 1997, Pruessmann et al 1999, Larkman et al 2001). This paper focuses solely on single-channel MRI for simplicity; hence, parallel MRI is not discussed.

In undersampled MRI, we attempt to find an optimal reconstruction function $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} f: \x \mapsto \y$ , which maps highly undersampled k-space data ( $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x$ ) to an image ( $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ ) close to the MR image corresponding to fully sampled data. Undersampled MRI consists of two parts, subsampling and reconstruction, as shown in figure 1.

**Figure 1.** General strategy for undersampled MRI reconstruction problem. The inverse Fourier transform of a fully sampled k-space data $\newcommand{\xfull}{\x_{{{\rm full}}}} \newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \xfull$ produces a reconstructed MRI image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ . The goal is to find a subsampling function $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal S}$ and learn an undersampled MRI reconstruction f from the training dataset. Here, $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} \y_{_{{\mathcal S}}}=|{\mathcal F}^{-1}|\, {\circ}\, \mathcal{P} (\x)$ is an aliased image caused by the violation of the Nyquist criterion. We use the U-net to find the function g that provides the mapping from the aliased image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_{_{{\mathcal S}}}$ to an anti-aliased image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ .
Download figure:
Standard image High-resolution image

$ \newcommand{\xfull}{\x_{{{\rm full}}}} \newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \xfull$ — **Figure 1.** General strategy for undersampled MRI reconstruction problem. The inverse Fourier transform of a fully sampled k-space data $\newcommand{\xfull}{\x_{{{\rm full}}}} \newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \xfull$ produces a reconstructed MRI image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ . The goal is to find a subsampling function $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal S}$ and learn an undersampled MRI reconstruction f from the training dataset. Here, $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} \y_{_{{\mathcal S}}}=|{\mathcal F}^{-1}|\, {\circ}\, \mathcal{P} (\x)$ is an aliased image caused by the violation of the Nyquist criterion. We use the U-net to find the function g that provides the mapping from the aliased image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_{_{{\mathcal S}}}$ to an anti-aliased image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ .
Download figure:
Standard image High-resolution image

Compressive sensing (CS) MRI can be viewed as a sub-Nyquist sampling method in which the image sparsity is enforced to compensate for undersampled data (Donoho 2004, 2006, Candes et al 2006, Lustig et al 2007). CS-MRI can be described roughly as a model-fitting method to reconstruct the MR image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ by adding a regularization term that enforces the sparsity-inducing prior on $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ . It aims to reconstruct an image given by

$\begin{align} \newcommand{\lam}{\boldsymbol{\lambda}} \newcommand{\y}{{\boldsymbol y}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\la}{\langle} \newcommand{\ma}{\mathrm{ma}} \displaystyle \label{leastSquare} \y=\underset{\y}{{{\rm argmin}}}\ \| \x - {\mathcal S}\, {\circ}\, {\mathcal F}(\y)\|_{\ell_2}^2 +\lambda \| {\mathcal T}(\y)\|_{\ell_1}, \nonumber \end{align} \tag{ 1 }$

where $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal F}$ denotes the Fourier transform, $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal S}$ is a subsampling, $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} {\mathcal T}(\y)$ represents a transformation capturing the sparsity pattern of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ , $\, {\circ}\,$ is the symbol of composition, and λ is the regularization parameter controlling the trade-off between the residual norm and regularity. Here, the term $\newcommand{\ma}{\mathrm{ma}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} \| \x-{\mathcal S}\, {\circ}\, {\mathcal F}(\y)\|_{\ell_2}$ forces the residual $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} \x-{\mathcal S}\, {\circ}\, {\mathcal F}(\y)$ to be small, whereas $\newcommand{\ma}{\mathrm{ma}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \| {\mathcal T}(\y)\|_{\ell_1}$ enforces the sparsity of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} {\mathcal T}(\y)$ . In CS-MRI, a priori knowledge of MR images is converted to a sparsity of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} {\mathcal T}(\y)$ with a suitable choice of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal T}$ . The most widely used CS method is total variation denoising (i.e. $\newcommand{\ma}{\mathrm{ma}} \newcommand{\na}{\nabla} \newcommand{\re}{\mathfrak{Re}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \| \nabla \y\|_{\ell_1}$ ), which enforces piecewise constant images by uniformly penalizing image gradients. Although CS-MRI with random sampling has attracted a large amount of attention over the past decade, it has some limitations in the preservation of fine-scale details and noise-like textures that hold diagnostically important information in MR images.

In contrast to the regularized least-squares approaches (1), our deep learning approach is a completely reversed paradigm. It aims to learn a function $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} f:\x \mapsto \y$ using many training data $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} \{(\x^{(i)}, \y^{(i)}):i=1, \cdots, N\}$ . Roughly speaking, f is achieved by

$\begin{align} \newcommand{\B}{\mathbf{B}} \newcommand{\y}{{\boldsymbol y}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\la}{\langle} \newcommand{\f}{\frac} \newcommand{\s}{\sigma} \newcommand{\ma}{\mathrm{ma}} \newcommand{\su}{\mathrm{supp}} \displaystyle \label{deep_eq} f=\underset{f\in \Bbb U_{net}}{{{\rm argmin}}}\ \frac1N \sum\limits_{i=1}^N \| f(\x^{(i)})-\y^{(i)}\|^2, \nonumber \end{align} \tag{ 2 }$

where $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\B}{\mathbf{B}} \Bbb U_{net}$ is a deep convolutional neural network with some domain(or prior) knowledge determined by a training dataset that consists of pairs of fully sampled MR image and folded images. A U-net can provide a low-dimensional latent representation and preserve high-resolution features through concatenation in the upsampling process (Ronnerberger et al 2015). This reconstruction function f can be viewed as the inverse mapping of the forward model $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal S}\, \, {\circ}\, \, {\mathcal F}$ subject to the constraint of MR images, which are assumed to exist in a low dimensional manifold. In the conventional regularized least-squares framework (1), it is very difficult to incorporate the very complicated MR image manifold into the regularization term. However, in the deep learning framework, the manifold constraint learned from the training set acts as highly nonlinear compressed sensing to obtain an useful reconstruction $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} f(\x)$ by leveraging complex prior knowledge on $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ .

There are several recent machine learning based methods for undersampled MRI (Hammernik et al 2017, Kwon et al 2017, Lee et al 2017) that were developed around the same time as our method. Hammernik et al developed an efficient trainable formulation for an accelerated Parallel Imaging(PI)-based method of learning variational framework to reconstruct MR images from accelerated multicoil MR data. The method is designed to learn a complete reconstruction procedure for multichannel MR data in the regularized least-squares framework. Their aim is to learn a set of parameters associated with the gradient of the regularization in the gradient decent scheme. Kwon et al applied the multilayer perceptron algorithm to reconstruct MR images from subsampled multicoil data. They reconstruct the image by using information from multiple receiver coils with different spatial sensitivities. In their method, the acceleration factor cannot be larger than the number of coils. Finally, Lee et al used a residual learning method to estimate aliasing artifacts from distorted images of undersampled data.

In this paper, a subsampling strategy for deep learning is explained using a separability condition in order to produce MR images with a quality that is as high as regular MR image reconstructed from fully sampled k-space data. The subsampling strategy is to preserve the information in $\newcommand{\xfull}{\x_{{{\rm full}}}} \newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \xfull$ as much as possible, while maximizing the skipping rate. To be precise, we use uniform subsampling in the phase encoding direction so that the Fourier transform contains all detailed features in a folded image, according to the Poisson summation formula.

We include a few low-frequency sampling to learn the overall structure of MR images and to deal with anomaly location uncertainty in the uniform sampling. The experiments show the high performance of the proposed method.

2. Method

Let $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \newcommand{\B}{\mathbf{B}} \y\in \Bbb C^{N\times N}$ be the MR image to be reconstructed, where N² is the number of pixels and $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\B}{\mathbf{B}} \Bbb C$ is the set of complex numbers. In 2D Fourier imaging with Cartesian k-space sampling, the MR image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ can be reconstructed from the corresponding k-space data $\newcommand{\xfull}{\x_{{{\rm full}}}} \newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\B}{\mathbf{B}} \xfull\in \Bbb C^{N\times N}$ : For $n, m=1-N/2, \cdots, 0, \cdots, N/2$ ,

$\begin{align} \newcommand{\y}{{\boldsymbol y}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\la}{\langle} \newcommand{\p}{\partial} \newcommand{\s}{\sigma} \newcommand{\ma}{\mathrm{ma}} \newcommand{\su}{\mathrm{supp}} \newcommand{\xfull}{\x_{{{\rm full}}}} \displaystyle \label{MR-data-0} \y(n,m)=\sum_{a=1-N/2}^{N/2}\sum_{b=1-N/2}^{N/2} \xfull(a,b) ~e^{2i\pi(an+bm)/N}, \nonumber \end{align} \tag{ 3 }$

where $\newcommand{\xfull}{\x_{{{\rm full}}}} \newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \xfull(a, b)$ is the MR-signal received at k-space position $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\p}{\partial} (2 \pi a/N, 2\pi b/N)$ . The frequency-encoding is along the a-axis and the phase-encoding is along b-axis in the k-space as per our convention.

In undersampled MRI, we violate the Nyquist criterion and skip phase-encoding lines during the MRI acquisition to speed up the time-consuming phase encoding. However, sub-Nyquist k-space data yields aliasing artifacts in the image space. For example, suppose we skip two phase-encoding lines to obtain an acceleration factor of 2. Then, the k-space data with zero padding is given by

$\begin{align} \newcommand{\ri}{{\rm i}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\la}{\langle} \newcommand{\f}{\frac} \newcommand{\ma}{\mathrm{ma}} \newcommand{\xfull}{\x_{{{\rm full}}}} \displaystyle \label{sampling} \left( \begin{array}{@{}cccc@{}} ~\cdots~ &\xfull(\frac{N}{2}-1,\frac{N}{2}) & \xfull(\frac{N}{2},\frac{N}{2}) \nonumber \\ ~\cdots~ &0 & 0 \nonumber \\ ~\cdots~ &\xfull(\frac{N}{2}-1,\frac{N}{2}-2) & \xfull(\frac{N}{2},\frac{N}{2}-2) \nonumber \\ ~\cdots~ &0 & 0 \nonumber \\ \dots & \vdots & \vdots \nonumber \\ \end{array} \right). \nonumber \end{align} \tag{ 4 }$

According to the Poisson summation formula, the discrete Fourier transform of the above uniformly subsampled data with factor 2 produces the following two-folded image (Seo et al 2012):

$\begin{align} \newcommand{\y}{{\boldsymbol y}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\la}{\langle} \newcommand{\ma}{\mathrm{ma}} \displaystyle \label{Poisson} \y_{2\mbox{\rm -fold}}(n,m)=\y(n, m)+\y(n, m+N/2). \nonumber \end{align} \tag{ 5 }$

If the deep learning approach is able to find an unfolding map $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_{2\mbox{\rm -fold}}\mapsto\y$ , in this way we could accelerate the data acquisition speed. However, it is impossible to get this unfolding map even with sophisticated manifold learning for MR images. In the left panel of figure 2, we consider two different MR images y₁ and y₂ with small anomalies at the bottom $(n, m)$ and top $(n, m+N/2)$ , respectively. Here, the corresponding k-space data $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \mathcal{F}(\y_1)$ and $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \mathcal{F}(\y_2)$ are different. However, the corresponding uniformly subsampled k-space data with factor 2 $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \mathcal{P}\, \, {\circ}\, \, \mathcal{S}\, \, {\circ}\, \, \mathcal{F} (\y_1)$ and $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \mathcal{P}\, \, {\circ}\, \, \mathcal{S}\, \, {\circ}\, \, \mathcal{F}(\y_2)$ are completely identical because $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \mathcal{F}^{-1}\, \, {\circ}\, \, \mathcal{P}\, \, {\circ}\, \, \mathcal{S}\, \, {\circ}\, \, \mathcal{F} (\y_1)= \mathcal{F}^{-1}\, \, {\circ}\, \, \mathcal{P}\, \, {\circ}\, \, \mathcal{S}\, \, {\circ}\, \, \mathcal{F} (\y_2)$ . Here, $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal S}$ and $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal P}$ are the sampling and zero-padding operator, respectively, so that $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \mathcal{P}\, \, {\circ}\, \, \mathcal{S} (\x_{full})$ is the subsampled k-space data with zero-padding given in (4). It is not possible to identify whether the anomaly is at the top or bottom. Deep learning cannot solve this unsolvable problem. We now explain our undersampling strategy for deep learning.

**Figure 2.** Feasibility of deep learning methods. Learning f requires separability: $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_1\neq \y_2 ~{{\rm implies~}}~ |{\mathcal F}^{-1}|\, \, {\circ}\, \, {\mathcal P}\, {\circ}\, {\mathcal S}\, \, {\circ}\, \, {\mathcal F}(\y_1)\neq$ $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} |{\mathcal F}^{-1}|\, \, {\circ}\, \, {\mathcal P}\, {\circ}\, {\mathcal S}\, \, {\circ}\, \, {\mathcal F}(\y_2)$ . The figure on the left shows why uniform subsampling does not satisfy the separability condition. We consider two different MR images with small anomalies at position $(n, m)$ and $(n, m+N/2)$ , respectively. The corresponding k-space data are different, but the corresponding uniformly subsampled k-space data with factor 2 are completely identical. It is hence not possible to identify whether the anomaly is at the top or bottom. In contrast, the figure on the right shows why separability can be achieved by adding low frequency data. Additional low frequency lines in the yellow box provides the location information of small anomalies.
Download figure:
Standard image High-resolution image

$ \newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_1\neq \y_2 ~{{\rm implies~}}~ |{\mathcal F}^{-1}|\, \, {\circ}\, \, {\mathcal P}\, {\circ}\, {\mathcal S}\, \, {\circ}\, \, {\mathcal F}(\y_1)\neq $ — **Figure 2.** Feasibility of deep learning methods. Learning f requires separability: $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_1\neq \y_2 ~{{\rm implies~}}~ |{\mathcal F}^{-1}|\, \, {\circ}\, \, {\mathcal P}\, {\circ}\, {\mathcal S}\, \, {\circ}\, \, {\mathcal F}(\y_1)\neq$ $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} |{\mathcal F}^{-1}|\, \, {\circ}\, \, {\mathcal P}\, {\circ}\, {\mathcal S}\, \, {\circ}\, \, {\mathcal F}(\y_2)$ . The figure on the left shows why uniform subsampling does not satisfy the separability condition. We consider two different MR images with small anomalies at position $(n, m)$ and $(n, m+N/2)$ , respectively. The corresponding k-space data are different, but the corresponding uniformly subsampled k-space data with factor 2 are completely identical. It is hence not possible to identify whether the anomaly is at the top or bottom. In contrast, the figure on the right shows why separability can be achieved by adding low frequency data. Additional low frequency lines in the yellow box provides the location information of small anomalies.
Download figure:
Standard image High-resolution image

Remark 2.1. Given the undersampled data $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x$ , let $\newcommand{\ma}{\mathrm{ma}} \newcommand{\f}{\frac} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y_\flat$ be the minimum norm solution, that is,

$\begin{align*} \newcommand{\y}{{\boldsymbol y}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\f}{\frac} \newcommand{\ma}{\mathrm{ma}} \displaystyle \y_\flat=\underset{~\y~\mbox{\rm s.t.~} \mathcal{S}\, \, {\circ}\, \, \mathcal{F} \y = \x }{\mathrm{argmin}} \; \|\y\|_{\ell^2}. \nonumber \end{align*}$

This $\newcommand{\ma}{\mathrm{ma}} \newcommand{\f}{\frac} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y_\flat$ is $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \mathcal{F}^{-1}(\mathcal{P}(\x))$ , the inverse Fourier transform of the data $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x$ padded by zeros. This is because $\newcommand{\ma}{\mathrm{ma}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \|\mathcal{P}(\x)\|_{\ell^2}\leqslant \|\x'\|_{\ell^2}$ for all $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x'$ satisfying $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \mathcal{S} (\x')=\x$ and the Fourier transform map is an isometry with respect to the $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\e}{{\boldsymbol e}} \ell^2$ norm. Unfortunately, this minimum norm solution $\newcommand{\ma}{\mathrm{ma}} \newcommand{\f}{\frac} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y_\flat$ is undesirable in most cases. See appendix A.

2.1. Subsampling strategy

Let $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} \{(\x^{(\,j)}, \y^{(\,j)})\}_{j=1}^M$ be a training set of undersampled and ground-truth MR images. The vectors $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x^{(\,j)}$ and $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y^{(\,j)}$ are in the space $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\B}{\mathbf{B}} \Bbb C^{N\times N}$ . Figure 1 shows a schematic diagram of our undersampled reconstruction method, where the corresponding inverse problem is to solve the underdetermined linear system

$\begin{align} \newcommand{\y}{{\boldsymbol y}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\la}{\langle} \newcommand{\ma}{\mathrm{ma}} \displaystyle \label{underdetermined} \; {\mathcal S}\, \, {\circ}\, \, {\mathcal F} (\y)~=~\x. \nonumber \end{align} \tag{ 6 }$

Given undersampled data $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x$ , there are infinitely many solutions $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ of (6) in $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\B}{\mathbf{B}} \Bbb C^{N\times N}$ . It is impossible to invert the ill-conditioned system $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\B}{\mathbf{B}} {\mathcal S}\, \, {\circ}\, \, {\mathcal F}: \Bbb C^{N\times N}\to \mathfrak{R}_{{\rm{\mathcal S}\, \, {\circ}\, \, {\mathcal F}}}$ , where $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \mathfrak{R}_{{\rm{\mathcal S}\, \, {\circ}\, \, {\mathcal F}}}$ is the range space of operator $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal S}\, \, {\circ}\, \, {\mathcal F}$ and its dimension is much lower than N². We use the fact that the MR images of humans exist in a much lower-dimensional manifold $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal M}$ embedded in the space $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\B}{\mathbf{B}} \Bbb C^{N\times N}$ . With this constraint $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal M}$ which is unknown, there is the possibility that there exists a practically meaningful inverse f in the sense that

$\begin{align} \newcommand{\ri}{{\rm i}} \newcommand{\y}{{\boldsymbol y}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\la}{\langle} \newcommand{\q}{\quad} \newcommand{\ma}{\mathrm{ma}} \displaystyle \label{underdetermined2} \; f\left({\mathcal S}\, \, {\circ}\, \, {\mathcal F}(\y)\right)=\y \q{{\rm for~}} \y\in {\mathcal M}. \nonumber \end{align} \tag{ 7 }$

In the left of figure 2, we consider the case that $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal S}$ is the uniform subsampling of factor 2. With this choice of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {\mathcal S}$ , two different images $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y_1\neq\y_2$ produce identical $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} |{\mathcal F}^{-1}|\, \, {\circ}\, \, {\mathcal P}\, \, {\circ}\, \, {\mathcal S}\, \, {\circ}\, \, {\mathcal F}(\y_1)= |{\mathcal F}^{-1}|\, \, {\circ}\, \, {\mathcal P}\, \, {\circ}\, \, {\mathcal S}\, \, {\circ}\, \, {\mathcal F}(\y_2)$ . This means the uniform subsampling of factor 2 is inappropriate for learning f satisfying (7). Here, $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y_1$ is the standard Logan phantom image and $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y_2$ is a modified image of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y_1$ obtained by moving three small anomalies to their symmetric positions with respect to the middle horizontal line. In contrast, if we add a few low frequencies to the uniform subsampling of factor 2, as shown in the image on the right of figure 2, the situation is dramatically changed and separability (8) may be achieved.

$\begin{align} \newcommand{\y}{{\boldsymbol y}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\la}{\langle} \newcommand{\ma}{\mathrm{ma}} \displaystyle \label{separability} \y_1\neq \y_2 ~{{\rm implies~}}~ |{\mathcal F}^{-1}|\, \, {\circ}\, \, {\mathcal P}\, \, {\circ}\, \, {\mathcal S}\, \, {\circ}\, \, {\mathcal F}(\y_1)\neq |{\mathcal F}^{-1}|\, \, {\circ}\, \, {\mathcal P} \, {\circ}\, {\mathcal S}\, \, {\circ}\, \, {\mathcal F}(\y_2). \nonumber \end{align} \tag{ 8 }$

In figure 3, we demonstrate the separability condition again using the patient data. Figure 3(a) is the ground truth, where the tumor is at the bottom. Figures 3(b) and (d) are the reconstructed images using a uniform subsampling of factors 2 and 4, respectively; the tumors apear found at both the top and bottom, and the uniform subsampling of factor 2 and 4 are not separable. However, in the reconstructed images in figures 3(c) and (e) using the uniform subsampling of factosr 2 and 4 with added low frequencies, the tumors are clearly located at the bottom and separability (8) may be achieved. This crucial observation is validated by various numerical simulations as shown in figure 5.

In the subsampling strategy, we use a uniform subsampling of factor 4 (25% k-space data—64 lines of a total 256 lines) with a few low frequencies(about $4 \% \; k$ -space data—12 lines of a total 256 lines). Owing to the Poisson summation formula, the uniformly subsampled data with factor 4 provides the detailed structure of the folded image of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ as

$\begin{align} \newcommand{\y}{{\boldsymbol y}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\la}{\langle} \newcommand{\f}{\frac} \newcommand{\s}{\sigma} \newcommand{\ma}{\mathrm{ma}} \newcommand{\su}{\mathrm{supp}} \displaystyle \label{Poisson-4} \y_{4\mbox{\rm -fold}}(n,m)=\sum_{j=0}^3\y(n, m+\f{jN}{4}). \nonumber \end{align} \tag{ 9 }$

However, the folded image may not contain the location information of small anomalies. We fix the anomaly location uncertainty by adding a few amount of low frequency k-space data. (See appendix B for details.)

2.2. Image reconstruction function

In this subsection, we describe the image reconstruction function f, which is schematically illustrated in figure 4. When we have an undersampled data $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x$ as an input of f, about 70% of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x$ are not measured and not recorded. The first step of f is to fill in zeros for the unmeasured region of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x$ to obtain $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \mathcal{P} (\x)$ . After the zero padding has been added, we take the inverse Fourier transform of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \mathcal{P}(\x)$ , take its absolute value, and obtain the folded image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_{_{{\mathcal S}}}$ . We input this folded image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_{_{{\mathcal S}}}$ into the trained U-net and obtain the U-net output image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \tilde \y$ . We apply the Fourier transform to $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \tilde \y$ , which yields the k-space data $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} {{\mathcal F}}(\tilde \y)$ . The U-net recovers the zero-padded part of the k-space information. However, during this recovery, the unpadded parts of the data are distorted. We manually fix this unwanted distortion by placing the original $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x$ values in their corresponding positions in the k-space data $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} {{\mathcal F}}(\tilde \y)$ . We call this k-space correction as f_cor and set $\newcommand{\ma}{\mathrm{ma}} \newcommand{\h}{{\mathbf h}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} \hat\x=f_{cor}({{\mathcal F}}(\tilde \y))$ . Because the original input data is preserved, we expect to obtain a more satisfactory reconstruction image and, indeed, our experiments show that the k-space correction is very effective. Finally, we apply the inverse Fourier transform to $\newcommand{\ma}{\mathrm{ma}} \newcommand{\h}{{\mathbf h}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \hat\x$ , take the absolute value and obtain our reconstruction image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\h}{{\mathbf h}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} |{{\mathcal F}}^{-1}(\hat\x)|$ . In summary, our image reconstruction function $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} f:\x\mapsto \y$ is given by

$\begin{align} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\ma}{\mathrm{ma}} \displaystyle f= |\mathcal{F}^{-1}|\, \, {\circ}\, \, f_{cor}\, \, {\circ}\, \, \mathcal{F}\, \, {\circ}\, \, f_d\, \, {\circ}\, \, |\mathcal{F}^{-1}|\, \, {\circ}\, \, \mathcal{P}, \nonumber \end{align} \tag{ 10 }$

where f_d is the trained U-net and f_cor indicates the k-space correction. Here, f_d should be determined by the following training process.

To train and test the U-net f_d, we generate the training and test sets as follows. Given ground-truth MR images $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \{\y^{(\,j)}\}_{j=1}^N$ , we take the Fourier transform of each $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y^{(\,j)}$ , apply our subsampling strategy $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} {{\mathcal S}}$ , which yields $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x^{(\,j)}$ . This provides a dataset $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} \{(\x^{(\,j)}, \y^{(\,j)})\}_{j=1}^N$ of subsampled k-space data and ground-truth MR images. The dataset is divided into two subsets : a training set $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} \{(\x^{(\,j)}, \y^{(\,j)})\}_{j=1}^M$ and test set $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} \{(\x^{(\,j)}, \y^{(\,j)})\}_{j=M+1}^N$ . The input $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x^{(\,j)}$ of the image reconstruction function f is an undersampled k-space data and the output $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y^{(\,j)}$ is the ground truth image. Using the zero-padding operator, inverse Fourier transform, and absolute value, we obtain folded images $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_{_{{\mathcal S}}}^{(\,j)}$ . Our training goal is then to recover the ground-truth images $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y^{(\,j)}$ from the folded images $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_{_{{\mathcal S}}}^{(\,j)}$ . Note that $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \lbrace \y_{{\mathcal S}}^{(\,j)}, \y^{(\,j)} \rbrace_{j=1}^{M}$ is a set of pairs for training f_d.

The architecture of our U-net is illustrated in figure 4. The first half of the network is the contracting path and the last half is the expansive path. The size of the input and output images is 256 × 256. In the contracting path, we first apply the 3 × 3 convolutions with zero-padding so that the image size does not decrease after convolution. The convolution layers improve the performance of machine learning systems by extracting useful features, sharing parameters, and introducing sparse interactions and equivariant representations (Bengio et al 2015). After each convolution, we use a rectified linear unit(ReLU) as an activation function to solve the vanishing gradient problem (Glorot et al 2011). Then, we apply the 2 × 2 max pooling with a stride of 2. The max pooling helps to make the representation approximately invariant to small translations of the input (Bengio et al 2015). In the expansive path, we use the average unpooling instead of max-pooling to restore the size of the output. In order to localize more precisely, the upsampled output is concatenated with the correspondingly feature from the contracting path. At the last layer a 1 × 1 convolution is used to combine each the 64 features into one large feature (Ronnerberger et al 2015).

The input of the net is $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \y_{_{{\mathcal S}}}^{(\,j)}$ , the weights are W, the net, as a function of weights W, is $f_{net}(\cdot, W)$ , and the output is denoted as $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} f_{net}(\y_{_{{\mathcal S}}}^{(\,j)}, W)$ . To train the net, we use the $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\e}{{\boldsymbol e}} \ell^2$ loss and find the optimal weight set W₀ with

$\begin{align} \newcommand{\y}{{\boldsymbol y}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\e}{{\boldsymbol e}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\f}{\frac} \newcommand{\s}{\sigma} \newcommand{\ma}{\mathrm{ma}} \newcommand{\su}{\mathrm{supp}} \displaystyle W_0=\underset{W}{{{\rm argmin}}}\ \frac1M \sum\limits_{j=1}^M \| f_{net}(\y_{_{{\mathcal S}}}^{(\,j)},W)-\y^{(\,j)}\|_{\ell^2}^2. \nonumber \end{align} \tag{ 11 }$

Once the optimal weight W₀ is found, we stop the training and denote the trained U-net as $f_d=f_{net}(\cdot, W_0)$ .

In our experiment, the ground-truth MR image $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ was normalized to be in the range $[0, 1]$ and the undersampled data $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x$ was subsampled to 29% k-space data as described in section 2. We trained our model using a training set of 1400 images from 30 patients. The MR images were obtained using a T2-weighted turbo spin-echo pulse sequence (repetition time = 4408 ms, echo time = 100 ms, echo spacing = 10.8 ms) (Loizou et al 2011). To train our deep neural network, all weights were initialized by a zero-centered normal distribution with standard deviation 0.01 without a bias term. The loss function was minimized using the RMSPropOptimize with learning rate 0.001, weight decay 0.9, mini-batch size 32, and 2000 epochs. RMSProp, which is an adaptive gradient method, was proposed by Tieleman and Hinton to overcome difficulties in the optimization process in practical machine learning implementations (Tieleman et al 2012). Training was implemented using TensorFlow (Google 2015) on an Intel(R) Core(TM) i7-6850K, 3.60GHz CPU and four NVIDIA GTX-1080, 8GB GPU system. The network required approximately six hours for training.

3. Result

Figure 5 shows the performance of the proposed method for five different brain images in the test set. The first, second and third columns show the ground-truth, aliased and corrected images, respectively. The aliased images are folded four times. The proposed method suppresses these artifacts, but provides surprisingly sharp and natural-looking images.

Figure 6 displays the impact of k-space correction. The four images in the first row are the ground truth (figure 6(a)), input (figure 6(b)) and output (figure 6(c)) of the U-net, and the final output after the k-space correction (figure 6(d)). In the second row, we subtract the ground truth from images in the first row. Images figure 6(c) before and figure 6(d) after k-space correction are visually indistinguishable. However, figures 6(g) and (h) displays the impact of k-space correction. The U-net almost completely removes the folding artifacts. However, one can still see a few folding artifacts. Hence, The k-space correction removes the remaining folding artifacts.

All our qualitative observations are supported by the quantitative evaluation. After we trained our model by using 1400 images from 30 patients, we used a test set of 400 images from 8 other patients, and measure and report their mean-squared error (MSE) and structural similarity index (SSIM) in table 1.

Table 1. Quantitative evaluation results in terms of MSE and SSIM using the test set of 400 images. MSE is computed using $\newcommand{\su}{\mathrm{supp}} \newcommand{\ma}{\mathrm{ma}} \newcommand{\s}{\sigma} \newcommand{\f}{\frac} \newcommand{\h}{{\mathbf h}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \frac{1}{400\times 256^2} \sum_{i=1}^{400}\sum_{n=1}^{256}\sum_{m=1}^{256} (\y^{(i)}_{proposed}(n, m)-\y^{(i)}(n, m)){}^2$ , where $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y^{(i)}$ is normalized to the range $[0, 1]$ . See Wang et al (2004) for definition of SSIM. As MSE approaches 0 or SSIM approaches 1, outputs are closer to labels.

	Aliased	U-net	U-net with k-space correction
MSE	$\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\p}{\partial} 0.0043 \pm 0.0016$	$\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\p}{\partial} 0.0012 \pm 0.0006$	$\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\p}{\partial} 0.0004 \pm 0.0002$
SSIM	$\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\p}{\partial} 0.6516 \pm 0.0815$	$\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\p}{\partial} 0.8782 \pm 0.0411$	$\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\p}{\partial} 0.9039 \pm 0.0431$

The results for these metrics support the effectiveness of both the U-net and k-space correction. In particular, the effectiveness of k-space correction is demonstrated.

4. Discussion and conclusion

Deep learning techniques exhibit surprisingly good performances in various challenging fields, and our case is not an exception. In this study, it generates the reconstruction function f using the U-net, providing a better performance than the existing methods.

Our inverse problem of undersampled MRI reconstruction is ill-posed in the sense that there are fewer equations than unknowns. The underdetermined system in section 3 has $256\times 256$ unknowns and $76\times 256$ equations. The dimension of the set $\newcommand{\R}{{{\mathbb R}}} \newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\y}{{\boldsymbol y}} \{\y\in \R^{256\times 256}: {\mathcal S}\, \, {\circ}\, \, {\mathcal F} (\y)= {\bf 0}\}$ is $(256-76)\times 256$ , and therefore it is impossible to have an explicit reconstruction formula for solving (6), without imposing the strong constraint of a solution manifold. For the uniqueness, the Hausdorff dimension of the solution manifold must be less than the number of equations (i.e. $76\times 256$ ). Unfortunately, it is extremely hard to find a mathematical expression for the complex structure of MR images in terms of $76\times 256$ parameters, because of its highly nonlinearity characteristic. The deep learning approach is a feasible way to capture MRI image structure as dimensionality reduction.

We learned the kind of subsampling strategy necessary to perform an optimal image reconstruction function after extensive effort. Initially, we used a regular subsampling with factor 4, but realized that it could not satisfy the separability condition. Because of wrap around artifact (a portion of the image is folded over onto some other portion of the image), it is impossible to specify the locations of small objects. We added low frequencies hoping to satisfy separability and this turned out to guarantee separability in a practical sense.

Once the data set satisfies the separability condition, we have many deep learning tools to recover the images from the folded images. We chose to use the U-net. The optimal choices may depend on the input image size, the number of training data, computer capacity, etc. It seems that the determination of optimal choice is difficult. Therefore, we empirically choose the number of layers, the number of convolution filters, and the filters' size. The trained U-net successfully unfolded and recovered the images from the folded images. The U-net removes most of the folding artifacts; however, one can still see them. Hence, The k-space correction is used to further reduce them.

The experiments show that our learned function f appears to have highly expressive representation capturing anatomical geometry as well as small anomalies. We tested the flexibility of the proposed method. We applied the proposed method to CT images that were never trained. It worked well for different types of images that were never trained. Our future research direction is to provide a more rigorous and detailed theoretical analysis to understanding why our method performs well. The proposed method can be extended to multi-channel complex data for parallel imaging, with suitable modifications to the sampling pattern and learning network. This is our ongoing research topic. In practice, owing to the large size of input data available for deep learning, we may face 'out of memory' problem. Indeed, we experienced out of memory problem when using input images of size $512 \times 512$ , with a four GPU (NVIDIA GTX-1080, 8GB) system. This memory limitation problem was the primary reason to use $256 \times 256$ images, which were obtained by resizing $512 \times 512$ images. It is possible to develop more efficient and effective learning procedures for out of memory problem.

Acknowledgments

This research was supported by the National Research Foundation of Korea No. NRF-2017R1A2B20005661. Hyun, Lee and Seo were supported by Samsung Science &; Technology Foundation (No. SSTF-BA1402-01).

Appendix A. Minimum-norm solution of the underdetermined system

The minimum-norm solution of the underdetermined system $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} {\mathcal S}\, {\circ}\, {\mathcal F} \y=\x$ in Remark 2.1 is the solution of following optimization problem: Minimize $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\re}{\mathfrak{Re}} \newcommand{\e}{{\boldsymbol e}} \|y\|_{\ell^2}$ subject to the constraint $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} {\mathcal S}\, {\circ}\, {\mathcal F} \y=\x$ . This underdetermined system has infinitely many solutions. For example, the following images are solutions of $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} {\mathcal S}\, {\circ}\, {\mathcal F} \y=\x$ where $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\x}{\boldsymbol{x}} \x$ is an undersampled data with a reduction factor of 3.37.

The first image is the minimum-norm solution, i.e.

This minimum-norm solution is improperly chosen; it does not look like a head MRI images. Then, can we deal with the complicated constraint problem: Solve $\newcommand{\ma}{\mathrm{ma}} \newcommand{\n}{\mathbf{n}} \newcommand{\m}{\mathbf{m}} \newcommand{\x}{\boldsymbol{x}} \newcommand{\y}{{\boldsymbol y}} {\mathcal S}\, {\circ}\, {\mathcal F} \y=\x$ subject to the constraint that $\newcommand{\ma}{\mathrm{ma}} \newcommand{\m}{\mathbf{m}} \newcommand{\n}{\mathbf{n}} \newcommand{\y}{{\boldsymbol y}} \y$ looks like a head MRI image? It seems to be very difficult to express this constraint in classical logic formalisms.

Appendix B. Performance of the proposed method with different reduction factors

We tested the proposed method with different reduction factors from R = 3.37 to R = 5.81. We performed two experiments by varying two factors ρ and L, where ρ denotes the uniform subsampling rate along the phase encoding direction (vertical direction) and L denotes the number of low frequency phase encoding lines to be added in our subsampling strategy.

In figure B1, we fix L = 12 and vary ρ from $\rho=4$ to $\rho=8$ . The proposed method provides the good reconstruction image, even if ρ is large $(\rho=8)$ . See the last row in figure B1.

**Figure B1.** In this experiment, we fix L = 12 and vary ρ : $\rho = 1, 4, 5, 6, 8$ .
Download figure:
Standard image High-resolution image

In figure B2, we fix $\rho=4$ and vary L from L = 0 to L = 12. In the case when the L = 0, the separability condition is violated and the proposed method fails (as shown in the first row of figure B2). When L = 1, our network starts to learn unfolding, dramatically. The proposed method with L = 12 provides excellent reconstruction capability.

Appendix C. The reconstruction process

This appendix presents the reconstruction process intuitively using a simplified version of the U-net See figure C1, C2 and C3 for the detailed reconstruction process.

**Figure C2.** Reconstruction process (part 2).
Download figure:
Standard image High-resolution image

**Figure C3.** Reconstruction process (part 3).
Download figure:
Standard image High-resolution image

Deep learning for undersampled MRI reconstruction

Article metrics

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Method

2.1. Subsampling strategy

2.2. Image reconstruction function

3. Result

4. Discussion and conclusion

Acknowledgments

Appendix A. Minimum-norm solution of the underdetermined system

Appendix B. Performance of the proposed method with different reduction factors

Appendix C. The reconstruction process

Deep learning for undersampled MRI reconstruction

Article metrics

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Method

2.1. Subsampling strategy

2.2. Image reconstruction function

3. Result

4. Discussion and conclusion

Acknowledgments

Appendix A. Minimum-norm solution of the underdetermined system

Appendix B. Performance of the proposed method with different reduction factors

Appendix C. The reconstruction process