# Benchmarking MRI Reconstruction Neural Networks on Large Public Datasets

^{1}

^{2}

^{3}

^{*}

*Keywords:*image reconstruction; neural networks; deep learning; fastMRI; OASIS; MRI

Next Article in Journal

Next Article in Special Issue

Next Article in Special Issue

Previous Article in Journal

Previous Article in Special Issue

Previous Article in Special Issue

CEA/NeuroSpin, Bât 145, F-91191 Gif-sur Yvette, France

Inria Saclay Ile-de-France, Parietal team, Univ. Paris-Saclay, 91120 Palaiseau, France

AIM, CEA, CNRS, Université Paris-Saclay, Université Paris Diderot, Sorbonne Paris Cité, F-91191 Gif-sur-Yvette, France

Author to whom correspondence should be addressed.

Received: 8 February 2020 / Revised: 23 February 2020 / Accepted: 26 February 2020 / Published: 6 March 2020

(This article belongs to the Special Issue Signal Processing and Machine Learning for Biomedical Data)

Deep learning is starting to offer promising results for reconstruction in Magnetic Resonance Imaging (MRI). A lot of networks are being developed, but the comparisons remain hard because the frameworks used are not the same among studies, the networks are not properly re-trained, and the datasets used are not the same among comparisons. The recent release of a public dataset, fastMRI, consisting of raw k-space data, encouraged us to write a consistent benchmark of several deep neural networks for MR image reconstruction. This paper shows the results obtained for this benchmark, allowing to compare the networks, and links the open source implementation of all these networks in Keras. The main finding of this benchmark is that it is beneficial to perform more iterations between the image and the measurement spaces compared to having a deeper per-space network.

A short version of this work has been accepted to the 17th International Symposium on Biomedical Imaging (ISBI 2020), 3–7 April 2020, Iowa City, IO, USA [1]. Magnetic Resonance Imaging (MRI) is an imaging modality used to probe the soft tissues of the human body. As it is non-invasive and non-ionizing (contrary to X-Rays, for example), its popularity has grown over the years, for example, tripling between 1997 and 2006, according to the authors of [2]. This is attributed in part to the technical improvements of this technique. We can, for example, mention higher field magnets (3 Teslas instead of 1.5), parallel imaging [3], or compressed sensing MRI [4] (CS-MRI). These improvements allow for better image quality and lower acquisition duration.

There is, however, still room for improvement. Indeed, an MRI scan may last up to 90 min according to the NHS website [5], making it unpractical for some people because you need to lay still for this long period. Typically, babies or people suffering from Parkinson’s disease or claustrophobia could not stay that long in a scanner without undergoing general anesthesia, which is a heavy process, making the overall exam less accessible. To extend the accessibility to more people, we should, therefore, either increase the robustness to motion artifacts, or reduce the acquisition time with the same image quality. On top of that, we should also reduce the reconstruction time with the same image quality to increase the MRI scanners throughput and the total exam time. Indeed, the reconstructed image might show some motion artifacts, and the whole acquisition would need to be re-done [6]. Some other times, based on the first images seen by the physician, they may decide to prescribe complementary pulse sequences if necessary to clarify the image-based diagnosis.

When working in the framework of CS-MRI, the classical methods generally involve solving a convex non-smooth optimization problem. This problem often involves a data-fitting term and a regularization term reflecting our prior on the data. The need for regularization comes from the fact that the problem is ill-posed since the sampling in the Fourier space, called k-space, is under the Nyquist–Shannon limit. However, these classical reconstruction methods exhibit two shortcomings.

- They are usually iterative involving the computation of transforms on large data, and therefore, take a lot of time (2 min for a $512\times 512$ − 500 µm in plane resolution slice [7], on a machine with 8 cores).
- The regularization is usually not perfectly suited to MRI data (it is indeed very difficult to come up with a prior that perfectly reflects MR images).

This is where learning comes in to play, and in particular, deep learning. The promise is that it will solve both the aforementioned problems.

- Because they are implemented efficiently on GPU and do not use an iterative algorithm, the deep learning algorithms run extremely fast.
- If they have enough capacity, they can learn a better prior of the MR images from the training set.

One of the first neural networks to gain attention for its use in MRI reconstruction was AUTOMAP [8]. This network did not exploit a problem-specific property except the fact that the outcome was supposed to be an image. Some more recent works [9,10,11] have tried to inspire themselves from existing classical methods in order to leverage problem specific properties but also expertise gained in the field. However, they have not been compared against each other on a large dataset containing complex-valued raw data.

A recently published dataset, fastMRI [12], allows this comparison, although it is still to be done and requires an implementation of the different networks in the same framework to allow for a fairer comparison in terms of, for example, runtime.

Our contribution is exactly this, that is:

- Benchmark different neural networks for MRI reconstruction on two datasets: the fastMRI dataset, containing raw complex-valued knee data, and the OASIS dataset [13] containing DICOM real-valued brain data.
- Provide reproducible code and the networks’ weights (https://github.com/zaccharieramzi/fastmri-reproducible-benchmark), using Keras [14] with a TensorFlow backend [15].

While our work focuses on classical MRI modalities reconstruction, note that other works have applied deep learning to other modalities like MR fingerprinting [16] or diffusion MRI [17]. The networks studied here could be applied but would not benefit from some invariants of the problem, especially in the fourth (contrast-related) dimension introduced.

In this section, we briefly discuss other works presenting benchmarks on many different reconstruction neural networks.

In [18], they benchmark their (adversarial training based) algorithms against classical methods and against Cascade-net (which they call Deep Cascade) [11] and ADMM-net (which they call DeepADMM) [19]. They train and evaluate the networks quantitatively on two datasets, selecting each time 100 images for train and 100 images for test:

- The IXI database (http://brain-development.org/ixi-dataset/) (brains),
- The Data Science Bowl challenge (https://www.kaggle.com/c/second-annual-data-science-bowl/data) (chests).

While both these datasets provide a sufficient number of samples to have a trustworthy estimate of the performance of the networks, they are not composed of raw complex-valued data, but of DICOM real-valued data. Still, in [18], they do evaluate their algorithms on a raw complex-valued dataset (http://mridata.org/list?project=Stanford%20Fullysampled%203D%20FSE%20Knees), but it only features 20 acquisitions, and therefore the comparison is only done qualitatively.

In [10], they benchmark their algorithm against classical methods. They train and evaluate their network on three different datasets:

- The brain real-valued data set provided by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) [20],
- Two proprietary datasets with raw complex-valued data of brain data.

Again, the only public dataset they use features real-valued data. It is also to be noted that their code cannot be found online.

In this section, we will first introduce what we call the classical models to do reconstruction in CS-MRI. The models we chose to discuss are in no way an exhaustive list of all the models that can be used without learning for reconstruction in MRI (think of LORAKS [21], for example, just to name this one), but they allow us to justify how the subsequent neural networks are built. These models are introduced shortly.

In anatomical MRI, the image is encoded as its Fourier transform, and the data acquisition is segmented in time in multiple shots or trajectories. This does not take possible gradient errors or $B0$-field inhomogeneities into account. Because each Fourier coefficient trajectory takes time to acquire, the time separating two consecutive shots, namely the TR or time of repetition, being potentially pretty long, the idea of CS-MRI is to acquire less of them. We, therefore, have the following idealized inverse problem in the case of single-coil CS-MRI:
where **y** is the acquired Fourier coefficients, also called the k-space data, $\mathsf{\Omega}$ is the sub-sampling pattern or mask, ${\mathit{F}}_{\mathsf{\Omega}}$ is the non-uniform Fourier transform (or masked Fourier transform in the case of Cartesian under-sampling), and **x** is the real anatomical image. Here, we will only deal with Cartesian under-sampling, and there we have ${\mathit{F}}_{\mathsf{\Omega}}={M}_{\mathsf{\Omega}}\mathit{F}$, where ${M}_{\mathsf{\Omega}}$ is a mask, and **F** is the classical Fourier transform. This model is also valid for 3D (volumewise) imaging, but in the following, we only consider 2D (slicewise) imaging.

$$\mathit{y}={\mathit{F}}_{\mathsf{\Omega}}\mathit{x}$$

The first (although unsatisfactory model) that can be used to perform the reconstruction of an MR image with an under-sampled k-space, is to simply use the inverse Fourier transform with the unknown Fourier coefficients replaced by zeros (zero-filled inverse Fourier transform). This method is called zero-filled reconstruction and we have:

$${\widehat{\mathit{x}}}_{zf}={\mathit{F}}^{-1}\mathit{y}$$

The second model we want to introduce makes use of the fact that MR images can be represented in a wavelet basis with only a few non-zero coefficients [4] according to the sparsity principle. The reconstruction is, therefore, done by solving the following optimization problem:
where the notations are the same as in Equation (1), and $\lambda $ is a hyper-parameter to be tuned, and $\mathsf{\Psi}$ is a chosen wavelet transform. This problem can be solved iteratively using a primal-dual optimization algorithm like Condat-Vù [22] or Primal Dual Hybrid Gradient (PDHG) [23] (also known as Chambolle-Pock algorithm) or, if the wavelet transform is invertible (i.e., non-redundant or decimated), using a proximal algorithm like the Fast Iterative Shrinkage Algorithm (FISTA) [24] or the Proximal Optimal Gradient Method (POGM) [25]. Since the problem is convex, all these algorithms converge to the same solution, only at different speeds.

$${\widehat{\mathit{x}}}_{wav}=\underset{\mathit{x}\phantom{\rule{0.166667em}{0ex}}\in {\mathbb{C}}^{n\times n}}{\mathrm{arg}\phantom{\rule{0.166667em}{0ex}}\mathrm{min}}\frac{1}{2}\parallel \mathit{y}-{\mathit{F}}_{\mathsf{\Omega}}{\mathit{x}\parallel}_{2}^{2}+\lambda {\parallel \mathsf{\Psi}\mathit{x}\parallel}_{1}$$

The last model we choose to introduce is the dictionary learning model [26,27]. Its assumption is that an MR image is only composed of a few patches, and can therefore be expressed sparsely in a corresponding dictionary. This dictionary can be learned per-image, leading to the following optimization problem:
where the notations are the same as in Equation (3), and I is the fixed set of patches locations, D is the dictionary, $\lambda $ and ${T}_{0}$ are hyper-parameters to be set, and ${R}_{ij}$ is the linear operator extracting the patch at location $(i,j)$. This problem is solved in two steps:

$$\begin{array}{cccc}\hfill {\widehat{\mathit{x}}}_{dl}=& \underset{\mathit{x},D,{\left\{{\alpha}_{ij}\right\}}_{(i,j)\in I}}{\mathrm{arg}\phantom{\rule{0.166667em}{0ex}}\mathrm{min}}\hfill & \hfill \phantom{\rule{1.em}{0ex}}& \frac{1}{2}\parallel \mathit{y}-{\mathit{F}}_{\mathsf{\Omega}}{\mathit{x}\parallel}_{2}^{2}+\lambda \sum _{(i,j)\in I}{\parallel {R}_{ij}\mathit{x}-D{\alpha}_{ij}\parallel}_{2}^{2}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \mathrm{subject}\phantom{\rule{4.pt}{0ex}}\mathrm{to}\hfill & \hfill \phantom{\rule{1.em}{0ex}}& \forall (i,j)\in I,\phantom{\rule{4.pt}{0ex}}\parallel {\alpha}_{ij}{\parallel}_{0}\le {T}_{0}\hfill \end{array}$$

- The dictionary learning step, where both the dictionary D and the sparse codes ${\alpha}_{ij}$ are updated alternatively.
- The reconstruction step, where x is updated. Since this subproblem is quadratic, it admits an analytical solution, which amounts to averaging patches and then performing a data consistency in which the sampled frequencies are replaced in the patch-average result.

The neural networks introduced here are all derived in a certain way from the classical models introduced before.

What we term single-domain networks are networks which only act either in the k-space or in the image (direct) space. They make use of the fact that we have a pseudo-inverse like in Equation (2). They usually use a U-net-like [28] architecture. This network was originally built for image segmentation but has since been used for a wide-variety of image-to-image tasks, mainly as a strong baseline. In [29], they used a U-net to apply on the under-sampled k-space measurements before performing the inverse Fourier transform. In [30], they used a U-net to apply on the zero-filled reconstruction and correct the output of the U-net with a data consistency step (where they replace sampled values in the k-space). The network we implemented was, however, vanilla, without this extra data-consistency step. Our implementation, however, only features the following cascade of number of filters: $16,32,64,128$. The original U-net is illustrated in Figure 1, where the number of filters used in each layer is four times what we used.

The second class of networks we introduce, we term cross-domain networks. The key intuitive idea is that they correct the data in both the k-space and the image space alternatively, using the Fourier transform to go from one space to the other. They are derived from the optimization algorithms used to solve the optimization problems introduced before, using the idea of “unrolling” introduced in [31]. An illustration of this class of networks is presented in Figure 2.

Because these networks work directly on the input data (and not on a primarily reconstructed version of it), they need to handle complex-valued data. In particular, the classical deep learning frameworks (TensorFlow and Pytorch) do not feature the ability to perform complex convolutions off-the-shelf. The way convolution is performed in the original papers is, therefore, to concatenate the real and imaginary of the image (respectively the k-space), making it a two-channel image, performing the series of convolutions, and having the output be a two-channel image then transformed back in a complex image (respectively k-space).

The **Cascade-net** [11] is based on the dictionary learning optimization Problem (4). The idea is to replace the dictionary learning step by convolutional neural networks and still keep the data consistency step in the k-space. The optimization algorithm is then unrolled to allow us to perform back-propagation. The authors of [11] show that we can perform back-propagation through the data consistency step (which is linear) and derive the corresponding Jacobian. The parameters used here for the implementation are the same as those in the original paper, except the number of filters, which was decreased from 64 to 48 to fit on a single GPU. This network is illustrated in Figure 3.

The **KIKI-net** [10] is an extension of the Cascade-net where they additionally perform convolutions after the data consistency step in the k-space. The parameters used here for the implementation are the same as those in the original paper. This network is illustrated in Figure 4.

The **Primal-Dual-net (PD-net)** was introduced by [9] and applied to MRI by [32], is based on the wavelet-based denoising (3), and in particular, the resolution of the corresponding optimization problem with the PDHG [23] algorithm. Here, the algorithm is unrolled, and the proximity operators (present in the general case of PDHG) are replaced by convolutional neural networks. For our implementation, for a fairer comparison with Cascade-net and the U-net, we used a ReLU non-linearity instead of a PReLU [33]. This network is illustrated in Figure 5.

The training was done with the same parameters for all the networks. The optimizer used was from Adam [34], with a learning rate of ${10}^{-3}$ and default parameters of Keras (${\beta}_{1}=0.9$, ${\beta}_{2}=0.999$, the exponential decay rates for the moment estimates). The gradient norm was clipped to one to avoid the exploding gradient problems [35]. The batch size was one (i.e., one slice) for every network except the U-Net, where the whole volume was used for each step. For all networks, to maximize the efficiency of the training, the slices were selected in the eight innermost slices of the volumes, because the outer slices do not have much signal. No early stopping or learning rate schedule was used (except for KIKI-net to allow for a stable training where we used the learning rate schedule proposed by the authors in the supporting information of [10]). The number of epochs used was 300 for all networks trained end-to-end. For the iterative training of the KIKI-net, the total number of epochs was 200 (50 per sub-training). Batch normalization was not used; however, in order to have the network learn more efficiently, a scaling of the input data was done. Both the k-space and the image were multiplied by ${10}^{6}$ for fastMRI and by ${10}^{2}$ for OASIS because the k-space measurements had values of mean ${10}^{-7}$ (looking separately at the real and imaginary parts) for fastMRI and of mean ${10}^{-3}$ for OASIS. Without this scaling operation, the training proved to be impossible with bias in the convolutions and very inefficient without bias in the convolutions.

The under-sampling was done retrospectively using a Cartesian mask described in the data set paper [12] and an acceleration factor of four (i.e., only 25% of the k-space was kept). It contains a fully-sampled region in the lower frequencies, and randomly selects phase encoding lines in the higher frequencies.

It is to be noted that different under-sampling strategies exist in CS-MRI. Some of them are listed in [36], for example, spiral or radial. These strategies allow for a higher image quality while having the same acceleration factor or the same image quality with a higher acceleration factor. Typically, the spiral under-sampling scheme was designed to allow fast coronary imaging [37,38]. These under-sampling strategies must take into account kinematic constraints (both physically and safety based) but should also be with variable density [36]. Recent works even try to optimize the under-sampling strategy under these kinematic constraints [39]. Others have tried to learn the under-sampling strategy in a supervised way. In [40], the under-sampling strategy is learned with a greedy optimization. In [41], a gradient descent optimization is used. Some approaches ([42,43,44] even try to jointly learn the optimal under-sampling strategy along with the reconstruction.

The data used for this benchmark is the emulated single-coil k-space data of the fastMRI dataset [12], along with the corresponding ground truth images. The acquisition was done with a 15-channel phased array coil, in Cartesian 2D Turbin Spin Echo (TSE). The pulse sequences were proton-density weighting, half with fat suppression, half without, some at 3.0 Teslas (T) others at 1.5 T. The sequence parameters were as follows: Echo train length 4, matrix size $320\times 320$, in-plane resolution 0.5 × 0.5 mm, slice thickness 3 mm, no gap between slices. In total, there are 973 volumes (34, 742 slices) for the training subset and 199 volumes (7135 slices) for the validation subset.

Since the k-spaces are of different sizes, therefore resulting in images of different sizes, the outputs of the cross-domain networks were cropped to a central $320\times 320$ region. For the U-net, the input of the network was cropped.

The Open Access Series of Imaging Studies (OASIS) brain database [13] is a database including MRI scans of 1068 participants, yielding 2168 MR sessions. Of these 2168, we select only 2164 sessions, which feature T1-weighted sequences. Of these,1878 were acquired on a 3.0 T, 236 at 1.5 T, and the remaining are undisclosed (50). The slice size is majorly $256\times 256$, and sometimes $240\times 256$ (rarely it can be some other sizes). The number of slices per scan is majorly 176, and sometimes 160 (rarely it can be smaller).

The data was then separated into a training and validation set. The split was participant-based, that is, a participant cannot have a scan in both sets. The split was of 90% for the training set and 10% for the validation set. We further reduced the training data to make it comparable to fastMRI, to 1000 scans randomly selected for the training subset and 200 scans randomly selected for the validation subset.

Contrarily to fastMRI, the OASIS data is available only in magnitude and, therefore, is only real-valued. The k-space is computed as the inverse Fourier transform of the magnitude image.

The metrics we used to benchmark the different networks are the following:

- The Peak Signal-to-Noise Ratio (PSNR);
- The Structural SIMilarity index (SSIM) [45];
- The number of trainable parameters in the network;
- The runtime in seconds of the neural network on a single volume.

The PSNR is computed as follows, on whole magnitude volumes:
where x is the ground truth volume, $\widehat{x}$ is the predicted volume (magnitude image), and n is the total number of points in the ground truth volume (same as the predicted volume). Since this metric compares very local differences, it does not necessarily reflect the global visual comparison of the images. The SSIM was introduced in [45] exactly to take more structural differences or similarities between images into account. It is computed as in the original paper, per slice, then averaged over the volume (the range, however, is computed volume-wise):
where x is the ground truth slice, $\widehat{x}$ is the predicted slice, ${\mu}_{i}$ is the mean of i, ${\sigma}_{i}^{2}$ is the variance of i, $co{v}_{ij}$ is the covariance between i and j, ${c}_{1}={\left({k}_{1}L\right)}^{2}$, ${c}_{2}={\left({k}_{2}L\right)}^{2}$, ${c}_{3}=\frac{{c}_{2}}{2}$, L is the range of the values of the data (given because computed over the whole volume), and ${k}_{1}=0.01$ and ${k}_{2}=0.03$.

$$PSNR(x,\widehat{x})=10{log}_{10}\left(\frac{max{\left(x\right)}^{2}}{\frac{1}{n}{\sum}_{i,j,k}{({x}_{i,j,k}-{\widehat{x}}_{i,j,k})}^{2}}\right)$$

$$SSIM(x,\widehat{x})=\frac{(2{\mu}_{x}{\mu}_{\widehat{x}}+{c}_{1})(2{\sigma}_{x}{\sigma}_{\widehat{x}}+{c}_{2})(co{v}_{x\widehat{x}}+{c}_{3})}{({\mu}_{x}^{2}+{\mu}_{\widehat{x}}^{2}+{c}_{1})({\sigma}_{x}^{2}+{\sigma}_{\widehat{x}}^{2}+{c}_{2})({\sigma}_{x}{\sigma}_{\widehat{x}}+{c}_{3})}$$

While the two aforementioned metrics control the reconstruction quality, it is important to note that this is not the only factor to take into account when designing reconstruction techniques. Because the reconstruction has to happen fast enough for the MR physician to decide whether to re-conduct the exam or not, it is important for the proposed technique to have a reasonable reconstruction speed. For real-time MRI applications or dynamic MRI (e.g., cardiac imaging), it is even more important (for example, in the context of monitoring surgical operations [46]). The runtimes were measured on a computer equipped with a single GPU Quadro P5000 with 16 GB of RAM.

Concurrently, the number of parameters has to stay relatively low to allow the implementation on the different machines with potentially limited memory, which will probably need to have multiple models (for different contrasts, different organs, or different undersampling schemes including different acceleration factors).

The quantitative results in Table 1, Table 2, Table 3 and Table 4 show that the PD-net [9] outperforms its competitors in terms of image quality metrics but also has the least amount of trainable parameters. It is slightly slower than the Cascade-net [11] though, which can be explained by its higher number of iterations, therefore involving more costly Fourier transform (inverse or direct) operations. These results hold true on the two data sets, fastMRI [12] and OASIS [13]. The only exception is that KIKI-net [10] is slightly better than the U-net [28] on the OASIS data set, but still far from the best performers. We can also note that the standard deviation of the image quality metrics is way higher in the fastMRI data set than in the OASIS data set. This higher standard deviation is explained by the fact that the two contrasts present in the fastMRI dataset, Proton Density with and without Fat Suppression (PD/PDFS), have widely different image metrics values. The standard deviations when we compute the metrics for each contrast separately are more in-line with the OASIS ones. The range of the image quality metrics is also much higher in the OASIS results.

The qualitative results shown in Figure 6 and Figure 7 confirm the quantitative ones on the image quality aspect. The PD-net [9] is much better at conserving the high-frequency parts of the original image, as can be seen when looking at the reconstruction error, which is quite flat over the whole image.

In this work, we only considered one scheme of under-sampling. However, it should be interesting to see if the performance obtained on one type of under-sampling generalizes to other types of under-sampling, especially if we do a re-gridding step for non-Cartesian under-sampling schemes. On that specific point, the extension of the networks towards non-Cartesian sampling schemes is not easy because the data consistency cannot be performed in the same way, and the measurement space is no longer similar to an image (except if we re-grid). In a recent work [47], some of the authors of the Cascade-net [11] propose a way to extend their approach to the non-Cartesian case, using a re-gridding step. The PD-net [9] also has a straightforward implementation for the non-Cartesian case even without re-gridding, in what is called the learned Primal. In this case, the network in the k-space is just computing the difference (residual) between the current k-space measurements and the initial k-space measurements. Therefore, there are no parameters to learn, which alleviates the problem of how to learn them.

We also only considered a single-coil acquisition setting. As parallel imaging is primarily used in CS-MRI to allow higher image quality [3], it is important to see how these networks will behave in the multi-coil setting. The difficult part in the extension of these works to the multi-coil setting will be to understand how to best involve the sensitivity maps (or even not involve them [48]).

Regarding the networks themselves, the results seem to suggest that for cross-domain networks, the trade-off between a high number of iterations and a richer correction in a certain domain (by having deeper networks) is in favor of having more iterations (i.e., alternating more between domains). It is, however, unclear how to best tackle the reconstruction in the k-space, since the convolutional networks make a shift invariance hypothesis, which is not true in the Fourier space where the coefficients corresponding to the high frequencies should probably not be treated in the same way as with the low frequencies. This leaves room for improvement in the near future.

Conceptualization, Z.R. and P.C.; methodology, Z.R.; software, Z.R.; validation, Z.R., P.C. and J.-L.S.; formal analysis, Z.R.; investigation, Z.R.; resources, P.C.; data curation, Z.R.; writing—original draft preparation, Z.R.; writing—review and editing, P.C. and J.-L.S.; visualization, Z.R.; supervision, P.C. and J.-L.S.; project administration, P.C. and J.-L.S.; funding acquisition, P.C. and J.-L.S. All authors have read and agree to the published version of the manuscript.

This research was funded by the Cross-Disciplinary Program on Numerical Simulation (SILICOSMIC project) of CEA, the French Alternative Energies and Atomic Energy Commission.

We want to thank Jonas Adler, Justin Haldar, and Jo Schlemper for their very useful and benevolent remarks and answers they gave when asked questions about their works.

The authors declare no conflict of interest

The following abbreviations are used in this manuscript:

MRI | Magnetic Resonance Imaging |

CS-MRI | Compressed Sensing MRI |

GPU | Graphical Processing Unit |

ReLU | Rectified Linear Unit |

PReLU | Parametrized ReLU |

PSNR | Peak Signal-to-Noise Ratio |

SSIM | Structural SIMilarity index |

- Ramzi, Z.; Ciuciu, P.; Starck, J.L. Benchmarking Deep Nets MRI Reconstruction Models on the FastMRI Publicly Available Dataset. In Proceedings of the ISBI 2020—International Symposium on Biomedical Imaging, Iowa City, IA, USA, 3–7 April 2020. [Google Scholar]
- Smith-Bindman, R.; Miglioretti, D.L.; Larson, E.B. Rising use of diagnostic medical imaging in a large integrated health system. Health Aff.
**2008**, 27, 1491–1502. [Google Scholar] [CrossRef] [PubMed] - Roemer, P.B.; Edelstein, W.A.; Hayes, C.E.; Souza, S.P.; Mueller, O.M. The NMR phased array. Magn. Reson. Med.
**1990**, 16, 192–225. [Google Scholar] [CrossRef] [PubMed] - Lustig, M.; Donoho, D.; Pauly, J.M. Sparse MRI: The Application of Compressed Sensing for Rapid MR Imaging Michael. Magn. Reson. Med.
**2007**. [Google Scholar] [CrossRef] [PubMed] - NHS. NHS Eebsite. Available online: https://www.nhs.uk/conditions/mri-scan/what-happens/ (accessed on 4 March 2020).
- AIM Specialty Health. Clinical Appropriateness Guidelines: Advanced Imaging. 2017. Available online: https://www.aimspecialtyhealth.com/PDF/Guidelines/2017/Sept05/AIM_Guidelines.pdf (accessed on 4 March 2020).
- Ramzi, Z.; Ciuciu, P.; Starck, J.L. Benchmarking proximal methods acceleration enhancements for CS-acquired MR image analysis reconstruction. In Proceedings of the SPARS 2019—Signal Processing with Adaptive Sparse Structured Representations Workshop, Toulouse, France, 1–4 July 2019. [Google Scholar]
- Zhu, B.; Liu, J.Z.; Cauley, S.F.; Rosen, B.R.; Rosen, M.S. Image reconstruction by domain-transform manifold learning. Nature
**2018**, 555, 487–492. [Google Scholar] [CrossRef] [PubMed] - Adler, J.; Öktem, O. Learned Primal-Dual Reconstruction. IEEE Trans. Med. Imaging
**2018**, 37, 1322–1332. [Google Scholar] [CrossRef] [PubMed] - Eo, T.; Jun, Y.; Kim, T.; Jang, J.; Lee, H.J.; Hwang, D. KIKI-net: Cross-domain convolutional neural networks for reconstructing undersampled magnetic resonance images. Magn. Reson. Med.
**2018**, 80, 2188–2201. [Google Scholar] [CrossRef] - Schlemper, J.; Caballero, J.; Hajnal, J.V.; Price, A.; Rueckert, D. A Deep Cascade of Convolutional Neural Networks for MR Image Reconstruction. IEEE Trans. Med. Imaging
**2018**, 37, 491–503. [Google Scholar] [CrossRef] - Zbontar, J.; Knoll, F.; Sriram, A.; Muckley, M.J.; Bruno, M.; Defazio, A.; Parente, M.; Geras, K.J.; Katsnelson, J.; Chandarana, H.; et al. fastMRI: An Open Dataset and Benchmarks for Accelerated MRI. arXiv
**2018**, arXiv:1811.08839. [Google Scholar] - LaMontagne, P.J.; Keefe, S.; Lauren, W.; Xiong, C.; Grant, E.A.; Moulder, K.L.; Morris, J.C.; Benzinger, T.L.; Marcus, D.S. OASIS-3: Longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and Alzheimer’s disease. Alzheimer’s Dementia J. Alzheimer’s Assoc.
**2018**, 14, P1097. [Google Scholar] [CrossRef] - Chollet, F. Keras. 2015. Available online: https://keras.io (accessed on 4 March 2020).
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Available online: tensorflow.org (accessed on 4 March 2020).
- Virtue, P.; Stella, X.Y.; Lustig, M. Better than real: Complex-valued neural nets for MRI fingerprinting. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3953–3957. [Google Scholar]
- Aggarwal, H.K.; Mani, M.P.; Jacob, M. Multi-Shot Sensitivity-Encoded Diffusion MRI Using Model-Based Deep Learning (Modl-Mussels). In Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 8–11 April 2019; pp. 1541–1544. [Google Scholar]
- Minh Quan, T.; Nguyen-Duc, T.; Jeong, W.K. Compressed Sensing MRI Reconstruction using a Generative Adversarial Network with a Cyclic Loss. IEEE Trans. Med. Imaging
**2018**, 37, 1488–1497. [Google Scholar] [CrossRef] - Yang, Y.; Sun, J.; Li, H.; Xu, Z.; Sun, J.; Xu, Z. Deep ADMM-net for compressive sensing MRI. In Proceedings of the Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 10–18. [Google Scholar] [CrossRef]
- Petersen, R.C.; Aisen, P.; Beckett, L.A.; Donohue, M.; Gamst, A.; Harvey, D.J.; Jack, C.; Jagust, W.; Shaw, L.; Toga, A.; et al. Alzheimer’s disease neuroimaging initiative (ADNI): Clinical characterization. Neurology
**2010**, 74, 201–209. [Google Scholar] [CrossRef] - Haldar, J.P. Low-Rank Modeling of Local k-Space Neighborhoods (LORAKS) for Constrained MRI. IEEE Trans. Med. Imaging
**2014**, 33, 668–681. [Google Scholar] [CrossRef] [PubMed] - Condat, L. A Primal–Dual Splitting Method for Convex Optimization Involving Lipschitzian, Proximable and Linear Composite Terms. J. Optim. Theory Appl.
**2013**, 158, 460–479. [Google Scholar] [CrossRef] - Chambolle, A.; Pock, T. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. J. Math. Imaging Vis.
**2011**, 120–145. [Google Scholar] [CrossRef] - Beck, A.; Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci.
**2009**, 2, 183–202. [Google Scholar] [CrossRef] - Kim, D.; Fessler, J.A. Adaptive restart of the optimized gradient method for convex optimization. J. Optim. Theory Appl.
**2018**, 178, 240–263. [Google Scholar] [CrossRef] - Ravishankar, S.; Bresler, Y. Magnetic Resonance Image Reconstruction from Highly Undersampled K-Space Data Using Dictionary Learning. IEEE Trans. Med. Imaging
**2011**, 30, 1028–1041. [Google Scholar] [CrossRef] - Caballero, J.; Price, A.N.; Rueckert, D.; Hajnal, J.V. Dictionary learning and time sparsity for dynamic MR data reconstruction. IEEE Trans. Med. Imaging
**2014**, 33, 979–994. [Google Scholar] [CrossRef] - Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv
**2015**, arXiv:1505.04597v1. [Google Scholar] - Han, Y.; Sunwoo, L.; Chul Ye, J. k-Space Deep Learning for Accelerated MRI. arXiv
**2019**, arXiv:1805.03779v2. [Google Scholar] [CrossRef] - Min Hyun, C.; Pyung Kim, H.; Min Lee, S.; Lee, S.; Keun Seo, J. Deep learning for undersampled MRI reconstruction. arXiv
**2019**, arXiv:1709.02576v3. [Google Scholar] - Gregor, K.; Lecun, Y. Learning Fast Approximations of Sparse Coding. In Proceedings of the 27thInternational Confer-ence on Machine Learning, Haifa, Israel, 13–15 June 2010. [Google Scholar]
- Cheng, J.; Wang, H.; Ying, L.; Liang, D. Model Learning: Primal Dual Networks for Fast MR imaging. arXiv
**2019**, arXiv:1908.02426. [Google Scholar] - He, K.; Zhang, X.; Ren, S.; Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. arXiv
**2015**, arXiv:1502.01852. [Google Scholar] - Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv
**2014**, arXiv:1412.6980. [Google Scholar] - Pascanu, R.; Mikolov, T.; Bengio, Y. On the difficulty of training Recurrent Neural Networks. arXiv
**2012**, arXiv:1211.5063. [Google Scholar] - Chauffert, N.; Ciuciu, P.; Kahn, J.; Weiss, P. Variable density sampling with continuous trajectories. SIAM J. Imaging Sci.
**2014**, 7, 1962–1992. [Google Scholar] [CrossRef] - Irarrazabal, P.; Nishimura, D.G. Fast Three Dimensional Magnetic Resonance Imaging. Magn. Reson. Med.
**1995**, 33, 656–662. [Google Scholar] [CrossRef] - Meyer, C.H.; Hu, B.S.; Nishimura, D.G.; Macovski, A. Fast Spiral Coronary Artery Imaging. Magn. Reson. Med.
**1992**, 28, 202–213. [Google Scholar] [CrossRef] - Lazarus, C.; Weiss, P.; Chauffert, N.; Mauconduit, F.; El Gueddari, L.; Destrieux, C.; Zemmoura, I.; Vignaud, A.; Ciuciu, P. SPARKLING: Variable-density k-space filling curves for accelerated T 2 * -weighted MRI. Magn. Reson. Med.
**2018**, 1–19. [Google Scholar] [CrossRef] - Sanchez, T.; Gözcü, B.; van Heeswijk, R.B.; Eftekhari, A.; Ilıcak, E.; Çukur, T.; Cevher, V. Scalable learning-based sampling optimization for compressive dynamic MRI. arXiv
**2019**, arXiv:1902.00386. [Google Scholar] - Sherry, F.; Benning, M.; Reyes, J.C.D.l.; Graves, M.J.; Maierhofer, G.; Williams, G.; Schönlieb, C.B.; Ehrhardt, M.J. Learning the sampling pattern for MRI. arXiv
**2019**, arXiv:1906.08754. [Google Scholar] - Aggarwal, H.K.; Jacob, M. J-MoDL: Joint Model-Based Deep Learning for Optimized Sampling and Reconstruction. arXiv
**2019**, arXiv:1911.02945. [Google Scholar] - Wu, Y.; Rosca, M.; Lillicrap, T. Deep compressed sensing. arXiv
**2019**, arXiv:1905.06723. [Google Scholar] - Weiss, T.; Senouf, O.; Vedula, S.; Michailovich, O.; Zibulevsky, M.; Bronstein, A. PILOT: Physics-Informed Learned Optimal Trajectories for Accelerated MRI. arXiv
**2019**, arXiv:1909.05773. [Google Scholar] - Wang, Z.; Bovik, A.C.; Rahim Sheikh, H.; Simoncelli, E.P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process.
**2004**, 13. [Google Scholar] [CrossRef] - Horvath, K.A.; Li, M.; Mazilu, D.; Guttman, M.A.; McVeigh, E.R. Real-time magnetic resonance imaging guidance for cardiovascular procedures. In Seminars in Thoracic and Cardiovascular Surgery; Elsevier: New York, NY, USA, 2007; Volume 19, pp. 330–335. [Google Scholar]
- Schlemper, J.; Sadegh, S.; Salehi, M.; Kundu, P.; Lazarus, C.; Dyvorne, H.; Rueckert, D.; Sofka, M. Nonuniform Variational Network: Deep Learning for Accelerated Nonuniform MR Image Reconstruction. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Shenzhen, China, 13–17 October 2019. [Google Scholar]
- El Gueddari, L.; Ciuciu, P.; Chouzenoux, E.; Vignaud, A.; Pesquet, J.C. Calibrationless OSCAR-based image reconstruction in compressed sensing parallel MRI. In Proceedings of the 16th IEEE International Symposium on Biomedical Imaging, Venice, Italy, 8–11 April 2019; pp. 1532–1536. [Google Scholar]
- Dragotti, P.L.; Dong, H.; Yang, G.; Guo, Y.; Firmin, D.; Slabaugh, G.; Yu, S.; Keegan, J.; Ye, X.; Liu, F.; et al. DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruction. IEEE Trans. Med. Imaging
**2017**, 37, 1310–1321. [Google Scholar] [CrossRef]

Network | PSNR-mean (std) (dB) | SSIM-mean (std) | #params | Runtime (s) |
---|---|---|---|---|

Zero-filled | 29.61 ( 5.28) | 0.657 ( 0.23) | 0 | 0.68 |

KIKI-net | 31.38 (3.02) | 0.712 (0.13) | 1.25M | 8.22 |

U-net | 31.78 ( 6.53) | 0.720 ( 0.25) | 482k | 0.61 |

Cascade net | 31.97 ( 6.95) | 0.719 ( 0.27) | 425k | 3.58 |

PD-net | 32.15 ( 6.90) | 0.729 ( 0.26) | 318k | 5.55 |

Network | PSNR-mean (std) (dB) | SSIM-mean (std) | # params | Runtime (s) |
---|---|---|---|---|

Zero-filled | 28.44 (2.62) | 0.578 (0.095) | 0 | 0.41 |

KIKI-net | 29.57 (2.64) | 0.6271 (0.10) | 1.25M | 8.88 |

Cascade-net | 29.88 (2.82) | 0.6251 (0.11) | 425K | 3.57 |

U-net | 29.89 (2.74) | 0.6334 (0.10) | 482K | 1.34 |

PD-net | 30.06 (2.82) | 0.6394 (0.10) | 318K | 5.38 |

Network | PSNR-mean (std) (dB) | SSIM-mean (std) | # params | Runtime (s) |
---|---|---|---|---|

Zero-filled | 30.63 (2.1) | 0.727 (0.087) | 0 | 0.52 |

KIKI-net | 32.86 (2.4) | 0.797 (0.082) | 1.25M | 11.83 |

U-net | 33.64 (2.6) | 0.807 (0.084) | 482K | 1.07 |

Cascade-net | 33.98 (2.7) | 0.811 (0.086) | 425K | 4.22 |

PD-net | 34.2 (2.7) | 0.818 (0.084) | 318280 | 6.08 |

Network | PSNR-mean (std) (dB) | SSIM-mean (std) | # params | Runtime (s) |
---|---|---|---|---|

Zero-filled | 26.11 (1.45) | 0.672 (0.0307) | 0 | 0.165 |

U-net | 29.8 (1.39) | 0.847 (0.0398) | 482k | 1.202 |

KIKI-net | 30.08 (1.43) | 0.853 (0.0336) | 1.25M | 3.567 |

Cascade-net | 32.0 (1.731) | 0.887 (0.0327) | 425k | 2.234 |

PD-net | 33.22 (1.912) | 0.910 (0.0358) | 318k | 2.758 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).