Benchmarking MRI Reconstruction Neural Networks on Large Public Datasets

Ramzi, Zaccharie; Ciuciu, Philippe; Starck, Jean-Luc

doi:10.3390/app10051816

Open AccessBenchmark

Benchmarking MRI Reconstruction Neural Networks on Large Public Datasets

by

Zaccharie Ramzi

^1,2,3,*,

Philippe Ciuciu

^1,2

and

Jean-Luc Starck

³

¹

CEA/NeuroSpin, Bât 145, F-91191 Gif-sur Yvette, France

²

Inria Saclay Ile-de-France, Parietal team, Univ. Paris-Saclay, 91120 Palaiseau, France

³

AIM, CEA, CNRS, Université Paris-Saclay, Université Paris Diderot, Sorbonne Paris Cité, F-91191 Gif-sur-Yvette, France

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2020, 10(5), 1816; https://doi.org/10.3390/app10051816

Submission received: 8 February 2020 / Revised: 23 February 2020 / Accepted: 26 February 2020 / Published: 6 March 2020

(This article belongs to the Special Issue Signal Processing and Machine Learning for Biomedical Data)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Deep learning is starting to offer promising results for reconstruction in Magnetic Resonance Imaging (MRI). A lot of networks are being developed, but the comparisons remain hard because the frameworks used are not the same among studies, the networks are not properly re-trained, and the datasets used are not the same among comparisons. The recent release of a public dataset, fastMRI, consisting of raw k-space data, encouraged us to write a consistent benchmark of several deep neural networks for MR image reconstruction. This paper shows the results obtained for this benchmark, allowing to compare the networks, and links the open source implementation of all these networks in Keras. The main finding of this benchmark is that it is beneficial to perform more iterations between the image and the measurement spaces compared to having a deeper per-space network.

Keywords:

image reconstruction; neural networks; deep learning; fastMRI; OASIS; MRI

1. Introduction

A short version of this work has been accepted to the 17th International Symposium on Biomedical Imaging (ISBI 2020), 3–7 April 2020, Iowa City, IO, USA [1]. Magnetic Resonance Imaging (MRI) is an imaging modality used to probe the soft tissues of the human body. As it is non-invasive and non-ionizing (contrary to X-Rays, for example), its popularity has grown over the years, for example, tripling between 1997 and 2006, according to the authors of [2]. This is attributed in part to the technical improvements of this technique. We can, for example, mention higher field magnets (3 Teslas instead of 1.5), parallel imaging [3], or compressed sensing MRI [4] (CS-MRI). These improvements allow for better image quality and lower acquisition duration.

There is, however, still room for improvement. Indeed, an MRI scan may last up to 90 min according to the NHS website [5], making it unpractical for some people because you need to lay still for this long period. Typically, babies or people suffering from Parkinson’s disease or claustrophobia could not stay that long in a scanner without undergoing general anesthesia, which is a heavy process, making the overall exam less accessible. To extend the accessibility to more people, we should, therefore, either increase the robustness to motion artifacts, or reduce the acquisition time with the same image quality. On top of that, we should also reduce the reconstruction time with the same image quality to increase the MRI scanners throughput and the total exam time. Indeed, the reconstructed image might show some motion artifacts, and the whole acquisition would need to be re-done [6]. Some other times, based on the first images seen by the physician, they may decide to prescribe complementary pulse sequences if necessary to clarify the image-based diagnosis.

When working in the framework of CS-MRI, the classical methods generally involve solving a convex non-smooth optimization problem. This problem often involves a data-fitting term and a regularization term reflecting our prior on the data. The need for regularization comes from the fact that the problem is ill-posed since the sampling in the Fourier space, called k-space, is under the Nyquist–Shannon limit. However, these classical reconstruction methods exhibit two shortcomings.

They are usually iterative involving the computation of transforms on large data, and therefore, take a lot of time (2 min for a $512 \times 512$ − 500 µm in plane resolution slice [7], on a machine with 8 cores).
The regularization is usually not perfectly suited to MRI data (it is indeed very difficult to come up with a prior that perfectly reflects MR images).

This is where learning comes in to play, and in particular, deep learning. The promise is that it will solve both the aforementioned problems.

Because they are implemented efficiently on GPU and do not use an iterative algorithm, the deep learning algorithms run extremely fast.
If they have enough capacity, they can learn a better prior of the MR images from the training set.

One of the first neural networks to gain attention for its use in MRI reconstruction was AUTOMAP [8]. This network did not exploit a problem-specific property except the fact that the outcome was supposed to be an image. Some more recent works [9,10,11] have tried to inspire themselves from existing classical methods in order to leverage problem specific properties but also expertise gained in the field. However, they have not been compared against each other on a large dataset containing complex-valued raw data.

A recently published dataset, fastMRI [12], allows this comparison, although it is still to be done and requires an implementation of the different networks in the same framework to allow for a fairer comparison in terms of, for example, runtime.

Our contribution is exactly this, that is:

Benchmark different neural networks for MRI reconstruction on two datasets: the fastMRI dataset, containing raw complex-valued knee data, and the OASIS dataset [13] containing DICOM real-valued brain data.
Provide reproducible code and the networks’ weights (https://github.com/zaccharieramzi/fastmri-reproducible-benchmark), using Keras [14] with a TensorFlow backend [15].

While our work focuses on classical MRI modalities reconstruction, note that other works have applied deep learning to other modalities like MR fingerprinting [16] or diffusion MRI [17]. The networks studied here could be applied but would not benefit from some invariants of the problem, especially in the fourth (contrast-related) dimension introduced.

2. Related Works

In this section, we briefly discuss other works presenting benchmarks on many different reconstruction neural networks.

In [18], they benchmark their (adversarial training based) algorithms against classical methods and against Cascade-net (which they call Deep Cascade) [11] and ADMM-net (which they call DeepADMM) [19]. They train and evaluate the networks quantitatively on two datasets, selecting each time 100 images for train and 100 images for test:

The IXI database (http://brain-development.org/ixi-dataset/) (brains),
The Data Science Bowl challenge (https://www.kaggle.com/c/second-annual-data-science-bowl/data) (chests).

While both these datasets provide a sufficient number of samples to have a trustworthy estimate of the performance of the networks, they are not composed of raw complex-valued data, but of DICOM real-valued data. Still, in [18], they do evaluate their algorithms on a raw complex-valued dataset (http://mridata.org/list?project=Stanford%20Fullysampled%203D%20FSE%20Knees), but it only features 20 acquisitions, and therefore the comparison is only done qualitatively.

In [10], they benchmark their algorithm against classical methods. They train and evaluate their network on three different datasets:

The brain real-valued data set provided by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) [20],
Two proprietary datasets with raw complex-valued data of brain data.

Again, the only public dataset they use features real-valued data. It is also to be noted that their code cannot be found online.

3. Models

In this section, we will first introduce what we call the classical models to do reconstruction in CS-MRI. The models we chose to discuss are in no way an exhaustive list of all the models that can be used without learning for reconstruction in MRI (think of LORAKS [21], for example, just to name this one), but they allow us to justify how the subsequent neural networks are built. These models are introduced shortly.

3.1. Idealized Inverse Problem

In anatomical MRI, the image is encoded as its Fourier transform, and the data acquisition is segmented in time in multiple shots or trajectories. This does not take possible gradient errors or

B 0

-field inhomogeneities into account. Because each Fourier coefficient trajectory takes time to acquire, the time separating two consecutive shots, namely the TR or time of repetition, being potentially pretty long, the idea of CS-MRI is to acquire less of them. We, therefore, have the following idealized inverse problem in the case of single-coil CS-MRI:

y = F_{Ω} x

(1)

where y is the acquired Fourier coefficients, also called the k-space data,

Ω

is the sub-sampling pattern or mask,

F_{Ω}

is the non-uniform Fourier transform (or masked Fourier transform in the case of Cartesian under-sampling), and x is the real anatomical image. Here, we will only deal with Cartesian under-sampling, and there we have

F_{Ω} = M_{Ω} F

, where

M_{Ω}

is a mask, and F is the classical Fourier transform. This model is also valid for 3D (volumewise) imaging, but in the following, we only consider 2D (slicewise) imaging.

3.2. Classic Models

The first (although unsatisfactory model) that can be used to perform the reconstruction of an MR image with an under-sampled k-space, is to simply use the inverse Fourier transform with the unknown Fourier coefficients replaced by zeros (zero-filled inverse Fourier transform). This method is called zero-filled reconstruction and we have:

{\hat{x}}_{z f} = F^{- 1} y

(2)

The second model we want to introduce makes use of the fact that MR images can be represented in a wavelet basis with only a few non-zero coefficients [4] according to the sparsity principle. The reconstruction is, therefore, done by solving the following optimization problem:

{\hat{x}}_{w a v} = \underset{x \in C^{n \times n}}{\arg \min} \frac{1}{2} ∥ y - F_{Ω} {x ∥}_{2}^{2} + λ {∥ Ψ x ∥}_{1}

(3)

where the notations are the same as in Equation (1), and

λ

is a hyper-parameter to be tuned, and

Ψ

is a chosen wavelet transform. This problem can be solved iteratively using a primal-dual optimization algorithm like Condat-Vù [22] or Primal Dual Hybrid Gradient (PDHG) [23] (also known as Chambolle-Pock algorithm) or, if the wavelet transform is invertible (i.e., non-redundant or decimated), using a proximal algorithm like the Fast Iterative Shrinkage Algorithm (FISTA) [24] or the Proximal Optimal Gradient Method (POGM) [25]. Since the problem is convex, all these algorithms converge to the same solution, only at different speeds.

The last model we choose to introduce is the dictionary learning model [26,27]. Its assumption is that an MR image is only composed of a few patches, and can therefore be expressed sparsely in a corresponding dictionary. This dictionary can be learned per-image, leading to the following optimization problem:

\begin{matrix} {\hat{x}}_{d l} = & \underset{x, D, {α_{i j}}_{(i, j) \in I}}{\arg \min} & \frac{1}{2} ∥ y - F_{Ω} {x ∥}_{2}^{2} + λ \sum_{(i, j) \in I} {∥ R_{i j} x - D α_{i j} ∥}_{2}^{2} \\ subject to & \forall (i, j) \in I, ∥ α_{i j} ∥_{0} \leq T_{0} \end{matrix}

(4)

where the notations are the same as in Equation (3), and I is the fixed set of patches locations, D is the dictionary,

λ

and

T_{0}

are hyper-parameters to be set, and

R_{i j}

is the linear operator extracting the patch at location

(i, j)

. This problem is solved in two steps:

The dictionary learning step, where both the dictionary D and the sparse codes $α_{i j}$ are updated alternatively.
The reconstruction step, where x is updated. Since this subproblem is quadratic, it admits an analytical solution, which amounts to averaging patches and then performing a data consistency in which the sampled frequencies are replaced in the patch-average result.

3.3. Neural Networks

The neural networks introduced here are all derived in a certain way from the classical models introduced before.

3.3.1. Single-Domain Networks

What we term single-domain networks are networks which only act either in the k-space or in the image (direct) space. They make use of the fact that we have a pseudo-inverse like in Equation (2). They usually use a U-net-like [28] architecture. This network was originally built for image segmentation but has since been used for a wide-variety of image-to-image tasks, mainly as a strong baseline. In [29], they used a U-net to apply on the under-sampled k-space measurements before performing the inverse Fourier transform. In [30], they used a U-net to apply on the zero-filled reconstruction and correct the output of the U-net with a data consistency step (where they replace sampled values in the k-space). The network we implemented was, however, vanilla, without this extra data-consistency step. Our implementation, however, only features the following cascade of number of filters:

16, 32, 64, 128

. The original U-net is illustrated in Figure 1, where the number of filters used in each layer is four times what we used.

3.3.2. Cross-Domains Networks

The second class of networks we introduce, we term cross-domain networks. The key intuitive idea is that they correct the data in both the k-space and the image space alternatively, using the Fourier transform to go from one space to the other. They are derived from the optimization algorithms used to solve the optimization problems introduced before, using the idea of “unrolling” introduced in [31]. An illustration of this class of networks is presented in Figure 2.

Because these networks work directly on the input data (and not on a primarily reconstructed version of it), they need to handle complex-valued data. In particular, the classical deep learning frameworks (TensorFlow and Pytorch) do not feature the ability to perform complex convolutions off-the-shelf. The way convolution is performed in the original papers is, therefore, to concatenate the real and imaginary of the image (respectively the k-space), making it a two-channel image, performing the series of convolutions, and having the output be a two-channel image then transformed back in a complex image (respectively k-space).

The Cascade-net [11] is based on the dictionary learning optimization Problem (4). The idea is to replace the dictionary learning step by convolutional neural networks and still keep the data consistency step in the k-space. The optimization algorithm is then unrolled to allow us to perform back-propagation. The authors of [11] show that we can perform back-propagation through the data consistency step (which is linear) and derive the corresponding Jacobian. The parameters used here for the implementation are the same as those in the original paper, except the number of filters, which was decreased from 64 to 48 to fit on a single GPU. This network is illustrated in Figure 3.

The KIKI-net [10] is an extension of the Cascade-net where they additionally perform convolutions after the data consistency step in the k-space. The parameters used here for the implementation are the same as those in the original paper. This network is illustrated in Figure 4.

The Primal-Dual-net (PD-net) was introduced by [9] and applied to MRI by [32], is based on the wavelet-based denoising (3), and in particular, the resolution of the corresponding optimization problem with the PDHG [23] algorithm. Here, the algorithm is unrolled, and the proximity operators (present in the general case of PDHG) are replaced by convolutional neural networks. For our implementation, for a fairer comparison with Cascade-net and the U-net, we used a ReLU non-linearity instead of a PReLU [33]. This network is illustrated in Figure 5.

3.4. Training

The training was done with the same parameters for all the networks. The optimizer used was from Adam [34], with a learning rate of

10^{- 3}

and default parameters of Keras (

β_{1} = 0.9

,

β_{2} = 0.999

, the exponential decay rates for the moment estimates). The gradient norm was clipped to one to avoid the exploding gradient problems [35]. The batch size was one (i.e., one slice) for every network except the U-Net, where the whole volume was used for each step. For all networks, to maximize the efficiency of the training, the slices were selected in the eight innermost slices of the volumes, because the outer slices do not have much signal. No early stopping or learning rate schedule was used (except for KIKI-net to allow for a stable training where we used the learning rate schedule proposed by the authors in the supporting information of [10]). The number of epochs used was 300 for all networks trained end-to-end. For the iterative training of the KIKI-net, the total number of epochs was 200 (50 per sub-training). Batch normalization was not used; however, in order to have the network learn more efficiently, a scaling of the input data was done. Both the k-space and the image were multiplied by

10^{6}

for fastMRI and by

10^{2}

for OASIS because the k-space measurements had values of mean

10^{- 7}

(looking separately at the real and imaginary parts) for fastMRI and of mean

10^{- 3}

for OASIS. Without this scaling operation, the training proved to be impossible with bias in the convolutions and very inefficient without bias in the convolutions.

4. Data

4.1. Under-Sampling

The under-sampling was done retrospectively using a Cartesian mask described in the data set paper [12] and an acceleration factor of four (i.e., only 25% of the k-space was kept). It contains a fully-sampled region in the lower frequencies, and randomly selects phase encoding lines in the higher frequencies.

It is to be noted that different under-sampling strategies exist in CS-MRI. Some of them are listed in [36], for example, spiral or radial. These strategies allow for a higher image quality while having the same acceleration factor or the same image quality with a higher acceleration factor. Typically, the spiral under-sampling scheme was designed to allow fast coronary imaging [37,38]. These under-sampling strategies must take into account kinematic constraints (both physically and safety based) but should also be with variable density [36]. Recent works even try to optimize the under-sampling strategy under these kinematic constraints [39]. Others have tried to learn the under-sampling strategy in a supervised way. In [40], the under-sampling strategy is learned with a greedy optimization. In [41], a gradient descent optimization is used. Some approaches ([42,43,44] even try to jointly learn the optimal under-sampling strategy along with the reconstruction.

4.2. FastMRI

The data used for this benchmark is the emulated single-coil k-space data of the fastMRI dataset [12], along with the corresponding ground truth images. The acquisition was done with a 15-channel phased array coil, in Cartesian 2D Turbin Spin Echo (TSE). The pulse sequences were proton-density weighting, half with fat suppression, half without, some at 3.0 Teslas (T) others at 1.5 T. The sequence parameters were as follows: Echo train length 4, matrix size

320 \times 320

, in-plane resolution 0.5 × 0.5 mm, slice thickness 3 mm, no gap between slices. In total, there are 973 volumes (34, 742 slices) for the training subset and 199 volumes (7135 slices) for the validation subset.

Since the k-spaces are of different sizes, therefore resulting in images of different sizes, the outputs of the cross-domain networks were cropped to a central

320 \times 320

region. For the U-net, the input of the network was cropped.

4.3. OASIS

The Open Access Series of Imaging Studies (OASIS) brain database [13] is a database including MRI scans of 1068 participants, yielding 2168 MR sessions. Of these 2168, we select only 2164 sessions, which feature T1-weighted sequences. Of these,1878 were acquired on a 3.0 T, 236 at 1.5 T, and the remaining are undisclosed (50). The slice size is majorly

256 \times 256

, and sometimes

240 \times 256

(rarely it can be some other sizes). The number of slices per scan is majorly 176, and sometimes 160 (rarely it can be smaller).

The data was then separated into a training and validation set. The split was participant-based, that is, a participant cannot have a scan in both sets. The split was of 90% for the training set and 10% for the validation set. We further reduced the training data to make it comparable to fastMRI, to 1000 scans randomly selected for the training subset and 200 scans randomly selected for the validation subset.

Contrarily to fastMRI, the OASIS data is available only in magnitude and, therefore, is only real-valued. The k-space is computed as the inverse Fourier transform of the magnitude image.

5. Results

5.1. Metrics

The metrics we used to benchmark the different networks are the following:

The Peak Signal-to-Noise Ratio (PSNR);
The Structural SIMilarity index (SSIM) [45];
The number of trainable parameters in the network;
The runtime in seconds of the neural network on a single volume.

The PSNR is computed as follows, on whole magnitude volumes:

P S N R (x, \hat{x}) = 10 {log}_{10} (\frac{max {(x)}^{2}}{\frac{1}{n} \sum_{i, j, k} {(x_{i, j, k} - {\hat{x}}_{i, j, k})}^{2}})

(5)

where x is the ground truth volume,

\hat{x}

is the predicted volume (magnitude image), and n is the total number of points in the ground truth volume (same as the predicted volume). Since this metric compares very local differences, it does not necessarily reflect the global visual comparison of the images. The SSIM was introduced in [45] exactly to take more structural differences or similarities between images into account. It is computed as in the original paper, per slice, then averaged over the volume (the range, however, is computed volume-wise):

S S I M (x, \hat{x}) = \frac{(2 μ_{x} μ_{\hat{x}} + c_{1}) (2 σ_{x} σ_{\hat{x}} + c_{2}) (c o v_{x \hat{x}} + c_{3})}{(μ_{x}^{2} + μ_{\hat{x}}^{2} + c_{1}) (σ_{x}^{2} + σ_{\hat{x}}^{2} + c_{2}) (σ_{x} σ_{\hat{x}} + c_{3})}

(6)

where x is the ground truth slice,

\hat{x}

is the predicted slice,

μ_{i}

is the mean of i,

σ_{i}^{2}

is the variance of i,

c o v_{i j}

is the covariance between i and j,

c_{1} = {(k_{1} L)}^{2}

,

c_{2} = {(k_{2} L)}^{2}

,

c_{3} = \frac{c_{2}}{2}

, L is the range of the values of the data (given because computed over the whole volume), and

k_{1} = 0.01

and

k_{2} = 0.03

.

While the two aforementioned metrics control the reconstruction quality, it is important to note that this is not the only factor to take into account when designing reconstruction techniques. Because the reconstruction has to happen fast enough for the MR physician to decide whether to re-conduct the exam or not, it is important for the proposed technique to have a reasonable reconstruction speed. For real-time MRI applications or dynamic MRI (e.g., cardiac imaging), it is even more important (for example, in the context of monitoring surgical operations [46]). The runtimes were measured on a computer equipped with a single GPU Quadro P5000 with 16 GB of RAM.

Concurrently, the number of parameters has to stay relatively low to allow the implementation on the different machines with potentially limited memory, which will probably need to have multiple models (for different contrasts, different organs, or different undersampling schemes including different acceleration factors).

5.2. Quantitative Results

The quantitative results in Table 1, Table 2, Table 3 and Table 4 show that the PD-net [9] outperforms its competitors in terms of image quality metrics but also has the least amount of trainable parameters. It is slightly slower than the Cascade-net [11] though, which can be explained by its higher number of iterations, therefore involving more costly Fourier transform (inverse or direct) operations. These results hold true on the two data sets, fastMRI [12] and OASIS [13]. The only exception is that KIKI-net [10] is slightly better than the U-net [28] on the OASIS data set, but still far from the best performers. We can also note that the standard deviation of the image quality metrics is way higher in the fastMRI data set than in the OASIS data set. This higher standard deviation is explained by the fact that the two contrasts present in the fastMRI dataset, Proton Density with and without Fat Suppression (PD/PDFS), have widely different image metrics values. The standard deviations when we compute the metrics for each contrast separately are more in-line with the OASIS ones. The range of the image quality metrics is also much higher in the OASIS results.

5.3. Qualitative Results

The qualitative results shown in Figure 6 and Figure 7 confirm the quantitative ones on the image quality aspect. The PD-net [9] is much better at conserving the high-frequency parts of the original image, as can be seen when looking at the reconstruction error, which is quite flat over the whole image.

6. Discussion

In this work, we only considered one scheme of under-sampling. However, it should be interesting to see if the performance obtained on one type of under-sampling generalizes to other types of under-sampling, especially if we do a re-gridding step for non-Cartesian under-sampling schemes. On that specific point, the extension of the networks towards non-Cartesian sampling schemes is not easy because the data consistency cannot be performed in the same way, and the measurement space is no longer similar to an image (except if we re-grid). In a recent work [47], some of the authors of the Cascade-net [11] propose a way to extend their approach to the non-Cartesian case, using a re-gridding step. The PD-net [9] also has a straightforward implementation for the non-Cartesian case even without re-gridding, in what is called the learned Primal. In this case, the network in the k-space is just computing the difference (residual) between the current k-space measurements and the initial k-space measurements. Therefore, there are no parameters to learn, which alleviates the problem of how to learn them.

We also only considered a single-coil acquisition setting. As parallel imaging is primarily used in CS-MRI to allow higher image quality [3], it is important to see how these networks will behave in the multi-coil setting. The difficult part in the extension of these works to the multi-coil setting will be to understand how to best involve the sensitivity maps (or even not involve them [48]).

Regarding the networks themselves, the results seem to suggest that for cross-domain networks, the trade-off between a high number of iterations and a richer correction in a certain domain (by having deeper networks) is in favor of having more iterations (i.e., alternating more between domains). It is, however, unclear how to best tackle the reconstruction in the k-space, since the convolutional networks make a shift invariance hypothesis, which is not true in the Fourier space where the coefficients corresponding to the high frequencies should probably not be treated in the same way as with the low frequencies. This leaves room for improvement in the near future.

Finally, this work has not dealt with recent approaches involving adversarial training for MRI reconstruction networks [18,49]. It would be very interesting to see how the adversarial training could improve each of the proposed networks.

Author Contributions

Conceptualization, Z.R. and P.C.; methodology, Z.R.; software, Z.R.; validation, Z.R., P.C. and J.-L.S.; formal analysis, Z.R.; investigation, Z.R.; resources, P.C.; data curation, Z.R.; writing—original draft preparation, Z.R.; writing—review and editing, P.C. and J.-L.S.; visualization, Z.R.; supervision, P.C. and J.-L.S.; project administration, P.C. and J.-L.S.; funding acquisition, P.C. and J.-L.S. All authors have read and agree to the published version of the manuscript.

Funding

This research was funded by the Cross-Disciplinary Program on Numerical Simulation (SILICOSMIC project) of CEA, the French Alternative Energies and Atomic Energy Commission.

Acknowledgments

We want to thank Jonas Adler, Justin Haldar, and Jo Schlemper for their very useful and benevolent remarks and answers they gave when asked questions about their works.

Conflicts of Interest

The authors declare no conflict of interest

Abbreviations

The following abbreviations are used in this manuscript:

MRI	Magnetic Resonance Imaging
CS-MRI	Compressed Sensing MRI
GPU	Graphical Processing Unit
ReLU	Rectified Linear Unit
PReLU	Parametrized ReLU
PSNR	Peak Signal-to-Noise Ratio
SSIM	Structural SIMilarity index

References

Ramzi, Z.; Ciuciu, P.; Starck, J.L. Benchmarking Deep Nets MRI Reconstruction Models on the FastMRI Publicly Available Dataset. In Proceedings of the ISBI 2020—International Symposium on Biomedical Imaging, Iowa City, IA, USA, 3–7 April 2020. [Google Scholar]
Smith-Bindman, R.; Miglioretti, D.L.; Larson, E.B. Rising use of diagnostic medical imaging in a large integrated health system. Health Aff. 2008, 27, 1491–1502. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Roemer, P.B.; Edelstein, W.A.; Hayes, C.E.; Souza, S.P.; Mueller, O.M. The NMR phased array. Magn. Reson. Med. 1990, 16, 192–225. [Google Scholar] [CrossRef] [PubMed]
Lustig, M.; Donoho, D.; Pauly, J.M. Sparse MRI: The Application of Compressed Sensing for Rapid MR Imaging Michael. Magn. Reson. Med. 2007. [Google Scholar] [CrossRef] [PubMed]
NHS. NHS Eebsite. Available online: https://www.nhs.uk/conditions/mri-scan/what-happens/ (accessed on 4 March 2020).
AIM Specialty Health. Clinical Appropriateness Guidelines: Advanced Imaging. 2017. Available online: https://www.aimspecialtyhealth.com/PDF/Guidelines/2017/Sept05/AIM_Guidelines.pdf (accessed on 4 March 2020).
Ramzi, Z.; Ciuciu, P.; Starck, J.L. Benchmarking proximal methods acceleration enhancements for CS-acquired MR image analysis reconstruction. In Proceedings of the SPARS 2019—Signal Processing with Adaptive Sparse Structured Representations Workshop, Toulouse, France, 1–4 July 2019. [Google Scholar]
Zhu, B.; Liu, J.Z.; Cauley, S.F.; Rosen, B.R.; Rosen, M.S. Image reconstruction by domain-transform manifold learning. Nature 2018, 555, 487–492. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Adler, J.; Öktem, O. Learned Primal-Dual Reconstruction. IEEE Trans. Med. Imaging 2018, 37, 1322–1332. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Eo, T.; Jun, Y.; Kim, T.; Jang, J.; Lee, H.J.; Hwang, D. KIKI-net: Cross-domain convolutional neural networks for reconstructing undersampled magnetic resonance images. Magn. Reson. Med. 2018, 80, 2188–2201. [Google Scholar] [CrossRef]
Schlemper, J.; Caballero, J.; Hajnal, J.V.; Price, A.; Rueckert, D. A Deep Cascade of Convolutional Neural Networks for MR Image Reconstruction. IEEE Trans. Med. Imaging 2018, 37, 491–503. [Google Scholar] [CrossRef] [Green Version]
Zbontar, J.; Knoll, F.; Sriram, A.; Muckley, M.J.; Bruno, M.; Defazio, A.; Parente, M.; Geras, K.J.; Katsnelson, J.; Chandarana, H.; et al. fastMRI: An Open Dataset and Benchmarks for Accelerated MRI. arXiv 2018, arXiv:1811.08839. [Google Scholar]
LaMontagne, P.J.; Keefe, S.; Lauren, W.; Xiong, C.; Grant, E.A.; Moulder, K.L.; Morris, J.C.; Benzinger, T.L.; Marcus, D.S. OASIS-3: Longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and Alzheimer’s disease. Alzheimer’s Dementia J. Alzheimer’s Assoc. 2018, 14, P1097. [Google Scholar] [CrossRef]
Chollet, F. Keras. 2015. Available online: https://keras.io (accessed on 4 March 2020).
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Available online: tensorflow.org (accessed on 4 March 2020).
Virtue, P.; Stella, X.Y.; Lustig, M. Better than real: Complex-valued neural nets for MRI fingerprinting. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3953–3957. [Google Scholar]
Aggarwal, H.K.; Mani, M.P.; Jacob, M. Multi-Shot Sensitivity-Encoded Diffusion MRI Using Model-Based Deep Learning (Modl-Mussels). In Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 8–11 April 2019; pp. 1541–1544. [Google Scholar]
Minh Quan, T.; Nguyen-Duc, T.; Jeong, W.K. Compressed Sensing MRI Reconstruction using a Generative Adversarial Network with a Cyclic Loss. IEEE Trans. Med. Imaging 2018, 37, 1488–1497. [Google Scholar] [CrossRef] [Green Version]
Yang, Y.; Sun, J.; Li, H.; Xu, Z.; Sun, J.; Xu, Z. Deep ADMM-net for compressive sensing MRI. In Proceedings of the Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 10–18. [Google Scholar] [CrossRef]
Petersen, R.C.; Aisen, P.; Beckett, L.A.; Donohue, M.; Gamst, A.; Harvey, D.J.; Jack, C.; Jagust, W.; Shaw, L.; Toga, A.; et al. Alzheimer’s disease neuroimaging initiative (ADNI): Clinical characterization. Neurology 2010, 74, 201–209. [Google Scholar] [CrossRef] [Green Version]
Haldar, J.P. Low-Rank Modeling of Local k-Space Neighborhoods (LORAKS) for Constrained MRI. IEEE Trans. Med. Imaging 2014, 33, 668–681. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Condat, L. A Primal–Dual Splitting Method for Convex Optimization Involving Lipschitzian, Proximable and Linear Composite Terms. J. Optim. Theory Appl. 2013, 158, 460–479. [Google Scholar] [CrossRef] [Green Version]
Chambolle, A.; Pock, T. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. J. Math. Imaging Vis. 2011, 120–145. [Google Scholar] [CrossRef] [Green Version]
Beck, A.; Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2009, 2, 183–202. [Google Scholar] [CrossRef] [Green Version]
Kim, D.; Fessler, J.A. Adaptive restart of the optimized gradient method for convex optimization. J. Optim. Theory Appl. 2018, 178, 240–263. [Google Scholar] [CrossRef] [Green Version]
Ravishankar, S.; Bresler, Y. Magnetic Resonance Image Reconstruction from Highly Undersampled K-Space Data Using Dictionary Learning. IEEE Trans. Med. Imaging 2011, 30, 1028–1041. [Google Scholar] [CrossRef]
Caballero, J.; Price, A.N.; Rueckert, D.; Hajnal, J.V. Dictionary learning and time sparsity for dynamic MR data reconstruction. IEEE Trans. Med. Imaging 2014, 33, 979–994. [Google Scholar] [CrossRef] [Green Version]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv 2015, arXiv:1505.04597v1. [Google Scholar]
Han, Y.; Sunwoo, L.; Chul Ye, J. k-Space Deep Learning for Accelerated MRI. arXiv 2019, arXiv:1805.03779v2. [Google Scholar] [CrossRef] [Green Version]
Min Hyun, C.; Pyung Kim, H.; Min Lee, S.; Lee, S.; Keun Seo, J. Deep learning for undersampled MRI reconstruction. arXiv 2019, arXiv:1709.02576v3. [Google Scholar]
Gregor, K.; Lecun, Y. Learning Fast Approximations of Sparse Coding. In Proceedings of the 27thInternational Confer-ence on Machine Learning, Haifa, Israel, 13–15 June 2010. [Google Scholar]
Cheng, J.; Wang, H.; Ying, L.; Liang, D. Model Learning: Primal Dual Networks for Fast MR imaging. arXiv 2019, arXiv:1908.02426. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. arXiv 2015, arXiv:1502.01852. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Pascanu, R.; Mikolov, T.; Bengio, Y. On the difficulty of training Recurrent Neural Networks. arXiv 2012, arXiv:1211.5063. [Google Scholar]
Chauffert, N.; Ciuciu, P.; Kahn, J.; Weiss, P. Variable density sampling with continuous trajectories. SIAM J. Imaging Sci. 2014, 7, 1962–1992. [Google Scholar] [CrossRef] [Green Version]
Irarrazabal, P.; Nishimura, D.G. Fast Three Dimensional Magnetic Resonance Imaging. Magn. Reson. Med. 1995, 33, 656–662. [Google Scholar] [CrossRef]
Meyer, C.H.; Hu, B.S.; Nishimura, D.G.; Macovski, A. Fast Spiral Coronary Artery Imaging. Magn. Reson. Med. 1992, 28, 202–213. [Google Scholar] [CrossRef]
Lazarus, C.; Weiss, P.; Chauffert, N.; Mauconduit, F.; El Gueddari, L.; Destrieux, C.; Zemmoura, I.; Vignaud, A.; Ciuciu, P. SPARKLING: Variable-density k-space filling curves for accelerated T 2 * -weighted MRI. Magn. Reson. Med. 2018, 1–19. [Google Scholar] [CrossRef] [Green Version]
Sanchez, T.; Gözcü, B.; van Heeswijk, R.B.; Eftekhari, A.; Ilıcak, E.; Çukur, T.; Cevher, V. Scalable learning-based sampling optimization for compressive dynamic MRI. arXiv 2019, arXiv:1902.00386. [Google Scholar]
Sherry, F.; Benning, M.; Reyes, J.C.D.l.; Graves, M.J.; Maierhofer, G.; Williams, G.; Schönlieb, C.B.; Ehrhardt, M.J. Learning the sampling pattern for MRI. arXiv 2019, arXiv:1906.08754. [Google Scholar]
Aggarwal, H.K.; Jacob, M. J-MoDL: Joint Model-Based Deep Learning for Optimized Sampling and Reconstruction. arXiv 2019, arXiv:1911.02945. [Google Scholar]
Wu, Y.; Rosca, M.; Lillicrap, T. Deep compressed sensing. arXiv 2019, arXiv:1905.06723. [Google Scholar]
Weiss, T.; Senouf, O.; Vedula, S.; Michailovich, O.; Zibulevsky, M.; Bronstein, A. PILOT: Physics-Informed Learned Optimal Trajectories for Accelerated MRI. arXiv 2019, arXiv:1909.05773. [Google Scholar]
Wang, Z.; Bovik, A.C.; Rahim Sheikh, H.; Simoncelli, E.P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13. [Google Scholar] [CrossRef] [Green Version]
Horvath, K.A.; Li, M.; Mazilu, D.; Guttman, M.A.; McVeigh, E.R. Real-time magnetic resonance imaging guidance for cardiovascular procedures. In Seminars in Thoracic and Cardiovascular Surgery; Elsevier: New York, NY, USA, 2007; Volume 19, pp. 330–335. [Google Scholar]
Schlemper, J.; Sadegh, S.; Salehi, M.; Kundu, P.; Lazarus, C.; Dyvorne, H.; Rueckert, D.; Sofka, M. Nonuniform Variational Network: Deep Learning for Accelerated Nonuniform MR Image Reconstruction. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Shenzhen, China, 13–17 October 2019. [Google Scholar]
El Gueddari, L.; Ciuciu, P.; Chouzenoux, E.; Vignaud, A.; Pesquet, J.C. Calibrationless OSCAR-based image reconstruction in compressed sensing parallel MRI. In Proceedings of the 16th IEEE International Symposium on Biomedical Imaging, Venice, Italy, 8–11 April 2019; pp. 1532–1536. [Google Scholar]
Dragotti, P.L.; Dong, H.; Yang, G.; Guo, Y.; Firmin, D.; Slabaugh, G.; Yu, S.; Keegan, J.; Ye, X.; Liu, F.; et al. DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruction. IEEE Trans. Med. Imaging 2017, 37, 1310–1321. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Illustration of the U-net from [28]. In our case, the output is not a segmentation map but a reconstructed image of the same size (we perform zero-padding to prevent decreasing sizes in convolutions).

Figure 2. The common backbone between the Cascade net, the KIKI-net, and the PD-net. US mask stands for under-sampling mask. DC stands for data consistency. (I)FFT stands for (Inverse) Fast Fourier Transform.

N_{k, d}

is the number of convolution layers applied in the k-space.

N_{i, d}

is the number of convolution layers applied in the image space.

N_{C}

is the total number of alternations between the k-space and the image-space. It is to be noted that in the case of PD-net, the data consistency step is not performed, the Fourier operators are performed with the original under-sampling mask, and a buffer is concatenated along with the current iteration to allow for some memory between iterations and to learn the acceleration (in the k-space net—dual net—it is also concatenated with the original k-space input). In the case of the Cascade net,

N_{k, d} = 0

, only the data consistency is performed in the k-space. In the case of the KIKI-net, there is no residual connection in the k-space. However, the k-space nets and image space nets could potentially be any kind of image-to-image neural network.

Figure 2. The common backbone between the Cascade net, the KIKI-net, and the PD-net. US mask stands for under-sampling mask. DC stands for data consistency. (I)FFT stands for (Inverse) Fast Fourier Transform.

N_{k, d}

is the number of convolution layers applied in the k-space.

N_{i, d}

is the number of convolution layers applied in the image space.

N_{C}

is the total number of alternations between the k-space and the image-space. It is to be noted that in the case of PD-net, the data consistency step is not performed, the Fourier operators are performed with the original under-sampling mask, and a buffer is concatenated along with the current iteration to allow for some memory between iterations and to learn the acceleration (in the k-space net—dual net—it is also concatenated with the original k-space input). In the case of the Cascade net,

N_{k, d} = 0

, only the data consistency is performed in the k-space. In the case of the KIKI-net, there is no residual connection in the k-space. However, the k-space nets and image space nets could potentially be any kind of image-to-image neural network.

Figure 3. Illustration of the Cascade-net from [11]. Here, each

C_{i}

is a convolutional block of 64 filters (48 in our implementation) followed by a ReLU non-linearity,

n_{d}

is the number of such convolutional blocks forming a convolutional subnetwork between each data consistency layer

D C

, and

n_{c}

is the number of convolutional subnetworks.

Figure 3. Illustration of the Cascade-net from [11]. Here, each

C_{i}

is a convolutional block of 64 filters (48 in our implementation) followed by a ReLU non-linearity,

n_{d}

is the number of such convolutional blocks forming a convolutional subnetwork between each data consistency layer

D C

, and

n_{c}

is the number of convolutional subnetworks.

Figure 4. Illustration of the KIKI-net from [10]. The KCNN and ICNN are convolutional neural networks composed of a number of convolutional blocks varying between 5 and 25 (we implemented 25 blocks for both KCNN and ICNN), each followed by a ReLU non-linearity and featuring between 8 and 64 filters (we implemented 32 filters). For both the varying numbers, the supplementary material of [10] shows that the higher the better. The ICNN also features a residual connection.

Figure 5. Illustration of the PD-net from [9]. Here,

T

denotes the measurement operator, which in our case is the under-sampled Fourier transform,

T^{*}

its adjoint, g is the measurements, which in our case are the undersampled k-space measurements, and

f_{0}

and

h_{0}

are the initial guesses for the direct and measurement spaces (the image and k-space in our case). The initial guesses are zero tensors. Because we transform complex-valued data into 2-channel real-valued data, the number of channels at the input and the output of the convolutional subnetworks are multiplied by 2 in our implementation.

Figure 5. Illustration of the PD-net from [9]. Here,

T

denotes the measurement operator, which in our case is the under-sampled Fourier transform,

T^{*}

its adjoint, g is the measurements, which in our case are the undersampled k-space measurements, and

f_{0}

and

h_{0}

are the initial guesses for the direct and measurement spaces (the image and k-space in our case). The initial guesses are zero tensors. Because we transform complex-valued data into 2-channel real-valued data, the number of channels at the input and the output of the convolutional subnetworks are multiplied by 2 in our implementation.

Figure 6. Reconstruction results for a specific slice (16th slice of file1000196, part of the validation set). The first row represents the reconstruction using the different methods, while the second represents the absolute error when compared to the reference.

Figure 7. Reconstruction results for a specific slice (15th slice of sub-OAS30367_ses-d3396_T1w.nii.gz, part of the validation set). The top row represents the reconstruction using the different methods, while the bottom row represents the absolute error when compared to the reference.

Table 1. Quantitative results for the fastMRI dataset. Peak Signal-to-Noise Ratio (PSNR) and Structural SIMilarity index (SSIM) mean and standard deviations are computed over the 200 validation volumes. Runtimes are given for the reconstruction of a volume with 35 slices.

Network	PSNR-mean (std) (dB)	SSIM-mean (std)	#params	Runtime (s)
Zero-filled	29.61 ( 5.28)	0.657 ( 0.23)	0	0.68
KIKI-net	31.38 (3.02)	0.712 (0.13)	1.25M	8.22
U-net	31.78 ( 6.53)	0.720 ( 0.25)	482k	0.61
Cascade net	31.97 ( 6.95)	0.719 ( 0.27)	425k	3.58
PD-net	32.15 ( 6.90)	0.729 ( 0.26)	318k	5.55

Table 2. Quantitative results for the fastMRI dataset with the Proton density fat suppression (PDFS) contrast. PSNR and SSIM mean and standard deviations are computed over the 99 validation volumes. Runtimes are given for the reconstruction of a volume with 35 slices.

Network	PSNR-mean (std) (dB)	SSIM-mean (std)	# params	Runtime (s)
Zero-filled	28.44 (2.62)	0.578 (0.095)	0	0.41
KIKI-net	29.57 (2.64)	0.6271 (0.10)	1.25M	8.88
Cascade-net	29.88 (2.82)	0.6251 (0.11)	425K	3.57
U-net	29.89 (2.74)	0.6334 (0.10)	482K	1.34
PD-net	30.06 (2.82)	0.6394 (0.10)	318K	5.38

Table 3. Quantitative results for the fastMRI dataset with the Proton density (PD) contrast. PSNR and SSIM mean and standard deviations are computed over the 100 validation volumes. Runtimes are given for the reconstruction of a volume with 40 slices.

Network	PSNR-mean (std) (dB)	SSIM-mean (std)	# params	Runtime (s)
Zero-filled	30.63 (2.1)	0.727 (0.087)	0	0.52
KIKI-net	32.86 (2.4)	0.797 (0.082)	1.25M	11.83
U-net	33.64 (2.6)	0.807 (0.084)	482K	1.07
Cascade-net	33.98 (2.7)	0.811 (0.086)	425K	4.22
PD-net	34.2 (2.7)	0.818 (0.084)	318280	6.08

Table 4. Quantitative results for the OASIS dataset. PSNR and SSIM mean and standard deviations are computed over the 200 validation volumes. Runtimes are given for the reconstruction of a volume with 32 slices.

Network	PSNR-mean (std) (dB)	SSIM-mean (std)	# params	Runtime (s)
Zero-filled	26.11 (1.45)	0.672 (0.0307)	0	0.165
U-net	29.8 (1.39)	0.847 (0.0398)	482k	1.202
KIKI-net	30.08 (1.43)	0.853 (0.0336)	1.25M	3.567
Cascade-net	32.0 (1.731)	0.887 (0.0327)	425k	2.234
PD-net	33.22 (1.912)	0.910 (0.0358)	318k	2.758

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ramzi, Z.; Ciuciu, P.; Starck, J.-L. Benchmarking MRI Reconstruction Neural Networks on Large Public Datasets. Appl. Sci. 2020, 10, 1816. https://doi.org/10.3390/app10051816

AMA Style

Ramzi Z, Ciuciu P, Starck J-L. Benchmarking MRI Reconstruction Neural Networks on Large Public Datasets. Applied Sciences. 2020; 10(5):1816. https://doi.org/10.3390/app10051816

Chicago/Turabian Style

Ramzi, Zaccharie, Philippe Ciuciu, and Jean-Luc Starck. 2020. "Benchmarking MRI Reconstruction Neural Networks on Large Public Datasets" Applied Sciences 10, no. 5: 1816. https://doi.org/10.3390/app10051816

APA Style

Ramzi, Z., Ciuciu, P., & Starck, J.-L. (2020). Benchmarking MRI Reconstruction Neural Networks on Large Public Datasets. Applied Sciences, 10(5), 1816. https://doi.org/10.3390/app10051816

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Benchmarking MRI Reconstruction Neural Networks on Large Public Datasets

Abstract

1. Introduction

2. Related Works

3. Models

3.1. Idealized Inverse Problem

3.2. Classic Models

3.3. Neural Networks

3.3.1. Single-Domain Networks

3.3.2. Cross-Domains Networks

3.4. Training

4. Data

4.1. Under-Sampling

4.2. FastMRI

4.3. OASIS

5. Results

5.1. Metrics

5.2. Quantitative Results

5.3. Qualitative Results

6. Discussion

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI