Deep Learning-Based Reconstruction for Cardiac MRI: A Review

Oscanoa, Julio A.; Middione, Matthew J.; Alkan, Cagan; Yurt, Mahmut; Loecher, Michael; Vasanawala, Shreyas S.; Ennis, Daniel B.

doi:10.3390/bioengineering10030334

Open AccessReview

Deep Learning-Based Reconstruction for Cardiac MRI: A Review

by

Julio A. Oscanoa

^1,2,*

,

Matthew J. Middione

²

,

Cagan Alkan

³,

Mahmut Yurt

³

,

Michael Loecher

²

,

Shreyas S. Vasanawala

²

and

Daniel B. Ennis

²

¹

Department of Bioengineering, Stanford University, Stanford, CA 94305, USA

²

Department of Radiology, Stanford University, Stanford, CA 94305, USA

³

Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA

^*

Author to whom correspondence should be addressed.

Bioengineering 2023, 10(3), 334; https://doi.org/10.3390/bioengineering10030334

Submission received: 2 February 2023 / Revised: 3 March 2023 / Accepted: 3 March 2023 / Published: 6 March 2023

(This article belongs to the Special Issue AI in MRI: Frontiers and Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Cardiac magnetic resonance (CMR) is an essential clinical tool for the assessment of cardiovascular disease. Deep learning (DL) has recently revolutionized the field through image reconstruction techniques that allow unprecedented data undersampling rates. These fast acquisitions have the potential to considerably impact the diagnosis and treatment of cardiovascular disease. Herein, we provide a comprehensive review of DL-based reconstruction methods for CMR. We place special emphasis on state-of-the-art unrolled networks, which are heavily based on a conventional image reconstruction framework. We review the main DL-based methods and connect them to the relevant conventional reconstruction theory. Next, we review several methods developed to tackle specific challenges that arise from the characteristics of CMR data. Then, we focus on DL-based methods developed for specific CMR applications, including flow imaging, late gadolinium enhancement, and quantitative tissue characterization. Finally, we discuss the pitfalls and future outlook of DL-based reconstructions in CMR, focusing on the robustness, interpretability, clinical deployment, and potential for new methods.

Keywords:

cardiac magnetic resonance imaging; image reconstruction; machine learning; deep learning

1. Introduction

Over the last few decades, cardiac magnetic resonance (CMR) has emerged as an essential clinical tool to assess cardiovascular disease. CMR is non-invasive, free of ionizing radiation, and highly versatile, which enables a comprehensive assessment of the cardiovascular structure, function, flow, perfusion, viability, and tissue characterization [1,2]. For example, CMR is the gold standard for measuring the left ventricle (LV) ejection fraction, a cornerstone parameter for the diagnosis of heart failure and the timing of surgical interventions [3]. Additionally, late gadolinium enhancement (LGE) and T1/T2 mapping are widely used for cardiac viability and the detection of pathologies, such as diffuse myocardial fibrosis, fatty infiltration, or infarcted tissue [4,5,6,7,8,9,10,11]. Similarly, CMR flow imaging has revolutionized the care of patients with congenital heart disease by providing blood velocity and flow measurements in multiple planes, which allows for the detection and evaluation of intracardiac shunts and right- and left-lung flow splits [12,13,14].

However, one of the main drawbacks of CMR is the prolonged scan time. Magnetic resonance imaging (MRI) uses an inherently slow acquisition process, a disadvantage that compromises patient comfort and image quality, while limiting the achievable spatiotemporal resolution. CMR is especially challenging because clinicians require a high spatial and temporal resolution for accurate diagnoses. Additionally, CMR acquisitions must account for the inherent cardiac and respiratory motion to avoid motion artifacts [15,16,17,18].

To date, the most successful methods accelerate CMR data acquisition by reducing the amount of data acquired. Parallel imaging methods [19,20,21] exploit the redundancy of the data acquired through multi-channel coil arrays. Similarly, compressed sensing techniques [22] leverage the inherent compressibility (or sparsity) of MRI images. Both methods introduce aliasing artifacts that must be suppressed via a reconstruction algorithm to recover accurate images. For CMR, the combination of parallel imaging and compressed sensing (PICS) has achieved about 2–3x undersampling for 2D applications and 2–4x for 3D applications by exploiting both the multi-channel acquisition and the redundancy of spatiotemporal data, without compromising the accuracy or precision in the estimation of the cardiovascular parameters of clinical interest [23,24,25,26].

More recently, machine learning has disrupted the image reconstruction field and is rapidly becoming the state of the art. Deep learning (DL)-based reconstruction algorithms have outperformed conventional algorithms by yielding more accurate image reconstructions and enabling higher undersampling factors [27,28,29,30]. In compressed sensing, reconstruction algorithms remove aliasing artifacts by enforcing a prior image model based on sparsity [22,31,32]. DL-based methods are able to outperform these methods by learning the prior image model directly from historical data [33,34,35,36].

For CMR, DL-based reconstruction methods have rapidly surpassed PICS with increased undersampling rates and substantially reduced image reconstruction times. This advance has the potential to be translated clinically into improved image quality [37], for increases in the spatial and temporal resolution, and to significantly reduce the breath-hold durations, which is especially relevant for pediatric patients and adult patients with respiratory impairment [38,39]. The adaptation of DL-based reconstruction methods to CMR has brought significant technical developments. For example, multiple works have aimed to develop novel network architectures that are able to efficiently learn spatiotemporal priors [37,40,41,42]. Additionally, some works have been developed to tackle the specific challenges of CMR data, such as increased computational complexity [43,44] and a shortage of fully sampled ground-truth data for training [45,46].

In this work, we review the most current and central developments in DL-based reconstruction for CMR applications. In the first section, we overview the relevant image reconstruction theory. Then, we formally present several DL-based reconstruction methods. We first focus on the technical developments to adapt DL-based reconstruction to CMR data and then on the works developed for specific applications. Finally, we discuss the pitfalls, the future outlook of the field, and the present conclusions.

2. Image Reconstruction Theory

DL-based methods outperform conventional PICS image reconstruction by learning the prior image model directly from historical data. However, conventional image reconstruction is still relevant because DL-based reconstruction methods are heavily based on traditional PICS algorithms, as shown in Figure 1. We review the relevant image reconstruction theory, including the MR image models and image reconstruction algorithms, in order to contextualize the DL-based reconstruction methods in Section 3.

2.1. General Model

Compressed sensing [31,32] enables accelerated data acquisition by reducing the number of acquired k-space data required to obtain high-quality images via a pseudo-randomly undersampling scheme. This random undersampling strategy introduces incoherent aliasing artifacts into the images, which must be removed via a non-linear reconstruction algorithm to retrieve the high-quality image.

Image reconstruction can be formulated as an inverse problem [22] that aims to retrieve the image

x \in C^{N T}

which is consistent with the acquired k-space data

y \in C^{M}

and best matches a prior image model (e.g., sparsity or low rank). In CMR, x is a time series of images, with N voxels per time frame and T temporal frames. Additionally, M is the number of k-space locations acquired. Equation (1) formulates image reconstruction as an optimization problem.

x^{*} = \arg min_{x} \frac{1}{2} {∥ A x - y ∥}_{2}^{2} + λ R (x),

(1)

where

A \in C^{M \times N T}

is the MRI forward operator that includes the Fourier transform and the coil sensitivity maps from the multi-channel coil array,

R : C^{N T} \to R

is a regularization function that enforces the prior, and

λ

is a tunable hyperparameter. The first least-squares term is denoted as a data consistency term because it enforces consistency with the acquired data, while the second term is the regularization term that enforces the prior image model.

For CMR, multiple choices of

R

that exploit the spatiotemporal redundancy have been proposed. Some models exploit sparsity in a sparsifying domain, such as temporal Fourier transform [24,50,51], wavelets [52], and total variation [53,54,55]. Likewise, other methods exploit low-rank structures [47,56,57] or a combination of both sparsity and low-rank models [58,59,60,61].

There are multiple algorithm options to solve the reconstruction problem described in Equation (1). If the proximal operator of

R

has a closed-form solution, e.g.,

ℓ 1

-wavelets [22], then an efficient option is to use a proximal gradient-type method [62] (e.g., a fast iterative shrinkage-thresholding algorithm [63]). The simplest proximal gradient algorithm alternates between a gradient step of the data consistency term and a proximal step on the regularization, as described in Equation (2) and Figure 1A.

\begin{matrix} z^{k + 1} & = x^{k} - α A^{H} (A x^{k} - y), \end{matrix}

(2a)

\begin{matrix} x^{k + 1} & = {prox}_{α λ R} (z^{k + 1}), \end{matrix}

(2b)

where

x^{k}

is the estimate of x at iteration k,

z^{k}

is an auxiliary term at iteration k,

α

is the step size, and

{prox}_{R}

is the proximal operator [62] of the regularization function

R

.

Conversely, if the proximal operator does not have a closed-form solution, then more efficient methods would be the primal dual hybrid gradient [64] or the alternating direction methods of multipliers [65]. We describe the method denoted as half quadratic splitting [66], which is relevant for DL-based reconstruction, in Equation (3).

\begin{matrix} x^{k + 1} & = arg min_{x} \frac{1}{2} {∥ A x - y ∥}_{2}^{2} + \frac{ρ}{2} {∥ T x - z^{k} ∥}_{2}^{2}, \end{matrix}

(3a)

\begin{matrix} z^{k + 1} & = arg min_{z} λ Q (z) + \frac{ρ}{2} {∥ z - T x^{k + 1} ∥}_{2}^{2}, \end{matrix}

(3b)

where we consider a regularization function

R

that can be expressed as

R (x) = Q (T x)

, where

Q

is a convex function (usually a norm function) and T is a linear operator that represents the sparsifying transform (e.g., total variation).

2.2. Low Rank plus Sparse Model

A special and relevant formulation is the low rank plus sparse (L+S) decomposition model [26]. This model decomposes image x into two components: one low rank

x_{L} \in C^{N T}

, and one sparse

x_{S} \in C^{N T}

. Equation (4) presents the L+S reconstruction problem.

x^{*} = \arg min_{x} \frac{1}{2} ∥ A (x_{L} + x_{S}) {- y ∥}_{2}^{2} + λ_{L} ∥ x_{L} ∥_{*} + λ_{S} {∥ T x_{S} ∥}_{1}, s . t . x = x_{L} + x_{S},

(4)

where

∥ x_{L} ∥_{*}

is the nuclear norm of

x_{L}

,

∥ T x_{S} ∥_{1}

is the

ℓ 1

-norm of

x_{S}

in the T sparsifying transform domain. With this formulation, the aim is that

x_{L}

captures the temporally correlated background, whereas

x_{S}

captures the dynamic information on top of the background.

Equation (4) can be solved using an extension of the proximal gradient (Equation (2)) as described in Equation (5).

\begin{matrix} x^{k + 1} & = x^{k} - α A^{H} (A (x_{L}^{k} + x_{S}^{k}) - y), \end{matrix}

(5a)

\begin{matrix} x_{L}^{k + 1} & = {prox}_{α λ_{L} {∥ \cdot ∥}_{*}} (x^{k + 1} - x_{S}^{k}), \end{matrix}

(5b)

\begin{matrix} x_{S}^{k + 1} & = T^{H} {prox}_{α λ_{S} {∥ \cdot ∥}_{1}} (T (x^{k + 1} - x_{L}^{k})), \end{matrix}

(5c)

where

{prox}_{α λ_{L} {∥ \cdot ∥}_{*}}

is the proximal operator of the nuclear norm, i.e., the singular value soft-thresholding operation, and

{prox}_{α λ_{S} {∥ \cdot ∥}_{1}}

is the proximal operator of the

ℓ 1

-norm, i.e., the soft-thresholding operation.

2.3. Partial Separability Model

Lastly, another formulation of interest uses the partial separability model [67]. This model assumes that the temporal images can be represented as linear combinations of a smaller number of temporal components. Therefore, the time series of images

x \in C^{N T}

can be stacked to form a Caseroti matrix

X \in C^{N \times T}

, which will be rank deficient with rank

R < T ≪ N

. Then, X can be expressed as a multiplication of two matrices

U \in C^{N \times R}, V \in C^{T \times R}

, where U contains the image-domain basis and V the temporal basis as shown in Equation (6):

X = [\begin{matrix} x_{1, 1} & \dots & x_{1, T} \\ ⋮ & ⋱ & ⋮ \\ x_{N, 1} & \dots & x_{N, T} \end{matrix}] = U V^{H} = [\begin{matrix} U_{1, 1} & \dots & U_{1, R} \\ ⋮ & ⋱ & ⋮ \\ U_{N, 1} & \dots & U_{N, R} \end{matrix}] [\begin{matrix} V_{1, 1} & \dots & V_{1, T} \\ ⋮ & ⋱ & ⋮ \\ V_{R, 1} & \dots & V_{R, T} . \end{matrix}]

(6)

The reconstruction problem becomes

X^{*} = \arg min_{X} \frac{1}{2} {∥ A (U V^{H}) - y ∥}_{2}^{2} + λ R (U, V), s . t . X = U V^{H},

(7)

where

A : C^{N \times T} \to C^{M}

is the MRI linear measurement operator, and

R

is the regularization function for U and V. There are multiple options for

R

that include sparse [60] or low-rank [47,48] regularizations.

Regarding the solver for Equation (7), some works pre-compute the temporal basis V using a calibration procedure and solve only for U [56,60]. A relevant case is when we simultaneously estimate U and V, where we can use the alternating minimization procedure described in Equation (8) [47,48] and Figure 1B.

\begin{matrix} U^{(k + 1)} & = \arg min_{U} \frac{1}{2} {∥ A (U V^{(k) H}) - y ∥}_{2}^{2} + λ R (U, V^{(k)}), \end{matrix}

(8a)

\begin{matrix} V^{(k + 1)} & = \arg min_{V} \frac{1}{2} {∥ A (U^{(k)} V^{H}) - y ∥}_{2}^{2} + λ R (U^{(k)}, V) . \end{matrix}

(8b)

One important advantage of the partial separability model is the computational efficiency. Because the rank R is usually significantly lower than the dimensions of X, performing a reconstruction on the basis of U and V is more computationally efficient, especially when the regularization function

R

does not require the explicit formation of X, as demonstrated in Ong et al. [48].

3. Deep Learning-Based Reconstruction

With the rapid adoption of machine learning in multiple areas of science, DL rapidly became a powerful tool for MR image reconstruction. While conventional compressed sensing assumes a generic sparse image prior, DL-based reconstruction methods enforce a data-driven image prior tailored for each application by learning the prior directly from historical data. Due to their success in tasks such as image segmentation and classification, the first DL-based reconstruction methods applied common architectures, such as U-Nets and residual networks [68,69,70,71]. However, soon more specialized architectures were developed.

In this section, we describe the general learning and evaluation procedures. Additionally, we review the main network architectures developed for CMR reconstruction, with the main focus on applications to cardiac cine MRI. Cardiac cine is the most widely used CMR technique in the clinic and, thus, the most commonly tackled CMR reconstruction problem [1]. We leverage the extensive bibliography of works developed for this particular application to elaborate on the most relevant network architectures. Furthermore, we emphasize a specialized type of framework denoted as unrolled networks, which are heavily based on conventional reconstructions.

3.1. Learning and Evaluation Procedure

The most common approach to learn a reconstruction network is through supervised learning. In this approach, it is ideal to have a set of fully-sampled datasets that will be used as ground truth for training the network. These datasets are split into three subsets: (1) training; (2) validation; and (3) testing, e.g., a standard split would be 70/20/10% [28]. The training subset is used to train the network (i.e., to calculate the gradients and update the network weights). The validation subset is used for hyperparameter tuning and to assess generalization. The test subset is used after training to properly assess the performance of the network and report image quality metrics relative to the fully sampled ground-truth datasets.

The training procedure aims to minimize a loss function

L

similar to Equation (9).

L (Φ) = \sum_{i} | | G_{Φ} (x_{i}) - m_{i} {| |}_{2}^{2},

(9)

where

m_{i}

is the i-th sample of the fully sampled training subset,

x_{i}

is the corresponding retrospectively undersampled image (

x_{i} = A^{H} M_{i} A m_{i}

, where

M_{i}

is a k-space undersampling mask), and

G

is the reconstruction network parameterized by the weights

Φ

. In this example, we used an

ℓ_{2}

-norm loss function, but other options include

ℓ_{1}

-norm, SSIM (structural similarity index), and perceptual loss functions [28,72]. The loss function can be minimized using a stochastic gradient descent algorithm, such as Adam [73].

The success of the training procedure and, thus, the learned reconstruction network depends highly on the quality, authenticity, and diversity of the training subset [36]. First, although k-space data are typically not readily available, advanced reconstruction networks need access to the raw k-space measurements, which provide information about the phase of the image and the coil sensitivity maps. Neglecting or simulating (e.g., from DICOM magnitude-only images) this information has led to artificial over-optimistic results [74]. Furthermore, data pre-processing steps such as normalization can help to greatly improve the training convergence. Data can be normalized using the

ℓ_{2}

-norm of the central k-space measurements, such that all data have approximately the same scale [30,36]. Finally, a common approach to improve diversity is data augmentation. Different undersampling masks

M_{i}

, flipping, and transposing the images are some common transformations used for data augmentation [75].

The validation subset is used for the generalization assessment and hyperparameter tuning. During training, a validation loss can be estimated using Equation (9) on the validation subset at regular intervals. The aim is to assess how well the network has generalized and performed with unseen data. Furthermore, multiple training runs with different hyperparameter values can be performed. The performance on the validation subset between these runs can be compared to select the best hyperparameter configuration.

Finally, after training, the evaluation procedure assesses the network performance on the testing subset. In general, the performance is evaluated by comparing the reconstructed images with the fully sampled ground-truth images in the datasets through an image quality metric, such as the NRMSE (normalized root mean squared error), pSNR (peak signal-to-noise ratio), and SSIM [30].

3.2. Unrolled Networks

Currently, the state-of-the-art methods use the unrolled network [33,34,35] framework, which is heavily based on conventional image reconstruction. These networks resemble the traditional iterative optimization algorithms by alternating between data consistency and regularization (Figure 1). The difference lies in the regularization step, where the algorithms enforce data-driven priors via a deep neural network. Next, we review some relevant CMR works that use this framework.

3.2.1. Neural Proximal Gradient Descent

The neural proximal gradient descent [35] or variational network [33] resembles the proximal gradient algorithm in Equation (2) and is described in Equation (10) and Figure 1A.

\begin{matrix} x^{t + 1} & = x^{t} - α A^{T} (A z^{t} - y), \end{matrix}

(10a)

\begin{matrix} z^{t + 1} & = D_{Φ} (x^{t + 1}), \end{matrix}

(10b)

where

D_{Φ}

is a regularization function performed by a deep neural network parameterized by

Φ

. This architecture resembles the proximal gradient descent (Equation (2)) by alternating gradient steps of the data consistency term and regularization steps. In this case, instead of a proximal step, the regularization is performed by a neural network

D_{Φ}

that enforces a data-driven prior learned directly from historical data through the learning procedure (Section 3.1).

For CMR, Sandino et al. [37] proposed DL-ESPIRiT, an unrolled network with the neural proximal gradient architecture and an extended coil sensitivity map operator to address field-of-view (FOV) limitations. In the data consistency step, DL-ESPIRiT used an augmented forward model with multiple sets of sensitivity maps, based on ESPIRiT, to flexibly represent overlapping tissue in reduced FOV acquisitions, which are common in cine imaging. Additionally, in the regularization step, the algorithm used a novel CNN architecture with separable 2D spatial + 1D temporal convolutions to learn spatiotemporal priors more efficiently. DL-ESPIRiT showed improved agreement compared to standard PICS and conventional parallel imaging, both qualitatively (sharpness) and quantitatively through cardiac function indices (stroke volume and ejection fraction). Furthermore, in the follow-up clinical study [38], DL-ESPIRiT yielded an approximate three-fold reduced scan time (

0.9 \pm 0.3

min versus

3.0 \pm 1.9

min) with a slightly lower image quality (only

0.5

less on a five-point Likert scale) than the clinical standard.

Similarly, Kustner et al. [40] used the neural proximal gradient architecture for their CINENet, a network focused on 3D cardiac cine imaging. In the regularization step, CINENet used 4D U-Nets with cascaded separable 3D spatial + 1D temporal convolutions. This novel architecture enabled the acquisition of 3D cine imaging in a single breath-hold (less than 10 s). For comparison, they performed a standard 2D multi-slice acquisition, which required multiple breath-holds and a total acquisition time of 260 s. Furthermore, the authors showed an improved image quality and contrast compared to PICS and good agreement in the LV functional parameters (end-systolic volume, end-diastolic volume, and ejection fraction) with the clinical gold-standard 2D cine imaging.

3.2.2. Model-Based Reconstruction Using Deep-Learned Priors

Another relevant method is the MOdel-based reconstruction using Deep-Learned priors (MoDL) [34], which resembles the half quadratic splitting (Equation (3)), and is described in Equation (11):

\begin{matrix} x^{t + 1} & = arg min_{x} \frac{1}{2} {∥ A x - y ∥}_{2}^{2} + \frac{ρ}{2} {∥ Ψ x - z^{t} ∥}_{2}^{2}, \end{matrix}

(11a)

\begin{matrix} z^{t + 1} & = D_{Φ} (x^{t + 1}), \end{matrix}

(11b)

where

D_{Φ}

is, likewise, a regularization function implemented as a deep neural network. The MoDL architecture is similar to the neural proximal gradient descent, with the principal difference being the data consistency step, where a quadratic optimization problem is solved instead of performing a simple gradient step.

For CMR, Biswas et al. [41] introduced a framework (MoDL-SToRM) [41] that combines the MoDL architecture with conventional image regularization penalties to reconstruct free-breathing and ungated cardiac MRI data from highly undersampled (R = 50x) multi-channel measurements. Their method incorporates the MoDL-based [34] reconstruction shown in Equation (11) into the SmooThness Regularization on Manifolds (SToRM) [76] framework, which assumes that the images in the free-breathing acquisitions lie on a smooth and low-dimensional manifold parameterized by the cardiac and respiratory phase and navigator radial spokes to capture the structure of the manifold. Consequently, the proposed training loss function synergistically combines the data consistency term, convolutional neural network (CNN) regularization prior, and SToRM prior. In particular, the CNN prior captures the local and population-generalizable dependencies and the SToRM prior utilizes patient-specific information, including cardiac and respiratory patterns. On highly undersampled (R = 50x) simulated and prospectively acquired 2D+time datasets, the authors showed that the MoDL-SToRM framework provides the most accurate reconstructions compared to MoDL-only and SToRM-only baselines with the same number of frames. In terms of the scan time, MoDL-SToRM reconstructions from 8.2 s of MRI data per slice are comparable to SToRM reconstructions from 42 s of MRI data per slice, resulting in a 5x reduction in scan time for the same image quality. In addition, the inference time for the proposed method is roughly 30 s for 200 frames on a P100 GPU, which is significantly faster than CS reconstructions.

3.2.3. Deep Low Rank plus Sparse

Huang et al. [42] adapted the low rank plus sparse (L+S) image model (Section 2.2) and incorporated a deep learning prior for the sparse component. The method is described in Equation (12):

\begin{matrix} x^{k + 1} & = x^{k} - α A^{H} (A (x_{L}^{k} + x_{S}^{k}) - y) \end{matrix}

(12a)

\begin{matrix} x_{L}^{k + 1} & = {prox}_{α λ_{L} {∥ \cdot ∥}_{*}} (x^{k + 1} - x_{S}^{k}), \end{matrix}

(12b)

\begin{matrix} x_{S}^{k + 1} & = D_{Φ} (x^{k + 1}, x_{L}^{k}), \end{matrix}

(12c)

where

D_{Φ} (x^{k + 1}, x_{L}^{k})

is a regularization deep neural network. Similarly to the original method, the image is decomposed into two components: one low rank (

x_{L}

) and another sparse (

x_{S}

). The difference is in the regularization step of the sparse component where, instead of a proximal step, a deep convolutional neural network

D_{Φ}

applies a data-driven regularization.

L+S-Net achieved an improved image quality in terms of the NRMSE, pSNR, and SSIM when retrospectively evaluated and compared to the conventional L+S. Additionally, L+S-Net showed an improved image quality, i.e., reduced streaking artifacts and spatial blurring, when prospectively evaluated.

3.2.4. Deep Subspace Learning

Sandino et al. [49] leveraged the computational efficiency of the partial separability model (Section 2.3 and Figure 1B) to propose DL-Subspace, an unrolled network focused on reducing memory usage and reconstruction time. In the partial separability model, instead of solving directly for image x, we aim to solve for the lower-dimensional subspace of image x, i.e., the spatial basis U and temporal basis V. Sandino et al. [49] incorporated deep learning priors to this model by using the MoDL framework. The simplified algorithm is presented in Equation (13).

\begin{matrix} U^{(k + 0.5)} & = \underset{U}{arg min} {∥y - A (U V^{(k) H})∥}_{F}^{2} + λ_{u} {∥U^{(k)} - U∥}_{F}^{2}, \end{matrix}

(13a)

\begin{matrix} U^{(k + 1)} & = D_{Φ_{u}} (U^{(k + 0.5)}) \end{matrix}

(13b)

\begin{matrix} V^{(k + 0.5)} & = \underset{V}{arg min} {∥y - A (U^{(k + 1)} V^{H})∥}_{F}^{2} + λ_{v} {∥V^{(k)} - V∥}_{F}^{2}, \end{matrix}

(13c)

\begin{matrix} V^{(k + 1)} & = D_{Φ_{v}} (V^{(k + 0.5)}), \end{matrix}

(13d)

where

D_{Φ_{u}}

and

D_{Φ_{v}}

are 2D and 1D CNNs that denoise the spatial and temporal basis, respectively. Equation (8a) has been divided into a MoDL data consistency step (Equation (13a)) and a data-driven regularization step (Equation (13b)). Similarly, Equation (8b) has been divided into data consistency (Equation (13c)) and regularization (Equation (13d)). The DL-Subspace network showed improved computational efficiency in terms of the memory usage and inference speed versus MoDL [34] and DL-ESPIRiT [77]. Additionally, DL-Subspace outperformed the other networks in terms of the SSIM and PSNR with respect to the fully sampled reference.

3.3. Other Networks

Exploring the spatiotemporal correlations of dynamic signals proves crucial for enhanced acceleration in cardiac imaging. Manifold learning is a useful technique for leveraging such correlations, which assumes dynamic signals in high-dimensional space are neighboring points on smooth, low-dimensional manifolds. Manifold-Net [78] is a method based on manifold learning that introduces a deep learning model to capture the temporal redundancy in reconstructing dynamic cardiac data from undersampled measurements. Specifically, it introduces a fixed, low-rank tensor manifold for temporal correlations, then models the dynamic reconstruction as a general CS-based optimization problem on the manifold. This is solved via Riemannian optimization, and the iterations of the optimization are unrolled into a neural network. This unrolled implementation is useful for reducing the computation time and eliminating the need for ad hoc parameter selection. Manifold-Net is demonstrated to yield improved quantitative measurements in terms of the mean squared error (MSE), PSNR, and SSIM for up to a 12-fold acceleration of cardiac data with retrospective electrocardiogram-gated segmented balanced steady-state free precession (bSSFP) with breath-holding. The authors also report that Manifold-Net preserves its high reconstruction accuracy over different sampling masks based on pseudo-radial, pseudo-spiral, and VISTA [79], and over limited training data.

Another approach that allows spatial resolution improvements without sacrificing scan times is super resolution. Super resolution (SR) refers to the recovery of high-resolution images from the low-resolution counterparts. Traditional image upscaling approaches include bicubic interpolation methods and zero padding in the Fourier domain which achieves sinc interpolation [80,81]. Recent advances in DL have also shown improvements in SR techniques. In particular, convolutional neural network-based SR has demonstrated a superior quality over traditional methods in computer vision-related fields [82,83,84,85]. SR has the potential to reduce the scan time and/or increase the temporal resolution for CMR.

Evan et al. proposed a DL SR method [86] to enhance the spatial detail of cardiac MRI from low-resolution acquisitions that are faster to acquire. For training, the authors used retrospectively collected short-axis bSSFP cine images from 400 examinations, from both 1.5T and 3T scanners. These data were treated as the ground truth and then only a portion of the central k-space was retained to simulate low-resolution acquisitions. They used a shallow (SRNet) and a deeper (modified U-Net [87]) neural network in single-frame (k) and multi-frame (kt) SR settings with an L1 + SSIM hybrid loss function. The single-frame networks had 2D convolutions, whereas the multi-frame networks consisted of 3D convolutions with the temporal domain in the third dimension. Retrospective evaluations across a wide range of upsampling factors demonstrated that all of their CNN methods quantitatively and qualitatively outperformed the conventional zero-padding and bicubic interpolation strategies. The authors also tested their SR approach on prospectively acquired scans and reported the comparable LV end-diastolic volume, LV end-systolic volume, LV stroke volume, and LV ejection fraction values to those from full-resolution images while visually improving anatomic details.

Baty and Grau presented another SR approach [88] that learns the mapping between low-resolution dynamic short-axis images and high-resolution long-axis views. They used a modified U-Net [87] architecture with convolutional long short-term memory layers at the first and second levels to take the dynamic aspect of the cine sequence into account via the high spatial frequency features across multiple time steps. Their results demonstrated that their proposed network yields improved quantitative and qualitative results compared to the standard static U-Net baseline and the traditional interpolation approach, highlighting the importance of temporal context for the super resolution of cardiac cine MRI sequences.

An alternative approach for DL SR has been proposed by Zhao et al., denoted as Synthetic Multi-Orientation Resolution Enhancement (SMORE) [89]. SMORE is a self-supervised model for MRI acquisitions with thick slices. This corresponds to images acquired with a high in-plane resolution but a poor through-plane resolution, enabling a reasonable acquisition time in general. The SMORE model is based on the idea that high-resolution in-plane slices can be used to compile paired low-resolution/high-resolution training data by down-sampling in a self-supervised fashion. For this purpose, SMORE applies a filter h(x) in the k-space containing a rect filter together with an anti-ringing filter on in-plane slices to yield a through-plane low resolution. Generated low-resolution/high-resolution data are then used to train the self super-resolution network. The authors demonstrate SMORE on a cardiac MRI to perform a super resolution from a 1.1 × 1.1 × 5 mm

^{3}

to an isotropic 1.1 mm

^{3}

resolution and to explore the visualization of myocardial scarring from cardiac LV remodeling after myocardial infarction. The SMORE method yields enhanced scar visualization with improved image clarity.

4. CMR-Specific Challenges

Due to the particular characteristics of CMR, DL-based reconstructions present some specific challenges. For example, CMR images have higher dimensionality (spatial plus temporal dimensions), which increases the computational cost of the reconstruction. Additionally, due to the same increased dimensionality, some CMR techniques have scarce or unavailable access to fully sampled ground-truth datasets for training. In this section, we review some works specifically developed to tackle these problems.

4.1. Increased Dimensionality

Deep unrolling algorithms for reconstruction involve multiple iterations of a neural network that is typically trained end-to-end using backpropagation. However, this approach requires large memory and computation power in calculating gradients and storing intermediate activations. The requirements also increase linearly with the measurement count and the number of network layers. High-dimensional reconstruction tasks can be particularly challenging, with exceptionally large memory requirements resulting in a reduction in the number of unrolled iterations at the expense of lower accuracy. Memory-efficient training frameworks aim to address the computational challenges of deep unrolling algorithms by exploring the trade-offs between memory usage and performance. These methods can take various forms such as constrained network families that enable reduced memory usage or modified training strategies that limit the amount of memory occupied by the network at any given time.

An important approach to address the limitations of large-scale DL reconstruction is Memory-Efficient Learning (MEL) [44]. MEL treats the full graph of the unrolled implementation as a series of smaller sequential graphs, which allows the network to be forward propagated to obtain the final output without computing the network gradients. MEL then uses invertible CNN models across the unrolls to recompute each smaller auto-differentiation graph from the network’s output in the reverse order. This approach allows MEL to only store a single layer in memory at a time, reducing the required memory by a factor of N. Notably, the computation required to invert each layer only marginally increases the backpropagation run time. MEL has been trained and evaluated on retrospectively and prospectively undersampled 3D knee and 2D+time cardiac cine MRI. It has been shown to improve quantitative metrics and reconstruct finer details with sharper edges and more continuous textures with a higher overall image quality for both 2D+time and 3D+time cardiac cine MRI. These performance improvements were achieved using a single 12 GB GPU.

Greedy LEarning for Accelerated MRI (GLEAM) [43] is another memory-efficient learning algorithm that splits the reconstruction network into independent modules and performs gradient-based optimization for each decoupled module in a greedy manner. Each module computes its own forward pass, updates its gradients based on a local loss, and passes the output to the next module. The intermediate activations are then discarded from the GPU memory. This allows GLEAM to only store a single module in memory at a time, reducing the GPU memory requirements to the amount required by a single module. Additionally, GLEAM can be extended to a distributed setting, enabling gradient updates for each module to be performed on different GPUs in parallel to speed up training. Unlike invertible networks [44,90,91], GLEAM does not suffer from numerical instabilities when there are a large number of layers because it decouples each module and relies on backpropagation instead of inverting individual modules. GLEAM has been evaluated on multi-coil knee, brain, and dynamic cardiac cine MRI datasets with multiple reconstruction network architectures. It has been shown to improve both qualitative and quantitative metrics over end-to-end learning baselines while also reducing the memory footprint and speeding up training compared to existing memory-efficient learning methods. On a 2D+time cardiac cine dataset, GLEAM was able to achieve a 1.8 dB pSNR improvement compared to traditional end-to-end learning with backpropagation by allowing three times as many unrolled iterations.

4.2. Limited Training Data

As described in the previous sections, supervised learning approaches have been successfully employed for the fast and high-quality reconstruction of undersampled MRI. One drawback of these methods is the reliance on fully sampled ground-truth data in the training datasets. This dependence poses a bigger problem for dynamic MRI as fully sampled data are typically unavailable in settings like cardiac imaging due to scan time constraints. Therefore, machine learning techniques that remove or reduce the requirement for fully sampled data are crucial for the success of DL-based reconstruction in dynamic MRI applications.

Deep Image Prior (DIP) [45] is an approach that solves inverse problems such as image denoising, inpainting, and SR in an unsupervised fashion (i.e., without the need of labeled training data). DIP optimizes the parameters of an untrained neural network that maps a fixed latent vector to the measurements. Note that DIP does not need a pre-trained network or an image database [45] and relies on the implicit bias of the CNN architecture. Yoo et al. proposed a Time-Dependent Deep Image Prior (TD-DIP) [92] approach that adapts the DIP method to dynamic MRI by optimizing an untrained neural network that maps the sequence of latent vectors to radial spoke-shared measurements. In addition, the authors constrain the latent vectors to a one-dimensional manifold so that the CNN learns to encode the temporal variations in the cine images into the spatial closeness of the latent samples. In this way, the temporal dependencies of the dynamic measurements, such as the nearly periodic cardiac motion, can be encoded more effectively. In order to provide more flexibility, the authors also include a Mapping Network (MapNet) that maps the fixed manifold into a more expressive latent space. They demonstrated that the TD-DIP method qualitatively and quantitatively outperforms the compressed sensing baselines with experiments on retrospective and prospective cardiac datasets. Because the TD-DIP approach captures the temporal dynamics with the manifold design, it also enables continuous dynamic reconstruction in the sense that intra-frame images can be recovered by traversing between consecutive latent vectors in the manifold.

Based on the success of recent self-supervised techniques for static MRI reconstruction without fully sampled ground-truth data, Acar et al. introduced a self-supervised learning framework for training deep architectures for dynamic cardiac MRI reconstruction [46]. The framework is depicted in Figure 2. Given that the model has no access to a fully sampled k-space, this framework splits the undersampled k-space data into two non-overlapping segments and uses the first segment to enforce data consistency and the second segment to learn the network weights. Their systematic assessment of the amenability of models of varying complexity on an open-access multi-coil k-space dataset for cardiovascular MRI demonstrated that a compact unrolled CNN architecture with bidirectional recurrent connections exhibits enhanced robustness.

5. Application-Specific Methods

5.1. Cardiovascular Blood Velocity and Flow Quantification

Phase contrast MRI (PC-MRI) provides quantitative measurements of the blood velocity and flow throughout the cardiovascular system. Two-dimensional PC-MRI (2D-Flow) is widely considered to be the clinical gold standard [93,94] for the non-invasive and time-resolved quantification of the blood velocity (cm/s) and flow (mL/beat) at a single slice location, whereas 3D+time PC-MRI (4D-Flow) is an increasingly impactful clinical tool used to visualize the blood velocity and flow patterns in a time-resolved 3D volume encompassing the entire chest. Both 2D-Flow and 4D-Flow provide clinicians with important hemodynamic parameters that aid in the diagnosis of many cardiovascular diseases, including valvular stenosis and pressure gradients in addition to regurgitation, right/left-lung flow splits, and shunts (flow). However, the spatiotemporal requirements for the accurate and precise measurements of the blood velocity and flow place an upper bound on the scan time [95,96].

DL-based reconstruction methods for both 2D- and 4D-Flow have recently emerged as a promising alternative, which aim to drastically reduce the scan time without reconstruction times that challenge the clinical workflow, which is typical of most CS techniques. Table 1 provides an overview of the currently published literature for DL-based reconstruction methods for 2D- and 4D-Flow.

Vishnevskiy et al. were the first to describe a DL-based reconstruction for the undersampled 4D-Flow [97]. They trained a deep variational neural network [33] using healthy subjects (n = 11). They retrospectively undersampled one patient dataset and prospectively undersampled seven volunteer datasets using a Cartesian pseudo-radial golden-angle sampling pattern with 12.4 ≤ R ≤ 13.8 [104]. Compared to R = 2 PI, their prospectively deployed DL-based reconstruction provided comparable measurements of peak velocity (bias = 1.6%, limits of agreement (LOA) = [−11.2, 8.1]%), and peak flow (0.05%, [−9.8, 9.7]%) in the ascending aorta (aAo) while also demonstrating a reduced bias and tighter limits of agreement (LOA) compared to the CS [105]. A representative patient dataset is included in Figure 3, highlighting the improvement in their network over the CS.

Haji-Valizadeh et al. were the first to describe a DL-based reconstruction for the real-time 2D-Flow [98]. They trained a 3D U-Net using synthetic real-time 2D-Flow datasets, generated from 510 ECG-gated breath-hold patient datasets, to separately filter aliasing artifacts from radial real-time velocity-compensated and complex-difference images. They prospectively undersampled 21 patient datasets and compared the ECG-gated 2D-Flow and the proposed real-time 2D-Flow sequence with radial k-space sampling for R = 28.8. Compared to the ECG-gated 2D-Flow, their prospectively deployed real-time radial 2D-Flow sequence with a DL-based reconstruction provided comparable measurements of the cardiac output (−0.8 [−1.7, 0.1] L/min), stroke volume (−9.5 [−24.2, 5.3] mL), and peak mean velocity (−4.4 [−19.2, 10.4] cm/s) in the aAo while also demonstrating a reduced bias and tighter LOA compared to the CS [106].

Cole et al. were the first to investigate the impact of complex-valued CNNs for the reconstruction of an undersampled 2D-Flow [99]. They trained two U-Net networks using end-to-end complex-valued CNNs as well as real-valued CNNs, using 180 fully sampled 2D-Flow pediatric patient datasets. They retrospectively undersampled 44 patient datasets, using variable density pseudo-random Poisson-disc undersampling for R ≤ 6. Compared to the real-valued CNN, the complex-valued CNN provided a higher PSNR (33.5 ± 5.9 vs. 32.3 ± 5.5), lower NRMSE (0.21 ± 0.1 vs. 0.22 ± 0.08), and higher SSIM (0.88 ± 0.08 vs. 0.86 ± 0.08) in addition to reducing errors in the measurement of the flow (6.3% vs. 7.2%) and peak velocity (5.3% vs. 28.2%).

Jaubert et al. proposed a DL-based reconstruction using complex-valued images for a spiral real-time 2D-Flow [100]. They trained a 3D U-Net using synthetic real-time spiral 2D-Flow datasets, generated from 520 ECG-gated breath-hold patient datasets. They prospectively undersampled 20 patient datasets and compared the ECG-gated 2D-Flow and the proposed real-time 2D-Flow sequence with uniform density spiral undersampling for R = 18. Compared to the ECG-gated 2D-Flow with R = 2 PI, their prospectively deployed real-time radial 2D-Flow sequence with DL-based reconstruction provided comparable measurements of the stroke volume (−0.8 [−10.5, 8.9] mL) and peak mean velocity (−2.5 [−13.7, 8.7] cm/s) in the aAo while also demonstrating a reduced bias and tighter LOA for the stroke volume but not the peak mean velocity, compared to the CS [107]. They later expanded their original work to demonstrate real-time 2D-Flow measurements, including visualization on the scanner of the beat-to-beat heart rate, stroke volume, and cardiac output with a mean latency of 622 ms [101]. Importantly, no significant differences were observed between the reference R = 2 PI methods at rest (−0.21 ± 0.50 L/min, p = 0.25) or the CS at peak exercise (0.12 ± 0.48 L/min, p = 0.46). A representative patient dataset is included in Figure 4, highlighting the improvement in their U-Net over CS.

Oscanoa et al. demonstrated a DL-based reconstruction for a highly undersampled 2D-Flow, which made use of a complex difference estimation to promote image sparsity [39]. They trained an unrolled network using 155 fully sampled 2D-Flow pediatric patient datasets. They prospectively undersampled five patient datasets and compared R = 2 PI with variable density pseudo-random Poisson-disc undersampling for R ≤ 8. Compared to R = 2 PI, their prospectively deployed sequence with DL-based reconstruction provided comparable measurements of the peak velocity (−3.6 [−5.6, 4.5]%) and total flow (6.1 [−11.6, 19.4]%) in the aAo and pulmonary artery (PA) while also demonstrating a reduced bias and tighter LOA for acceleration rates up to 8x, compared to the CS [108].

Kim et al. demonstrated a DL-based reconstruction for the 4D-Flow that produced three directional velocities from only three acquired flow encodings, without requiring the conventional fourth reference scan to measure the background phase [102]. They used the complex images from four velocity encodings (flow-compensated reference plus x, y, and z velocities) of 144 neurovascular patient datasets to train a 3D U-Net to predict the x, y, and z velocities. They took 14 3D radial undersampled 4D-Flow patient datasets and compared a conventional 4-point to the proposed DL-based 3-point reconstruction with the native undersampling and also with further undersampling applied retrospectively using R ≤ 6. Without additional undersampling, the DL-based 3-point reconstruction provided comparable measurements of the total flow (R

^{2}

= 0.98, slope = 0.95, p < 0.001, 95%–CIs = [0.93, 0.97]) and peak velocity (R

^{2}

= 0.98, slope = 0.96, p < 0.001, 95%–CIs = [0.93, 0.99]) compared to the 4-point reconstruction, across several notable neurovascular arteries. With R = 6 additional undersampling, the 3-point DL-based recon underestimated the velocities, relative to the 4-point reconstruction, by about 3.1% of the prescribed velocity encoding strength.

Nath et al. demonstrated DL-based residual attention reconstruction for the 4D-Flow [103]. They trained 2D U-Nets separately for each 4D-Flow velocity encoding direction using 2D complex image slices extracted from the 4D-Flow volume at each temporal frame and slice position, using 18 fully sampled volunteer datasets. They retrospectively undersampled two fully sampled patient datasets using variable density pseudo-random Poisson-disc undersampling for R = 2.5, 3.3, and 5. Compared to the fully sampled datasets, their proposed DL-based reconstruction with R = 2.5 provided comparable measurements of the peak velocity (10.3 ± 5.1 cm/s) and total flow (18.2 ± 4.1 mL/s), with larger errors for R = 3.3 and R = 5, but did demonstrate a reduced bias and tighter LOA compared to the three state-of-the-art DL-based reconstructions and CS [109] across all undersampling factors.

5.2. Late Gadolinium Enhancement

Late gadolinium enhancement (LGE) imaging is a technique that measures contrast agent extravasation into the myocardium and is considered the gold standard for diagnosing myocardial scar tissue. Accelerating LGE acquisitions is particularly important, as there is a limited time window to capture an image before the contrast dynamics change and decrease the diagnostic quality of the images acquired. Recent DL reconstruction works have focused on providing whole heart 3D volumes with significant undersampling to provide full-heart high-resolution scar characterization in a clinically viable scan time.

El-Rewaidy et al. developed a U-Net-based method for reconstructing 3D LGE volumes from Cartesian undersampled datasets [110]. The network presented used fully complex convolutions rather than real-valued convolutions with real and imaginary channels. Additionally, a batch normalization layer was used that better preserved the magnitude and phase relationships in the complex data. The image quality was scored by expert reviewers, where the differences between the complex-valued network and a reference CS reconstruction were insignificant. In the same comparison, the real-valued network was found to score significantly worse. The scar extent was also compared between the reference CS reconstruction and the complex-valued network, where a very good correlation (

R^{2} > 0.97

) and very small bias were seen between the methods. The computation time of the neural networks was over 300× shorter than the reference CS reconstruction. The work showed that DL reconstruction networks could be applied to significantly speed up undersampled reconstructions without loss in the image quality or diagnostic applicability (Figure 5).

Yaman et al. designed a DL reconstruction method using physics-based networks to reconstruct 3D LGE images with random Cartesian undersampling [111]. The method used an unrolled network that combined data consistency blocks and 3D convolutional blocks for regularization. The data were trained in a self-supervised manner, so no fully sampled data were needed. This was achieved by splitting the k-space data into two sets and using one set for the data consistency and the other for a loss function requiring the k-space data to match the left-out set. Based on clinical reviewer scoring, the method allowed for better image quality, with undersampling ratios up to R = 6 when compared to a standard CS reconstruction.

Ven der Velde et al. investigated the use of a DL reconstruction on LGE image quality, as well as the performance of automatic scar segmentation methods [112]. A CNN-based network architecture was used with a tunable denoising parameter, from 0 to 100%. The method was tested on 2D LGE images with different noise reduction factors, where it was seen that the DL recon significantly improved the image quality based on expert scoring. The scar size was also quantified, where it was observed that standard deviation (SD)-based automatic segmentation methods overestimated the scar size when compared to manual segmentation. This was attributed to the new noise characteristics from the recon changing the thresholding levels when based off the SD. Different segmentation algorithms, such as a full width at half maximum method, did not show these same biases with the DL recon data.

5.3. Tissue Characterization

Cardiac tissue characterization generally involves dynamic imaging in combination with RF events to characterize tissue by T1 and T2. These techniques have shown to be clinically important biomarkers for a range of cardiac pathologies. Acceleration is important to obtain more temporal information needed to fit T1 and T2 and to obtain whole heart coverage in a reasonable amount of time. Because these techniques can span three independent temporal domains (cardiac, respiratory, and T1/T2), datasets can be quite large and hard to properly train. Different groups have also experimented with directly estimating T1 (or T2) values from the DL network, rather than using the network to reconstruct images, followed by conventional T1 (or T2) fitting.

In a study by Chen et al., DL reconstructions were used to reconstruct large high-dimensional (5D) datasets containing 2D images with cardiac, respiratory, and inversion time dimensions [113]. To deal with the large data size and GPU memory limitations, the network operates on a lower-dimensional temporal feature space, using similar techniques as in low-rank CS models. A densely connected network was used in this work and compared to a U-Net architecture and a traditional CS-based reconstruction that was considered the reference standard. The proposed network showed very little bias with the CS reconstruction and limits of agreement in the range of 50 ms.

Jeelani et al. designed a reconstruction method called DeepT1 that contains a recurrent CNN for reconstructing images, which is then followed by a U-Net on the output images to obtain T1 maps, rather than traditional fitting [114]. This was tested using a 2D gated sequence with eight modified look-locker inversion recovery times, in pre- and post-contrast acquisitions. The DL methods were compared to fully sampled images, where good agreement was seen with the rCNN image reconstruction and even better agreement when using the UNet to generate T1 maps from the images (Figure 6).

MR fingerprinting is another technique used to measure T1 and T2 in the heart, which uses a series of highly undersampled images with pseudo-random flip angles to solve for relaxation times. Hamilton et al. designed a method for reconstructing T1 and T2 values from fingerprinting images to speed up the solving process [115]. They used a densely connected network with pixel-wise signal curves as input in order to output both T1 and T2. The method is trained from Bloch simulations and was tested against a normal dictionary-based solver. They found that they achieved similar results with both methods, while the DL method performed in less than a second, compared to minutes for the conventional solver.

6. Pitfalls and Future Outlook

DL-based reconstructions have demonstrated the capacity to accelerate MRI image acquisitions, either with higher acceleration rates or with higher accuracy than CS. DL-based MRI reconstructions have shown the potential to enable more comprehensive and detailed examinations. However, these techniques present several pitfalls and challenges that need to be considered prior to their widespread clinical adoption. We highlight the challenges related to robustness and generalizability due to instabilities and interpretability, clinical deployment such as performance gaps, downstream tasks, and increased computational costs.

6.1. Instabilities and Hallucinations

DL-based reconstruction networks are potentially subject to instabilities and hallucinations. Antun et al. [116] showed that adding small, almost undetectable perturbations in the acquired k-space could introduce severe artifacts in the reconstructed images. Furthermore, the authors showed that small structural changes in the image, such as those arising from tumors, may not be captured in the reconstructed image. Similarly, Darestani et al. [117] studied instabilities and hallucinations and showed that even though the reconstruction networks are prone to instabilities and hallucinations, more traditional methods such as

ℓ_{1}

-norm minimization are sensitive to such problems as well. In fact, the authors showed that the reconstruction networks are actually more robust to these phenomena than traditional methods. Approaches to studying, identifying, and minimizing instabilities and hallucinations are a very active and critical area of research that will require collaboration between scientists and clinicians.

6.2. Interpretability

Instabilities and hallucinations are a major concern for deep learning methods, mainly because their inner workings are still not fully understood. While CS has a solid mathematical basis that assures robustness [31,32], the interpretability of deep neural networks is currently an active area of research. An interesting approach has used convex duality to assess shallow networks [118,119]. Additionally, Sahiner et al. [120] extended this framework to assess CNN-based image reconstruction networks. These works use convex duality to reformulate the non-convex training problem as convex in a higher-dimensional space. The alternative formulation offers convergence guarantees to the global optimum and, thus, improved robustness. Furthermore, this formulation provides further insight and interpretability. In the convex formulation, the authors found that weight decay regularization during training induced path sparsity regularization (i.e., the network enforces that the least possible number of filters are used). Additionally, once trained, the authors found that the network performs inference as a piece-wise linear filtering procedure. In other words, based on the input, the trained network will choose which filters to use (a non-linear procedure) and then perform conventional filtering (a linear procedure). Interpretability of reconstruction networks is an important area of research that should be given further attention. Understanding the inner workings of reconstruction networks could potentially provide insight into the cause of instabilities and hallucinations, and into possible ways to prevent them.

6.3. Performance Gaps

DL-based reconstruction methods are able to outperform conventional methods by learning tailored priors directly from data. Nevertheless, once deployed in the clinic, these methods can sometimes present a performance drop. Researchers have assessed this phenomena and reported two main reasons: (1) the mishandling of publicly available data and (2) distribution shifts in the data. Next, we elaborate on both topics.

Training datasets that are made publicly available are vital for continued DL-based reconstruction research. However, the growing availability of these public datasets may sometimes lead to “off-label” usage, where data published for one task are used for another. Shimron et al. [74] assessed two cases where data mishandling resulted in biased or overly optimistic results, namely when the k-space data have been zero-padded or when the data have been JPEG-compressed.

Likewise, DL-based methods are prone to performance drops due to distribution shifts. These methods are usually evaluated by the in-distribution performance. In other words, researchers divide the same dataset into training and testing subsets. Then, they use the training subset for training the network while they use the test subset to evaluate the performance. However, in clinical practice, performance is often evaluated in different distributions. For example, training and test datasets may come from different scanners, different sequences, or simply different noise levels [117]. All of these possible distribution shifts may introduce a performance drop of the reconstruction networks.

Darestani et al. [121] denoted this performance drop as a “distribution shift performance gap”. Furthermore, the authors proposed a domain adaptation method that successfully reduced the performance gap by 87%–99%. The method had two steps. First, during training, they incorporated a self-supervised loss to the conventional supervised loss. Second, during inference, they included extra self-supervised training, denoted as “test-time training”.

6.4. Downstream Tasks

Advanced DL-based reconstruction techniques that enable fast acquisitions and reconstructions will have significant clinical impact for the diagnosis and treatment of numerous cardiovascular diseases. However, these reconstruction methods should not overlook the downstream clinical processes. For example, in the case that the downstream task is a quantitative analysis such as in flow imaging, the need for speed should not come at the expense of compromised accuracy and precision in the estimation of the hemodynamic parameters of interest. Novel undersampling and DL-based reconstruction strategies should be developed without arbitrarily selecting the undersampling rate but by selecting the maximum achievable undersampling rate while still maintaining measurement accuracy and precision. Likewise, in the case of a downstream segmentation tasks, researchers should ensure compatibility with the desired segmentation method (i.e., if the DL-based reconstruction produces images that impact the segmentation process). Furthermore, some works [122,123] have developed DL methods that synergistically combine both reconstruction and segmentation tasks to improve both tasks.

6.5. Increased Computational Cost

A crucial component of deployment includes ensuring the users’ institutions have proper computation power to perform the reconstructions. While most DL-based reconstruction methods can outperform conventional reconstruction methods in terms of image quality and computation time, these methods oftentimes come at the expense of increased computational cost, requiring expensive GPUs for both training and inference. This limitation is a clinical bottleneck, especially for 3D non-Cartesian applications [124], such as 4D-Flow [77], Dynamic Contrast-Enhanced MRI (DCE) [48], and MR Fingerprinting [125]. Possible solutions to reduce the computation cost of these types of reconstructions include coil compression [126,127,128], stochastic gradient descent [48,129], and randomized sketching algorithms [130,131,132,133].

6.6. Improved Models and Losses

The field of machine learning moves incredibly fast, with new techniques that outperform the previous state of the art being published routinely. We expect an increase in achievable MRI data undersampling rates in the upcoming years due to novel and more sophisticated reconstruction methods. For example, the incorporation of a CNN-based perceptual loss can yield sharper edges and a more faithful image contrast as described by Wang et al. [134]. DL-based reconstruction performance depends heavily on the loss function used. However, commonly used losses such as

ℓ_{1}

,

ℓ_{2}

, or SSIM do not capture the perceptual information of fine structures, which results in reduced perceptual quality and blurring. The authors presented a method that yielded improved reconstructed image quality in knee datasets.

Additionally, transformer networks are a DL-based architecture with the potential to improve reconstruction in CMR. Since their introduction in 2017 [135], transformers have rapidly become the state of the art in natural language processing and have recently demonstrated the ability to outperform CNNs for image and video classification tasks [136,137]. These methods split the data into patches that are input to self-attention layers that extract features and are able to learn long-range dependencies. This architecture was already successfully applied to MR image reconstruction [138,139], and applications to CMR are expected soon.

7. Conclusions

Machine learning has disrupted the CMR field with novel reconstruction methods that have largely outperformed conventional methods both in image quality and reconstruction speed. DL-based reconstruction methods have the capacity to enable unprecedented fast acquisitions, providing a significant impact on the diagnosis and treatment of numerous diseases. For CMR, DL-based methods have been developed for virtually all applications, including cine, flow imaging, late gadolinium enhancement, and tissue characterization. However, there are still some pitfalls and limitations, such as the instabilities, interpretability, and increased computational cost. Finally, the machine learning field moves incredibly fast. New techniques that can potentially be applied to CMR image reconstruction are constantly being published. Thus, we expect that the pitfalls and challenges discussed will be overcome in the near future by novel DL-based reconstruction techniques, which is essential in order to fully translate these methods into the clinic.

Author Contributions

Contributions to this work were conducted as follows: conceptualization, J.A.O., M.J.M., C.A., M.Y. and M.L.; methodology, J.A.O., M.J.M., C.A., M.Y. and M.L.; writing—original draft preparation, J.A.O., M.J.M., C.A., M.Y. and M.L.; writing—review and editing, J.A.O., M.J.M., C.A., M.Y., M.L., S.S.V. and D.B.E. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

This review article includes several works that were written by one, or more, of the authors of this manuscript.

Abbreviations

The following abbreviations are used in this manuscript:

CMR	Cardiac Magnetic Resonance
DL	Deep Learning
LV	Left Ventricle
MRI	Magnetic Resonance Imaging
L+S	Low Rank plus Sparse
FOV	Field of view
1D	One Dimensional
2D	Two Dimensional
3D	Three Dimensional
MoDL	Model-Based Deep Learning
CNN	Convolutional Neural Network
SToRM	SmooThness Regularization on Manifolds
CS	Compressed Sensing
GPU	Graphics Processing Unit
NRMSE	Normalized Root Mean Squared Error
SSIM	Structural Similarity Index Measure
PSNR	Peak Signal-to-Noise Ratio
MSE	Mean Squared Error
SR	Super Resolution
bSSFP	Balanced Steady-State Free Precession
SMORE	Synthetic Multi-Orientation Resolution Enhancement
MEL	Memory-Efficient Learning
GLEAM	Greedy LEarning for Accelerated MRI
DIP	Deep Image Prior
MapNet	Mapping Network
TD-DIP	Time-dependent Deep Image Prior
PC-MRI	Phase Contrast Magnetic Resonance Imaging
PI	Parallel Imaging
LOA	Limits of Agreement
aAO	Ascending Aorta
ECG	Electrocardiogram
RelErr	Relative Error
95%-CIs	95 percent Confidence Intervals
LGE	Late Gadolinium Enhancement
SD	Standard Deviation
T1	Longitudinal Relaxation Time
T2	Transverse Relaxation Time
DCE	Dynamic Contrast Enhanced

References

Bustin, A.; Fuin, N.; Botnar, R.M.; Prieto, C. From compressed-sensing to artificial intelligence-based cardiac MRI reconstruction. Front. Cardiovasc. Med. 2020, 7, 17. [Google Scholar] [CrossRef] [PubMed]
Ismail, T.F.; Strugnell, W.; Coletti, C.; Božić-Iven, M.; Weingärtner, S.; Hammernik, K.; Correia, T.; Küstner, T. Cardiac MR: From theory to practice. Front. Cardiovasc. Med. 2022, 9, 826283. [Google Scholar] [CrossRef] [PubMed]
McDonagh, T.A.; Metra, M.; Adamo, M.; Gardner, R.S.; Baumbach, A.; Böhm, M.; Burri, H.; Butler, J.; Čelutkienė, J.; Chioncel, O.; et al. 2021 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure. Eur. Heart J. 2021, 42, 3599–3726. [Google Scholar] [CrossRef] [PubMed]
Amano, Y.; Takayama, M.; Kumita, S. Contrast-enhanced myocardial T1-weighted scout (Look–Locker) imaging for the detection of myocardial damages in hypertrophic cardiomyopathy. J. Magn. Reson. 2009, 30, 778–784. [Google Scholar] [CrossRef]
Kellman, P.; Arai, A.E.; McVeigh, E.R.; Aletras, A.H. Phase-sensitive inversion recovery for detecting myocardial infarction using gadolinium-delayed hyperenhancement. Magn. Reson. Med. 2002, 47, 372–383. [Google Scholar] [CrossRef]
Haaf, P.; Garg, P.; Messroghli, D.R.; Broadbent, D.A.; Greenwood, J.P.; Plein, S. Cardiac T1 mapping and extracellular volume (ECV) in clinical practice: A comprehensive review. J. Cardiovasc. Magn. Reson. 2017, 18, 1–12. [Google Scholar] [CrossRef]
Messroghli, D.R.; Radjenovic, A.; Kozerke, S.; Higgins, D.M.; Sivananthan, M.U.; Ridgway, J.P. Modified Look-Locker inversion recovery (MOLLI) for high-resolution T1 mapping of the heart. Magn. Reson. Med. 2004, 52, 141–146. [Google Scholar] [CrossRef]
Kellman, P.; Hansen, M.S. T1-mapping in the heart: Accuracy and precision. J. Cardiovasc. Magn. Reson. 2014, 16, 1–20. [Google Scholar] [CrossRef]
Chow, K.; Flewitt, J.A.; Green, J.D.; Pagano, J.J.; Friedrich, M.G.; Thompson, R.B. Saturation recovery single-shot acquisition (SASHA) for myocardial T1 mapping. Magn. Reson. Med. 2014, 71, 2082–2095. [Google Scholar] [CrossRef]
Huang, T.Y.; Liu, Y.J.; Stemmer, A.; Poncelet, B.P. T2 measurement of the human myocardium using a T2-prepared transient-state TrueFISP sequence. Magn. Reson. Med. 2007, 57, 960–966. [Google Scholar] [CrossRef]
Anderson, L.; Holden, S.; Davis, B.; Prescott, E.; Charrier, C.; Bunce, N.; Firmin, D.; Wonke, B.; Porter, J.; Walker, J.; et al. Cardiovascular T2-star (T2*) magnetic resonance for the early diagnosis of myocardial iron overload. Eur. Heart J. 2001, 22, 2171–2179. [Google Scholar] [CrossRef] [PubMed]
Babu-Narayan, S.V.; Giannakoulas, G.; Valente, A.M.; Li, W.; Gatzoulis, M.A. Imaging of congenital heart disease in adults. Eur. Heart J. 2016, 37, 1182–1195. [Google Scholar] [CrossRef] [PubMed]
Marelli, A.J.; Mackie, A.S.; Ionescu-Ittu, R.; Rahme, E.; Pilote, L. Congenital heart disease in the general population: Changing prevalence and age distribution. Circ 2007, 115, 163–172. [Google Scholar] [CrossRef] [PubMed]
Ntsinjana, H.N.; Hughes, M.L.; Taylor, A.M. The role of cardiovascular magnetic resonance in pediatric congenital heart disease. J. Cardiovasc. Magn. Reson. 2011, 13, 1–20. [Google Scholar] [CrossRef] [PubMed]
Cheng, J.Y.; Hanneman, K.; Zhang, T.; Alley, M.T.; Lai, P.; Tamir, J.I.; Uecker, M.; Pauly, J.M.; Lustig, M.; Vasanawala, S.S. Comprehensive motion-compensated highly accelerated 4D flow MRI with ferumoxytol enhancement for pediatric congenital heart disease. J. Magn. Reson. 2016, 43, 1355–1368. [Google Scholar] [CrossRef]
Wang, Y.; Watts, R.; Mitchell, I.R.; Nguyen, T.D.; Bezanson, J.W.; Bergman, G.W.; Prince, M.R. Coronary MR angiography: Selection of acquisition window of minimal cardiac motion with electrocardiography-triggered navigator cardiac motion prescanning—Initial results. Radiology 2001, 218, 580–585. [Google Scholar] [CrossRef]
Bluemke, D.A.; Boxerman, J.L.; Atalar, E.; McVeigh, E.R. Segmented K-space cine breath-hold cardiovascular MR imaging: Part 1. Principles and technique. AJR Am. J. Roentgenol. 1997, 169, 395. [Google Scholar] [CrossRef]
Weiger, M.; Börnert, P.; Proksa, R.; Schäffter, T.; Haase, A. Motion-adapted gating based on k-space weighting for reduction of respiratory motion artifacts. Magn. Reson. Med. 1997, 38, 322–333. [Google Scholar] [CrossRef]
Pruessmann, K.P.; Weiger, M.; Scheidegger, M.B.; Boesiger, P. SENSE: Sensitivity encoding for fast MRI. Magn. Reson. Med. 1999, 42, 952–962. [Google Scholar] [CrossRef]
Griswold, M.A.; Jakob, P.M.; Heidemann, R.M.; Nittka, M.; Jellus, V.; Wang, J.; Kiefer, B.; Haase, A. Generalized autocalibrating partially parallel acquisitions (GRAPPA). Magn. Reson. Med. 2002, 47, 1202–1210. [Google Scholar] [CrossRef]
Sodickson, D.K.; Manning, W.J. Simultaneous acquisition of spatial harmonics (SMASH): Fast imaging with radiofrequency coil arrays. Magn. Reson. Med. 1997, 38, 591–603. [Google Scholar] [CrossRef] [PubMed]
Lustig, M.; Donoho, D.; Pauly, J.M. Sparse MRI: The application of compressed sensing for rapid MR imaging. Magn. Reson. Med. 2007, 58, 1182–1195. [Google Scholar] [CrossRef]
Tariq, U.; Hsiao, A.; Alley, M.; Zhang, T.; Lustig, M.; Vasanawala, S.S. Venous and arterial flow quantification are equally accurate and precise with parallel imaging compressed sensing 4D phase contrast MRI. J. Magn. Reson. 2013, 37, 1419–1426. [Google Scholar] [CrossRef]
Otazo, R.; Kim, D.; Axel, L.; Sodickson, D.K. Combination of compressed sensing and parallel imaging for highly accelerated first-pass cardiac perfusion MRI. Magn. Reson. Med. 2010, 64, 767–776. [Google Scholar] [CrossRef] [PubMed]
Hsiao, A.; Lustig, M.; Alley, M.T.; Murphy, M.; Chan, F.P.; Herfkens, R.J.; Vasanawala, S.S. Rapid pediatric cardiac assessment of flow and ventricular volume with compressed sensing parallel imaging volumetric cine phase-contrast MRI. AJR Am. J. Roentgenol. 2012, 198, W250. [Google Scholar] [CrossRef]
Otazo, R.; Candes, E.; Sodickson, D.K. Low-rank plus sparse matrix decomposition for accelerated dynamic MRI with separation of background and dynamic components. Magn. Reson. Med. 2015, 73, 1125–1136. [Google Scholar] [CrossRef]
Knoll, F.; Murrell, T.; Sriram, A.; Yakubova, N.; Zbontar, J.; Rabbat, M.; Defazio, A.; Muckley, M.J.; Sodickson, D.K.; Zitnick, C.L.; et al. Advancing machine learning for MR image reconstruction with an open competition: Overview of the 2019 fastMRI challenge. Magn. Reson. Med. 2020, 84, 3054–3070. [Google Scholar] [CrossRef] [PubMed]
Muckley, M.J.; Riemenschneider, B.; Radmanesh, A.; Kim, S.; Jeong, G.; Ko, J.; Jun, Y.; Shin, H.; Hwang, D.; Mostapha, M.; et al. Results of the 2020 fastmri challenge for machine learning mr image reconstruction. IEEE Trans. Med. Imaging 2021, 40, 2306–2317. [Google Scholar] [CrossRef] [PubMed]
Knoll, F.; Hammernik, K.; Zhang, C.; Moeller, S.; Pock, T.; Sodickson, D.K.; Akcakaya, M. Deep-learning methods for parallel magnetic resonance imaging reconstruction: A survey of the current approaches, trends, and issues. IEEE Signal Process. Mag. 2020, 37, 128–140. [Google Scholar] [CrossRef]
Hammernik, K.; Schlemper, J.; Qin, C.; Duan, J.; Summers, R.M.; Rueckert, D. Systematic evaluation of iterative deep neural networks for fast parallel MRI reconstruction with sensitivity-weighted coil combination. Magn. Reson. Med. 2021, 86, 1859–1872. [Google Scholar] [CrossRef]
Donoho, D.L. Compressed sensing. IEEE Trans. Inf. Theory 2006, 52, 1289–1306. [Google Scholar] [CrossRef]
Candès, E.J.; Romberg, J.; Tao, T. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 2006, 52, 489–509. [Google Scholar] [CrossRef]
Hammernik, K.; Klatzer, T.; Kobler, E.; Recht, M.P.; Sodickson, D.K.; Pock, T.; Knoll, F. Learning a variational network for reconstruction of accelerated MRI data. Mag. Reson. Med. 2018, 79, 3055–3071. [Google Scholar] [CrossRef] [PubMed]
Aggarwal, H.K.; Mani, M.P.; Jacob, M. MoDL: Model-based deep learning architecture for inverse problems. IEEE Trans. Med. Imaging 2018, 38, 394–405. [Google Scholar] [CrossRef]
Mardani, M.; Sun, Q.; Donoho, D.; Papyan, V.; Monajemi, H.; Vasanawala, S.; Pauly, J. Neural proximal gradient descent for compressive imaging. Adv. Neural. Inf. Process. Syst. 2018, 31. Available online: https://proceedings.neurips.cc/paper/2018/hash/61d009da208a34ae155420e55f97abc7-Abstract.html (accessed on 2 March 2023).
Sandino, C.M.; Cheng, J.Y.; Chen, F.; Mardani, M.; Pauly, J.M.; Vasanawala, S.S. Compressed sensing: From research to clinical practice with deep neural networks: Shortening scan times for magnetic resonance imaging. IEEE Signal Process. Mag. 2020, 37, 117–127. [Google Scholar] [CrossRef]
Sandino, C.M.; Lai, P.; Vasanawala, S.S.; Cheng, J.Y. Accelerating cardiac cine MRI using a deep learning-based ESPIRiT reconstruction. Magn. Reson. Med. 2021, 85, 152–167. [Google Scholar] [CrossRef]
Zucker, E.J.; Sandino, C.M.; Kino, A.; Lai, P.; Vasanawala, S.S. Free-breathing Accelerated Cardiac MRI Using Deep Learning: Validation in Children and Young Adults. Radiology 2021, 300, 539–548. [Google Scholar] [CrossRef]
Oscanoa, J.A.; Middione, M.J.; Syed, A.B.; Sandino, C.M.; Vasanawala, S.S.; Ennis, D.B. Accelerated two-dimensional phase-contrast for cardiovascular MRI using deep learning-based reconstruction with complex difference estimation. Mag. Reson. Med. 2023, 89, 356–369. [Google Scholar] [CrossRef] [PubMed]
Küstner, T.; Fuin, N.; Hammernik, K.; Bustin, A.; Qi, H.; Hajhosseiny, R.; Masci, P.G.; Neji, R.; Rueckert, D.; Botnar, R.M.; et al. CINENet: Deep learning-based 3D cardiac CINE MRI reconstruction with multi-coil complex-valued 4D spatio-temporal convolutions. Sci. Rep. 2020, 10, 1–13. [Google Scholar] [CrossRef] [PubMed]
Biswas, S.; Aggarwal, H.K.; Jacob, M. Dynamic MRI using model-based deep learning and SToRM priors: MoDL-SToRM. Magn. Reson. Med. 2019, 82, 485–494. [Google Scholar] [CrossRef] [PubMed]
Huang, W.; Ke, Z.; Cui, Z.X.; Cheng, J.; Qiu, Z.; Jia, S.; Ying, L.; Zhu, Y.; Liang, D. Deep low-rank plus sparse network for dynamic MR imaging. Med. Image Anal. 2021, 73, 102190. [Google Scholar] [CrossRef]
Ozturkler, B.; Sahiner, A.; Ergen, T.; Desai, A.D.; Sandino, C.M.; Vasanawala, S.; Pauly, J.M.; Mardani, M.; Pilanci, M. GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction. arXiv 2022, arXiv:2207.08393. [Google Scholar]
Wang, K.; Kellman, M.; Sandino, C.M.; Zhang, K.; Vasanawala, S.S.; Tamir, J.I.; Yu, S.X.; Lustig, M. Memory-efficient Learning for High-Dimensional MRI Reconstruction. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France, 27 September–1 October 2021; pp. 461–470. [Google Scholar]
Ulyanov, D.; Vedaldi, A.; Lempitsky, V. Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18 June–23 June 2018; pp. 9446–9454. [Google Scholar]
Acar, M.; Çukur, T.; Öksüz, İ. Self-supervised dynamic mri reconstruction. In Proceedings of the Machine Learning for Medical Image Reconstruction: 4th International Workshop, MLMIR 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, 1 October 2021; pp. 35–44. [Google Scholar]
Haldar, J.P.; Liang, Z.P. Spatiotemporal imaging with partially separable functions: A matrix recovery approach. In Proceedings of the 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Rotterdam, The Netherlands, 4–17 April 2010; pp. 716–719. [Google Scholar]
Ong, F.; Zhu, X.; Cheng, J.Y.; Johnson, K.M.; Larson, P.E.; Vasanawala, S.S.; Lustig, M. Extreme MRI: Large-scale volumetric dynamic imaging from continuous non-gated acquisitions. Magn. Reson. Med. 2020, 84, 1763–1780. [Google Scholar] [CrossRef] [PubMed]
Sandino, C.M.; Ong, F.; Iyer, S.S.; Bush, A.; Vasanawala, S. Deep subspace learning for efficient reconstruction of spatiotemporal imaging data. In Proceedings of the NeurIPS 2021Workshop on Deep Learning and Inverse Problems, 2021. Available online: https://openreview.net/forum?id=pjeFySy4240 (accessed on 2 March 2023).
Lustig, M.; Santos, J.M.; Donoho, D.L.; Pauly, J.M. SPARSE: High frame rate dynamic MRI exploiting spatio-temporal sparsity.In Proceedings of the 13th Annual Meeting of the International Society for Magnetic Resonance in Medicine, 2006; Volume 2420. Available online: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=be7217aac865de3801e3f35ee13520888aad597d (accessed on 2 March 2023).
Usman, M.; Atkinson, D.; Odille, F.; Kolbitsch, C.; Vaillant, G.; Schaeffter, T.; Batchelor, P.G.; Prieto, C. Motion corrected compressed sensing for free-breathing dynamic cardiac MRI. Magn. Reson. Med. 2013, 70, 504–516. [Google Scholar] [CrossRef] [PubMed]
Wetzl, J.; Schmidt, M.; Pontana, F.; Longère, B.; Lugauer, F.; Maier, A.; Hornegger, J.; Forman, C. Single-breath-hold 3-D CINE imaging of the left ventricle using Cartesian sampling. Magn. Reson. Mater. Phys. 2018, 31, 19–31. [Google Scholar] [CrossRef] [PubMed]
Caballero, J.; Price, A.N.; Rueckert, D.; Hajnal, J.V. Dictionary learning and time sparsity for dynamic MR data reconstruction. IEEE Trans. Med. Imaging 2014, 33, 979–994. [Google Scholar] [CrossRef]
Montefusco, L.B.; Lazzaro, D.; Papi, S.; Guerrini, C. A fast compressed sensing approach to 3D MR image reconstruction. IEEE Trans. Med. Imaging 2010, 30, 1064–1075. [Google Scholar] [CrossRef]
Chen, C.; Li, Y.; Axel, L.; Huang, J. Real time dynamic MRI by exploiting spatial and temporal sparsity. Magn. Reson. Imaging 2016, 34, 473–482. [Google Scholar] [CrossRef]
Pedersen, H.; Kozerke, S.; Ringgaard, S.; Nehrke, K.; Kim, W.Y. k-t PCA: Temporally constrained k-t BLAST reconstruction using principal component analysis. Magn. Reson. Med. 2009, 62, 706–716. [Google Scholar] [CrossRef]
Zhao, B.; Haldar, J.P.; Brinegar, C.; Liang, Z.P. Low rank matrix recovery for real-time cardiac MRI. In Proceedings of the 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Rotterdam, The Netherlands, 14–17 April 2010; pp. 996–999. [Google Scholar]
Goud, S.; Hu, Y.; Jacob, M. Real-time cardiac MRI using low-rank and sparsity penalties. In Proceedings of the 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Rotterdam, The Netherlands, 14–17 April 2010; pp. 988–991. [Google Scholar]
Lingala, S.G.; Hu, Y.; DiBella, E.; Jacob, M. Accelerated dynamic MRI exploiting sparsity and low-rank structure: kt SLR. IEEE Trans. Med. Imaging 2011, 30, 1042–1054. [Google Scholar] [CrossRef] [PubMed]
Zhao, B.; Haldar, J.P.; Christodoulou, A.G.; Liang, Z.P. Image reconstruction from highly undersampled (k, t)-space data with joint partial separability and sparsity constraints. IEEE Trans. Med. Imaging 2012, 31, 1809–1820. [Google Scholar] [CrossRef] [PubMed]
Miao, X.; Lingala, S.G.; Guo, Y.; Jao, T.; Usman, M.; Prieto, C.; Nayak, K.S. Accelerated cardiac cine MRI using locally low rank and finite difference constraints. Magn. Reson. Imaging 2016, 34, 707–714. [Google Scholar] [CrossRef] [PubMed]
Parikh, N.; Boyd, S. Proximal algorithms. Found. Trends Opt. 2014, 1, 127–239. [Google Scholar] [CrossRef]
Beck, A.; Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2009, 2, 183–202. [Google Scholar] [CrossRef]
Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 2011, 3, 1–122. [Google Scholar] [CrossRef]
Chambolle, A.; Pock, T. A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 2011, 40, 120–145. [Google Scholar] [CrossRef]
Geman, D.; Yang, C. Nonlinear image recovery with half-quadratic regularization. IEEE Trans. Image Process. 1995, 4, 932–946. [Google Scholar] [CrossRef]
Liang, Z.P. Spatiotemporal imaging with partially separable functions. In Proceedings of the 2007 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Washington, DC, USA, 12–16 April 2007; pp. 988–991. [Google Scholar]
Wang, S.; Su, Z.; Ying, L.; Peng, X.; Zhu, S.; Liang, F.; Feng, D.; Liang, D. Accelerating magnetic resonance imaging via deep learning. In Proceedings of the 2016 IEEE International Symposium on Biomedical Imaging (ISBI) (ISBI), Prague, Czech Republic, 13–16 April 2016; pp. 514–517. [Google Scholar]
Zhu, B.; Liu, J.Z.; Cauley, S.F.; Rosen, B.R.; Rosen, M.S. Image reconstruction by domain-transform manifold learning. Nature 2018, 555, 487–492. [Google Scholar] [CrossRef]
Yang, G.; Yu, S.; Dong, H.; Slabaugh, G.; Dragotti, P.L.; Ye, X.; Liu, F.; Arridge, S.; Keegan, J.; Guo, Y.; et al. DAGAN: Deep de-aliasing generative adversarial networks for fast compressed sensing MRI reconstruction. IEEE Trans. Med. Imaging 2017, 37, 1310–1321. [Google Scholar] [CrossRef]
Akçakaya, M.; Moeller, S.; Weingärtner, S.; Uğurbil, K. Scan-specific robust artificial-neural-networks for k-space interpolation (RAKI) reconstruction: Database-free deep learning for fast imaging. Magn. Reson. Med. 2019, 81, 439–453. [Google Scholar] [CrossRef] [PubMed]
Ghodrati, V.; Shao, J.; Bydder, M.; Zhou, Z.; Yin, W.; Nguyen, K.L.; Yang, Y.; Hu, P. MR image reconstruction using deep learning: Evaluation of network structure and loss functions. Quant. Imaging Med. Surg. 2019, 9, 1516. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015; pp. 1–15. [Google Scholar]
Shimron, E.; Tamir, J.I.; Wang, K.; Lustig, M. Implicit data crimes: Machine learning bias arising from misuse of public data. Proc. Natl. Acad. Sci. USA 2022, 119, e2117203119. [Google Scholar] [CrossRef] [PubMed]
Fabian, Z.; Heckel, R.; Soltanolkotabi, M. Data augmentation for deep learning based accelerated MRI reconstruction with limited data. Proc. Mach. Learn. Res. 2021, 139, 3057–3067. [Google Scholar]
Poddar, S.; Jacob, M. Dynamic MRI using smoothness regularization on manifolds (SToRM). IEEE Trans. Med. Imaging 2015, 35, 1106–1115. [Google Scholar] [CrossRef] [PubMed]
Sandino, C.M.; Cheng, J.Y.; Alley, M.T.; Carl, M.; Vasanawala, S.S. Accelerated abdominal 4D flow MRI using 3D golden-angle cones trajectory. In Proceedings of the Proc Ann Mtg ISMRM, Honolulu, HI, USA, 22–27 April 2017. [Google Scholar]
Ke, Z.; Cui, Z.X.; Huang, W.; Cheng, J.; Jia, S.; Ying, L.; Zhu, Y.; Liang, D. Deep Manifold Learning for Dynamic MR Imaging. IEEE Trans. Comput. Imaging 2021, 7, 1314–1327. [Google Scholar] [CrossRef]
Ahmad, R.; Xue, H.; Giri, S.; Ding, Y.; Craft, J.; Simonetti, O.P. Variable density incoherent spatiotemporal acquisition (VISTA) for highly accelerated cardiac MRI. Magn. Reson. Med. 2015, 74, 1266–1278. [Google Scholar] [CrossRef] [PubMed]
Bernstein, M.A.; Fain, S.B.; Riederer, S.J. Effect of windowing and zero-filled reconstruction of MRI data on spatial resolution and acquisition strategy. J. Magn. Reson. 2001, 14, 270–280. [Google Scholar] [CrossRef]
Ashikaga, H.; Estner, H.L.; Herzka, D.A.; Mcveigh, E.R.; Halperin, H.R. Quantitative assessment of single-image super-resolution in myocardial scar imaging. IEEE J. Transl. Eng. Health Med. 2014, 2, 1–12. [Google Scholar] [CrossRef]
Dong, C.; Loy, C.C.; He, K.; Tang, X. Learning a deep convolutional network for image super-resolution. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 184–199. [Google Scholar]
Yang, W.; Zhang, X.; Tian, Y.; Wang, W.; Xue, J.H.; Liao, Q. Deep learning for single image super-resolution: A brief review. IEEE Trans. Multimed. 2019, 21, 3106–3121. [Google Scholar] [CrossRef]
Wang, Z.; Chen, J.; Hoi, S.C. Deep learning for image super-resolution: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3365–3387. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Sixou, B.; Peyrin, F. A review of the deep learning methods for medical images super resolution problems. IRBM 2021, 42, 120–133. [Google Scholar] [CrossRef]
Masutani, E.M.; Bahrami, N.; Hsiao, A. Deep learning single-frame and multiframe super-resolution for cardiac MRI. Radiology 2020, 295, 552. [Google Scholar] [CrossRef] [PubMed]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Basty, N.; Grau, V. Super resolution of cardiac cine MRI sequences using deep learning. In Image Analysis for Moving Organ, Breast, and Thoracic Images; Springer: Berlin/Heidelberg, Germany, 2018; pp. 23–31. [Google Scholar]
Zhao, C.; Dewey, B.E.; Pham, D.L.; Calabresi, P.A.; Reich, D.S.; Prince, J.L. SMORE: A self-supervised anti-aliasing and super-resolution algorithm for MRI using deep learning. IEEE Trans. Med. Imaging 2020, 40, 805–817. [Google Scholar] [CrossRef] [PubMed]
Putzky, P.; Welling, M. Invert to learn to invert. Adv. Neural. Inf. Process. Syst. 2019, 32. Available online: https://proceedings.neurips.cc/paper/2019/hash/ac1dd209cbcc5e5d1c6e28598e8cbbe8-Abstract.html (accessed on 2 March 2023).
Kellman, M.; Zhang, K.; Markley, E.; Tamir, J.; Bostan, E.; Lustig, M.; Waller, L. Memory-efficient learning for large-scale computational imaging. IEEE Trans. Comput. Imaging 2020, 6, 1403–1414. [Google Scholar] [CrossRef]
Yoo, J.; Jin, K.H.; Gupta, H.; Yerly, J.; Stuber, M.; Unser, M. Time-dependent deep image prior for dynamic MRI. IEEE Trans. Med. Imaging 2021, 40, 3337–3348. [Google Scholar] [CrossRef]
Firmin, D.; Nayler, G.; Klipstein, R.; Underwood, S.; Rees, R.; Longmore, D. In vivo validation of MR velocity imaging. J. Comput. Assist. Tomogr. 1987, 11, 751–756. [Google Scholar] [CrossRef]
Attili, A.K.; Parish, V.; Valverde, I.; Greil, G.; Baker, E.; Beerbaum, P. Cardiovascular MRI in childhood. Arch. Dis. Child. 2011, 96, 1147–1155. [Google Scholar] [CrossRef] [PubMed]
Nayak, K.S.; Nielsen, J.F.; Bernstein, M.A.; Markl, M.; Gatehouse, P.D.; Botnar, R.M.; Saloner, D.; Lorenz, C.; Wen, H.; Hu, B.S.; et al. Cardiovascular magnetic resonance phase contrast imaging. J. Cardiovasc. Magn. Reson. 2015, 17, 71. [Google Scholar] [CrossRef]
Markl, M.; Frydrychowicz, A.; Kozerke, S.; Hope, M.; Wieben, O. 4D Flow MRI. J. Magn. Reson. Imaging 2012, 36, 1015–1036. [Google Scholar] [CrossRef]
Vishnevskiy, V.; Walheim, J.; Kozerke, S. Deep variational network for rapid 4D flow MRI reconstruction. Nat. Mach. Intell. 2020, 2, 228–235. [Google Scholar] [CrossRef]
Haji-Valizadeh, H.; Guo, R.; Kucukseymen, S.; Paskavitz, A.; Cai, X.; Rodriguez, J.; Pierce, P.; Goddu, B.; Kim, D.; Manning, W.; et al. Highly accelerated free-breathing real-time phase contrast cardiovascular MRI via complex-difference deep learning. Mag. Reson. Med. 2021, 86, 804–819. [Google Scholar] [CrossRef] [PubMed]
Cole, E.; Cheng, J.; Pauly, J.; Vasanawala, S. Analysis of deep complex-valued convolutional neural networks for MRI reconstruction and phase-focused applications. Mag. Reson. Med. 2021, 86, 1093–1109. [Google Scholar] [CrossRef]
Jaubert, O.; Steeden, J.; Montalt-Tordera, J.; Arridge, S.; Kowalik, G.T.; Muthurangu, V. Deep artifact suppression for spiral real-time phase contrast cardiac magnetic resonance imaging in congenital heart disease. Mag. Reson. Med. 2021, 83, 125–132. [Google Scholar] [CrossRef]
Jaubert, O.; Montalt-Tordera, J.; Brown, J.; Knight, D.; Arridge, S.; Steeden, J.; Muthurangu, V. FReSCO: Flow Reconstruction and Segmentation for low latency Cardiac Output monitoring using deep artifact suppression and segmentation. arXiv 2022, arXiv:2203.13729. [Google Scholar] [CrossRef] [PubMed]
Kim, D.; Jen, M.L.; Eisenmenger, L.B.; Johnson, K.M. Accelerated 4D-flow MRI with 3-point encoding enabled by machine learning. Mag. Reson. Med. 2022, 89, 800–811. [Google Scholar] [CrossRef] [PubMed]
Nath, R.; Callahan, S.; Stoddard, M.; Amini, A.A. FlowRAU-Net: Accelerated 4D Flow MRI of Aortic Valvular Flows with a Deep 2D Residual Attention Network. IEEE Trans. Biomed. Eng. 2022, 69, 3812–3824. [Google Scholar] [CrossRef]
Winkelmann, S.; Schaeffter, T.; Koehler, T.; Eggers, H.; Doessel, O. An optimal radial profile order based on the Golden Ratio for time-resolved MRI. IEEE Trans. Med. Imaging 2006, 26, 68–76. [Google Scholar] [CrossRef]
Zhang, T.; Pauly, J.M.; Levesque, I.R. Accelerating parameter mapping with a locally low rank constraint. Mag. Reson. Med. 2015, 73, 655–661. [Google Scholar] [CrossRef]
Benkert, T.; Tian, Y.; Huang, C.; DiBella, E.V.; Chandarana, H.; Feng, L. Optimization and validation of accelerated golden-angle radial sparse MRI reconstruction with self-calibrating GRAPPA operator gridding. Mag. Reson. Med. 2018, 80, 286–293. [Google Scholar] [CrossRef]
Kowalik, G.T.; Knight, D.; Steeden, J.A.; Muthurangu, V. Perturbed spiral real-time phase-contrast MR with compressive sensing reconstruction for assessment of flow in children. Mag. Reson. Med. 2020, 83, 2077–2091. [Google Scholar] [CrossRef] [PubMed]
Uecker, M.; Lai, P.; Murphy, M.J.; Virtue, P.; Elad, M.; Pauly, J.M.; Vasanawala, S.S.; Lustig, M. ESPIRiT—An eigenvalue approach to autocalibrating parallel MRI: Where SENSE meets GRAPPA. Mag. Reson. Med. 2014, 71, 990–1001. [Google Scholar] [CrossRef]
Montesinos, P.; Abascal, J.F.P.; Cussó, L.; Vaquero, J.J.; Desco, M. Application of the compressed sensing technique to self-gated cardiac cine sequences in small animals. Mag. Reson. Med. 2014, 72, 369–380. [Google Scholar] [CrossRef]
El-Rewaidy, H.; Neisius, U.; Mancio, J.; Kucukseymen, S.; Rodriguez, J.; Paskavitz, A.; Menze, B.; Nezafat, R. Deep complex convolutional network for fast reconstruction of 3D late gadolinium enhancement cardiac MRI. NMR Biomed. 2020, 33, e4312. [Google Scholar] [CrossRef]
Yaman, B.; Shenoy, C.; Deng, Z.; Moeller, S.; El-Rewaidy, H.; Nezafat, R.; Akçakaya, M. Self-supervised physics-guided deep learning reconstruction for high-resolution 3d lge cmr. In Proceedings of the 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), Nice, France, 13–16 April 2021; pp. 100–104. [Google Scholar]
van der Velde, N.; Hassing, H.C.; Bakker, B.J.; Wielopolski, P.A.; Lebel, R.M.; Janich, M.A.; Kardys, I.; Budde, R.P.; Hirsch, A. Improvement of late gadolinium enhancement image quality using a deep learning—Based reconstruction algorithm and its influence on myocardial scar quantification. Eur. Radiol. 2021, 31, 3846–3855. [Google Scholar] [CrossRef]
Chen, Y.; Shaw, J.L.; Xie, Y.; Li, D.; Christodoulou, A.G. Deep learning within a priori temporal feature spaces for large-scale dynamic MR image reconstruction: Application to 5-D cardiac MR Multitasking. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Shenzhen, China, 13–17 October 2019; pp. 495–504. [Google Scholar]
Jeelani, H.; Yang, Y.; Zhou, R.; Kramer, C.M.; Salerno, M.; Weller, D.S. A myocardial T1-mapping framework with recurrent and U-Net convolutional neural networks. In Proceedings of the 2020 IEEE International Symposium on Biomedical Imaging (ISBI), Iowa City, IA, USA, 3–7 April 2020; pp. 1941–1944. [Google Scholar]
Hamilton, J.I.; Currey, D.; Rajagopalan, S.; Seiberlich, N. Deep learning reconstruction for cardiac magnetic resonance fingerprinting T1 and T2 mapping. Magn. Reson. Med. 2021, 85, 2127–2135. [Google Scholar] [CrossRef]
Antun, V.; Renna, F.; Poon, C.; Adcock, B.; Hansen, A.C. On instabilities of deep learning in image reconstruction and the potential costs of AI. Proc. Natl. Acad. Sci. USA 2020, 117, 30088–30095. [Google Scholar] [CrossRef]
Darestani, M.Z.; Chaudhari, A.S.; Heckel, R. Measuring robustness in deep learning based compressive sensing. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 2433–2444. [Google Scholar]
Ergen, T.; Pilanci, M. Convex duality of deep neural networks. arXiv 2020, arXiv:2002.09773. [Google Scholar]
Pilanci, M.; Ergen, T. Neural networks are convex regularizers: Exact polynomial-time convex optimization formulations for two-layer networks. In Proceedings of the International Conference on Machine Learning, Virtual, 13–18 July 2020; pp. 7695–7705. [Google Scholar]
Sahiner, A.; Mardani, M.; Ozturkler, B.; Pilanci, M.; Pauly, J. Convex regularization behind neural reconstruction. arXiv 2020, arXiv:2012.05169. [Google Scholar]
Darestani, M.Z.; Liu, J.; Heckel, R. Test-Time Training Can Close the Natural Distribution Shift Performance Gap in Deep Learning Based Compressed Sensing. arXiv 2022, arXiv:2204.07204. [Google Scholar]
Sun, L.; Fan, Z.; Ding, X.; Huang, Y.; Paisley, J. Joint CS-MRI reconstruction and segmentation with a unified deep network. In Proceedings of the International Conference on Information Processing in Medical Imaging, Hong Kong, China, 2–7 June 2019; pp. 492–504. [Google Scholar]
Huang, Q.; Yang, D.; Yi, J.; Axel, L.; Metaxas, D. FR-Net: Joint reconstruction and segmentation in compressed sensing cardiac MRI. In Proceedings of the International Conference on Functional Imaging and Modeling of the Heart, Bordeaux, France, 6–8 June 2019; pp. 352–360. [Google Scholar]
Gurney, P.T.; Hargreaves, B.A.; Nishimura, D.G. Design and analysis of a practical 3D cones trajectory. Magn. Reson. Med. 2006, 55, 575–582. [Google Scholar] [CrossRef] [PubMed]
Cao, X.; Liao, C.; Iyer, S.S.; Wang, Z.; Zhou, Z.; Dai, E.; Liberman, G.; Dong, Z.; Gong, T.; He, H.; et al. Optimized multi-axis spiral projection MR fingerprinting with subspace reconstruction for rapid whole-brain high-isotropic-resolution quantitative imaging. Magn. Reson. Med. 2022, 88, 133–150. [Google Scholar] [CrossRef]
Buehrer, M.; Pruessmann, K.P.; Boesiger, P.; Kozerke, S. Array compression for MRI with large coil arrays. Magn. Reson. Med. 2007, 57, 1131–1139. [Google Scholar] [CrossRef] [PubMed]
Zhang, T.; Pauly, J.M.; Vasanawala, S.S.; Lustig, M. Coil compression for accelerated imaging with Cartesian sampling. Magn. Reson. Med. 2013, 69, 571–582. [Google Scholar] [CrossRef]
Huang, F.; Vijayakumar, S.; Li, Y.; Hertel, S.; Duensing, G.R. A software channel compression technique for faster reconstruction with many channels. Magn. Reson. Imaging 2008, 26, 133–141. [Google Scholar] [CrossRef]
Muckley, M.; Noll, D.C.; Fessler, J.A. Accelerating SENSE-type MR image reconstruction algorithms with incremental gradients. In Proceedings of the Proc Ann Mtg ISMRM, Milan, Italy, 10–16 May 2014; p. 4400. [Google Scholar]
Pilanci, M.; Wainwright, M.J. Randomized sketches of convex programs with sharp guarantees. IEEE Trans. Inf. Theory 2015, 61, 5096–5115. [Google Scholar] [CrossRef]
Pilanci, M.; Wainwright, M.J. Iterative Hessian sketch: Fast and accurate solution approximation for constrained least-squares. J. Mach. Learn. Res. 2016, 17, 1842–1879. [Google Scholar]
Pilanci, M.; Wainwright, M.J. Newton sketch: A near linear-time optimization algorithm with linear-quadratic convergence. SIAM J. Optim. 2017, 27, 205–245. [Google Scholar] [CrossRef]
Tang, J.; Golbabaee, M.; Davies, M.E. Gradient projection iterative sketch for large-scale constrained least-squares. In Proceedings of the Int Conf Mach Learn, Sydney, Australia, 6–11 August 2017; pp. 3377–3386. [Google Scholar]
Wang, K.; Tamir, J.I.; De Goyeneche, A.; Wollner, U.; Brada, R.; Yu, S.X.; Lustig, M. High fidelity deep learning-based MRI reconstruction with instance-wise discriminative feature matching loss. Magn. Reson. Med. 2022, 88, 476–491. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural. Inf. Process. Syst. 2017, 30. Available online: https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html (accessed on 2 March 2023).
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Arnab, A.; Dehghani, M.; Heigold, G.; Sun, C.; Lučić, M.; Schmid, C. Vivit: A video vision transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 6836–6846. [Google Scholar]
Lin, K.; Heckel, R. Vision Transformers Enable Fast and Robust Accelerated MRI. Proc. Mach. Learn. Res. 2022, 172, 774–795. [Google Scholar]
Korkmaz, Y.; Dar, S.U.H.; Yurt, M.; Özbey, M.; Çukur, T. Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial Transformers. IEEE Trans. Med. Imaging 2022, 41, 1747–1763. [Google Scholar] [CrossRef] [PubMed]

Figure 1. (A) A generalized image reconstruction algorithm iteratively removes aliasing artifacts produced by undersampling. The methods alternate between data consistency and regularization steps. A common conventional approach (e.g., [22]) would apply regularization via a proximal step. DL-based reconstruction methods regularize via a neural network. (B) An alternative reconstruction algorithm uses the partial separability model. Image reconstruction is performed on spatial and temporal basis rather than directly on the images. A conventional method (e.g., [47,48]) alternately updates the basis U and V via gradient steps. Similarly, a DL-based reconstruction method would update via spatial and temporal networks. Diagrams and pictures modified from Sandino et al. [49].

Figure 2. Self-supervised learning diagram for CMR image reconstruction from Acar et al. [46]. Acquired k-space

Ω

is split into two sets: training set

Θ

and loss set

Λ

. The training set

Θ

is used to enforce data consistency and the loss set

Λ

is used in the loss function to learn the network weights.

Figure 2. Self-supervised learning diagram for CMR image reconstruction from Acar et al. [46]. Acquired k-space

Ω

is split into two sets: training set

Θ

and loss set

Λ

. The training set

Θ

is used to enforce data consistency and the loss set

Λ

is used in the loss function to learn the network weights.

Figure 3. A representative retrospectively (R = 10) acquired dataset from Vishnevskiy et al. showing a patient with stenotic aortic flow at peak systole. A quantitative comparison of highlighting their FlowVN reconstruction comparing (nRMSE) and relative error (RelErr) for both magnitude and velocity images relative to a fully sampled dataset (reference) and a CS reconstruction.

Figure 4. A representative prospectively (R = 18) acquired subject from Jaubert et al. showing the Cartesian reference as well as the CS and U-Net reconstruction at peak systole. Corresponding average real time–blood flow curves, averaged over 9-s, show more accurate peak blood flow rates for the U-Net compared to CS while also showing accurate stroke volumes for both CS and U-Net reconstructions.

Figure 5. Representative LGE images from El-Rewaidy et al. acquired using prospective undersampling for R = 5 reconstructed with the reference method as well as retrospectively undersampled using R = 6, 7, and 8 reconstructed with zero-filling and the proposed DL-based reconstruction,

C

Net. Minimal noise and blurring artifacts are seen in the

C

Net images. Similarly, image quality is similar between R = 5 reference images and R = 8

C

Net images while also maintaining scar–blood contrast.

Figure 5. Representative LGE images from El-Rewaidy et al. acquired using prospective undersampling for R = 5 reconstructed with the reference method as well as retrospectively undersampled using R = 6, 7, and 8 reconstructed with zero-filling and the proposed DL-based reconstruction,

C

Net. Minimal noise and blurring artifacts are seen in the

C

Net images. Similarly, image quality is similar between R = 5 reference images and R = 8

C

Net images while also maintaining scar–blood contrast.

Figure 6. Representative T1 maps from Jeelani et al. for two post-contrast and one pre-contrast patient cases, comparing results between CS, DeepT1, and a fully sampled ground truth. DeepT1 shows better results than CS (higher SSIM compared to fully sampled).

Table 1. Overview of 2D- and 4D-Flow deep learning-based reconstructions.

Manuscript	Imaging	Network	Training	Undersampling	Recon Reduction
Vishnevskiy [97]	4D-Flow	Unrolled	n = 11	12.4 ≤ R ≤ 13.8	30×
Haji-Valizadeh [98]	2D-Flow	3D U-Net	n = 510	R = 28.8	4.6×
Cole [99]	2D-Flow	2D U-Net	n = 180	R ≤ 6	N/A
Jaubert [100,101]	2D-Flow	3D U-Net	n = 520	R = 18	15×
Oscanoa [39]	2D-Flow	2D U-Net	n = 155	R = 8	N/A
Kim [102]	4D-Flow	Unrolled	n = 140	R ≤ 6	N/A
Nath [103]	4D-Flow	2D U-Net	n = 18	R = 2.5, 3.3, 5	7×

Reconstruction time improvement is relative to compressed sensing (CS).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Oscanoa, J.A.; Middione, M.J.; Alkan, C.; Yurt, M.; Loecher, M.; Vasanawala, S.S.; Ennis, D.B. Deep Learning-Based Reconstruction for Cardiac MRI: A Review. Bioengineering 2023, 10, 334. https://doi.org/10.3390/bioengineering10030334

AMA Style

Oscanoa JA, Middione MJ, Alkan C, Yurt M, Loecher M, Vasanawala SS, Ennis DB. Deep Learning-Based Reconstruction for Cardiac MRI: A Review. Bioengineering. 2023; 10(3):334. https://doi.org/10.3390/bioengineering10030334

Chicago/Turabian Style

Oscanoa, Julio A., Matthew J. Middione, Cagan Alkan, Mahmut Yurt, Michael Loecher, Shreyas S. Vasanawala, and Daniel B. Ennis. 2023. "Deep Learning-Based Reconstruction for Cardiac MRI: A Review" Bioengineering 10, no. 3: 334. https://doi.org/10.3390/bioengineering10030334

APA Style

Oscanoa, J. A., Middione, M. J., Alkan, C., Yurt, M., Loecher, M., Vasanawala, S. S., & Ennis, D. B. (2023). Deep Learning-Based Reconstruction for Cardiac MRI: A Review. Bioengineering, 10(3), 334. https://doi.org/10.3390/bioengineering10030334

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Based Reconstruction for Cardiac MRI: A Review

Abstract

1. Introduction

2. Image Reconstruction Theory

2.1. General Model

2.2. Low Rank plus Sparse Model

2.3. Partial Separability Model

3. Deep Learning-Based Reconstruction

3.1. Learning and Evaluation Procedure

3.2. Unrolled Networks

3.2.1. Neural Proximal Gradient Descent

3.2.2. Model-Based Reconstruction Using Deep-Learned Priors

3.2.3. Deep Low Rank plus Sparse

3.2.4. Deep Subspace Learning

3.3. Other Networks

4. CMR-Specific Challenges

4.1. Increased Dimensionality

4.2. Limited Training Data

5. Application-Specific Methods

5.1. Cardiovascular Blood Velocity and Flow Quantification

5.2. Late Gadolinium Enhancement

5.3. Tissue Characterization

6. Pitfalls and Future Outlook

6.1. Instabilities and Hallucinations

6.2. Interpretability

6.3. Performance Gaps

6.4. Downstream Tasks

6.5. Increased Computational Cost

6.6. Improved Models and Losses

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI