Compressed-Sensing Magnetic Resonance Image Reconstruction Using an Iterative Convolutional Neural Network Approach

: Convolutional neural networks (CNNs) demonstrate excellent performance when employed to reconstruct the images obtained by compressed-sensing magnetic resonance imaging (CS-MRI). Our study aimed to enhance image quality by developing a novel iterative reconstruction approach that utilizes image-based CNNs and k-space correction to preserve original k-space data. In the proposed method, CNNs represent a priori information concerning image spaces. First, the CNNs are trained to map zero-ﬁlling images onto corresponding full-sampled images. Then, they recover the zero-ﬁlled part of the k-space data. Subsequently, k-space corrections, which involve the replacement of unﬁlled regions by original k-space data, are implemented to preserve the original k-space data. The above-mentioned processes are used iteratively. The performance of the proposed method was validated using a T2-weighted brain-image dataset, and experiments were conducted with several sampling masks. Finally, the proposed method was compared with other noniterative approaches to demonstrate its e ﬀ ectiveness. The aliasing artifacts in the reconstructed images obtained using the proposed approach were reduced compared to those using other state-of-the-art techniques. In addition, the quantitative results obtained in the form of the peak signal-to-noise ratio and structural similarity index demonstrated the e ﬀ ectiveness of the proposed method. The proposed CS-MRI method enhanced MR image quality with high-throughput examinations.


Introduction
Magnetic resonance imaging (MRI) is a noninvasive imaging modality for acquiring biological information at high spatial resolution. Compared to X-ray computed tomography, MRI scan times are longer owing to the use of a data-acquisition scheme that is sequentially sampled over the Fourier domain-also referred to as the k-space. This shortcoming has resulted in the proposal of several hardware-based and software-based techniques, such as asymmetric Fourier imaging [1], parallel imaging [2][3][4], and echo-planar imaging [5,6], to reduce the time required to obtain an MRI scan. However, in clinical diagnosis procedures, scanning speed must be improved without image degradation to reduce the motion artifacts caused by patients and the burden placed upon them. A possible strategy for reducing MRI scan time is to reduce the number of data acquisitions in the k-space instead of using full sampling. However, undersampled data are subject to aliasing artifacts in reconstructed images. Compressed sensing [7] can be employed for MRI reconstruction [8][9][10][11][12] using techniques that utilize sparsity in specific transform domains, such as the wavelet [8,10] and curvelet [11] transforms and dictionary learning [12], all of which have been incorporated in recently developed MRI scanners.
Even though the development of improved image-processing methods is competitive, the use of convolutional neural networks (CNNs) to enhance the quality of medical images has increasingly attracted researchers' attention [13][14][15][16][17]. With respect to compressed-sensing MRI (CS-MRI) reconstruction, several extant studies have reported the possibility of obtaining high-quality images from undersampled data using CNNs [18][19][20][21][22][23][24] by training them to map undersampled images onto corresponding full-sampled images [18,19]. Alternatively, in a few studies, a hybrid approach that operates on both the k-space and image space has been introduced to enhance image quality [22,23]. For example, Shanshan et al. reported on the first trial of a CNN-based CS-MRI approach that can restore the brain structures from zero-filling MR images [18]. Quan et al. proposed a generative adversarial network (GAN)-based CS-MRI algorithm; however, it is difficult to train GANs stably [21]. In addition, Eo et al. proposed KIKI-net, in which CNNs operate on both the k-space and image space [22]. This approach separately minimizes the loss functions in both spaces and improves image quality by restoring tissue structures and eliminating aliasing artifacts. Hyun et al. proposed a hybrid approach that employs CNNs and k-space correction, wherein the unfilled parts of k-space data are replaced by original k-space data [23]. This approach outperforms image-based CNNs; however, aliasing artifacts still remain due to the hybrid approach. Therefore, a more effective suppression of the aliasing artifacts is greatly needed.
Our work, which was inspired by the above-mentioned research studies, involves the development of a novel iterative CNN-based method for CS-MRI reconstruction, which is presented in this paper. The method combines the operation of image-based CNNs with k-space corrections, and it demonstrates superior performance compared to standalone image-based CNN and noniterative k-space correction methods because the aliasing artifacts can be suppressed by the iterative processing of the proposed CS-MRI reconstruction. In this study, experiments were performed to analyze the quality of T2-weighted brain images that was realized using the proposed method, and image quality is expressed in terms of the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). Figure 1 presents a sequential schematic of the proposed method, which iteratively processes image-based CNNs and k-space corrections, with the CNNs representing a priori image space information. The method aims to preserve information pertaining to tissue structures and to suppress aliasing artifacts to a larger degree compared to standalone image-based CNN and noniterative k-space correction methods. As indicated in Figure 1, first, undersampled k-space data, y 0 , are extracted from the corresponding fully sampled data using a binary sampling mask, R, and zero-filling images, x 0 , are obtained using the inverse Fourier transformation, F −1 . Subsequently, the CNNs are trained to map these zero-filling images onto the corresponding fully sampled images, and k-space corrections are performed to replace the unsampled data by the original k-space data [23]. Finally, F −1 is used to obtain the corrected images from the corrected k-space data. The above-mentioned processes are performed iteratively. Note that in all iterations except the first, CNNs are trained to map the ith output onto the corresponding fully sampled image. The calculations performed by the proposed method can be expressed by the following:

Proposed Method
Here, x i denotes the ith output of a reconstructed image, R denotes the logical negation of binary sampling mask R, • denotes element-wise multiplication, and f θ represents the CNN with network weights θ. F denotes the Fourier operator.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 3 of 11 Here, x i denotes the ith output of a reconstructed image, R ̅ denotes the logical negation of binary sampling mask R, ∘ denotes element-wise multiplication, and represents the CNN with network weights . ℱ denotes the Fourier operator.

Network Architecture
Several state-of-the-art CNN architectures were previously introduced to realize efficient imageto-image transformation. The proposed approach employs a network architecture based on the 2D U-net, a detailed description of which was previously reported [25,26]. This architecture was selected owing to its known superior performance in the medical imaging field. Figure 2 depicts a schematic of the 2D U-net architecture comprising independent encoding (left) and decoding (right) workflows. The encoding workflow corresponds to that of typical CNN architectures, which are constructed by repeatedly using two 3 × 3 2D convolutional layers. The processing within each such layer is followed by batch normalization (BN), the use of a leaky rectified linear unit (LReLU), and 2 × 2 max pooling for downsampling. Additionally, the number of feature channels is doubled during each downsampling step. In contrast, the decoding path comprises a 3 × 3 2D deconvolutional layer. The processing within this layer is followed by BN, an LReLU, upsampling, and skip connection with the corresponding linked-feature maps in the encoding path, and two 3 × 3 2D convolutional layers. The processing within each of these two layers is-once again-followed by BN and an LReLU. A linear function was employed to activate the output layer.
During each iteration, the above-described U-net architecture was trained using the loss function of the mean squared error, expressed by the following: Here, xtrue denotes the fully sampled image corresponding to each zero-filling image. The loss function was minimized using the Adam optimizer [27] at a learning rate of 0.0001, and the number of epochs was 100. A small batch of 32 images was used for training. These hyperparameters were empirically selected. All U-net processing was performed on a computer running the Ubuntu 16.04 operating system, an NVIDIA Quadro P6000 (Sant Clara, CA, USA) graphics processing unit with 24 GB memory, TensorFlow 1.9 [28], and Keras 2.2.4 [29].
i-th output Undersampled data

Network Architecture
Several state-of-the-art CNN architectures were previously introduced to realize efficient image-to-image transformation. The proposed approach employs a network architecture based on the 2D U-net, a detailed description of which was previously reported [25,26]. This architecture was selected owing to its known superior performance in the medical imaging field. Figure 2 depicts a schematic of the 2D U-net architecture comprising independent encoding (left) and decoding (right) workflows. The encoding workflow corresponds to that of typical CNN architectures, which are constructed by repeatedly using two 3 × 3 2D convolutional layers. The processing within each such layer is followed by batch normalization (BN), the use of a leaky rectified linear unit (LReLU), and 2 × 2 max pooling for downsampling. Additionally, the number of feature channels is doubled during each downsampling step. In contrast, the decoding path comprises a 3 × 3 2D deconvolutional layer. The processing within this layer is followed by BN, an LReLU, upsampling, and skip connection with the corresponding linked-feature maps in the encoding path, and two 3 × 3 2D convolutional layers. The processing within each of these two layers is-once again-followed by BN and an LReLU. A linear function was employed to activate the output layer.
During each iteration, the above-described U-net architecture was trained using the loss function of the mean squared error, expressed by the following: Here, x true denotes the fully sampled image corresponding to each zero-filling image. The loss function was minimized using the Adam optimizer [27] at a learning rate of 0.0001, and the number of epochs was 100. A small batch of 32 images was used for training. These hyperparameters were empirically selected. All U-net processing was performed on a computer running the Ubuntu 16.04 operating system, an NVIDIA Quadro P6000 (Sant Clara, CA, USA) graphics processing unit with 24 GB memory, TensorFlow 1.9 [28], and Keras 2.2.4 [29].

Experimental Setup
The performance of the proposed CS-MRI reconstruction approach was evaluated using a dataset comprising T2-weighted brain MR images extracted from the IXI database [30]. In this study, 7427 and 7460 images (256 × 256 resolution) of 57 subjects each were randomly selected for CNN training and testing, respectively. Experiments were performed for Cartesian and radial undersampling masks of 10%, 20%, 30%, and 40%, as indicated in Figure 3 [21].
To evaluate the performance of the proposed iterative method, the (1) standalone image-based U-net and (2) noniterative k-space correction [23] methods were adopted for comparison.
1. The standalone image-based U-net was trained by the same architecture as that of this study, expressed by the following: 2. The noniterative k-space correction method was implemented based on Hyun's method [23], expressed by the following: Here, x denotes the output of the noniterative k-space correction method that corresponds to the first output of the proposed iterative method.
To ensure a fair comparison, the architecture and hyperparameters of the CNNs of other methods were set to be identical to those considered in the proposed approach. The PSNR and SSIM values were calculated to evaluate the performance of the methods at Cartesian and radial sampling masks of different sampling rates (10%, 20%, 30%, and 40%) [31]. The statistical significance was tested by conducting a paired t-test.

Experimental Setup
The performance of the proposed CS-MRI reconstruction approach was evaluated using a dataset comprising T2-weighted brain MR images extracted from the IXI database [30]. In this study, 7427 and 7460 images (256 × 256 resolution) of 57 subjects each were randomly selected for CNN training and testing, respectively. Experiments were performed for Cartesian and radial undersampling masks of 10%, 20%, 30%, and 40%, as indicated in Figure 3 [21].
To evaluate the performance of the proposed iterative method, the (1) standalone image-based U-net and (2) noniterative k-space correction [23] methods were adopted for comparison.

1.
The standalone image-based U-net was trained by the same architecture as that of this study, expressed by the following: 2.
The noniterative k-space correction method was implemented based on Hyun's method [23], expressed by the following: Here, x denotes the output of the noniterative k-space correction method that corresponds to the first output of the proposed iterative method.
To ensure a fair comparison, the architecture and hyperparameters of the CNNs of other methods were set to be identical to those considered in the proposed approach. The PSNR and SSIM values were calculated to evaluate the performance of the methods at Cartesian and radial sampling masks of different sampling rates (10%, 20%, 30%, and 40%) [31]. The statistical significance was tested by conducting a paired t-test.

Results
We compared the proposed method with the image-based U-net and noniterative k-space correction methods. Figure 4 presents the reconstructed results that were obtained after ten iterations of the proposed approach using Cartesian and radial sampling masks of 10%. The first to fifth columns show the ground truth, the zero-filling, image-based U-net, noniterative, and proposed methods, respectively. The proposed method can reduce aliasing artifacts compared with the other methods. The bottom row contains the error maps obtained by different methods in comparison with the fully sampled image. Even though the image-based U-net and noniterative methods can substantially reduce artifacts, the artifacts at image edges, such as those at the boundary of structures, cannot be reduced. The proposed iterative method can reduce these artifacts as well. Table 1 shows the mean and standard deviation (SD) of the PSNR and SSIM values obtained using different methods with Cartesian and radial sampling masks of 10%. Ten iterations of the proposed method resulted in statistically significant improvements (p < 0.001) in both the PSNR and SSIM compared with the other methods and the smaller iterations in the proposed method. Figures 5 and 6 show the box plots of the PSNR and SSIM values obtained with Cartesian and radial sampling masks of 10%, 20%, 30%, and 40%. The columns correspond to the zero-filling, image-based Unet, and noniterative methods and each iteration of the proposed method (left to right). In each plot, the yellow line within the box represents the median; the lower and upper lines of the box represent the 25th and 75th percentiles, respectively; and the lower and upper adjacent lines (whiskers) represent the minimum and maximum values, respectively. Ten iterations of the proposed method outperformed the Cartesian Full sampling 20% 30% 40% Full sampling 20% 30% 40% Radial 10% 10% Figure 3. Cartesian and radial sampling masks for the different undersampling masks-10%, 20%, 30%, and 40%-considered in this study.

Results
We compared the proposed method with the image-based U-net and noniterative k-space correction methods. Figure 4 presents the reconstructed results that were obtained after ten iterations of the proposed approach using Cartesian and radial sampling masks of 10%. The first to fifth columns show the ground truth, the zero-filling, image-based U-net, noniterative, and proposed methods, respectively. The proposed method can reduce aliasing artifacts compared with the other methods. The bottom row contains the error maps obtained by different methods in comparison with the fully sampled image. Even though the image-based U-net and noniterative methods can substantially reduce artifacts, the artifacts at image edges, such as those at the boundary of structures, cannot be reduced. The proposed iterative method can reduce these artifacts as well. Table 1 shows the mean and standard deviation (SD) of the PSNR and SSIM values obtained using different methods with Cartesian and radial sampling masks of 10%. Ten iterations of the proposed method resulted in statistically significant improvements (p < 0.001) in both the PSNR and SSIM compared with the other methods and the smaller iterations in the proposed method.
image-based U-net and the noniterative methods for all Cartesian and radial sampling masks. In addition, the SSIMs from the noniterative method with a Cartesian sampling mask of 10% ( Figure 5) and radial sampling masks of 10% and 20% ( Figure 6) are lower than those from the image-based U-net and proposed methods. During training, the proposed method converges within ten iterations in terms of the PSNR and SSIM. These findings demonstrate the effectiveness of the proposed iterative method.   Figure 4. Image reconstruction results obtained with a sampling mask of 10%. Columns correspond to ground truth and results obtained using the zero-filling, image-space-treated U-net, other noniterative, and proposed iterative methods (left to right). Rows correspond to the Cartesian and radial sampling masks of reconstructed images and error maps compared to fully sampled images. Table 1. Quantitative results (mean ± SD) in terms of peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) obtained using different methods with Cartesian and radial sampling masks of 10%.  Figures 5 and 6 show the box plots of the PSNR and SSIM values obtained with Cartesian and radial sampling masks of 10%, 20%, 30%, and 40%. The columns correspond to the zero-filling, image-based U-net, and noniterative methods and each iteration of the proposed method (left to right). In each plot, the yellow line within the box represents the median; the lower and upper lines of the box represent the 25th and 75th percentiles, respectively; and the lower and upper adjacent lines (whiskers) represent the minimum and maximum values, respectively. Ten iterations of the proposed method outperformed the image-based U-net and the noniterative methods for all Cartesian and radial sampling masks. In addition, the SSIMs from the noniterative method with a Cartesian sampling mask of 10% ( Figure 5) and radial sampling masks of 10% and 20% ( Figure 6) are lower than those from the image-based U-net and proposed methods. During training, the proposed method converges within ten iterations in terms of the PSNR and SSIM. These findings demonstrate the effectiveness of the proposed iterative method.

Discussion
We propose herein an iterative CNN-based method for CS-MRI reconstruction. The method combines image-based CNNs and k-space corrections, with the CNNs representing a priori image space information. The proposed method performs iterative calculations in the image and k-space domains through image restoration using the CNNs and k-space correction. In this manner, the proposed method can reduce the error in k-space correction. Compared to those obtained using other techniques, the images that are reconstructed using the proposed approach demonstrate enhanced image quality with a reduction in aliasing artifacts. The above-described results demonstrate that the proposed method outperforms the U-net trained on the image space and other noniterative methods in terms of more effective suppression of aliasing artifacts. For example, Figure 4 reveals the reconstruction errors in the ventricles of the brain for a radial sampling mask of 10%, as observed using the U-net and noniterative methods. On the contrary, the proposed method can suppress these reconstruction errors. In addition, the results obtained using the proposed method were also improved compared with those of the other methods in terms of the PSNR and SSIM for all sampling rates of the Cartesian and radial sampling masks. These results indicate that the proposed method can enhance image quality and decrease acquisition time. Additionally, the results of the quantitative evaluation reveal that, in comparison with image-based CNNs and other noniterative approaches, the proposed method increases the PSNR and SSIM for Cartesian and radial sampling masks. Particularly, at radial sampling masks of 10% and 20%, the noniterative method does not improve image quality in terms of the SSIM when compared to the U-net and proposed methods. These results support the fact that the proposed method can reduce the error in k-space correction using iterative k-space correction.
Several studies have already compared deep-learning-based methods with conventional CS-MRI algorithms. For example, Eo et al. [22] reported that a three-layered CNN based on Wang's method [18] can achieve superior performance compared to CS-MRIs by utilizing sparsity in wavelet transforms [8] as well as dictionary learning [12]. Furthermore, the fastMRI project [32] also showed that the U-net performs substantially better than the total-variation-based CS-MRI [33]. Considering these reports, we believe that it is sufficient to compare the proposed method to the U-net and the noniterative methods.
The major limitations of the present study include the optimization of CNN hyperparameters, i.e., the number of layers, filters, epochs, and batch size. In general, these parameters are empirically determined, and the objective optimization of these parameters is necessary. In addition, the experimental data used in this study correspond to real-valued MR images, which are different from the actual k-space data obtained from an actual MRI scanner (complex-valued MR images). Hence, an imaginary channel would have to be connected to the input and output CNN layers. Alternatively, complex-valued data would have to be preprocessed. In the future, we will focus on evaluations of clinical usefulness by radiologists and testing with other MR modalities, such as T1-weighted images.

Conclusions
This paper presents a CS-MRI reconstruction approach that combines image-based CNNs and k-space correction, wherein the two methods are iteratively implemented, with the CNNs representing a priori image space information. The aliasing artifacts in the reconstructed images obtained using the proposed approach were reduced when compared to those obtained using other state-of-the-art techniques. In addition, the quantitative results obtained in the form of the PSNR and SSIM demonstrate the effectiveness of the proposed method. These results indicate that the proposed CS-MRI method enhances MR image quality with high-throughput examinations.