Dual-Channel Reconstruction Network for Image Compressive Sensing

The existing compressive sensing (CS) reconstruction algorithms require enormous computation and reconstruction quality that is not satisfying. In this paper, we propose a novel Dual-Channel Reconstruction Network (DC-Net) module to build two CS reconstruction networks: the first one recovers an image from its traditional random under-sampling measurements (RDC-Net); the second one recovers an image from its CS measurements acquired by a fully connected measurement matrix (FDC-Net). Especially, the fully connected under-sampling method makes CS measurements represent original images more effectively. For the two proposed networks, we use a fully connected layer to recover a preliminary reconstructed image, which is a linear mapping from CS measurements to the preliminary reconstructed image. The DC-Net module is used to further improve the preliminary reconstructed image quality. In the DC-Net module, a residual block channel can improve reconstruction quality and dense block channel can expedite calculation, whose fusion can improve the reconstruction performance and reduce runtime simultaneously. Extensive experiments manifest that the two proposed networks outperform state-of-the-art CS reconstruction methods in PSNR and have excellent visual reconstruction effects.


Introduction
In the past decade, compressive sensing [1] theory has achieved great success in signal sampling paradigm because it can obtain high-quality recovery from CS measurements. Based on CS, several new imaging systems have been developed, such as single-pixel camera [2], compressive spectral imaging system [3], Hyperspectral imaging [4], high-speed video camera [5] and fast Magnetic Resonance Imaging (MRI) system [6].
For a given image x ∈ R N , the CS linear measurements y = Φx ∈ R M , where Φ is an M × N measurement matrix and M N. The original image has a sparse representation x = Ψs where Ψ is an N × N basis matrix. Compressive sensing is mainly concerned with the problem of recovering original image x from CS measurements y, which contains two kinds of methods: conventional iterative optimization strategies [7][8][9][10][11][12][13][14] and deep learning-based methods [15][16][17][18].
Early researchers have proposed some iterative algorithms such as matching pursuit [7], orthogonal matching pursuit (OMP) [8,9], iterative hard thresholding [11], iterative soft thresholding [12] and approximate message passing (AMP) [13,19]. However, these iterative algorithms are usually very slow to converge. To alleviate such difficulty, block-based CS methods have been proposed [20,21], although they still need expensive computation. Inspired by the great success of deep neural networks for computer vision tasks [22,23], learning-based CS reconstruction methods have been developed [15][16][17][18]24]. However, compared with the traditional CS methods, deep learning-based methods require additional training process, which brings the need for a training set. However,

•
Unlike the deep-learning network with a very deep single-channel, we propose a novel shallow dual-channel reconstruction module for image compressive sensing reconstruction, in which each channel can extract different level features. It brings the better reconstruction quality.

•
The proposed DC-Net module has two residual blocks and one dense block. Because the dense block has fewer parameters than residual block, the time complexity of the proposed method is lower than DR 2 -Net with four residual blocks.

•
In our method, two residual blocks in one channel can obtain high level features and one dense block in another channel can obtain the low level features. Experiment results show both RDC-Net and FDC-Net have better robustness than DR 2 -Net.

Related Work
There are many traditional optimization algorithms [7][8][9][10]20,26,27] which are used to solve the CS reconstruction problem. AmitSatish Unde et al. proposed a reconstruction algorithm based on iterative re-weighted l1 norm minimization [20]. A.Metzler et al. proposed a denoising-based AMP framework (D-AMP), which integrated a wide class of denoiser within its iterations [14]. A. Metzler et al. also developed a novel neural network architecture that mimics the behavior of the denoising-based approximate message algorithm (LDAMP) [19]. Jin Tan et al. employed an adaptive Wiener filter as the image denoiser into AMP framework, called "AMP-Wiener". They extended AMP-Wiener to three-dimension, called "AMP-3D-Wiener" for compressive hyperspectral imaging reconstruction problem [28]. Philip Schniter et al. integrated the D-AMP into auto-tuning method to form the D-VAMP [29]. E Tipping et al. presented an accelerated training algorithm for sparse bayesian models. They exploited a recent result concerning the properties of the marginal likelihood function to derive a 'constructive' method for maximisation thereof [30]. Jiao Wu et al. proposed a stage-wise fast l p -sparse Bayesian learning algorithm through integrating with a fast sequential learning scheme and a stage-wise strategy for CS reconstruction [31]. Thomas et al. proposed an iterative hard thresholding for compressed sensing [11]. Xiangming Meng et al. presented a unified Bayesian inference framework for generalized linear models (GLM), which iteratively reduced the GLM problem to a sequence of standard linear model (SLM) problems [32]. Jiang Zhu et al.proposed an approximate message passing-based generalized sparse Bayesian learning (AMP-Gr-SBL) algorithm to reduce the computation complexity of Gr-SBL algorithm [33]. Jun Fang et al. proposed a 2D pattern-coupled hierarchical Gaussian prior model to exploit the underlying block-sparse structure. This pattern-coupled hierarchical Gaussian prior model imposed a soft coupling mechanism among neighboring coefficients through their shared hyperparameters [34]. Mohammad Shekaramiz et al. proposed a new sparse Bayesian learning (SBL) method that incorporated a total variation-like prior as a measure of the overall clustering pattern in the solution [35]. Saman et al.presented an generative iterative thresholding algorithm for linear inverse problems with multi-constraints and its applications [26]. Bin Kang et al. proposed an efficient image fusion framework for multi-focus images based on compressed sensing. The new fusion framework consisted of three parts: image sampling, measurement fusion and image reconstruction. This novel fusion framework was capable of saving computational resource and enhancing the fusion result and was easy to implement [36]. Kezhi Li et al. proposed a new class of orthogonal circulant matrices built from deterministic sequences for convolution-based compressed sensing [37]. Nam Yul Yu et al. proposed to construct a filter with real-valued coefficients by taking the discrete Fourier transform of a decimated binary Sidelnikov sequence [38]. Weisheng Dong and Guangming Shi et al. presented a learning method for compressive image recovery. PAR models were first learned from training set and then used to regularize the compressive image recovery process [39]. However, the above algorithms suffer from serious time-consuming, which has become the bottleneck for the application of image compressive sensing.
In recent years, deep learning-based methods have shown promising performance in compressive image recovery [15][16][17][18]40]. Yu Simiao et al. proposed a conditional Generative Adversarial Networks-based deep learning framework for de-aliasing and reconstructing MRI images from highly undersampled data with great promise to accelerate the data acquisition process [41]. Guang Yang et al. provided a deep learning-based strategy for reconstruction of CS-MRI, and bridged a substantial gap between conventional non-learning methods working only on data from a single image, and prior knowledge from large training data sets [40]. Seitzer, Maximilian et al. proposed a hybrid method, in which a visual refinement component was learnt on top of an MSE loss-based reconstruction network [42]. Schlemper, Jo et al. proposed a novel cascaded convolution neural networks based on compressive sensing technique and explore its applicability to improve DT-CMR acquisitions [43]. The stacked denoising autoencoder (SDA) [15] considered the mapping from original signal to its measurement vector as one layer of the SDA. This kind of measurement method made SDA adapt its structure to the training set. However, it enhanced computational complexity along with the size of input image increased. Kulkarni et al. [16] proposed a block-based Network to realize the non-iterative image recovery. It took CS measurements of image block as input and output its corresponding reconstruction image block. DR 2 -Net [17] contained a linear mapping to recover a preliminary reconstructed image, in which residual blocks [23] could further improve the reconstruction quality. Xiaotong Lu and Weisheng Dong et al. [24] proposed a novel convolutional compressive sensing framework (ConvCSNet) based on deep convolutional neural network, which captured the image measurements by a convolutional operation.
Deep Residual Network: Lately, the deep residual network (ResNet) [23] had achieved promising performance on many computer vision tasks such as Image Recognition [23] and Image Denoising [44]. The ResNet introduces identity shortcut connections that directly pass the data flow to later layers compared with the traditional convolutional network. Therefore, we use the ResNet to avoid the loss attenuation caused by multiple non-linear transformations and ResNet consists of many residual blocks.
Densely Connected Network: Recently, the densely connected network (DensNet) [22] also obtained an enormous success in image detection, classification and semantic segmentation. Compared with the deep Residual Network [23], the DensNet introduces identity shortcuts to all layers, which makes a better use information of all features. Especially in reconstruction tasks, the architecture of DensNet can make comprehensive utilization of shallow detailed features to recover original image and DensNet consists of many dense blocks.
To further improve reconstruction quality and reduce runtime in CS reconstruction, in this paper, we use two residual blocks and one dense block to build a dual-channel reconstruction network module. This module can improve the image reconstruction quality and reduce time complexity simultaneously, which is used to build two CS reconstruction networks: the first one recovers original image from its CS measurements acquired by random Gaussian under-sampling measurements (RDC-Net) and the second one recovers original image from its CS measurements acquired by the fully connected measurement matrix (FDC-Net).
The remainder of this paper is organized as follows: In Section 3, we introduce dual-channel reconstruction network module and two kinds of reconstruction networks. In Section 4, we design extensive experiments to evaluate our proposed reconstruction networks. Finally, we conclude the paper in Section 5.

Network Architecture
As shown in Figure 1. we propose two kinds of reconstruction networks: RDC-Net and FDC-Net. Firstly we introduce traditional random under-sampling, fully connected under-sampling approaches and preliminary reconstructed module. Then we discuss dual-channel reconstruction network module.

Under-Sampling and Preliminary Reconstruction
In the compressive sensing theory [1], there are some under-sampling approaches such as random Gaussian measurement [45], random Fourier measurement [46] and random Bernoulli measurement [47]. Random Gaussian measurement matrix is mostly used in CS theory and we also use Gaussian measurement matrix in RDC-Net.
In the FDC-Net, we use a fully connected layer (Figure 2b) as the measurement matrix to imitate the traditional under-sampling method in Figure 2a. In particular, such fully connected layer has no bias and activation function, and it learns a linear transformation from the original image to CS measurements. Both random Gaussian measurement matrix and learning measurement matrix have the similar mathematical formula as Equation (1). We expect that the learning measurement matrix (Equation (4)) is well adapted to the distribution of original image and denote this layer as y f = W 1 · x, and the traditional Gaussian measurement method can also be expressed by and W 2 conforms to the Gaussian distribution. Especially, x is the original image and y r , y f are the corresponding CS measurements of RDC-Net and FDC-Net, respectively. Afterwards, we use a fully connected layer to recover a preliminary reconstructed image x * c . We denote the preliminary reconstructed module and the corresponding parameters as f ( ) and Ω p c respectively, where c ∈ {RDC-Net, FDC-Net}, p represents the preliminary reconstructed module. The preliminary reconstructed image can be expressed by: and mean squared error (MSE) is used as the loss function for the training set: where M, W 1 represent the number of training samples and learning measurement matrix respectively. Back propagation [48] algorithm is used to minimize the loss function defined in Equation (3).

Dual-Channel Network Module
In Section 3.1, we only obtain a preliminary reconstructed image for the reason that it is not easy to get an exact solution in preliminary reconstructed module. Then, the dual-channel network module is used to further improve the reconstruction quality. In this paper, two residual blocks and one dense block are used as one channel separately, and they are fused to build a dual-channel network module. We firstly make a brief introduction to residual block and dense block.
Compared with the traditional convolutional network, the main difference of the residual network is that it introduces identity connections that directly pass the data flow to later layers. Given an input χ, we expect the output of a few stacked layers in network as T (χ). However, it takes great expense to optimize T ( ) in traditional convolutional network. In [23], K. He et al. proposed to approximate the residual value between T (χ) and χ with the stacked layers. The residual block (Figure 1c) can be expressed by In [22], Gao Huang et al. proposed the Dense Convolutional Network (DensNet) for many computer vision tasks. The traditional convolutional networks with L layers have L connections. While the DensNet has L(L+1) 2 direct connections, which strengthens feature propagation, encourages feature reuse and enormously reduces the number of parameters. This kind of network is very useful in the compressive sensing field. In dense block (Figure 1d), for each layer, all preceding feature maps are used as its inputs, and its own feature maps are also used as inputs into all subsequent layers. In other words, it means the mth layer can connect the feature maps of all preceding layers χ 0 , χ 1 , ..., χ m−1 as inputs: where [χ 0 , χ 1 , ..., χ m−1 ] denotes the concatenate operation of the feature maps in layers 0, 1, ..., m − 1. Γ m (.) can be regarded as a composite function of four consecutive operations: batch normalization (BN), scale layer, rectified linear unit (ReLU) and convolution (Conv). We denote the dual-channel network module as H(χ) that contains two residual blocks and one dense block, which can be expressed by where the symbol ⊗ represents cascaded operation and ⊕ represents parallel operation between one dense block and two residual blocks. In this paper, H(χ) takes x * c as input and outputs final reconstruction result, which can be represented as: where d represents the dual-channel network module and the Ω d c represents the parameters of dual-channel network module. The loss function of the proposed networks can be expressed by

Architecture
The architectures of proposed networks are shown in Figure 1. In the RDC-Net (Figure 1a), we take the 33 × 33 sized image block as input and acquire CS measurements by traditional random measurement matrix. In the FDC-Net (Figure 1b), we take the same sized image block as input and acquire CS measurements by fully connected measurement matrix. With the CS measurements, the preliminary reconstructed image can be realized via a fully connected layer. Then, the dual-channel reconstruction network module H(χ) takes the preliminary reconstructed image as input and outputs the corresponding higher quality image. Finally, the BM3D [49] is used to remove the artifacts caused by block-wise processing.

Experiments
In this section, we perform a multitude of experiments to test the performance of the proposed networks on the Caffe [50] platform. Our computer is equipped with intel core i7-6700 with frequency of 3.4 GHz and Nvidia GeForce GTX 1080Ti, and the network framework runs on the ubuntu system.

Training Data
For a fair comparison, the same dataset [16] is used to generate the training data and test data. We use the luminance component of the images and extract 33 × 33 sized image patches with stride 14 from 91 images [16] as training set. We also use the luminance component of the images and extract 33 × 33 sized image patches with stride 14 from 5 images [16] as test images. Both RDC-Net and FDC-Net use the same dataset and are trained with different MRs = 0.01, 0.04, 0.10 and 0.25. Especially, we take about 8 h to train the proposed networks.

Training Strategy
The training procedure of RDC-Net and FDC-Net consists of two steps. In the first step, we train preliminary reconstructed module with a slightly big learning rate to obtain the preliminary reconstructed image and parameters of Ω p c . The maximum number of iterations, the learning rate, the step size, the batch size and the gamma are set as 800,000, 0.001, 200,000, 128 and 0.5, respectively. The second step is to optimize preliminary reconstruction module and DC-Net module with a gradually decline learning rate and updates parameters of Ω p c and Ω d c . Especially, the maximum number of iterations, the learning rate, the decay rate, the decay steps and the batch size are set as 200,000, 0.0001, 0.98, 1000 and 64.

Comparison with Other Methods
In this part, we compare two proposed networks with existing methods such as NLR-CS [51], D-AMP [14], TVAL3 [10], ReconNet [16], SDA [15], DR 2 -Net [17] and ConvCSNet [24]. In particular, NLR-CS, TVAL3, D-AMP, ReconNet, DR 2 -Net, CSRNet and RDC-Net obtain the CS measurements by traditional random measurement matrix. SDA [15], ConvCSNet [24], ASRNet and FDC-Net obtain CS measurements by learning-based approaches. The results of TVAL3, NLR-CS, D-AMP, ReconNet and DR 2 -Net are from the code presented by the respective authors on their websites. Especially, the results of SDA are from our own reproduction. The results of CSRNet and ASRNet refer to the paper [25]. In the training stage, we use the default parameters to train these networks many times to get the many test models. Then we use these test models to obtain reconstruction results. In this paper, we choose PSNR and SSIM as the evaluation criterions. The related experiment results are summarized in Tables 1 and 2, where the best results are highlighted in bold.    As shown in Table 1, RDC-Net obtains the higher mean PSNR values than other methods at MRs = 0.10, 0.25. However, in some test images (e.g., barbara, Fingerprint, Flinstones), other reconstruction methods (NLR-CS or DR 2 -Net) obtain slightly higher reconstruction quality, and we also compare the reconstruction performance of FDC-Net, SDA, ConvCS-Net and ASRNet in Table 2. It is obvious that FDC-Net outperforms other methods at measurement rates 0.01, 0.04, 0.10 and 0.25. Especially at MR = 0.25. FDC-Net obtains 2.3 dB improvement than the second highest value. In Figures 3-5, we compare the visual reconstruction results among FDC-Net, RDC-Net and DR 2 -Net. We can easily find that our reconstruction results have better visual effects. For example, Figure 4 is a fingerprint image. Our visual reconstruction results have a clearer texture, clean areas and sharp edges than DR 2 -Net in the enlarged patches at four MRs , while the visual reconstruction results of DR 2 -Net have blurred textures and confusing areas.

Evaluation on Different Network Architectures
In order to evaluate the effectiveness of our main model, FDC-Net, we design other different network architectures such as single channel networks (One-densblock and Two-resblocks)and dual-channel networks (one-resblock + one-densblock, two-resblocks + two-densblock, three-resblocks + one-densblock). "One-densblock" means that we use the dense block channel (Figure 1b) to recover the image from CS measurements. "Two-resblocks" means that we use the residual block channel (Figure 1b) to recover the image. "one-resblock + one-densblock", "two-resblocks + two-densblock" and "three-resblocks + one-densblock" represent that we use one residual block and one dense block, two residual blocks and two dense blocks, three residual blocks and one dense block to improve the preliminary reconstructed image quality respectively. The relevant results are summarized in Table 3, where the best results are highlighted in bold. As shown in Table 3, it is obvious that FDC-Net outperforms other networks at MRs = 0.04, 0.10, 0.25. When we only use one channel module (one-densblock or two-resblocks) to recover the original image from its CS measurements, the reconstruction results are good. But we combine two channel modules, FDC-Net obtains obviously outstanding performances, which is probably because the residual block channel can improve reconstruction quality and dense block channel can expedite calculation. One-resblock + one-densblock, three-resblocks + one-densblock and two-resblocks + two-densblocks all obtain outstanding performance. Although the three-resblocks + one-densblocks obtains higher PSNR than FDC-Net at MR = 0.01, it increases the time complexity and has lower PSNR than FDC-Net at MRs = 0.04, 0.10, 0.25. Therefore, we use the two residual blocks and one dense block to build the dual-channel reconstruction module.

Robustness to Noise
To show the robustness of proposed networks to noise, we perform reconstruction experiments under the presence of measurement noise. The standard Gaussian noise is added to CS measurements of test set. We add five levels of noise corresponding to δ = 0.01, 0.05, 0.10, 0.25 and 0.5, where δ is the standard variance for the Gaussian noise. Then, two proposed networks trained on the noiseless CS measurements take the noisy CS measurements as input and output the reconstruction images. Here, we mainly compare the three algorithms: DR 2 -Net, RDC-Net, FDC-Net. The related results are summarized in Figure 6.
From Figure 6, it is obvious that two proposed networks mostly outperform the DR 2 -Net for δ = 0.01, 0.05, 0.10, 0.25 and 0.5 at four MRs. Especially, the decay of FDC-Net's performance is slower than DR 2 -Net's at MRs = 0.01, 0.04, which indicates that our FDC-Net have outstanding robustness at low measurement rates.

Evaluation on ImageNet Val Dataset
To testify the scalability of proposed networks, we also perform reconstruction experiments between two proposed networks with DR 2 -Net on the large-scale ImageNet val dataset [52] and it includes 50,000 images of 1000 classes. The experimental results are shown in Table 4, where the best results are highlighted in bold.
As shown in Table 4, the two proposed networks obtain better performances than DR 2 -Net at four MRs. Especially at MR = 0.25, RDC-Net, FDC-Net achieves nearly 3 dB and 5 dB improvement over DR 2 -Net, respectively, which indicates that our proposed networks have better generalization ability than DR 2 -Net.

Time Complexity and Network Convergence
In this paper, we also perform the time complexity experiments between two proposed algorithms and DR 2 -Net. The related results are shown in Table 5, where the best results are highlighted in bold.
From Table 5, we can observe that two proposed networks have slightly less runtime than DR 2 -Net, and FDC-Net gains best results, which is helpful for CS real-time applications. In order to further demostrate that our proposed networks have better convergence performance than DR 2 -Net, we perform a convergence experiment between FDC-Net and DR 2 -Net at MR = 0.04. Figure 7 shows that training error and test error of FDC-Net are smaller than DR 2 -Net, which demonstrates that our network is easier to converge than DR 2 -Net.

Conclusions
Inspired by the fact that deep learning-based methods can improve reconstruction performance and enormously reduce computation compared to traditional iterative reconstruction algorithms, we propose a novel dual-channel reconstruction network module (DC-Net module) to build two CS reconstruction networks: the first one recovers an image from its traditional random under-sampling measurements (RDC-Net); the second one recovers an image from its CS measurements acquired by a fully connected measurement matrix (FDC-Net). Especially, DC-Net module consists of one dense block and two residual blocks. We use a fully connected layer to obtain a preliminary reconstructed image, and DC-Net module is used to further improve the preliminary reconstructed image quality. Extensive experiments show that our networks outperform the state-of-the-art CS algorithms in both PSNR and visual quality. Moreover, our networks also have outstanding robustness and lower time complexity.