In this part, we use the designed bit-rate model and relative PSNR model to optimize the sampling rate and bit-depth jointly.
5.1. Rate-Distortion Optimization Algorithm
We introduced the relative PSNR substitution distortion into problem (5). The optimization problem of sampling rate
and bit-depth
can be expressed as follows:
From bit-rate model (17), let
; there is a correspondence between the sampling rate and the quantization depth as follows:
The number of bit-depth is less than the number of sampling rate , and is much less than the number of combinations for bit-depth and sampling rate. According to Equation (25), the number of candidate parameters of problem (5) can be reduced to the same number as the bit-depth.
Therefore, the proposed adaptive CS image coding framework with rate-distortion optimization follows the main steps below:
- (1)
Input:
- (2)
First sampling.
Sampling rate is , and the original image is measured to obtain partial measurements .
Calculate the mean , the variance , the maximum and the minimum of .
- (4)
Reducing the candidate set.
Calculate the sampling rate corresponding to each bit-depth based on Equation (25), obtaining a candidate parameter set , where represents the number of quantization depths.
- (5)
Estimating the optimal parameters.
Estimate the relative PSNR of all candidate parameters according to the four-layered feedforward neural network, and select the parameter for which relative PSNR is best. is the optimized sampling rate and is the optimized bit-depth.
Sampling rate is , and the original image is measured to obtain the remaining measurements.
- (7)
Quantization and entropy coding.
The measurements of the two samplings are quantized using the bit-depth , and then are entropy encoded.
5.2. Model Parameter Estimation for the Bit-Rate Model and the Relative PSNR Model
In order to estimate the model parameters of the proposed average codeword length model and the relative PSNR model, 100 images in the BSDS500 dataset [
35] were randomly selected for training, and the BSD68 dataset [
36] was used for testing, each image being cropped to a 256 × 256 size. During training, the quantization bit-depth took eight values in {3, 4, …, 10}, and the sampling rate used 49 values which included 40 values in {0.01, 0.02, …, 0.4} and 9 values in {1/30, 1/35, 1/40, …, 1/80}. Each image collected 392 samples, which included the average codeword length, the relative PSNR, and their affecting factors. A total of 39,200 samples were collected for model training. At the encoder, the same orthogonal Gaussian measurement matrix was first used for block CS sampling, in which the image block size was 32 × 32 (the measurement still obeys the approximate Gaussian distribution [
26]), and then uniform quantization and arithmetic coding were performed. At the decoder, arithmetic decoding and inverse quantization were first performed, and then CS reconstruction was performed using a non-local low-rank algorithm (NLR-CS) [
23], in which the initial image was reconstructed total variation iterative threshold regularization image reconstruction algorithms (BCS-TVIT) [
37].
The initial sampling rate
determines the accuracy of the image features estimated by
and
. The larger it is, the better it is to estimate the bit rate and PSNR accurately. However, if
is too large, there may be unnecessary measurements and calculations. When a Gaussian random matrix is used, the number of measured values for reconstructing a high-quality signal is at least
[
21], so the best choice of the initial sampling rate m should be
, which is difficult to estimate it accurately. We analyzed the sample data of the training set and found that when the sampling rate was lower than 0.013, the visual quality of all reconstructed images was bad, and the PSNR value did not exceed 15 dB. Therefore, we used
.
As shown in
Table 1, the parameters of our model (16) were obtained by least square fitting with the
in the training set. To quantify the accuracy of the fitting, we also measured the mean squared error (MSE), the Pearson correlation coefficient (PCC), and R-squared (
) [
38] between actual
and predicted
in the test set. The closer
and PCC are to 1, the better the degree of fit of the model.
As can be seen from
Table 1, all parameters are non-zero except for the value of
, which verifies the mapping relationship between the sampling rate
, bit-depth
, variance
, interval
, and average codeword length
.
is the coefficient of
, and the value of
is very small. When there is a fifth term
, the correlation between
and
is very weak.
indicates that the influence of
can be ignored in model (16).
In
Table 2, the R-squared of model (12) reaches 0.9809 and the PCC reaches 0.9904. The R-squared of model (16) reaches 0.9903 and the PCC reaches 0.9952, which is better than the estimation of model (12). The results show that both model (12) and model (16) can describe well the relationship between sampling rate
, bit-depth
, variance
, mean
, and the average codeword length
, and that model (16) is better than model (12). Moreover, bit-rate model (17) based on model (16) has no logarithmic operation, and can quickly calculate the sampling rate based on the bit-depth
and the
to narrow the parameter candidate set, which is more conducive to practical application.
When collecting data about the relative PSNR, we took
,
for
. We used the “newff” function in MATLAB 2018b software for training PSNR,
,
,
and
, respectively, where the input and the four-layered feedforward neural network are the same. The training and testing performances are shown in
Table 3.
Table 3 shows that the effect of fitting the PSNR using the same input variables and network structure is the worst, because PSNR is calculated from the difference between the original image and the reconstructed image. In addition to being related to the sampling rate
, quantized bit-depth
, and the variance
and average
of some measurements, PSNR is also closely related to other factors. Compared with the estimated PSNR, the performance of the estimated
,
,
, and
is improved. Among them, the effect of estimating
is the best, which shows that the mapping relationship between sampling rate
, bit-depth
, variance
, mean
, and
is closer than that with
,
, and
. Therefore, we chose
to evaluate distortion.
5.3. Computational Complexity of the Rate-Distortion Optimization Algorithm
The additional computational complexity of the rate-distortion optimization for sampling rate and the bit-depth is mainly derived from feature extraction, rate estimation, and relative PSNR estimation.
The calculation of extracting features is mainly from the , , , and values. Assuming the image size is and the block size is 32 × 32, the number of measurements obtained by the first sampling is . The calculation of requires additions and one multiplication. The calculation of requires additions and multiplications. The and require a total of up to comparisons. Assuming that a comparison requires two subtractions, a total of subtractions are required. The first sampling requires additions and multiplications. Assuming the same computational complexity of subtraction and addition, extracting features require a total of additions and multiplications. The extracted feature additionally adds 0.11% multiplication and 0.59% addition compared to the first sampling.
The calculation of the rate estimation process mainly comes from the calculation of Equation (25). Since the bit-depth is a finite discrete value, can be calculated using a lookup table in the equation. At this point, calculating Equation (25) requires seven additions and seven multiplications. We chose seven bit-depths as candidate values, and then Equation (25) had to calculate a total of 49 additions and 49 multiplications.
The calculation of the relative PSNR estimation process mainly comes from the calculation of the neural network model (23). The network input layer has four neurons, and the output layer has one neuron. The network has two hidden layers, each with six neurons. The number of network parameters is 4 × 6 + 6 + 6 × 6 + 6+6 × 1 + 1 = 79. Networks without activation functions include 4 × 6 + 6 × 6 + 6 × 1 = 66 multiplications and 3 × 6 + 6 + 5 × 6 + 6 + 5 + 1 = 66 additions. The hidden layer uses the sigmoid activation function. It is assumed that the series approximation calculates the exponential power. When the precision is , it takes about 60 multiplications and 10 additions to calculate an activation function. Calculating 12 activation functions requires 720 multiplications and 120 additions. The calculation of the network model once is about 782 multiplications and 182 additions. If we select seven bit-depths as candidate values, we must calculate the relative PSNR of seven candidate parameters. In this case, we had to calculate 5474 multiplications and 1274 additions in total.
A measurement requires 1024 multiplications and 1023 additions. The computation of the estimated bit rate and relative PSNR does not exceed the multiplications of six measurements and the additions of two measurements. When compressing an image of size 256 × 256, the first sampling can obtain 852 measurements. The computation of the estimated bit rate and relative PSNR increases the multiplications by and the additions by . Compared with the computation of the first sampling, the additional computation of the entire rate-distortion optimization process increases by 0.81% multiplication and 0.82% addition.