A Novel Reconstruction Algorithm with High Performance for Compressed Ultrafast Imaging

Compressed ultrafast photography (CUP) is a type of two-dimensional (2D) imaging technique to observe ultrafast processes. Intelligence reconstruction methods that influence the imaging quality are an essential part of a CUP system. However, existing reconstruction algorithms mostly rely on image priors and complex parameter spaces. Therefore, it usually takes a lot of time to obtain acceptable reconstruction results, which limits the practical application of the CUP. In this paper, we proposed a novel reconstruction algorithm named PnP-FFDNet, which can provide a high quality and high efficiency compared to previous methods. First, we built a forward model of the CUP and three sub-optimization problems were obtained using the alternating direction multiplier method (ADMM), and the closed-form solution of the first sub-optimization problem was derived. Secondly, inspired by the PnP-ADMM framework, we used an advanced denoising algorithm based on a neural network named FFDNet to solve the second sub-optimization problem. On the real CUP data, PSNR and SSIM are improved by an average of 3 dB and 0.06, respectively, compared with traditional algorithms. Both on the benchmark dataset and on the real CUP data, the proposed method reduces the running time by an average of about 96% over state-of-the-art algorithms, and show comparable visual results, but in a much shorter running time.


Introduction
Compressed ultrafast photography (CUP) is a new ultrafast computational imaging method, and it can achieve an imaging speed of 10 11 frames per second within a single shot. This technology introduces the framework of compressed sensing [1] into the imaging process of streak cameras, extending the imaging capability of streak cameras from onedimensional (1D) to two-dimensional (2D). This emerging 2D ultrafast imaging technique is of great significance in revealing the fundamental mechanisms of physics, chemistry, and biomolecules [2]. It has a wide range of applications in the fields of fluorescence lifetime detection [3], real-time visualization of laser dynamics [4,5], and wide-field time-of-flight volume imaging [6], et al.
CUP reconstruction refers to establishing an inverse problem model based on the compressed sensing framework, and using an iterative algorithm to reconstruct 3D video data from the compressed 2D image captured by CUP. The reconstruction algorithm used in previous works in CUP is the two-step iterative threshold method (TwIST) [7]. Using total variation (TV) regularization in the TwIST reconstruction algorithm can easily induce artifacts [8], which limits the spatial-temporal resolution of CUP imaging and reconstruction efficiency. Ref. [9] introduced spatial and intensity constraints in the original TwIST algorithm by using an additional charge-coupled device (CCD) camera in the experimental system to reduce low-intensity artifacts. Ref. [10] proposed applying the plug-and-play alternating direction method of multipliers (PnP-ADMM) [11] to the CUP reconstruction problem, and the block-matching 3D filtering denoising (BM3D) [12] algorithm is applied to the solution of the sub-problem of this algorithm, which improves the quality of the reconstruction and effectively suppresses the resolution anisotropy and artifacts. However, the variable separation strategy adopted does not take full advantage of the fact that the sensing matrix is a block-diagonal matrix, and the convergence of the PnP-ADMM algorithm applied to the CUP reconstruction problem is not explained. When applying the BM3D algorithm to video data, it is necessary to perform BM3D denoising for each frame, which is very time-consuming, while denoising algorithms based on neural networks have better denoising performance in computing speed.
The alternating direction multiplier method (ADMM) is a widely used algorithm for constrained optimization problems in image restoration. Based on ADMM, the PnP-ADMM algorithm implements a modular structure. The biggest advantage of PnP-ADMM is that it allows the state-of-the-art denoising algorithms to be applied to the solution process of sub-optimization problems without specifying specific priors, which greatly improves the flexibility of the algorithm. Therefore, an excellent denoising algorithm can be found and inserted into the algorithm framework of PnP-ADMM to improve the reconstruction quality of the algorithm. Existing image denoising algorithms can be divided into two categories: model-based methods and discriminative-learning-based methods. Similar to total variation denoising (TVD) [13], BM3D, and weighted nuclear norm minimization (WNNM) [14] for image denoising, these algorithms are flexible in dealing with denoising problems with different noise levels, but they have some drawbacks. For example, algorithms are generally time-consuming and have many parameters that need to be manually tuned. Furthermore, these algorithms usually rely on manually determined priors, such as sparsity [15,16] and non-local self-similarity [17][18][19], which have limitations for describing complex image structures. Discriminant learning has been widely studied in image denoising due to its advantages of fast inference speed and good performance. Some examples include learning deep CNN denoiser prior for image restoration (IRCnn) [20], deep CNN for image denoising (DnCnn) [21], and fast and flexible solution for CNN-based image denoising (FFDNet) [22]. Their non-linear mapping layer is a collection of "Convolution + Batch Normalization + Rectified Linear Units" layers with filters of spatial size 3 × 3. Among them, FFDNet has several desirable properties that make it very suitable as a denoiser to be applied to the framework of PnP-FFDNet. First, it introduces a noise map as an input channel, so that a single model can handle a wide range of noise levels, so that it exhibits apparent results on both synthetic noisy images corrupted by additive white Gaussian noise (AWGN) and real-world noisy images [22,23]. Second, FFDNet reduces the size of the input by down-sampling, which makes it faster for forward inference. Therefore, compared with the model-based algorithm BM3D, with excellent de-noising performance even on CPU, FFDNet is faster without sacrificing the denoising performance. These properties make FFDNet very suitable as a denoiser in the framework of PnP-ADMM. In this paper, we proposed a novel reconstruction algorithm based on PnP and FFDNet to reconstruct the CUP system. FFDNet can learn the noise model in CUP system well. The performance on both simulated datasets and real data shows that our method performs well on both reconstructed visual effects and metric evaluation, and the reconstruction time is greatly reduced.

Principle of Streak Camera
Streak camera is an ultrafast imaging device that can capture dynamic events that occur on picosecond or even femtosecond timescales. As shown in Figure 1, a long slit with width of several microns is usually inserted in front of streak camera. The optical signal is converted into an electrical signal through the streak image tube and space-time mapping is performed by an ultrafast scanning unit. Microchannel plate (MCP) realizes mapping is performed by an ultrafast scanning unit. Microchannel plate (MCP) realizes the multiplication of photoelectrons, and the phosphor screen converts the photoelectrons into optical signals.  Figure 2a shows a schematic diagram of the CUP imaging system. The experimental system is mainly composed of streak camera (integrated CCD), a random binary mask, and a main camera lens. The random binary mask performs intensity modulation on the image of the detection target passing through the main lens. The light intensity modulation factor of the transparent area of the mask is 1, and the light intensity modulation factor of the opaque area of the mask is 0. The difference between CUP and the traditional streak camera is that the CUP requires the slit on the streak camera to be fully opened. This is the key to CUP's ability to perform 2D ultrafast imaging. Figure 2b shows how the streak camera works in the CUP. Since there is no limitation of the slit, the one to four frames of the encoded images entered into the streak camera and were scanned by an ultrafast electric field. Then, the second frame was sheared by a pixel compared with the first frame in the scanning axis, and this was also performed on the third, fourth frames, etc. Finally, all the images were accumulated into a compressed image and recorded by a CCD camera. Next, the intelligence methods were performed to reconstruct the dynamic ultrafast video from a compressed image.  Figure 3 describes the basic workflow of the CUP system. Four frames are selected from the benchmark dataset runner to simulate the dynamic scene , with different frames simulating the moment when the dynamic scene occurs. Each frame is of size 256 × 256, and then encoded by random matrix, which is a randomly distributed binary  Figure 2a shows a schematic diagram of the CUP imaging system. The experimental system is mainly composed of streak camera (integrated CCD), a random binary mask, and a main camera lens. The random binary mask performs intensity modulation on the image of the detection target passing through the main lens. The light intensity modulation factor of the transparent area of the mask is 1, and the light intensity modulation factor of the opaque area of the mask is 0. The difference between CUP and the traditional streak camera is that the CUP requires the slit on the streak camera to be fully opened. This is the key to CUP's ability to perform 2D ultrafast imaging. Figure 2b shows how the streak camera works in the CUP. Since there is no limitation of the slit, the one to four frames of the encoded images entered into the streak camera and were scanned by an ultrafast electric field. Then, the second frame was sheared by a pixel compared with the first frame in the scanning axis, and this was also performed on the third, fourth frames, etc. Finally, all the images were accumulated into a compressed image and recorded by a CCD camera. Next, the intelligence methods were performed to reconstruct the dynamic ultrafast video from a compressed image.   Figure 2a shows a schematic diagram of the CUP imaging system. The experimental system is mainly composed of streak camera (integrated CCD), a random binary mask, and a main camera lens. The random binary mask performs intensity modulation on the image of the detection target passing through the main lens. The light intensity modulation factor of the transparent area of the mask is 1, and the light intensity modulation factor of the opaque area of the mask is 0. The difference between CUP and the traditional streak camera is that the CUP requires the slit on the streak camera to be fully opened. This is the key to CUP's ability to perform 2D ultrafast imaging. Figure 2b shows how the streak camera works in the CUP. Since there is no limitation of the slit, the one to four frames of the encoded images entered into the streak camera and were scanned by an ultrafast electric field. Then, the second frame was sheared by a pixel compared with the first frame in the scanning axis, and this was also performed on the third, fourth frames, etc. Finally, all the images were accumulated into a compressed image and recorded by a CCD camera. Next, the intelligence methods were performed to reconstruct the dynamic ultrafast video from a compressed image.  Figure 3 describes the basic workflow of the CUP system. Four frames are selected from the benchmark dataset runner to simulate the dynamic scene , with different frames simulating the moment when the dynamic scene occurs. Each frame is of size 256 × 256, and then encoded by random matrix, which is a randomly distributed binary  Figure 3 describes the basic workflow of the CUP system. Four frames are selected from the benchmark dataset runner to simulate the dynamic scene I(x, y, t), with different frames simulating the moment when the dynamic scene occurs. Each frame is of size 256 × 256, and then encoded by random matrix, which is a {0, 1} randomly distributed binary code. The symbol denotes the Hadamard (element-wise) product. The data compression of CUP follows these three steps:

Design of Compressed Ultrafast Photography
Step 1: Encoding. Each frame of the dynamic scene I(x, y, t) is with the same mask, and the encoding operation is denoted as C. The encoded dynamic scene is CI(x, y, t); Step 2: Shift. A deflection electrode inside the streak camera provides a deflection voltage in the vertical direction, so the frames that arrived at different times were shifted in the vertical direction at different positions. The direction of the offset of each frame is marked in Figure 3. For the convenience of mathematical simplification later, it is assumed that each frame is offset by s 0 pixels, the shift operation is recorded as S, and the dynamic coding scene after translation is SCI(x, y, t); Step 3: Overlay. The receiver of the streak camera is an internal CCD. During the exposure time of the CCD, photons arriving at the CCD at different times are accumulated. The SCI(x, y, t) in step 2 is superimposed along the time axis, denoted as T, and the twodimensional observation result Y(m, n) = TSCI(x, y, t) is finally obtained.
code. The symbol denotes the Hadamard (element-wise) product. The data compression of CUP follows these three steps: Step 1: Encoding. Each frame of the dynamic scene is with the same mask, and the encoding operation is denoted as . The encoded dynamic scene is ; Step 2: Shift. A deflection electrode inside the streak camera provides a deflection voltage in the vertical direction, so the frames that arrived at different times were shifted in the vertical direction at different positions. The direction of the offset of each frame is marked in Figure 3. For the convenience of mathematical simplification later, it is assumed that each frame is offset by pixels, the shift operation is recorded as , and the dynamic coding scene after translation is ; Step 3: Overlay. The receiver of the streak camera is an internal CCD. During the exposure time of the CCD, photons arriving at the CCD at different times are accumulated. The in step 2 is superimposed along the time axis, denoted as , and the two-dimensional observation result is finally obtained. The goal of CUP reconstruction is to reconstruct the original three-dimensional dynamic scene from the obtained two-dimensional observation . Without loss of generality, the dynamic scene can be viewed as video data with N frames, and will mathematically describe the CUP imaging process and build a classical inverse problem model.
The video data that are the matrix representation of the dynamic scene with N frames are compressed into one frame of two-dimensional observation data , which are the matrix representation of the observation by the CUP system. Figure 4 shows the relationship between and . Therefore, . The coding matrix used by the CUP system is random binary code. The data compression observation model of CUP [10] can be expressed as: (1) where represents the noise in the observation process, can be expressed as , represents the encoding process, represents the shift process, and represents the accumulating process. can be represented by the summation notation: (2) The goal of CUP reconstruction is to reconstruct the original three-dimensional dynamic scene I(x, y, t) from the obtained two-dimensional observation Y(m, n). Without loss of generality, the dynamic scene can be viewed as video data with N frames, and will mathematically describe the CUP imaging process and build a classical inverse problem model.
The video data X X ∈ R n x ×n y ×N that are the matrix representation of the dynamic scene I(x, y, t) with N frames are compressed into one frame of two-dimensional observation data Y Y ∈ R L×n y , which are the matrix representation of the observation Y(m, n) by the CUP system. Figure 4 shows the relationship between L and n x . Therefore, L = (N − 1)s 0 + n x . The coding matrix C C ∈ R n x ×n y used by the CUP system is random binary code. The data compression observation model of CUP [10] can be expressed as: where Z Z ∈ R L×n y represents the noise in the observation process, X X ∈ R n x ×n y ×N can be expressed as [X 1 , X 2 , · · · , X N ] T , C represents the encoding process, S represents the shift process, and T represents the accumulating process. T can be represented by the summation notation: where denotes the Hadamard (element-wise) product. To convert the Hadamard product to matrix multiplication, Y Y ∈ R L×n y , X X ∈ R n x ×n y ×N , and Z Z ∈ R L×n y are transformed into 1D column vectors: where x i = Vec(X i ) ∈ R n x n y ×1 , y = Vec(Y) ∈ R Ln y ×1 and z = Vec(Z) ∈ R Ln y ×1 , C diag = diag(Vec(C)) ∈ R n x n y ×n x n y . Figure 5 shows a simple example when the mask is 3 × 3.
(3 where , and . Figure 5 shows a simple example when the mask is 3 × 3.  , and the sensing matrix is a block-diagonal ma trix, can be expressed as: indicates the number o pixels per shift.
represents the circular translation of pixels along th vertical direction of the matrix . can be expressed as: where represents the identity matrix. A simple case of in Equation (5 when the number of frames is four is shown in Figure 6, and it can be written as . For a data cube of size n x × n y × N. Its height is n x , width is n y , and the number of frames is N.
where , and . Figure 5 shows a simple example when the mask is 3 × 3  where , and the sensing matrix is a block-diagonal trix, can be expressed as: where , indicates the numbe pixels per shift.
represents the circular translation of pixels along vertical direction of the matrix . can be expressed as: where represents the identity matrix. A simple case of in Equatio when the number of frames is four is shown in Figure 6, and it can be written Equation (3) can be mathematically described as a classic inverse problem model as follows: , and the sensing matrix H ∈ R Ln y ×n x n y N is a block-diagonal matrix, H can be expressed as: where S i = CircShi f t I 0 , (i − 1)s 0 n y ∈ R Ln y ×n x n y , i = 1, 2, 3, . . . N, s 0 indicates the number of pixels per shift. CircShi f t(A, l) represents the circular translation of l pixels along the vertical direction of the matrix A. I 0 ∈ R Ln y ×n x n y can be expressed as: where I ∈ R n x n y ×n x n y represents the identity matrix. A simple case of H i in Equation (5) when the number of frames is four is shown in Figure 6, and it can be written as: s 0 n y ∈ R Ln y ×n x n y , i = 1, 2, 3, . . . N and C 0 ∈ R Ln y ×n x n y can be expressed as: An important property of the sensing matrix is that HH T is a diagonal matrix, which is used in Section 3.1 to obtain the closed-form solution of the suboptimization problem. HH T can be written as HH T = N ∑ i=1 H i H T i and consider that when i = 1, the result of H 1 H T 1 is: where C diag = diag(Vec(C)) n x n y ×n x n y is a diagonal matrix and O represent the zero matrix. In the case of i = 1, the result of H 1 H T 1 is a diagonal matrix. When generalized to i = 2, · · · , N, the result of H i H T i is still a diagonal matrix, so their sum is also naturally a diagonal matrix and, thus, HH T is a diagonal matrix. and can be expressed as: (7) An important property of the sensing matrix is that is a diagonal matrix, which is used in Section 3.1 to obtain the closed-form solution of the suboptimization problem.
can be written as and consider that when , the result of is: where is a diagonal matrix and represent the zero matrix. In the case of , the result of is a diagonal matrix. When generalized to , the result of is still a diagonal matrix, so their sum is also naturally a diagonal matrix and, thus, is a diagonal matrix.

Algorithm Framework of PnP-ADMM for CUP
According to the forward model of CUP data compression already established in Section 2.2, the inverse problem model of Equation (4) can be described as an unconstrained optimization problem: (9) In the formula, represents the CUP forward imaging model, and represents a certain image prior. The ADMM transforms the unconstrained optimization problem (9) into a constrained optimization problem by introducing an auxiliary variables : The minimum optimization problem (10) can be solved by iteratively solving the following three sub-optimization problems [24]: (11)

Algorithm Framework of PnP-ADMM for CUP
According to the forward model of CUP data compression already established in Section 2.2, the inverse problem model of Equation (4) can be described as an unconstrained optimization problem:x = argmin In the formula, f (x) = y − Hx 2 2 represents the CUP forward imaging model, and g(x) represents a certain image prior.
The ADMM transforms the unconstrained optimization problem (9) into a constrained optimization problem by introducing an auxiliary variables v: The minimum optimization problem (10) can be solved by iteratively solving the following three sub-optimization problems [24]: where k represents the number of iterations.
In the CUP reconstruction problem, f (x) = 1 2 y − Hx 2 2 represents the forward model of CUP imaging. For the convenience of representation, the sub-problem (11) is rewritten in the form without the iteration variable k: For the determined v, u, H, y, sub-problem (14) has a closed-form solution: For a large matrix H, inverting H T H + ρI will consume a lot of computer memory resources and time. For the original data X with size of 256 × 256 × 8, the size of H is 67,328 × 524,288, and the size of H T H is 524,288 × 524,288. At present, it is difficult for computers to deal with such a large-scale matrix inversion problem. Inspired by [25], when HH T is a diagonal matrix, Equation (14) has a closed-form solution. It was discussed in Section 2.2 that the matrix H is a block-diagonal matrix, and it is verified that HH T is a diagonal matrix. This feature of H can simplify the computation of the inversion of H T H + ρI : Bringing Equation (16) into Equation (15), we can obtain [25]: where HH T is a diagonal matrix: then we can obtain: Using matrix division of corresponding elements, Equation (21) can be simplified to: where s = ψ 1 , . . . , ψ Ln y , in Equation (22), the matrix division of the corresponding element has priority. By utilizing the property that HH is a diagonal matrix, in each iteration process, the solution of sub-optimization problem (11) can be completed with only one calculation, so the computer memory load is reduced and the solution efficiency of the algorithm is improved.
One of the most important features of ADMM iteration is its modular structure: problem (11) can be seen as a reversal step, since it includes the forward imaging model f (x), and problem (12) can be seen as a denoising step, because it includes the image a priori g(v). If σ = λ ρ , problem (12) can be rewritten as [24]: Considering x (k) + 1 ρ u (k) as a noisy image, problem (23) minimizes the two-norm distance between the noise-free image v and the noisy image x (k) + 1 ρ u (k) , based on the image prior g(v) as a regular term. If g(v) = v TV , where · TV represents the total variation norm, it can be calculated by Equation (24) [24]: Then problem (23) becomes the standard total variation norm denoising problem, namely the TV denoising problem. Formally based on this intuition, [11] proposed the PnP-ADMM method, without specifying the image prior g(v), and just replaces step (12) with a state-of-the-art image denoising algorithm [24]: where D σ (·) represents some type of noise denoising algorithm. Although it is unclear to which image prior the denoising algorithm D σ (·) corresponds, the performance of the PnP-ADMM method on the image reconstruction problem surpasses other popular image reconstruction algorithms [26][27][28][29].

The Architecture of FFDNet
Proposed by Zhang et al. in [22], FFDNet is a single discriminative CNN model. Figure 7 shows the architecture of FFDNet. FFDNet consists of "Downsampling layer + Nonlinear mapping layer + Upsampling layer". The down-sampling layer is a reversible down-sampling operator that reshapes a noisy image into four down-sampled sub-images. At the same time, FFDNet concatenates a tunable noise map with the down-sampled subimages to form a tensor as the inputs to the non-linear mapping layer. At the non-linear mapping layer, each sub-layer is composed of a specific combination of three types of operations: convolution (Conv) with filter size of 3 × 3, rectified linear units (ReLU), and batch normalization (BN). For the grayscale model, the number of Conv layer is 15 and the number of channels is 64. The noise map varies from 0 to 75 [23]. After the non-linear mapping layer, an upscaling operation is applied in the up-sampling layer as the reverse operator of the down-sampling operator applied in the input stage to produce the estimated clean image with the same shape as the input noisy image. The training dataset is composed of pairs of input-output patches , which are generated by adding AWGN of to clean patches and build the corresponding noise map . FFDNet, without a residual learning estimate, can denoise the image directly [23]: (26) thus, the corresponding loss function is [23]: (27) where is the collection of all learnable parameters. Therefore, the architecture and these additional techniques render this algorithm faster, more efficient, and more versatile than other denoising algorithms.

PnP-ADMM Fixed-Point Convergence for CUP Reconstruction
Ref. [24] demonstrates the fixed-point convergence of the PnP-ADMM algorithm based on the definition of a bounded denoiser and the assumption of bounded gradients.

Definite 1. Bounded denoiser: A bounded denoiser with a parameter
is a function such that for any input [24]: (28) for some universal constant , independent of and .
Bounded denoisers are a weak condition that we expect most denoisers to have. Next, we show that bounded gradients also hold in the CUP reconstruction problem. In the problem of CUP reconstruction, the gradient of is: where is a block-diagonal matrix with element distribution , all elements of observation are non-negative, so the result of is non-negative. can be viewed as a weighted sum of , so , since all elements in are normalized to be between 0 and 1, the inequality can be simplified to: . Therefore, the assumption of constant , , bounded gradient holds in the problem of CUP reconstruction. According to the proof of [24], the CUP reconstruction algorithm based on PnP-ADMM has fixed-point convergence, that is, there is , and when , we have: F Ĩ =Î (26) thus, the corresponding loss function is [23]: where θ is the collection of all learnable parameters. Therefore, the architecture and these additional techniques render this algorithm faster, more efficient, and more versatile than other denoising algorithms.

PnP-ADMM Fixed-Point Convergence for CUP Reconstruction
Ref. [24] demonstrates the fixed-point convergence of the PnP-ADMM algorithm based on the definition of a bounded denoiser and the assumption of bounded gradients. Definite 1. Bounded denoiser: A bounded denoiser with a parameter σ is a function D σ : R n → R n such that for any input x ∈ R n [24]: for some universal constant C, independent of n and σ.
Bounded denoisers are a weak condition that we expect most denoisers to have. Next, we show that bounded gradients also hold in the CUP reconstruction problem. In the problem of CUP reconstruction, the gradient of f (x) is: where H is a block-diagonal matrix with element distribution {0, 1}, all elements of observation y are non-negative, so the result of H y is non-negative. H Hx can be viewed as a weighted sum of x, so H Hx 2 ≤ n x n y N x 2 , since all elements in x are normalized to be between 0 and 1, the inequality can be simplified to: H Hx 2 ≤ n x n y N. Therefore, the assumption of constant M, f (x) 2 ≤ M, bounded gradient holds in the problem of CUP reconstruction. According to the proof of [24], the CUP reconstruction algorithm based on PnP-ADMM has fixed-point convergence, that is, there is (x * , v * , u * ), and when k → ∞ , we have:

PSNR and SSIM on Simulation Datasets
In order to test the reconstruction ability of PnP-FFDNet, eight frames of data were selected from the benchmark datasets runner, kobe, traffic, drop, and crash [25] to be compressed and encoded in the way of CUP data compression, and then PnP-ADMM was used for reconstruction. A comparison is made with the CUP reconstruction algorithm TwIST used in [4] and other denoising algorithms used in the algorithmic framework of PnP-ADMM: TVD/BM3D/IRCnn/DnCnn. The computing platform configuration parameters we used are as follows: CPU is 12th Gen Intel(R) Core(TM) i7-12700H 2.70 GHz, GPU is NVIDIA GeForce RTX 3060 laptop GPU.
The benchmark datasets used in the experiment are all 256 × 256 × 8, the size of each frame is 256 × 256, and the total number of frames is eight frames. In all subsequent experiments, the offset s 0 of each frame is 1. The dimension of the simulated observation data obtained through the data compression model of CUP is 263 × 256. The encoding mask size is 256 × 256, its elements {0, 1} are randomly distributed, and the sampling rate is 50%. The regular term of the TwIST algorithm selects the total variation (TV) norm of the image, and the denoising algorithm selects the TVD. Based on the actual test experience, when the parameters manually adjusted in the algorithm are set as follows, a better CUP reconstruction effect can be achieved. For the TwIST algorithm, the regularization parameter is set to 0.05 and the loop is excited when the error of the objective function of two adjacent loops is less than 1e-5. The regularization parameter ρ in the PnP-ADMM algorithm is set to 1, and the iteration is excited when ∆ k+1 is less than or equal to 1e-3, where ∆ k+1 is [24]: The experiment uses PSNR and SSIM to evaluate the reconstruction performance. PSNR is based on the error between the reconstructed image and the corresponding pixel of the original image. SSIM measures the structural similarity between the reconstructed image and the original image from three aspects: brightness, contrast, and structure. Tables 1-3 summarize the PSNR, SSIM, and execution time, respectively of the reconstruction results of the six algorithms. As learning-based denoising algorithms are implemented based on the open-source framework Pytorch, these denoisers can use GPUs to accelerate forward inference. "use GPU" in Table 3 represents the result of the algorithm using GPU-accelerated computing. Figure 8 shows the reconstruction performance of different algorithms on the benchmark dataset.

The Performance of PnP-FFDNet on Data with Different Compression Ratios
In order to test the reconstruction performance of the algorithm with different compression ratios, different frame numbers were intercepted from the drop dataset for CUP data compression encoding, and the PnP-FFDNet algorithm and the PnP-BM3D algorithm were selected for comparison. The data compression ratio is defined as: Since the size of each frame in the drop dataset is 256 × 256 and is 1, the CUP data compression ratio can be simplified as: The larger the number of frames selected for compression coding, the larger the data compression ratio. From the statistical results in Tables 1-3, it can be intuitively seen that PnP-FFDNet and PnP-BM3D have the best reconstruction performance. PnP-FFDNet greatly reduces the time required for reconstruction without losing the reconstruction performance. Although other algorithms (such as PnP-TV/PnP-IRCnn/PnP-DnCnn) have advantages in reconstruction time, they sacrifice reconstruction efficiency. Compared with PnP-BM3D, the execution time of PnP-FFDNet is reduced by an average of 93% on the CPU and 96% on the GPU, but the PSNR and SSIM metrics are very similar. Figure 8 shows the reconstruction performance with different reconstruction algorithms, and selected one frame from eight frames of data for comparison. Among them, the drop data has fewer details than the other four data images, so the six algorithms have good reconstruction performance. For traffic data with more image texture, although the performance of the six reconstruction algorithms is relatively poor, the reconstruction results of the PnP-FFDNet and PnP-BM3D algorithms have clearer contours and less noise.

The Performance of PnP-FFDNet on Data with Different Compression Ratios
In order to test the reconstruction performance of the algorithm with different compression ratios, different frame numbers were intercepted from the drop dataset for CUP data compression encoding, and the PnP-FFDNet algorithm and the PnP-BM3D algorithm were selected for comparison. The data compression ratio R is defined as: Since the size of each frame in the drop dataset is 256 × 256 and s 0 is 1, the CUP data compression ratio R can be simplified as: The larger the number of frames selected for compression coding, the larger the data compression ratio.
As shown in Figures 9-11, when the data compression rate R increases, our experimental results show that the reconstructed metrics PSNR and SSIM both decrease. When the compression ratio is small, the reconstruction effect of PnP-BM3D is slightly better than that of PnP-FFDNet, but as the compression ratio R increases, the performance of the two algorithms in PSNR and SSIM is close, and even when the compression ratio R is at a certain value, PnP-FFDNet performs better than PnP-BM3D on PSNR and SSIM. However, when the compression ratio R increases, the time consumption of the PnP-BM3D algorithm increases linearly. When the number of compressed frames is 39, the algorithm takes half an hour to complete the reconstruction, while PnP-FFDNet completes the reconstruction in one minute, due to the high efficiency of the algorithm itself and the parallel acceleration of the GPU. The inference speed of FFDNet is greatly accelerated due to the use of down-sampling techniques to reduce the computational load. Therefore, even in CPU, PnP-FFDNet is significantly faster than PnP-BM3D. Furthermore, accelerating the inference phase of the network with the help of GPU parallel computing is also one of the reasons why the PnP-FFDNet algorithm executes faster. BM3D algorithm increases linearly. When the number of compressed frames is 39, the algorithm takes half an hour to complete the reconstruction, while PnP-FFDNet completes the reconstruction in one minute, due to the high efficiency of the algorithm itself and the parallel acceleration of the GPU. The inference speed of FFDNet is greatly accelerated due to the use of down-sampling techniques to reduce the computational load. Therefore, even in CPU, PnP-FFDNet is significantly faster than PnP-BM3D. Furthermore, accelerating the inference phase of the network with the help of GPU parallel computing is also one of the reasons why the PnP-FFDNet algorithm executes faster.

Performance of PnP-FFDNet on Real Data
In order to test the performance of the PnP-FFDNet algorithm on real CUP experimental data, key frames were extracted from the video file of the laser pulse reflection process in the Supplementary Material of [4], and the RGB image was converted into a grayscale image to form a 320 × 320 × 16 original image data, and 335 × 320 observation data were obtained through the CUP data compression model. Using TwIST and PnP-TV/BM3D/IRCnn/DnCnn/FFDNet, we performed algorithm reconstruction experiments. Figure 12 shows the reconstruction results with six algorithms on the real experimental data of CUP. The algorithm reconstructs a total of 16 frames, and selects 12 consecutive frames with clear pulsed laser graphics for comparison. Due to the process of compression sampling, only part of the information is sampled, the spatial resolution of the image reconstructed by the algorithm is low, and so the text in the image has more details and, therefore, cannot be reconstructed well. From the image comparison of the reconstruction results, it can be seen that the laser reflection process of the PnP-FFDNet reconstruction results is the closest to the ground truth, and the reconstructed visual effects, evaluation indicators (PSNR, SSIM), and operating efficiency all exceed the other algorithms.

Performance of PnP-FFDNet on Real Data
In order to test the performance of the PnP-FFDNet algorithm on real CUP experimental data, key frames were extracted from the video file of the laser pulse reflection process in the Supplementary Material of [4], and the RGB image was converted into a grayscale image to form a 320 × 320 × 16 original image data, and 335 × 320 observation data were obtained through the CUP data compression model. Using TwIST and PnP-TV/BM3D/IRCnn/DnCnn/FFDNet, we performed algorithm reconstruction experiments. Figure 12 shows the reconstruction results with six algorithms on the real experimental data of CUP. The algorithm reconstructs a total of 16 frames, and selects 12 consecutive frames with clear pulsed laser graphics for comparison. Due to the process of compression sampling, only part of the information is sampled, the spatial resolution of the image reconstructed by the algorithm is low, and so the text in the image has more details and, therefore, cannot be reconstructed well. From the image comparison of the reconstruction results, it can be seen that the laser reflection process of the PnP-FFDNet reconstruction results is the closest to the ground truth, and the reconstructed visual effects, evaluation indicators (PSNR, SSIM), and operating efficiency all exceed the other algorithms.  Table 4 shows the reconstruction indicators. Compared with TwIST, PnP-TV and PnP-BM3D, PSNR is improved by 3.95 dB, 4.61 dB, and 1.85 dB, respectively, and SSIM is improved by 0.07, 0.1, and 0.02, respectively. Under the premise of obtaining a better reconstruction effect, PnP-IRCnn/PnP-DnCnn/PnP-FFDNet require relatively less reconstruction time. The reconstruction effects of the algorithms using the learning-based denoiser are similar on the metric PSNR, and PnP-FFDNet is the best for the metric SSIM. Although the running time of the PnP-TV algorithm is the shortest, its reconstruction effect is similar to that of TwIST. The performance on real datasets shows that PnP-FFDNet has excellent performance in reconstruction effect and running efficiency.

Conclusions
This paper proposes a CUP reconstruction algorithm based on the combination of the PnP-ADMM framework and the convolutional neural network denoising algorithm FFDNet. The reconstruction methods were performed on the benchmark dataset and the real data of the CUP experiment. The results show that the proposed algorithm performs better on the PSNR and SSIM. Since the inference of the convolutional neural network can be performed in parallel using the GPU, the execution time of PnP-FFDNet is greatly reduced compared to PnP-BM3D, without losing the re-estimation effect. In the popular  Table 4 shows the reconstruction indicators. Compared with TwIST, PnP-TV and PnP-BM3D, PSNR is improved by 3.95 dB, 4.61 dB, and 1.85 dB, respectively, and SSIM is improved by 0.07, 0.1, and 0.02, respectively. Under the premise of obtaining a better reconstruction effect, PnP-IRCnn/PnP-DnCnn/PnP-FFDNet require relatively less reconstruction time. The reconstruction effects of the algorithms using the learning-based denoiser are similar on the metric PSNR, and PnP-FFDNet is the best for the metric SSIM. Although the running time of the PnP-TV algorithm is the shortest, its reconstruction effect is similar to that of TwIST. The performance on real datasets shows that PnP-FFDNet has excellent performance in reconstruction effect and running efficiency.

Conclusions
This paper proposes a CUP reconstruction algorithm based on the combination of the PnP-ADMM framework and the convolutional neural network denoising algorithm FFDNet. The reconstruction methods were performed on the benchmark dataset and the real data of the CUP experiment. The results show that the proposed algorithm performs better on the PSNR and SSIM. Since the inference of the convolutional neural network can be performed in parallel using the GPU, the execution time of PnP-FFDNet is greatly reduced compared to PnP-BM3D, without losing the re-estimation effect. In the popular CUP experimental configuration, the data depth is often between 150-1500. The proposed method can greatly speed up the reconstruction process. and is a practical and efficient CUP reconstruction algorithm.
Author Contributions: Formal analysis, Q.S.; methodology, Q.S.; supervision, J.T.; writing-original draft, Q.S.; writing-review and editing, J.T. and C.P. All authors have read and agreed to the published version of the manuscript.