Three-Dimensional Digital Zooming of Integral Imaging under Photon-Starved Conditions

In this paper, we propose new three-dimensional (3D) visualization of objects at long distance under photon-starved conditions. In conventional three-dimensional image visualization techniques, the visual quality of three-dimensional images may be degraded because object images at long distances may have low resolution. Thus, in our proposed method, we utilize digital zooming, which can crop and interpolate the region of interest from the image to improve the visual quality of three-dimensional images at long distances. Under photon-starved conditions, three-dimensional images at long distances may not be visualized due to the lack of the number of photons. Photon counting integral imaging can be used to solve this problem, but objects at long distance may still have a small number of photons. In our method, a three-dimensional image can be reconstructed, since photon counting integral imaging with digital zooming is used. In addition, to estimate a more accurate three-dimensional image at long distance under photon-starved conditions, in this paper, multiple observation photon counting integral imaging (i.e., N observation photon counting integral imaging) is used. To show the feasibility of our proposed method, we implement the optical experiments and calculate performance metrics, such as peak sidelobe ratio. Therefore, our method can improve the visualization of three-dimensional objects at long distances under photon-starved conditions.


Introduction
Three-dimensional (3D) visualization of objects at long distances on photon-starved conditions has been a great challenge in many applications, such as military, astronomy, and observing wild animals. In the military case, a defense or reconnaissance that searches enemies at long distances in the day or night is required. In astronomy, observing stars at billions of light years of distance is a critical problem. In addition, observing wild animals, which are nocturnal and have much wariness, is also needed.
However, it is difficult to visualize the three-dimensional objects, which are located at long distances by conventional imaging methods, since lateral and longitudinal resolutions of the image at long distance may be reduced due to the limitation of optical devices and the image sensor. When a camera takes a picture, the object at long distance in the scene has less pixels than a close one. Therefore, lateral and longitudinal resolutions (i.e., three-dimensional resolution) of the image for objects at long distance are reduced. To visualize three-dimensional objects at long distance, integral imaging [1][2][3], which was first proposed by G. Lippmann, can be utilized. It uses two-dimensional (2D) images with different perspectives captured by lenslet array or camera array, where these images are referred to as elemental images. Integral imaging can provide full parallax and continuous viewing points of three-dimensional objects without any viewing glasses and coherent light sources [1][2][3][4][5][6][7][8]. However, due to the limitation of three-dimensional resolution for three-dimensional objects at long distances, the visual quality of three-dimensional images at long distances may be degraded. In addition, this resolution problem may be critical under photon-starved conditions. Because an image sensor detects less photons, which have the information of an object at a long distance under photon-starved conditions, elemental images may not have the information of the object. That is, the visual quality of three-dimensional images may be more degraded under photon-starved conditions.
To visualize three-dimensional objects under photon-starved conditions, photon counting integral imaging [9][10][11] has been proposed. It can make a computational model of a photon detector by statistical distribution, such as Poisson distribution, because photons occur rarely in unit time and space [11]. In addition, for three-dimensional image reconstruction, statistical estimation methods, such as maximum likelihood estimation (MLE) [9][10][11] or Bayesian approaches [12][13][14], are utilized. However, photon counting integral imaging may not estimate the accurate three-dimensional images for objects at long distances under photon-starved conditions, since object images may have low resolution and an insufficient number of photons. Therefore, to solve these problems, a new three-dimensional image visualization technique is required.
In this paper, we propose three-dimensional digital zooming of integral imaging under photon-starved conditions. It can magnify region of interest (ROI) in elemental images captured by synthetic aperture integral imaging (SAII) [15]. Then, three-dimensional images at long distances can be obtained by volumetric computational reconstruction (VCR) [16][17][18][19][20][21][22][23] and photon counting integral imaging [9][10][11][12][13][14]. Under photon-starved conditions, photons can be detected throughout the scene by computational photon counting imaging, which may cause the degradation of resolution for objects due to lack of the number of photons. However, in our method, photon counting imaging is utilized only in ROI of elemental images to visualize three-dimensional images at long distances. Therefore, more photons can be extracted from the ROI of elemental images. In addition, multiple observations of photon counting imaging is considered in our method, where this method is called "N observation photon counting imaging", which improves the visual quality of the images under photon-starved conditions, since photons are detected randomly for each observation, and multiple observation can increase the number of samples. Additionally, to estimate more accurate three-dimensional images under photon-starved conditions, statistical estimation methods, such as maximum likelihood estimation (MLE), are used in our method. This paper is organized as follows. We describe the basic concept of optical and digital zooming and integral imaging in Section 2. Then, we introduce the computational photon counting method and our proposed method in Section 3. To show the feasibility of our proposed method, we show the experimental results in Section 4. Finally, we make a conclusion with summary in Section 5.

Related Work
In this section, we briefly present the concept of optical and digital zooming and integral imaging.

Zooming
In general, to visualize objects at a long distance, two zooming methods, such as optical and digital zooming, can be applied. In optical zooming, objects at long distances can be magnified by modifying lenses' positions (i.e., use of zoom lens). This can obtain the best visual quality of the image without any interpolation methods. However, it needs more complicated optical devices and is more expensive. On the other hand, in digital zooming, it is easy to visualize objects at long distances by using various interpolation methods. Thus, it is more cost-effective and simpler. However, the visual quality of the image is worse than the image by optical zooming. Recently, two digital zooming methods have been utilized, being interpolation and deep learning based methods. The interpolation method can magnify the image by digital image processing.
In general interpolation methods, there are "nearest", "bilinear", and "bicubic" methods. These methods use near pixels to interpolate a pixel value at a new location [24]. The difference between their methods is the number of near pixels. The "nearest" interpolation method uses the single nearest pixel, the "bilinear" interpolation method utilizes the four nearest pixels and distances for interpolation, and the "bicubic" interpolation method interpolates the new pixel by using 16 pixels, which are weighted. The "nearest" method is faster to process than others, but degrading resolution is a disadvantage. The "bicubic" interpolation method has the best result of interpolation, but speed is slow. On the other hand, both the quality and speed of the "bilinear" method are more appropriate than others. Recently, several novel interpolation methods were used, such as the "Lagrange-Chebyshev" and the "de la Vallée-Poussin" [25,26]. In addition, a deep learning-based method, which can estimate the magnified image by a neural network [27], has been reported. This deep learning-based method can estimate more detail of the zooming region with high magnification ratio in real-time. However, a huge amount of data and time are required to train the neural network.
Cameras in state-of-the-art smart phones utilize interpolation with deep learning to have the best quality of images at long distances. In this paper, we use the interpolation method for digital zooming of objects at long distance because it is easy and fast for implementation. To visualize three-dimensional objects at long distances, we introduce integral imaging, which will be described in the next section.

Integral Imaging
Integral imaging is a passive three-dimensional visualization technique. Figure 1 illustrates the basic concept of integral imaging by lenslet array and camera array. In the pickup stage, it can record multiple two-dimensional images with different perspectives (i.e., elemental images) from the scene by lenslet array or camera array. In optical reconstruction or display, a three-dimensional image can be displayed by projecting elemental images through the homogeneous lenslet array used at pickup stage. Additionally, a threedimensional image can be reconstructed by back-projecting elemental images through virtual pinhole arrays on the reconstruction plane in computational reconstruction, where it is called volumetric computational reconstruction (VCR) [16][17][18][19][20][21][22][23]. In integral imaging, two different pickup methods can be used, such as lenslet array and camera array. Lenslet array-based integral imaging is shown in Figure 1, and it can record the elemental images by single shot, but the resolution of these elemental images is low since the resolution of the image sensor is divided by the number of lenslets. On the other hand, camera array-based integral imaging is shown in Figure 1, which can record the elemental images with the same resolution as the image sensor. This camera array-based integral imaging is called synthetic aperture integral imaging (SAII) [15]. However, it is difficult to align and synchronize the camera array. In addition, since moving a single camera to avoid these problems is slow for pickup, it may impossible for capturing the dynamic scene.
To visualize three-dimensional images computationally, elemental images are backprojected through the virtual pinhole array on the reconstruction plane as shown in Figure 2. Reconstruction is the reverse process of pickup stage. When elemental images are backprojected through the virtual pinhole array on the reconstruction plane, they have different shifting pixels at various reconstruction depths. Shifting pixel value can be found as the following [23] where ∆S x , ∆S y are the number of shifting pixels for elemental images in x and y directions, c x , c y are the sensor size, and N x , N y are the number of pixels for each elemental image, respectively. p x , p y are the pitch between elemental images, f is the focal length of the virtual pinhole, and z is the reconstruction depth. Using shifting pixel values calculated by Equation (1), elemental images are overlapped with each other on the reconstruction plane. Then, three-dimensional images are obtained by [23] ∆S k where K, L are the number of elemental images in x and y directions, ∆S k x , ∆S l y are rounded shifting pixels of kth column and lth row elemental image, O(x, y, z) is the overlapping matrix for the reconstructed three-dimensional image at the reconstruction depth z, 1 is the ones matrix, E kl is kth column and lth row elemental image, and I(x, y, z) is the reconstructed three-dimensional image. Equations (1)-(5) are utilized for VCR [23]. N x , N y , p x , p y , f , c x , and c y are constant values, and only z is variable. Thus, when the reconstruction depth increases, the value of shifting pixel and depth resolution are changed. Our method uses a digital zooming method to enhance the shifting pixel and depth resolution in computational reconstruction of integral imaging.
However, under photon-starved conditions, the detail information of ROI of objects at long distance may be lost due to lack of the number of photons. To enhance the visual quality of three-dimensional images at long distances, we utilize computational photon counting integral imaging, as described in next subsection.

Computational Photon Counting Imaging
Under photon-starved conditions, it is difficult to record the information of objects from the scene using conventional imaging methods due to lack of the number of photons. To overcome this problem, computational photon counting imaging is utilized in our method. Photons can be detected by Poisson random process under these conditions because photons may occur rarely in unit time and space [11]. Computational photon counting imaging can be described as follows [9][10][11][12][13][14] λ(x, y) = N p I(x, y) C(x, y)|λ(x, y) ∼ Poisson(λ(x, y)) where λ is the normalized irradiance of the elemental image, which has unit energy, I(x, y) is the recorded two-dimensional image, C(x, y) is the photon counting image, and N p is the expected number of photons from the elemental image. Since λ has unit energy by Equation (6), N p photons can be extracted randomly from the recorded two-dimensional image by Equation (7). To obtain a three-dimensional image under photon-starved conditions, photon counting integral imaging [11] can be used. At first, to estimate the recorded two-dimensional image from photon counting image, maximum likelihood estimation (MLE) can be applied as follows [9][10][11] L(λ kl ) = λ kl = C kl (11) where λ kl is the normalized irradiance of kth column lth row elemental image, L(), l() are the likelihood and log-likelihood functions, andλ kl is the kth column lth row-estimated twodimensional image for the scene, respectively. By using MLE and VCR, a three-dimensional image can be visualized under photon-starved conditions, as follows [9][10][11] where I p (x, y, z) is a three-dimensional image under photon-starved conditions obtained by photon counting integral imaging. However, three-dimensional objects at long distance under photon-starved conditions may not be visualized due to lack of the number of photons in ROI of the scene. Therefore, in this paper, we present digital zooming and photon counting integral imaging for three-dimensional objects at long distances under photon-starved conditions, as described in the next section. In our method, we use different VCRs for digital zooming, where shifting pixels are recalculated because elemental images are cropped by ROI of the scene. In addition, to improve the visual quality of photon counting image, N observation photon counting imaging is proposed.

Three-Dimensional Digital Zooming of Integral Imaging and N Observation Photon Counting Integral Imaging under Photon-Starved Conditions
As mentioned earlier, the conventional integral imaging has the problems that resolution of three-dimensional image and depth resolution of object at long distance are worst. In addition, resolution of three-dimensional images for objects at long distances under photon-starved conditions is also much worse. To solve these resolution problems, in our method, we utilize digital zooming and VCR to visualize three-dimensional images at long distances, as illustrated in Figure 3. As mentioned earlier, digital zooming is applied to obtain new elemental images by cropping ROI from the original elemental images before overlapping the elemental images with shifting pixels on the reconstruction plane. When ROI is cropped from the original elemental image, the aspect ratio of the image is preserved as the original elemental images. Then, new elemental images are interpolated, and their size is the same as the original elemental images. Therefore, new VCR for three-dimensional objects at long distances, considering digital zooming, can be written as follows ∆S l y = l × ∆S y , for l = 0, 1, 2, · · · , L − 1 O(x, y, z) = where N x , N y are width and height of the cropped ROI from the original elemental image, m is real value of zooming ratio, which is bigger than one, z is a zooming distance by digital zooming, ∆S x , ∆S y are the number of shifting pixels for new elemental image by digital zooming, ∆S k x , ∆S l y are the rounded number of shifting pixels of kth column lth row new elemental image by digital zooming,Õ(x, y, z) is the new overlapping matrix by digital zooming,Ẽ kl is the kth column and the lth row of new elemental image by digital zooming, andĨ(x, y, z) is the three-dimensional image by digital zooming, respectively. The size of ROI is always less than the original elemental images, and new elemental images are interpolated with the same size as the original elemental images. Thus, through Equations (13)- (20), the shifting pixels are changed, and a three-dimensional image by digital zooming can be obtained.
Three-dimensional objects at long distances under photon-starved conditions can be visualized by combining our digital zooming and computational photon counting integral imaging. However, since new elemental images have limited resolution, photon counting integral imaging may not produce three-dimensional images with sufficient visual quality. Therefore, in this paper, we propose N observation photon counting imaging as depicted in Figure 4. In this method, photon counting images are generated and estimated N times. Then, they are accumulated and averaged in relation to each other. Finally, the estimated images with better visual quality can be obtained, since the number of samples for each photon counting image increases. To verify the feasibility of our method, we describe our experimental setup and results in the next section.

Simulation and Experimental Results
In this section, we present our simulation and experimental setup for obtaining the elemental images by SAII. Then, we show the simulation and experimental results to prove the feasibility of our method.

Simulation and Experimental Setup
Before we implement the optical experiment, we implemented computer simulation by "Blender". We set the simulation environment as depicted in Figure 5. The object is 'ISO-12233', which is located at 1600 mm from camera array. In this simulation setup, focal length and sensor size of virtual camera are 50 mm and 36 mm (H) × 24 mm (V), respectively. The size of elemental image is 1080 (H) × 720 (V), and pitch between cameras is 2 mm. Total number of elemental images is 5 (H) × 5 (V). Through this setup, we evaluate the performance of interpolation methods, such as "nearest", "bilinear", and "bicubic".
To record the elemental images by SAII, experimental setup is illustrated in Figure 6. The yellow helicopter and the fire truck are used as three-dimensional objects at close dis-tance (40 mm) and long distance (520 mm), respectively, because we require two different objects, which are located at close and long distances to prove our zooming ability. In this setup, Nikon D850 is used as the image sensor, and Nikon DX AF-S Nikkor 18-55 mm is used as the camera lens. The sensor size is 36 mm (H)× 24 mm (V), and the focal length of the camera lens is set to 18 mm. Total number of the recorded elemental images is 5 (H)× 5 (V), each elemental image has 5408 (H)× 3600 (V) pixels, and the pitch between cameras is 0.5 mm. Then, for digital zooming, the size of ROI is set to 676 (H) × 450 (V) pixels, which is eight times smaller than the original elemental image. That is, the magnification ratio is eight. Thus, the distance between objects and camera is reduced as magnification ratio, which is the ratio between size of the elemental image and cropped ROI when a threedimensional image is reconstructed by VCR with digital zooming. Figure 7a illustrates the center image among elemental images captured by our experimental setup, and Figure 7b is the new elemental image for digital zooming, which is the cropped and interpolated ROI of Figure 7a.
To show the availability of visualization under photon-starved conditions, we captured the elemental images under these conditions as shown in Figure 8. The elemental image under photon-starved conditions is shown in Figure 8a Figure 9 shows three-dimensional images of simulation results. Figure 9a-c are the reconstructed three-dimensional images by digital zooming VCR with "bicubic", "bilinear", and "nearest" interpolation methods. Their depths are the same as 200 mm, and they were digitally zoomed with magnification ratio eight. To verify our method, we calculate the peak sidelobe ratio (PSR) of correlation via different depths as the performance metric. PSR of the correlation peak is defined as the number of standard deviation by which the peak exceeds the mean value of the correlation surface. It can be calculated by [28]

Experimental Result
where µ c is the mean of the correlation, and σ c is the standard deviation of the correlation. The higher the PSR value is, the better the recognition performance obtained.
To calculate the PSR value, the reference image, which is reconstructed at 200 mm, is used. Magnification is eight, 1600 ÷ 8 = 200 by using Equation (14). Table 1 shows the CPU time for obtaining three-dimensional images at 200mm depth and PSR value. The specification of the computer used in this simulation is AMD Ryzen 7 1700X Eight-Core Processor 3.40 GHz, 16GB of RAM. CPU time of each interpolation method is almost the same. "Nearest" method is the fastest, but its PSR value at focusing depth is the lowest. As shown Figure 10, the "nearest" method has the worst PSR value, and the "bicubic" method has the best result. The PSR graphs of "bicubic" and "bilinear" methods via reconstruction depths are very similar to each other. Only "nearest" method does not have the peak PSR value at focusing depth (200 mm). In Table 1, the speed of the "bilinear" method is almost the same as the "nearest" one, but the PSR value is almost the same as the "bicubic" method. In addition, the "bicubic" method is the slowest, and, instead, its PSR value is higher than any other interpolation methods. The "bilinear" method is faster than the "bicubic" method, but it is slower than the "nearest" method. Since it uses four nearest pixels and distances to interpolate at new location, the PSR value of the "bilinear" method is better than the "nearest" one and worse than the "bicubic" method, which uses weighted 16 nearest pixels. Through Figure 9, Table 1, and Figure 10, the "nearest" interpolation method has the worst result among the three interpolation methods, and the "bicubic" method is the best among them. However, the "bilinear" method is as fast as the "nearest" method for interpolation, and this result is almost similar to the "bicubic" method. Figure 10. PSR results of simulation by digital zooming VCR with "bicubic" (blue line), "bilinear" (orange line), and "nearest" (magenta line) interpolation methods.  Table 2 shows the PSR values of reconstructed three-dimensional images via reconstruction depths by using digital zooming with "nearest", "bilinear", and "bicubic" interpolation methods, respectively, where magnification ratio is 8.8. Thus, the reconstruction depth of focused object is 181.82 mm via Equation (14). However, in Table 2, the peak PSR values (727.47, 720.57 and 705.19) of interpolation methods are located at 186.36 mm, 196.36 mm and 186.36 mm ("bicubic", "bilinear" and "nearest"), and they are not located at focusing depth. Through Table 2, the location of peak PSR value for each interpolation method has the error.  Figure 11 shows three-dimensional images of our experimental result under conventional conditions. Figure 11a is the reconstructed three-dimensional image by VCR without digital zooming. Here, a fire truck may not be recognized because of low resolution. Figure 11b is the reconstructed three-dimensional image by VCR with digital zooming, where it is noticed that a fire truck can be recognized. To show that our method is better than the conventional method, we cropped "911" characters at various reconstruction depths, respectively. Figure 11c,d were reconstructed at 344 mm and 392 mm by conventional method without digital zooming and Figure 11e,f were reconstructed at 43 mm and 49 mm by our method, respectively. It is noticed that the experimental results by our method have better visual quality than ones by the conventional method. In Table 3, the PSR values of "bilinear" methods are better than the other two interpolation methods, such as "nearest" and "bicubic" entirely. In addition, this peak value (60.997) is located at 43 mm, which is object depth. In contrast, the other two methods have peak PSR values at wrong object depths. Moreover, since the pixel value by "bicubic" method may be negative, the "bilinear" method has the best result. Through this result, we utilize the "bilinear" interpolation method in our proposed method under photons-starved conditions.  Figure 12 shows the reconstructed three-dimensional images under photon-starved conditions. A fire truck may not be visualized by the conventional method due to low resolution and lack of the number of photons, as shown in Figure 12a-c. In contrast, the reconstructed three-dimensional image by photon counting integral imaging with digital zooming is shown in Figure 12d-f has better visual quality. However, its visual quality is not sufficient for recognition. Therefore, in this paper, we use N observation photon counting integral imaging to improve the visual quality of the reconstructed three-dimensional image under photon-starved conditions. As shown in Figure 12g-i, it is noticed that our method can visualize the reconstructed three-dimensional image at long distance under photon-starved conditions and improve its visual quality for recognition. In the PSR graph, as shown in Figure 13, the reference image is used as the reconstructed three-dimensional image at 43 mm. The distance 43 mm is calculated by using Equation (14) because the object is focused at 344 mm, and m is 8. The red line in Figure 13 is the PSR result of the conventional method via different depths. Magenta and blue lines in Figure 13 are PSR results of our methods with and without N observation photon counting integral imaging, respectively. In Figure 13, the reconstruction depths by our method are eight times smaller than the ones by conventional method because of digital zooming. Thus, they are modified by multiplying 8 to the reconstruction depths by our method. Table 4 shows the PSR values of the conventional method, digital zooming, and our proposed method, where the conventional method uses integral imaging and computational photon counting imaging only. The tendency of PSR results by the conventional method (red line) have flat and wrong peaks at the reconstruction depth of object. In contrast, PSR results of our method (magenta line) have a sharp peak at the correct reconstruction depth of object. Peak PSR value of conventional method is 32.1834 at 356 mm and 360 mm. On the other hand, peak PSR values of our method with and without N observation photon counting integral imaging are 60.9974 and 41.7063 at 43 mm (i.e., 43 × 8 = 344 mm), respectively. Therefore, through Figure 13 and Table 4, it is noticed that our proposed method can provide a more accurate position of three-dimensional objects at long distance with improved visual quality under photon-starved conditions.

Conclusions
In this paper, we have presented three-dimensional visualization of objects at long distances under photon-starved conditions using digital zooming and N observation photon counting integral imaging. The conventional method has less lateral and longitudinal resolutions for objects at long distances. In contrast, our method can solve this resolution problem by using digital zooming. In addition, under photon-starved conditions, the three-dimensional information of the object at long distances may be more accurately obtained via digital zooming VCR with N observation photon counting integral imaging. As shown in experimental results, our method had better visual quality of three-dimensional objects at long distances under photon-starved conditions than the conventional method. Therefore, we believe that our method may be used for many applications, such as military, astronomy, and so on. However, our method has a drawback. Because digital zooming utilizes interpolation algorithms for improving the visual quality, it is difficult to zoom the object at very long distance or detail information of object. That is, when the object is located at too long a distance and required magnification is too large, the interpolation method does not work because the nearest pixels may not be helpful for interpolation. Thus, in this case, we require additional optical zooming methods. To solve these problems, we will continue to examine them in future work.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

3D
Three-dimensional 2D Two-Dimensional ROI Region of interest VCR Volumetric computational reconstruction SAII Synthetic aperture integral imaging MLE Maximum likelihood estimation PSR Peak sidelobe ratio