Adaptive Content Frame Skipping for Wyner– Ziv-Based Light Field Image Compression

: Light field (LF) imaging introduces attractive possibilities for digital imaging, such as digital focusing, post-capture changing of the focal plane or view point, and scene depth estimation, by capturing both spatial and angular information of incident light rays. However, LF image compression is still a great challenge, not only due to light field imagery requiring a large amount of storage space and a large transmission bandwidth, but also due to the complexity requirements of various applications. In this paper, we propose a novel LF adaptive content frame skipping compression solution by following a Wyner–Ziv (WZ) coding approach. In the proposed coding approach, the LF image is firstly converted into a four-dimensional LF (4D-LF) data format. To achieve good compression performance, we select an efficient scanning mechanism to generate a 4D-LF pseudo-sequence by analyzing the content of the LF image with different scanning methods. In addition, to further explore the high frame correlation of the 4D-LF pseudo-sequence, we introduce an adaptive frame skipping algorithm followed by decision tree techniques based on the LF characteristics, e.g., the depth of field and angular information. The experimental results show that the proposed WZ-LF coding solution achieves outstanding rate distortion (RD) performance while having less computational complexity. Notably, a bit rate saving of 53% is achieved compared to the standard high-efficiency video coding (HEVC) Intra codec.


Context and Motivations
Light field (LF) rendering is known as an attractive form of image-based rendering (IBR) [1,2], which collects immense amounts of image data due to the intensity of light rays traveling in every angle at every point in 3D space being captured [3]. Thus, the LF image data include information such as the location or point ( , , ), the angle or direction ( , ∅), the wavelength ), and the time ( ) for light rays captured in a scene. This process is defined as the Plenoptic function, , and explains the huge amount of data stored in each LF image, as an LF image can include 7D information ( , , , , ∅, , ) [3]. A raw LF image is composed of micro-images (MIs), and a set of sub-aperture images (SAIs) is obtained by rearranging the co-located pixels from each MI. Each SAI corresponds to a captured image from a scene from a particular point of view, which can vary slightly between two different SAIs [4]. In addition, information about the parallax and depth of an image scene can be provided by comparing SAIs. In practice, a set of constraints is introduced to the Plenoptic function to reduce the complexity of LF information, which is reduced to an extensive 4D function, as below: Here, the light intensity is indexed by the sub-aperture image (viewpoint) ( , ) and the position (angle) within the sub-aperture image ( , ).
As an example of an LF imaging technology, LF cameras have become a promising tool for various research areas, e.g., richer photography using Lytro Illum [5], material analysis using Raytrix [6], medical imaging [7], and biometric recognition [8]. As a result of the enormous size of the photo-realistic LF images (typically 1 GB [9]), data compression is, therefore, a challenge in terms of storage, processing, and transmission. Recently, the Joint Photographic Experts Group (JPEG) committee created a process for standardization called JPEG Pleno [10], which includes LF, point cloud, and holography [11]. The proposal provides an LF representation and coding with optimized viewing and resolution for a huge amount of data; thus, an efficient coding solution with high compression performance is of the utmost importance.
In the literature, various techniques and methods for LF compression have been introduced, especially for LF lenslet coding and four-dimensional LF (4D-LF) coding. The LF lenslet format is a compact version of the LF data, which represents the LF data as a massive hexagonal array of lenslets (MIs) and requires additional camera metadata in order to render images of a scene. In [12][13][14][15][16], exploiting the LF lenslet compression, most of the conventional image and video coding methods were applied to exploit the existing spatial redundancy of MIs within a raw LF lenslet, such as JPEG, JPEG2000, or high-efficiency video coding (HEVC) intra coding. This idea is based on the concept of self-similarity compensated prediction [12]. A block-based matching algorithm is utilized to manage the most suitable predictor block for the current block, which is compared to the previously coded and reconstructed range of the current image. With two different candidate blocks, the predictor block can be generated. Additionally, [13,14] proposed adding new coding modes to the HEVC coding tools (i.e., locally linear embedding-based prediction) and adapting the intra prediction scheme in HEVC coding tools. In addition, to exploit the data geometry for dimensionality reduction of LF, [15,16] presented coding schemes for LF based on low rank approximation. Likewise, in [17], the author used the disparity compensated prediction method to take advantage of the existing spatial redundancy. In addition, the high-order prediction (HOP) model has also been considered as a method to achieve compression, such as in [18]. Based on a geometric transformation between the current block and the reference region, this method provides a high-order intra-block prediction method by adding HOP to HEVC intra prediction modes. Moreover, in recent works, an objective performance assessment of LF lenslet representation was investigated in [19]. The LF lenslet is used with YUV 4:4:4 encoding at 10 bit/sample, which performs well in terms of coding efficiency for different colored sub-sampling formats. In regard to the repeating patterns of lenslets in this representation, screen content coding (SCC) [20] is an efficient encoder for LF image compression. The work in [21] presented an efficient lenslet image coding model, which applies SCC to encode LF lenslets. Based on the plentiful repeating patterns of the LF lenslet representation, this approach is faster and more powerful than the SCC standard, with an even faster decoding time.
On the other hand, 4D-LF represents the LF data as a stack of sub-aperture images (SAIs) generated from lenslets of an LF camera. In the 4D-LF coding approach, generating the 4D-LF pseudo-sequence is a well-known approach for LF compression. This approach involves shifting LF data from the still image coding aspect into the video coding aspect. The sub-aperture array is defined as a pseudo-sequence of different views of LF images and is compressed as a video sequence. Since the first exploration of the LF scanning order in [22,23], several approaches and a variety of scanning orders have been examined, seeking a higher redundancy among SAIs and increased compression efficiency [24][25][26]. For the inter-frame coding mode of a video codec, the similarity between SAIs is a significant parameter in the compression performance. In [24], a 4D-LF pseudo-sequence was created by organizing SAIs from the lenslet array structure. Nevertheless, the coding order and reference frame management are implemented coarsely in a way that does not adapt to specific scenarios. In [25], the author presented a solution to fully exploit information among different views. A hierarchical coding order is applied to encode the 2-D coding structure with the selected number of frames used. Based on different scanning orders in [26], the greater the viewpoint distance between SAIs, the less similarity between SAIs. Additionally, [27] recently presented an efficient coding strategy to convert the model parameters into a bitstream, which is well suited for 4D-LF compression.
According to the literature, LF coding can achieve encouraging results with predictive video coding methods (i.e., H.264/AVC, H.265/HEVC). However, the conventional predictive video coding paradigm mostly focuses on one-to-many applications, which result in complex encoders but simple decoders-it is not suitable with simpler encoders for emerging applications, such as visual sensor networks, remote sensing, or visual-based Internet of Things (IoTs). In regard to the other alternative coding possibilities, three-dimensional discrete wavelet transform-based video coding (3-D DWT) [28] and compressive sensing (CS)-based video coding [29] may also be selected for emerging video applications due to their low encoding complexity requirements. However, in spite of the fast video coding provided by these techniques, 3-D DWT and CS-based video coding approaches still require a large amount of encoding memory and have inferior rate distortion performance when compared to the relevant intra-frame encoding codecs (e.g., H.265/HEVC). In this context, Wyner-Ziv (WZ) coding [30], a lossy distributed coding paradigm [31], introduces a low encoding complexity capability, whereby the motion estimation part on the encoder side is shifted to the decoder side. This coding approach has successfully been applied to many different forms of video and emerging applications, such as natural image analysis, hyperspectral images, sensor networks, and wireless video. WZ coding provides different coding techniques compared to conventional video coding, as well as notably providing a flexible distribution of the codec complexity, high compression, and inherent error robustness [32].
This type of coding manages to separately encode individual frames, which are in turn decoded conditionally to achieve similar efficiency to standard coding. The first WZ coding approach in [33,34] was applied to video signals in the real world, giving improved error resilience. Regarding WZ coding with LF images, several LF image compression approaches have been proposed [35][36][37][38]. In particular, the performance of distributed video coding for light field content was analyzed in [39]. In [40], the LF images were compressed by WZ coding for random access. Taking advantage of the WZ coding structure, the images are independently encoded by a WZ encoder while previously reconstructed images are applied as Side Information (SI) at the receiver to exploit the similarities among LF images. The results show significant compression performance compared to intra coding while maintaining the random access capabilities. Hence, this is a promising coding solution for LF images.

Contributions and Paper Organization
Regarding LF image coding requirements and WZ coding, the biggest challenge is the transmission of LF content to multiple end users with different display devices and applications while controlling and retaining the quality of an immense amount of data. In this sense, an efficient LF coding architecture is of utmost importance. Thus, extending and improving the work in [41], we propose a novel adaptive content frame skipping approach for LF image compression by following the distributed coding approach in order to achieve efficient compression performance for LF data with low encoding complexity. The contributions of this paper are summarized below.

•
An advanced WZ-based LF image compression solution: The well-known WZ coding approach is enhanced by improving the compression performance at the key frame encoder-decoder with state-of-the-art video compression using H.265/HEVC [42], while the advantage of the low complexity of the WZ procedure is utilized on the side of the WZ frame encoder-decoder. Additionally, an advanced channel codec (i.e., LDPC codec [43]) is applied in this WZ coding approach to achieve capacity approaching the performance requirements and flexible code designs using density evolution [44]; • An efficient content-driven LF image reordering mechanism: The different scanning methods may affect the results depending on the video content and characteristics. Based on the high correlation of SAIs and different content types of LF images, 4 scanning methods (i.e., spiral scan, hybrid scan, U-shape scan, and raster scan) are evaluated thoroughly in order to select the most efficient scanning methods for LF images, and also to further improve the performance of our WZ coding solution; • An adaptive skip mode decision algorithm: To further improve the proposed WZ-LF image coding paradigm, an adaptive skip mode decision is introduced using a decision tree rule-based method, which is based on the changes of spatial and temporal features of the LF content sequences. The associated side information is used as the final reconstructed frame when the skip mode is applied to WZ frames.
The remainder of this paper is organized as follows. Section 2 gives an overview of the proposed LF coding architecture. Section 3 presents the novel adaptive content frame skipping algorithm. Afterward, Section 4 analyzes the experimental results, while Section 5 presents the conclusions and describes directions for future work.

Overall Wyner-Ziv-Based Light Field Image Compression
This section presents the WZ-LF image compression solution in detail. In order to achieve the best performance for the solution, an efficient scanning order based on LF content is analyzed and a content skipping algorithm is introduced.

Proposed WZ-LF Architecture
To achieve efficient compression performance for transmission and storage of LF images, Figure 1 illustrates the proposed WZ coding-based LF image compression architecture. The proposed WZ-LF coding method is strengthened compared to the original WZ architecture proposed by Girod [31] by improving the compression performance at the key frame encoderdecoder using the state-of-the-art video compression codec H.265/HEVC Intra. As shown, the LF image can be processed in the following steps. • At the encoder: The LF data are firstly unpacked and decoded into the 4D-LF representation. The SAIs within the 4D-LF are then grouped into a pseudo-sequence using an efficient scanning order, which is described in the next sub-section. The LF image compression problem is then cast as a common video coding problem. The first frame of every group of pictures (GOP), called a key frame, is encoded using the recent H.265/HEVC intra coding approach [42], with only the spatial correlation Skip Light Field Data (.lfr) employed; thus, low complexity and error robustness can be achieved. For the remaining WZ frames, the following steps are performed:

Skip mode decision:
In this module, the skipping decision is activated based on a decision tree algorithm [45]. The key frames and WZ frames are used to determine the skip or non-skip WZ frames by identifying texture information and motion activity in the 4D-LF pseudo-sequence. The several features are computed to detect changes of spatial and temporal characteristics in the video sequence, e.g., the sum absolute difference (SAD), gradient magnitude similarity (GMS), and variance of block (VAR). These features will be explained in the next sub-section. A rule-based method with decision tree calculates the values of decision nodes based on the features in order to make the skipping decision. When the skip mode decision is activated, the WZ frames are skipped in the normal WZ encoding and decoding procedure and the associated side information is used for the final reconstruction frames. This process is explained in detail in next sub-section.

Discrete cosine transforms (DCT):
For WZ frames, the discrete cosine transform (DCT) is used to exploit the statistical dependencies within a frame. The DCT is applied to each 4 × 4 block for WZ frames. By breaking down the image into a 4 × 4 block of pixels arranged from left to right and top to bottom, the WZ frames are transformed using a 4 × 4 DCT. Since the DCT operation has been started, the standard zig-zag scan order [46] within the 4 × 4 DCT coefficient blocks will group the DCT coefficient bands together. The coefficients are organized into 16 bands after being processed in zig-zag scan order. The direct current (DC) band and the alternating current (AC) band are defined as low-frequency information for the first band and as high-frequency information for the remaining bands.

Uniform quantization:
In order to encode WZ frames, a quantizer is then applied to each DCT band individually utilizing a predefined number of levels, which depend on the target quality for the WZ frame. By utilizing a uniform scalar quantizer with a greater number of levels (i.e., with lower step sizes), the lower spatial frequencies of the DCT coefficients are processed. Meanwhile, with a lower number of levels, the higher frequency coefficients are more coarsely quantized without significant degradation of the visual quality of the decoded image. Similar to [47], 8 different types of quantization matrices are adopted in the proposed LF compression scheme to target various quality levels and data rates.

Low-density parity check (LDPC) encoding:
In this work, to achieve lower complexity in contrast to turbo codes [48], we employ a known low-density parity check accumulator (LDPCA) channel encoder as the WZ encoder. A LDPCA encoder comprises an LDPC syndrome former integrated with an accumulator. By using LDPC code and modulo 2, syndrome bits are established, producing the accrued syndrome for every bit plane. The accrued syndromes are saved in a buffer of the encoder, then transmission of only a few of the syndromes in chunks is started. In case of failure at the decoder, a feedback channel is utilized in the encoder buffer in order to transmit more accrued syndromes. By transmitting an 8-bit cyclic redundancy check (CRC) sum of the encoded bit plane, the decoder is provided with the ability to detect residual errors.

•
At the decoder: 1. SI generation: The SI is known as WZ frame estimation and is generated by a frame interpolation algorithm [49], with two consecutive decoded key frames at the decoder side. The SI is also considered a noisy version of the original WZ frame, with a reciprocal relationship between the number of parity bits (or bit rate) and the quality of noise estimation, i.e., the better the quality of estimation, the smaller the received bit rate. By estimating the correlation between the original WZ frame and the SI correctly, the decoding performance can be greatly improved. The better the quality of the SI that is interpolated, the better the quality of the final reconstructed WZ frame that can be achieved. Regarding correlations between frames, the 4D-LF pseudo-sequence is a series of frames with high correlation due to the characteristics of the LF image. Thus, achieving the best quality for SI gives a huge advantage in achieving impressive decoding performance.

LDPC decoding:
In this part, we describe the decoding of a bit plane given the soft input estimations of the SI and the parity bits transmitted from the encoder. From the decoder, in the case of an increase in the number of parity bits, the decoding procedure is then looped. Additionally, for inverse accumulation activity from the encoder, the syndrome bits are removed from the received parity bits before the beginning of the procedure. On these syndrome bits a sum product decoding operation is performed. These instructions are considered as a soft decision algorithm, with the probability of each received bit treated as an input. Additionally, when the decoded bit plane matches the value received from the encoder with the CRC sum registration, this is considered to be an effective decoding process. Then, the decoded bit plane is sent to the inverse quantization and reconstruction module.

WZ frame reconstruction:
In WZ frame reconstruction, the decoded quantized symbol streams relating to each DCT band are formed through all the bit planes related to these bands. When all decoded quantized symbols are received, all DCT coefficients are reconstructed with the support of the corresponding SI coefficients and the estimated correlation information between the original WZ and SI frames. It should be noted that in the proposed scheme, a correlation noise estimation process is performed at the decoder side and used as a decoder rate control mechanism. The corresponding DCT SI bands are chosen when the DCT coefficients bands with no parity bits are transmitted. The WZ frames and the reconstructed frames are then applied to the reconstruction function to bound the error.

Efficient Sub-Aperture Image Arrangment
Recently, a scanning order method was developed based on optimized reference picture selection for LF image coding using a low-delay configuration with H.265/HEVC [50]. However, this method is not suitable for our proposed WZ-LF codec, which encodes and decodes KEY and WZ frames with an intra coding approach. Therefore, in order to select an efficient SAI arrangement, several scan paths of sub-aperture images are examined, such as spiral, raster scan, U-shape, and hybrid scanning approaches [26], as shown in Figure 2. Combing the raster and U-shape scanning order, the hybrid scanning order takes advantage of the similarity of adjacent views, both horizontally and vertically. However, due to varying angles between SAIs, the temporal correlation along SAIs may be changed by different scanning orders and with different LF content. Moreover, the compression performance of the 4D-LF pseudo-sequence can be affected by specific content. Therefore, in this section we thoroughly evaluate the scanning order to verify the most effective order for LF images.
Beginning with content-driven considerations, a set of LF data is collected from [51] containing different content types and categorized into two types: wide and narrow. The wide LF content type includes wide depth-of-field (WDOF), wide depth of field with subject layer (WDOF-L), and blurry content (BC), while the narrow type includes narrow depth of field (NDOF), narrow depth of field with a focus on one main subject (NDOF-1), and narrow depth of field with a focus on more than two subjects (NDOF-2).
Regarding scanning methods, the four types of scanning orders (i.e., spiral scan, hybrid scan, U-shape scan, and raster scan) are applied and computed to determine the most efficient scanning order for LF images. The three following LF images, i.e., spear fence 2 (NDOF), stairs (NDOF-1), and swan 1 (WDOF-L), are selected for evaluation with a temporal frequency of 15 Hz, 193 frames, and encoded by the H.265/HEVC codec.
From the RD performance results in Figure 3, the spiral scanning method may be considered the most suitable for LF images, as it achieves better results than the other scanning methods. Therefore, to achieve the best performance with our proposed WZ-LF coding solution, the spiral scanning method is chosen. Its performance is evaluated in detail in the next section.

Observation
Distributed video coding is well known for having low encoding complexity and for providing various advantageous coding techniques, i.e., flexible distribution of the codec complexity, high compression, and inherent error robustness [32]. This coding method is suitable for many different forms of video in emerging applications, e.g., sensor networks, wireless video, and surveillance video.
The different video sequence types (i.e., low-motion and high-motion sequences) are considered to affect to the compression performance of the codec. The low-motion and high-motion sequences refer to high correlation and low correlation between each frame, respectively. Based on the sequence motion, the distributed video coding approach in common low-motion sequences (e.g., hall monitor, Akiyo) achieves better compression performance in comparison to traditional codecs, while the compression performance declines for high-motion sequences (e.g., Soccer, Foreman) [47].
According to the LF characteristic, adjacent views in the 4D-LF pseudo-sequences both horizontally and vertically exhibit higher similarity with each other. Therefore, the 4D-LF pseudo-sequences are mostly considered low-motion sequences compared to natural videos according to the SAD values shown in Figure 4.
It is noted that the frames of 4D-LF pseudo-sequence are extremely highly correlated, as shown in Figure 4, so skipping the most similar frames may achieve efficient compression performance. Therefore, an adaptive frame skipping mechanism based on a decision tree is introduced in our WZ-LF coding solution and is described in detail in the following section.

Decision Tree Based Adaptive Frame Skipping
Following the analysis of LF data types in the previous sub-section, different data types and different scanning orders can lead to different values of these features, because each SAI represents a different perspective. In this work, we apply the iterative dichotomiser 3 (ID3) algorithm [45] to the frame skipping decision based on an offline training model with spatial and temporal features of the 4D-LF pseudo-sequence.
Based on the high correlation of SAIs and the WZ-LF architecture, it is important to identify the motion activity of the key frames. Thus, the two discriminative temporal features are utilized to detect changes in the motion of the key SAI frames, i.e., _ , the sum of absolute difference of SAI key frames; and _ , the similarity of the gradient magnitude employed with the Scharr operator [52]. The temporal features are computed as follows: where and are two consecutive SAI key frames, ( , ) is the pixel location in the SAI key frames with size of N × M.
where ( ) and ( ) are the gradient magnitude of the two consecutive SAI key frames at pixel location and is a positive constant for equation stability.
( ) and ( ) employ the convolution operation ⊗ in the horizontal and vertical directions following the Scharr filter, computed as: Regarding texture information, the spatial feature is also an essential element in order to identify flat and non-flat regions in the SAI WZ frames of the 4D-LF pseudo-sequence. By identifying the difference in texture information, the block variance is selected for content image assessment, i.e., _ , and is computed as where is the variance of the SAI WZ frames in the 4D-LF pseudo-sequence. Figure 5 shows the discriminative spatial and temporal features of the SAI key frames and WZ frames. Notably, the value of the spatial feature covers most of the flat regions (i.e., blurred regions or regions with low texture), while the values of the temporal features cover non-flat regions (i.e., regions with depth, contrast, and saturation complexity). The frame skipping mechanism is based on a technique wherein the texture and motion activity of the two consecutive key frames and neighbor WZ frames of a 4D-LF pseudo-sequence are used for the selection of frames to be skipped through a decision tree rule [45]. In order to establish the skip and non-skip rules from the tree structure, an offline trained model is applied to the binary decision tree. The optimal weights for the offline model are determined by computing all temporal and spatial features for each LF content type as described in Algorithm 1. Based on this, the skip mode decision is considered for activation or not. It should be noted that neither the sample data nor the weights are updated for the offline model, thus, the offline model should be maintained continuously for the best accuracy.

Original LF images
The proposed algorithm is constructed as below

Algorithm 1
The decision-tree-based adaptive frame skipping Input: 4D-LF pseudo-sequence Output: Skip mode decision (i.e., skip or non-skip) Initialize the data partitioning with WZ frames ( ) and two consecutive KEY frames ( ; ).

Test Conditions
For emerging application scenarios such as visual sensor networks, remote sensing, or camera surveillance, low-resolution imagery is more common than high-resolution imagery; thus, we examine in this paper low-resolution versions of 12 common LF images (shown in Figure 6) by downsampling to Quarter Common Intermediate Format (QCIF) resolution with a temporal frequency of 15 Hz. Based on the high correlation of SAIs in the 4D-LF, the proposed WZ-LF coding solution is especially suitable for these emerging applications. Similarly, the datasets used for training are presented in Table 1 with 16 LF training samples. This dataset was collected from [48] and covers different categories and content types. To assess the performance of the proposed LF compression solution, these LF images are examined with the relevant coding benchmark H.265/HEVC [42] and HEVC-based DVC codecs [53]. The comparison analyzes two parts, i.e., the overall rate distortion (RD) performance and the specific coding tool performance. Regarding the development environment, the proposed WZ-LF coding solution is developed using the C language through Visual Studio 2015 and integrated with the state-of-the-art H.265/HEVC Intra.

Overall WZ-LF Compression Performance Evaluation
Regarding the compression performance, the RD performance is widely utilized to quantify video coding schemes through use of the Bjøntegaard delta-Peask Signal to Noise Ration (BD-PSNR) and Bjøntegaard delta rate [54]. Figure 7 presents the RD curve comparison for the proposed WZ-LF coding solution and the other relevant benchmarks, i.e., HEVC inter and intra coding [42] and HEVC-based Distributed Video Codinglabeled as DVC-H.265/HEVC [53], while the BD rate and BD-PSNR are computed in Table 2. Some conclusions can be derived from the observed results, as shown below.  • WZ-LF versus H.265/HEVC Intra: The advantage of the proposed coding solution is applied to the intra coding solution at the encoder side and to the inter coding solution at the decoder side. Thus, the proposed WZ-LF can significantly improve the RD performance for all 4D-LF pseudo-sequences with a variety of content types. As shown, the average BD rate reductions are 53.14%, 52.53%, 53.22%, and 53.18% for the proposed WZ-LF solution with WDOF, NDOF, NDOF-1, and NDOF-2, respectively. Hence, the obtained performance improvement confirms the efficiency of the proposed skip mode decision in the WZ-LF architecture.  Four scanning order types, i.e., spiral, hybrid, U-shape scanning, and raster, are evaluated based on BD rate [54]. Regarding the common scanning method used for video coding, raster scanning is utilized as an anchor in order to compute the BD rate. Broken down into different data types, the BD rate results for the scanning methods are shown in Table 4. The hybrid and U-shape scanning orders achieve bit rate savings of approximately 3% compared to the raster scan for most content types, however for NDOF and NDOF-1 types, the BD rate performance changes compared to the raster scan, with bit rate savings of approximately 3% and 24%, respectively. Regarding the spiral scan, this method achieves an outstanding result, with an average bit rate saving of 10% for all data types compared to the raster scan. In particular, for the NDOF and NDOF-1 data types, this method still achieves impressive performance, with bit rate savings of approximately 11% and 9%, respectively. Thus, we could tentatively conclude that the spiral scan is the most efficient scanning method, especially for LF images. Considering the high correlation of SAIs in the 4D-LF pseudo-sequences, the decision tree method is applied to determine the skipping process at the encoder side of the WZ-LF architecture in order to enhance the compression efficiency of the WZ-LF coding solution. The spatial-temporal features of the 4D-LF pseudo-sequences are selected based on the depth of field changes in the content. According to the rules created by the offline trained model, the skip mode decision determines whether or not to skip the WZ encode procedure and instead encode images as normal 4D-LF pseudo-sequences. Table 5 and Figure 8 show comparisons of the WZ-LF coding solution with and without the skip mode decision. Examining the RD performance results, it is clear that WZ-LF with skip mode has lower complexity than WZ-LF without skip mode, with a bit rate saving of 25%. Notably, the NDOF content type shows a significant improvement with a bit rate saving of 29.3%, while WDOF, NDOF-1, and NDOF-2 achieve BD rate reductions of 26.6%, 20.2%, and 23.8%, respectively. Therefore, we can observe that the skip mode performs outstandingly with blurry content (34% bit rate saving) or content with a narrow depth of field (48% bit rate saving for the game board sequences).  Examining the compression complexity is an essential part of the performance evaluation. For this evaluation, the coding solutions are tested on the same PC with an Intel Core i7-7700HQ (2.8 GHz) processor, 16 GB RAM, and Windows 10-Home OS. The results are shown in Figures 9 and 10, respectively, for Quantization parameter (QP) of 40 and of 25, with and without the skip mode decision. To avoid the effect of multi-thread processing during the test, the results of five repetitions of the same compression setting are averaged. Additionally, the time saving (%) is measured as: where and _ are the processing time of the WZ-LF codec with and without the skip mode decision, respectively.  From these complexity results, it can be observed that the WZ-LF codec with the skip mode decision saves a significant amount of time in encoding compared to the WZ-LF codec without the skip mode decision. The WZ-LF with the skip mode can encode approximately 46% and 74% faster on average than the WZ-LF without skip mode at QP40 and QP25, respectively.

Conclusions
This paper introduces an LF adaptive content frame skipping compression solution following the WZ coding approach by analyzing the spatial and temporal correlation between sub-aperture pictures. The proposed WZ-LF coding paradigm combines the state-of-the-art H.265/HEVC codec with an adaptive frame skipping mechanism, along with an efficient scanning order. The proposed LF compression architecture provides an efficient scanning order that adapts to LF content. This provides optimized performance for almost all LF content data types. In addition, the up-to-date WZ coding solution based on embedded adaptive frame skipping decisions significantly outperforms the relevant H.265/HEVC Intra and DVC-H.265/HEVC codecs. In particular, the proposed coding solution improves the compression performance and has lower computational complexity than both of the relevant benchmarks. Hence, the proposed WZ-LF coding solution meets the requirements for many emerging applications, e.g., visual sensor networks, video surveillance, and remote space transmission.
In future research, other LF image components, i.e., noise and depth maps, could be analyzed in order to provide better quality LF reconstruction. Thus, the proposed WZ-LF coding solution may be further improved.