Analysis of Variable-Length Codes for Integer Encoding in Hyperspectral Data Compression with the k2-Raster Compact Data Structure

This paper examines the various variable-length encoders that provide integer encoding to hyperspectral scene data within a k2-raster compact data structure. This compact data structure leads to a compression ratio similar to that produced by some of the classical compression techniques. This compact data structure also provides direct access for query to its data elements without requiring any decompression. The selection of the integer encoder is critical for obtaining a competitive performance considering both the compression ratio and access time. In this research, we show experimental results of different integer encoders such as Rice, Simple9, Simple16, PForDelta codes, and DACs. Further, a method to determine an appropriate k value for building a k2-raster compact data structure with competitive performance is discussed.


Introduction
Hyperspectral scenes [1][2][3][4][5][6][7][8][9][10] are data taken from the air by sensors such as AVIRIS (Airborne Visible/Infrared Imaging Spectrometer) or by satellite instruments such as Hyperion and IASI (Infrared Atmospheric Sounding Interferometer). These scenes are made up of multiple bands from across the electromagnetic spectrum, and data extracted from certain bands are helpful in finding objects such as oil fields [11] or minerals [12]. Other applications include weather prediction [13] and wildfire soil studies [14], to name a few. Due to their sizes, hyperspectral scenes are usually compressed to facilitate their transmission and reduce storage size.
Compact data structures [15] are a type of data structure where data are stored efficiently while at the same time providing real-time processing and compression of the data. They can be loaded into main memory and accessed directly by means of the rank and select functions [16] in the structures. Compressed data provide reduced space usage and query time, i.e., they allow more efficient transmission through limited communication channels, as well as faster data access. There is no need to decompress a large portion of the structure to access and query individual data as is the case with data compressed by classical compression algorithms such as gzip or bzip2 and by specialized algorithms such as CCSDS123.0-B-1 [17] or KLT+JPEG 2000 [18,19]. In this paper, we are interested in lossless compression of hyperspectral scenes through compact data structures. Therefore, reconstructed scenes should be identical to the originals before compression. Any deterministic analysis process will necessarily yield the same results. Figure 1 shows several images from our datasets. The compact data structure used in this paper is called k 2 -raster. It is a tree structure developed from another compact data structure called k 2 -tree. k 2 -raster is built from a raster matrix with its pixel cells filled with integer values, while k 2 -tree is from a bitmap matrix with zero and one values. During the construction of the k 2 -raster tree, if the neighboring pixels have equal values such as clusters (spatial correlation), the number of nodes in the tree that need to be saved is reduced. If the values are similar, as discussed later in this paper, the values will be made even smaller. They are then compressed or packed in a more compact form by the integer encoders, and with these small integers, the compression results are even better. Moreover, when it comes to querying cells, a tree structure speeds up the search, saving access time. Another added advantage of some of the integer encoders is that they provide direct random access to the cells without any need for full decompression.
Currently, huge amounts of remote sensing data have been produced, transmitted, and archived, and we can foresee that in the future, the amount of larger datasets is expected to keep growing at a fast rate. The need for their compression is becoming more pressing and critical. In view of this trend, we take on the task of remote sensing compression and make it as one of our main objectives. In this research work, we reduce hyperspectral data sizes by using compact data structures to produce lossless compression. Early on, we began by examining the possibility of taking advantage of the spatial correlation and spectral correlation in the data. In our previous paper [20], we presented a predictive method and a differential method that made use of these correlations in hyperspectral data with favorable results. However, in this paper, we would like to focus on selecting a suitable integer encoder that is employed in the k 2 -raster compact data structure, as that is also a major factor in providing competitive compression ratios.
Compression of integer data in the most effective and efficient way, in relation to compact data structures, has been the focus of many studies over the past several decades. Some include Elias [21][22][23], Rice [24][25][26], PForDelta [27][28][29], and Directly Addressable Codes (DACs) [30][31][32]. In our case, we need to store non-negative, typically small integers in the k 2 -raster structure. This structure is a tree built in such a way that the nodes are not connected by pointers, but can still be reached with the use of a compact data structure linear rank function. When the data are saved, no pointers need to be stored, thus keeping the size of the structure small. Additionally, we use a fixed code ( [15], §2.7) to help us save even more space. In what follows, we investigate the effectiveness of some of these integer encoders.
The rest of the paper is organized as follows: In Section 2, we describe the k 2 -raster structure, followed by the various variable-length integer encoders such as Elias, Rice, PForDelta, and DACs. Section 3 presents experimental results for finding the best and optimal values for k and exploring the different integer encoders for k 2 -raster. This is done in comparison with classical compression techniques. Lastly, some conclusions and final thoughts on future work are put forth in Section 4.

Materials and Methods
In this section, we describe k 2 -raster [33] and the integer encoders Elias, Rice, Simple9, Simple16, PForDelta codes, and DACs. This is followed by a discussion on how to obtain the best value of k and two related works on raster compression: heuristic k 2 -raster [33] and 3D-2D mapping [34].

K 2 -Raster
The k 2 -tree structure was originally proposed by Ladra et al. [35] as a compact representation of the adjacency matrix of a directed graph. Its applications include web graphs and social networks. Based on k 2 -tree, the same authors also proposed k 2 -raster [33], which is specifically designed for raster data including images. A k 2 -raster is built from a matrix of width w and height h. If the matrix can be partitioned into k 2 square subquadrants of equal size, it can be used directly. Otherwise, it is necessary to enlarge the matrix to size s × s, where s is computed as: setting the new elements to 0. This extended matrix is then recursively partitioned into k 2 submatrices of identical size, referred to as quadrants. This process is repeated until all cells in a quadrant have the same value, or until the submatrix has size 1 × 1 and cannot be further subdivided. This partitioning induces a tree topology, which is represented in a bitmap T. Elements can then be accessed via a rank function. At each tree level, the maximum and minimum values of each quadrant are computed. These are then compared with the corresponding maximum and minimum values of the parent, and the differences are stored in the Vmax and Vmin arrays of each level. Saving the differences instead of the original values results in lower values for each node, which in turn allows a better compression with DACs or other integer encoders such as Simple9, PForDelta, etc. An example of a simple 8 × 8 matrix is given in Figure 2 to illustrate this process. A k 2 -raster is constructed from this matrix with maximum and minimum values as given in Figure 3. Differences from the parents' extrema are then computed as explained above, resulting in the structure shown in Figure 4. Next, with the exception of the root node at the top level, the Vmax and Vmin arrays at all levels are concatenated to form Lmax and Lmin, respectively. Both arrays are then compressed by an integer encoder such as DACs. The root's maximum (rMax) and minimum (rMin) values remain uncompressed. The resulting elements, which fully describe this k 2 -raster structure, are given in Table 1.

Unary Codes and Notation
We denote x as a non-negative integer. The expression |x| gives the minimum bit length needed to express x, i.e., |x| = log 2 x + 1.
Unary codes are generally used for small integers. Unary codes have the following form: where the superscript x indicates the number of consecutive 0 bits in the code. For example, u(1 d ) = 0 1 1 = 01 b , u(6 d ) = 0 6 1 = 0000001 b , u(9 d ) = 0 9 1 = 0000000001 b . Here, bits are denoted by a subscript b and decimal numbers by a subscript d. Furthermore, when codes are composed of two parts, they are spaced apart for readability purposes. In general, the notation used in [15] is adopted in this paper.    Figure 3, the maximum value of each node is subtracted from that of its parent while the minimum value of the parent is subtracted from the node's minimum value. These differences then replace their corresponding values in the node. The maximum and minimum values of the root remain the same.

Elias Codes
Elias codes include Gamma (γ) codes and Delta (δ) codes. They were developed by Peter Elias [21] to encode natural numbers, and in general, they work well with sequences of small numbers.
Gamma codes have the following form: where [x] l represents the l least significant bits of x. For example, Delta codes have the following form: For values that are larger than 31, Delta codes produce shorter codewords than Gamma codes. This is due to the use of Gamma codes in forming the first part of their codes, which provides a shorter code length for Delta codes as the number becomes larger. Some examples are:

Rice Codes
Rice codes [25] are a special case of Golomb codes. Let x be an integer value in the sequence, and let y = x/2 l , where l is a non-negative integer parameter. The Rice codes for this parameter are defined as: Some examples are shown for different values of l in Table 2. Table 2. Some examples of Rice codes. Value To obtain optimal performance among Rice codes, l should be selected to be close to the expected value of the input integers. In general, Rice codes give better compression performance than Elias γ and δ codes.

Simple9, Simple16, and PForDelta
Apart from Elias codes and Rice codes, the codes in this section store the integers in single or multiple word-sized elements to achieve data compression. They have been shown to have good compression ratios [30].
Simple9 [36] assigns a maximum possible number of a certain bit length to a 28-bit segment or packing space of a 32-bit word. The other 4 bits contain a selector that has a value ranging from 0 to 8. Each selector has information that indicates how the integers are stored, and that includes the number of these integers and the maximum number of bits that each integer is allowed in this packing space. For example, Selector 0 tests to see if the first 28 integers in the data have a value of 0 or 1, i.e., a bit length of 1. If they do, then they are stored in this 28-bit segment. Otherwise, Selector 1 tests to see if it can pack 14 integers into the segment with a maximum bit length of 2 bits for each. If this still does not work, Selector 2 tests to see if 9 integers can each be packed into a maximum bit length of 3 bits. This testing goes on until the right number of data are found that can be stored in these 28 bits. Table 3 shows the 9 different ways of using 28 bits in a word of 32 bits in Simple9.
Simple16 [37] is a variant of Simple9 and uses all 16 combinations in the selector bits. Their values range from 0 to 15. Table 4 shows the 16 different ways of packing integers into the 28-bit segment in Simple16.
PForDelta [27] is also similar to both Simple9 and Simple16, but encodes a fixed group of numbers at a time. To do so, 128-or 256-bit words are used.
Due to its relative simplicity, Simple9 is used here as an example to illustrate how an integer sequence is stored in the encoders described in this section. This sequence <3591 25 13 12 15 12 11  26 20 8 13 8 9 7 13 10 12 0 10> d is taken from the Lmax array of one of our data scenes AG9, and the bit-packing is shown in Table 5. There are 19 integers in the sequence. Assuming the integer is 16 bits each, the sequence has a total size of 38 bytes. After packing into the array, the sequence occupies only 16 bytes. Table 3. Nine different ways of encoding numbers in the 28-bit packing space in Simple9.

Directly Addressable Codes
Directly Addressable Codes (DACs) can be used to compress k 2 -raster and provide access to variable-length codes. Based on the concept of compact data structures, DACs were proposed in the papers published by Brisaboa et al. in 2009 [30] and 2013 [31]. This structure is proven to yield good compression ratios for variable-length integer sequences. By means of the rank function, it gains fast direct access to any position of the sequence in a very compact space. The original authors also asserted that it was best suited for a sequence of integers with a skewed frequency distribution toward smaller integer values.
Different types of encoding are used for DACs, and the one that we are interested in for k 2 -raster is called VBytecoding. Consider a sequence of integers x. Each integer x i , which is represented by log 2 x i + 1 bits, is broken into chunks of bits of size C S . Each chunk is stored in a block of size C S + 1 with the additional bit used as a control bit. The chunk occupies the lower bits in the block and the control bit the highest bit. The block that holds the most significant bits of the integer has its control bit set to 0, while the others have it set to 1. For example, if we have an integer 41 d (101001 b ), which is 6 bits long, and if the chunk size is C S = 3, then we have 2 blocks: 0101 1001 b . The control bit in each block is shown underlined. To show how the blocks are organized and stored, we again illustrate it with an example. Given five integers of variable length: , and a chunk size of 3 (the block size is 4), their representations are listed in Table 6. We store them in three blocks of arrays A and control bitmaps B. This is depicted in Figure 5. To retrieve the values in the arrays A, we make use of the corresponding bitmaps B with the rank function. This function returns the number of bits, which are set to 1 from the beginning position to the one being queried in the control bitmap B i . An example of how the function is used follows: If we want to access the third integer (100 d ) in the sequence in Figure 5, we start looking for the third element in the array A 1 in Block 1 and find A 3,1 with its corresponding control bitmap B 3,1 . The function rank(B 3,1 ) then gives a result of 2, which means that the second element A 3,2 in the array A 2 in Block 2 contains the next block. With the control bit in B 3,2 , we compute the function rank(B 3,2 ) and obtain a result of 1. This means the next block in Block 3 can be found in the first element A 3,3 . Since its corresponding control bitmap B 3,3 is set to 0, the search ends here. All the blocks found are finally concatenated to form the third integer in the sequence.
More information on DACs and the software code can be found in the papers [30,31] by Ladra et al.

Selection of the k Value
Following the description of Subsection 2.1, using different k values leads to the creation of Lmax and Lmin arrays of different lengths. This, in turn, affects the final results of the size of k 2 -raster. With this in mind, we present a heuristic approach that can be used to determine the best k value for obtaining the smallest storage size. First, we compute the sizes of the extended matrix for different values of k within a suitable range using Equation (1). Then, we find the k value that corresponds to the matrix with the smallest size, and the result can be considered as the best k value. Before the start of the k 2 -raster building process, the program can find the best k value and use it as the default.

Heuristic k 2 -Raster
In the k 2 -raster paper by Ladra et al. [33], a variant of this structure was also proposed whereby the elements at the last level of the tree structure are stored by using an entropy-based heuristic approach. This is denoted by k 2 H -raster. For example, for k = 2, each set of the 4 nodes that are from the same parent forms a codeword. It is possible that at this same level of the tree, these codewords may be repeated, and their frequencies of occurrences can be computed. These sets of codewords and their frequencies are then compressed and saved. In effect, the more these codewords are repeated, the less storage space they take up. An example of codeword frequency based on the k 2 -raster discussed in Section 2.1 is shown in Table 7. According to experiments conducted by the authors of [33], it saves space in the final representation.

3D-2D Mapping
A study on compact representation of raster images in a time-series was proposed by Cruces et al. in [34]. This method is based on the 3D to 2D mapping of a raster where 3D tuples <x, y, z> are mapped into a 2D binary grid. That is, a raster of size w × h with values in a certain range, between 0 and v inclusive, has a binary matrix of w × h columns and v+1 rows. All the rasters are then concatenated into a 3D matrix and stored as a 3D-k 2 -tree.

Experimental Results
In this section, we present an exhaustive comparison of the different integer encoders for use with k 2 -raster. First, though, we report results from experiments for finding the best k value. Reported also are the experimental results to find out if the heuristic k 2 -raster and 3D-2D mapping would give better storage sizes. All storage sizes in this section are expressed as bits per pixel per band (bpppb).
The implementations for k 2 -raster and k 2 H -raster were based on the algorithms presented in the paper by Ladra et al. [33]. The sdsl-lite implementation of k 2 -tree by Simon Gog [38] (https: //github.com/simongog/sdsl-lite/blob/master/include/sdsl/k2_tree.hpp) was used for testing 3D-2D mapping described in the paper by Cruces et al. [34]. The DACs software was downloaded from a package called "DACs, optimization with no further restrictions" at the Universidade da Coruña's Database Laboratory website (http://lbd.udc.es/research/DACS/). The programming code for the Rice, PForDelta, Simple9, and Simple16 codes was written by the programmers Diego Caro, Michael Dipperstein, and Christopher Hoobin and was downloaded from these authors' GitHub web pages. Slight modifications to the code were made to meet our requirements to perform the experiments. All programs for this paper were written in C and C++ and compiled with gnu g++ 5.4.0 20160609 with -Ofast optimization. The experiments were carried out on an Intel Core 2 Duo CPU E7400 @2.80GHz with 3072KB of cache and 3GB of RAM. The operating system was Ubuntu 16.04.5 LTS with kernel 4.15.0-47-generic (64 bits). The software code is available at http://gici.uab. cat/GiciWebPage/downloads.php. Table 8. Hyperspectral scenes used in our experiments. Also shown are the bit rate and bit rate reduction using k 2 -raster. x is the scene width, y the scene height, and z the number of spectral bands. bpppb, bits per pixel per band; CRISM, Compact Reconnaissance Imaging Spectrometer for Mars; IASI, Infrared Atmospheric Sounding Interferometer. : Calibrated (C) or Uncalibrated (U).

Best k Value Selection
From our previous research [20], the selection of the k value when building a k 2 -raster was shown to have a great effect on the resulting size of the structure, as well as the access time to query its elements. In order to further investigate this idea, we extended our research to finding ways of choosing the best k value. One way was to build the k 2 -raster structure with different k values for scene data from each sensor to see how the matrix size affected the choice of the k value. Additionally, we measured the time it took to build the k 2 -raster and the size of the structure. The results are shown in Table 9. For most tested data, the k value leading to the smallest extended matrix size (attribute S in the table) usually provided the fastest build time and the smallest storage size. With these results, we could say that, in general, when k = 2, the compressed data size was large, sometimes even larger than the size of the original scene. As the value of k became larger, beginning with k = 3, the compressed data size was reduced. As far as the compressed size was concerned, the best value was in the range from three to 10 for matrices with a small raster size (i.e., if both the original width and original height were less than 1000) such as the ones for the AIRS Granule or AVIRIS Yellowstone scenes. If at least one dimension was larger than 1000 such as Hyperion calibrated or uncalibrated scenes, a larger range, typically between three and 20, needed to be considered. Table 9. Results for different k values using the scene data from each sensor for the following attributes: (S) the extended matrix Size (pixels), (C) the k 2 -raster Compressed storage data rate (bpppb), and (B) the time to Build the k 2 -raster (seconds). The original scene width and height are shown in the first column. The best results are highlighted in blue. The above experiments were repeated to compare the access time for the different k values. For each scene, the average time over 100,000 consecutive queries is reported. Results are shown in Table 10, and Figure 6 shows how the access time and the size varied depending on the k value. As can be observed, access time became smaller and smaller as the value of k became larger. The plotted data suggested that there was a trade-off between access time and size with respect to the k value. We considered the optimal k value to be the one that created a relatively small size with a minimal access time. For example in AG9, when comparing the results between k = 6 and k = 15, the difference in bits per pixel per band for storage size was not very significant, but the reduction in access time was. Therefore, for this scene, k = 15 was considered an optimal value. For AIRS Granule 9, the best value is marked with a red circle, and the optimal value is marked with a blue square.

Heuristic k 2 -Raster
In this section, we present the results of the experiments using the heuristic k 2 -raster proposed by Ladra et al. [33] on some of our datasets. Table 11 reports results for two hyperspectral scene datasets: AIRS Granule and AVIRIS Uncalibrated Yellowstone. In the experiments, we found that only when k = 2 would there be enough repeated sets of codewords in the last level of nodes to help us save space. When k ≥ 3, there were no repeated sets of codewords. From the table, it can be seen that there was not much size reduction with k 2 H -raster in most cases. However, if we built a k 2 -raster using the best or optimal k value, the size was considerably smaller. Therefore, we can see that k 2 H -raster structure did not produce a better size. Table 11. Comparison of the structure size (bpppb) built from k 2 -raster and k 2 H -raster where k = 2. The sizes for k 2 -raster using the best k value and the optimal k value are also shown. The best results are highlighted in blue.

3D-2D Mapping
As discussed earlier, Cruces et al. [34] proposed a 3D to 2D mapping of raster images using k 2 -tree as an alternative to achieve a better compression ratio. We used the k 2 -tree implementation in sdsl-lite software to obtain the sizes for one of our datasets (AG9) from k = 2 to k = 4. Note that similar to k 2 -raster, if the 2D binary matrix cannot be partitioned into square subquadrants of equal size, it needs to be expanded using Equation (1), and the extra elements are set to zero. The results are presented in Table 12. The sizes for a range of bands from 1481 to 1500 of the scene are also given for comparison.
From the results for AG9, we can see that the 3D-2D mapping did not make the size smaller. Instead, it became larger when the k value increased, and therefore, the method did not produce competitive results.

Comparison of Integer Encoders for k 2 -Raster
Experiments were conducted to determine whether other variable-length encoders of integers might serve as a better substitute for DACs, which were the original choice in the k 2 -raster structure initially proposed by Ladra et al. [33]. The performance of DACs was compared to that of other encoders such as Rice, Simple9, PForDelta, Simple16 codes, and gzip. In these experiments, the Lmax and Lmin arrays were encoded using these codes, and the results are shown in Table 13. For Rice codes, the l value, as explained in Section 2.4, produced different results depending on the mean of the raster's elements, and only the ones with the best l value are shown.  Table 13. A comparison of the storage size (in bpppb) using different integer encoders on Lmax and Lmin from the k 2 -raster built from our datasets. The combined entropies for Lmax and Lmin are listed as a reference. The l value that was used in Rice codes is enclosed in brackets. The best and optimal k values for DACs are also enclosed in brackets. Except for the entropy, the best rates for each scene's data are highlighted in blue. The results showed that, in most cases, DACs still provided the best storage size compared to other encoders for our datasets. They also had the added advantage of direct random access to individual elements of the matrix whilst the other encoders would need to decompress each raster in order to retrieve the element, thus requiring much longer access time. When DACs did not yield the best performance, DACs results were usually only less than 0.1 bpppb worse. In the worst cases, DACs results lagged behind by, at most, 0.4 bpppb.

Conclusions
In this research, we examined the possibility of using different integer coding methods for k 2 -raster and concluded that this compact data structure worked best when it was used in tandem with DACs encoding. The other variable-length encoders, though having competitive compression ratios, lacked the ability to provide users with direct access to the data. We also studied a method whereby we could obtain a k value that gave a competitive storage size and, in most cases, also a suitable access time.
For future work, we are interested in investigating the feasibility of modifying elements in a k 2 -raster structure, facilitating data replacements without having to go through cycles of decompression and compression for the entire compact data structure.