Lossless and Near-Lossless Compression Algorithms for Remotely Sensed Hyperspectral Images

Rapid and continuous advancements in remote sensing technology have resulted in finer resolutions and higher acquisition rates of hyperspectral images (HSIs). These developments have triggered a need for new processing techniques brought about by the confined power and constrained hardware resources aboard satellites. This article proposes two novel lossless and near-lossless compression methods, employing our recent seed generation and quadrature-based square rooting algorithms, respectively. The main advantage of the former method lies in its acceptable complexity utilizing simple arithmetic operations, making it suitable for real-time onboard compression. In addition, this near-lossless compressor could be incorporated for hard-to-compress images offering a stabilized reduction at nearly 40% with a maximum relative error of 0.33 and a maximum absolute error of 30. Our results also show that a lossless compression performance, in terms of compression ratio, of up to 2.6 is achieved when testing with hyperspectral images from the Corpus dataset. Further, an improvement in the compression rate over the state-of-the-art k2-raster technique is realized for most of these HSIs by all four variations of our proposed lossless compression method. In particular, a data reduction enhancement of up to 29.89% is realized when comparing their respective geometric mean values.


Introduction
The wealth of information in hyperspectral images (HSIs) and increases in sensor performance have opened perspectives for a variety of applications, including space exploration, remote sensing, medical imaging, environmental monitoring, industrial quality control, and forensic science, among many others [1,2].While the full potential of HSI techniques has not been fully explored, there is an increasing demand for this technology in the marketplace and in various aspects of life.According to Business Communications Company (BCC) research [3], the growth of the global market for HSI has increased at a compound annual growth rate (CAGR) of 15.1% for the period from 2018 to 2023.
A major problem with hyperspectral images is their immense size, collected by hundreds of contiguous spectral bands, where the size of each HSI can easily reach hundreds of megabytes.This would engender logistical problems in terms of storage, transmission, and processing.According to the Consultative Committee for Space Data Systems (CCSDS), the large data volume associated with hyperspectral images poses significant challenges for the myriad resources utilized for data processing in both onboard satellites and on-ground stations [4].As a result, the use of efficient compression methods to decrease the size of these images without compromising their valuable information becomes mandatory.The availability of such compression techniques for hyperspectral images is a key enabler for unlocking the full potential of this powerful technology.It would pave the way for exploiting its full capability and enhancing our understanding of the world around us in meaningful ways.This motivated us further to undertake this research.
Image compression methods are mainly categorized into lossless and lossy compression techniques.Lossless compression allows for the exact reconstruction of the image at the expense of realizing modest compression ratios when compared to lossy compression methods.This is due to the theoretical boundary imposed by entropy on the lossless compression [5].Entropy depends on the statistical nature of the source data and is defined as the average Shannon information content.Let X be the set of all possible outcomes x i of these source data.The entropy of the set X, denoted by H(X), is formulated as follows: where M is the cardinality of the set X; that is, M =|X|, and p(x i ) corresponds to the probability of the outcome x i [6].In this work, the unit of the calculated entropy is bits, though the calculated entropy can be expressed in various units depending on the application.Accordingly, the range of H(X) is bound within [0, log 2 |X|].
On the other hand, the implementation of lossy compression techniques relies heavily on employing transform-based methods such as principal component analysis (PCA) [16,17], discrete cosine transform (DCT) [18,19], discrete wavelet transform (DWT) [20], pairwise orthogonal transform (POT) [21], and the lossy compression algorithm for hyperspectral images (HyperLCA) [22].The integer version of the transform-based techniques is also applied for lossless compression with limited results in terms of compression ratios.In this category, the integer version of the Karhunen-Loéve transform (KLT) is utilized to achieve lossless compression [23].Similarly, the integer-based DWT is adopted in lossless compression as a part of the JPEG 2000 compression algorithm [24].
A third category, called near-lossless, achieves higher compression ratios than lossless compression while limiting the pixel distortion to a pre-defined absolute or relative error [25][26][27][28][29].A more uncompromising definition of near-lossless limits the maximum error to the intrinsic noise of the original data produced by the used instrument or other sources, such as atmospheric correction [30].Generally, near-lossless compression is achieved by one of the following means: (1) losslessly coding the quantized prediction error; (2) applying pre-quantization of the original image followed by lossless coding; or (3) a two-stage near-lossless coding.
The first category is the most popular due to its low complexity.Here, the typical prediction technique is the context-based, adaptive, lossless image codec (CALIC) method [31].It can be extended to 3D-CALIC [32] and M-CALIC [33] by exploiting the correlation presented in the hyperspectral data.However, CALIC-based algorithms are not hardwarefriendly, as reported by Zheng et al. [25].Nonetheless, two compression techniques tailored to hardware implementations, promulgated by the Consultative Committee for Space Data Systems, include the near-lossless version of the algorithm (NL-CCSDS-123) [34] and the CCSDS-123-AC.The latter employs a context-based arithmetic coder (AC) and offers lower computational complexity [35].Both of these methods use CCSDS-123 as a predictor, except that the former employs a range encoder, and the latter employs a context-based arithmetic encoder.The former encoder is a simplified version of the latter, though it yields suboptimal results [34,35].
In the second category, lossless compression is performed on a pre-quantized image.Such techniques yield poor compression performance with increasing tolerance values [27].To bind the tolerance to a certain δ value, the quantizer step size is set to 2δ + 1 [33].The technique proposed in [36] falls under this category.It extends the existing CCSDS algorithm for Image Data Compression (CCSDS-IDC) with a pre-quantizer to increase the compression rate.The quantization in this study is carried out by employing a quantization table instead of a scaler-based quantizer.
The third category combines both lossy and lossless compression.The residual image resulting from the difference between the original image and the lossy compressed image is quantized and then losslessly encoded.A two-stage coder is proposed by Beerten et al., with the JPEG 2000 being the lossy layer [27].The residual image is then encoded bit-plane by bit-plane using binary arithmetic coding.This allows the encoder to terminate the coding process at any arbitrary bit rate.Other studies propose pairing the lossless compression technique CALIC with a wavelet-based approach [26,37] or with JPEG 2000 [38] as the lossy layer.
In our previous work reviewing hardware-accelerated compression of remotely sensed hyperspectral images, results show that lossless compression started to gain the attention of the research community as early as 2009 [9].Then, it increased thereafter, perhaps due to the growing demand for loss-free hyperspectral images by a myriad of research and development projects for various analysis tasks.Our investigation also shows that there is a limited number of studies on the development of near-lossless compression when compared to the rich literature on lossless and lossy compression methods [9].
Overall, the work described herein makes the following main research contributions: • A novel lossless compression technique of remotely sensed hyperspectral images is proposed by employing our recent method of seed generation based on bit manipulation techniques [39].Four variations are employed in our experiments using the Corpus dataset of HSIs.Our performance results yield an enhancement in data reduction that reaches 29.89% when comparing the corresponding geometric mean value with that obtained by the state-of-the-art k 2 -raster method [40].• A novel near-lossless compression of HSIs is also proposed by incorporating our published quadrature-based square rooting method [39].A data reduction that varies from 38.90% to 39.73% is realized with a maximum relative error of 0.33 and a maximum absolute error of only 30.Since hyperspectral images with high entropies are hard to losslessly compress due to their reduced correlation, this approach can be applied with a small to negligible impact on the accuracy of the decompressed data.
The rest of the paper is structured as follows: Section 2 gives a short review of some recent works related to the compression of HSIs.Then, Section 3 describes the lossless compression of remotely sensed hyperspectral images utilizing our seed generation approach.This is followed by the proposed near-lossless compression employing our quadraturebased square rooting method in Section 4. In Section 5, we present the experimental results and compression performance obtained by both proposed compression systems.Finally, our concluding remarks and future work are provided in Section 6.

Related Work
In this section, we review some recent studies pertaining to this research focus, with a particular emphasis on works published since 2018.Some of the reviewed papers span HSI compression systems that are transform based and tensor based.Others rely on a multitude of computational techniques that include machine learning, compressive sensing, and recursive least-squares.
In the realm of machine learning and, in particular, deep learning, several algorithms have been proposed [41][42][43].In 2019, a lossy hyperspectral image compression system that combines an onboard predictive compressor and a ground-based convolutional neural network (CNN) for image reconstruction was described in [41].Results show that the quantization of hyperspectral data followed by the lossless mode of CCSDS-123.0-B-2outperforms the lossy mode of the standard in terms of speed.Furthermore, a comparable rate-distortion performance is achieved by incorporating the ground-based CNN.Deep learning is also employed in lossless compression as proposed in [42] by leveraging deep recurrent neural networks (RNN) to improve the prediction accuracy of the traditional DPCM approach.Another approach based on machine learning involves the application of support vector regression (SVR) [43].First, a 3D wavelet transform is used to simultaneously capture spatial and spectral features.Then, SVR is used to predict the behavior of the 3D wavelet coefficients.Entropy encoding, including run-length encoding and arithmetic encoding, is used to achieve a high rate-distortion performance of 40.65 dB that yields a classification rate baseline of 75.8% [43].A lossy hyperspectral image compression algorithm, leveraging autoencoders and deep learning techniques, is described in [44].It yields significant improvements in compression ratio and a 28% enhancement in peak signal-to-noise Ratio (PSNR).The classification accuracy remains largely unaffected by the compression process, demonstrating effectiveness in preserving image quality.In 2023, a novel hyperspectral compression network via contrastive learning (HCCNet) is proposed to enhance feature distinction [45].Contrastive learning aims to learn useful data representations by comparing similar and dissimilar pairs of examples [46,47].Innovative modules like contrastive informative feature encoding (CIFE) and contrastive invariant feature recovery (CIFR) preserve informative attributes resulting in significant improvements.The former, CIFE, is designed to extract and organize discriminative attributes, while the latter, CIFR, is intended to recover lost attributes caused by compression.However, a semantic gap between compressed and original images remains, indicating areas for future research.
Multiple studies explored the application of transform-based methods in the compression of hyperspectral images.In a recent study by Melián et al. [48], the information extracted from subsequent frames is reused to speed up the compression performance of the lossy compressor, known as HyperLCA.Results show a 27% reduction in the floatingpoint operations (FLOPs) while maintaining a comparable signal-to-noise ratio (SNR).Another study focuses on the hardware aspect of the algorithm through the introduction of HW-HyperLCA [22].The latter utilizes integer arithmetic for high compression ratios (CRs) and compression-distortion ratios (CDRs).Further, the method supports the development of different integer versions that can be implemented for specific applications.A lossy compression scheme using three-dimensional wavelet block tree coding (3D-WBTC) exploits intra and inter-sub-band correlations is disclosed in [49].The proposed method outperforms the three-dimensional set partition embedded block coding (3D-SPECK) strategy at low bit rates and exhibits faster decoding and encoding times.The work in [50] introduces a method to enhance three-dimensional discrete cosine transform (3D-DCT) -based compression of hyperspectral images by using a luminance transform.The approach involves two steps: initially applying the luminance transform to reduce brightness and contrast differences within spectral band groups, followed by compression using 3D-DCT and entropy encoding.Compared to the traditional 3D-DCT, this method shows improved results [50].Tzamarias et al. introduce integer-to-integer transforms within bi-orthogonal graph filter banks to enable lossless compression [51].Such filter banks decompose hyperspectral images into different frequency sub-bands using graph-based operations, which take into account both the spatial and spectral correlations in the data.This decomposition enables efficient compression by focusing on capturing the most relevant information in the image while discarding redundant or less critical components.
Tensor-based compression methods have gained interest for effectively managing complex multi-dimensional data, especially in hyperspectral image compression.A new approach called tensor-robust CUR, for TRCUR, aimed at addressing the challenges of compressing and denoising of hyperspectral data, which frequently suffer from quality degradation due to noise is provided in [52].While the reported method, tensor robust principal component analysis (TRPCA), is effective, it imposes significant computational demands, especially for large datasets that may exceed memory capacity limits.To overcome this constraint, TRCUR adopts a divide-and-conquer strategy by heavily downsampling the input data to create smaller subtensors.TRPCA is then applied to these subtensors to obtain low-rank solutions.Finally, the desired hyperspectral image is reconstructed by combining these low-rank solutions using tensor CUR reconstruction.Expanding upon the concept of CUR decomposition, it is a technique where a given matrix A is broken down into three matrices: C, U, and R. Matrix C contains a small subset of actual columns from A, while R consists of a small subset of actual rows from A. Matrix U is meticulously constructed to ensure that the product C × U × R closely approximates the original matrix A [53].Another study presents a compression method based on tensor decomposition for hyperspectral images [54].It involves obtaining a differential representation of the data between consecutive spectral bands.Then, the first band is compressed using JPEG, while the differential data are compressed using a special mathematical operation called sparse Tucker tensor decomposition.During decoding, the compressed first band and differential data are combined to reconstruct the hyperspectral image [54].
Compressed sensing (CS) has emerged as a powerful tool for hyperspectral image compression, offering efficient reconstruction from sparse measurements [55].Compressed sensing, combined with dictionary learning, is employed for lossless compression and reconstruction of hyperspectral images in [56].The method involves training a BlockSparse dictionary without prior knowledge about how the training data are grouped.Then, a measurement matrix is used to compress the hyperspectral data.The final step involves reconstructing the image by using the trained dictionary incorporating classification features of the hyperspectral data.CS is also employed in [57], where sparsification of hyperspectral image and reconstruction (SHSIR) is introduced.The adopted method incorporates the robust minimum volume simplex algorithm (RMVSA) to improve the accuracy of endmember extraction while the Bregman solver is employed to boost the reconstruction accuracy.Another method called context-aware compressed sensing (CACS) is presented in [58].This method incorporates contextual information into the process of learning the dictionary and reconstructing the hyperspectral images.Further, the study in [59] employs compressed sensing in the spectral dimension by sparsely expressing this dimension as a column vector and utilizing a small dictionary.This approach avoids repeated calculations and eliminates the effects of blocking.
The recursive least squares method stands out as a commonly employed technique for hyperspectral image compression.In this regard, the bimodal conventional recursive least-squares (B-CRLS) is designed to address issues like misalignment-induced boresight effects and blurry band predictions in hyperspectral images [60].Due to the improved band prediction, the method achieves a comparable lossless compression performance, when compared to adaptive-length and fixed-length CRLS while providing relatively lower computational times.Additionally, two parallel methods, SuperRLS and BSuperRLS, are proposed for lossless compression of hyperspectral images in [61].These methods involve superpixel segmentation, parallel prediction using Recursive Least-squares, and encoding residuals with an arithmetic encoder.SuperRLS offers advantages such as parallelizability, competitive compression ratios, and reconstruction of pixels under selected superpixels.BSuperRLS, having a similar structure, achieves the best compression performance with lower computational times.
Finally, the works by Chow et al. describe the compression of hyperspectral images by utilizing a k 2 -raster compact data structure that achieves a compression ratio similar to classical techniques, enabling direct access without decompression [40,62].The k 2 -tree structure serves as a compact representation of the adjacency matrix for a directed graph [63].Chow et al. also suggest that the k 2 -raster structure works best when combined with directly addressable codes (DACs) [40,62].We conclude this review of recent research works on HSI compression by summarizing the key aspects of the above-mentioned studies in Table 1.

Lossless Compression
We employ our recent method for seed generation to achieve lossless compression [39].The main advantage of this approach lies in its reasonable time complexity based on using simple arithmetic operations, potentially making it suitable for real-time compression onboard satellites.Reduction is realized by utilizing the fact that the integer part of the square root of x requires ⌈n/2⌉ bits, where n is the number of bits required to obtain the binary representation of x [64].In addition, for a uniformly distributed x ∈ N, the distribution of the values of √ x , given by f (x), is skewed to the left and is formulated as x .This means, for example, that there are more integers mapped to the square root value of 9 (nineteen values comprising integers from 81 to 99) than integers mapped to the square root value of 2 (five values comprising integers from 4 to 8).This fact yields a reduced entropy of the integer square roots when compared to the corresponding values of x.As a result, further reduction can be realized by employing any simple entropy encoder.
To losslessly achieve this reduction, we also need to preserve the fractional part of the square root for accurate retrieval of the original value of x.In the following section, we provide details on the computation and encoding scheme of both the integral and fractional parts of the square root to achieve lossless compression.

Computation of the Integral Part
The initial estimation of the square root of x is given by the seed s 0 .This seed can be obtained by averaging the value of the leftmost ⌈n/2⌉ bits of the binary representation of x, denoted as the most significant half (MSH), and the quantity 2 ⌊n/2⌋ , where n is as defined previously.This is formulated as follows: We observe from the above equation that the seed s 0 estimates the integer square root of an unsigned integer x using only two operations: one addition followed by a single-bit right shift.This generated seed is an accurate estimate of the integer square root of unsigned integers up to eight bits.Beyond this size, a deviation from the correct square root begins to follow a pattern of connected parabola-like curves (PLC) with a different focal point for each side of the curve (see Figure 1a).The vertices (v) of the PLCs are located at the even powers of two on the x-axis whereas the peaks, denoted by x peak , are seen at the odd powers of two [39].
Entropy 2024, 26, x FOR PEER REVIEW 7 of 36 unsigned integers up to eight bits.Beyond this size, a deviation from the correct square root begins to follow a pattern of connected parabola-like curves (PLC) with a different focal point for each side of the curve (see Figure 1a).The vertices () of the PLCs are located at the even powers of two on the -axis whereas the peaks, denoted by  , are seen at the odd powers of two [39].The magnitude of each one of the peaks, given by  , follows a first-degree polynomial provided by the equation below: where  is the calculated seed of  using Equation (2).For instance, the calculated seed of  = 2 is equal to 3072 with a deviation of +175.7 from the correct square root value of 2896.3.This error can be estimated using Equation (3), producing the same value of 175.7 = 0.0572 × 3072.It follows that by knowing both  and  , the focal point () for each side of the PLC is calculated as follows: Thus, by incorporating the equation of the vertical parabola, the deviation of the estimated square root (denoted by ) of an unsigned integer  is obtained as next: Since the bit depth of the selected HSIs does not exceed 16 bits, we consider two approaches to handling this error [65].We can either compensate for the error over the entire bit depth or completely avoid it by processing the acquired data as two distinct bytes.

Error Compensation
As noted from Figure 1, the generated error is always positive.This means that the estimated integer square root is always equal to or greater than the integer square root of  ( ≥ ⌊√⌋).We compensate for the generated error by utilizing the following observation.There are 2 non-square integers between every two consecutive square integers  and  .This is derived as follows: The magnitude of each one of the peaks, given by y peak , follows a first-degree polynomial provided by the equation below: where s peak is the calculated seed of x peak using Equation (2).For instance, the calculated seed of x peak = 2 23 is equal to 3072 with a deviation of +175.7 from the correct square root value of 2896.3.This error can be estimated using Equation (3), producing the same value of 175.7 = 0.0572 × 3072.It follows that by knowing both x peak and y peak , the focal point ( f ) for each side of the PLC is calculated as follows: Thus, by incorporating the equation of the vertical parabola, the deviation of the estimated square root (denoted by err) of an unsigned integer x is obtained as next: Since the bit depth of the selected HSIs does not exceed 16 bits, we consider two approaches to handling this error [65].We can either compensate for the error over the entire bit depth or completely avoid it by processing the acquired data as two distinct bytes.

Error Compensation
As noted from Figure 1, the generated error is always positive.This means that the estimated integer square root is always equal to or greater than the integer square root of x s 0 ≥ √ x .We compensate for the generated error by utilizing the following observation.There are 2s i non-square integers between every two consecutive square integers s 2 i and s 2 i+1 .This is derived as follows: In Figure 2, we illustrate this increasing distance between consecutive square integers relative to their corresponding square roots.According to the figure, a single step from the integer square root s i to s i+1 translates to a distance of 2s i + 1 between their corresponding square integers.Therefore, a deviation of d in the plane of integer square roots translates to a distance of ∑ i+d−1 i (2s i + 1) in the plane of unsigned integers.
Entropy 2024, 26, x FOR PEER REVIEW 8 of 36 In Figure 2, we illustrate this increasing distance between consecutive square integers relative to their corresponding square roots.According to the figure, a single step from the integer square root  to  translates to a distance of 2 + 1 between their corresponding square integers.Therefore, a deviation of  in the plane of integer square roots translates to a distance of ∑ (2 + 1) in the plane of unsigned integers.Step 5: 185 − 2 × 92 + 1 = 0. Thus, the correct integer square root value of 92 is reached when the remainder is equal to zero.However, the remainder zero is reached here because  is a perfect square.For non-square integers, the correct integer square root is reached when the remainder is less than zero.The example below shows the same procedure employing the non-square integer  = 24576.The generated seed  for  = 24576 is 160.As a result, the distance between  and  is 160 − 24576 = 1024.That is, we can apply the rollback as follows: Step The iterative subtraction is stopped when the remainder is less than zero giving the correct integer square root value of 156.We can generalize the above by stating that the correct integer square root is reached when the remainder is less than or equal to zero.We formulate the procedure of calculating the integer square root in Algorithm 1, for  > 8 bits, as given below: In order to find the desired square root s i , we roll back from the deviated value s i+d while knowing the value of x.This is achieved by first computing the difference between s 2 i+d and x.Then, from the squared value of s i+d , we roll back a distance of ∑ i+d−1 i (2s i + 1) towards x one leap at a time until the difference is no longer positive.For instance, the calculated seed for the square number x = 8464 is 97.This value is deviated by +5 from the correct integer square root s = 92.Knowing the distance between x and 97 2 , we can compensate for this deviation by first rolling back a distance of 2 × 96 + 1.If the distance remains positive, we subtract one from the estimated square root and repeat the process.This is performed multiple times until the correct integer square root is reached.This example is detailed below by showing the roll-back of a distance of 945(= 97 2 − 8464 one leap at a time: Step 4: 372 − 2 × 93 + 1 = 185, and Step 5: 185 − 2 × 92 + 1 = 0. Thus, the correct integer square root value of 92 is reached when the remainder is equal to zero.However, the remainder zero is reached here because x is a perfect square.For non-square integers, the correct integer square root is reached when the remainder is less than zero.The example below shows the same procedure employing the non-square integer x = 24576.The generated seed s 0 for x = 24576 is 160.As a result, the distance between s 2 0 and x is 160 2 − 24576 = 1024.That is, we can apply the rollback as follows: Step 1: Step 3: 388 − 2 × 157 + 1 = 73, and Step 4: 73 − 2 × 156 + 1 = −240.The iterative subtraction is stopped when the remainder is less than zero giving the correct integer square root value of 156.We can generalize the above by stating that the correct integer square root is reached when the remainder is less than or equal to zero.We Input: x, s 0 Output: s // integer square root of x Initialization: To reduce the computational complexity of finding the correct integer square root, we suggest computing the distances (2 × s i + 1) in parallel.As illustrated in Figure 1b, the maximum deviation from the correct integer square root of a 16-bit unsigned integer is +11, located at x = 2 15 .Therefore, we suggest computing the worst-case scenario (11 distances) at once and then serializing the jump using subtraction.As per [66], the number of cycles required by the integer instructions of Intel Pentium and Pentium MMX are as follows: one clock cycle for addition or subtraction, one clock cycle for comparison, one cycle for a shift operation, and 11 cycles for multiplication.As a result, computing the total distance s 2 i − x requires 12 cycles.Then, the eleven instances of s i − 1 are unrolled and calculated in parallel within one cycle.This is followed by two cycles to calculate the jumps, (s i ≪ 1) + 1, simultaneously.Finally, the conditional update of the D value requires 11 cycles for subtraction and 12 cycles for comparison.In addition to the cost of two cycles incurred in calculating the seed s 0 , we obtain a total of 40 clock cycles for calculating the correct integer square root for a 16-bit unsigned integer.

Error Avoidance
The generated error can be completely avoided by processing the 16-bit values of the hyperspectral data byte by byte.This produces an accurate estimate of the integer square root directly by using Equation (2).We note here that the sparsity of the hyperspectral data is expected to increase after their partitioning into 8-bit chunks, especially for decorrelated data.We define sparsity of hyperspectral data to mean the percentage of zero elements comprised within the image [67].Handling zeros before square rooting avoids unnecessary computations.Further, it improves the compression ratio by shortening long streams of zeros.This is achieved by removing all the zero bytes, leaving a single-bit indicator that instructs the decoder on how to interpret the forthcoming bits.A compression could be obtained only when the following condition related to sparsity is satisfied.Let p be the percentage of having all zero bytes in the partitioned data.Removing the zero elements reduces the average number of bits to (1 − p) × 9 + p × 1 = 9 − 8 p.In order to obtain any reduction in this value, it is desired to have: This means that the number of zero elements must occupy more than 12.50% for any reduction to be obtained using the aforementioned approach.A similar argument applies to unpartitioned hyperspectral data in the following way In this case, the percentage of zero elements is halved to 6.25% of the 16-bit values for any compression to be realized.Further reduction can be obtained by employing the runlength encoder (RLE) as the occurrence of the zero elements becomes significant after data decorrelation and partitioning.To maintain longer streams of zeros, we suggest processing the odd bytes first, as they are mostly zeros after the preprocessing stage.This processing scheme yields a greatly varying length of zero streaks.Therefore, to efficiently encode these lengths, the first vector of the RLE holds the unary code that indicates the number of bits required to represent each length.The second vector holds the binary representation of these lengths.For example, the numbers: 365,698; 2; 356; 2; and 5254 represent the first five lengths of the alternating zero and non-zero streaks of the Mt.St. Helens image after preprocessing.The number of bits required to represent each length is 19, 1, 9, 1, and 13, respectively.We clarify here that the use of a single bit is sufficient to indicate two possible lengths (1 and 2) since the length value cannot be zero in this case.This also impacts the rest of the length values as their corresponding binary representations are decremented by one.Thus, the first vector contains the unary codes: 1111111111111111110, 0, 111111110, 0, 1111111111110.

Computation of the Fractional Part
To encode the fractional part, we utilize the same observation formulated in Equation ( 6).As presented in Figure 2, it is evident that the integer numbers s 2 i ≤ x < s 2 i+1 share the same integer square root s i .However, to differentiate between the fractions that correspond to each number in this range, we need ⌈log 2 2s i ⌉ bits.For instance, there are eight non-square integers (2 × 4) between the square integers 16 and 25 (4 2 and 5 2 ).These eight values share the same integer square root value of 4, and we can differentiate between them by using three bits to indicate eight distinct fractions.In other words, the fractional part of the square root of x can be encoded as the distance of x from the nearest square number s 2 i , where s 2 i < x.Calculating the distance costs one multiplication (s i × s i ) and one subtraction (x − s 2 i ).Note that the distance between every two consecutive square integers decreases as these integers become smaller.Hence, more compression can be realized.This can be achieved by utilizing the correlation presented in the hyperspectral data.The number of bits required to represent the fractional part according to the value of the integer square root s i is depicted in Table 2. Fractional Bits 8  9 There are cases where the number of non-square integers bound between two square integers s 2 i and s 2 i+1 is exactly equal to log 2 2s i , which leaves no room to include the zero fraction that corresponds to the perfect square root s i .For instance, all the integers x within the set {4, 5, 6, 7, 8} map to integer square root value of 2. To differentiate between the fractions that correspond to each integer, we need more than two bits to include the zero fraction that corresponds to the integer number 4. This occurs consistently when the integer square roots are powers of two, e.g., s i = 1, 2, 4, 8, . . ., etc. Here, we demonstrate another advantage of eliminating the zero bytes beforehand.To avoid adding an extra bit, perfect squares that are powers of two are encoded with four zeros in the integer part.This instructs the decoder to interpret the next fractional bits as a unary code.Therefore, the cost of the fractional part, represented by the number of bits, remains consistent within each range.That is, the number of bits allocated for the fraction is always ⌈log 2 2s i ⌉ for all s i values.Table 3 below presents the codewords of the proposed lossless square root-based encoder.For s i in the form 2 m , m represents the number of ones in the unary code of the fractional part.For example, when s i = 4 = 2 2 , we have m = 2 and the corresponding unary code of the fractional part is 110.

Preprocessing
A typical lossless compressor consists of a preprocessor followed by an entropy encoder [68].The preprocessor's main function is to decorrelate the input data, which is then passed to the entropy encoder.Decorrelation can be achieved by means of prediction [25,62,[69][70][71][72] or transform-based techniques [51,[73][74][75][76].According to [69], techniques that use lookup tables and vector quantization are also categorized as prediction-based since both types are used to generate a prediction of the data.The best predictor is the one that yields the lowest entropy and is easy to implement.Predictive techniques offer low computational complexity and moderate memory requirements [25].In general, the transform-based techniques are more successful in lossy compression.This is due to the fact that these methods must be integer-based to achieve lossless compression, which in turn may compromise their ability to decorrelate [69].
As it is well known, the use of prediction techniques may produce negative residuals.Thus, to avoid passing such residuals to the square root computation step, a simple mapping technique, reported in [70], is employed to maintain positive values.This technique maps a residual r into an unsigned integer by utilizing the formula below for the mapping M(r): We note here that the use of Equation ( 9) increases the range of the values by one bit.In particular, the maximum residual for a 16-bit hyperspectral image of (2 16 − 1) maps to (2 17 − 2) with the minimum being equal to zero.Since 17 is not a multiple of 8, partitioning it into 8-bit blocks leaves an extra bit that requires additional handling.A better solution is to employ a bitwise exclusive-or (XOR) operation to decorrelate hyperspectral data.
The application of the XOR operation as a decorrelation technique for improved lossless compression is detailed by Cooper in [77].Generally, the correlated data tend to have similar values in their most significant bits.Therefore, the use of the XOR operation sets the most significant bits into zeros, resulting in a lower data entropy while maintaining positive values within 16 bits.The goal is to XOR the adjacent bands B i of each line of the acquired scene, except for the first band B 0 , as indicated by Equation ( 10) next.Then, the original data can be retrieved at the decoder by repeating the XOR operation starting from the first band.This is similar to the use of XOR operation in data encryption [78], where the original message is XORed with the key to generate the ciphered message.Afterward, the original message is recovered by XORing the ciphered message with the same shared key.In our case, the first band B 0 corresponds to our key.Hence, we have: where Bi and ⊕ represent the decorrelated band and the XOR operation, respectively.Consequently, this procedure offers a layer of security for critical data, which can be achieved by ciphering the first band of each acquired line of the scene.Hyperspectral data tend to have stronger spectral than spatial correlation [79].Nonetheless, we will investigate the impact of both spectral and spatial decorrelation by employing the bitwise XOR operation in Section 5.

Postprocessing
As mentioned earlier, the entropy of the integer square roots √ x is significantly smaller than the entropy of the x values.A simple entropy encoder can be utilized to further reduce the total number of bits.We suggest using the well-known Rice coding technique [80] to map the most frequent values into a smaller number of bits.Rice coding is preferable when the data to be compressed follow a geometric statistical distribution [68].This distribution is usually determined after data have been processed in an earlier stage.Typically, a Rice code of the binary representation of x is obtained by concatenating the unary representation of the quotient (q = x/2 k ) and the least significant k bits of x [81].

Lossless Encoder/Decoder
The flowchart of the proposed lossless compressor is depicted in Figure 3a.The diagram highlights its three main stages: (1) preprocessing by employing one-dimensional XOR operation; (2) square rooting by means of seed generation; and (3) postprocessing by utilizing Rice codes.The decoder, presented by Figure 3b, receives the encoded data in a structure of three vectors: (1) the indicator vector (to be decoded bit by bit); (2) the Rice codes vector (to be decoded code by code); and (3) the variable length vector, where the number of bits to be processed is determined by the deciphered value of the Rice code, i.e., the value of s 0 .Next, we provide in Algorithms 2 and 3 the pseudocodes for the lossless encoder and decoder, respectively.
The sequential time complexity of the lossless compression algorithm is primarily dictated by the corresponding time complexities of the preprocessing, seed generation, fraction calculation, and post processing stages.The total time complexity can be approximated as O(n 3 ), where n represents the number of values along each of the three dimensions of the input hyperspectral image.We assume, for the sake of simplicity, that each HSI is represented by a cube.The algorithm begins by XORing each value of the input image with its adjacent one, resulting in a complexity of O(n 3 ) as a result of traversing the entire input data.Subsequently, the XORed values undergo seed generation, where a single shift and add operations are applied.This requires a constant complexity for each pixel value along all n bands.Thus, a total of O(n 3 ) operations are needed to complete this step.The calculation of the corresponding fractional part also entails a cubic complexity, involving squaring the generated seed and subtracting it from the XORed value.For the postprocessing stage, the worst-case time complexity of run-length encoding is O(n 3 ).This is because in the worst-case scenario, every element in the input data would be unique; hence requiring each element to be encoded separately.Finally, the direct mapping of Rice codes using lookup tables would exhibit a time complexity of O(n 3 ) since it operates on each generated seed value.
Algorithm 2. Pseudocode for the compressor part of the proposed lossless compression.
Input: x // hyperspectral data Outputs: vecRice, // a vector that stores the calculated seed values.vecFrac, // a vector that stores the calculated fractions.vecRLE, // a vector that holds the counts of consecutive runs of zero and nonzero values.vecUnary.// a vector that holds the variable unary codes corresponding to the number of bits of each count.

Initializations:
x 0 ← 0 , // the initial value to be XORed with the first element of x. nZ 0 ← 0 , // initialize the first nonzero value with 0. cnt ← 1 , // counts the number of consecutive runs.PO2 ← 2 , // to calculate the required number of bits for each run.nVar ← 11 , // the required number of bits for each run.n ← the number of bits required to represent x. done ← 0 For all x i in x Do 1. Preprocessing xored ← x i ⊕ x i−1 // perform exclusive-or operation.

Algorithm 2. Pseudocode for the compressor part of the proposed lossless compression
Input:  // hyperspectral data Outputs: vecRice, // a vector that stores the calculated seed values.vecFrac, // a vector that stores the calculated fractions.vecRLE, // a vector that holds the counts of consecutive runs of zero and nonzero values.vecUnary.// a vector that holds the variable unary codes corresponding to the number of bits of each count.

Initializations:
← 0, // the initial value to be XORed with the first element of . ← 0, // initialize the first nonzero value with 0.  ← 1, // counts the number of consecutive runs.2 ← 2, // to calculate the required number of bits for each run. ← 11, // the required number of bits for each run. ← the number of bits required to represent .done ← 0 For all  in  Do 1. Preprocessing  ←  ⊕  // perform exclusive-or operation.

Near-Lossless Compression
We aim in this part of our research to utilize our method for computing the square root value in order to achieve near-lossless compression of remotely sensed hyperspectral images [39].Square rooting depends on exploiting the concept of quadrature.In this regard, the quadrature problem of a plane figure involves geometrically constructing a square of the same area, hence the name "quadrature" [82].
Let x be the area of the rectangle ABCD (see Figure 4).The generated seed s 0 represents one side of the rectangle (segment BC).The other side is simply obtained by dividing x over s 0 .The average of both sides of the rectangle produces the hypotenuse of the right triangle MCF.The goal is to find the length of the adjacent segment that represents the square root value.The opposite segment is calculated as the difference between s 0 and the hypotenuse.Given both the hypotenuse and the opposite segments of this right triangle, we can calculate sin θ from which we can obtain the angle θ.Ultimately, the length of the adjacent segment is obtained by multiplying the hypotenuse by the cosine value of the angle θ.The pseudocode of the quadrature-based square rooting method is described next in Algorithm 4 [39].Inputs: x, s 0 Output : s // square root of x.

𝑠 ← cos 𝜃 × 𝑀
The angle  is equal to zero when the hypotenuse and adjacent segments are coincident; i.e., both segments are of equal lengths.This means that the shape, with which we started, is a perfect square and  is an accurate estimate of the square root.Less accuracy of  translates to a wider deviation of  from 0 °.The worst-case scenario is when the hypotenuse is almost perpendicular to the adjacent side with an angle of nearly 90 °.Simulation results, presented in Figure 5a, show that by employing the seed  , the worstcase scenario of  is limited to −30 ° [39].Since both sin  and sin(−) map to the same cosine value, the polarity of the angle  has no impact on the calculated length of the adjacent segment, which corresponds to the square root value.
The value of sin  ranges between 0 and 0.5 since  ∈ −30 ∘ , 4 ∘ ).Using a step size of 0.01 quantizes this range into 51 values.Our objective is to use these values to directly address the corresponding cosine value that is required to calculate the square root.Based on our study in [39], we found that only 24 values are actually accessed via the lookup table resulting in a corresponding utilization rate of 37.5%.These 24 variations of the index can be represented using 5 bits.However, better reduction can be achieved when using unary coding for the index part of the code.This is demonstrated by Figure 5b since most of the  values revisit the narrow range of  ∈ (−4 °, 4 °) and only a few values of  expand towards  = −30 °.An extra reduction could be obtained by utilizing the fact that some combinations of the indices and  never occur.For instance, index 6 is only reached by seven variations of  and index 10 is only reached by three.Table 4 shows the possible number of variations of  that can access each value of these 24 indices.The angle θ is equal to zero when the hypotenuse and adjacent segments are coincident; i.e., both segments are of equal lengths.This means that the shape, with which we started, is a perfect square and s 0 is an accurate estimate of the square root.Less accuracy of s 0 translates to a wider deviation of θ from 0 • .The worst-case scenario is when the hypotenuse is almost perpendicular to the adjacent side with an angle of nearly ±90 • .Simulation results, presented in Figure 5a, show that by employing the seed s 0 , the worst-case scenario of θ is limited to −30 • [39].Since both sin θ and sin(−θ) map to the same cosine value, the polarity of the angle θ has no impact on the calculated length of the adjacent segment, which corresponds to the square root value.The value of sin θ ranges between 0 and 0.5 since θ ∈ [−30 • , 4 • ).Using a step size of 0.01 quantizes this range into 51 values.Our objective is to use these values to directly address the corresponding cosine value that is required to calculate the square root.Based on our study in [39], we found that only 24 values are actually accessed via the lookup table resulting in a corresponding utilization rate of 37.5%.These 24 variations of the index can be represented using 5 bits.However, better reduction can be achieved when using unary coding for the index part of the code.This is demonstrated by Figure 5b since most of the x values revisit the narrow range of θ ∈ (−4 • , 4 • ) and only a few values of x expand towards θ = −30 • .An extra reduction could be obtained by utilizing the fact that some combinations of the indices and s 0 never occur.For instance, index 6 is only reached by seven variations of s 0 and index 10 is only reached by three.Table 4 shows the possible number of variations of s 0 that can access each value of these 24 indices.As a result, instead of encoding s 0 with fixed-length 8-bit codes, variable-length codes are used that do not exceed eight bits in the worst case.For indices ≥ 14, the decoder directly infers the value of the integer square root.The length of the concatenated code herein is equal to zero.Mapping s 0 values to the reduced codes is achieved using lookup tables.Since the number of bits remains at eight bits for the index value of zero, mapping in this case would only translate to an extra lookup table.Accordingly, the mapping requires a structure of lookup tables limited to only the indices in the range between 0 and 14 exclusively (0 < index < 14).These lookup tables are distributed as follows: (1) five lookup tables of size 256 bytes each for the range 1 ≤ index ≤ 5; (2) another five lookup tables of 16 bytes each for 6 ≤ index ≤ 10; (3) and for the range 11 ≤ index ≤ 13, we need only three lookup tables of eight bytes each.This results in a total of 1.4 KB for the entire structure of lookup tables.Table A1, provided in the Appendix A, shows the values of s 0 that map to each index.This is followed by Table A2, which encloses the MATLAB code used to generate Table A1.Table 5 below gives an example that illustrates the encoding of a sequence of ten 16-bit unsigned integers x.Therefore, the corresponding codewords that employ the quadrature-based square rooting method are: 11111110000, 001001011, 100111110, 11110000110, 001000000, 11100010110, 001101101, 010010110, 001010110, 1100111000.
The unary code for the resulting indices is underlined.The binary representation that follows is the order of s 0 within a given index, as provided in Table A1.The number of bits required to represent the order is determined in advance by utilizing Table 4.For instance, the unary code of index 7, which is equal to 11111110, instructs the decoder to parse the next three bits to capture the value of s 0 , which happens to be the first element for index 7, as depicted in Table A1.

Near-Lossless Encoder/Decoder
The flowchart of the quadrature-based encoder used to achieve near-lossless compression is exhibited in Figure 6.Starting with the hyperspectral image as an input, we apply seed generation to obtain the value of s 0 .Next, the quadrature-based square rooting method is employed to compute the index value.Both the index and s 0 values are then used to obtain the encoded value of s 0 from the lookup table structure described earlier.Finally, the unary code of the index value and the code obtained from the corresponding lookup table are then concatenated to generate the compressed stream.In our case, we note that the use of decorrelation by means of either prediction or XORing may hinder the compression performance.This is due to the fact that decorrelation would reduce the majority of s 0 values, which yields smaller values of θ and thus limits the indices to the first few.Consequently, the quadrature-based encoder would not be able to benefit from the distribution presented in Table A1.indices to the first few.Consequently, the quadrature-based encoder would not be able to benefit from the distribution presented in Table A1.As described in Algorithm 4, the square root value is obtained by multiplying the hypotenuse by the cosine value of .The hypotenuse length, denoted by M, is a floatingpoint number, yet  is an integer.Therefore, by employing the concatenation of  with its corresponding index value, we would obtain better compression performance and maintain integer-based computations for the encoder.This is instead of relying on the floating-point value provided by the hypotenuse.Consequently, the last two steps of Al- As described in Algorithm 4, the square root value is obtained by multiplying the hypotenuse by the cosine value of θ.The hypotenuse length, denoted by M, is a floatingpoint number, yet s 0 is an integer.Therefore, by employing the concatenation of s 0 with its corresponding index value, we would obtain better compression performance and maintain integer-based computations for the encoder.This is instead of relying on the floating-point value provided by the hypotenuse.Consequently, the last two steps of Algorithm 4 shall be assigned to the decoder on the ground station.The value of M is generated at the decoder by dividing the value of s 0 by (1 + sin θ).This is derived as follows: Then, the square root value of x is reconstructed at the decoder by multiplying cos θ with M. Squaring the resulting product gives the desired value of x.As described in Algorithm 4, the square root value is obtained by multiplying the hypotenuse by the cosine value of .The hypotenuse length, denoted by M, is a floatingpoint number, yet  is an integer.Therefore, by employing the concatenation of  with its corresponding index value, we would obtain better compression performance and maintain integer-based computations for the encoder.This is instead of relying on the floating-point value provided by the hypotenuse.Consequently, the last two steps of Algorithm 4 shall be assigned to the decoder on the ground station.The value of M is generated at the decoder by dividing the value of  by (1 + sin ).This is derived as follows: Then, the square root value of  is reconstructed at the decoder by multiplying cos  with M. Squaring the resulting product gives the desired value of .Next, we disclose the pseudocodes for the near-lossless encoder and decoder in Algorithms 5 and 6, respectively.
The sequential time complexity of the near-lossless algorithm is dominated by the seed generation and quadrature-based square rooting steps, as both involve basic arithmetic operations that contribute to a worst-case time complexity of O(n 3 ).Unary encoding, which typically involves appending a sequence of 1s followed by a single 0, also plays a role in the algorithm's time complexity.However, the use of a precomputed lookup table for the 24 indices and their corresponding unary codes would only require a constant time complexity, thus maintaining the overall complexity at O(n 3 ).Overall, the entire algorithm can be considered to have a worst-case sequential time complexity of O(n 3 ).
Algorithm 5. Pseudocode for the compressor part of the proposed near-lossless compression.
Input: x // hyperspectral data.Outputs: vecSeed, vecUnary.Initialization: n ← the number of bits required to represent x.For all x i in x Do 1. Seed generation MSH ← the leftmost ⌈n/2⌉ bits of Preparing the compressed stream index ← 10 2 • sin θ order ← the corresponding order of the seed value within the lookup table of the index (Table A1).m ← the number of bits that correspond to index (Table 4).varCode ← the least significant m bits of order.vecSeed.add(varCode ) // add the encoded seed to vecSeed vector.unary ← the corresponding unary code of the index value.vecUnary.add(unary) // add unary code to vecUnary vector.End Do Algorithm 6. Pseudocode for the decompressor part of the proposed near-lossless compression.

Inputs: vecSeed, vecUnary
Output: x // reconstructed hyperspectral data Initialization: index ← the index value obtained by interpreting the next unary code in vecUnary For all index Do m ← the number bits to be read from vecSeed based on index value (Table 4).order ← get the next m bits from vecSeed.seed ← get the seed value given the index (Table A1).cos θ ← given the index value (that corresponds to sin θ value), retrieve the cosine value from the lookup table.

Experimental Results and Discussion
Our simulation results were obtained using the MATLAB computing environment R2020 installed on a MacBook Pro machine with a 2.4-GHz Apple M2 Max processor.This system has 32 GB of RAM and is running a macOS Ventura 13.3.1.

Dataset Description
To ensure the reproducibility of the results and allow for comparison with other stateof-the-art works, the Corpus dataset is used for analysis [83].Since this dataset represents a collection of both multispectral and hyperspectral images, only the latter type will be utilized in this work.Further, the Corpus dataset is heavily employed by the research community, including CCSDS, to evaluate the performance of compression algorithms [84].The description of the 32 selected hyperspectral images is provided in Table 6.All files are stored in band sequential format (BSQ).The data types of these images are either u16be (unsigned 16-bit big-endian) or s16be (signed 16-bit big-endian) integers.These integers are represented using the two's complement format [85].In our study, the BSQ format is transformed into band interleaved by line (BIL) format to simulate the line-by-line acquisition of the underlying scene in real time.In Figure 8, we display three examples of hyperspectral scenes used in our experiments.The first two originate from the AVIRIS imager and the third belongs to CASI.
state-of-the-art works, the Corpus dataset is used for analysis [83].Since this dataset represents a collection of both multispectral and hyperspectral images, only the latter type will be utilized in this work.Further, the Corpus dataset is heavily employed by the research community, including CCSDS, to evaluate the performance of compression algorithms [84].The description of the 32 selected hyperspectral images is provided in Table 6.All files are stored in band sequential format (BSQ).The data types of these images are either u16be (unsigned 16-bit big-endian) or s16be (signed 16-bit big-endian) integers.These integers are represented using the two's complement format [85].In our study, the BSQ format is transformed into band interleaved by line (BIL) format to simulate the lineby-line acquisition of the underlying scene in real time.In Figure 8, we display three examples of hyperspectral scenes used in our experiments.The first two originate from the AVIRIS imager and the third belongs to CASI.from the dataset provider regarding the former factor, it is challenging to pinpoint the precise reasons involved.Moreover, the presence of noise or artifacts in the two images could also affect their sparsity levels, as they may introduce additional variations in pixel values.Since this behavior did not manifest itself again when the images are blocked into bytes, the sparsity increased significantly.This suggests that while the values across bands may exhibit similarities in their most significant bits (MSBs), the differences are significant enough such that when considering the entire 16 bits they may influence the resulting sparsity values after decorrelation.The average bit rate used in finding the compression ratio is calculated using Equations ( 7) and (8).For instance, the percentage of zero elements of the Maine scene is 53.51% after decorrelation and blocking into eight bits.The calculated average bit rate, using Equation ( 8), is reduced from 8 to 4.7192, that is 0.5351 × 1 + 0.4649 × 9.As a result, the obtained compression ratio has reached a promising value of 8/4.7192 = 1.6952 by using preprocessing techniques for this specific image.As revealed in Tables 7 and 8, the obtained compression ratio of the preprocessing step is higher with the adoption of byte-wise processing.The geometric mean of the compression ratio for all the selected hyperspectral images is 1.2 for the decorrelated 16-bit values and 1.5 for the decorrelated 8-bit values.
The decorrelated 8-bit values are passed next to the seed generator to produce an accurate estimate of the integer square root.We note that by partitioning the data into bytes, the generated error is completely avoided, and thus, no error compensation is required.Since square rooting reduces the number of bits into four, the value of s 0 ∈ [1, 15].This is because all zero elements are removed in the preprocessing stage.The zero value is used to encode the integer square roots of the perfect square integers.When employing our seed generation approach, we observe that the distribution of the obtained integer square roots follows a certain pattern, especially for the same imager.Although the observed pattern is not geometrically distributed per se, its consistent shape allows for offline mapping between the Rice codes and the integer square roots.In this regard, Figure 9 displays the distribution of the integer square roots for nine images of the AIRS and AVIRIS instruments, respectively.
bytes, the generated error is completely avoided, and thus, no error compensation is required.Since square rooting reduces the number of bits into four, the value of  ∈ 1,15 .This is because all zero elements are removed in the preprocessing stage.The zero value is used to encode the integer square roots of the perfect square integers.When employing our seed generation approach, we observe that the distribution of the obtained integer square roots follows a certain pattern, especially for the same imager.Although the observed pattern is not geometrically distributed per se, its consistent shape allows for offline mapping between the Rice codes and the integer square roots.In this regard, Figure 9 displays the distribution of the integer square roots for nine images of the AIRS and AVIRIS instruments, respectively.To maximize the benefit of Rice coding, we suggest following the descending order of the most frequent values of  given in the histogram data.This order is obtained by finding the integer square root value that appears the most often, known as the mode, for To maximize the benefit of Rice coding, we suggest following the descending order of the most frequent values of s 0 given in the histogram data.This order is obtained by finding the integer square root value that appears the most often, known as the mode, for the dataset within the same instrument.For analysis purposes, we consider both onedimensional and two-dimensional XORing for the preprocessing stage.One-dimensional XORing involves computing the bitwise XOR operation between consecutive spectral values, that is across bands only.On the other hand, two-dimensional XORing is applied by calculating the bitwise XOR operation along both the spectral and spatial dimensions, that is, across bands and along the line of the scene.Table 9 shows the mapping of Rice codes to the integer square roots according to the data distributions of the AIRS and AVIRIS imagers.The goal is to directly access Rice codes of the obtained integer square root in a single clock cycle utilizing a small-sized lookup table.Direct addressing into lookup tables locates the desired entry in O(1) time [86].Nonetheless, smaller lookup tables are preferable for area and power considerations.The required size of such a lookup table is 16 bytes only, including the 16 variations from 0 to 15, where each code has a maximum length of 8 bits.The value obtained from the lookup table is then concatenated to the calculated distance between x and s 2 0 .To find the reduction percentage, the total number of bits after compression is divided by the total number of bits of the original hyperspectral image.Then, the obtained value is subtracted from one.These results are compared to those yielded by the state-of-theart k 2 -raster method, as reported in [40].This comparison is displayed in Table 10.We clearly observe that an improved reduction is realized when employing our proposed seed generation technique by the lossless compressor, especially with direct addressing of Rice codes.A compression ratio of up to 2.6 is achieved while maintaining a reasonable computational complexity.As observed from the table, the preprocessing of hyperspectral data using two-dimensional XORing is sometimes detrimental, although it shows the highest compression performance for a few images.This is confirmed by the geometric mean values of the results encoded using Rice codes.These values are 34.45% and 33.13% for the images decorrelated by employing 1D and 2D XORing, respectively.Similarly, the geometric mean values for the images encoded using mapped Rice are 36.89%and 35.72% for the images employing the same respective decorrelations.Furthermore, these values show that the best results are obtained when combining both 1D XORing and mapped Rice codes.Overall, the generated results by the four variations of our proposed method for lossless compression outperform those produced by the k 2 -raster method [40].Specifically, reduction enhancements ranging from 16.65% to 29.89% are achieved by all these variations when compared in terms of their geometric mean values with that obtained by the k 2 -raster technique.Our proposed algorithm of lossless compression can be applied to power-of-two precision of the resolution values starting from 8 bits, then 16 bits, followed by 32 bits, and so on.All considered images in the dataset are stored using 16 bits with a 4-bit padding for those produced by a 12-bit instrument such as those by the AIRS instrument.Although the hyperspectral data are processed using a word length, we consider the actual bit rate of the imager when calculating the reduction percentage to maintain a fair comparison.For example, the total number of bits for the hyperspectral data produced by each AIRS scene is calculated as 1501 × 135 × 90 × 12 bits.The entropy results provided in the said table are calculated using the entropy model given by Equation (1).In Figure 10, we exhibit the original and the reconstructed images of the AVIRIS Yellowstone scene 10, calibrated (band 106) after applying our lossless compression and decompression algorithms.We note that the visual assessment of each image is extremely similar.Comparison with Other Lossless Methods We provide in Table 11 a comparison of our HSI lossless compression algorithm with other lossless methods in terms of bit rate (bpp).In addition to the four variations of our algorithm, we include in this table results from gzip (GNU Zip), bzip2, xz [62], and raster.The results show that our algorithm outperforms  -raster and gzip while its bit

Comparison with Other Lossless Methods
We provide in Table 11 a comparison of our HSI lossless compression algorithm with other lossless methods in terms of bit rate (bpp).In addition to the four variations of our algorithm, we include in this table results from gzip (GNU Zip), bzip2, xz [62], and k 2 -raster.The results show that our algorithm outperforms k 2 -raster and gzip while its bit rate values are higher when compared to those obtained by the two methods: bzip2 and xz.The latter two techniques include transform-based algorithmic components which may make them amenable to yield lower bit rate values.In general, transform-based compression methods can offer superior compression ratios but can be computationally expensive, making them less suitable for real-time applications, where minimizing computational complexity and execution time is essential [87].We have only included in the mentioned table the common scenes, from the Corpus dataset, having available bit rate values among all disclosed methods [62].

Results of Near-Lossless Compression
For high entropy images that are hard to compress, we suggest employing our proposed quadrature-based square rooting method to realize acceptable compression ratios while maintaining highly accurate data for the decompressed stream.The compression performance based on this method is stabilized at a reduction of nearly 40% for real-world hyperspectral data.The near-lossless feature discussed in this work has a small to negligible impact on the accuracy of the decompressed data, as the nature of the produced error values reflects those generated by the square rooting method.This has a direct impact on preserving the shape of the original spectral signature of the image and is clearly seen in Figure 11 for the Yellowstone (sc18, C) image.performance based on this method is stabilized at a reduction of nearly 40% for real-world hyperspectral data.The near-lossless feature discussed in this work has a small to negligible impact on the accuracy of the decompressed data, as the nature of the produced error values reflects those generated by the square rooting method.This has a direct impact on preserving the shape of the original spectral signature of the image and is clearly seen in Figure 11 for the Yellowstone (sc18, C) image.The near-lossless compressor eventually produces the concatenation of the index value with the generated seed  .To further reduce the output stream, we replace the binary code of the index value with the corresponding unary code.As illustrated in Figure 12, the distributions of the indices are significantly skewed to the right for all AVIRIS test The near-lossless compressor eventually produces the concatenation of the index value with the generated seed s 0 .To further reduce the output stream, we replace the binary code of the index value with the corresponding unary code.As illustrated in Figure 12, the distributions of the indices are significantly skewed to the right for all AVIRIS test images.Statistically, the occurrence of the first few indices out of a total of 24 is more often than the occurrence of the remaining indices.Therefore, instead of concatenating five bits to represent the index, we can use the unary code instead and improve the average bit rate of the compressed stream.Moreover, since the values of  are unevenly distributed across the indices (see Table A1), we encode the 8-bit values of  with its order within the index.This translates to a reduction in the number of bits required to represent  , given by  where 0 ≤  ≤ 8.For instance,  is equal to zero for indices greater than or equal to 14 indicating a single Moreover, since the values of s 0 are unevenly distributed across the indices (see Table A1), we encode the 8-bit values of s 0 with its order within the index.This translates to a reduction in the number of bits required to represent s 0 , given by n where 0 ≤ n ≤ 8.For instance, n is equal to zero for indices greater than or equal to 14 indicating a single occurrence of s 0 .By applying the previous procedure, we disclose in Table 12 the data reduction percentage as well as the Maximum Relative Error (MRE) and the Maximum Absolute Error (MAE) for the selected dataset.All hyperspectral images that show a reduction percentage of less than 40%, when using lossless compression, are selected for processing by the near-lossless compressor.We calculated the maximum absolute error for the reconstructed values x using the following equation: Therefore, the maximum relative error is formulated as: In both Figures 13 and 14 below, we reveal the original and the reconstructed images of both the CASI uncalibrated image t0477f06 (band 70) and the AIRS uncalibrated granule 16 image (band 208) obtained after applying our near-lossless compressor, respectively.We can state that the visual assessment of each decompressed image with respect to its original one is highly similar.Comparison with Other Near-Lossless Methods When comparing to other works, vector quantization techniques were used previ ously in [30] to achieve near-lossless compression.This work shows high compression ratios of 20:1.The details of the generated error for the full dataset used in the study are Comparison with Other Near-Lossless Methods When comparing to other works, vector quantization techniques were used previously in [30] to achieve near-lossless compression.This work shows high compression ratios of 20:1.The details of the generated error for the full dataset used in the study are not reported.However, the maximum absolute error reaches 127 when evaluating selected images in the AVIRIS dataset.Compared to the quadrature-based method, our results produce four times higher the accuracy.Furthermore, a recent study by Zheng et al. shows a significant reduction of at least 53% where the authors evaluated the compression performance by using three hyperspectral images of which only one is included in our dataset [25].Nonetheless, the study relies on a recursive least-squares (RLS) filter with a loop quantizer, where the weight matrix is updated iteratively.RLS has a fast convergence rate, yet its computational complexity is in the order of O N 2 , which implies major limitations to its applications [88,89].This is particularly true for power-critical systems, such as onboard computers required to process large volumes of hyperspectral data.Accordingly, our proposed quadrature-based method may be more feasible for onboard compression as it is highly parallelizable and needs only a single-pass for data processing while exhibiting minimum memory requirements.

Conclusions
Two new methods for lossless and near-lossless compression of remotely sensed hyperspectral images are reported in this article.We achieved lossless compression by employing our recent method of seed generation based on bit-manipulation techniques.Due to its acceptable complexity, the method could prove to be very efficient for lossless compression of hyperspectral images, where power and computational resources could be confined onboard satellites.Four variations of this technique are considered yielding a compression ratio of up to 2.6 while outperforming a state-of-the-art method, known as k 2 -raster, by as much as 29.89% in terms of data reduction obtained by using HSIs from the Corpus dataset.
For those instances when hyperspectral images are hard to losslessly compress due to their reduced correlation, we suggest employing our near-lossless compression technique, which relies on our previously described quadrature-based square rooting method.Its obtained compression performance is stabilized at nearly 40% (from 38.9011% to 39.7314%) reduction of the original data within a maximum relative error of 0.33 and a maximum absolute error of only 30.
For hyperspectral data compression onboard satellites, we recommend the utilization of an adaptive lossless and near-lossless compression scheme, whereby a selection criterion is adopted based on a threshold value set by the user to indicate the acceptable level of compression performance.As part of our future work, we intend to implement both compression techniques on hardware accelerators, such as FPGA boards, commonly used for image processing.

Figure 1 .
Figure 1.(a) A scaled plot showing the deviation of the generated seed from the correct square root for square numbers up to 2 − 1; (b) a discrete plot showing the maximum deviation of  from the integer square root for numbers up to 2 − 1.

Figure 1 .
Figure 1.(a) A scaled plot showing the deviation of the generated seed from the correct square root for square numbers up to 2 32 − 1; (b) a discrete plot showing the maximum deviation of s 0 from the integer square root for numbers up to 2 16 − 1.

Figure 2 .Step 1 :
Figure 2. The distance between consecutive integer square numbers increases relative to their corresponding square roots.In order to find the desired square root  , we roll back from the deviated value  while knowing the value of .This is achieved by first computing the difference between  and  .Then, from the squared value of  , we roll back a distance of ∑ (2 + 1) towards  one leap at a time until the difference is no longer positive.For instance, the calculated seed for the square number  = 8464 is 97.This value is deviated by +5 from the correct integer square root  = 92.Knowing the distance between  and 97 , we can compensate for this deviation by first rolling back a distance of 2 × 96 + 1.If the distance remains positive, we subtract one from the estimated square root and repeat the process.This is performed multiple times until the correct integer square root is reached.This example is detailed below by showing the roll-back of a distance of 945 (= 97 − 8464) one leap at a time: Step 1: 945 − 2 × 96 + 1 = 752, Step 2: 752 − 2 × 95 + 1 = 561, Step 3: 561 − 2 × 94 + 1 = 372, Step 4: 372 − 2 × 93 + 1 = 185, andStep 5: 185 − 2 × 92 + 1 = 0. Thus, the correct integer square root value of 92 is reached when the remainder is equal to zero.However, the remainder zero is reached here because  is a perfect square.For non-square integers, the correct integer square root is reached when the remainder is less than zero.The example below shows the same procedure employing the non-square integer  = 24576.The generated seed  for  = 24576 is 160.As a result, the distance between  and  is 160 − 24576 = 1024.That is, we can apply the rollback as follows:Step 1: 1024 − 2 × 159 + 1 = 705, Step 2: 705 − 2 × 158 + 1 = 388, Step 3: 388 − 2 × 157 + 1 = 73, and Step 4: 73 − 2 × 156 + 1 = −240.The iterative subtraction is stopped when the remainder is less than zero giving the correct integer square root value of 156.We can generalize the above by stating that the correct integer square root is reached when the remainder is less than or equal to zero.We formulate the procedure of calculating the integer square root in Algorithm 1, for  > 8 bits, as given below:

Figure 2 .
Figure 2. The distance between consecutive integer square numbers increases relative to their corresponding square roots.

Algorithm 1 .
of calculating the integer square root in Algorithm 1, for n > 8 bits, as given below: Calculation of the integer square root of x by rolling back from s 0 .

Figure 3 .
Figure 3. Proposed lossless compression system: (a) The lossless compressor contains three main stages: (1) preprocessing using one-dimensional XORing; (2) computing  based on the seed generation method; and (3) postprocessing of the fixed length integral part using Rice codes.(b) Steps performed by the decoder of the proposed lossless compressor to reconstruct the original data stream.

Figure 3 .
Figure 3. Proposed lossless compression system: (a) The lossless compressor contains three main stages: (1) preprocessing using one-dimensional XORing; (2) computing s 0 based on the seed generation method; and (3) postprocessing of the fixed length integral part using Rice codes.(b) Steps performed by the decoder of the proposed lossless compressor to reconstruct the original data stream.

Algorithm 4 .
The quadrature-based method to compute the square root value of x.

Figure 4 .Algorithm 4 .
Figure 4.The quadrature of a plane rectangle ABCD: (a) The case when  is the longer side of the rectangle; (b) The case when  is the shorter side of the rectangle.In both cases, segment BC is equal to  .Algorithm 4. The quadrature-based method to compute the square root value of  Inputs: ,  Output:  // square root of .Initialization:  ←   ←   ⁄  ← 0.5 × ( + )  ← BC − M sin  ←   ⁄ cos  ← retrieved from a lookup table utilizing sin . ← cos  ×

Figure 4 .
Figure 4.The quadrature of a plane rectangle ABCD: (a) The case when s 0 is the longer side of the rectangle; (b) The case when s 0 is the shorter side of the rectangle.In both cases, segment BC is equal to s 0 .

Figure 5 .
Figure 5. (a) Zoomed plot of angle  versus  showing that the worst-case scenario occurs within the first 95 values of , with  ∈ −30 ∘ , 4 ∘ ); (b) The range of  as a function of  for unsigned numbers up to 2 − 1.

Figure 5 .
Figure 5. (a) Zoomed plot of angle θ versus x showing that the worst-case scenario occurs within the first 95 values of x, with θ ∈ [−30 • , 4 • ); (b) The range of θ as a function of x for unsigned numbers up to 2 16 − 1.

Figure 6 .
Figure 6.Flowchart of the proposed near-lossless compression employing our method of quadrature-based square rooting.

Figure 6 .
Figure 6.Flowchart of the proposed near-lossless compression employing our method of quadraturebased square rooting.

Figure 7
displays the different steps involved in the reconstruction of the original stream by the decoder.

Figure 6 .
Figure 6.Flowchart of the proposed near-lossless compression employing our method of quadrature-based square rooting.

Figure 7
displays the different steps involved in the reconstruction of the original stream by the decoder.

Figure 7 .
Figure 7. Steps representing the decoder side of the proposed near-lossless compression system.Figure 7. Steps representing the decoder side of the proposed near-lossless compression system.

Figure 7 .
Figure 7. Steps representing the decoder side of the proposed near-lossless compression system.Figure 7. Steps representing the decoder side of the proposed near-lossless compression system.

Figure 9 .
Figure 9. Histograms exhibiting the similarity in the distributions of the integer square roots,  , for multiple hyperspectral images generated by the two imagers: (a) AIRS and; (b) AVIRIS, respectively.

Figure 9 .
Figure 9. Histograms exhibiting the similarity in the distributions of the integer square roots, s 0 , for multiple hyperspectral images generated by the two imagers: (a) AIRS and; (b) AVIRIS, respectively.

Figure 11 .
Figure 11.Original spectral data of the spatial location (100,100) of Yellowstone (sc18, C) image versus its reconstructed spectrum: (a) The spectral data showing selected bands from 100 to 224; (b) An enlarged plot showing the difference in magnitude between the original and the reconstructed value of band 147 at the same spatial location as indicated by the selection in (a) using a square shape.

Figure 11 .
Figure 11.Original spectral data of the spatial location (100,100) of Yellowstone (sc18, C) image versus its reconstructed spectrum: (a) The spectral data showing selected bands from 100 to 224; (b) An enlarged plot showing the difference in magnitude between the original and the reconstructed value of band 147 at the same spatial location as indicated by the selection in (a) using a square shape.

Entropy 2024 ,
26, x FOR PEER REVIEW 29 of 36images.Statistically, the occurrence of the first few indices out of a total of 24 is more often than the occurrence of the remaining indices.Therefore, instead of concatenating five bits to represent the index, we can use the unary code instead and improve the average bit rate of the compressed stream.

Figure 12 .
Figure 12.Histograms showing the number of accesses for each of the 24 indices, ranging from 0 to 50, of all 9 images produced by the AVIRIS imager.

Figure 12 .
Figure 12.Histograms showing the number of accesses for each of the 24 indices, ranging from 0 to 50, of all 9 images produced by the AVIRIS imager.

Table 1 .
Overview of recent studies related to HSI compression.

Table 2 .
Number of bits required to represent the fractional part relative to the integer square root value.

Table 3 .
Codewords of the lossless square root-based encoder.

Run length encoding of nZ If nZ
The fraction encoded as the distance between the xored value and the squared value of the seed If Fr > 0 Then Fr ← Fr − 1 Else Fr ← Unary(seed) seed ← 0 End If Fr out ← the rightmost m bits of Fr vecFr.add(Frout ) i = nZ i−1 Then cnt ← cnt + 1 Pseudocode for the decompressor part of the proposed lossless compression.nVar ← the number of bits derived from the next unary code in vecUnary.cnt ← the run length obtained by interpreting the next nVar bits from vecRLE.Else riceCode ← get the next rice code from vecRice.seed ← decode (riceCode ) If seed = 0 Then seed ← the value of the next unary code from vecFrac Fr ← 0 Else m ← ⌈log 2 (2 • seed)⌉ Fr ← get the next m bits from vecFrac.Fr ← Fr + 1 End If x i ← seed 2 + Fr

Table 4 .
Number of bits required to represent the variations of  for each of the 24 index values.

Table 4 .
Number of bits required to represent the variations of s 0 for each of the 24 index values.

Table 5 .
Sequence of ten 16-bit integers x with both their s 0 values and corresponding indices.

Table 6 .
Description of the hyperspectral images within the Corpus dataset, selected for the evaluation of compression performance.

Table 7 .
The impact of data decorrelation on the sparsity of the hyperspectral data, represented by the percentage of zero elements, when using 16 bits for the AVIRIS dataset.

Table 8 .
The impact of data decorrelation on the sparsity of the hyperspectral data, represented by the percentage of zero elements, when using 8 bits for the AVIRIS dataset.

Table 9 .
Suggested order of mapping the integer square roots to Rice codes using 1D and 2D XORing in the preprocessing stage.

Table 10 .
[40] reduction percentage of four variations of our proposed method for lossless compression and their comparison with results obtained using k 2 -raster[40].Higher reduction values are displayed in boldface.The geometric mean values and the achieved enhancements in data reduction of the four variations are displayed in the last two rows.

Table 11 .
[62]rate values of four variations of our proposed method for HSI lossless compression and their comparison with results obtained using k 2 -raster[40], gzip, bzip2, and xz[62]for a subset of the Corpus dataset.

Table 12 .
Compression performance of the quadrature-based near-lossless HSI compressor, characterized by the data reduction percentage, MRE, and MAE values.

Table 12 .
Compression performance of the quadrature-based near-lossless HSI compressor, characterized by the data reduction percentage, MRE, and MAE values.

Table A2 .
MATLAB code for generating the values displayed in TableA1.