Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

On the Lossless Compression of HyperHeight LiDAR Forested Landscape Data

Remote Sens. 2025, 17(21), 3588; https://doi.org/10.3390/rs17213588

by Viktor Makarichev¹

, Andres Ramirez-Jaime²

, Nestor Porras-Diaz²

, Irina Vasilyeva¹

, Vladimir Lukin^1,*

, Gonzalo Arce²

and Krzysztof Okarma³

Reviewer 1: Anonymous

Reviewer 2:

Mathias Lemmens

Reviewer 3: Anonymous

Remote Sens. 2025, 17(21), 3588; https://doi.org/10.3390/rs17213588

Submission received: 31 August 2025 / Revised: 14 October 2025 / Accepted: 28 October 2025 / Published: 30 October 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript investigated the lossless compression techniques for a novel representation of lidar data: Hyper Height Data Cube. However, the conclusion do not summarize the results concisely. In other words, I am still confused after reading the paper. Results from the research need to be summarized at higher levels. The structure of the manuscript is a little confusing, sometimes I do not know if I am reading the methods or introduction or results section.

Line 102: "each voxel..." Can you explain what is the difference of the voxel-based representation of lidar points versus the HHDC? Seems to be very similar except the difference in resolution/voxel size.

Line 105-107 & Figure 2: please name the three panels in Figure 2 using A, B, C and update the relevant text by calling the panels Figure 2A, 2B, or 2C

Line 125: can you add an equation to show how compression ratio is calculated?

Could you define "lossless"? The definition of lossless seems to be unclear to me.

Line 313, "represent the same forest area": where is the area? can you provide some figures to show how the data looks or where the data is located

Line 362: I think here in discussion you should also discuss what does a larger CR mean.

Line 370: you mentioned the octree method a lot and it would be helpful to add one or two sentence to briefly summary what is the method and how the method works

Author Response

Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions/corrections highlighted changes in the re-submitted files.

General comment: This manuscript investigated the lossless compression techniques for a novel representation of lidar data: Hyper Height Data Cube. However, the conclusion do not summarize the results concisely. In other words, I am still confused after reading the paper. Results from the research need to be summarized at higher levels. The structure of the manuscript is a little confusing, sometimes I do not know if I am reading the methods or introduction or results section.

Response: Thank you for this comment. Our paper relates to a quite new topic of satellite LiDAR remote sensing and data compression for such systems. Due to this, knowledge coming from different disciplines needs to be presented and combined. This might be slightly confusing, but, in our opinion, information about the considered LiDAR systems and their principles of operation as well as information about data properties incorporated in coders’ design have to be presented.

We have rewritten Conclusions by adding several sentences. In particular, at the beginning of Conclusions it is stated now “Specific properties of HHDCs are shown, including high sparsity of data and a limited range of values. This leads to the necessity and possibility to exploit these properties in the design of modified methods of lossy compression that allow obtaining compression ratios considerably larger than in other typical practical situations when one deals with 3D data compression (such as hyperspectral images)."

Comments 1: Line 102: "each voxel..." Can you explain what is the difference of the voxel-based representation of lidar points versus the HHDC? Seems to be very similar except the difference in resolution/voxel size.

Response 1: Thank you for this remark. We have replaced “each voxel (i,j,k)” with “each element (i,j,k) of X”, since it is more appropriate.

Comments 2: Line 105-107 & Figure 2: please name the three panels in Figure 2 using A, B, C and update the relevant text by calling the panels Figure 2A, 2B, or 2C.

Response 2: We have divided Figure 2 into three parts – (a), (b) and (c) – and accordingly changed the references to them in the text.

Comments 3: Line 125: can you add an equation to show how compression ratio is calculated?.

Response 3: Thank you for this remark. In the revised paper, we have moved this equation from the further part of the text (formerly it was in Section 3.5) as it should be placed in the suggested place, indeed.

Comments 4: Could you define "lossless"? The definition of lossless seems to be unclear to me.

Response 4: We have added the following definition at the beginning of Section 3: “Classically, lossless methods are defined as data compression algorithms that allow for the reduction of data size without loss of information; therefore, the decompressed file can be restored bit-for-bit, achieving the identical form to the original file.”

Comments 5: Line 313, "represent the same forest area": where is the area? can you provide some figures to show how the data looks or where the data is located

Response 5: In our experiments, we have utilized emulated HHDCs (3D tensors) generated using data originating from a forested area at the Smithsonian Environmental Research Center (SERC) in the state of Maryland (Latitude 38.88° N, Longitude 76.56° W). The raw LiDAR data regarding the SERC area are available at NEON (National Ecological Observatory Network) website: Discrete return LiDAR point cloud (DP1.30003.001), RELEASE-2025, DOI: 10.48443/jyb5-g533 , listed in bibliography. We have added more details regarding the location of the considered forest area to the text; however, the data is troublesome for visualization in figures due to its sparsity (high number of zeros).

Comments 6: Line 362: I think here in discussion you should also discuss what does a larger CR mean.

Response 6: We have added a brief explanation to avoid potential misunderstandings: “The obtained results show that the provided CR values are significantly larger than for HSI compression, hence the resulting compressed file size is smaller for the same input files compared to the application of the HSI method.”

Comments 7: Line 370: you mentioned the octree method a lot and it would be helpful to add one or two sentence to briefly summary what is the method and how the method works

Response 7: We have added the following explanation in the paper: “The octree method has been primarily used in computer graphics for the compression of 3D data, such as voxels, or point clouds. Its operating principle involves dividing a cube into eight equal parts (octants) and then selecting those with the most non-zero elements for further division using the same principle. In the case of color reduction, these can be cubes representing the largest clusters of pixels with similar colors. Lossless data compression takes advantage of the fact that for sparse data, such as LiDAR data, many of the sub-octants obtained from subsequent divisions contain no data at all and are therefore highly amenable to lossless compression. Only non-empty or non-uniform regions are further subdivided.”

Response to Comments on the Quality of English Language

Point 1: The English is fine and does not require any improvement.

Response 1: Thank You. We have carefully checked the revised file to avoid any mistakes, also in the revised paper.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

See file attached.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

The introduction is too wordy, pretentious and promotional.

Author Response

Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions/corrections highlighted changes in the re-submitted files.

Comment 1: Line 304 ‘we use compression ratio (CR) as the main performance indicator.’ Section 4 shows that also computation time is an important performance parameter. 

Response 1: Thank you for this remark. In fact, our work was intended on finding lossless compression methods able to reach as high compression ratio as possible as well as on modifying the known methods to reach this goal. Meanwhile, since the final goal is the application of these techniques, we also need to consider such practically important characteristics as computational time or memory expenses. In Section “Discussion” of the revised version we stated: “The complexity analysis of the proposed methods is presented in the form of asymptotic relations. From these results, it is not possible to evaluate performance, including processing time and system load on a specific hardware platform. Such metrics strongly depend on the software implementation of the algorithms. This implementation should take into account the hardware capabilities. This topic will be addressed in a separate study.”

Comment 2: Line 306 ‘It is proposed to compute the minimum, maximum, mean, and median of’ CR. Provide an underpinning of the choice of these statistical measures.

Response 2: In our opinion, these four parameters can describe quite adequately the distributions of CR values that are all non-Gaussian and non-symmetric. Depending on particular conditions and restrictions of compression, each of these four parameters might be the most important.

Comment 3: Lines 331-332 ‘the mean value of CR is relatively high and cannot be considered a reliable indicator. For this reason, in the following analysis, we rely on the median value of CR.’ This seems to be an occasional argument, without a thorough statistical underpinning.

Response 3: Thank You for pointing this out. We have changed this sentence to “The mean CR can be relatively high due to compressing several very sparse arrays. Therefore, we also rely on the median value of CR in our analysis”.

Comment 4: An important parameter in compression of Lidar data is the sparsity of the data. This parameter can be quantified by computing the area in the data set with Lidar measurements (let us call this area AM) and the area in the data set without Lidar measurements (let us call this area A0). To express sparsity by one metric the ratio between A0 and AM: A0/AM can be computed. The larger this value, the higher the sparsity of the data in the data set. Instead of AM the total area covered by the data set (AT) can be used, which provides a normalized measure, lying between 0 and 1. Of course, in practical situations A0 will be unknown, but in this research the above sparsity measure provides insight in the performance properties of the diverse compression techniques investigated in your research. The sparsity measure will depend on the type of landscape and can thus be estimated a priori in practical circumstances. Sparsity should be an important parameter in the evaluation of the performance of the diverse compression techniques. Furthermore, you should verify whether the compression techniques investigated in your research are really lossless..

Response 4: Thank you for these comments. We have introduced the measure of sparsity that can be applied to the considered 3D tensors (see the formula (1)). Also, we have added the evaluation of this measure (see Tables 1 and Figures 6, 7). In addition, we have added the following phrases concerning lossless data reconstruction:

“To make the proposed approach lossless, i.e., to allow exact reconstruction of the original data, two components are required:

a specification of the block order;

specification of which blocks consist entirely of zeros.

To achieve this, we additionally propose using row-wise numbering of the blocks. Next, an array with block indices is added to the header of each compressed data file. Furthermore, an array of bits indicating whether a block consists entirely of zeros is also added to the header. This information ensures an exact reconstruction of the original data; however, it leads to an increased size of the compressed file: the more blocks, the larger this additional information. Therefore, in practice, dividing the HHDCs into very small blocks may be inefficient.” (see the Block-Splitting subsection)

and

“We note that, when evaluating the compression ratio, the information required for lossless reconstruction of the original tensor is also taken into account.” (see the Test Data Compression subsection).

Comment 5: Figure 6 is a graphical presentation of the median column in Table 2. The same is true for Figure 7 and Table 3 Up to Figure 10 and Table 6

And Figure 12 and Table 7

Up to Figure 16 and Table 11.

The figures are superfluous and should be removed.

Response 5: Thanks for this comment. We have removed superfluous figures.

4. Response to Comments on the Quality of English Language

Point 1: The English could be improved to more clearly express the research. The introduction is too wordy, pretentious and promotional.

Response 1: Thank you. We have carefully checked the revised file to avoid any mistakes, also in the revised paper and improved the readability of the introduction as well as the whole paper.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The paper mentions that “RC & RLE (skip zeros)” performs poorly in certain scenarios. It is recommended to analyze the causes of these failure cases and investigate whether they are related to the distribution pattern of zero values. Additionally, a discussion on the basis for selecting different block sizes should be added, exploring whether it is related to data dimension or sparsity.
While time and space complexities are mentioned, empirical data on actual runtime or processor load is lacking. We recommend adding runtime comparisons on typical hardware platforms, particularly highlighting performance under constrained satellite computing resources.
The paper primarily compares with NPZ (LZ77 + Huffman) and bit packing. We suggest adding comparisons with satellite remote sensing image compression standards like CCSDS-123.0 to enhance persuasiveness.
Some figures (e.g., Figures 6–17) omit the horizontal axis labels in the text. We recommend uniformly correcting this in the final version.
The fundamental differences between this work and HSI compression (e.g., zero-value ratio, data range) could be further emphasized in the introduction or conclusion to highlight the innovation.
The discussion of limitations is rather brief. It is recommended to supplement that the current method primarily targets forest scenes (highly sparse) and whether its performance may degrade in urban or complex terrain.
Some sentences are verbose or complex. It is recommended to simplify them to improve readability.

Author Response

Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions/corrections highlighted changes in the re-submitted files.

2. Point-by-point response to Comments and Suggestions for Authors

Comments 1: The paper mentions that “RC & RLE (skip zeros)” performs poorly in certain scenarios. It is recommended to analyze the causes of these failure cases and investigate whether they are related to the distribution pattern of zero values. Additionally, a discussion on the basis for selecting different block sizes should be added, exploring whether it is related to data dimension or sparsity. 

Response 1: Thank you for this comment. We have added the following to the discussion: “When comparing two similar compression methods, RC & RLE (skip zeros) and RC & RLE (repeat zeros), we observe that the performance of RC & RLE (skip zeros) is lower than that of RC & RLE (repeat zeros). This difference is mainly due to the design of the algorithms. Both methods use zero-sequence packing. However, RC & RLE (repeat zeros) allows splitting zero chains into segments and compressing them separately, while RC & RLE (skip zeros) does not provide such flexibility. In RC & RLE (skip zeros), each nonzero value is represented with fewer bits than in RC & RLE (repeat zeros), but additional k bits are required to encode subsequent zeros. These k bits must be sufficient to encode the longest possible sequence. In other words, RC & RLE (skip zeros) does not allow adjusting k, unlike RC & RLE (repeat zeros). This lack of flexibility is reflected in the compression results on the test data.”

and
“Next, the optimal value of the block size s depends on the sparsity of the compressed HHDC and the applied method. To achieve maximum memory efficiency, this parameter should be determined individually for each object. However, this requires additional computational resources, especially when the size of the compressed data cubes is very large. In cases where processing is performed on a standalone device with limited computational power or when high speed is required, it is reasonable to use a predefined s. The recommended values of this parameter for the considered data were obtained in the previous section. In a more general case, however, it would be more practical to use a trained model that suggests the optimal s for a given sparsity level. The study of such an approach will be the subject of future research.”

Comments 2: While time and space complexities are mentioned, empirical data on actual runtime or processor load is lacking. We recommend adding runtime comparisons on typical hardware platforms, particularly highlighting performance under constrained satellite computing resources.

Response 2: Thank You for pointing this out. In our opinion, this requires additional exploration. We have mentioned this lack in the discussion: “The complexity analysis of the proposed methods is presented in the form of asymptotic relations. From these results, it is not possible to evaluate performance, including processing time and system load, on a specific hardware platform. Such metrics strongly depend on the software implementation of the algorithms. This implementation should take into account the hardware capabilities. This topic will be addressed in a separate study.”

Comments 3: The paper primarily compares with NPZ (LZ77 + Huffman) and bit packing. We suggest adding comparisons with satellite remote sensing image compression standards like CCSDS-123.0 to enhance persuasiveness.

Response 3: Thank you for pointing this out. Such an analysis requires a separate study. We noted its necessity in our conclusions: “In addition, the suggested methods have been explored in the context of comparison with a limited set of techniques. The effectiveness of other approaches, especially those based on constructive tools such as trigonometric polynomials, wavelets, and atomic functions, remains unstudied. In our next study, we will focus on these methods and perform a comparative analysis with existing industry standards, such as CCSDS-123.0.”

Comments 4: Some figures (e.g., Figures 6–17) omit the horizontal axis labels in the text. We recommend uniformly correcting this in the final version.

Response 4: Thanks for this comment. We have added the horizontal axis labels. Also, we have removed superfluous figures that repeat the contents of the tables.

Comments 5: The fundamental differences between this work and HSI compression (e.g., zero-value ratio, data range) could be further emphasized in the introduction or conclusion to highlight the innovation.

Response 5: Thanks for this remark. We have supplemented the conclusions with the statement: “In general, these methods provide a high compression ratio. Moreover, this value exceeds that achieved by lossless HSI compression methods, rarely exceeding 6.”.

Comments 6: The discussion of limitations is rather brief. It is recommended to supplement that the current method primarily targets forest scenes (highly sparse) and whether its performance may degrade in urban or complex terrain.

Response 6: Thanks for this comment. We have supplemented the discussion with the statements: “The suggested methods are targeted mainly for forest environments characterized by high sparsity. This design choice may limit their generalizability to other landscape types. The algorithm's performance is anticipated to deteriorate in densely structured urban areas. Similar degradation may occur in topographically complex terrain.”

Comments 7: Some sentences are verbose or complex. It is recommended to simplify them to improve readability.

Response 7: We revised the paper and checked the language of the whole manuscript once again. Thank you for pointing this out.

3. Response to Comments on the Quality of English Language

Point 1: The English could be improved to more clearly express the research.

Response 1: Thank You. We have carefully checked the revised file to avoid any mistakes in the revised paper and improved the readability of the whole paper.

Article Menu

On the Lossless Compression of HyperHeight LiDAR Forested Landscape Data

Further Information

Guidelines

MDPI Initiatives

Follow MDPI