A Tile-Based Multi-Core Hardware Architecture for Lossless Image Compression and Decompression
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe title and objectives stated by the authors in the abstract of the paper correspond to its content.
The objectives proposed by the authors were covered, the work being well structured.
In the paper, the authors present the implementation of a lossless compression and decompression system for images on the Xilinx ZC706 FPGA platform. The hardware architecture proposed by the authors is a high-performance one, oriented towards modern applications that require fast and efficient data processing.
The proposed system is a high-performance one for lossless image compression, suitable for real-time applications. As presented by the authors, it results that the proposed solution is superior (compared to other existing methods) in terms of compression ratio, speed and efficient use of hardware resources, provided that an optimal tile size (≥8 x 8) is used.
Some questions for the authors:
- have you used 4 x 4, 8 x 8, 16 x 16 and full-frame tile sizes. Have you also tested intermediate sizes (e.g. 32 x 32, 64 x 64)?
- have you also evaluated other characteristics such as: execution time, energy consumption, latency, or other resources?
- the proposed system was tested only on ZC706? Do you think this system could be ported to other FPGA platforms (e.g. ZCU104, UltraScale+) or ASIC?
The scientific content is good, the conclusions are presented correctly, the results obtained are relevant.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsSince the main goal of the paper is to present comparisons between embedded hardware image processing techniques using FPGAs an expected outcoming would be the present of real and sintethic images before and after the processing in the results section in order to readers evaluate visual inspection of the quality of results. Please add a subsection like this and resubmit the paper again for further analysis.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThis study taregts the problem of lossless image compression with high speed and good quality, especially for real-time tasks. The solution involves a tile-based multi-core hardware architecture on an FPGA, using parallel processing and hybrid compression methods like run-length coding, predictive coding, and non-coding mechanisms. The claimed contribution is the use of a four-stage pipeline for compression and dynamic state machines for decompression. Tests on the Xilinx Zynq-706 evaluation board show a compression speed of 480 Msubpixels/s and a decompression speed of 372 Msubpixels/s, surpassing existing methods.
The following issues must be addressed:
- To improve the abstract, clearly state the problem, solution, and results in a simple way. Highlight the novel aspects, especially the improvements in speed and efficiency. Make sure the abstract flows smoothly, connecting the research problem, method, results, and real-world benefits.
- The introduction should clearly explain why current compression methods are inefficient, making the need for a better solution obvious. Briefly mention the unique aspects of the proposed approach, like the tile-based multi-core architecture and hybrid strategy. How the system was tested, and summarize the results. Lastly, mention real-world applications of the method, such as medical imaging, multimedia, and surveillance, to highlight its importance.
- Section 1 reviews compression and decompression using a mix of strategies. It explains how run-length coding, predictive coding, and non-coding work together to improve image compression. Since this part builds the foundation for the multi-core FPGA-based system, the authors should improve content structure with clear subheadings and concise explanations for better readability.
- The paper lacks a Related Work section, which is important to compare existing lossless compression techniques with the proposed approach. A dedicated section should summarize prior research, including FPGA-based methods, hybrid compression, and multi-core systems, discussing their weaknesses and showing how this study provides a better solution.
- The performance comparison section can be improved by including more benchmark data, such as compression ratios, processing speeds, and memory usage for different datasets. Also, comparing against modern AI-based compression techniques would strengthen validation. Adding tables, graphs, or charts can help readers understand the results more easily.
- How well does the system handle larger images and datasets?
- How does the FPGA-based approach compare to GPU or ASIC implementations in terms of power use and cost?
- Can tile sizes be adjusted dynamically for better compression results?
- Does the dynamic state machine increase latency, and can it be made even faster?
- Are there cases where compression or decompression might harm image quality?
- Could this system work for satellite imaging or medical diagnostics?
- Does the pipeline structure create problems for FPGA development, and what are the trade-offs?
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsPros
-
The overall presentation is well-structured. The introduction effectively outlines the necessary background and clearly states the paper’s contributions.
-
The paper revisits a previously proposed lossless image compression and decompression algorithm that uses hybrid strategies. It aims to improve processing efficiency by implementing the algorithm on an FPGA platform.
-
A notable contribution is the introduction of a multi-core system architecture to accelerate both compression and decompression.
-
The design leverages a division of tasks between the Processing System (PS) and Programmable Logic (PL), where the PS supports the PL in executing the core algorithms. The PL’s multi-core modules can process up to eight image tiles simultaneously, outperforming existing architectures.
-
The compression pipeline consists of four stages, while the decompression is managed by a dynamic state machine, leading to enhanced performance and operation efficiency.
Cons
-
The proposed work primarily integrates existing techniques; such as hybrid compression; to improve throughput and compression ratios. It lacks a novel research contribution.
-
While the use of FPGA for parallel processing and efficient hardware utilization is practical, it is a well-established approach and does not introduce any unique innovation.
-
The paper claims the use of a dynamic state machine for decompression to achieve adaptive compression. However, the rationale for calling it "dynamic" is unclear, as there appears to be no observable dynamic behavior.
-
The use of a four-stage pipeline for compression is a standard technique aimed at increasing clock frequency and throughput. It doesn't add new research value.
-
The proposed parallel processing-based architecture is primarily a combination of existing methods aimed at improving performance, rather than introducing a fundamentally new idea.
-
Although the paper presents performance improvements in Tables 1 and 2 and Section 3.2, these enhancements are relatively marginal and do not strongly support the claim of a novel research contribution.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsAll requirements were fulfilled.
Reviewer 4 Report
Comments and Suggestions for Authors-
The author has revised the abstract, introduction, and literature review sections as requested.
-
The write-up for the compression and decompression algorithms has also been improved for better readability.
-
The author has thoroughly addressed all of my comments, including the use of the term "dynamic," the inclusion of additional results, and the explanation of parallel processing.