Flight-Safe Inference: SVD-Compressed LSTM Acceleration for Real-Time UAV Engine Monitoring Using Custom FPGA Hardware Architecture
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper addresses a critical and relevant problem in UAV safety and maintenance. The proposed approach of using SVD-compressed LSTMs on a custom FPGA is a valid and promising direction for achieving real-time onboard prognostics. The experimental results regarding latency and power reduction on the FPGA are noteworthy. However, several major major concerns will need to be addressed.
- Novelty Articulation and Positioning:
- While the application of SVD to LSTMs and implementing LSTMs on FPGAs are established research areas, the paper needs to more explicitly highlight its unique contributions. Is the novelty in a specific SVD compression strategy tailored for this problem, unique aspects of the FPGA dataflow/architecture that offer advantages over prior LSTM-on-FPGA works, the specific co-design synergy for UAV engine RUL prediction, or the extent of performance gains achieved?
- The related work section should be leveraged more effectively to position the proposed work against existing SVD-LSTM compression techniques and other FPGA-based LSTM accelerators, clearly delineating the advancements made by this paper.
- Justification and Substantiation of "Flight-Safe" Claim:
- The term "Flight-Safe" in the title and throughout the manuscript implies a high degree of reliability, fault tolerance, and potentially adherence to safety standards, which are critical for aerospace applications. Currently, the justification for "flight-safe" primarily seems to stem from achieving real-time performance and efficiency.
- This claim needs to be either substantially supported by discussing specific design aspects related to safety, reliability, fault detection/tolerance in the FPGA architecture, or adherence to any relevant aviation guidelines for software/hardware. Alternatively, the claim should be toned down to more accurately reflect "suitability for real-time in-flight processing" or "enhanced operational safety through timely prognostics.”
- Fundamental Inconsistency in Problem Formulation:
-
- The Abstract and Introduction suggest the system "effectively predicts aircraft engine’s normalized Remaining Useful Life (RUL)" and aims for "accurate normalized Remaining Useful Life (RUL) estimation."
- However, the Methodology (Section 4.1.2 "Sliding Window," detailing the failure_within_w1 binary label) and Evaluation Metrics (Section 4.3, listing Accuracy, Precision, Recall, F1-score, Binary Cross-Entropy Loss) clearly frame the actual implemented task as a binary classification (predicting failure within a window). Meanwhile, how long is a time step in the demonstrated case? Is it a millisecond, a second, a minute, a day, or a week?
- This is a critical contradiction. The paper must consistently define and address one problem (either RUL regression or failure classification) and ensure the methodology, results, and discussion align. The abstract's "98% prediction accuracy" also points to a classification task.
- Details of FPGA Architecture and Implementation:
- While there are figures provide block diagrams, more detailed information on the custom FPGA architecture is needed. This includes:
- Data Quantization: The bit-widths used for weights, activations, and intermediate computations at various stages of the SVD-LSTM on the FPGA should be explicitly stated. The impact of this quantization on RUL prediction accuracy (distinct from SVD compression effects) needs to be analyzed and presented.
- Memory Hierarchy and Bandwidth: Details on how weights and activations are stored and accessed (e.g., BRAM usage, off-chip memory interface if any, data movement strategies) and whether memory bandwidth is a bottleneck would be valuable.
- Control Logic: More information on the controller's design and how it orchestrates the pipelined and parallel operations within the SVD-LSTM core.
- SVD-Specific Mapping: How the decomposed SVD matrices (U, S, V) are specifically handled and multiplied within the FPGA core.
- Lack of Clarity and Detail on Performance Baselines and Results:
-
- The abstract claims a "24% drop in latency" and resource savings "relative to floating-point designs." It is unclear what this "floating-point design" baseline refers to: Is it a software implementation on a CPU? A floating-point version implemented on the same FPGA? An uncompressed model?
- Specific details of the baseline(s) used for comparison (hardware platform, model version, precision) are necessary to interpret the claimed improvements. The text lacks the detailed comparative tables (e.g., for latency and power against other platforms) that would be expected.
- The comparison to a CPU device is not that impressive as a matter of fact when dealing with machine learning models. How is the FPGA comparing against a commonly available GPU device like a NVIDIA Jetson for example?
- Grammar and editing problems:
-
- The following are some examples of the grammar and editing problems. Please perform a through proof reading.
- Abstract: “Filed Programmable Gate Array(FPGA)” -> “Field Programmable Gate Array (FPGA)”
- Abstract: “enhances safety minimizes unplanned downtime” -> “enhances safety, minimizes unplanned downtime”
- Section 4.1: "This is illustrated in the Table 2” -> “This is illustrated in Table 2”
- Section 4.2: “evaluate the training model” -> “evaluate the trained model”
- Section 5: “… throughput deep stacked LSTM networks like an FPGA.” An FPGA is not a network.
- The following are some examples of the grammar and editing problems. Please perform a through proof reading.
Please see the main comments.
Author Response
Please refer to the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper Flight-Safe Inference: SVD-Compressed LSTM Acceleration for Real-Time UAV Engine Monitoring Using Custom FPGA Hardware Architecture introduces a hardware-accelerated predictive maintenance framework for UAV engines using an SVD-compressed LSTM model deployed on an FPGA platform. This approach enables accurate real-time prediction of engine remaining useful life with 98% accuracy while significantly reducing computational latency and resource usage. There are some major issues related with the paper that need to be addressed before it can be considered for publishing.
Authors use SVD for model compression, they do not provide empirical comparisons with other modern compression methods like pruning, quantization-aware training, or knowledge distillation. It is possible to add some comparison?
Study is based on the NASA Turbofan Engine Degradation dataset, which is a simulation rather than real UAV data. It would be good that framework is also tested on real data.
Reported 98% prediction accuracy is presented as a strong result, but there is limited discussion on robustness, calibration, or error tolerance. For a critical application like UAV engine monitoring, other metrics such as false negatives, time-to-failure predictions, or confidence intervals should have been addressed more comprehensively. Please, add more detailed explanation about mentioned things.
Although the FPGA implementation is detailed, the article lacks real-world deployment scenarios, such as temperature tolerance, power fluctuation, airborne interference, or UAV-specific integration issues. These are critical in aviation environments and that's why detailed explanation is expected.
There's no mention of k-fold cross-validation or similar techniques, which would provide better confidence in the reported performance and reduce the risk of overfitting to a specific train/test split. Please, add or explain your technique.
In Figure 5 the legend should be bigger.
The manuscript contains some grammatical errors. Because of that I recommend a native speaker for proofreading of the paper.
Author Response
Please refer to the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThis study presents a robust Filed Programmable Gate Array (FPGA) accelerated predictive maintenance framework for UAV engines using a Singular Value Decomposition (SVD)-optimized Long Short-Term Memory (LSTM) model. The topic and the results are interesting. But, some comments should be put forward as follows.
1. This paper is too long, and the structure of this paper should be improved. In addition, the written English should be polished.
2. The contributions or originality of this paper should be highlighted in abstract or introduction sections.
3. There are many figures in this paper, the number of figures should be reduced. The presentation form of Fig.5, 13, 15, 16, 17, needs to be modified for better readability according to the relevant drawing standards.
4. In Section 7, some comparative simulations or experiments including existing algrithms should be added to verify the superiority of the proposed schemes.
Author Response
Please refer to the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThank you for the revision and response. The reviewer’s previous comments have been mostly addressed.
One remaining concern: the 70 ms GPU latency vs. 90 ms CPU time does not intuitively make sense. Normally, a model running on a GPU will be significantly faster than a CPU. Could it be that many operations are still performed on a CPU in your Colab experiment, even with a GPU available? Could you test the CPU version of the test in Colab (with GPU explicitly disabled) as well for comparison? Please carefully check the experiment’s soundness and provide reasoning for the results.
Author Response
Please read the attached file.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsCan be accepted.
Author Response
We appreciate the time and effort of the reviewer for his dedication in providing us with valuable feedback on our manuscript.
Round 3
Reviewer 1 Report
Comments and Suggestions for AuthorsThank you for the response. Note that the demonstration of the impact of this work is still limited without comparison to a comparable GPU-equipped edge computing device. The readily available edge computing devices could greatly reduce development costs with mature infrastructure support for model integration, etc. The reliability of these mass production devices is often better guaranteed as well. The authors might want to include a brief discussion on the tradeoffs here.
Author Response
Please refer to the attached file.
Author Response File: Author Response.pdf