Comparative Study of RNN-Based Deep Learning Models for Practical 6-DOF Ship Motion Prediction
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsSummary: This manuscript presents a comparative analysis of RNN-based deep learning models (RNN, LSTM, GRU, Bi-LSTM) for six degrees-of-freedom (6-DOF) ship motion prediction under varying sea states. The study systematically evaluates factors such as input sequence length, downsampling intervals, input dimensionality, and model complexity. Importantly, the authors introduce Peak Matching as an additional evaluation metric to complement standard error measures, emphasizing the importance of accurate prediction of extrema in maritime safety contexts. Experiments are conducted on a simulated dataset of ship motions across multiple environments, and results indicate that Bi-LSTM consistently outperforms unidirectional models, particularly under complex conditions.
Strengths:
- The paper addresses a practically significant and safety-critical problem in maritime engineering—short-term ship motion prediction, essential for autonomous navigation and operational safety.
- The comparative evaluation is comprehensive, exploring not only multiple RNN variants but also the impact of sequence length, downsampling, and different input configurations (single DOF, 6-DOF, with/without wave data).
- The introduction of Peak Matching and Overestimation Ratio as performance indicators is a valuable contribution, reflecting realistic maritime decision-making needs that go beyond traditional error metrics.
- The results are well-documented with figures, sensitivity analyses, and detailed discussions. The findings (e.g., optimal sequence length ≈200 timesteps, downsampling interval n=5, Bi-LSTM robustness) are practical and actionable for real-world implementations.
- The paper is well-written, logically structured, and supported by a thorough literature review that situates the contribution within existing work.
Areas for Improvement:
- Dataset limitations: The study relies on a simulated dataset (KCS model under nine sea states). While this is appropriate for controlled experimentation, real-world validation is essential to confirm generalization. The authors acknowledge this but should emphasize it more strongly in the limitations.
- Scope of architectures: Only RNN-based models are compared. Including or at least discussing more recent architectures (e.g., CNN-RNN hybrids, attention-based models, or Transformers) would strengthen positioning.
- Computational performance: While model accuracy is well-covered, reporting training/inference times and resource usage would make the results more actionable for deployment scenarios.
- Peak Matching metric validation: The new metric is promising, but additional evidence (e.g., correlation with real operational outcomes or comparison with other event-based metrics) would help justify its adoption.
- Figures: While informative, some figures (e.g., prediction curves across multiple models) are crowded and could be simplified or moved to supplementary material for readability.
Conclusion and Recommendation: This is a strong and valuable manuscript that makes both methodological and practical contributions to ship motion prediction research. The comparative evaluation, coupled with the introduction of physically meaningful performance metrics, provides actionable insights for engineers and researchers. However, to maximize impact, the paper should strengthen the discussion on generalization beyond simulations, expand context regarding newer deep learning methods, and report computational efficiency. I recommend acceptance with major revisions.
Author Response
Response to the Reviewer #1
Title: Study of RNN-Based Deep Learning Models for Practical 6-DOF Ship Motion Prediction
Authors: HaEun Lee, Yangjun Ahn
All your comments are highly appreciated. Your comments have been extremely valuable in helping us refine the manuscript and strengthen its contributions. The manuscript has been revised based on your valuable comments, and our replies to your comments are provided below.
- Dataset limitations: The study relies on a simulated dataset (KCS model under nine sea states). While appropriate for controlled experimentation, real-world validation is essential to confirm generalization. The authors acknowledge this but should emphasize it more strongly in the limitations.
- The authors deeply agree with your consideration and suggestions.
- We fully agree with the reviewer that real-world validation is indispensable for confirming the generalizability of the proposed models. While the present study focused on controlled simulations to enable systematic comparisons, we have revised the Discussion section (p.25) to acknowledge this limitation more explicitly. In particular, we now highlight the necessity of validating the models using full-scale measurement data from sea trials or operational environments, and we outline this as a key direction for our future work.
- The authors have revised the DISCUSSION section by emphasizing limitations and adding future work at the end.
(DISCUSSION)
Several limitations of this study should be acknowledged. First, the experiments were conducted on a simulated dataset of a specific vessel (KCS model) under nine sea states. Although this setting considers a range of wave environments, it inherently limits the generalizability of the findings to different vessel types and real-world maritime conditions. Therefore, future validation with measurement data from sea trials or operational environments will be essential to confirm the robustness of the models. Second, ...
- Scope of architectures: Only RNN-based models are compared. Including or at least discussing more recent architectures (e.g., CNN-RNN hybrids, attention-based models, or Transformers) would strengthen positioning.
- We sincerely thank the reviewer for this insightful suggestion.
- Our study intentionally focused on RNN-based architectures to establish a systematic and controlled comparison among widely used baseline models. These models were selected for their historical relevance and accessibility to a wide community of researchers and practitioners through standard PyTorch libraries. Nevertheless, we fully agree that more recent architectures, such as CNN-RNN hybrids, attention mechanisms, Transformer-based models, and physics-informed neural networks, represent promising directions for enhancing generalization and capturing nonlinear vessel–wave interactions. Accordingly, we have revised the Discussion section to explicitly acknowledge this limitation and highlight these models as important avenues for future research.
- (DISCUSSION)
Second, the scope of architectures was intentionally restricted to RNN-based variants to establish a clear comparative baseline. Nevertheless, recent advances such as attention mechanisms, Transformer-based networks, and physics-informed neural architectures may offer superior capabilities for capturing nonlinear vessel–wave interactions.
- Computational performance: While model accuracy is well-covered, reporting training/inference times and resource usage would make the results more actionable for deployment scenarios.
- We appreciate the reviewer’s constructive suggestions.
- In addition to specifying the computational environment, we conducted a brief experiment to measure the average training time per epoch on an NVIDIA GeForce RTX 3090 GPU. The results show that model training is computationally feasible:
- RNN: mean = 1.71 s, median = 1.73 s
- LSTM: mean = 2.01 s, median = 2.09 s
- GRU: mean = 1.90 s, median = 1.97 s
- Bi-LSTM: mean = 2.60 s, median = 2.68 s
Inference latency remained negligible (on the order of milliseconds per sequence). These results indicate that all proposed models can be trained and deployed without excessive computational cost. This clarification has been added to the Materials and Methods: Training section (p.12).
(MATERIALS AND METHODS: TRAINING)
All experiments were conducted on a workstation with an NVIDIA GeForce RTX 3090 GPU. On average, training time per epoch was approximately 1.7 s (RNN), 2.0 s (LSTM), 1.9 s (GRU), and 2.6 s (Bi-LSTM), confirming that training was computationally feasible under this setting.
- Peak Matching metric validation: The new metric is promising, but additional evidence (e.g., correlation with real operational outcomes or comparison with other event-based metrics) would help justify its adoption.
- We sincerely thank the reviewer for this constructive comment.
- The Peak Matching metric was introduced in this study as a simple yet intuitive way to evaluate the accurate prediction of extrema, which are particularly critical in maritime operations such as cargo handling, collision avoidance, and helicopter landing. To the best of our knowledge, no established reference has formalized this metric in the context of ship motion prediction. Nevertheless, its strength lies in its straightforward interpretability and ability to highlight safety-relevant performance aspects that conventional error-based measures (e.g., RMSE, MAE) may overlook. We have revised the Discussion section to clarify this motivation and to emphasize that future work will validate the metric more rigorously using operational data.
- (DISCUSSION)
The introduction of Peak Matching as an evaluation metric addresses a specific limitation of MSE-based assessments in the context of ship motion prediction. While MSE provides an average measure of error, it does not capture the accuracy of extrema prediction. It is often more critical for safety-critical operational decisions such as cargo handling, collision avoidance, helicopter operations, or structural load assessments. The strength of the Peak Matching metric lies in its intuitive interpretability and ability to highlight dynamics directly relevant to maritime safety. Nevertheless, its effectiveness may depend on the specific application context, and future validation with real-world operational data will be necessary to establish its full utility. In particular, future work may include rigorously validating this metric using operational datasets to ensure its practical relevance.
- Figures: While informative, some figures (e.g., prediction curves across multiple models) are crowded and could be simplified or moved to supplementary material for readability.
- The authors appreciate the reviewer’s suggestion to improve figure readability.
- In the revised manuscript, we simplified several main text figures and adjusted the line styles for clarity. For example, Figure 16(b), which compares five model configurations, has been revised by modifying the line styles to ensure that each curve is visually distinguishable. In addition, detailed prediction curves that were previously crowding the main text have been moved to Appendix A (e.g., Figures A1–A3), thereby improving the overall readability of the paper while still making the complete set of results available for reference.
- (FIGURE 16B)
- (APPENDIX)
- Appendix A
Appendix A.1 Single Environment: Effect of Trained Sequence Length
This appendix provides the full prediction curves for all four RNN-based models (RNN, LSTM, GRU, Bi-LSTM) across different sequence lengths. These figures complement the representative results shown in the main text (Section 5.1.1).
|
(a) Sequence length = 50 timesteps |
|
(b) Sequence length = 100 timesteps |
|
(c) Sequence length = 200 timesteps |
|
(d) Sequence length = 300 timesteps |
|
(e) Sequence length = 400 timesteps |
Figure A1. Model Prediction Results for Varying Sequence Lengths: Predicted time series outputs for different sequence lengths of 50, 100, 200, 300, and 400 timesteps.
Appendix A.2 Single Environment: Effect of Downsampling Interval
This appendix provides the prediction curves for all four RNN-based models(RNN, LSTM, GRU, Bi-LSTM) under different downsampling intervals at sequence length 200. These figures complement the representative results shown in the main text (Section 5.1.1).
|
(a) n=1 |
|
(b) n=2 |
|
(c) n=5 |
|
(d) n=10 |
Figure A2. Example of sequential visualization of prediction results for sequence length 200 under varying downsampling intervals
Appendix A.3 Multi-Environment: Sensitivity to Sequence Length
This appendix provides the prediction results for all four RNN-based models (RNN, LSTM, GRU, Bi-LSTM) across different sequence lengths in multi-environment scenarios, complementing Section 5.2.2 in the main text.
|
(a) Sequence length = 50 timesteps |
|
(b) Sequence length = 100 timesteps |
|
(c) Sequence length = 200 timesteps |
|
(d) Sequence length = 300 timesteps |
Figure A3. Example of sequential visualization of prediction results for different sequence lengths of 50, 100, 200, and 300 timesteps.
- English Editing
The authors have revised the English sentences and words.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsSummary:
The manuscript presents a comparative study of four RNN-based DL models for the prediction of 6-DOF ship motion. The authors evaluate RNN, LSTM, GRU, and Bi-LSTM models using a simulated dataset with one and nine distinct conditions. The study systematically investigates the effect of various parameters, such as dimensionality of the system, window size, downsampling rate, and model complexity. The results indicate that the Bi-LSTM architecture outperforms its unidirectional counterparts, the use of multi-DOF motion data enhances performance, while the inclusion of wave data hinders the prediction. The shorter the window size, the better, and a downsampling interval of 5 improves the prediction accuracy. In addition, the manuscript introduces a "Peak Matching" method that complements the MSE by assessing the model's ability to predict motion extrema.
General comments:
i) The article is very well written and was a joy to read. It is well structured with flow, all experiments are motivated appropriately, and all design choices are supported.
ii) The actual content of this work does not include any new methods or unexpected findings, which limits its potential impact and significance. Yet, I believe it would be a valuable read for the interdisciplinary research community.
Specific comments:
Please find a list of comments below that would hopefully assist the authors to further enhance the impact of this work.
1) Please clarify the key contributions of this work in the introduction section, using bulletpoint structure. As it reads now (the introduction), it might not be particularly attractive for the reader.
2) I think the authors should clarify the intended use of the predictions.
Do they need to be online ? or they are also considering offline use cases too ?
This is relevant to the Bi-LSTMs, as they are mostly useful for offline or delayed online processing.
Please provide more details about the backward pass of the Bi-LSTMs.
Do you actually feed future measurements? if yes, is the comparison with the other unidirectional sequence models fair ? in particular for the case of Non-Representative Inputs.
3) Please provide more information on how you actually solve the the eq 15-17, in order to generate the synthetic data.
4) Both the Peak Matching Algorithm and the Overestimated Peak Ratio are indeed interesting.
5) There is a bit of repetition in the last paragraph of the introduction about what this work is about. It's best to delineate this.
6) Typo in fig 3. "Blue" lines indicate the original... it should be grey line
7) In lines 493-494, there is a statement "paradoxically suggesting that the models have achieved good generalization by not overfitting to specific patterns" with which I am not sure I agree and especially if it can be concluded just from this one experiment.
8) Similarly, in lines 562-563, "utilizing exogenous variables effectively requires advanced model structures or the introduction of separate mechanism", I am not sure if this can be concluded just from this one experiment.
9) In the experiments with all the 6-DOF predictions, such as Fig 10 - 14, please include and comment on the timewise plots of the prediction for the other dimensions too. Only heave is presented, which is quite limiting demonstration of the prediction capabilities of a 6-DOF model.
10) I wonder if both (second part of 5.1.1 and 5.2.3) down-sampling investigations are really needed in the manuscript? Maybe remove one of the two to save some space and add plots with the rest of the dimensions of the 6-DOF model.
11) In Figure 11b, the down-right plot shows that all the predictions are inaccurate. Can you please discuss this? Why aren't the models able to predict with decent accuracy the first 50-100 steps and then degrade?
Author Response
Response to the Reviewer #2
Title: Study of RNN-Based Deep Learning Models for Practical 6-DOF Ship Motion Prediction
Authors: HaEun Lee, Yangjun Ahn
All your comments are highly appreciated. Your comments have been extremely valuable in helping us refine the manuscript and strengthen its contributions. The manuscript has been revised based on your valuable comments, and our replies to your comments are provided below.
1) Please clarify the key contributions of this work in the introduction section, using bulletpoint structure. As it reads now (the introduction), it might not be particularly attractive for the reader.
- The authors deeply agree with your consideration and suggestions.
- To address this, we have revised the Introduction Section (p.3) to explicitly highlight the key contributions of the study in a bullet-point structure. This addition improves the readability and attractiveness of the paper by clearly presenting the novelty and practical significance of the work. The revised section now reads as follows.
- (INTRODUCTION)
…
The contributions of this paper can be summarized as follows:
- Unified baseline for 6-DOF prediction. Four representative RNN-based models were systematically compared under consistent experimental conditions, establishing a practical baseline for ship motion prediction across maritime environments.
- Basic RNN model design insights. Practical guidelines were designed using basic PyTorch libraries on sequence length, downsampling strategy, input dimensionality, and model capacity, highlighting conditions under which RNN-based models can achieve stable and reliable performance.
- Safety-relevant evaluation. Peak Matching and the Overestimation Ratio were proposed as complementary evaluation metrics to traditional error measures, emphasizing the importance of extrema prediction for safety-critical maritime operations.
2) I think the authors should clarify the intended use of the predictions. Do they need to be online? or they are also considering offline use cases too ? This is relevant to the Bi-LSTMs, as they are mostly useful for offline or delayed online processing. Please provide more details about the backward pass of the Bi-LSTMs.
Do you actually feed future measurements? if yes, is the comparison with the other unidirectional sequence models fair? in particular for the case of Non-Representative Inputs.
- We thank the reviewer for raising this important point.
- The intended scope of our study is primarily offline or near-real-time scenarios where a short delay in prediction is acceptable, such as decision support, risk assessment, or operational planning. While online real-time control applications remain highly important, they typically favor unidirectional models due to their causal processing. In this work, Bi-LSTM was included to establish a comprehensive baseline and to explore its potential advantages in offline or delayed-online maritime applications.
- Regarding the backward pass of the Bi-LSTM, we confirm that no future measurements beyond the given input window were fed to the model. The same input sequence was processed in forward and reverse order, and the hidden states were concatenated to form the final representation. This ensures that all models—including RNN, LSTM, GRU, and Bi-LSTM—were trained and evaluated under identical input conditions, preserving fairness in the comparisons.
- The authors have clarified these points in the revised manuscript (Section 3.4 and Section 4.2.1).
(3.4. BIDIRECTIONAL LSTM)
… Furthermore, the backward pass of Bi-LSTM was implemented without feeding future measurements beyond the given input window. The same input sequence was processed forward and reverse, ensuring a fair comparison with unidirectional models.
(4.2.1 MODEL ARCHITECTURE)
… The intended scope of the present study is primarily offline or near-real-time prediction scenarios where a short delay is acceptable (e.g., decision support, risk assessment, or planning). Unidirectional models would be more appropriate for strictly real-time control applications, but Bi-LSTM was included here to establish comparative baselines and explore their potential for operational analysis.
3) Please provide more information on how you actually solve the eq 15-17, in order to generate the synthetic data.
- The authors appreciate the reviewer’s constructive suggestions. The following paragraph regarding the dataset has been modified in the 4.1.1 Dataset section.
(4.1.1 Dataset)
To develop the training dataset for the neural network models, the KRISO Container Ship (KCS) was chosen as the representative vessel, with its principal particulars provided in Table 2. Assuming operation at the design cruising speed, simulations were carried out under multiple sea conditions, as detailed in Table 3, by combining various sea states with different wave heading angles. A long-crested irregular wave field was generated by decomposing the ITTC wave spectrum into 100 sinusoidal components with random phase superposition to mimic natural irregularity. For each condition, time-series data of the 6-DOF ship motions were produced over 10,000 seconds with a sampling interval of 0.01 seconds, resulting in 90,000 seconds of simulation and about 9 million data points in total. This dataset was balanced across all environmental conditions to ensure fair model training. To capture the nonlinear dependence of ship responses on sea-state severity, a total of eighteen sea states with different significant wave heights and mean periods were considered, restricted to long-crested irregular head seas. The motion databases were generated through weakly nonlinear IRF-based simulations. Independent test sets were configured for each operating scenario to evaluate the generalization performance of the models.
4) Both the Peak Matching Algorithm and the Overestimated Peak Ratio are indeed interesting.
- The authors sincerely thank the reviewer for recognizing the contribution of the Peak Matching Algorithm and the Overestimated Peak Ratio. We are pleased that these event-based evaluation metrics were found interesting. We greatly appreciate your positive feedback.
5) There is a bit of repetition in the last paragraph of the introduction about what this work is about. It's best to delineate this.
- We sincerely thank the reviewer for this constructive comment.
- We agree that the final paragraph of the Introduction contains some repetitions regarding the objectives and contributions of this study. The authors have revised the text to reduce redundancy and improve clarity. The revised version now reads as follows to address this issue:
(INTRODUCTION)
…
The present study systematically investigates the prediction performance of four representative RNN-based architectures (RNN, LSTM, GRU, Bi-LSTM) under single- and multi-sea-state environments (Sections 3–5). It quantifies the effects of input sequence length, downsampling interval, model scale, and input dimensionality. In addition, using both standard error metrics (MSE/MAE) and peak-related evaluation indicators enables the formulation of practical design guidelines for real-world implementation (Figures 5–14). Given the growing interest and adoption of deep learning models in maritime applications, there is a critical need for comprehensive comparative studies that can guide researchers and engineers in selecting appropriate architectures.
…
6) Typo in fig 3. "Blue" lines indicate the original... it should be grey line
- We sincerely thank the reviewer for pointing out this oversight. This correction ensures consistency between the figure legend and the actual visualization.
- The figure caption has been corrected accordingly.
(FIGURE 3)
Example of peak-preserving down-sampling (n = 10). Grey lines indicate the original signal, blue dots represent sampled points, and red stars indicate the physical peaks preserved through correction.
7) In lines 493-494, there is a statement "paradoxically suggesting that the models have achieved good generalization by not overfitting to specific patterns" with which I am not sure I agree and especially if it can be concluded just from this one experiment.
- The authors sincerely appreciate the reviewer’s careful remark.
- Our original intention in this section was not to make a definitive claim about generalization, but to describe an observed phenomenon: while the models performed well on representative inputs, their accuracy consistently degraded in multiple experiments with non-representative intervals (e.g., sudden wave changes, atypical motion patterns). This seemed counterintuitive compared to typical overfitting behavior, where memorizing training-specific features would usually cause degradation across representative and non-representative conditions.
- However, the authors fully agree with your concern. To avoid overstating this implication, we have revised the manuscript in Section 5.1.2 (Performance under Non-Representative Inputs) to clarify this as an empirical observation only. The revised text now emphasizes that while models may not have memorized training patterns, such degradation alone does not establish evidence of generalization.
(5.1.2. PERFORMANCE UNDER NON-REPRESENTATIVE INPUTS)
Figure 6 illustrates time series prediction results for atypical input intervals characterized by sudden wave changes. Across this and other non-representative cases, the experimental results consistently reveal significant accuracy degradation across most models. This pattern may suggest that the models did not simply memorize training-specific features, but instead avoided strong overfitting to particular patterns. However, the findings also indicate that uncertainty arising from input variations remains, suggesting the need for training strategies and model interpretations that explicitly account for such uncertainty.
8) Similarly, in lines 562-563, "utilizing exogenous variables effectively requires advanced model structures or the introduction of separate mechanism", I am not sure if this can be concluded just from this one experiment.
- The authors sincerely appreciate the reviewer’s careful remark.
- Our intention in this section was not to make a broad or definitive claim, but rather to describe an observed tendency from our experiments. While multi-DOF inputs consistently improved accuracy, including wave elevation, they sometimes led to degraded performance. To illustrate this, we presented Figure 8 as a representative example.
- We agree with the reviewer that the original wording (“utilizing exogenous variables effectively requires advanced model structures or the introduction of separate mechanisms”) may have sounded stronger than intended. In the revised manuscript, we have rephrased this sentence more cautiously (Section 5.2.1, p.17) to clarify that this is an empirical observation based on our tested cases. We now state that incorporating exogenous variables such as wave elevation may require more advanced model structures. Further work with broader datasets and architectures will be needed to draw firm conclusions.
(5.2.1 PERFORMANCE COMPARISON BY INPUT FEATURES: DOF COMBINATIONS AND EXTERNAL VARIABLE EFFECTS)
These results suggest that input configurations with direct and explicit correlations are advantageous for prediction performance improvement in simple time series models. By contrast, incorporating exogenous variables such as wave elevation introduced mixed effects in our tested cases, possibly due to time-lagged or nonlinear relationships. This observation indicates that effectively leveraging such variables may require more advanced model structures or additional mechanisms, and further investigation is needed to confirm this across diverse settings.
9) In the experiments with all the 6-DOF predictions, such as Fig 10 - 14, please include and comment on the timewise plots of the prediction for the other dimensions too. Only heave is presented, which is quite limiting demonstration of the prediction capabilities of a 6-DOF model.
- You are absolutely right. Although the present study has considered all 6DOF motions, only the heave motion was presented in the manuscript. The study examined the full set of 6DOF motions, and the quantitative metrics (MSE/MAE/Peak Matching) reported reflect the performance across all DOFs. For visualization, we selected heave as the representative case because it is the most widely studied motion in ship dynamics, and including all DOFs in the main text would have significantly reduced readability due to figure complexity. Importantly, we also examined the other motion components (surge, sway, roll, pitch, and yaw), and confirm that the trends observed for heave (e.g., the relative stability of Bi-LSTM and the impact of sequence length or downsampling) were consistent across all motions. For this reason, we did not include separate plots for the other DOFs in the main text. However, we will gladly provide DOF-specific timewise plots as supplementary material if the editor or reviewers consider it necessary.
- As the reviewer instructed, the following paragraph has been added to the Discussion section.
In addition to the heave motion, the other five motion components (surge, sway, roll, pitch, and yaw) were analyzed. The results demonstrated that the trends observed for heave, such as the relative stability of Bi-LSTM and the effects of sequence length and downsampling, were consistently reproduced across all 6DOF motions. Therefore, only heave is shown in the main text for clarity, while the findings represent all motion components.
10) I wonder if both (second part of 5.1.1 and 5.2.3) down-sampling investigations are really needed in the manuscript? Maybe remove one of the two to save some space and add plots with the rest of the dimensions of the 6-DOF model.
- As this study aims to emphasize its practical relevance, we specifically examined parameters that directly affect the input length of RNN models, since this is one of the most common challenges faced by AI model users. Even with the same physical input duration, the effective input length of the neural network changes depending on the downsampling strategy. For example, a 10-second input corresponds to a sequence length of 100 when sampled at 10 Hz, but increases to 1000 when sampled at 100 Hz. We kindly ask the reviewers to recognize that this analysis was conducted to highlight the practical considerations underlying our study.
11) In Figure 11b, the down-right plot shows that all the predictions are inaccurate. Can you please discuss this? Why aren't the models able to predict with decent accuracy the first 50-100 steps and then degrade?
- We sincerely thank the reviewer for this helpful comment.
- We believe that this point has been addressed in Section 5.2.2, where we explain that excessively long sequence lengths introduce structural limitations in RNN-based models—such as vanishing gradients, information dilution, and noise accumulation—that hinder the effective capture of long-term dependencies (Figure 11a). These limitations account for the overall degradation and the inaccuracy observed in the early steps of long-horizon predictions in Figure 11b. While no additional changes were made, we appreciate the reviewer’s comment, which helped us reconfirm that this discussion was necessary and appropriately placed.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe manuscript presents an extensive study on neural network–based time series models for ship motion forecasting, focusing on recurrent neural networks (RNN, LSTM, GRU, and Bi-LSTM) under varying environmental conditions and input configurations. The manuscript is motivated by the need for accurate short-term motion prediction to enhance safety and efficiency in maritime operations, and the authors address the inherent challenges of nonlinear, non-stationary, and environment-dependent ship dynamics. Th manuscript presents a systematic evaluation across multiple factors, including input sequence length, downsampling intervals, model size, input dimensionality, and environmental variability. The results demonstrate that Bi-LSTM consistently achieves superior performance, particularly in multi-environment scenarios, while multi-degree-of-freedom inputs enhance predictive accuracy by capturing the coupled nature of ship dynamics. The introduction of peak matching as a complementary evaluation metric is a valuable contribution, as it emphasises the importance of capturing motion extrema in real-world maritime contexts. The authors do not make a mention of the use of quaternions the most used format for trading orientation. Although due to their simple presentations tracking orientations via quaternions does not require a neural network setting as pointed out in "Quaternion-Valued Distributed Filtering and Control," IEEE Transactions on Automatic Control, doi: 10.1109/TAC.2020.3007332, I thin the studies in quaternions should at least be pointed to in the introduction. Despite the skiing the work with quaternions, the manuscript is well-motivated, extensive in scope, and offers practical insights, such as diminishing returns from model scaling and the benefits of moderate downsampling. I would like to draw attention of the riveters to the following issues. First, there are areas where the paper could be improved. While the simulated dataset allows for controlled experimentation, the absence of validation on real-world data limits the immediate applicability of the findings; this should be emphasised more directly as a current limitation. Second, while the paper identifies useful design guidelines, the exposition at times feels dense and could be streamlined to improve accessibility for a broader readership. More detailed comparisons to alternative forecasting approaches beyond RNN variants would also help contextualise the contribution within the broader literature on time series prediction.
Author Response
Response to the Reviewer #3
Title: Study of RNN-Based Deep Learning Models for Practical 6-DOF Ship Motion Prediction
Authors: HaEun Lee, Yangjun Ahn
All your comments are highly appreciated. Your comments have been extremely valuable in helping us refine the manuscript and strengthen its contributions. The manuscript has been revised based on your valuable comments, and our replies to your comments are provided below.
The authors do not make a mention of the use of quaternions the most used format for trading orientation. Although due to their simple presentations tracking orientations via quaternions does not require a neural network setting as pointed out in "Quaternion-Valued Distributed Filtering and Control," IEEE Transactions on Automatic Control, doi: 10.1109/TAC.2020.3007332, I think the studies in quaternions should at least be pointed to in the introduction.
- The authors sincerely appreciate the reviewer’s insightful comment regarding the role of quaternions in orientation representation.
- We fully agree that quaternions are an essential and widely used format in motion tracking, and we are grateful that the reviewer has already recognized the distinction between such approaches and our work's neural network–based modeling focus.
- We have revised the INTRODUCTION (p. 2) to acknowledge related studies in this area while clarifying that our research centers on data-driven sequence modeling of ship motions under diverse environmental conditions.
- The authors have revised the INTRODUCTION section by briefly discussing quaternion-based approaches.
(INTRODUCTION)
…Quaternion-based Kalman filtering has also been explored for distributed orientation estimation [11]. Still, such methods are primarily suited for sensor fusion and attitude tracking….
While the simulated dataset allows for controlled experimentation, the absence of validation on real-world data limits the immediate applicability of the findings; this should be emphasised more directly as a current limitation
- The authors deeply agree with your consideration and suggestions.
- We acknowledge the reviewer’s point that the absence of validation on real-world data is a current limitation. While this study focused on controlled simulations for systematic comparisons, we have revised the Discussion (p. 24) to note this limitation more explicitly and to emphasize validation with full-scale measurement data from sea trials as an important direction for future work.
- The authors have revised the DISCUSSION section by emphasizing limitations and adding future work at the end.
(DISCUSSION)
…
Several limitations of this study should be acknowledged. First, the experiments were conducted on a simulated dataset of a specific vessel (KCS model) under nine sea states. Although this setting considers a range of wave environments, it inherently limits the generalizability of the findings to different vessel types and real-world maritime conditions. Therefore, future validation with measurement data from sea trials or operational environments will be essential to confirm the robustness of the models. Second, the scope of model architectures was deliberately limited to RNN-based variants to establish a clear and consistent comparative baseline. This choice reflects the recent surge of research activity in artificial intelligence–driven approaches, and the study focused exclusively on fundamental RNN-based architectures. Traditional methods such as ARIMA, Kalman filtering, and physics-based models were not considered within the present scope. Nonetheless, emerging techniques—such as attention mechanisms, Transformer-based networks, and physics-informed neural architectures—may provide enhanced capabilities for capturing complex nonlinear vessel–wave interactions. Third, the analysis was restricted to deterministic predictions without explicit quantification of predictive uncertainty, which represents a valuable direction for future work in support of risk-informed decision-making.
…
The exposition at times feels dense and could be streamlined to improve accessibility for a broader readership
- The authors thank the reviewer for this stylistic suggestion.
- The authors agree that certain sections of the manuscript appeared dense, partly due to multiple subfigures (e.g., panels a–d) being included in the main text. To address this, we have streamlined the presentation by moving figures with more than four subpanels (e.g., Figures A1–A3) to Appendix A. This adjustment reduces visual crowding in the main text and improves readability while making the full set of detailed results available for reference.
- (APPENDIX A)
Appendix A.1 Single Environment: Effect of Trained Sequence Length
This appendix provides the full prediction curves for all four RNN-based models (RNN, LSTM, GRU, Bi-LSTM) across different sequence lengths. These figures complement the representative results shown in the main text (Section 5.1.1).
(a) Sequence length = 50 timesteps |
(b) Sequence length = 100 timesteps |
(c) Sequence length = 200 timesteps |
(d) Sequence length = 300 timesteps |
(e) Sequence length = 400 timesteps |
Figure A1. Model Prediction Results for Varying Sequence Lengths: Predicted time series outputs for different sequence lengths of 50, 100, 200, 300, and 400 timesteps.
Appendix A.2 Single Environment: Effect of Downsampling Interval
This appendix provides the prediction curves for all four RNN-based models(RNN, LSTM, GRU, Bi-LSTM) under different downsampling intervals at sequence length 200. These figures complement the representative results shown in the main text (Section 5.1.1).
(a) n=1 |
(b) n=2 |
(c) n=5 |
(d) n=10 |
Figure A2. Example of sequential visualization of prediction results for sequence length 200 under varying downsampling intervals
Appendix A.3 Multi-Environment: Sensitivity to Sequence Length
This appendix provides the prediction results for all four RNN-based models (RNN, LSTM, GRU, Bi-LSTM) across different sequence lengths in multi-environment scenarios, complementing Section 5.2.2 in the main text.
(a) Sequence length = 50 timesteps |
(b) Sequence length = 100 timesteps |
(c) Sequence length = 200 timesteps |
(d) Sequence length = 300 timesteps |
Figure A3. Example of sequential visualization of prediction results for different sequence lengths of 50, 100, 200, and 300 timesteps.
More detailed comparisons to alternative forecasting approaches beyond RNN variants would also help contextualise the contribution within the broader literature on time series prediction.
- We sincerely thank the reviewer for this insightful suggestion.
- Our study intentionally focused on RNN-based architectures to establish a systematic and controlled comparison among widely used baseline models. These models were selected not only for their historical relevance but also for their accessibility to a wide community of researchers and practitioners through standard PyTorch libraries.
- Nevertheless, we fully agree that recent architectures, such as CNN-RNN hybrids, attention mechanisms, Transformer-based models, and physics-informed neural networks, represent promising directions for enhancing generalization and capturing nonlinear vessel–wave interactions. Accordingly, we have revised the Discussion section (p.24) to explicitly acknowledge this limitation and to highlight these models as important avenues for future research.
- (DISCUSSION)
… Second, the scope of model architectures was deliberately limited to RNN-based variants to establish a clear and consistent comparative baseline. This choice reflects the recent surge of research activity in artificial intelligence–driven approaches, and the study focused exclusively on fundamental RNN-based architectures. Traditional methods such as ARIMA, Kalman filtering, and physics-based models were not considered within the present scope. Nonetheless, emerging techniques—such as attention mechanisms, Transformer-based networks, and physics-informed neural architectures—may provide enhanced capabilities for capturing complex nonlinear vessel–wave interactions.Third,…
Author Response File: Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsThis paper focuses on six degrees-of-freedom (6-DOF) ship motion prediction and systematically compares the performance of RNN, LSTM, GRU, and Bi-LSTM deep learning models under single and multiple environmental conditions. It also proposes the introduction of a peak-matching evaluation metric. The research topic has high engineering application value, and the overall structure of the paper is relatively complete. However, improvements are still needed in terms of research logic, methodological details, and depth of results analysis. Specific comments are as follows:
- Although the Introduction reviews the applications of RNN, LSTM, GRU, and Bi-LSTM in time-series prediction, it lacks emphasis on the innovative contributions of this study. For example, after introducing the “peak-matching” metric, the authors did not explicitly explain its unique value and application scenarios compared with existing metrics. It is recommended that the Introduction or Methodology section clearly highlight the contributions of this paper: (i) the introduction of a peak-matching metric to capture extreme ship motions; (ii) comparative analysis of different deep learning models under multiple sea states; and (iii) clarification of distinctions from existing studies.
- The authors trained and tested their models solely on CFD simulation data of the KCS hull form, without validation from full-scale ship trials or towing-tank experiments, limiting the generalizability of the models. Recommendation: include at least one set of publicly available real-world measurement data (such as the SeaTrial dataset) or explicitly state the limitations of using only simulation data in the discussion.
- The study lacks comparisons with benchmark models. Although the deep learning models outperform each other, there is no comparison with traditional approaches such as ARIMA, Kalman filtering, or physics-based models (e.g., the MMG model), which makes it difficult to demonstrate the relative advantages of deep learning methods.
- The descriptions of the hyperparameter settings for each neural network (e.g., learning rate, number of hidden layers, time steps, batch size, activation functions) are incomplete. It is recommended to provide a table listing the model configurations and to present the mathematical formula or pseudocode for the peak-matching metric to enhance reproducibility.
- The results are largely limited to numerical comparisons and lack statistical significance tests (e.g., t-tests or ANOVA). Recommendation: analyze the causes of prediction accuracy differences under different sea states, such as the influence of wave spectrum characteristics, noise levels, and nonlinear disturbances.
- The Conclusion section mainly reiterates experimental results and fails to extract implications for ship motion prediction or navigation safety. It is recommended to add summaries of the “academic contribution” and “engineering significance,” such as the role of peak prediction in safety warnings for extreme ship motions. The authors should also highlight future research directions, such as combining physical models (MMG, CFD) with deep learning for hybrid modeling.
- To strengthen the academic depth of the paper, it is suggested that the authors add the following categories of references:
(1) Motion prediction
- Teodoro, M. F., Pereira, C., Henriques, P., & Canas, A. (2024). Prediction of ship movement using a Kalman filter algorithm. Advances in Science and Technology, 144, 93–100.
- Li, W., Chen, Y., & Pan, Y. (2025). Urban signalized intersection traffic state prediction: a spatial-temporal graph model integrating the cell transmission model and transformer.
(2) Deep learning for motion prediction
- Zaman, U., Khan, J., Hussain, S., Balobaid, A. S., & Aburasain, R. Y. (2024). An Efficient Long Short-Term Memory and Gated Recurrent Unit Based Smart Vessel Trajectory Prediction Using Automatic Identification System Data. Computers, Materials & Continua, 81(1).
- Pan, Y. A., Li, F., Li, A., Niu, Z., & Liu, Z. (2025). Urban intersection traffic flow prediction: A physics-guided stepwise framework utilizing spatio-temporal graph neural network algorithms. Multimodal Transportation, 4(2), 100207.
(3) Hybrid methods combining physical and data-driven models
- Coraddu, A., Oneto, L., Cipollini, F., Kalikatzarakis, M., Meijn, G. J., & Geertsma, R. (2022). Physical, data-driven and hybrid approaches to model engine exhaust gas temperatures in operational conditions. Ships and Offshore Structures, 17(6), 1360–1381.
- Schirmann, M. L., Gose, J. W., & Collette, M. D. (2023). A comparison of physics-informed data-driven modeling architectures for ship motion predictions. Ocean Engineering, 286, 115608.
- Pan, Y. A., Guo, J., Chen, Y., Cheng, Q., Li, W., & Liu, Y. (2024). A fundamental diagram based hybrid framework for traffic flow estimation and prediction by combining a Markovian model with deep learning. Expert Systems with Applications, 238, 122219.
- Pan, Y. A., Guo, J., Chen, Y., Li, S., & Li, W. (2022). Incorporating traffic flow model into a deep learning method for traffic state estimation: a hybrid stepwise modeling framework. Journal of Advanced Transportation, 2022(1), 5926663.
Overall, the research idea of this paper is reasonable, but it lacks comparisons with traditional models and validation with empirical data. Methodological descriptions are not sufficiently transparent, and the results analysis lacks depth. It is recommended that the authors supplement comparisons with baseline models, improve the description of methodological details, add statistical tests, and cite the above references to enhance the scholarly rigor of the paper. The Conclusion should also be elevated from the perspectives of “engineering significance” and “academic contribution.”
Author Response
Response to the Reviewer #4
Title: Study of RNN-Based Deep Learning Models for Practical 6-DOF Ship Motion Prediction
Authors: HaEun Lee, Yangjun Ahn
All your comments are highly appreciated. Your comments have been extremely valuable in helping us refine the manuscript and strengthen its contributions. The manuscript has been revised based on your valuable comments, and our replies to your comments are provided below.
- Although the Introduction reviews the applications of RNN, LSTM, GRU, and Bi-LSTM in time-series prediction, it lacks emphasis on the innovative contributions of this study. For example, after introducing the “peak-matching” metric, the authors did not explicitly explain its unique value and application scenarios compared with existing metrics. It is recommended that the Introduction or Methodology section clearly highlight the contributions of this paper: (i) the introduction of a peak-matching metric to capture extreme ship motions; (ii) comparative analysis of different deep learning models under multiple sea states; and (iii) clarification of distinctions from existing studies.
- The authors sincerely appreciate the reviewer’s insightful comment on the need to emphasize the innovative contributions of this study.
- We fully agree that the contributions should be more clearly highlighted, particularly regarding the role and unique value of the Peak Matching metric, the comparative analysis across multiple sea states, and the distinctions from prior studies.
- In response, we have revised the INTRODUCTION (p. 3) to explicitly summarize the key contributions in bullet-point format. These now include:
(INTRODUCTION)
Contributions of this paper can be summarized as follows:
- Unified baseline for 6-DOF prediction. Four representative RNN-based models were systematically compared under consistent experimental conditions, establishing a practical baseline for ship motion prediction across maritime environments.
- Basic RNN model design insights. Practical guidelines were designed using basic PyTorch libraries on sequence length, downsampling strategy, input dimensionality, and model capacity, highlighting conditions under which RNN-based models can achieve stable and reliable performance.
- Safety-relevant evaluation. Peak Matching and the Overestimation Ratio were proposed as complementary evaluation metrics to traditional error measures, emphasizing the importance of extrema prediction for safety-critical maritime operations.
- The authors trained and tested their models solely on CFD simulation data of the KCS hull form, without validation from full-scale ship trials or towing-tank experiments, limiting the generalizability of the models. Recommendation: include at least one set of publicly available real-world measurement data (such as the SeaTrial dataset) or explicitly state the limitations of using only simulation data in the discussion.
- The authors sincerely appreciate the reviewer’s valuable comment.
- We fully acknowledge that relying solely on CFD simulation data of the KCS hull form is a limitation of the present study, as it constrains the generalizability of the results to real-world maritime conditions. While this work intentionally focused on controlled simulation settings to enable systematic and reproducible comparisons across models, we agree that validation with real-world measurement data (e.g., sea trial or towing-tank datasets) is essential to confirm the robustness and applicability of the proposed approaches.
- The authors have revised the DISCUSSION section by emphasizing limitations and adding future works at the end.
(DISCUSSION)
Several limitations of this study should be acknowledged. First, the experiments were conducted on a simulated dataset of a specific vessel (KCS model) under nine sea states. Although this setting considers a range of wave environments, it inherently limits the generalizability of the findings to different vessel types and real-world maritime conditions. Therefore, future validation with measurement data from sea trials or operational environments will be essential to confirm the robustness of the models. Second,…
- The study lacks comparisons with benchmark models. Although the deep learning models outperform each other, there is no comparison with traditional approaches such as ARIMA, Kalman filtering, or physics-based models (e.g., the MMG model), which makes it difficult to demonstrate the relative advantages of deep learning methods.
- We sincerely thank the reviewer for this insightful suggestion.
- We fully acknowledge that the present study does not include comparisons with traditional benchmark models such as ARIMA, Kalman filtering, or physics-based models (e.g., the MMG model). Our study intentionally focused on RNN-based architectures to establish a systematic and controlled baseline among widely used deep learning variants, which can be readily reproduced using standard libraries such as PyTorch.
- Nevertheless, we agree that incorporating traditional statistical and physics-based approaches would further contextualize the relative advantages and limitations of deep learning methods. To address this point, we have revised the DISCUSSION (p. 24) to explicitly acknowledge this limitation and to highlight integration with such benchmarks as an important direction for future work.
- (DISCUSSION)
… Second, the scope of model architectures was deliberately limited to RNN-based variants to establish a clear and consistent comparative baseline. This choice reflects the recent surge of research activity in artificial intelligence–driven approaches, and the study focused exclusively on fundamental RNN-based architectures. Traditional methods such as ARIMA, Kalman filtering, and physics-based models were not considered within the present scope. Nonetheless, emerging techniques—such as attention mechanisms, Transformer-based networks, and physics-informed neural architectures—may provide enhanced capabilities for capturing complex nonlinear vessel–wave interactions. Third,…
- The descriptions of the hyperparameter settings for each neural network (e.g., learning rate, number of hidden layers, time steps, batch size, activation functions) are incomplete. It is recommended to provide a table listing the model configurations and to present the mathematical formula or pseudocode for the peak-matching metric to enhance reproducibility.
- We sincerely thank the reviewer for this constructive suggestion.
- We acknowledge that the descriptions of hyperparameter settings and the formalization of the Peak Matching metric were not sufficiently detailed in the original submission.We sincerely thank the reviewer for this insightful suggestion.
- To address this, 1. We have expanded Section 2.2 TRAINING to clearly describe the hyperparameter configurations used across all experiments, including hidden units, number of layers, batch size, learning rate schedules, and training strategies. We now provide Table 4, which summarizes the baseline hyperparameter settings for all models to ensure transparency and reproducibility.
Also, in Section 4.3 EVALUATION: METRICS AND PEAK MATCHING ALGORITHM, we have included a formal mathematical definition of the Peak Matching (PM) score (Equation 18). This explicitly defines how the proportion of correctly matched peaks is calculated within a tolerance margin ϵ, which is determined in a data-driven manner from the error distribution. This addition clarifies the reproducibility and unique role of the metric in safety-critical ship motion prediction. - We believe these revisions address the reviewer’s concern by providing greater methodological clarity, ensuring reproducibility, and highlighting the distinct contribution of the proposed evaluation metric.
- (4.2.2. TRAINING)
All experiments were conducted on a workstation with an NVIDIA GeForce RTX 3090 GPU. On average, training time per epoch was approximately 1.7 s (RNN), 2.0 s (LSTM), 1.9 s (GRU), and 2.6 s (Bi-LSTM), confirming that training was computationally feasible under this setting.
In addition, the core hyperparameters—including hidden units, number of layers, batch size, and input sequence configurations—were kept consistent across experiments. These settings are summarized in Table 4 to ensure reproducibility and provide a clear reference for model configurations. The number of layers was set to a minimum of 2 and increased stepwise (up to 20 for Bi-LSTM) to evaluate performance scalability depending on problem complexity. Similarly, the batch size was set to 64 as the most stable configuration, with additional tests conducted using larger batch sizes for sensitivity analysis.
Mean Squared Error (MSE) was used as the loss function with the Adam optimizer. Early stopping was implemented to prevent overfitting and enhance generalization, terminating training when validation loss showed no improvement for a specified number of epochs. The initial learning rate was set to 0.01, with adaptive scheduling applied to reduce the rate by a factor of 0.5 when validation performance plateaued, down to a minimum of 1e-6 for stability and convergence performance.
All models were trained under identical preprocessing conditions: Gaussian normalization-based standardization and downsampling intervals n={1,2,5,10}. Performance evaluation was conducted using MSE, MAE, and peak-matching-based indicators. This experimental design enabled fair and consistent performance comparison between architectures. Performance evaluation was comprehensively conducted using quantitative metrics, including MSE, MAE, and peak matching-based indicators.
Table 4. Summary of the baseline hyperparameter settings for all models.
Model |
Sequence Length |
Downsampling |
Hidden Units |
Layers |
Batch Size |
Learning Rate (initial) |
Loss Function |
RNN |
{50, 100, 200, 300, 400, 500} |
{1,2,5,10} |
64 |
2~10 |
64~ |
0.01 → 1e-6 |
MSE |
LSTM |
{50, 100, 200, 300, 400, 500} |
{1,2,5,10} |
64 |
2~10 |
64~ |
0.01 → 1e-6 |
MSE |
GRU |
{50, 100, 200, 300, 400, 500} |
{1,2,5,10} |
64 |
2~10 |
64~ |
0.01 → 1e-6 |
MSE |
Bi-LSTM |
{50, 100, 200, 300, 400, 500} |
{1,2,5,10} |
64 |
2~20 |
64~ |
0.01 → 1e-6 |
MSE |
- (4.3. EVALUATION: METRICS AND PEAK MATCHING ALGORITHM)
Formally, the Peak Matching score is defined as:
|
(18) |
where denotes the actual peak value, represents the corresponding predicted peak value, ϵ is the allowable error tolerance, and is the total number of successfully matched peaks. The indicator function 1(⋅) equals 1 if the condition is satisfied and 0 otherwise. The tolerance ϵ was determined from the distribution of validation peak errors (e.g., percentile- or MAD/IQR-based thresholds), ensuring robustness against heavy-tailed noise and avoiding arbitrary hand-tuning. This formulation quantifies the proportion of correctly matched peaks within a predefined tolerance, complementing conventional error metrics (MSE, MAE) by focusing on local extrema that are critical in safety-sensitive ship motion prediction.
- The results are largely limited to numerical comparisons and lack statistical significance tests (e.g., t-tests or ANOVA). Recommendation: analyze the causes of prediction accuracy differences under different sea states, such as the influence of wave spectrum characteristics, noise levels, and nonlinear disturbances.
- sincerely thank the reviewer for this insightful suggestion regarding statistical significance tests.
- In response, we have incorporated paired t-tests into the Results section to quantitatively evaluate whether predicted peak values differ significantly from the ground truth in Result Section.
- (5.2.3 SENSITIBITY TO DOWN-SAMPLING – p.20)
As shown in Table 6, a paired t-test between ground truth and predicted peak values under the optimal downsampling condition (n=5) yielded a non-significant result (t=0.363, p=0.717 > 0.05). This indicates that predicted peaks are statistically indistinguishable from actual values, validating the model’s ability to reproduce peak dynamics without systematic bias.
Table 6. t-test results comparing ground truth and predicted peak values (BI-LSTM, n=5)
|
Values |
Mean (Ground Truth) |
0.0057 |
Mean (Prediction) |
-0.0106 |
T-Statistic |
0.3629 |
P-value |
0.7169 |
- (5.2.4 EFFECT OF MODEL UPSIZING ON MULTI-ENVIRONMENT GENERALIZATION - p.22)
As shown in Table 7, the enlarged Bi-LSTM architecture (15 layers, 512 nodes) yielded a paired t-test result that was statistically non-significant (t = 0.216, p = 0.829 > 0.05), consistent with the smaller model in Table 6. This finding indicates that increasing model capacity does not introduce systematic bias, and peak predictions remain statistically aligned with the ground truth.
Table 7. t-test results comparing ground truth and predicted peak values (BI-LSTM, n=5, Layer= 15, Node = 512)
|
Values |
Mean (Ground Truth) |
0.0078 |
Mean (Prediction) |
-0.0002 |
T-Statistic |
0.2156 |
P-value |
0.8294 |
- The Conclusion section mainly reiterates experimental results and fails to extract implications for ship motion prediction or navigation safety. It is recommended to add summaries of the “academic contribution” and “engineering significance,” such as the role of peak prediction in safety warnings for extreme ship motions. The authors should also highlight future research directions, such as combining physical models (MMG, CFD) with deep learning for hybrid modeling.
- We sincerely thank the reviewer for this valuable suggestion.
- We agree that the original Conclusion section primarily reiterated experimental results and did not sufficiently emphasize the broader implications of our study. In the revised manuscript, we have restructured the Concluding Remarks (Section 7) to explicitly highlight the Academic contributions, Engineering significance, Key findings from experiments, and Future research directions. These changes ensure that the Conclusion extends beyond a summary of results and explicitly draws out academic and engineering contributions, aligning with the reviewer’s recommendations.
- (7. CONCLUDING REMARKS)
This study investigated neural network–based time series prediction models for ship motion forecasting under varying environmental and input configurations. Several ac-ademic contributions and engineering implications can be articulated based on a com-prehensive set of experiments.
First, the study established the first unified baseline for full 6-DOF ship motion prediction by systematically comparing four representative RNN-based architectures (RNN, LSTM, GRU, Bi-LSTM) under both single- and multi-environmental conditions. In addition, two complementary evaluation metrics, Peak Matching and the Overestimation Ratio, were proposed to extend conventional error measures such as MSE and MAE by explicitly capturing the accuracy and tendencies of extrema prediction, which are critical in safety-sensitive maritime contexts. Furthermore, practical design guidelines were derived regarding sequence length, downsampling strategies, model capacity, and input dimensionality, providing actionable insights for selecting and configuring models in different operational scenarios.
From an engineering perspective, accurately predicting peak motions directly affects navigation safety, including cargo handling, collision avoidance, helicopter operations, and structural load management. The proposed evaluation framework underscores the importance of conservative forecasting, where overestimation may be preferable to underestimation for risk mitigation. Moreover, the trade-offs identified in sequence length, downsampling, and model scaling offer practical references for implementing ship motion forecasting systems that balance accuracy, computational efficiency, and robustness.
The experimental results further highlight several key findings. Prediction horizons varied between approximately 40 seconds in single-environment settings and about 20 seconds in multi-environment settings, illustrating the increased complexity of gener-alized forecasting. Multi-DOF inputs generally enhanced prediction performance relative to single-DOF inputs, while including wave data yielded mixed results. The Bi-LSTM architecture demonstrated relatively stable performance across diverse conditions, and moderate downsampling (n=5) improved predictive accuracy. At the same time, model scaling exhibited diminishing returns beyond specific architecture sizes, with optimal configurations appearing to be task dependent.
Looking ahead, validation with full-scale measurement data will be essential to confirm model robustness in real-world applications. Hybrid modeling frameworks that combine physical models with data-driven learning offer a promising direction for better capturing nonlinear vessel–wave interactions. Furthermore, incorporating uncertainty quantification into predictive outputs could strengthen risk-informed decision-making in safety-critical maritime operations.
This study advances methodological and practical insights into ship motion fore-casting. Emphasizing peak-based evaluation and systematically comparing RNN-based models under diverse conditions provides a rigorous foundation for future developments in hybrid, data-driven, and risk-aware prediction frameworks that enhance maritime safety and operational reliability.
To strengthen the academic depth of the paper, it is suggested that the authors add the following categories of references:
(1) Motion prediction
- Teodoro, M. F., Pereira, C., Henriques, P., & Canas, A. (2024). Prediction of ship movement using a Kalman filter algorithm. Advances in Science and Technology, 144, 93–100.
- Li, W., Chen, Y., & Pan, Y. (2025). Urban signalized intersection traffic state prediction: a spatial-temporal graph model integrating the cell transmission model and transformer.
(2) Deep learning for motion prediction
- Zaman, U., Khan, J., Hussain, S., Balobaid, A. S., & Aburasain, R. Y. (2024). An Efficient Long Short-Term Memory and Gated Recurrent Unit Based Smart Vessel Trajectory Prediction Using Automatic Identification System Data. Computers, Materials & Continua, 81(1).
- Pan, Y. A., Li, F., Li, A., Niu, Z., & Liu, Z. (2025). Urban intersection traffic flow prediction: A physics-guided stepwise framework utilizing spatio-temporal graph neural network algorithms. Multimodal Transportation, 4(2), 100207.
(3) Hybrid methods combining physical and data-driven models
- Coraddu, A., Oneto, L., Cipollini, F., Kalikatzarakis, M., Meijn, G. J., & Geertsma, R. (2022). Physical, data-driven and hybrid approaches to model engine exhaust gas temperatures in operational conditions. Ships and Offshore Structures, 17(6), 1360–1381.
- Schirmann, M. L., Gose, J. W., & Collette, M. D. (2023). A comparison of physics-informed data-driven modeling architectures for ship motion predictions. Ocean Engineering, 286, 115608.
- Pan, Y. A., Guo, J., Chen, Y., Cheng, Q., Li, W., & Liu, Y. (2024). A fundamental diagram based hybrid framework for traffic flow estimation and prediction by combining a Markovian model with deep learning. Expert Systems with Applications, 238, 122219.
- Pan, Y. A., Guo, J., Chen, Y., Li, S., & Li, W. (2022). Incorporating traffic flow model into a deep learning method for traffic state estimation: a hybrid stepwise modeling framework. Journal of Advanced Transportation, 2022(1), 5926663.
- We sincerely thank the reviewer for this valuable suggestion.
- Following the recommendation, we have incorporated the suggested references into the Introduction(p.2) to provide a more comprehensive overview of existing work in motion prediction, deep learning for motion prediction, and hybrid modeling approaches. Specifically:
Motion prediction: We added references validating Kalman filter–based approaches for ship-movement prediction using AIS data (Teodoro et al. [57]) and extended the discussion to highlight the relevance of traffic state prediction models based on cell transmission and transformer architectures (Li et al. [58]).
Deep learning for motion prediction: We included studies on AIS-based vessel trajectory forecasting using LSTM/GRU (Zaman et al. [59]) and traffic flow prediction with physics-guided spatio-temporal GNNs (Pan et al. [60]), situating our work within broader deep learning applications.
Hybrid methods: We expanded the introduction to cover hybrid models that combine physical knowledge with data-driven learning. This includes ship engine exhaust temperature modeling (Coraddu et al. [61]), physics-informed vs. data-driven architectures for ship motions (Schirmann et al. [62]), and hybrid traffic flow estimation frameworks integrating fundamental diagrams, Markovian dynamics, and deep learning (Pan et al. [63,64]).
- (INTRODUCTION)
Recent studies have incorporated advanced optimization strategies to improve model performance. Particle Swarm Optimization (PSO) algorithms have been successfully applied to optimize bidirectional LSTM networks for ship motion attitude prediction, demonstrating enhanced accuracy compared to conventional approaches [17]. Similarly, Binary System Optimization (BSO) algorithms have optimized complex hybrid architectures combining Temporal Convolutional Networks with BiGRU and attention mechanisms [18]. These optimization strategies have shown meaningful progress in real-time capability and prediction accuracy through systematic hyperparameter tuning and architectural refinement. Beyond pure deep models, hybrid approaches combining physics and learning are increasingly applied in transport and marine domains [61,62]. For ship motions, comparisons of physics-informed and data-driven models reveal an accuracy–interpretability trade-off, underscoring the need for clear baselines [63,64]. Accurate prediction of wave-induced ship motions has long been recognized as a fundamental challenge in marine engineering and autonomous navigation system development [1,2,3]. Ships exhibit complex six degrees-of-freedom (6-DOF: surge, sway, heave, roll, pitch, yaw) dynamics under varying sea conditions and operational speeds [4]. Particularly in challenging sea conditions with high wave variability, precise prediction of dynamic responses to wave excitation plays a crucial role in ensuring autonomous system stability and expanding operational envelopes [5]. Consequently, extensive research has been directed toward developing methods for predicting short-term ship motion responses faster than in real time.
Early research primarily focused on numerical analysis-based models such as Kalman filters [6], followed by the adoption of statistical time series models based on Auto-Regression (AR) and Auto-Correlation Function (ACF) for predicting roll, pitch, and heave motions [7,8]. While these models offer advantages in terms of simplicity and interpretability, they face significant limitations in adequately capturing the highly nonlinear characteristics of ship motions [9]. Recent work has also validated Kalman filtering for ship-movement prediction using operational AIS-type inputs, underscoring the practicality of classical estimators in maritime settings [10]. Quaternion-based Kalman filtering has also been explored for distributed orientation estimation [11]. Such methods are primarily suited for sensor fusion and attitude tracking. Traditional AR models depend on predefined parameters and exhibit poor adaptability to diverse operational scenarios, particularly when dealing with non-stationary ship motion time series [12]. Furthermore, ACF-based methods suffer from significant lag time errors in instantaneous sample calculations, limiting their real-time prediction capabilities [13].
In recent years, deep learning-based time series prediction models have been increasingly adopted to address these limitations [14,15]. RNN-based models, particularly LSTM and GRU architectures, have demonstrated superior capability in learning complex ship dynamics and promise for full 6-DOF prediction [16,17]. For maritime trajectory prediction using AIS data, recurrent models (LSTM/GRU) outperform classical baselines when combined with rigorous preprocessing and spatiotemporal handling [18]. Various approaches have been explored to enhance prediction accuracy and generalization performance, including encoder-decoder structures with attention mechanisms [19], input vector optimization, wavelet-based multi-scale processing [20], and hybrid architectures incorporating CNN components [21]. Beyond maritime applications, physics-guided GNNs and transformers improve forecasting by combining topology, signals, and temporal embeddings [22,23], suggesting that such designs could also advance maritime motion prediction.
Recent studies have incorporated advanced optimization strategies to improve model performance. Particle Swarm Optimization (PSO) algorithms have been successfully applied to optimize bidirectional LSTM networks for ship motion attitude prediction, demonstrating enhanced accuracy compared to conventional approaches [24]. Similarly, Binary System Optimization (BSO) algorithms have optimized complex hybrid architectures combining Temporal Convolutional Networks with BiGRU and attention mechanisms [25]. These optimization strategies have shown meaningful progress in real-time capability and prediction accuracy through systematic hyperparameter tuning and architectural refinement. Beyond pure deep models, hybrid approaches combining physics and learning are increasingly applied in transport and marine domains [26,27]. For ship motions, comparisons of physics-informed and data-driven models reveal an accuracy–interpretability trade-off, underscoring the need for clear baselines [28,29].
- Also, updated references are listed below:
References
- Perera, L.P.; Oliveira, P.; Soares, C.G. Maritime Traffic Monitoring Based on Vessel Detection, Tracking, State Estimation, and Trajectory Prediction. IEEE Trans. Intell. Transp. Syst. 2012, 13, 1188–1200. https://doi.org/10.1109/TITS.2012.2187282
- Yumori, I. Real Time Prediction of Ship Response to Ocean Waves Using Time Series Analysis. In Proceedings of the OCEANS 81, Boston, MA, USA, 15–18 September 1981; 1082–1089. https://doi.org/10.1109/OCEANS.1981.1151574
- Zhao, X.; Xu, R.; Kwan, C. Ship-Motion Prediction: Algorithms and Simulation Results. In Proceedings of the 2004 IEEE In-ternational Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), Montreal, QC, Canada, 17–21 May 2004; V-125. https://doi.org/10.1109/ICASSP.2004.1327063
- Sandaruwan, D.; Kodikara, N.; Rosa, R.; Rupasinghe, N.; Tharaka, K. Modeling and Simulation of Environmental Disturbances for Six Degrees of Freedom Ocean Surface Vehicle. Sri Lankan J. Phys. 2009, 10, 147–157. https://doi.org/10.4038/sljp.v10i0.3834
- Brodtkorb, A.H.; Nielsen, U.D.; Sørensen, A.J. Sea State Estimation Using Vessel Response in Dynamic Positioning. Ocean Res. 2018, 70, 76–86. https://doi.org/10.1016/j.apor.2017.09.005
- He, W.; Zhong, C.; Sotelo, M.A.; Chu, X.; Liu, X.; Li, Z. Short-Term Vessel Traffic Flow Forecasting by Using an Improved Kalman Model. Cluster Comput. 2019, 22, 7907–7918. https://doi.org/10.1007/s10586-017-1491-2
- Takami, T.; Nielsen, U.D.; Jensen, J.J. Real-Time Deterministic Prediction of Wave-Induced Ship Responses Based on Short-Time Measurements. Ocean Eng. 2021, 221, 108503. https://doi.org/10.1016/j.oceaneng.2020.108503
- Zafeiraki, M. A Comparison of ARIMA and SVR in Short-Term Ship Motion Prediction. Master's Thesis, Utrecht University, Utrecht, The Netherlands, 2022. https://doi.org/10.33612/diss.118849326
- Li, G.; Kawan, B.; Wang, H.; Zhang, H. Neural-Network-Based Modelling and Analysis for Time Series Prediction of Ship Motion. Ship Technol. Res. 2017, 64, 30–39. https://doi.org/10.1080/09377255.2017.1309786
- Teodoro, M.F.; Pereira, C.; Henriques, P.; Canas, A. Prediction of Ship Movement Using a Kalman Filter Algorithm. Proceedings of the 6th International Conference on Numerical Modelling in Engineering, Trans Tech Publications Ltd.: 27 March 2024. https://doi.org/10.4028/p-ipm9w5
- P. Talebi, S. Werner and D. P. Mandic; Quaternion-Valued Distributed Filtering and Control. in IEEE Transactions on Automatic Control, vol. 65, no. 10, pp. 4246-4257, Oct. 2020, DOI: 10.1109/TAC.2020.3007332.
- Wang, J.; Zhou, Y.; Zhuang, L.; Shi, L.; Zhang, P. A Model of Maritime Accidents Prediction Based on Multi-Factor Time Series Analysis. Mar. Eng. Technol. 2023, 22, 89–102. https://doi.org/10.1080/20464177.2023.2167269
- Capobianco, S.; Millefiori, L.M.; Forti, N.; Braca, P.; Willett, P. Deep Learning Methods for Vessel Trajectory Prediction Based on Recurrent Neural Networks. IEEE Aerosp. Electron. Syst. Mag. 2021, 36, 22–31. https://doi.org/10.1109/MAES.2021.3096591
- Zhang, M.; Taimuri, G.; Zhang, J.; Hirdaris, S. A Deep Learning Method for the Prediction of 6-DOF Ship Motions in Real Conditions. Proc. Inst. Mech. Eng. Part M J. Eng. Marit. Environ. 2023, 237, 887–905. https://doi.org/10.1177/14750902231157852
- Hu, X.; Zhang, B.; Tang, G. Research on Ship Motion Prediction Algorithm Based on Dual-Pass Long Short-Term Memory Neural Network. IEEE Access 2021, 9, 4543–4552. https://doi.org/10.1109/ACCESS.2020.3047893
- del Águila Ferrandis, J.; Triantafyllou, M.S.; Chryssostomidis, C.; Karniadakis, G.E. Learning Functionals via LSTM Neural Networks for Predicting Vessel Dynamics in Extreme Sea States. R. Soc. A 2021, 477, 20190897. https://doi.org/10.1098/rspa.2019.0897
- Wang, Y.; Zhang, B.; Ding, F.; Ren, H. Estimating Dynamic Motion Parameters with an Improved Wavelet Thresholding and Inter-Scale Correlation. IEEE Access 2018, 6, 36475–36487. https://doi.org/10.1109/ACCESS.2018.2850915
- Zaman, U.; Khan, J.; Lee, E.; Hussain, S.; Balobaid, A.S.; Aburasain, R.Y. An Efficient Long Short-Term Memory and Gated Recurrent Unit Based Smart Vessel Trajectory Prediction Using Automatic Identification System Data. Mater. Contin. 2024. DOI: not available
- Qiang, H.; Guo, Z.; Xie, S.; Peng, X. MSTformer: Motion Inspired Spatial-Temporal Transformer with Dynamic-Aware Attention for Long-Term Vessel Trajectory Prediction. arXiv preprint 2023, arXiv:2303.11540. https://doi.org/10.48550/arXiv.2303.11540
- Zhang, G.; Tan, F.; Wu, Y. Ship Motion Attitude Prediction Based on an Adaptive Dynamic Particle Swarm Optimization Algorithm and Bidirectional LSTM Neural Network. IEEE Access 2020, 8, 90087–90098. https://doi.org/10.1109/ACCESS.2020.2993909
- Wang, H.; Yin, J.; Wang, N.; Wang, L. A Multi-Dimensional Data-Driven Ship Roll Prediction Model Based on VMD-PCA and IDBO-TCN-BiGRU-Attention. Mar. Sci. 2025, 12, 1547933. https://doi.org/10.3389/fmars.2025.1547933
- Li, A.; Xu, Z.; Li, W.; Chen, Y.; Pan, Y. Urban Signalized Intersection Traffic State Prediction: A Spatial-Temporal Graph Model Integrating the Cell Transmission Model and Transformer. SSRN, 2025. https://doi.org/10.2139/ssrn.5189471
- Pan, Y.A.; Li, F.; Li, A.; Niu, Z.; Liu, Z. Urban Intersection Traffic Flow Prediction: A Physics-Guided Stepwise Framework Utilizing Spatio-Temporal Graph Neural Network Algorithms. Multimodal Transp. 2025, 4(2), 100207. https://doi.org/10.1016/j.mtrs.2025.100207.
- Hygen, J.E. Deterministic Response Prediction of Wave-Induced Vessel Motions. Master's Thesis, NTNU, Trondheim, Norway, 2023. https://hdl.handle.net/11250/3095040
- Liu, S.; Xu, R.; Papanikolaou, A. Prediction of the Motion of a Ship in Regular Head Waves Using Artificial Neural Networks. In Proc. of the ISOPE Int. Ocean and Polar Eng. Conf., Rhodes, Greece, 20–25 June 2021; pp. 464–470. https://doi.org/10.5957/IJOPE-2021-th12
- Coraddu, A.; Oneto, L.; Cipollini, F.; Kalikatzarakis, M.; Meijn, G.J.; Geertsma, R. Physical, Data-Driven and Hybrid Approaches to Model Engine Exhaust Gas Temperatures in Operational Conditions. Ships Offshore Struct. 2022, 17(6), 1360–1381. https://doi.org/10.1080/17445302.2021.1939989.
- Schirmann, M.L.; Gose, J.W.; Collette, M.D. A Comparison of Physics-Informed Data-Driven Modeling Architectures for Ship Motion Predictions. Ocean Eng. 2023, 286, 115608. https://doi.org/10.1016/j.oceaneng.2023.115608.
- Pan, Y.A.; Guo, J.; Chen, Y.; Cheng, Q.; Li, W.; Liu, Y. A Fundamental Diagram Based Hybrid Framework for Traffic Flow Estimation and Prediction by Combining a Markovian Model with Deep Learning. Expert Syst. Appl. 2024, 238, 122219. https://doi.org/10.1016/j.eswa.2023.122219
- Pan, Y.A.; Guo, J.; Chen, Y.; Li, S.; Li, W. Incorporating Traffic Flow Model into a Deep Learning Method for Traffic State Estimation: A Hybrid Stepwise Modeling Framework. Adv. Transp. 2022, 2022, 5926663. https://doi.org/10.1155/2022/5926663
- Han, P.; Li, G.; Cheng, X.; Skjong, S.; Merz, M.; Æsøy, V.; Zhang, H. An Uncertainty-Aware Hybrid Approach for Sea State Estimation Using Ship Motion Responses. IEEE Trans. Ind. Inform. 2021, 17, 5582–5592. https://doi.org/10.1109/TII.2020.3003095
- Duan, W.; Huang, L.; Han, Y.; Guo, W.; Liu, Y. A Hybrid AR-EMD-SVR Model for the Short-Term Prediction of Nonlinear and Non-Stationary Ship Motion. Zhejiang Univ. Sci. A 2015, 16, 562–576. https://doi.org/10.1631/jzus.A1500040
- Nielsen, U.D.; Jensen, J.J. Deterministic Predictions of Vessel Responses Based on Past Measurements. In Proc. of the 27th Int. Ocean and Polar Eng. Conf. (ISOPE 2017), San Francisco, CA, USA, 25–30 June 2017. https://doi.org/10.5957/IJOPE-2017-TG034
- Nielsen, U.D.; Brodtkorb, A.H.; Jensen, J.J. Response Predictions Using the Observed Autocorrelation Function. Mar. Struct. 2018, 58, 31–52. https://doi.org/10.1016/j.marstruc.2017.10.012
- Jiang, H.; Duan, S.; Huang, L.; Han, Y.; Yang, H.; Ma, Q. Scale Effects in AR Model Real-Time Ship Motion Prediction. Ocean Eng. 2020, 203, 107202. https://doi.org/10.1016/j.oceaneng.2020.107202
- Zhang, K.; Huang, L.; He, Y.; Wang, B.; Chen, J.; Tian, Y.; Zhao, X. A Real-Time Multi-Ship Collision Avoidance Decision-Making System for Autonomous Ships Considering Ship Motion Uncertainty. Ocean Eng. 2023, 278, 114205. https://doi.org/10.1016/j.oceaneng.2023.114205
- Yin, J.; Zou, Z.; Xu, F. On-Line Prediction of Ship Roll Motion during Maneuvering Using Sequential Learning RBF Neural Networks. Ocean Eng. 2013, 61, 139–147. https://doi.org/10.1016/j.oceaneng.2013.01.005
- Zhang, W.; Liu, Z. Real-Time Ship Motion Prediction Based on Time Delay Wavelet Neural Network. Appl. Math. 2014, 176297. https://doi.org/10.1155/2014/176297
- Li, X.; Lv, X.; Yu, J.; Li, J. Neural Network Application on Ship Motion Prediction. In Proc. of the 2017 9th Int. Conf. on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, China, 26–27 August 2017; 414–417. https://doi.org/10.1109/IHMSC.2017.101
- Skulstad, R.; Li, G.; Fossen, T.I.; Wang, T.; Zhang, H. A Cooperative Hybrid Model for Ship Motion Prediction. Ident. Control 2021, 42, 17–26. https://doi.org/10.4173/mic.2021.1.2
- Silva, K.M.; Maki, K.J. Data-Driven System Identification of 6-DOF Ship Motion in Waves with Neural Networks. Ocean Res. 2022, 125, 103222. https://doi.org/10.1016/j.apor.2022.103222
- Lee, J.H.; Lee, J.; Kim, Y.; Ahn, Y. Application of Machine Learning for Prediction of Wave-Induced Ship Motion. J. Offshore Polar Eng. 2023, 33, 164–173. https://doi.org/10.17736/ijope.2023.jc881
- Tian, X.; Song, Y. Machine Learning for Short-Term Prediction of Ship Motion Combined with Wave Input. Sci. 2023, 13, 5298. https://doi.org/10.3390/app13095298
- Lee, J.; Kim, Y.; Lee, J.H.; Ahn, Y. Prediction of Wave-Induced Nonlinear Ship Motions Based on an IRF-LSTM Hybrid Approach. J. Offshore Polar Eng. 2024, 34, 164–173. https://doi.org/10.17736/ijope.2024.jc892
- D'Agostino, D.; Serani, A.; Stern, F.; Diez, M. Time-Series Forecasting for Ships Maneuvering in Waves via Recurrent-Type Neural Networks. Ocean Eng. Mar. Energy 2022, 8, 479–487. https://doi.org/10.1007/s40722-022-00255-w
- Liu, Y.; Duan, W.; Huang, L.; Duan, S.; Ma, X. The Input Vector Space Optimization for LSTM Deep Learning Model in Real-Time Prediction of Ship Motions. Ocean Eng. 2020, 213, 107681. https://doi.org/10.1016/j.oceaneng.2020.107681
- Zhou, T.; Yang, X.; Ren, H.; Li, C.; Han, J. The Prediction of Ship Motion Attitude in Seaway Based on BSO-VMD-GRU Combination Model. Ocean Eng. 2023, 288, 115977. https://doi.org/10.1016/j.oceaneng.2023.115977
- Zhang, T.; Zheng, X.Q.; Liu, M.X. Multiscale Attention-Based LSTM for Ship Motion Prediction. Ocean Eng. 2021, 230, 109066. https://doi.org/10.1016/j.oceaneng.2021.109066
- Gao, N.; Hu, A.; Hou, L.; Chang, X. Real-Time Ship Motion Prediction Based on Adaptive Wavelet Transform and Dynamic Neural Network. Ocean Eng. 2023, 280, 114466. https://doi.org/10.1016/j.oceaneng.2023.114466
- Gong, J.; Xu, J.; Xu, L.; Hong, Z. Enhancing Motion Forecasting of Ship Sailing in Irregular Waves Based on Optimized LSTM Model and Principal Component of Wave Height. Mar. Sci. 2025, 12, 1497956. https://doi.org/10.3389/fmars.2025.1497956
- Xu, D.; Yin, J. An Enhanced Hybrid Scheme for Ship Roll Prediction Using Support Vector Regression and TVF-EMD. Ocean Eng. 2024, 307, 117951. https://doi.org/10.1016/j.oceaneng.2024.117951
- Mak, B.; Düz, B. Ship as a Wave Buoy: Estimating Relative Wave Direction from In-Service Ship Motion Measurements Using Machine Learning. In Proc. of the ASME 2019 38th Int. Conf. on Ocean, Offshore and Arctic Eng. (OMAE 2019), Glasgow, UK, 9–14 June 2019; Volume 9, V009T13A043. https://doi.org/10.1115/OMAE2019-96201
- Shi, W.; Guo, Z.; Chen, M.; Li, S.; Hu, J.; Dai, Z. Multi-Step Prediction of Ship Heave Motion Using Transformer-Enhanced Multi-Scale CNN. Measurement 2025, 242, 115787. https://doi.org/10.1016/j.measurement.2024.115787
- Zhang, B.; Wang, S.; Deng, L.; Jia, M.; Xu, J. Ship Motion Attitude Prediction Model Based on IWOA-TCN-Attention. Ocean Eng. 2023, 272, 113911. https://doi.org/10.1016/j.oceaneng.2023.113911
- Zhang, L.; Feng, X.; Wang, L.; Gong, B.; Ai, J. A Hybrid Ship-Motion Prediction Model Based on CNN–MRNN and IADPSO. Ocean Eng. 2024, 299, 117428. https://doi.org/10.1016/j.oceaneng.2024.117428
- Lee, J.H.; Lee, J.; Kim, Y.; Ahn, Y. Prediction of Wave-Induced Ship Motions Based on Integrated Neural Network System and Spatiotemporal Wave-Field Data. Fluids 2023, 35, 097109. https://doi.org/10.1063/5.0163795
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning Internal Representations by Error Propagation. Technical Report, Defense Technical Information Center (DTIC), 1985. https://doi.org/10.21236/ada164453
- Bengio, Y.; Simard, P.; Frasconi, P. Learning Long-Term Dependencies with Gradient Descent Is Difficult. IEEE Trans. Neural Netw. 1994, 5, 157–166. https://doi.org/10.1109/72.279181
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation. arXiv 2014, arXiv:1406.1078. https://arxiv.org/abs/1406.1078
- Schuster, M.; Paliwal, K.K. Bidirectional Recurrent Neural Networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. https://doi.org/10.1109/78.650093
- Siami-Namini, S.; Tavakoli, N.; Namin, A.S. The Performance of LSTM and BiLSTM in Forecasting Time Series. In Proc. of the 2019 IEEE Int. Conf. on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; pp. 3285–3292. https://doi.org/10.1109/BigData47090.2019.9005997
- Cummins, W. E. The impulse response function and ship motions. Schiffstechnik, 1962, 9, 101–109. DOI: not available
- Fonseca, N., & Soares, C. G.. Time-domain analysis of large-amplitude vertical ship motions and wave loads. of Ship Res. 1998, 42(02), 139-153. DOI: not available
Salvesen, N., Tuck, E. O., & Faltinsen, O. Ship motions and sea loads. Trans. - Society of Naval Architects and Marine Eng. 1970, 78, 250-279. DOI: not available
- P. Talebi, S. Werner and D. P. Mandic; Quaternion-Valued Distributed Filtering and Control. in IEEE Transactions on Automatic Control, vol. 65, no. 10, pp. 4246-4257, Oct. 2020, DOI: 10.1109/TAC.2020.3007332.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsI have carefully reviewed the authors’ responses and the revised manuscript. The authors have addressed all of the reviewers’ comments thoroughly and constructively. Dataset limitations are now clearly acknowledged, the scope of architectures has been better contextualized, computational performance metrics have been added, the new Peak Matching metric is more clearly justified, and figure readability has been improved. In addition, the language has been polished throughout.
In my view, the revision satisfactorily resolves the issues raised in the first review round, and the manuscript is now suitable for publication.
Author Response
The authors sincerely appreciate your thorough review and positive evaluation of our revised manuscript. Thank you for your recommendation for publication.
Reviewer 2 Report
Comments and Suggestions for AuthorsThank you for the revised manuscript and the rebuttal.
I am happy with almost all the improvements and the status of the manuscript, yet I have one remaining request/suggestion/comment.
Regarding my comment:
9) In the experiments with all the 6-DOF predictions, such as Fig 10 - 14, please include and comment on the timewise plots of the prediction for the other dimensions too. Only heave is presented, which is quite limiting demonstration of the prediction capabilities of a 6-DOF model.
The authors replied: " ... However, we will gladly provide DOF-specific timewise plots as supplementary material if the editor or reviewers consider it necessary."
My suggestion is that the plots with the rest of the DOFs are necessary, so yes please add them to the supplementary material. If it was me, I would add one example/case, in the actual manuscript, but I leave this up to the authors to decide.
Author Response
We sincerely appreciate your thorough review and evaluation of our revised manuscript. Please kindly refer to the attached response document for our detailed reply regarding your comment. The authors have addressed your suggestion accordingly, and the relevant explanations and supplementary additions are provided therein.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsHaving considered the revised manuscript and author response, I am happy to recommend the manuscript for publication without further changes.
Author Response
The authors thank the reviewer for the kind and encouraging comments. The authors appreciate your positive assessment and recommendation for publication.
Reviewer 4 Report
Comments and Suggestions for AuthorsAll my questions have been well addressed. I believe the current version could be accepted by this journal. Thanks.
Author Response
Thank you for your time and valuable feedback. The authors are happy that the revised manuscript meets your expectations and appreciate your recommendation for acceptance.