The research findings from experiments on technical modifications (Original, Paint, Advanced Propeller, Fin, Bulbous Bow, Combined) for FC and CO
2 emissions reduction in a sample oil tanker (
Appendix A,
Table A1) and development of a reliable hybrid physics-based ML model for accurate predictions, extensible to other oil tankers. The primary objectives were to identify the optimal scenario for FC and CO
2 reduction and to develop a reliable ML model for predictive accuracy under various operational conditions. The experiments were performed at a speed range of 9 to 15 knots, involving physics-based calculations of FC and CO
2 emissions (
Appendix C) and training of ML models (SVR, GPR, RF, XGB, SNN) using an 80–20 train–test split, as described in
Section 2.2.
The results were validated by comparing ML predictions with physics-based calculations, using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) to assess accuracy across all scenarios, as detailed in
Section 3.5. The XGB model demonstrated the best predictive accuracy, while the Advanced Propeller scenario exhibited the most substantial FC reduction, followed by the Combined scenario, indicating significant potential for CO
2 emission reductions. The MC data augmentation improved the model’s generalizability, as discussed in
Section 2.2. The extrapolation performance at 8 knots, relevant for minimal FC, is presented in
Appendix B,
Table A6, further validating the model’s robustness. The detailed physics-based calculations, including resistance coefficients and adequate power, are provided in
Appendix C.
3.3. Graphical Results
Graphical analysis, shown in
Figure 2a, revealed that total resistance increases with ship speed across all scenarios, driven by hydrodynamic drag. Efficiency modifications showed notable benefits: the Paint (5%) scenario achieved the lowest resistance, highlighting reduced drag from enhanced hull coatings.
The Advanced Propeller and Fin (2–4%) scenarios followed similar resistance trends to the baseline but with noticeable reductions. The Bulbous Bow scenario showed higher resistance at low speeds but improved efficiency at higher speeds, aligning with its established performance advantages.
Figure 2b highlights effective power (Pe) requirements across scenarios, with the Original Scenario demanding the most power. The Paint (5%) and Advanced Propeller Scenarios demonstrated substantial reductions in power needs. Although the Bulbous Bow Scenario initially required more power due to increased resistance, it improved at higher speeds as wave resistance decreased.
Figure 3 illustrates the thrust required (Treq) across various scenarios, reflecting trends in resistance. The Paint (5%) Scenario showed the lowest Treq due to reduced hull resistance, while the Advanced Propeller Scenario minimized thrust needs through optimized performance. The Fin (2–4%) Scenario also reduced Treq, enhancing propulsion efficiency. The Bulbous Bow Scenario required the highest thrust at lower speeds due to increased wet surface area. However, its thrust demand decreased relative to the Original Scenario at higher speeds, reflecting benefits of wave resistance.
The Bulbous Bow scenario required the highest thrust due to its increased wetted area, while the Advanced Propeller scenario minimized thrust needs through optimized performance. The Paint (5%) scenario also showed reduced thrust demands compared to the original, highlighting the effectiveness of improved hull coatings. These graphical results indicate that different modifications have a significant impact on the ship’s hydrodynamic performance. Scenarios focusing on drag reduction (e.g., Paint (5%) and Advanced Propeller) provided the most direct improvements in resistance, power, and thrust requirements. In contrast, the Bulbous Bow scenario shows benefits primarily at higher speeds.
The comparison between Rt, Pe, and Treq for Original, Paint (5%), Bulbous Bow, and Combined Paint and Bulbous Bow Scenarios is presented in
Figure A5,
Figure A6 and
Figure A7 of
Appendix B. The Combined Scenario had the lowest Rt, Pe, and Treq at all speeds (e.g., at 12 knots, Rt = 116.21 kN, Pe = 717.23 kW, Treq = 153.07 kN, compared to 121.61 kN, 750.69 kW, 160.02 kN for the Original). This synergy, combining Paint’s frictional resistance reduction with Bulbous Bow’s wave resistance optimization, highlights the potential for integrated modifications to enhance efficiency, particularly at higher speeds where wave-making resistance dominates.
While the Combined scenario consistently reduces fuel consumption across all speeds, the relative gain at 15 knots is modest. This indicates that although wave resistance becomes more significant at higher speeds, the synergistic effect of the combined modifications produces steady rather than sharply increasing benefits.
3.6. Scenario-Specific Performance Insights
The efficiency gains across ship speeds (9 to 15 knots) are illustrated in
Figure 5a,b, which show the percentage reduction in FC and thrust required (Treq) relative to the Original Scenario. Among all configurations, the Advanced Propeller achieved the largest FC reduction, ranging from 4.9% to 6.2%, due to its higher open water efficiency (ηₒ = 0.579), which represents a 24.78% improvement over the baseline propeller (ηₒ = 0.464). This improvement is calculated as follows: (0.579 − 0.464)/0.464 × 100.
The Combined Scenario (Paint + Bulbous Bow) followed, yielding consistent FC reductions of 3.3% to 4.9%, while the Paint (5%) Scenario achieved reductions of 10.3% to 10.5% across all speeds. Although the Paint Scenario models only a 5% reduction in the frictional resistance coefficient (CF), its impact on fuel consumption is amplified at higher speeds where resistance increases non-linearly.
In contrast, the Fin Installation Scenario produced negligible FC savings (0.06% to 0.26%). The Bulbous Bow Scenario exhibited mixed results: at low speeds (e.g., 9 knots), fuel consumption increased by 3.36% due to greater wetted surface area and added resistance. However, at mid-range speeds (e.g., 13.5 knots), it reduced FC by 2.17%, demonstrating its benefit in mitigating wave resistance. At 15 knots, however, fuel consumption increased again by 5.93%, indicating diminishing returns at higher speeds.
Moreover,
Figure 5b presents the Treq changes. The Combined Scenario showed the greatest Treq reduction (2.7% to 4.2%), followed by the Advanced Propeller (3.0% to 4.8%). The Paint Scenario yielded no change in Treq (0%), while the Fin Scenario slightly increased Treq by 0.58% to 0.65%. The Bulbous Bow Scenario increased Treq at lower speeds (e.g., 2.81% increase at 9 knots) but produced reductions at higher speeds (e.g., 1.83% decrease at 13.5 knots and 4.83% decrease at 15 knots), reflecting its speed-dependent hydrodynamic performance.
The efficiency of open water (η_O) plays a crucial role, especially in the Advanced Propeller Scenario, where higher η_O reduces Pd and FC, especially at higher speeds. However, η_O remains constant for Paint, Bulbous Bow, Fin, and Combined Scenarios (η_O = 0.464), so η_O increase is excluded from
Figure 5a,b to avoid uninformative zero bars, with FC and Treq reductions providing more precise insights. These modifications show that hydrodynamic improvements can reduce FC and CO
2 emissions, but their effects depend on operating conditions. The Combined Scenario performs better than individual modifications, as explained in
Section 3.2.
3.7. Comparative Analysis of Machine Learning Model Performance
The evaluation of ML models—Support Vector Regression (SVR), Gaussian Process Regression (GPR), Random Forest (RF), Extreme Gradient Boosting (XGB), and Neural Networks (NN)—takes place before and after MC simulations to assess their prediction capabilities for FC and CO
2 emissions across ship modification scenarios, including Paint, Advanced Propeller, and the Combined Scenario, which incorporates all optimizations. The selected graphs present the most important findings, while
Table A12 in
Appendix D provides detailed results for the original and paint scenarios and ML models. The following observations reveal the effects of applying MC simulation for dataset augmentation and its impact on model accuracy.
The XGB model generates FC predictions for the Combined Scenario, which are compared to the Original Scenario in
Figure 6. XGB produced highly accurate predictions for the 771.04 kg/h FC at 15 knots without MC simulation because it matched the actual value of 771.04 kg/h. The XGB model achieved near-perfect alignment after MC simulation with a prediction of 771.04 kg/h while demonstrating better fuel efficiency than the Original Scenario’s 814.84 kg/h.
The GPR results for FC in the Advanced Propeller Scenario are shown in
Figure 7. The original calculations showed better alignment with GPR than SVR before MC simulation (e.g., FC of 638.46 kg/h at 15 knots vs. actual 729.31 kg/h). The accuracy of GPR improved substantially after MC simulation to reach 633.07 kg/h, especially in cases such as the Advanced Propeller Coefficients, demonstrating GPR’s capability to handle intricate data patterns.
The evaluation of RF through FC reveals its performance, as shown in
Figure 8. The predictive performance of RF was acceptable before MC simulation, but it performed worse than GPR and XGB (e.g., 772.47 kg/h at 15 knots vs. 787.84 kg/h in Paint). MC simulation yielded moderate improvements in RF, resulting in predictions of 682.38 kg/h, indicating its potential for more straightforward applications.
The Paint Modification Scenario in
Figure 9 shows XGB as the top-performing model. XGB produced highly accurate predictions without MC simulation because its results matched the original values precisely (e.g., 788.36 kg/h at 15 knots vs. 787.84 kg/h). The model achieved near-perfect alignment in Paint Modifications and the Combined Scenario after undergoing MC simulation.
The performance of NN is shown in
Figure 10. NN showed some improvement after MC simulation, but its predictions were not consistent across different scenarios (e.g., FC of 806.87 kg/h at 15 knots vs. actual 771.04 kg/h in Combined). This variability indicates the need for more optimization and hyperparameter tuning when using NN for this application.
Figure 11 shows the predicted FC using SVR. SVR predictions were far from the original physics-based calculations without MC simulation, especially at higher speeds (e.g., FC of 327.78 kg/h at 15 knots vs. actual 771.04 kg/h in Combined). However, after MC simulation, the augmented dataset helped to reduce these discrepancies but only to a limited extent (e.g., FC of 334.31 kg/h), indicating limited improvement in SVR’s predictive capabilities.
a. CO2 Emissions Analysis
The Advanced Propeller Scenario CO
2 emission predictions from GPR are presented in
Figure 12. The model provided accurate results before MC simulation (e.g., CO
2 1949.72 kg at 15 knots vs. actual 2268.14 kg) and achieved major improvements after MC (e.g., CO
2 of 1968.83 kg), resulting in reliable emission predictions.
The analysis of RF’s CO
2 emission predictions appears in
Figure 13. The model achieved better results after MC simulation, but its performance remained less reliable than GPR and XGB (e.g., CO
2 of 2122.20 kg at 15 knots vs. actual 2450.20 kg in Paint).
The XGB model demonstrates outstanding accuracy in its CO
2 predictions for the Combined Scenario, as shown in
Figure 14 (e.g., CO
2 of 2397.93 kg at 15 knots vs. actual 2397.93 kg), representing a substantial decrease from the Original Scenario’s 2534.16 kg. The model delivered superior results across all scenarios, especially when making exact predictions such as Paint Modifications (e.g., CO
2 of 2454.72 kg at 15 knots vs. actual 2450.20 kg post-MC).
The SVR model failed to reach acceptable accuracy levels at both pre- and post-MC simulation stages (e.g., CO
2 of 1016.35 kg at 15 knots vs. actual 2397.93 kg before MC, improving to 1022.88 kg post-MC in Combined), indicating its unsuitability for such applications. NN’s CO
2 predictions improved after MC simulation but remained inconsistent (e.g., CO
2 of 2510.53 kg at 15 knots vs. actual 2397.93 kg in Combined), indicating a need for further enhancements to achieve robust predictions. The detailed results of all simulations, including all scenarios (Original, Paint, Fin, Advanced Propeller, Bulbous Bow, and Combined) and ML models after MC sampling, are presented in
Table A14 of
Appendix D, providing a comprehensive reference for model performance.
MC simulation produced positive effects across all models, improving the match between predicted values and the original physics-based calculations. The XGB and GPR models proved to be the most reliable, producing accurate predictions for FC and CO
2 emissions, particularly in the Combined Scenario, demonstrating the potential of integrated ship modifications to enhance efficiency and reduce emissions. The strength of ML methods lies in their ability to capture non-linear relationships and adapt to dynamic operational conditions, as demonstrated by real-time input simulations (e.g., engine power: 870 kW, ship speed: 12.7 knots,
Section 4.3), which physics-based models alone struggle to model accurately due to their reliance on static assumptions. In contrast, SVR and NN demonstrated limited effectiveness, even after the dataset was augmented through MC simulation. The following section delves deeper into quantitative evaluations using Mean Squared Error (MSE) plots to comprehensively assess the models’ predictive accuracy and identify the most effective approach for specific scenarios.
The synthetic data points produced by MC simulations underwent validation through physics-based calculations of a selected subset of augmented samples. The SVR model evaluated one synthetic sample for each original data point to verify that the generated values for thrust (T), Pd, and FC stayed within the oil tanker specifications (
Appendix A,
Table A1). The Paint scenario at 14.5 knots receives validation through
Table 2, which shows the original FC value against SVR model predictions before and after MC augmentation. The MC-augmented prediction shows substantial improvement, which verifies the physical accuracy of synthetic data. The validation samples for Advanced Propeller, Bulbous Bow, Combined, and Fin scenarios are presented in
Appendix B,
Table A5. The SVR model trained on the augmented dataset produced FC predictions with an R
2 score of 0.95 across all scenarios, demonstrating substantial predictive accuracy and low overfitting, as the model performs well on new test data.
b. Model Evaluation Based on MSE Metrics
SVR consistently showed high MSE and RMSE values before and after MC simulations. Despite significantly reducing errors post-simulation, SVR’s values remained relatively high compared to other models, indicating its limited suitability for accurate predictions in this application. Specifically, in the “Bulbous Bow” scenario, RMSE for FC decreased from approximately 135.70 before MC to 33.99 after MC, demonstrating improvement but still lagging behind other models in terms of accuracy. The Combined Scenario showed similar patterns for SVR, with high MSE values before MC simulation that decreased significantly after MC but remained less competitive than the XGB and GPR models. This further underscores SVR’s challenges in achieving high precision across diverse scenarios.
Table 3 shows SVR’s performance across all scenarios, emphasizing its relative lack of reliability. Mean MSE and RMSE values for FC predictions across all models are provided in
Appendix D,
Table A13.
XGB and GPR emerged as the most accurate models. XGB demonstrated low MSE values after MC simulations across all scenarios, even reaching zero MSE in some cases, indicating a perfect fit for those conditions.
In scenarios such as “Paint (5%)” and “Bulbous Bow,” XGB’s RMSE values effectively became zero after MC simulations, highlighting its robustness and ability to handle variability. This is further described in
Figure 15 and 16, which compares XGB’s performance against GPR, RF, and SNN. GPR also demonstrated strong predictive capabilities, particularly in the “Advanced Propeller” scenario, where it outperformed XGB in terms of MSE and RMSE. This suggests that GPR can be advantageous in specialized contexts with more complex underlying dynamics. Overall, GPR maintained low RMSE values, particularly in scenarios with increased complexity, demonstrating its flexibility and adaptability.
RF and SNN models performed moderately well but did not achieve the same levels of accuracy as XGB or GPR. While demonstrating error reduction after MC simulations, the RF model still had higher MSE values than the top-performing models. For instance, in the “Bulbous Bow” scenario, MSE for RF dropped from approximately 2512.84 to 3.63 after MC simulations. Yet, RMSE remained comparatively high at 1.90, indicating a need for more generalizability. This trend was evident across multiple scenarios, suggesting that RF is less consistent in reducing prediction errors when compared to GPR and XGB. SNN improved after MC simulations, with reduced MSE values across many scenarios. However, their performance could have been better. In particular, in the “Fin (2–4%)” scenario, RMSE for FC decreased from approximately 10.92 to 8.04, indicating improvement; however, it still falls short of the results achieved with GPR or XGB. The inconsistencies in SNN performance render it less reliable for predictions that require high accuracy across varying conditions.
Figure 15 highlights the comparative performance of XGB, GPR, RF, and SNN models, illustrating SNN’s moderate yet inconsistent improvements. Detailed MSE performance for the other models is available in
Appendix D,
Table A14.
It is relevant to note that the MSE values shown may reflect differences in internal scaling between models. While the relative performance trends are valid, the absolute values should be interpreted with caution.
c. Detailed Scenario-Based Analysis
Evaluating different scenarios provides a more granular view of each model’s behavior under varying conditions. Below is a detailed discussion of model performance for specific scenarios:
- i.
Original Scenario: This scenario served as a baseline for comparison. XGB and GPR both demonstrated the best performance, achieving low MSE and RMSE values, especially after MC simulations. While showing error reductions post-simulation, the SVR model still had high RMSE values of approximately 27.68, indicating its lack of precision.
- ii.
Paint (5%) Scenario: In this scenario, XGB continued to demonstrate excellent performance, with an MSE value of zero post-simulation, indicating that the model effectively generalized the new conditions introduced by the paint modification. GPR also showed strong performance, while RF and SNN showed moderate levels of error reduction.
- iii.
Advanced Propeller Scenario: The Advanced Propeller scenario demonstrated GPR’s ability to handle complex modifications. GPR achieved the lowest MSE and RMSE compared to other models, making it the best performer in this context. XGB also performed well, but not as consistently as GPR, particularly before the MC simulation.
- iv.
Fin (2–4%) Scenario: In this scenario, XGB and GPR were again the most effective, while SNN showed improvements but still lagged. The RF model exhibited moderate improvements but failed to match the performance levels of XGB and GPR. This scenario highlighted the limitations of RF and SNN in scenarios with minor, more nuanced changes.
- v.
Bulbous Bow Scenario: In the Bulbous Bow scenario, XGB and GPR maintained high performance, with XGB achieving perfect accuracy (MSE of zero post-simulation). SVR showed improvements but had higher error levels, making it the least effective for this scenario. SNN performed inconsistently, suggesting sensitivity to the modifications in this particular scenario.
- vi.
Combined Scenario: The Combined Scenario tested the models’ ability to generalize across diverse conditions by integrating multiple modifications. XGB achieved near-zero MSE, reinforcing its robustness, while GPR maintained low MSE values, though higher than XGB’s. RF performed moderately, with MSE values lower than those of SNN, which showed higher errors, indicating inconsistent performance in complex scenarios.
3.8. Implementation of the XGB Model and Summary of Results
Building upon the comparative analysis of ML models, where XGB demonstrated superior predictive performance, this section details the implementation of the XGB model and summarizes the key results obtained. By integrating the XGB model into the hybrid framework, the authors aimed to enhance predictive accuracy for FC and CO2 emissions under various operational scenarios. A single XGB model was trained on a combined dataset from all scenarios (Original, Paint, Advanced Propeller, Fin, Bulbous Bow, Combined) to enable flexible predictions for diverse ship configurations.
The following subsections discuss the user interface development, visualization, and comparative analysis of predictions across different scenarios; key insights derived from the results; quantitative evaluation of the model’s predictions; and an assessment of XGB’s overall performance in capturing complex maritime operational dynamics.
- i.
Visualization and Comparative Analysis
Predictions were visualized across a range of ship speeds (9 to 15 knots) using a single model trained on all scenarios. Graphs illustrated the relationship between ship speed and predicted metrics:
Fuel Consumption: The bar chart demonstrates how the combined model predicts FC (after MC augmentation) at different speeds, exhibiting similar FC reduction patterns to efficiency-based scenarios such as “Advanced Propeller” and “Bulbous Bow” at higher speeds.
CO2 Emissions: The bar chart of CO2 emissions mirrored the FC pattern, showing reduced emissions in configurations that matched the “Advanced Propeller” and “Fin (2–4%)” scenarios.
MC-augmented: The MC-augmented predictions matched the scenario-specific trends, demonstrating the model’s ability to capture parameter effects accurately. The visualization illustrates how different configurations align with the aggregated scenario trends, enabling practical interpretation.
Fuel Consumption: The combined model captured efficiency gains consistent with scenarios like “Advanced Propeller” at higher speeds, where propeller efficiency improvements reduce FC. The “Bulbous Bow” scenario also improved FC significantly at high speeds by reducing wave resistance.
CO2 Emissions: The model predicted CO2 emissions through efficiency-driven scenarios such as “Advanced Propeller” and “Fin (2–4%)”, which matched expected hydrodynamic improvements.
The XGB model demonstrated its ability to handle operational condition variability, thus proving its effectiveness for real-world deployment. The combined FC and CO
2 prediction trends are shown in
Figure 16, with user input overlay.
- iii.
Quantitative Evaluation of Predictions
The XGB model predicted an FC of 237.10 kg/h and CO2 emissions of 735.74 kg after MC augmentation, using the following sample operational inputs: engine power (Pe: 870 kW), ship speed (12.7 knots), hull efficiency (ηH: 1.054), and propeller efficiency (ηO: 0.590). The configuration shows the closest match to the “Advanced Propeller” scenario (FC: 286.21 kg/h, CO2: 890.11 kg) followed by the “Paint (5%)” scenario (FC: 304.59 kg/h, CO2: 947.26 kg). The setup with its high propeller efficiency (ηO = 0.590) benefits from efficiency improvements that are similar to those in the “Advanced Propeller” scenario, which focuses on enhanced propeller design. The predicted values are lower than expected (FC: 284.26 kg/h, CO2: 884.04 kg), which may indicate underestimation, possibly due to the model’s generalization across diverse scenarios. To achieve further reductions in FC and CO2, the model suggests improvements similar to the “Advanced Propeller” scenario, such as adopting advanced propeller technologies to enhance ηO.
- iv.
Evaluation of XGB Model Performance
XGB demonstrated superior performance to other ML models, such as RF and GPR, which struggled with overfitting or underfitting in specific scenarios. Its ability to capture non-linear relationships between features and robustness to scenario variations made it the most effective model for predicting FC and CO2 emissions. With hyperparameters optimized (e.g., 100 estimators for a balance of efficiency and accuracy), XGB effectively generalized across operational scenarios, providing reliable predictions.