Next Article in Journal
What Difference Can a Workshop Make? Lessons from an Evaluation of Eight Place-Based Climate Adaptation Workshops in the United States
Previous Article in Journal
Intensification of SUHI During Extreme Heat Events: An Eight-Year Summer Analysis for Lecce (2018–2025)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Recent Advancements and Challenges in Artificial Intelligence for Digital Twins of the Ocean

by
Vassiliki Metheniti
1,*,
Antonios Parasyris
1,*,
Ricardo Santos Pereira
2,3 and
Garabet Kazanjian
4
1
Foundation for Research and Technology-Hellas, Institute of Applied and Computational Mathematics, 70013 Heraklion, Greece
2
WavEC Offshore Renewables, Edifício Diogo Cão, Doca de Alcântara Norte, 1350-352 Lisboa, Portugal
3
Instituto Superior Técnico, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal
4
Acopian Center for the Environment, American University of Armenia, 40 Baghramyan Avenue, Yerevan 0019, Armenia
*
Authors to whom correspondence should be addressed.
Climate 2026, 14(1), 3; https://doi.org/10.3390/cli14010003
Submission received: 31 October 2025 / Revised: 15 December 2025 / Accepted: 19 December 2025 / Published: 23 December 2025

Abstract

The Digital Twins of the Ocean (DTOs) represent an emerging framework for monitoring, simulating, and predicting ocean dynamics, supporting a range of applications relevant to understanding and responding to the global climate system. By integrating large-scale, multi-sourced datasets with advanced numerical models, DTOs provide a powerful tool for climate science. This review examines the role of machine learning (ML) in advancing DTOs applications, addressing the limitations of traditional methodologies under current conditions of increasing data availability from satellites, in situ sensors, and high-resolution numerical models. We highlight how ML serves as a versatile tool for enhancing DTOs capabilities, including real-time forecasting, correcting model biases, and filling data gaps where conventional approaches fall short. Furthermore, we review surrogate models that aim to complement or replace traditional physical models, offering increasing accuracy and the appeal of much faster inference for forecasts, and the insertion of hybrid models, which couple physics-based simulations with ML algorithms and are proving to be continuously improving in accuracy for complex oceanographic tasks as bigger datasets become available and methodologies evolve. This paper provides a comprehensive review of ML applications within DTOs, focusing on key areas such as water quality and marine biodiversity, ports, marine pollution, fisheries, and renewable energy. The review concludes with a discussion of future research directions and the potential of ML to foster more robust and practical DTOs, ultimately supporting informed decision-making for sustainable ocean management.

1. Introduction

The ocean plays a pivotal role in regulating the global climate system, and the need for accurate, the timely monitoring of its state is of great importance for understanding climate dynamics. Digital Twins of the Ocean (DTOs) represent an approach to this challenge. A DTO is a dynamic, continuously updated system that maintains a two-way connection with the physical ocean, thereby surpassing the limitations of a conventional simulation by providing a dynamic, detailed virtual representation of the real ocean system. By synthesizing these data streams and models, it provides a comprehensive, near-real-time view of the ocean’s physical, chemical, and biological states [1]. The power of a DTO lies in its ability to not only reflect the current state but also to serve as a platform for predictive analysis and decision support, allowing researchers and policymakers to test hypotheses and assess potential outcomes without risk to the real ocean. In the context of global climate regulation, DTOs are well-suited to aid our understanding and response. The ocean is the primary regulator of Earth’s climate, absorbing, storing, and redistributing vast amounts of solar energy [2]. This process drives key phenomena like global ocean currents and the water cycle, which in turn influence weather patterns and the dynamics of extreme events such as marine heatwaves [3]. The DTO’s ability to model these complex interactions with high accuracy and resolution makes it a valuable tool for exploring these processes.
The capabilities of a DTO are greatly improved by using machine learning (ML), a versatile tool that can handle the large, multi-sourced datasets now available due to advancements in high-resolution numerical models, satellite remote sensing, and autonomous in situ platforms [4,5]. ML has become an effective tool for developing specific applications such as forecasting, monitoring environmental quality, assessing coastal hazards, and managing the sustainable blue economy. Traditional methods often struggle to process these large amounts of data efficiently. ML, however, allows for the full utilization of these datasets, creating efficient tools that can be used without requiring great computational resources during its operation [4,6]. A key benefit of ML over traditional methods is that they can have similar forecasting skills, often at a much lower computational cost during inference, making them suitable for near-real-time forecasting where speed is critical [4,7]. A growing number of studies are exploring hybrid models that combine the strengths of both approaches, using physics-based numerical models as a foundation and applying ML to correct for biases, improve parameterizations, or downscale predictions [8,9,10]. ML is also capable of categorizing and separating data, as ML models can learn non-linear relationships and find hidden patterns within large datasets. This helps provide early warnings for extreme events like marine heatwaves and storms by identifying subtle patterns that are otherwise hard to detect [7,11]. Additionally, ML is a good way to fill in gaps in areas with limited data coverage/sample frequency [12,13,14], where it can generate realistic information to make the DTO more complete.
In this paper, we review the methodologies and technical components of different ML techniques for non-familiar readers. Thereafter, we present a range of DTO applications used for ocean and weather forecasting, marine environment and ecosystem health, port operations, renewable energy, aquaculture and fisheries. Finally, we conclude with a discussion of further capabilities and methods of ML in DTO applications, along with concluding remarks on the identified present and future challenges.

2. Digital Twins and AI Methodologies

2.1. Digital Twin Terminology

The term “Digital Twin” has become prevalent in recent years, though its application is often inconsistent, leading to various, sometimes conflicting, interpretations. Additionally, several underlying technological concepts require clear definitions to distinguish between various applications and ontologies. As defined by Fraunhofer (https://www.ipk.fraunhofer.de/en/expertise-and-technologies/industry-trends/digital-twins.html, last accessed online 10 October 2025), and adopted hereinafter, the level of integration with the real world differentiates these concepts. While all are digital representations of a physical asset, they are distinguished by their data exchange capabilities. This classification includes three concepts: a digital model, a digital shadow, and a digital twin (Figure 1). In a digital model, data exchange between the physical and digital models is manual, with no impact from changes in one on the other. A digital shadow, however, offers automatic data flow from the physical to the digital, allowing physical variations to influence the digital, but not vice versa. A fully fledged digital twin facilitates bi-directional data flow, where changes in the physical asset influence the digital model, and the digital model can, in turn, interact with and potentially control the physical asset [15,16]. Especially for non-specialist stakeholders, this last interaction is usually aided by interactive visual front ends that accompany the DTO (see e.g., https://ocean-twin.eu/marketplace/product/oil-spills, last accessed online 14 December 2025). This is not a requirement, but a very important step on the tool’s wide adoption and usability.
The core functionalities provided by these digital representations, from the basic analytical power of a digital model to the real-time, bi-directional control enabled by a full digital twin, establish the necessary computational foundation for operational oceanography. Specifically, the ability to integrate observational and real-time data with dynamic models drives the practical applications of DTOs outlined in Section 3, including ocean forecasting, hazard mitigation, and optimized resource management. As DTOs are an emerging field, originating in 2020 as digital assets designed to support ocean protection initiatives under the Ocean Decade program (https://oceandecade.org/actions/digital-twins-of-the-ocean-ditto/, last accessed online 14 December 2025), to the authors’ knowledge, only a few fully fledged DTOs are currently available (https://ocean-twin.eu/news, https://www.edito.eu/, https://oceandecade.org/actions/digital-twins-of-the-ocean-ditto/, last accessed online 14 December 2025).

2.2. AI Techniques

The integration of various Artificial Intelligence (AI) methodologies is fundamental to the advanced capabilities of DTOs. The AI models relevant to this review are classified into four core categories: classical ML, Deep Learning (DL), Hybrid physics/ML models combination (hereinafter Hybrid Models), and Computer Vision (CV), as summarized in Table 1. These techniques are differentiated by their fundamental learning models, data requirements, and inherent complexity, collectively forming a versatile computational toolkit for addressing complex, data-driven challenges in oceanography [6].
Traditional ML models are broadly categorized by their approach to data processing, primarily into supervised, unsupervised, and semi-supervised models or frameworks. Supervised Learning involves training a model using a dataset of labeled input-output pairs to learn a mapping function capable of predicting an output for new, unseen inputs [17]. Models such as Random Forest (RF) construct an ensemble of Decision Trees using bagging and feature randomness to handle classification and regression tasks across diverse inputs like tabular data, gridded fields, and time series [18]. Extreme Gradient Boosting (XGBoost) is an efficient, scalable implementation of the gradient boosting algorithm that sequentially builds weak prediction models, typically Decision Trees, to correct the errors of previous models for highly accurate classification and regression on structured data [19]. Another supervised method, Gaussian Process Regression (GPR), is a nonparametric, kernel-based probabilistic model that approximates an unknown target function and provides prediction intervals to quantify the estimate’s uncertainty, accepting gridded field and point data [20]. Additionally, Support Vector Regression (SVR), is used for regression analysis that finds a hyperplane best fitting the data while minimizing error within a specified margin, and it operates on gridded field data [21]. Conversely, Unsupervised Learning infers underlying patterns or groupings in unlabeled data [22]. K-Means Clustering achieves this by iteratively partitioning n observations into k clusters to minimize the variance within each group, and it is frequently used with imagery and video data [23].
DL is an advanced subset of ML that utilizes complex, multi-layered Neural Networks (NNs) to model high-level abstractions and learn non-linear relationships in data [24]. Convolutional NNs (CNNs) are especially well-suited for processing spatially correlated data like images, videos, and gridded fields, employing convolutional and pooling layers to automatically extract hierarchical features and capture spatial patterns [25]. Recurrent NNs (RNNs), such as Long Short-Term Memory (LSTM) networks, are specialized to process sequential data like time series. LSTMs use a system of input, forget, and output gates that control the flow of information into a cell state, allowing them to capture long-term dependencies and overcome the vanishing gradient problem in time series forecasting [26]. The Multi-Layer Perceptron (MLP) is a foundational, fully connected NN that uses non-linear activation functions to model complex relationships in tabular data and gridded fields [27]. For data synthesis and completion, Generative Adversarial Networks (GANs) use two competing networks, a generator and a discriminator, to create high-quality, realistic synthetic data, typically imagery or gridded fields [28]. Similarly, autoencoders are unsupervised networks that learn efficient data encodings by compressing input data, like sparse time series, into a latent space and then reconstructing the original data, which is useful for dimensionality reduction and data imputation [29].
Beyond standalone architectures, Hybrid Models are strategically employed to combine different AI techniques to enhance performance and efficiency. Ensemble Models (Stacking) is a technique that trains a second-level meta-learner to synthesize and optimally blend the predictions of multiple diverse base learners (e.g., RF, SVM) for tasks that utilize gridded field and point data [30]. Deep Deterministic Policy Gradient (DDPG) is a Reinforcement Learning algorithm that concurrently learns a Q-function (critic) and a deterministic policy (actor) network using off-policy data to optimize decisions in environments with continuous action spaces [31]. It accepts multivariate time series representing the system state as input. Lastly, CV utilizes specialized DL models to enable automated interpretation of imagery and video. YOLO (You Only Look Once) is a family of single-stage object detection models that process an entire image in one pass to simultaneously predict bounding boxes, confidence scores, and class probabilities, making it exceptionally fast for real-time applications [32]. Mask R-CNN extends this capability by adding a parallel branch to the detection framework to generate a high-quality segmentation mask for each detected object instance, thereby delineating the object’s shape at the pixel level in imagery and video.
The methodologies detailed in this section form the foundational computational layer of modern DTOs. These AI techniques are precisely what enable DTOs to process the large, multi-sourced datasets available today and to deliver on their core promise: to provide computationally efficient, highly accurate, and near-real-time insights for monitoring, forecasting, and informed decision-making. Section 3 details the specific and practical application of these AI categories across key DTO operational domains, showcasing how they overcome existing limitations to support a sustainable blue economy.
Table 1. Classification of the AI models relevant to this review in four core categories: classical ML, Deep Learning (DL), Hybrid physics/ML models combination (hereafter Hybrid Models), and Computer Vision (CV).
Table 1. Classification of the AI models relevant to this review in four core categories: classical ML, Deep Learning (DL), Hybrid physics/ML models combination (hereafter Hybrid Models), and Computer Vision (CV).
Model CategoryModel TypeExamplesData Types Applications
Classical
Machine Learning
SupervisedRandom Forest, XGBoost, GPR, Support Vector Regression (SVR)Inputs: Gridded field Data, Time Series, Point Data
Outputs: Gridded, Field Data, Time Series, Point Data, Categorical Data
Chl-a distribution [33,34,35]; Predicting microplastic accumulation [36]; Forecasting HABs [37]; Fish species distribution [38]; Classifying fish reproductive condition [39]; Estimating fishing operations [40]; Ocean surface dissolved oxygen and macronutrients estimation [41];
UnsupervisedK-Means ClusteringInputs: Imagery, Video
Outputs: Imagery, Video (Segmentation mask)
Floating object detection (hybrid approach) [42]
Deep LearningSupervisedConvolutional Neural Networks (CNNs)Inputs: Gridded/Field Data (Forecasts), Imagery/Video (Segmentation/Detection), Time Series/Point Data (Counts)Ocean state prediction [43]; multi-source data integration for Chl-a estimation [34]; Oil spill segmentation [44]; marine heatwaves forecasting [7]; Counting fish [45]; Detecting/measuring litter [46,47]
SupervisedMulti-Layer Perceptron (MLP)Inputs: Gridded field data, point data
Outputs: Gridded field data, point data
Ocean 3D reconstruction from surface-only satellite data [48]; Kd retrieval from remote sensing data [27].
SupervisedLong Short-Term Memory (LSTM)Inputs: Time Series, point Data
Outputs: Time Series, Point Data (Forecasts)
Forecasting water quality; Point-wise time series forecasting, predicting atmospheric variables [49]; HABs forecasting [50].
UnsupervisedGenerative Adversarial Network (GAN)Inputs: Gridded field data
Outputs: Imagery, Video (Synthesized Samples)
Synthesizing macroplastic samples to overcome data scarcity (hybrid GAN-RF) [51]
UnsupervisedAutoencoders (Used in the HIDRA family of models, specifically HIDRA-3)Inputs: Sparse Time series
Outputs: Reconstructed continuous gridded field, time series
Reconstructing missing sea surface height signals in latent space [52].
Hybrid physics/ML model combinationEnsemble model,
Supervised
Stacking Ensemble Learning (STK) (e.g., combining RF, SVM, KNN, XGBoost, GP)Inputs: Gridded field data, Point data
Outputs: Gridded field data (probability maps)
Predicting fishing grounds [53,54,55].
SupervisedHydro-Biogeochemical-CNN (HBGC-CNN)Inputs: Gridded field data
Outputs: Gridded field data
Daily, spatial prediction of Chl-a and HABs [56].
Reinforcement LearningDeep Deterministic Policy Gradient (DDPG) (Hybrid with LSTM)Inputs: Multivariate Time series Outputs: Optimal control actions (Optimization)Optimizing energy consumption in large-scale recirculating aquaculture systems [57].
Computer VisionSupervisedYOLOv8 (Modified version, Deep CNN)Inputs: Imagery, Video
Outputs: Imagery, Video (Bounding boxes, Segmentation masks), Time series, Point data (Classification labels)
Object detection for biomass estimation in turbid waters [58]; Distinguishing plastic from tourists on coastlines [59]
Mask R-CNN, (Region-based CNN)Inputs: Imagery, Video
Outputs: Imagery/Video (Segmentation Masks), Time Series/Point Data (Measurements)
Detecting and measuring fish size [60,61]; Segmenting fish body parts for disease detection [62]; Detecting seafloor marine litter [63].

3. DTO Applications

Building upon the foundation of AΙ techniques (Section 2.2), ranging from supervised algorithms like RF and XGBoost, to advanced DL architectures such as CNNs, LSTMs, and Hybrid models, this section presents the realization of these computational capabilities in operational DTOs. These AI-powered solutions overcome the data and computational limitations of traditional methodologies, providing the requisite speed, scalability, and accuracy for real-time monitoring, forecasting, and decision-making. The following subsections are structured to showcase the application of these models across two primary domains: first, in the creation of AI-powered surrogate models for general ocean state prediction, and second, in the deployment of specialized DTOs across key sectors of the blue economy, including environmental health, maritime logistics, renewable energy, and sustainable resource management.

3.1. Surrogate Models in DTOs

Advances in AI and surrogate modeling support the rapid development of DTOs. These technologies enable the construction of computationally efficient and accurate representations of complex ocean processes, supporting improved forecasting, scenario testing, and operational decision-making. Surrogate models, which are simplified, data-driven emulators of physics-based models, are especially valuable in DTOs. By learning from high-fidelity simulations or observations, they drastically reduce computational costs while maintaining sufficient accuracy. This allows rapid model runs in real time, enabling interactive scenario exploration for both scientific and policy applications. For instance, GLONET, a CNN trained on the GLORYS12 reanalysis dataset, demonstrated the ability to predict temperature, salinity, sea surface height, and ocean currents at 1/12° resolution, with experimental daily forecasts now integrated into the EDITO platform (European Digital Twin Ocean; [43], https://www.edito.eu/, last accessed online 10 October 2025). In addition to traditional metrics, GLONET undergoes a unique NN-specific evaluation criterion as mentioned in [43], to meet the high standards required for operational deployment, and is also benchmarked against the current global Mercator Ocean analysis and forecasting high-resolution physical system GLO12 [64], and one of the first neural global ocean forecasting systems [65] trained from the same ocean reanalysis GLORYS12, showing comparable and improved performance in some variables. Similarly, AURORA, a foundation model for the Earth system [66], highlights the versatility of large-scale pretrained models that can be adapted to a variety of downstream ocean and climate prediction tasks.
At the coastal scale, accurate forecasting requires high-resolution numerical models capable of resolving mesoscale to sub-mesoscale processes and capturing non-linear interactions between tidal currents, waves, and local meteorology [67]. Assimilation of new observational sources, such as HF radar and SWOT satellites, further increases fidelity but also heightens computational demand. Here, ML-based surrogates provide an effective approach for addressing these applications. Recent advances with Fourier Neural Operators (FNOs) show that it is possible to emulate complex circulation models such as NEMO [68] at a fraction of the computational cost, while retaining skill for flood prediction and hazard assessment [69]. DTO forecasting, however, is not limited to spatially distributed ocean states. Many operational needs, such as port operations, offshore energy, or aquaculture, require pointwise time series forecasting of variables like sea surface temperature, wave height, or sea surface level. RNNs, and particularly LSTM models, have proven effective at capturing long-range dependencies in temporal datasets, with successful applications in meteorology [49]. These methods can be directly applied to oceanographic variables by training on buoy or grid-point data, producing site-specific forecasts that support operational planning. Sea level prediction provides a striking example of this evolution. The HIDRA family of models extends DL to tide gauge forecasting under sparse or missing data conditions. HIDRA3 [52] predicts sea surface height across multiple stations simultaneously, reconstructing missing signals in latent space to increase robustness. HIDRA-D [70] advances further by generating gridded sea level fields from sparse tide gauge inputs, offering spatially continuous predictions that can be seamlessly integrated into digital twins.
Complementary innovations in data assimilation are also accelerating DTO development. A promising avenue is the use of NNs for super-resolution data assimilation, which addresses the common mismatch between high-resolution observations and lower-resolution models. Within EDITO, networks have been trained to upscale coarse fields, assimilate high-resolution observations, and then project results back into low-resolution model space with far greater accuracy than classical upsampling (https://github.com/AntoineBernigaud/Super_Resolution_Data_Assimilation, last accessed online 10 October 2025). This allows DTOs to capture finer details without the prohibitive cost of continuously running high resolution simulations.
Beyond day-to-day forecasting, DTOs are increasingly applied to extreme event prediction. Ref. [7] demonstrated the use of CNNs for forecasting marine heatwaves in the Mediterranean, showing the potential of DL for high-impact extremes. A comprehensive review article on AI for modeling and understanding extreme weather and climate events can be found in [71]. These advances are framed within a growing international context. The Digital Twins of the Ocean (DITTO) program, endorsed by the UN Decade of Ocean Science for Sustainable Development, is establishing common methodologies and best practices, ensuring interoperability between national and regional DTO efforts (https://ditto-oceandecade.org/, last accessed online 10 October 2025). Together with EDITO in Europe and other regional platforms, such initiatives are shaping a new era of ocean science—where high-resolution observations, physics-based models, and ML converge to deliver timely and actionable insights for a sustainable blue economy. As a final remark, not all of these models can be automatically considered DTOs, as some lack the front-end and two-way automatic connection with the physical object which is the ocean. So, most of the trained AI models are considered Digital Shadows, as data can flow automatically from the physical object to the digital object (see Figure 1), whereas most of the models presented that refer to a DTO project like EDITO, DITTO, Iliad can be considered fully fledged DTOs with established two-way data flow. A schematic categorization of the models in this section can be seen in Figure 2.

3.2. Marine Environment and Ecosystem Health

Efforts to protect the marine environment (see UN Sustainable Development Goal 14 https://www.un.org/sustainabledevelopment/oceans/, last accessed 10 October 2025, and EU’s zero pollution ambition https://environment.ec.europa.eu/strategy/zero-pollution-action-plan_en, last accessed 10 October 2025) require continuous, high-fidelity monitoring and predictive modeling capabilities that surpass traditional methodologies. DTOs provide the necessary framework to integrate large-scale data with ML to track progress toward achieving Good Environmental Status (GES) across marine waters. This section details how ML is strategically deployed across three domains of environmental health: monitoring Water Quality and Biodiversity by inverting complex optical properties; ensuring timely response capabilities through automated oil spill detection and forecasting; and tackling pervasive anthropogenic waste via marine litter monitoring and prediction. Collectively, these applications demonstrate the DTO’s capacity to synthesize diverse data streams into actionable intelligence for effective policy implementation and sustainable management.

3.2.1. Water Quality and Biodiversity Monitoring

Despite nearly two decades of dedicated efforts to curb marine pollution, oceans worldwide continue to face the threat of irreversible damage. The UN Sustainable Development Goal 14 (specifically SDG 14.1) calls for urgent action to reduce pollution, including nutrient runoff and plastic waste, to safeguard marine life and human well-being. In line with this, the EU’s zero pollution ambition for 2050 sets concrete targets: halving marine plastic litter and reducing microplastic emissions by 30% by 2030. Furthermore, the Marine Strategy Framework Directive (MSFD) strives to achieve good environmental status (GES) across European marine waters. Meeting these ambitious objectives in Europe- and contributing to global progress—will require countries to adopt fundamentally different and more transformative approaches to policy implementation. To quantitatively assess progress against these regulatory targets and safeguard ecosystems, a standardized metric is required.
The Water Quality Index (WQI) is a widely used metric to quantitatively assess water quality [72], particularly in ecosystems subject to anthropogenic and environmental pressures [73]. This is directly linked to the assessment of marine biodiversity, as water quality is an important factor for the health and survival of aquatic ecosystems [41,74,75]. A DTO relies on a range of indicators for WQI, including physical and biogeochemical parameters such as temperature, salinity, pH, dissolved oxygen, chlorophyll-a, ammonia, nitrogen, optical depth, and suspended sediments. There are different methods and approaches in deriving these parameters, depending on data accessibility and the study’s objective. Measuring these indicators on an operational basis can be achieved utilizing three primary data sources: in situ monitoring, remote sensing, and environmental modeling. For the scope of this review, the emphasis is placed on applications incorporating satellite remote sensing and in situ measurements, owing to the computational demands of environmental modeling and the inherent requirement for physical validation.
DTOs can leverage existing in situ time series datasets through ML applications to directly forecast water quality variables. For example, robust pipelines using models like Long Short-Term Memory (LSTM) and Back Propagation Neural Networks (BPNN) [75] have been developed to forecast Chlorophyll-a for early warning of algal blooms. However, even if in situ monitoring with commercial IoT-enabled sensors can provide direct measurements, it can also be expensive to acquire and maintain, labor-intensive, and geographically restricted.
Ocean color (OC) remote sensing measurements offer a cost-effective alternative for monitoring variables and marine biodiversity across wider areas, by providing high-frequency synoptic views of spectral water-leaving reflectances (Rrs(λ)), an ocean Apparent Optical Property (AOP). An AOP is a property that depends on the ambient light field. OC is intrinsically linked to the ocean’s Inherent Optical Properties (IOPs): the absorption (a(λ)) and scattering (b(λ)) properties of dissolved and particulate water constituents. These IOPs are independent of the light field and define the water’s optical characteristics R = f(a,b). This relationship allows OC data to be used for the estimation of both IOPs and other AOPs, such as the diffuse attenuation coefficient (Kd(λ)), which is used for solar penetration depth estimation, as well as biogeochemical parameters like Chlorophyll-a, Dissolved Oxygen, and Nitrates [76]. These properties provide quantitative information about the upper ocean’s biogeochemical composition, supporting the estimation of the Water Quality Index (WQI) [73,76].
The transition to ML is driven by the intrinsic limitations of traditional approaches. Traditional methods for deriving these variables, include empirical relationships between the target variable and AOPs [77,78,79,80,81,82], other water constituents [83,84,85], or IOPs [86] and semi-analytical approaches based on radiative transfer models [87,88,89,90,91]. Empirical algorithms are primarily used for regional studies, but are prone to errors due to spatiotemporal limitations of their derivation datasets and their difficulty in differentiating between various water components that do not co-vary with the target variable [92]. Furthermore, semi-analytical models like the QAAs can be effective for open ocean applications, but do not always perform well on turbid coastal environments, particularly where high concentrations of colored dissolved organic matter (CDOM) are found, which affects the IOPs in the QAA relationships and hence, their results [89,93].
ML models provide the necessary solution to overcome these constraints. ML applications are increasingly utilized to estimate IOPs, AOPs and bio-optical variables from remote sensing measurements providing a scalable and efficient solution for comprehensive WQI monitoring, overcoming the constraints of limited spatial and temporal coverage inherent in traditional methods (Figure 3). ML approaches can also represent more complex biophysical interactions and non-linear relationships, often making them more suitable than parametric models [27,94,95]. Additionally, recent research has demonstrated the utility of ML in addressing specific limitations of conventional monitoring [9,95,96,97,98].
ML methods are applied on Rrs data for deriving Chla products. Ref. [35] developed a ML approach using SVR to estimate chlorophyll-a (Chl) concentrations from satellite measurements at a global scale. This study addresses the limitations of traditional band-ratio algorithms (OCx) and the hybrid Ocean Color Index (OCI) approach by demonstrating that the SVR model provides a single, unified algorithm that performs well across a wide range of water types (0.01 to 1 mg/m3). Notably, the SVR approach was found to reduce image noise and improve consistency across multiple satellite sensors, including SeaWiFS, MODIS/Aqua, and VIIRS. This cross-sensor consistency is an important capability for DTOs that rely on long-term, multi-decadal data records. Addressing the limited spatial and temporal coverage of standard satellite Chl products, particularly in dynamic coastal regions, ref. [33] introduced a scene-specific RF based Regression Ensemble (RFRE) model to increase Chl data availability. This approach leveraged the high spatial coverage of the non-saturating Color Index (CI) and Rayleigh-corrected reflectance (Rrc) data (from land bands) to accurately recover missing Chl-a pixels. By training an independent RFRE model for each MODIS scene, the method converted the CI and Rrs inputs into Chl-a values over pixels typically masked by sun glint, thin clouds, or thick aerosols. Tested over the Yellow Sea and East China Sea, this ML technique increased Chl coverage without compromising data quality, providing a method for filling gaps in time-series analysis and bloom monitoring. For complex coastal environments where physical and biogeochemical factors interact strongly, ref. [34] demonstrated a DTO-relevant approach for complex coastal areas by employing a CNN that uniquely integrates satellite ocean color data (Chl-a, CDOM, TSS) with verified hydrodynamic model outputs (currents, temperature, salinity) to estimate Chl-a distribution. By using a segmented image approach to increase training data 300-fold, their optimal model achieved high accuracy, showcasing the efficacy of combining remote sensing and physical modeling inputs via DL for coastal water quality. A further evolution of this synergistic approach was presented by [56], who developed a coupled Hydro-Biogeochemical-CNN (HBGC-CNN) model specifically for daily, spatial prediction of Chl-a in the turbid Bohai Sea. This method utilized the process-based HBGC model to provide key environmental and nutrient variables (e.g., temperature, salinity, Dissolved Inorganic Nitrogen, zooplankton) as input features to the CNN, which was then trained using satellite-derived Chl-a products. The HBGC-CNN successfully reproduced daily and seasonal Chl-a variations, achieving the best performance metrics among tested ML models (ANN, SVR, RF). This hybrid method was shown to reconstruct gap-free Chl-a products for historical reanalysis and accurately predict the spatiotemporal distribution of harmful algal blooms (HABs) triggered by a typhoon, highlighting its utility for operational forecasting in coastal environments with frequent cloud cover and high turbidity.
Considering the derivation of IOPs through ML models, ref. [95] developed an MLP NN to derive IOPs from total light absorption and particulate absorption coefficients using data from six common ocean color sensor wavelengths, addressing the computational limitation of existing methods and their reliance on ancillary inputs or assumptions about the spectral shapes of subcomponents. The NNs were designed to retrieve the absorption coefficients for phytoplankton (aph) and a combination of gelbstoff and detritus (adg), which are key water quality parameters. The models were trained using a synthetic dataset and validated using publicly available datasets, demonstrating good accuracy. To improve performance in turbid coastal waters, they developed a hybrid algorithm (QAANN) by integrating their NNs with the Quasi-Analytical Algorithm (QAA). This hybrid model achieved a reduction in mean absolute percentage error (MAPE) for retrieving both aph(443) and adg(490) when compared to existing semi-analytical algorithms. A different strategy to enhance analytical methods in challenging waters was proposed by [99], who introduced a DL-assisted quasi-analytical algorithm (QAA-DL) to improve IOP retrievals over inland and nearshore coastal waters. The QAA-DL integrates an artificial NN (ANN) into the traditional QAA framework to re-parameterize the algorithm specifically for extremely turbid waters, where conventional models struggle or fail to produce valid results (often yielding negative values). By developing two ANN models to predict the total absorption coefficient, the method increased valid data coverage in turbid regions, providing more than double the usable observations compared to standard algorithms (QAAv6 and GIOP). This synergistic approach thus effectively balances the physical interpretability of QAA with the nonlinear modeling power of DL to overcome the long-standing challenge of IOP estimation in optically complex environments.
The Kd(λ) is an AOP related to the solar penetration depth and hence its availability on the upper ocean layer and the related ecosystems. Refs. [27,100] each developed an Artificial Neural Network (ANN) to predict the Kd(λ). Ref. [100] developed a deep learning model, termed DLKPAR, that directly estimates the Kd of Photosynthetically Active Radiation Kd(PAR) using remote sensing reflectance R(λ) at seven specific wavelengths, thereby circumventing the requirement for additional empirical conversion formulas. Ref. [27], conversely, developed a Multi-Layer Perceptron (MLP) regression model that utilizes the routinely measured in situ beam attenuation coefficient profiles, alongside remote sensing data, to derive a gridded Kd(490) field for the Eastern Mediterranean Sea.
The authors of [41] developed a series of robust GPR models to estimate concentrations of global ocean surface dissolved oxygen and macronutrients (nitrate, phosphate, and silicate) from satellite data. Their methodology identified optimal input parameters as sea surface temperature (SST), sea surface salinity (SSS), and latitude/longitude, and demonstrated that including dissolved oxygen further improved macronutrient estimates. The study concluded that GPR models provide highly accurate and reliable satellite-based products for monitoring these non-optically active variables, which are otherwise difficult to observe. Furthermore, ANNs have been employed to infer the three-dimensional structure of the ocean from surface-only satellite data, providing a viable method to reconstruct subsurface chlorophyll and temperature fields where in situ data are sparse [48]. In another example of remote sensing-based applications, ref. [101] developed a high-resolution method to monitor nutrient concentrations in coastal waters. The study addressed the challenge of monitoring non-optically active nutrients like dissolved inorganic nitrogen (DIN) and orthophosphate-phosphorus (PO4) by creating a matched dataset between in situ measurements and satellite data from both Sentinel-2 and Sentinel-3. Their methodology utilized a ML inversion model, with results indicating that a quadratic SVM performed best for DIN concentration and an exponential GPR performed best for PO4 concentration, demonstrating high applicability for monitoring aquaculture areas. Extending ML application to biogeochemical components, ref. [102] successfully estimated sea-surface Particulate Organic Carbon (POC) concentrations in the Mediterranean Sea using a highly optimized ML model. To select optimal input features from a pool of 47 environmental factors (including IOPs, AOPs, nutrients, and physical parameters), they employed the Geographic Detector method. The overall best performance was achieved by the tuneRanger RF (TRRF) model, which was optimized using Bayesian optimization, yielding high accuracy. This ML approach utilized multiple satellite-derived physical and biogeochemical inputs to accurately map POC distribution at the basin scale, providing a useful tool for monitoring the ocean carbon cycle in semi-enclosed, human-impacted environments. Ref. [103], developed an innovative Deep NN (DNN) approach to retrieve the particulate backscatter coefficient (bbp) and POC from the spaceborne lidar CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization), by utilizing active remote sensing data for global carbon cycle monitoring. This method addresses the limitations of passive ocean color by utilizing lidar’s depth-resolving capabilities and its independence from sunlight, enabling measurements at night and in polar regions where ice and cloud cover are prevalent. The DNN was trained by pairing lidar signals with collocated daytime MODIS products, effectively learning the non-linear relationship between CALIOP data and bbp/POC without relying on restrictive, error-prone physics-based assumptions about conversion factors. The results showed that the DNN-derived bbp and POC were in better agreement with independent Argo float data than those derived from traditional methods. This work showcases the potential of combining active remote sensing with advanced ML to generate gap-filled ocean data products for DTOs globally. In a non-active sensing effort to address this data gap challenge, ref. [96] focused on the polar regions by employing ensemble ML models (RF and Extremely Randomized Tree) for the spatial and temporal reconstruction of Chl-a data. To circumvent cloud cover limitations, their models were trained exclusively using cloud-free microwave and reanalysis data (including SST, sea ice, atmospheric temperature, wind, and photosynthetically active radiation). This purely predictive approach successfully generated high-resolution (4 km) daily Chl-a data, with reconstructed pixels showing strong agreement with in situ measurements in areas missed by satellites. The study demonstrated the robustness of using readily available environmental predictors and ML to drastically increase ocean color data coverage in perpetually cloudy environments.
The data from these methods are then used to calculate a single, quantitative WQI through a process involving indicator selection, weighting, and aggregation [73]. However, these traditional WQI methods are often criticized for their inherent uncertainty and inconsistency due to issues like “eclipsing” (overestimation of the WQI) and “ambiguity” (underestimation or overestimation of the WQI values) [74]. A recent study by [74] provides an example of how ML can be applied to address the issues of traditional WQI models. The researchers developed a data-driven methodology for coastal WQI assessment, training and validating eight different ML algorithms to directly predict WQI values and bypass the problematic sub-index and weighting components of conventional models. The study found that tree-based and ensemble models, including DT, Extra Tree (ExT), Extreme Gradient Boosting (XGB), and RF, consistently demonstrated superior performance, providing a more reliable and consistent method for WQI assessment. This approach is highly applicable for operational uses, as the trained ML models can be integrated into real-time monitoring systems to provide rapid and accurate WQI predictions, eliminating the need for manual calculations. However, for continuous operational use, this methodology has disadvantages. The primary constraint is the significant data requirement, needing an extensive and high-quality time series dataset for robust results. A prerequisite for applying this method is the operational availability of commercial WQ IoT-enabled sensors or satellite sensors to ensure a continuous supply of data for the ML prediction models. Moreover, as the models are trained on specific regional data, they may not be directly transferable to other coastal areas without retraining. The risk of overfitting is also a key challenge, as models trained on insufficient or unrepresentative data may fail to generalize to new, unseen operational data.

3.2.2. Oil Spill Detection and Forecasting

Oil spills can severely damage ecosystems and cleanup alone for medium-to-large spills costs an estimated $2.4–9.4 billion [104]. Thus, the International Convention on Oil Pollution Preparedness, Response and Co-operation (OPRC) requires signatory states to establish emergency plans, response capacity, and international cooperation for oil spills. On a European level, the use of DTOs for oil spill mapping and mitigation increases compliance with Directive 2013/30/EU, which requires minimization of spill risk through integrated emergency planning, comprehensive risk assessment, and effective cross-border collaboration. Nonetheless, without proper real-time hazard mapping and forecasting capabilities, authorities struggle to identify oil spills. In the event of a spill, the absence of harmonized frameworks and coordination across jurisdictions results in delayed responses, particularly in cross-border incidents. Digital twin technologies can enhance oil spill preparedness and response by integrating real-time satellite data, oceanographic models, and ML algorithms into a cohesive operational framework [44,105]. These systems provide authorities with advanced tools for rapid detection, tracking, and forecasting of spill trajectories (Figure 4), enabling more effective and timely mitigation efforts.
The implementation of this enhanced response capability is predicated on the automated identification of oil spills from satellite imagery using advanced ML techniques. Foundational work in this area was presented by [106], who developed a large dataset of satellite images and trained a deep NN based on the U-Net architecture for oil spill segmentation. This approach demonstrated the high potential of DL for reliable, automated spill detection, forming a basis for operational monitoring systems. While DL has shown great promise, other ML classifiers are also highly effective. For example, ref. [107] developed a detection system using RF classifiers on Synthetic Aperture Radar (SAR) imagery, demonstrating another robust methodology for automated spill identification.
Building on this, the detection concept has been adapted and operationalized within DTO frameworks, moving the focus from detection to trajectory prediction and response. For instance, ref. [44] describes the development of a DTO for Coastal Crete where the AI-based identification model is containerized into an application package. This DTO integrates the ML detection module with wind and hydrodynamic higher resolution forecasting models, providing a complete tool for both near real time identifying a spill and predicting its fate within 2 days. Similarly, ref. [105] developed a digital twin for managing oil spill incidents in the North Aegean, showcasing the regional applicability and scalability of this approach. Further advancements focus on improving detection accuracy and providing more detailed characterization of spills using more sophisticated data and models. Several studies have explored the use of hyperspectral remote sensing, which provides richer spectral information than traditional satellite imagery. Ref. [108] utilized a Deep CNN (DCNN) with multi-scale features for hyperspectral detection, while [109] employed an adaptive optimizer to enhance detection performance. More recently, research has combined hyperspectral with thermal infrared data to further improve the identification of marine oil pollution [110]. Beyond detection, ML models are also being developed for more specific tasks, such as identifying the type of oil spill [45], which supports planning the appropriate response strategy.

3.2.3. Marine Litter Monitoring

UN Ocean Decade Challenge 1 entitled ‘Understand and Beat Ocean Pollution’ calls on countries to map ocean pollution, understand its impacts, and develop solutions to mitigate and reduce it. Furthermore, SDG target 14.1 aims to prevent and reduce marine pollution by 2030. Anthropogenic marine litter, predominantly plastics [111,112], poses a significant and increasing threat to global marine ecosystems, biodiversity, and human health [113,114,115,116]. The urgency of monitoring comes from the vast quantities discharged annually [112], the long-term persistence of plastics, and their inevitable fragmentation into microplastics (MPs, particles < 5 mm) [117] that infiltrate the food web [118]. Historically, monitoring relied on manual, labor-intensive, and costly methods such as beach surveys, net trawl expeditions, and through underwater video inspections [119,120,121]. These conventional approaches are severely limited in spatial and temporal coverage, often lacking the consistency and scale needed for effective large-scale assessment and mitigation policy formulation [122]. ML, leveraged with RS technologies, offers a transformative solution [42,46]. It provides the necessary speed, scalability, and high accuracy to automatically identify, classify, quantify, and predict the distribution of debris across highly complex and varying marine and coastal environments, enabling a shift toward continuous, cost-effective monitoring strategies. The integration of multi-sourced data and models across different spatial domains, from the open ocean surface and dynamic coastlines to the ocean’s bottom, defines the scope of a comprehensive ML framework, applicable to DTOs addressing marine litter (Table 2).
The initial focus for ML application lies in automatically detecting and classifying floating plastic debris on the sea surface using satellite data and vessel-based imagery across macro- and microplastic scales. Initial efforts employing Sentinel-2 multi-spectral imagery addressed the limitation of small training datasets for macroplastic detection. For instance, ref. [51] achieved high accuracy using a GAN-RF model, where the GAN synthesized needed plastic samples to overcome data scarcity. Similarly, ref. [36] confirmed the strong performance of RF when bolstered by spectral feature engineering, validating the use of indices like the Floating Debris Index (FDI) and the non-linear kernel Normalized Difference Vegetation Index (kNDVI) for robust classification. These methods establish the core techniques for generating high-fidelity, large-scale maps of surface plastic distribution, which serve as inputs for initializing and validating marine plastic transport models within a DTO. To enhance spectral discrimination beyond Sentinel-2’s capabilities, multiple studies have explored advanced sensors and improved classification techniques. Ref. [42] successfully leveraged hyperspectral PRISMA data combined with a hybrid unsupervised (K-Means) and supervised (LGBM) approach to detect floating objects down to 5 m resolution. Reinforcing the utility of spectral information, ref. [123] demonstrated the efficacy of the Naive Bayes algorithm using the FDI for discriminating macro-plastics from natural debris (seaweed/sea foam) in complex Brazilian coastal waters. Meanwhile, ref. [124] validated the feasibility of automated classification for macro-litter categories (bottles, buckets, and straws) directly from vessel-based imagery using the VGG16 CNN architecture.
While these spectral methods support large-scale maritime surveillance, effective debris management necessitates high-resolution detection and quantification in dynamic, localized coastal environments. This operational requirement is met by applications focusing on the practical monitoring of plastics accumulated on coastlines, beaches, and artificial shore structures, typically utilizing high-resolution drone or aerial imagery to manage complex, site-specific backgrounds. Research in direct shoreline monitoring employs sophisticated DL models and high-resolution aerial imagery to accurately detect and quantify debris, even in complex, human-modified environments. Ref. [46] developed a two-stage CNN system, APLASTIC-Q, that performs initial detection before quantifying items into fourteen specific classes, establishing high performance benchmarks for debris counting in diverse riverine and beach settings. Building on this, ref. [59] enhanced the YOLOv8 DL model to address the challenge of false positives by successfully training the model to distinguish plastic from tourists on tourism-intensive artificial coastlines. This combination of high-fidelity detection and robust environmental discrimination provides DTOs with near-real-time, item-level ground truth data necessary for localized cleanup and waste management modeling. Complementary to direct quantification, ML models are used to predict microplastic accumulation hotspots by incorporating coastal morphometric data. Ref. [125] utilized geotechnologies and a GB model to successfully predict microplastic distribution based on influential physical parameters like beach face slope and orientation. Integrating such predictive models allows DTOs to accurately forecast beach susceptibility and dynamically adapt coastal pollution mitigation and sampling strategies based on environmental factors.
However, to deliver a complete operational picture, the ability to detect surface and coastal macro-litter must be computationally coupled with models to address the three-dimensional fate of microplastics and monitor the inaccessible subsurface environment. This final layer of DTO capability, focuses on computational models for predicting microplastic fate in the water column and the necessary technologies for monitoring hard-to-reach subsurface environments, along with the automation of microplastic laboratory analysis. Predictive modeling and laboratory automation offer scalable solutions for understanding and characterizing microplastic distribution and hazards. Ref. [126] successfully modeled the global abundance of microplastics (MPs) using the XGBoost algorithm, utilizing SHAP analysis to identify key drivers like the distance from the coast and human development index. Complementing this, ref. [127] introduced a highly accessible lab analysis method using DL (U-Net for segmentation and VGG16 for classification) to automatically count and classify MPs from images taken with a digital camera or mobile phone. These integrated computational tools provide DTOs with predictive capabilities for microplastic pathway analysis and support the automated processing of validation data needed for model calibration. Dedicated DL systems support the automation of litter detection in hard-to-access sub-surface environments. Ref. [47] developed MLDet, a model specifically tailored for underwater marine litter detection using Autonomous Underwater Vehicles (AUVs), which utilized enhanced features like Deformable Convolutional Networks (DCN) for robust shape recognition. Similarly, ref. [63] applied the Mask R-CNN to successfully detect seafloor marine litter in towed camera images, establishing the efficacy of DL for handling degraded and obscured debris. Automating image processing for AUV and towed systems allows DTOs to assimilate large volumes of continuous, three-dimensional data on submerged debris locations, facilitating optimized robotic cleanup missions.
Table 2. Overview of AI Techniques for Marine Litter Monitoring across Diverse Domains.
Table 2. Overview of AI Techniques for Marine Litter Monitoring across Diverse Domains.
Monitoring DomainAI TechniqueData SourcePrimary FunctionKey Advantage/ContextReference
Surface (Wide-Area)GAN-RFSentinel-2 MultispectralMacroplastic Detection & ClassificationOvercomes data scarcity; large-scale mapping.[51]
Random ForestSentinel-2 (Spectral Indices)Macroplastic DiscriminationRobust with feature engineering (FDI).[36]
K-Means/LGBM (Hybrid)Hyperspectral (PRISMA)Floating Object DetectionEnhanced spectral discrimination.[42]
Naive BayesSpectral Indices (FDI)Macro-plastic vs. Natural DebrisEffective for discrimination.[123]
Coastal/Shoreline (Item-Level)APLASTIC-Q (CNN)Drone/Aerial ImageryDebris Detection & QuantificationHigh-resolution item counting; multi-class.[46]
YOLOv8 (DL)Drone/Aerial ImageryPlastic vs. Non-Plastic DiscriminationReduced false positives on complex backgrounds.[59]
Gradient BoostingCoastal Morphometric DataMicroplastic Hotspot PredictionIdentifies accumulation areas.[125]
Subsurface & Prediction (3D)XGBoostGlobal Environmental DataMicroplastic Global Abundance ModelingIdentifies key drivers; large-scale prediction.[126]
U-Net/VGG16 (DL)Mobile Phone ImagesMicroplastic Lab Analysis (Count/Class)Accessible, automated lab analysis.[127]
MLDet (DCNN)AUV ImageryUnderwater Litter DetectionRobust shape recognition; AUV-specific.[47]
Mask R-CNN (CNN)Towed Camera ImagesSeafloor Litter DetectionAccurate object segmentation.[63]

3.3. Port Safety and Ship Routing

Transitioning from environmental monitoring to operational optimization, the Digital Twin concept is important for global logistics and maritime safety, offering advanced solutions to escalating challenges [128]. Ports play a central role in the global economy: over 80% of world trade by volume (and a substantial share by value) is carried by sea, and global seaborne trade reached about 12.3 billion tonnes in 2023 [129]. If current UNCTAD (United Nation Trade And Development, [129]) growth rates persist (≈2–2.4% p.a.), global cargo volumes could increase roughly 18–20% by 2030 compared with 2023, placing heavy additional demand on port infrastructure and operations. At the same time, ports face growing climate and hazard exposure: a recent global hazard-level assessment finds many ports are exposed to multiple hazards (≈86% exposed to three or more hazards) and estimates port-specific risks of physical damage and service disruption at roughly US$7.5 billion per year [130]. Many ports globally are already struggling to cope due to a combination of rising volumes, increased frequency and severity of extreme weather events, sea-level rise, and other related hazards. In Europe, 74% of all goods entering or leaving the EU are transported by sea (EU Blue Economy Report 2025 [131]). Additionally, in 2023, European seaports handled around 3.4 billion tonnes of freight (Eurostat, data on maritime transport of goods). With cargo volumes expected to surge by 20% by 2030 and with more frequent disruptions from extreme weather events, better risk assessment and mitigation solutions are increasingly needed.
These persistent and escalating challenges, coupled with the critical need for safety, efficiency, and environmental protection, make port and ship routing operations a suitable domain for DTOs. Digital twin technologies in ports and maritime operations create real-time virtual replicas of physical assets, such as terminals, vessels, yard equipment, and environmental conditions, by integrating IoT, sensor networks, and advanced analytics. These DTOs enable situational awareness, predictive simulation, decision support, and operational optimization with strong implications for sustainability and efficiency [128]. ML and AI are increasingly integrated within these DTO ecosystems: predictive models forecast container dwell time, truck movements, vessel arrival and turnaround, and even hazardous cargo container status, enhancing throughput and reducing emissions in container terminals and port logistics [132].
A prime application of ML-enhanced DTOs is the computation of optimal vessel trajectories for low-carbon maritime operations. The VISIR-II framework exemplifies such a DTO component, designed to minimize CO2 emissions, fuel consumption, and transit time under dynamic oceanographic and meteorological conditions through the employment of Dijkstra’s algorithm for minimum-cost path determination [133,134]. This approach has been applied within broader DTO initiatives, such as the ILIAD Digital Twin of the Ocean framework, where traditional graph-based optimization was applied for route planning ([135], https://ocean-twin.eu/marketplace/product/ship-routing, last accessed online 10 October 2025). The v6 iteration of VISIR-II has evolved into a hybrid system through the strategic integration of ML. This hybrid structure incorporates pre-trained NN models, specifically MLP regressors, which function as computationally efficient surrogate estimators for vessel performance metrics under dynamic conditions [136]. This capacity for rapid emulation is used to generate synthetic datasets of optimal routing solutions, which can be employed to train next-generation AI systems for autonomous vessels, thereby mitigating data scarcity limitations inherent in AIS-based data [133].
Complementing macro-scale routing intelligence, ML within DTOs also extends to localized time series forecasting, which contributes to dynamic harbor safety and efficient resource allocation. These predictive capabilities enable port authorities to make proactive decisions regarding berth availability and operational continuity, particularly in anticipation of extreme meteorological and oceanographic events. Recent studies have demonstrated the effectiveness of lightweight NNs and statistical models at local scales for this purpose. For example, ref. [49] demonstrated the effectiveness of SARIMA, LSTM, and hybrid models in predicting key atmospheric variables relevant to coastal and harbor operations. Similarly, ref. [137] performed a comprehensive evaluation of time series and regression models for environmental forecasting, emphasizing practical deployment for univariate forecasting tasks. These methods are highly applicable for short-term wind, wave, or current forecasting, where ML is used for gap-filling, noise reduction, and forecast enhancement using sparse in situ sensor data. The seamless integration of these advanced forecasting tools into Digital Twin platforms further enhances situational awareness by enabling real-time or near-real-time simulation and alert systems, facilitating responses to increasing disruptions associated with extreme weather events.
As DTOs evolve to encapsulate both routing intelligence and harbor-level environmental forecasting, the convergence of physics-based and data-driven methods offers a scalable path toward resilient, low-emission, and adaptive port operations.

3.4. Marine Renewable Energy

The need for a sustainable and swift transition towards renewable energy sources for electricity production is well established, and a central component of the EU’s energy roadmap [138]. Among different energy sources, marine renewable energy is becoming increasingly interesting, and relevant, owing not only to the limited space available onshore but also due to the high energetic potential offered by marine renewable sources [139]. According to the EU Strategy on Offshore Renewable Energy (2020), the EU Commission has set the objective to have an installed capacity of at least 60 GW of offshore wind and at least 1 GW of ocean energy by 2030 and 340 GW by 2050 from offshore wind and ocean energy [140]. At the current rate of production, the EU is not on track to meet these goals. To achieve them, deployment of Offshore Renewable Energy (ORE) infrastructure must scale rapidly, up five times by 2030 and 25 times by 2050. However, without proper planning and data, ORE deployment can cause environmental damage or be limited by uncertainty and lack of capacity [140]. Furthermore, by facilitating increased deployment of offshore renewable energy, DTOs can support the EU targets of 55% GHG reduction by 2030 and carbon neutrality by 2050. DTOs also help achieve the targets of 61 GW by 2030 and 340 GW by 2050 of offshore wind and ocean energy as well as UN Ocean Decade Challenge 5: ‘Unlock Ocean-Based Solutions to Climate Change’ [141]. Among different energy sources, marine renewable energy is becoming increasingly interesting, and relevant, owing not only to the limited space available onshore but also due to the high energetic potential offered by marine renewable sources [142].
The development of DTO applications for marine renewables leverages advanced AI solutions. These predictive capabilities contribute to improving the level of reliability, traceability, and efficiency obtained from energy harvesting devices, spanning resource forecasting, structural integrity assessment, and design optimization, which ultimately contributes to their dissemination and widespread implementation [143]. A brief overview is now provided, wherein the description is divided by the type of renewable energy, discussed in ascending order of maturity.

3.4.1. Wave Energy

Wave Energy conversion has been a topic of research for decades, and yet the level of industry maturity is not very high, despite the very large number of prototypes and concepts which have been proposed and tested [144]. This lack of penetration is related with the difficulty in ensuring survivability of wave energy devices, owing partly to the demanding conditions such equipment experiences. To understand failure causes and increase the future robustness of these technologies, a few DTO applications for wave energy conversion systems have been developed, which address different aspects. A central component of these applications is the accurate prediction of the wave field and its spatial–temporal variability. Beyond deterministic forecasts, this requires quantifying the uncertainty associated with predicted wave parameters—particularly significant wave height. Recent developments employ deep-learning surrogate models to approximate the underlying wave climate and propagate uncertainties more efficiently than traditional numerical simulations. Such approaches, exemplified by DELWAVE 1.0 for the Adriatic Sea [145], provide rapid, data-driven estimations of wave conditions while enabling probabilistic assessments, important for evaluating device survivability.
The DTO framework utilizes ML to address the primary challenge of resource assessment and power forecasting required for successful grid integration and planning. Ref. [146] initiated this by demonstrating how optimized Deep Neural Networks (DNN) and the Moth-Flame Optimization (MFO) algorithm achieve superior short-term wave energy flux forecasting compared to traditional methods. Ref. [147] scaled this predictive capability for regional planning by developing an AI-driven stacking ensemble model (integrating XGBoost, LightGBM, and CatBoost) to accurately predict the total power output of Wave Energy Converter (WEC) arrays across multiple coastal locations, demonstrating high scalability. Furthermore, ref. [148] enhanced accessibility for preliminary site selection by integrating a CatBoost model into a user-friendly Graphical User Interface (GUI) capable of forecasting significant wave height with high accuracy even with limited data.
This necessity for accuracy extends to the real-time control systems that maximize WEC efficiency and prevent damage. Ref. [149] introduced a fundamental hybrid model, the LWT-PINN, which combines linear wave theory with Physics-Informed Neural Networks (PINNs) to perform deterministic phase-resolved wave prediction in near real-time, an important input for control strategies. Ref. [150] reinforced the operational viability of this process by integrating LSTM networks with Deep Ensembles (DE) for robust Uncertainty Quantification (UQ) of wave height, improving prediction reliability. This predictive information is closed within the control loop by algorithms such as the LSTM Recurrent Neural Network (RNN), which [151] demonstrated can accurately and rapidly predict the non-linear wave excitation forces on WECs directly from wave elevations, enabling real-time actuation and filtering out unwanted signal noise.
The second key application is employing the DTO for physical modeling, which entails the prediction of hydrodynamic and mechanical loads on the structures from given wave characteristics, with emphasis on extreme conditions since these typically originate the largest magnitude loads over the operational envelope, and can be thus responsible for equipment failures [152]. This necessity drives the application of ML as a computationally efficient surrogate for expensive numerical models used in design and structural integrity assessment. Ref. [153] exemplified this by applying supervised regression ML models, notably XGBoost, to optimize the geometric design of WECs (e.g., ballast weight and position) by rapidly and accurately predicting performance metrics. Supporting this foundation, ref. [154] utilized deep learning to create Non-Intrusive Reduced Order Models (NIROMs) for Wave-Structure Interaction (WSI), coupling LSTM with Physics-Informed Neural Networks (LSTM-PINN) to efficiently model the full hydrodynamic and rigid body dynamics. This capability enables the prediction of extreme mechanical loads that cause equipment failures.

3.4.2. Tidal Energy

Addressing Tidal Energy applications, this technology is more mature than Wave Energy owing mostly to a larger resource predictability, relatively practical installation, and economic profitability. However, its application is limited to specific geographic locations wherein there is a combination of high energetic potential and ease of installation and maintenance, corresponding to estuaries and natural channels [144]. The topology of the tidal devices and physical working principles mean the risk of very large structural loads arising from extreme sea-states is small compared to Wave Energy technologies. As such, the development of DTO applications for tidal devices usually addresses different aspects than wave energy, with a focus on the estimation of power production under different environmental conditions, and particularly incorporating the complex flows that arise in channels and barrages and the hydrodynamic interaction between machines, referred to as wake effects. Methodologically, this challenge is being addressed through techniques like surrogate-based optimization (SBO), which uses high-fidelity numerical data to train efficient surrogate models for the purpose of optimizing turbine array layouts [155], as well as the development of ML-based surrogates for real-time condition monitoring, such as calculating component loads from sensor data [156]. At an industrial level, DTOs are being developed using SCADA (Supervisory Control and Data Acquisition) operational data [157] to monitor and improve performance, to plan maintenance and as a predictive tool for component failures. Specific operational challenges being tackled by ML-driven DTOs include the automated detection and estimation of biofouling [158] with recent work focusing on the integration of physics-informed ML [159,160].

3.4.3. Offshore Wind Energy

Finally, regarding Offshore Wind Energy (OWE), this technology may be considered more mature as the installed capacity and lifespan of wind farms is rather extensive, which means there are several fields in which DTOs are already implemented [161]. Uncertainties related with, e.g., increasing size of the machines and associated structural behavior pose a challenge, which can be tackled by applying surrogate modeling coupled to in situ data from accelerometers to construct structural DTs of the turbine’s rotor and tower, and ultimately help in predicting the dynamic behavior. Methodologically, this involves using machine learning algorithms, such as ANNs or GPR, to create computationally inexpensive surrogate models [162]. These models are trained on data from high-fidelity physics-based simulations or operational sensor streams. Their primary advantage is the ability to execute in real-time, allowing for live prediction of complex dynamic behaviors and structural loads [163], a feat that is computationally prohibitive for the original high-fidelity models. Another trend in OWE is the employment of floating platforms, as more remote locations are considered. A floating turbine will necessarily have added degrees of freedom and hence increased motion compared to bottom-fixed structures, which favors the creation of DTOs for monitoring the system’s movements and eventually control the motion amplitude for enhanced performance. Several other aspects have been addressed by DTOs application for OWE, including for the construction and assembly phase [164], power production forecast (WinDTwin project https://www.engineering.com/achieving-real-time-predictive-maintenance-for-wind-farms-with-hybrid-twin/, last accessed online 10 October 2025), load estimation [165], and Operation and Maintenance (O&M) [166]. In the O&M domain, DTOs are increasingly functioning as dynamic systems. A key methodology is the use of physics-based hybrid analytics, which merges ML-driven models with physical principles to accurately monitor asset health, increase facility uptime, and enable predictive maintenance strategies [167].

3.5. Aquaculture and Fisheries

The applications of DTOs and ML models extend finally to the sectors of aquaculture and fisheries, that contribute to global food security and economic stability. These industries face significant, intersecting challenges, including climate variability, disease outbreaks, and overfishing. DTO frameworks provide the necessary comprehensive, real-time insights and decision-making support required for sustainable management by integrating advanced ML capabilities with vast, multi-sourced datasets. A detailed summary of these diverse ML applications is provided in Table 3.
Within this DTO framework, the initial application of ML focuses on monitoring the ecosystem scale, encompassing everything from habitat health to ecological surveys to inform conservation and management strategies. ML models for habitat and ecological health are being developed using both in situ and satellite data. Ref. [168] utilized a hybrid model combining a RF with a Sinusoidal Chaos Map Whale Optimization Algorithm to accurately map suitable seagrass habitats, training the model on both in situ and satellite data. Similarly, ref. [169] used an SVM and a MaxEnt (Maximum Entropy) model [170] to map suitable coral reef habitats, reinforcing the importance of protected marine areas by correlating habitat suitability with human activity. On a more localized scale, ref. [171] developed a LASSO regression and a RF model to predict cyanobacteria abundance in aquaculture ponds, identifying key organic carbon indicators for effective management. These applications are foundational for DTOs, enabling the creation of virtual representations of marine ecosystems to forecast changes and simulate the effects of management actions.
Building upon ecosystem modeling, ML provides the tools for non-invasive, item-level monitoring of the fish stocks, supporting biomass and health assessments. CV and DL are heavily employed for the automated monitoring of fish themselves. DL-based fish counting typically employs either density estimation or object detection approaches. Ref. [172] developed a DL model called MAN to accurately count cultured fish in crowded tanks with a high degree of occlusion by using a multi-branch attention mechanism and density maps. The advantage of this approach is its ability to count fish in highly dense environments, but its disadvantage is that it can have a larger error in areas with high fish density. Similarly, ref. [58] developed a non-invasive system for biomass estimation in turbid waters using a modified YOLOv8 model for object detection followed by regression models. The advantage of this approach is its high accuracy in turbid environments. However, it might falter under extreme conditions such as low light or extreme salinity. These vision-based methods are also used for fish sizing and measurement in wild fisheries. Refs. [60,61] developed systems using a Mask R-CNN and R-CNNs, respectively, to measure fish size in trawls or from non-specialist cameras. A related effort by [173] used enhanced deformable geometric models to fit the silhouettes of swimming tuna from stereo video footage, enabling highly accurate length estimations for growth monitoring in cages. Furthermore, ref. [62] presented a comprehensive DL pipeline called “Fish-Sense” that performs a dual function: it uses a Mask R-CNN and VGG-16 for biomass estimation from out-of-water images and also includes a dedicated disease detection module. This module uses an Inception V3 model to classify fish health and then employs Mask R-CNN again to segment body parts for identifying signs of inflammation. A major advantage of this pipeline is its ability to automate both biomass and disease monitoring, but it requires out-of-water images, which can be stressful for the fish. These non-invasive data streams are foundational for a DTO’s digital twin of fish populations, providing the real-time data needed for a dynamic virtual representation of fish health, size, and biomass within their physical environment.
Moving from real-time monitoring to predictive capacity, ML models are used to forecast key biological and environmental states to support proactive management. This includes predicting outbreaks of harmful algal blooms (HABs), which can be fatal to fish populations, and forecasting the migration patterns of wild fish stocks. Refs. [37,174] leveraged remote sensing and numerical models in their ML frameworks. Ref. [37] used XGBoost, RF, and SVM on three consecutive days of MODIS satellite data to predict HABs up to nine days in advance, with the XGBoost model proving most effective. Ref. [174] utilized a RF model and numerical models to predict future HAB events in the East China and Yellow Seas under different climate change scenarios. Ref. [175] proposed an ML architecture for detecting and predicting HAB events by combining Convolutional Neural Networks (CNNs) for spatial pattern recognition with Long Short-Term Memory (LSTM) networks for temporal analysis, trained on a large dataset of Karenia brevis algae events to predict occurrences up to eight days in advance. Ref. [50] focused on a local spatio-temporal HABs forecasting model based on maritime station monitoring (MSM) data, which is more accurate than satellite data but has spatial and temporal gaps. They used principal component analysis to select main environmental factors, spatio-temporal clustering to identify warning levels, and an ARIMA-LSTM network for forecasting. Separately, ML-based Species Distribution Models (SDMs) are frequently used to predict the spatial distribution of migratory fish stocks. Ref. [176] used an ensemble modeling approach with 17 different ML algorithms to predict skipjack tuna habitats in the Western North Pacific based on satellite-derived data. Ref. [55] used an Ensemble Learning Model (ELM) composed of five base algorithms (RF, SVM, KNN, XGBoost, and GP) to forecast high-yield albacore fishing grounds in the South Pacific, finding that latitude was the most influential factor on the model’s accuracy. Furthermore, the use of a Stacking Ensemble Learning (STK) model was found to be highly effective for predicting fishing grounds for bigeye tuna and albacore [53,54]. Similarly, ref. [38] developed an SDM for eight commercial fish species in the Mediterranean using XGBoost on a large, refined dataset, successfully predicting high-resolution habitat maps. This approach is a key component of a DTO for fisheries, as it can model complex environmental-biological interactions to optimize fishing strategies and manage resources dynamically. Extending this predictive capacity to individual organisms, ref. [39] applied a RF model to classify the reproductive condition of Chilean hake, providing a reliable and cost-effective alternative to traditional histological methods. In a genetic context, ref. [177] used XGBoost on genomic data to predict disease resistance in aquaculture species, with XGBoost demonstrating a performance advantage over traditional genomic prediction models andreducing computational time. These predictive models can be integrated into a DTO to simulate future biological states of a fish population under different environmental scenarios, enabling proactive and data-driven management and policy formulation.
Finally, ML models are integrated into DTOs to provide actionable intelligence for day-to-day operations through automation and optimization. ML algorithms are used to analyze vessel surveillance data and identify fishing activity to inform dynamic management strategies, such as using an RF model to estimate the spatiotemporal distribution of fishing grounds from Vessel Monitoring System (VMS) data [40]. For aquaculture optimization, a recent development involves the creation of AI-powered digital assistants to provide real-time guidance and knowledge. Ref. [178] developed a smart aquaculture assistant system that integrates an IoT layer with sensors for monitoring water quality (turbidity, pH, and temperature) with an assistant layer. The assistant layer utilizes a fine-tuned GPT 3.5 model, trained on aquaculture FAQs and a comprehensive knowledge base, to provide farmers with logical and natural-language advice via mobile and desktop applications. This system offers visualizations of pond and weather data and can automate actuator controls, such as pumps and heaters, based on sensor readings. This advancement is invaluable for a DTO, as it provides a user-friendly interface that connects farmers directly to the digital twin, translating complex data and predictive models into actionable, real-time advice for day-to-day management. ML models are also used for optimizing fish farm operations and providing decision support. Ref. [179] developed a hybrid DT model that combines a DT with a Naive Bayes method to classify Potential Fishing Zones (PFZs) in the Indian Ocean. The model classifies PFZs into low, medium, or high catch categories based on sea surface temperature (SST) gradients, their persistence, and chlorophyll-a concentration. Ref. [180] proposed an optimization scheme for water pump control in smart fish farms using the Internet of Things (IoT) and a Kalman filter to maintain water levels with minimal energy consumption. These methods can be integrated into a DTO’s control systems to automate and optimize farm operations, leading to increased resource efficiency and reduced environmental impact. Regarding energy optimization, a hybrid DL approach combining LSTM networks and DDPG has been developed for large-scale recirculating aquaculture systems [57]. The model was trained on a full year of hourly data from a commercial facility, using environmental and operational parameters to predict and optimize energy consumption. This approach resulted in a 15–20% reduction in daily energy use while maintaining stable water quality and a 17% decrease in energy costs per kilogram of fish produced. The system’s robust performance under varying fish biomass densities and seasonal temperature profiles demonstrates its potential for enhancing the economic and environmental sustainability of intensive aquaculture operations. These findings are highly relevant to a DTO, as they show how deep reinforcement learning can be used to dynamically optimize complex, real-world biological-mechanical systems with multiple objectives.
Table 3. Summary of Machine Learning Applications and Techniques within Digital Twins of the Ocean (DTOs) for Aquaculture and Fisheries.
Table 3. Summary of Machine Learning Applications and Techniques within Digital Twins of the Ocean (DTOs) for Aquaculture and Fisheries.
Application AreaSpecific Task/ObjectiveKey ML Model/TechniqueData SourcePrimary Output Reference
Ecosystem Health & Habitat ModelingSeagrass Habitat MappingRandom Forest (Hybrid with WOA)Satellite Imagery (spectral), Environmental ParametersAccurate habitat maps, conservation planning[168]
Coral Reef Habitat MappingSVM, MaxEntSatellite Imagery, Environmental ParametersIdentification of suitable reef areas, protected zone reinforcement[169]
Cyanobacteria Abundance PredictionLASSO Regression, Random ForestAquaculture Pond Data (organic carbon indicators)Early warning of harmful blooms in ponds, management optimization[171]
Automated Monitoring of Fish Populations (Computer Vision)Fish Counting (Cultured Fish)MAN (Deep Learning)Underwater Camera ImageryAccurate count in crowded tanks[45]
Fish Biomass Estimation (Turbid Water)YOLOv8 (modified DL), RegressionUnderwater Camera ImageryNon-invasive biomass estimation in challenging conditions[58]
Fish Sizing & Measurement (Wild Fish)Mask R-CNN, R-CNNUnderwater/Towed Camera ImageryAccurate size data for stock assessment[60,61]
Fish Health & Disease DetectionMask R-CNN, Inception V3Underwater Camera ImageryBiomass estimation, health classification, disease segmentation[62]
Environmental & Stock State ForecastingHarmful Algal Bloom (HAB) ForecastingXGBoostRemote Sensing, Numerical Model OutputsEarly warning (up to 9 days) for HABs[37,174]
Species Distribution Modeling (Tuna)Ensemble ModelsEnvironmental Parameters (SST, Chl-a, currents)Prediction of optimal migratory fish habitats[176]
Fishing Ground Prediction (Tuna)Stacking Ensemble Learning (STK)Environmental Parameters, Catch DataIdentification of high-yield fishing zones[53,56]
Fish Reproductive Condition ClassificationRandom Forest (RF)Biological Samples (Image/Genomic)Reproductive status for stock management[39]
Disease Resistance Prediction (Aquaculture)Extreme Gradient Boosting (XGB)Genomic DataPrediction of disease susceptibility in species[177]
Operational Optimization & Reinforcement LearningFishing Activity/Ground EstimationRandom Forest (RF)Vessel Monitoring System (VMS) DataDynamic management strategies, resource allocation[40]
AI-Powered Digital Assistant (Aquaculture)GPT 3.5 (fine-tuned)Sensor Readings, Operational DataReal-time guidance, automated actuator control[178]
Energy Optimization (RAS)Hybrid DL (LSTM + DDPG − RL)Hourly Operational Data (RAS)15–20% energy reduction, stable water quality[57]

4. Challenges and Future Perspectives

Collectively, the applications presented in Section 3 establish that ML is not merely an auxiliary tool but an indispensable computational layer for operational DTOs. The use of models, including specialized CNNs, e.g., GLONET for global forecasting [43], U-Net for oil spill detection [44] and sequence-aware networks like LSTMs for time-series forecasting (e.g., sea level and water quality), demonstrates the capability to provide the requisite speed for near-real-time operations. The increasing adoption of Hybrid models, such as the Hydro-Biogeochemical-CNN (HBGC-CNN) for Chl-a prediction or the integration of MLP regressors into trajectory optimization with VISIR-II [136], affirms a strategic shift towards leveraging the computational efficiency of AI while retaining the interpretability and constraint-adherence of physics-based models. Furthermore, specialized models of CV like YOLOv8 and Mask R-CNN, are being applied to automate monitoring tasks in high-stakes domains, from fish health and biomass estimation to the detection of seafloor marine litter. This comprehensive integration across forecasting, detection, and optimization validates the DTO framework’s core promise: to translate massive, heterogeneous data streams into computationally efficient, highly accurate, and actionable insights necessary for informed decision-making and sustainable management of the blue economy.
The success demonstrated across these applications confirms that Digital Twins can be utilized for intelligent and data-based decision-making tools [181] in various sectors, including cities and urban planning [182,183], in industry and manufacturing [184], as well as ecological understanding [185] and improving resource allocation (e.g., for water supply [186]). ML offers unprecedented capabilities to extract patterns from large, heterogeneous environmental datasets and to produce fast surrogates for complex process models. In the context of DTOs, ML is being applied for monitoring, short- and long-term forecasting, anomaly detection, and “what-if” scenario evaluations. Yet, the efficacy of ML in environmental and DTO contexts is subject to multiple, interacting challenges that differ in important ways from the constraints of physical or mathematical (process-based) models and must be addressed if ML-enabled DTOs are to become reliable decision-support tools [187].
Foremost among these challenges is the dependency of DL and many ML approaches on large, representative training datasets to learn robust mappings. However, environmental observations, especially in the open ocean and in under-instrumented coastal areas, are often spatially sparse and temporally irregular, which can produce models that overfit localized characteristics or fail when extrapolated (e.g., to new localities) [188]. While methods, designed to mitigate data scarcity, like data augmentation, synthetic data, and transfer learning, exist, their effectiveness depends on the similarity between source and target domains and on careful validation; the literature documents both successes and persistent limitations when training data are limited or biased [189].
In addition to data volume constraints, the operational utility of DTOs is complicated by issues of accessibility and governance, as DTOs rely on continuous streams of heterogeneous observations, such as satellite remote sensing, autonomous platforms, in situ sensors, vessel tracking, and even citizen-generated data. However, many high-value datasets are restricted for legal, commercial, or privacy reasons. Fisheries catch and vessel activity datasets are a prominent example where concerns about confidentiality, economic impact, or enforcement have led stakeholders to limit sharing [190]. Such restrictions fragment the data landscape and inhibit model training, integration, and operationalization of DTOs for management purposes. Moreover, real-time feeds may be costly, subject to institutional licenses, or intermittent.
These data constraints are compounded by a lack of standardization in data production. Environmental data are produced by diverse platforms and communities using different formats, conventions, and metadata practices. The adoption of community standards (e.g., CF/NetCDF conventions, FAIR principles, and interoperability frameworks) improves reusability, but uneven uptake and inconsistent semantics, as well as metadata quality remain barriers to seamless DTO construction and ML model reuse. These challenges render ML pipelines brittle, as pre-processing becomes time-consuming and error-prone, and automated model deployment across datasets is hampered. Institutional investments in data curation and shared ontologies can reduce the engineering overhead of DTOs. To this aim, the Ocean Best Practices Community has undertaken the development of IEEE P3501: Recommended Practice for the Development of Digital Twins of the Earth (https://ieeeoes.org/key-activities/oes-standards/, last accessed 10 October 2025).
Moving beyond data-related issues, another operational challenge is adapting ML models and DTOs that were initially developed for one region to function effectively in another. Environmental systems are diverse, due to variations in physical drivers, boundary conditions, ecological processes, and human pressures across different locations. Often a model must be re-trained or re-calibrated with new local data to regain acceptable accuracy. The requirement for retraining presents a substantial challenge, especially for local authorities and resource managers who may lack capacity to run complex ML pipelines, undermining the scalability of DTOs across jurisdictions. In contrast, well-constructed process-based models, grounded in conserved physical laws, might offer more straightforward portability because their governing equations hold across sites, albeit still requiring local boundary conditions and parameter tuning. New hybrid and physics-informed ML approaches (e.g., [69,191]) explicitly embed physical constraints to improve generalization and reduce the dependence on extensive regional training data, and have shown encouraging results as surrogates for computationally expensive numerical models.
This acceleration in model efficiency is important, as traditionally, building and maintaining a DTO, particularly one that integrates ML components and real-time assimilation, is resource intensive in terms of human capital and computing power. However, the rapid development of AI and NN architectures can reduce the runtime of expensive ocean and coastal models by orders of magnitude, enabling real-time inference, what-if scenario exploration, and uncertainty propagation that would be impractical otherwise [109]. Furthermore, AI models can be set-up to assimilate sensor, glider, and satellite data, enabling DTOs that update predictive behavior as new observations arrive and that can learn site-specific patterns without handcrafted rules. This capacity improves responsiveness for operational decision support [188].
Despite these clear technological advantages, the integration of AI introduces specific challenges and risks that require mitigation. Generative AI systems and over-confident model outputs can produce plausible but incorrect predictions or explanations or, in the worst case, be an outcome of model fabrication (AI hallucinations). Such inaccuracies are particularly hazardous when translated into management actions. This is compounded by many high-performing AI models effectively functioning as black boxes to non-specialists [192]. Stakeholders and regulators may not understand how an output was produced, what input data were used, or which assumptions were embedded, thereby complicating accountability for decisions based on DTO outputs [193]. Even the selection of appropriate AI models for operationalization is nontrivial and can create institutional fragmentation and inconsistent advice to decision-makers. Furthermore, due to autoregressive characteristics of specific AI models it has been seen that long-term forecasting and stability in ocean simulations is hindered by error propagation. To mitigate these impacts on such models, ref. [43] propose an initial pre-training phase that focuses on generating 1-day-ahead forecasts, followed by a subsequent phase in which the GLONET extends forecasts up to 4 days, that compared to the non-autoregressive AI approach, offers the ability for longer term, accurate forecasts.
Beyond these technical and governance challenges, ML remains an indispensable part of the toolkit for DTOs and environmental decision support; however, realizing its full potential necessitates concerted, multi-faceted action on several key fronts, as summarized in Table 4. First, the community must commit to investment in open, well-documented observational infrastructure and metadata standards to reduce preprocessing friction. The European Digital Twin of the Ocean (EDITO Infra) can facilitate the former within at least a Europe-centric framework, whereas the aforementioned IEEE ocean best practices standards could help with the latter. Second, the field requires technical solutions for selective data sharing and privacy-preserving analytics to make sensitive datasets usable without breaching confidentiality. Third, there must be broader adoption of physics-informed and hybrid modeling strategies to improve transferability and consequently reduce the need for continual retraining. Finally, DTOs must mandate transparent reporting of uncertainties for every operational output so users can see both central estimates and uncertainty ranges. This transparency must extend to distinguishing which part of the uncertainty comes from model structure/limited data (epistemic uncertainty) and which is irreducible variability (aleatoric). For scenarios characterized by high epistemic uncertainty, a human review should be a prerequisite prior to any automated action. The IPCC offers best-practice language and calibrated phrasing for communicating such distinctions to decision-makers [194]. Significant efforts have been made to give an explanation of why a decision coming from a NN has been made, since they primarily operate as black boxes. This field of eXplainable AI (XAI) is rapidly advancing with methodologies like perturbation-based crossvalidation [195], LIME (Local Interpretable Model-agnostic Explanations) technique [196], and SHAP-based feature engineering [197], that aid towards interpretability of such models.
The convergence of ocean science, high-resolution observations, and ML represents a transformative inflection point for climate science and sustainable management. The successful mitigation of the challenges outlined, particularly enhancing data governance, standardizing interoperability, and ensuring model transparency through robust Uncertainty Quantification, is not merely an academic exercise, but a prerequisite for establishing the trust required for DTOs to influence policy and operational decision-making. By strategically investing in these foundational elements, the research community can ensure that ML-enabled DTOs fulfill their promise as computationally efficient, accurate, and ethical platforms, ultimately supporting a resilient and sustainable blue economy in the face of escalating global climate pressures.

Author Contributions

Conceptualization, V.M. and A.P.; methodology, V.M. and A.P.; formal analysis, V.M. and A.P.; investigation, V.M., A.P., R.S.P. and G.K.; writing—original draft preparation, V.M., A.P., R.S.P. and G.K.; visualization, V.M., A.P. and R.S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is contained within the article.

Acknowledgments

We thank the two anonymous reviewers for their constructive suggestions that helped shape the final version of this review.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Chen, G.; Yang, J.; Huang, B.; Ma, C.; Tian, F.; Ge, L.; Xia, L.; Li, J. Toward digital twin of the ocean: From digitalization to cloning. Intell. Mar. Technol. Syst. 2023, 1, 3. [Google Scholar] [CrossRef]
  2. Huguenin, M.F.; Holmes, R.M.; England, M.H. Drivers and distribution of global ocean heat uptake over the last half century. Nat. Commun. 2022, 13, 1–11. [Google Scholar] [CrossRef] [PubMed]
  3. Oliver, E.C.J.; Benthuysen, J.A.; Darmaraki, S.; Donat, M.G.; Hobday, A.J.; Holbrook, N.J.; Schlegel, R.W.; Sen Gupta, A. Marine Heatwaves. Ann. Rev. Mar. Sci. 2021, 13, 313–342. [Google Scholar] [CrossRef] [PubMed]
  4. Heimbach, P.; O’Donncha, F.; Smith, T.A.; Garcia-Valdecasas, J.M.; Arnaud, A.; Wan, L. Crafting the Future: Machine learning for ocean forecasting. State Planet 2025, 5, 22. [Google Scholar] [CrossRef]
  5. Sagi, T.; Lehahn, Y.; Bar, K. Artificial intelligence for ocean science data integration: Current state, gaps, and way forward. Elementa 2020, 8. [Google Scholar] [CrossRef]
  6. Zhao, Q.; Peng, S.; Wang, J.; Li, S.; Hou, Z.; Zhong, G. Applications of deep learning in physical oceanography: A comprehensive review. Front. Mar. Sci. 2024, 11, 1396322. [Google Scholar] [CrossRef]
  7. Parasyris, A.; Metheniti, V.; Kampanis, N.; Darmaraki, S. Marine heatwaves in the Mediterranean Sea: A convolutional neural network study for extreme event prediction. Ocean. Sci. 2025, 21, 897–912. [Google Scholar] [CrossRef]
  8. Sonnewald, M.; Lguensat, R.; Jones, D.C.; Dueben, P.D.; Brajard, J.; Balaji, V. Bridging observations, theory and numerical simulation of the ocean using machine learning. Environ. Res. Lett. 2021, 16, 073008. [Google Scholar] [CrossRef]
  9. Liu, G.; Bracco, A.; Brajard, J. Systematic Bias. Correction in Ocean Mesoscale Forecasting Using Machine Learning. J. Adv. Model. Earth Syst. 2023, 15, e2022MS003426. [Google Scholar] [CrossRef]
  10. Cho, D.; Yoo, C.; Im, J.; Cha, D.H. Comparative Assessment of Various Machine Learning-Based Bias Correction Methods for Numerical Weather Prediction Model Forecasts of Extreme Air Temperatures in Urban Areas. Earth Space Sci. 2020, 7, e2019EA000740. [Google Scholar] [CrossRef]
  11. Liu, H.-Y.; Tan, Z.-M.; Wang, Y.; Tang, J.; Satoh, M.; Lei, L.; Gu, J.-F.; Zhang, Y.; Nie, G.-Z.; Chen, Q.-Z. A Hybrid Machine Learning/Physics-Based Modeling Framework for 2-Week Extended Prediction of Tropical Cyclones. J. Geophys. Res. Mach. Learn. Comput. 2024, 1, e2024JH000207. [Google Scholar] [CrossRef]
  12. Kadow, C.; Hall, D.M.; Ulbrich, U. Artificial intelligence reconstructs missing climate information. Nat. Geosci. 2020, 13, 408–413. [Google Scholar] [CrossRef]
  13. Smith, P.A.H.; Sørensen, K.A.; Buongiorno Nardelli, B.; Chauhan, A.; Christensen, A.; St. John, M.; Rodrigues, F.; Mariani, P. Reconstruction of subsurface ocean state variables using Convolutional Neural Networks with combined satellite and in situ data. Front. Mar. Sci. 2023, 10, 1218514. [Google Scholar] [CrossRef]
  14. Sloyan, B.M.; Chapman, C.C.; Cowley, R.; Charantonis, A.A. Application of Machine Learning Techniques to Ocean Mooring Time Series Data. J. Atmos. Ocean. Technol. 2023, 40, 241–260. [Google Scholar] [CrossRef]
  15. AboElHassan, A.; Sakr, A.H.; Yacout, S. General purpose digital twin framework using digital shadow and distributed system concepts. Comput. Ind. Eng. 2023, 183, 109534. [Google Scholar] [CrossRef]
  16. Brarda, P.G.; Fernandez, G.; Ayala, N.F. Digital Twin, Digital Shadow or Digital Model? A Systematic Literature Review. IFIP Adv. Inf. Commun. Technol. 2026, 766, 306–319. [Google Scholar] [CrossRef]
  17. Chapelle, O.; Scholkopf, B.; Zien, A. (Eds.) Semi-Supervised Learning. IEEE Trans. Neural Netw. 2009, 20, 542. [Google Scholar] [CrossRef]
  18. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  19. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
  20. Rasmussen, C.E. Gaussian Processes in Machine Learning; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics; Springer: Berlin/Heidelberg, Germany, 2004; Volume 3176, pp. 63–71. [Google Scholar] [CrossRef]
  21. Smola, A.J.; Schölkopf, B. A tutorial on support vector regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef]
  22. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar] [CrossRef]
  23. Chen, W.; He, C.; Ji, C.; Zhang, M.; Chen, S. An improved K-means algorithm for underwater image background segmentation. Multimed. Tools Appl. 2021, 80, 21059–21083. [Google Scholar] [CrossRef]
  24. Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  25. LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2323. [Google Scholar] [CrossRef]
  26. Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  27. Metheniti, V.; Karageorgis, A.P.; Drakopoulos, P.; Kampanis, N.; Sofianos, S. Deriving the diffuse attenuation coefficient in the Eastern Mediterranean Sea, using observational optical measurements and a multi-layer perceptron regression model. Deep Sea Res. Part. I Oceanogr. Res. Pap. 2023, 199, 104105. [Google Scholar] [CrossRef]
  28. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
  29. Hinton, G.E.; Salakhutdinov, R.R. Reducing the Dimensionality of Data with Neural Networks. Science (1979) 2006, 313, 504–507. [Google Scholar] [CrossRef]
  30. Wolpert, D.H. Stacked generalization. Neural Netw. 1992, 5, 241–259. [Google Scholar] [CrossRef]
  31. Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. In Proceedings of the 4th International Conference on Learning Representations, ICLR 2016—Conference Track Proceedings, San Juan, Puerto Rico, 2–4 September 2015; Available online: https://arxiv.org/pdf/1509.02971 (accessed on 17 October 2025).
  32. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once. CVPR 2016, 2016, 779–788. [Google Scholar] [CrossRef]
  33. Chen, S.; Hu, C.; Barnes, B.B.; Xie, Y.; Lin, G.; Qiu, Z. Improving ocean color data coverage through machine learning. Remote Sens. Environ. 2019, 222, 286–302. [Google Scholar] [CrossRef]
  34. Jin, D.; Lee, E.; Kwon, K.; Kim, T. A Deep Learning Model Using Satellite Ocean Color and Hydrodynamic Model to Estimate Chlorophyll-a Concentration. Remote Sens. 2021, 13, 2003. [Google Scholar] [CrossRef]
  35. Hu, C.; Feng, L.; Guan, Q. A Machine Learning Approach to Estimate Surface Chlorophyll a Concentrations in Global Oceans From Satellite Measurements. IEEE Trans. Geosci. Remote Sens. 2021, 59, 4590–4607. [Google Scholar] [CrossRef]
  36. Sannigrahi, S.; Basu, B.; Basu, A.S.; Pilla, F. Development of automated marine floating plastic detection system using Sentinel-2 imagery and machine learning models. Mar. Pollut. Bull. 2022, 178, 113527. [Google Scholar] [CrossRef]
  37. Izadi, M.; Sultan, M.; El Kadiri, R.; Ghannadi, A.; Abdelmohsen, K. A Remote Sensing and Machine Learning-Based Approach to Forecast the Onset of Harmful Algal Bloom. Remote Sens. 2021, 13, 3863. [Google Scholar] [CrossRef]
  38. Effrosynidis, D.; Tsikliras, A.; Arampatzis, A.; Sylaios, G. Species Distribution Modelling via Feature Engineering and Machine Learning for Pelagic Fishes in the Mediterranean Sea. Appl. Sci. 2020, 10, 8900. [Google Scholar] [CrossRef]
  39. Flores, A.; Wiff, R.; Donovan, C.R.; Gálvez, P. Applying machine learning to predict reproductive condition in fish. Ecol. Inform. 2024, 80, 102481. [Google Scholar] [CrossRef]
  40. Meeanan, C.; Noranarttragoon, P.; Sinanun, P.; Takahashi, Y.; Kaewnern, M.; Matsuishi, T.F. Estimation of the spatiotemporal distribution of fish and fishing grounds from surveillance information using machine learning: The case of short mackerel (Rastrelliger brachysoma) in the Andaman Sea, Thailand. Reg. Stud. Mar. Sci. 2023, 62, 102914. [Google Scholar] [CrossRef]
  41. Sundararaman, H.K.K.; Shanmugam, P. Estimates of the global ocean surface dissolved oxygen and macronutrients from satellite data. Remote Sens. Environ. 2024, 311, 114243. [Google Scholar] [CrossRef]
  42. Taggio, N.; Aiello, A.; Ceriola, G.; Kremezi, M.; Kristollari, V.; Kolokoussis, P.; Karathanassi, V.; Barbone, E. A Combination of Machine Learning Algorithms for Marine Plastic Litter Detection Exploiting Hyperspectral PRISMA Data. Remote Sens. 2022, 14, 3606. [Google Scholar] [CrossRef]
  43. Aouni, A.E.; Gaudel, Q.; Regnier, C.; Van Gennip, S.; Le Galloudec, O.; Drevillon, M.; Drillet, Y.; Lellouche, J.-M. GLONET: Mercator’s End-to-End Neural Global Ocean Forecasting System. December 2024. Available online: https://arxiv.org/pdf/2412.05454 (accessed on 10 September 2025).
  44. Metheniti, V.; Parasyris, A.; Fazzini, N.; Outmani, S.; Correia, M.; Goddard, J.; Alexandrakis, G.; Kozyrakis, G.V.; Vettorello, L.; Keeble, S.; et al. Coastal Crete: A Digital Twin of the Ocean for Oil Spill Identification and Forecasting. In Proceedings of the OCEANS 2025 Brest, Brest, France, 16–19 June 2025; IEEE: New York, NY, USA, 2025; pp. 1–8. [Google Scholar] [CrossRef]
  45. Li, Y.; Yu, Q.; Xie, M.; Zhang, Z.; Ma, Z.; Cao, K. Identifying oil spill types based on remotely sensed reflectance spectra and multiple machine learning algorithms. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 9071–9078. [Google Scholar] [CrossRef]
  46. Wolf, M.; Van Den Berg, K.; Garaba, S.P.; Gnann, N.; Sattler, K.; Stahl, F.; Zielinski, O. Machine learning for aquatic plastic litter detection, classification and quantification (APLASTIC-Q). Environ. Res. Lett. 2020, 15, 114042. [Google Scholar] [CrossRef]
  47. Ma, D.; Wei, J.; Li, Y.; Zhao, F.; Chen, X.; Hu, Y.; Yu, S.; He, T.; Jin, R.; Li, Z.; et al. MLDet: Towards efficient and accurate deep learning method for Marine Litter Detection. Ocean. Coast. Manag. 2023, 243, 106765. [Google Scholar] [CrossRef]
  48. Sammartino, M.; Nardelli, B.B.; Marullo, S.; Santoleri, R. An artificial neural network to infer the mediterranean 3d chlorophyll-a and temperature fields from remote sensing observations. Remote Sens. 2020, 12, 4123. [Google Scholar] [CrossRef]
  49. Parasyris, A.; Alexandrakis, G.; Kozyrakis, G.V.; Spanoudaki, K.; Kampanis, N.A. Predicting Meteorological Variables on Local Level with SARIMA, LSTM and Hybrid Techniques. Atmosphere 2022, 13, 878. [Google Scholar] [CrossRef]
  50. Wen, J.; Yang, J.; Li, Y.; Gao, L. Harmful algal bloom warning based on machine learning in maritime site monitoring. Knowl. Based Syst. 2022, 245, 108569. [Google Scholar] [CrossRef]
  51. Jamali, A.; Mahdianpari, M. A Cloud-Based Framework for Large-Scale Monitoring of Ocean Plastics Using Multi-Spectral Satellite Imagery and Generative Adversarial Network. Water 2021, 13, 2553. [Google Scholar] [CrossRef]
  52. Rus, M.; Mihanović, H.; Ličer, M.; Kristan, M. HIDRA3: A deep-learning model for multipoint ensemble sea level forecasting in the presence of tide gauge sensor failures. Geosci. Model. Dev. 2025, 18, 605–620. [Google Scholar] [CrossRef]
  53. Song, L.; Li, T.; Zhang, T.; Sui, H.; Li, B.; Zhang, M. Comparison of machine learning models within different spatial resolutions for predicting the bigeye tuna fishing grounds in tropical waters of the Atlantic Ocean. Fish. Oceanogr. 2023, 32, 509–526. [Google Scholar] [CrossRef]
  54. Li, J.; Chen, F.; Dai, Q.; Zhu, W.; Li, D.; Yu, W.; Zhou, W. Construction and Comparison of Machine-Learning Forecast Models of Albacore Thunnus alalunga Fishing Grounds in the South Pacific Ocean. Fishes 2024, 9, 375. [Google Scholar] [CrossRef]
  55. Zhang, J.; Fan, D.; He, H.; Xiao, B.; Xiong, Y.; Shi, J. Forecasting Albacore (Thunnus alalunga) Fishing Grounds in the South Pacific Based on Machine Learning Algorithms and Ensemble Learning Model. Appl. Sci. 2023, 13, 5485. [Google Scholar] [CrossRef]
  56. Li, H.; Li, X.; Song, D.; Nie, J.; Liang, S. Prediction on daily spatial distribution of chlorophyll-a in coastal seas using a synthetic method of remote sensing, machine learning and numerical modeling. Sci. Total Environ. 2024, 910, 168642. [Google Scholar] [CrossRef]
  57. Alnemari, A.M.; Elmessery, W.M.; Moghanm, F.S.; Espinosa, V.; Shams, M.Y.; Elwakeel, A.E.; Saeed, O.; Eid, M.H.; Alhag, S.K.; Al-Shuraym, L.A.; et al. Energy optimization in large-scale recirculating aquaculture systems: Implementation and performance analysis of a hybrid deep learning approach. Aquac. Eng. 2025, 111, 102561. [Google Scholar] [CrossRef]
  58. Rani, S.V.J.; Ioannou, I.; Swetha, R.; Lakshmi, R.M.D.; Vassiliou, V. A novel automated approach for fish biomass estimation in turbid environments through deep learning, object detection, and regression. Ecol. Inform. 2024, 81, 102663. [Google Scholar] [CrossRef]
  59. Zhao, H.; Wang, X.; Yu, X.; Peng, S.; Hu, J.; Deng, M.; Ren, L.; Zhang, X.; Duan, Z. Application of improved machine learning in large-scale investigation of plastic waste distribution in tourism Intensive artificial coastlines. Environ. Pollut. 2024, 356, 124292. [Google Scholar] [CrossRef] [PubMed]
  60. Monkman, G.G.; Hyder, K.; Kaiser, M.J.; Vidal, F.P. Using machine vision to estimate fish length from images using regional convolutional neural networks. Methods Ecol. Evol. 2019, 10, 2045–2056. [Google Scholar] [CrossRef]
  61. Garcia, R.; Prados, R.; Quintana, J.; Tempelaar, A.; Gracias, N.; Rosen, S.; Vågstøl, H.; Løvall, K. Automatic segmentation of fish using deep learning with application to fish size measurement. ICES J. Mar. Sci. 2020, 77, 1354–1366. [Google Scholar] [CrossRef]
  62. Aftab, K.; Tschirren, L.; Pasini, B.; Zeller, P.; Khan, B.; Fraz, M.M. Intelligent Fisheries: Cognitive Solutions for Improving Aquaculture Commercial Efficiency Through Enhanced Biomass Estimation and Early Disease Detection. Cognit. Comput. 2024, 16, 2241–2263. [Google Scholar] [CrossRef]
  63. Politikos, D.V.; Fakiris, E.; Davvetas, A.; Klampanos, I.A.; Papatheodorou, G. Automatic detection of seafloor marine litter using towed camera images and deep learning. Mar. Pollut. Bull. 2021, 164, 111974. [Google Scholar] [CrossRef]
  64. Lellouche, J.-M.; Greiner, E.; Le Galloudec, O.; Régnier, C.; Benkiran, M.; Testut, C.-E.; Bourdallé-Badie, R.; Drévillon, M.; Garric, G.; Drillet, Y. The Mercator Ocean Global High-Resolution Monitoring and Forecasting System. New Front. Oper. Oceanogr. 2018, 563–592. Available online: http://purl.flvc.org/fsu/fd/FSU_libsubv1_scholarship_submission_1536245264_041d5712 (accessed on 10 October 2025).
  65. Wang, K.; Xu, H.; Wang, H.; Qiu, R.; Hu, Q.; Liu, X. Digital twin-driven safety management and decision support approach for port operations and logistics. Front. Mar. Sci. 2024, 11, 1455522. [Google Scholar] [CrossRef]
  66. Bodnar, C.; Bruinsma, W.P.; Lucic, A.; Stanley, M.; Allen, A.; Brandstetter, J.; Garvan, P.; Riechert, M.; Weyn, J.A.; Dong, H.; et al. A foundation model for the Earth system. Nature 2025, 641, 8065. [Google Scholar] [CrossRef]
  67. Joanna, S.; Angelique, M.; Jennifer, V.; Pascal, M. Solving coastal dynamics: Introduction to high-resolution ocean forecasting services. State Planet Discuss. 2025, 2025, 1–17. [Google Scholar] [CrossRef]
  68. Madec, G.; NEMO Team. NEMO Ocean Engine Reference Manual; NEMO Team: Dover, NH, USA, 2024. [Google Scholar]
  69. Jordão, H.; Jiang, P.; Weisser, C.; Meinert, N.; Lavin, A.; Walker, C.C.; Wainwright, H.M.; Holgate, S.; Lütjens, B.; Barnard, P. Coastal Digital Twin: Learning a Fast and Physics-Informed Surrogate Model for Coastal Floods Via Neural Operators. 13 December 2021, AGU. Available online: https://agu.confex.com/agu/fm21/meetingapp.cgi/Paper/887845 (accessed on 10 September 2025).
  70. Rus, M.; Ličer, M.; Kristan, M. HIDRA-D: Deep-learning model for dense sea level forecasting using sparse altimetry and tide gauge data. EGUsphere 2025, 2025, 1–25. [Google Scholar] [CrossRef]
  71. Camps-Valls, G.; Fernández-Torres, M.Á.; Cohrs, K.H.; Höhl, A.; Castelletti, A.; Pacal, A.; Robin, C.; Martinuzzi, F.; Papoutsis, I.; Prapas, I.; et al. Artificial intelligence for modeling and understanding extreme weather and climate events. Nat. Commun. 2025, 16, 1–14. [Google Scholar] [CrossRef] [PubMed]
  72. Horton, R. An index number system for rating water quality. J. Water Pollut. Control Fed. 1965, 37, 300–306. [Google Scholar]
  73. Uddin, M.G.; Nash, S.; Olbert, A.I. A review of water quality index models and their use for assessing surface water quality. Ecol. Indic. 2021, 122, 107218. [Google Scholar] [CrossRef]
  74. Uddin, M.G.; Nash, S.; Diganta, M.T.M.; Rahman, A.; Olbert, A.I. Robust machine learning algorithms for predicting coastal water quality index. J. Environ. Manag. 2022, 321, 115923. [Google Scholar] [CrossRef]
  75. Lin, J.; Liu, Q.; Song, Y.; Liu, J.; Yin, Y.; Hall, N.S. Temporal Prediction of Coastal Water Quality Based on Environmental Factors with Machine Learning. J. Mar. Sci. Eng. 2023, 11, 1608. [Google Scholar] [CrossRef]
  76. Mobley, C.D. Radiative Transfer in the Ocean. In Encyclopedia of Ocean Sciences; Elsevier: Amsterdam, The Netherlands, 2001; pp. 2321–2330. [Google Scholar] [CrossRef]
  77. Hu, C.; Lee, Z.; Franz, B. Chlorophyll aalgorithms for oligotrophic oceans: A novel approach based on three-band reflectance difference. J. Geophys. Res. Oceans 2012, 117, 1011. [Google Scholar] [CrossRef]
  78. O’Reilly, J.E.; Werdell, P.J. Chlorophyll algorithms for ocean color sensors—OC4, OC5 & OC6. Remote Sens. Environ. 2019, 229, 32–47. [Google Scholar] [CrossRef]
  79. Austin, R.W.; Petzold, T.J. Spectral Dependence of the Diffuse Attenuation Coefficient of Light in Ocean Waters. Opt. Eng. 1986, 25, 253471. [Google Scholar] [CrossRef]
  80. Werdell, P.J.; Bailey, S.W. An improved in-situ bio-optical data set for ocean color algorithm development and satellite data product validation. Remote Sens. Environ. 2005, 98, 122–140. [Google Scholar] [CrossRef]
  81. Zhang, T.; Fell, F. An empirical algorithm for determining the diffuse attenuation coefficient K d in clear and turbid waters from spectral remote sensing reflectance. Limnol. Oceanogr. Methods 2007, 5, 457–462. [Google Scholar] [CrossRef]
  82. Wang, M.; Son, S.H.; Harding, L.W. Retrieval of diffuse attenuation coefficient in the Chesapeake Bay and turbid ocean regions for satellite ocean color applications. J. Geophys. Res. Oceans 2009, 114, 10011. [Google Scholar] [CrossRef]
  83. Morel, A. Optical modeling of the upper ocean in relation to its biogenous matter content (case I waters). J. Geophys. Res. 1988, 93, 10749. [Google Scholar] [CrossRef]
  84. Morel, A.; Maritorena, S. Bio-optical properties of oceanic waters: A reappraisal. J. Geophys. Res. Oceans 2001, 106, 7163–7180. [Google Scholar] [CrossRef]
  85. Morel, A.; Claustre, H.; Antoine, D.; Gentili, B. Natural Variability of Bio-Optical Properties in Case 1 Waters: Attenuation and Reflectance Within the Visible and Near-UV Spectral Domains, as Observed in South Pacific and Mediterranean Waters. 2007. Available online: www.biogeosciences.net/4/913/2007/ (accessed on 20 January 2021).
  86. Arst, H.; Erm, A.; Reinart, A.; Sipelgas, L.; Herlevi, A. Calculating irradiance penetration into water bodies from the measured beam attenuation coefficient, II: Application of the improved model to different types of lakes. Nord. Hydrol. 2002, 33, 227–240. [Google Scholar] [CrossRef]
  87. Gordon, H.R.; Brown, O.B.; Jacobs, M.M. Computed Relationships Between the Inherent and Apparent Optical Properties of a Flat Homogeneous Ocean. Appl. Opt. 1975, 14, 417. [Google Scholar] [CrossRef]
  88. Sathyendranath, S.; Platt, T. The spectral irradiance field at the surface and in the interior of the ocean: A model for applications in oceanography and remote sensing. J. Geophys. Res. Oceans 1988, 93, 9270–9280. [Google Scholar] [CrossRef]
  89. Lee, Z.; Carder, K.L.; Arnone, R.A. Deriving inherent optical properties from water color: A multiband quasi-analytical algorithm for optically deep waters. Appl. Opt. 2002, 41, 5755. [Google Scholar] [CrossRef]
  90. Lee, Z.P.; Du, K.P.; Arnone, R. A model for the diffuse attenuation coefficient of downwelling irradiance. J. Geophys. Res. C Ocean. 2005, 110, 1–10. [Google Scholar] [CrossRef]
  91. Lee, Z.; Weidemann, A.; Kindle, J.; Arnone, R.; Carder, K.L.; Davis, C. Euphotic zone depth: Its derivation and implication to ocean-color remote sensing. J. Geophys. Res. 2007, 112, C03009. [Google Scholar] [CrossRef]
  92. IOCCG. Uncertainties in Ocean Colour Remote Sensing; IOCCG: Dartmouth, NH, Canada, 2019. [Google Scholar] [CrossRef]
  93. Zoffoli, M.L.; Lee, Z.; Ondrusek, M.; Lin, J.; Kovach, C.; Wei, J.; Lewis, M. Estimation of Transmittance of Solar Radiation in the Visible Domain Based on Remote Sensing: Evaluation of Models Using In Situ Data. J. Geophys. Res. Oceans 2017, 122, 9176–9188. [Google Scholar] [CrossRef]
  94. Metheniti, V.; Vervatis, V.; Kampanis, N.; Sofianos, S. Turbidity effects on the Aegean sea surface properties using numerical simulations. Ocean. Dyn. 2025, 75, 1–15. [Google Scholar] [CrossRef]
  95. Kolluru, S.; Gedam, S.S.; Inamdar, A.B. A neural network approach for deriving absorption coefficients of ocean water constituents from total light absorption and particulate absorption coefficients. Comput. Geosci. 2021, 147, 104678. [Google Scholar] [CrossRef]
  96. Park, J.; Kim, J.H.; Kim, H.C.; Kim, B.K.; Bae, D.; Jo, Y.H.; Jo, N.; Lee, S.H. Reconstruction of Ocean Color Data Using Machine Learning Techniques in Polar Regions: Focusing on Off Cape Hallett, Ross Sea. Remote Sens. 2019, 11, 1366. [Google Scholar] [CrossRef]
  97. Liu, H.; Li, Q.; Bai, Y.; Yang, C.; Wang, J.; Zhou, Q.; Hu, S.; Shi, T.; Liao, X.; Wu, G. Improving satellite retrieval of oceanic particulate organic carbon concentrations using machine learning methods. Remote Sens. Environ. 2021, 256, 112316. [Google Scholar] [CrossRef]
  98. Morel, A.; Bélanger, S. Improved detection of turbid waters from ocean color sensors information. Remote Sens. Environ. 2006, 102, 237–249. [Google Scholar] [CrossRef]
  99. Zhao, D.; Feng, L.; Yang, Z.; Yu, X.; Wang, M. A Deep Learning-Assisted Algorithm to Improve Inherent Optical Properties Estimations Over Inland and Nearshore Coastal Waters. IEEE Trans. Geosci. Remote Sens. 2025, 63, 1–14. [Google Scholar] [CrossRef]
  100. Chen, L.; Chen, L.; Chen, L.; Pan, X.; Zhang, J.; Zhang, J.; Demeaux, C.B.; Wang, Y.; Wang, Y. Inversion diffuse attenuation coefficient of photosynthetically active radiation based on deep learning. Opt. Express 2023, 31, 37365–37380. [Google Scholar] [CrossRef]
  101. Huang, J.; Wang, D.; Pan, S.; Li, H.; Gong, F.; Hu, H.; He, X.; Bai, Y.; Zheng, Z. A New High-Resolution Remote Sensing Monitoring Method for Nutrients in Coastal Waters. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–15. [Google Scholar] [CrossRef]
  102. Li, C.; Wu, H.; Yang, C.; Cui, L.; Ma, Z.; Wang, L. Advanced Machine Learning Models for Estimating the Distribution of Sea-Surface Particulate Organic Carbon (POC) Concentrations Using Satellite Remote Sensing Data: The Mediterranean as an Example. Sensors 2024, 24, 5669. [Google Scholar] [CrossRef]
  103. Zhang, Z.; Chen, P.; Jamet, C.; Dionisi, D.; Hu, Y.; Lu, X.; Pan, D. Retrieving bbp and POC from CALIOP: A deep neural network approach. Remote Sens. Environ. 2023, 287, 113482. [Google Scholar] [CrossRef]
  104. Hotte, N.; Sumaila, U.R. Potential Economic Impact of a Tanker Spill on Ocean-Based Industries in British Columbia; The University of British Columbia: Vancouver, BC, Canada, 2012. [Google Scholar] [CrossRef]
  105. Kokkos, N.; Petalas, S.; Keramea, P.; Psomouli, F.; Sylaios, G. A Digital Twin for Oil Spill Incidents in the North Aegean. Available online: https://gkhub.earthobservations.org/packages/e03x7-55x27 (accessed on 5 August 2025).
  106. Krestenitis, M.; Orfanidis, G.; Ioannidis, K.; Avgerinakis, K.; Vrochidis, S.; Kompatsiaris, I. Oil Spill Identification from Satellite Images Using Deep Neural Networks. Remote Sens. 2019, 11, 1762. [Google Scholar] [CrossRef]
  107. Conceição, M.R.A.; Mendonça, L.F.F.; Lentini, C.A.D.; Lima, A.T.C.; Lopes, J.M.; Vasconcelos, R.N.; Gouveia, M.B.; Porsani, M.J. SAR Oil Spill Detection System through Random Forest Classifiers. Remote Sens. 2021, 13, 2044. [Google Scholar] [CrossRef]
  108. Yang, J.F.; Wan, J.H.; Ma, Y.; Zhang, J.; Hu, Y.B.; Jiang, Z.C. Oil spill hyperspectral remote sensing detection based on DCNN with multi-scale features. J. Coast. Res. 2019, 90, 332–339. [Google Scholar] [CrossRef]
  109. Jiang, Z.; Zhang, J.; Ma, Y.; Mao, X. Hyperspectral Remote Sensing Detection of Marine Oil Spills Using an Adaptive Long-Term Moment Estimation Optimizer. Remote Sens. 2022, 14, 157. [Google Scholar] [CrossRef]
  110. Yang, J.; Hu, Y.; Zhang, J.; Ma, Y.; Li, Z.; Jiang, Z. Identification of marine oil spill pollution using hyperspectral combined with thermal infrared remote sensing. Front. Mar. Sci. 2023, 10, 1135356. [Google Scholar] [CrossRef]
  111. Cózar, A.; Echevarría, F.; González-Gordillo, J.I.; Irigoien, X.; Úbeda, B.; Hernández-León, S.; Palma, Á.T.; Navarro, S.; García-de-Lomas, J.; Ruiz, A.; et al. Plastic debris in the open ocean. Proc. Natl. Acad. Sci. USA 2014, 111, 10239–10244. [Google Scholar] [CrossRef]
  112. Zhang, Y.; Wu, P.; Xu, R.; Wang, X.; Lei, L.; Schartup, A.T.; Peng, Y.; Pang, Q.; Wang, X.; Mai, L.; et al. Plastic waste discharge to the global ocean constrained by seawater observations. Nat. Commun. 2023, 14, 1–12. [Google Scholar] [CrossRef]
  113. Deudero, S.; Alomar, C. Mediterranean marine biodiversity under threat: Reviewing influence of marine litter on species. Mar. Pollut. Bull. 2015, 98, 58–68. [Google Scholar] [CrossRef]
  114. Landrigan, P.J.; Stegeman, J.J.; Fleming, L.E.; Allemand, D.; Anderson, D.M.; Backer, L.C.; Brucker-Davis, F.; Chevalier, N.; Corra, L.; Czerucka, D.; et al. Human Health and Ocean Pollution. Ann. Glob. Health 2020, 86, 151. [Google Scholar] [CrossRef]
  115. Gall, S.C.; Thompson, R.C. The impact of debris on marine life. Mar. Pollut. Bull. 2015, 92, 170–179. [Google Scholar] [CrossRef]
  116. Thushari, G.G.N.; Senevirathna, J.D.M. Plastic pollution in the marine environment. Heliyon 2020, 6, e04709. [Google Scholar] [CrossRef]
  117. Andrady, A.L. Weathering and fragmentation of plastic debris in the ocean environment. Mar. Pollut. Bull. 2022, 180, 113761. [Google Scholar] [CrossRef] [PubMed]
  118. Smith, M.; Love, D.C.; Rochman, C.M.; Neff, R.A. Microplastics in Seafood and the Implications for Human Health. Curr. Environ. Health Rep. 2018, 5, 375–386. [Google Scholar] [CrossRef] [PubMed]
  119. Edyvane, K.S.; Dalgetty, A.; Hone, P.W.; Higham, J.S.; Wace, N.M. Long-term marine litter monitoring in the remote Great Australian Bight, South Australia. Mar. Pollut. Bull. 2004, 48, 1060–1075. [Google Scholar] [CrossRef] [PubMed]
  120. Zablotski, Y.; Kraak, S.B.M. Marine litter on the Baltic seafloor collected by the international fish-trawl survey. Mar. Pollut. Bull. 2019, 141, 448–461. [Google Scholar] [CrossRef]
  121. Fakiris, E.; Papatheodorou, G.; Kordella, S.; Christodoulou, D.; Galgani, F.; Geraga, M. Insights into seafloor litter spatiotemporal dynamics in urbanized shallow Mediterranean bays. An optimized monitoring protocol using towed underwater cameras. J. Environ. Manag. 2022, 308, 114647. [Google Scholar] [CrossRef]
  122. Serra-Gonçalves, C.; Lavers, J.L.; Bond, A.L. Global Review of Beach Debris Monitoring and Future Recommendations. Environ. Sci. Technol. 2019, 53, 12158–12167. [Google Scholar] [CrossRef]
  123. Nivedita, V.; Begum, S.S.; Aldehim, G.; Alashjaee, A.M.; Arasi, M.A.; Sikkandar, M.Y.; Jayasankar, T.; Vivek, S. Plastic debris detection along coastal waters using Sentinel-2 satellite data and machine learning techniques. Mar. Pollut. Bull. 2024, 209, 117106. [Google Scholar] [CrossRef]
  124. Kylili, K.; Kyriakides, I.; Artusi, A.; Hadjistassou, C. Identifying floating plastic marine debris using a deep learning approach. Environ. Sci. Pollut. Res. 2019, 26, 17091–17099. [Google Scholar] [CrossRef]
  125. da Silva Ferreira, A.T.; de Oliveira, R.C.; Ribeiro, M.C.H.; de Freitas Sousa, P.S.; de Paula Miranda, L.; de Oliveira Folharini, S.; Siegle, E. Microplastic Deposit Predictions on Sandy Beaches by Geotechnologies and Machine Learning Models. Coasts 2025, 5, 4. [Google Scholar] [CrossRef]
  126. Zhen, Y.; Wang, L.; Sun, H.; Liu, C. Prediction of microplastic abundance in surface water of the ocean and influencing factors based on ensemble learning. Environ. Pollut. 2023, 331, 121834. [Google Scholar] [CrossRef]
  127. Lorenzo-Navarro, J.; Castrillón-Santana, M.; Sánchez-Nielsen, E.; Zarco, B.; Herrera, A.; Martínez, I.; Gómez, M. Deep learning approach for automatic microplastics counting and classification. Sci. Total Environ. 2021, 765, 142728. [Google Scholar] [CrossRef] [PubMed]
  128. Neugebauer, J.; Heilig, L.; Voß, S. Digital Twins in the Context of Seaports and Terminal Facilities. Flex. Serv. Manuf. J. 2024, 36, 821–917. [Google Scholar] [CrossRef]
  129. United Nations Conference on Trade and Development (UNCTAD). Review of Maritime Transport 2024: Navigating Maritime Chekpoints; United Nations Publication: New York, NY, USA, 2024. [Google Scholar]
  130. Verschuur, J.; Koks, E.E.; Li, S.; Hall, J.W. Multi-hazard risk to global port infrastructure and resulting trade and logistics losses. Commun. Earth Environ. 2023, 4, 1–12. [Google Scholar] [CrossRef]
  131. Borriello, A.; Calvo Santos, A.; Feyen, L.; Ghiani, M.; Guillén, J.; McGovern, L.; Petrucco, G.; Pistocchi, A.; Pleguezuelo Alonso, M.; Politiek, H.D.; et al. The EU Blue Economy Report 2025; Publications Office of the European Union: Louxemburg, 2025. [Google Scholar] [CrossRef]
  132. Jahangard, M.; Xie, Y.; Feng, Y. Leveraging machine learning and optimization models for enhanced seaport efficiency. Marit. Econ. Logist. 2025, 27, 1–42. [Google Scholar] [CrossRef]
  133. Mannarini, G.; Salinas, M.L.; Carelli, L.; Petacco, N.; Orović, J. VISIR-2: Ship weather routing in Python. Geosci. Model. Dev. 2024, 17, 4355–4382. [Google Scholar] [CrossRef]
  134. Madusanka, N.S.; Fan, Y.; Yang, S.; Xiang, X. Digital Twin in the Maritime Domain: A Review and Emerging Trends. J. Mar. Sci. Eng. 2023, 11, 1021. [Google Scholar] [CrossRef]
  135. Parasyris, A.; Metheniti, V.; Vettorello, L.; Ribeiro, J.; Delgado, M.; Goddard, J.; Rebane, J.; Egerer, M.; Alexandrakis, G.; Kozyrakis, G.V.; et al. Coastal Crete: The Ship Routing/Harbor Safety Digital Twin of the Ocean. In Proceedings of the IEEE OCEANS 2025 Brest, Brest, France, 16–19 June 2025; IEEE: New York, NY, USA, 2025; pp. 1–6. [Google Scholar] [CrossRef]
  136. Salinas, M.L.; Mannarini, G. [VISIR-2 Ship Weather Routing Model] Source Code (Python). 2025. Available online: https://zenodo.org/records/15497175 (accessed on 15 December 2025).
  137. Effrosynidis, D.; Spiliotis, E.; Sylaios, G.; Arampatzis, A. Time series and regression methods for univariate environmental forecasting: An empirical evaluation. Sci. Total Environ. 2023, 875, 162580. [Google Scholar] [CrossRef]
  138. European Commission. Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions: Roadmap Towards Ending Russian Energy Imports. 2025. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A52025DC0440R%2801%29&qid=1747125158211 (accessed on 15 December 2025).
  139. Aderinto, T.; Li, H. Ocean Wave Energy Converters: Status and Challenges. Energies 2018, 11, 1250. [Google Scholar] [CrossRef]
  140. European Commission. Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions: ‘An EU Strategy to Harness the Potential of Offshore Renewable Energy for a Climate-Neutral Future’. 2020, Brussels. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=COM:2020:741:FIN (accessed on 10 October 2025).
  141. Sabine, C.; Robinson, C.; Isensee, K.; Bastian, L.; Batten, S.; Bellerby, R.; Blasiak, R.; Laarissa, S.; Lira Loarca, A.; McGeachy, C.; et al. Ocean Decade Vision 2030 White Papers Challenge 5: Unlock ocean-based solutions to climate change. In The United Nations Decade of Ocean Science for Sustainable Development (2021–2030); UNESCO-IOC: Paris, France, 2024. [Google Scholar] [CrossRef]
  142. UNESCO-IOC. MSPglobal Policy Brief: Marine Spatial Planning and the Sustainable Blue Economy. Paris, France. 2021. Available online: https://www.researchgate.net/publication/350956285_UNESCO-IOC_2021_MSPglobal_Policy_Brief_Marine_Spatial_Planning_and_the_Sustainable_Blue_Economy_Paris_UNESCO_IOC_Policy_Brief_no_2 (accessed on 15 December 2025).
  143. Zhou, Y. Ocean energy applications for coastal communities with artificial intelligencea state-of-the-art review. Energy AI 2022, 10, 100189. [Google Scholar] [CrossRef]
  144. Khojasteh, D.; Shamsipour, A.; Huang, L.; Tavakoli, S.; Haghani, M.; Flocard, F.; Farzadkhoo, M.; Iglesias, G.; Hemer, M.; Lewis, M.; et al. A large-scale review of wave and tidal energy research over the last 20 years. Ocean Eng. 2023, 282, 114995. [Google Scholar] [CrossRef]
  145. Mlakar, P.; Ricchi, A.; Carniel, S.; Bonaldo, D.; Ličer, M. DELWAVE 1.0: Deep learning surrogate model of surface wave climate in the Adriatic Basin. Geosci. Model. Dev. 2024, 17, 4705–4725. [Google Scholar] [CrossRef]
  146. Bento, P.M.R.; Pombo, J.A.N.; Mendes, R.P.G.; Calado, M.R.A.; Mariano, S.J.P.S. Ocean wave energy forecasting using optimised deep learning neural networks. Ocean Eng. 2021, 219, 108372. [Google Scholar] [CrossRef]
  147. Muthamizhan, T.; Karthick, K.; Aruna, S.K.; Velmurugan, P. AI-Driven Stacking Ensemble for Predicting Total Power Output of Wave Energy Converters: A Data-Driven Approach to Renewable Energy Processes. Processes 2025, 13, 961. [Google Scholar] [CrossRef]
  148. Shahbazbegian, A.; Ghiasi, M. Developing a machine learning (ML) based graphical user interface (GUI) for significant wave height (SWH) forecasting to support wave energy converters (WECs) operations planning. Renew. Energy 2026, 256, 124490. [Google Scholar] [CrossRef]
  149. Liu, Y.; Zhang, X.; Dong, Q.; Chen, G.; Li, X. Phase-resolved wave prediction with linear wave theory and physics-informed neural networks. Appl. Energy 2024, 355, 121602. [Google Scholar] [CrossRef]
  150. Lee, D.; Yang, S.; Oh, J.-W.; Cho, S.-G.; Kim, S.; Kang, N. AI-Powered Digital Twin of the Ocean: Reliable Uncertainty Quantification for Real-Time Wave Height Prediction with Deep Ensemble. December 2024. Available online: https://arxiv.org/pdf/2412.05475 (accessed on 17 October 2025).
  151. Zhang, M.; Yuan, Z.M.; Dai, S.S.; Chen, M.L.; Incecik, A. LSTM RNN-based excitation force prediction for the real-time control of wave energy converters. Ocean Eng. 2024, 306, 118023. [Google Scholar] [CrossRef]
  152. Katsidoniotaki, E.; Psarommatis, F.; Göteman, M. Digital Twin for the Prediction of Extreme Loads on a Wave Energy Conversion System. Energies 2022, 15, 5464. [Google Scholar] [CrossRef]
  153. Poguluri, S.K.; Bae, Y.H. Enhancing Wave Energy Conversion Efficiency through Supervised Regression Machine Learning Models. J. Mar. Sci. Eng. 2024, 12, 153. [Google Scholar] [CrossRef]
  154. Halder, R.; Damodaran, M.; Khoo, B.C. Deep learning-driven nonlinear reduced-order models for predicting wave-structure interaction. Ocean Eng. 2023, 280, 114511. [Google Scholar] [CrossRef]
  155. González-Gorbeña, E.; Pacheco, A.; Plomaritis, T.A.; Ferreira, Ó.; Sequeira, C.; Moura, T. Surrogate-Based Optimization of Tidal Turbine Arrays: A Case Study for the Faro-Olhão Inlet. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2019; Volume 11538, pp. 548–561. [Google Scholar] [CrossRef]
  156. Roth, J.; Jacobs, G.; Röder, J.; Zweiffel, M.; Bauer, T.; Amos, L. Regression-based hub load calculation of a tidal turbine with blade root sensor data. Proc. Eur. Wave Tidal Energy Conf. 2025, 16. [Google Scholar] [CrossRef]
  157. Upadhyay, D.; Sampalli, S. SCADA (Supervisory Control and Data Acquisition) systems: Vulnerability assessment and security recommendations. Comput. Secur. 2020, 89, 101666. [Google Scholar] [CrossRef]
  158. Rashid, H.; Benbouzid, M.; Titah-Benbouzid, H.; Amirat, Y.; Berghout, T.; Mamoune, A. Mapping a Machine Learning Path Forward for Tidal Stream Turbines Biofouling Detection and Estimation. In IECON Proceedings (Industrial Electronics Conference); IEEE: New York, NY, USA, 2023. [Google Scholar] [CrossRef]
  159. Thanthirige, T.R.M.; Flanagan, M.; Kennedy, C.; Goggins, J.; Finnegan, W. Advancements in Structural Testing and Life Predictions of Tidal Turbine Blades. Proc. Eur. Wave Tidal Energy Conf. 2025, 16. [Google Scholar] [CrossRef]
  160. Finnegan, W.; Kennedy, C.; Flanagan, M.; Goggins, J. Strain failure limits of tidal turbine blades based on full-scale structural testing. Proc. Eur. Wave Tidal Energy Conf. 2025, 16. [Google Scholar] [CrossRef]
  161. Mbasso, W.F.; Harrison, A.; Dagal, I.; Jangir, P.; Khishe, M.; Kotb, H.; Shaikh, M.S.; Smerat, A.; Fendzi Donfack, E.; Kumar, R. Digital twins in renewable energy systems: A comprehensive review of concepts, applications, and future directions. Energy Strategy Rev. 2025, 61, 101814. [Google Scholar] [CrossRef]
  162. Singh, D.; Dwight, R.P.; Laugesen, K.; Beaudet, L.; Viré, A. Probabilistic surrogate modeling of offshore wind-turbine loads with chained Gaussian processes. J. Phys. Conf. Ser. 2022, 2265, 032070. [Google Scholar] [CrossRef]
  163. Hasan, A.; Hu, Z.; Haghshenas, A.; Karlsen, A.; Alaliyat, S.; Cali, U. An Interactive Digital Twin Platform for Offshore Wind Farms’ Development. Digit. Twin Driven Intell. Syst. Emerg. Metaverse 2023, 269–281. [Google Scholar] [CrossRef]
  164. Vasconcelos, D.; Vieira, M.; Dias, D.; Reis, L. Structural evaluation of the deepCWind offshore wind platform. Frat. Integrita Strutt. 2020, 14, 24–44. [Google Scholar] [CrossRef]
  165. Mehlan, F.C.; Keller, J.; Nejad, A.R. Virtual sensing of wind turbine hub loads and drivetrain fatigue damage. Forsch. Ingenieurwesen/Eng. Res. 2023, 87, 207–218. [Google Scholar] [CrossRef]
  166. Pereira, R.S.; Filipe, L.; Meda, R. Short Term Operation & Maintenance Planning of Offshore Wind Farms: Holistic Development of a Digital Twin. In Proceedings of the Oceans Conference Record (IEEE), Brest, France, 16–19 June 2025. [Google Scholar] [CrossRef]
  167. Srikonda, R.; Rastogi, A.; Oestensen, H. Increasing facility uptime using machine learning and physics-based hybrid analytics in a dynamic digital twin. In Proceedings of the Annual Offshore Technology Conference 2020, Houston, TX, USA, 4 May 2020. [Google Scholar] [CrossRef]
  168. He, B.; Zhao, Y.; Liu, S.; Ahmad, S.; Mao, W. Mapping seagrass habitats of potential suitability using a hybrid machine learning model. Front. Ecol. Evol. 2023, 11, 1116083. [Google Scholar] [CrossRef]
  169. da Silveira, C.B.L.; Strenzel, G.M.R.; Maida, M.; Gaspar, A.L.B.; Ferreira, B.P. Coral reef mapping with remote sensing and machine learning: A nurture and nature analysis in marine protected areas. Remote Sens. 2021, 13, 2907. [Google Scholar] [CrossRef]
  170. Phillips, S.J.; Dudík, M. Modeling of species distributions with Maxent: New extensions and a comprehensive evaluation. Ecography 2008, 31, 161–175. [Google Scholar] [CrossRef]
  171. Zhang, M.; Zhang, Y.; Yu, S.; Gao, Y.; Dong, J.; Zhu, W.; Wang, X.; Li, X.; Li, J.; Xiong, J. Two machine learning approaches for predicting cyanobacteria abundance in aquaculture ponds. Ecotoxicol. Environ. Saf. 2023, 258, 114944. [Google Scholar] [CrossRef] [PubMed]
  172. Yu, X.; Wang, Y.; An, D.; Wei, Y. Counting method for cultured fishes based on multi-modules and attention mechanism. Aquac. Eng. 2022, 96, 102215. [Google Scholar] [CrossRef]
  173. Muñoz-Benavent, P.; Andreu-García, G.; Valiente-González, J.M.; Atienza-Vanacloig, V.; Puig-Pons, V.; Espinosa, V. Enhanced fish bending model for automatic tuna sizing using computer vision. Comput. Electron. Agric. 2018, 150, 52–61. [Google Scholar] [CrossRef]
  174. Jang, J.; Baek, S.S.; Kang, D.; Park, Y.; Ligaray, M.; Baek, S.H.; Choi, J.Y.; Park, B.S.; Lee, M.I.; Cho, K.H. Insights and machine learning predictions of harmful algal bloom in the East China Sea and Yellow Sea. J. Clean. Prod. 2024, 459, 142515. [Google Scholar] [CrossRef]
  175. Hill, P.R.; Kumar, A.; Temimi, M.; Bull, D.R. HABNet: Machine Learning, Remote Sensing-Based Detection of Harmful Algal Blooms. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 3229–3239. [Google Scholar] [CrossRef]
  176. Mugo, R.; Saitoh, S.I. Ensemble Modelling of Skipjack Tuna (Katsuwonus pelamis) Habitats in the Western North Pacific Using Satellite Remotely Sensed Data; a Comparative Analysis Using Machine-Learning Models. Remote Sens. 2020, 12, 2591. [Google Scholar] [CrossRef]
  177. Palaiokostas, C. Predicting for disease resistance in aquaculture species using machine learning models. Aquac. Rep. 2021, 20, 100660. [Google Scholar] [CrossRef]
  178. Le, N.-B.-V.; Huh, J.-H. AgTech: Building Smart Aquaculture Assistant System Integrated IoT and Big Data Analysis. IEEE Trans. AgriFood Electron. 2024, 2, 471–482. [Google Scholar] [CrossRef]
  179. Majumder, S.; Maity, S.; Balakrishnan Nair, T.M.; Bright, R.P.; Nagaraja Kumar, M.; Shwetha, N.; Kumar, N. Potential Fishing Zone Characterization in the Indian Ocean by Machine Learning Approach. Adv. Intell. Syst. Comput. 2021, 2, 43–54. [Google Scholar] [CrossRef]
  180. Ullah, I.; Kim, D.H. An Optimization Scheme for Water Pump Control in Smart Fish Farm with Efficient Energy Consumption. Processes 2018, 6, 65. [Google Scholar] [CrossRef]
  181. Chaplin, J.C.; Martinez-Arellano, G.; Mazzoleni, A. Digital Twins and Intelligent Decision Making. In Digital Manufacturing for SMEs: An Introduction; University of Nottingham: Nottingham, UK, 2020. [Google Scholar] [CrossRef]
  182. Raes, L.; McAleer, S.R.; Croket, I.; Kogut, P.; Brynskov, M.; Lefever, S. Decide Better: Open and Interoperable Local Digital Twins; Springer: Berlin/Heidelberg, Germany, 2025. [Google Scholar] [CrossRef]
  183. Ricciardi, G.; Callegari, G. Digital Twins for Climate-Neutral and Resilient Cities. State of the Art and Future Development as Tools to Support Urban Decision-Making. Urban Book Ser. 2023, F813, 617–626. [Google Scholar] [CrossRef]
  184. Lu, Y.; Liu, C.; Wang, K.I.-K.; Huang, H.; Xu, X. Digital Twin-driven smart manufacturing: Connotation, reference model, applications and research issues. Robot. Comput. Integr. Manuf. 2020, 61, 101837. [Google Scholar] [CrossRef]
  185. Miedtank, A.; Schneider, J.; Manss, C.; Zielinski, O. Marine digital twins for enhanced ocean understanding. Remote Sens. Appl. 2024, 36, 101268. [Google Scholar] [CrossRef]
  186. Cavalieri, S.; Gambadoro, S. Digital Twin of a Water Supply System Using the Asset Administration Shell. Sensors 2024, 24, 1360. [Google Scholar] [CrossRef]
  187. Zhao, T.; Wang, S.; Ouyang, C.; Chen, M.; Liu, C.; Zhang, J.; Yu, L.; Wang, F.; Xie, Y.; Li, J.; et al. Artificial intelligence for geoscience: Progress, challenges, and perspectives. Innovation 2024, 5, 100691. [Google Scholar] [CrossRef]
  188. Tzachor, A.; Hendel, O.; Richards, C.E. Digital twins: A stepping stone to achieve ocean sustainability? NPJ Ocean Sustain 2023, 2, 16. [Google Scholar] [CrossRef]
  189. Alzubaidi, L.; Bai, J.; Al-Sabaawi, A.; Santamaría, J.; Albahri, A.S.; Al-dabbagh, B.S.N.; Fadhel, M.A.; Manoufali, M.; Zhang, J.; Al-Timemy, A.H.; et al. A survey on deep learning tools dealing with data scarcity: Definitions, challenges, solutions, tips, and applications. J. Big Data 2023, 10, 46. [Google Scholar] [CrossRef]
  190. Orofino, S.; McDonald, G.; Mayorga, J.; Costello, C.; Bradley, D. Opportunities and challenges for improving fisheries management through greater transparency in vessel tracking. ICES J. Mar. Sci. 2023, 80, 675–689. [Google Scholar] [CrossRef]
  191. Wesselkamp, M.; Moser, N.; Kalweit, M.; Boedecker, J.; Dormann, C.F. Process-Informed Neural Networks: A Hybrid Modelling Approach to Improve Predictive Performance and Inference of Neural Networks in Ecology and Beyond. Ecol. Lett. 2024, 27, e70012. [Google Scholar] [CrossRef]
  192. Ryo, M.; Angelov, B.; Mammola, S.; Kass, J.M.; Benito, B.M.; Hartig, F. Explainable artificial intelligence enhances the ecological interpretability of black-box species distribution models. Ecography 2021, 44, 199–205. [Google Scholar] [CrossRef]
  193. Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A review of machine learning interpretability methods. Entropy 2021, 1, 18. [Google Scholar] [CrossRef]
  194. Mastrandrea, M.D.; Mach, K.J.; Plattner, G.K.; Edenhofer, O.; Stocker, T.F.; Field, C.B.; Ebi, K.L.; Matschoss, P.R. The IPCC AR5 guidance note on consistent treatment of uncertainties: A common approach across the working groups. Clim. Chang. 2011, 108, 675–691. [Google Scholar] [CrossRef]
  195. Gómez-Talal, I.; Azizsoltani, M.; Bote-Curiel, L.; Rojo-Álvarez, J.L.; Singh, A. Towards Explainable Artificial Intelligence in Machine Learning: A study on efficient Perturbation-Based Explanations. Eng. Appl. Artif. Intell. 2025, 155, 110664. [Google Scholar] [CrossRef]
  196. Darvishvand, L.; Kamkari, B.; Huang, M.J.; Hewitt, N.J. A systematic review of explainable artificial intelligence in urban building energy modeling: Methods, applications, and future directions. Sustain. Cities Soc. 2025, 128, 106492. [Google Scholar] [CrossRef]
  197. Chakraborty, D.; Başaǧaoǧlu, H.; Gutierrez, L.; Mirchi, A. Explainable AI reveals new hydroclimatic insights for ecosystem-centric groundwater management. Environ. Res. Lett. 2021, 16, 114024. [Google Scholar] [CrossRef]
Figure 1. Schematic differentiation between a Digital model, Shadow and a Twin, depending on data flow.
Figure 1. Schematic differentiation between a Digital model, Shadow and a Twin, depending on data flow.
Climate 14 00003 g001
Figure 2. Categorization of AI-powered surrogate models utilized in DTO applications.
Figure 2. Categorization of AI-powered surrogate models utilized in DTO applications.
Climate 14 00003 g002
Figure 3. Comparative Flow Chart of Traditional and AI-Based Methodologies for Water Quality Index (WQI) Parameter Estimation in Digital Twins of the Ocean (DTOs).
Figure 3. Comparative Flow Chart of Traditional and AI-Based Methodologies for Water Quality Index (WQI) Parameter Estimation in Digital Twins of the Ocean (DTOs).
Climate 14 00003 g003
Figure 4. Oil spill DTO example in Cretan Sea, visualized through Geomachine (https://geomachine.com, last accessed online 10 October 2025). Showing forecasts after spill masks have been identified through Satellite imagery and ML. (a) 1 h forecast, (b) 23 h forecast.
Figure 4. Oil spill DTO example in Cretan Sea, visualized through Geomachine (https://geomachine.com, last accessed online 10 October 2025). Showing forecasts after spill masks have been identified through Satellite imagery and ML. (a) 1 h forecast, (b) 23 h forecast.
Climate 14 00003 g004
Table 4. Synthesis of Major Challenges and Future Research Directions for Operational Machine Learning in DTOs.
Table 4. Synthesis of Major Challenges and Future Research Directions for Operational Machine Learning in DTOs.
Challenge CategorySpecific ConstraintFuture Research Direction/SolutionSupporting ML Concept
Data Scarcity & QualitySpatially sparse/temporally irregular data leading to overfitting.Investment in open observational infrastructure and metadata standards.Transfer Learning, Data Augmentation
Data GovernanceHigh-value datasets restricted by commercial/privacy concerns (e.g., fisheries).Technical solutions for selective data sharing and privacy-preserving analytics.Differential Privacy
Model PortabilityHigh cost of re-training ML models for new regions.Broader adoption of physics-informed and hybrid modeling strategies.Physics-Informed Neural Networks
Model Credibility & RiskOpaque systems (black boxes) and risk of model fabrication (hallucinations).Transparent reporting of uncertainties (epistemic/aleatoric).Uncertainty Quantification
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Metheniti, V.; Parasyris, A.; Pereira, R.S.; Kazanjian, G. Recent Advancements and Challenges in Artificial Intelligence for Digital Twins of the Ocean. Climate 2026, 14, 3. https://doi.org/10.3390/cli14010003

AMA Style

Metheniti V, Parasyris A, Pereira RS, Kazanjian G. Recent Advancements and Challenges in Artificial Intelligence for Digital Twins of the Ocean. Climate. 2026; 14(1):3. https://doi.org/10.3390/cli14010003

Chicago/Turabian Style

Metheniti, Vassiliki, Antonios Parasyris, Ricardo Santos Pereira, and Garabet Kazanjian. 2026. "Recent Advancements and Challenges in Artificial Intelligence for Digital Twins of the Ocean" Climate 14, no. 1: 3. https://doi.org/10.3390/cli14010003

APA Style

Metheniti, V., Parasyris, A., Pereira, R. S., & Kazanjian, G. (2026). Recent Advancements and Challenges in Artificial Intelligence for Digital Twins of the Ocean. Climate, 14(1), 3. https://doi.org/10.3390/cli14010003

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop