Design of an Observing System Simulation Experiment for the Operational Model of the Southwestern Coast of Iberia (SOMA)

Mendonça, Fernando; Martins, Flávio; Bertino, Laurent

doi:10.3390/jmse13091830

Open AccessArticle

Design of an Observing System Simulation Experiment for the Operational Model of the Southwestern Coast of Iberia (SOMA)

by

Fernando Mendonça

^1,2,*

,

Flávio Martins

^1,3

and

Laurent Bertino

⁴

¹

Centre for Marine and Environmental Research, University of Algarve, 8005-139 Faro, Portugal

²

Faculty of Science and Technology, University of Algarve, 8005-139 Faro, Portugal

³

Superior Institute of Engineering, University of Algarve, 8005-139 Faro, Portugal

⁴

Nansen Environmental and Remote Sensing Center, 5007 Bergen, Norway

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2025, 13(9), 1830; https://doi.org/10.3390/jmse13091830

Submission received: 19 August 2025 / Revised: 16 September 2025 / Accepted: 18 September 2025 / Published: 21 September 2025

(This article belongs to the Special Issue Monitoring of Ocean Surface Currents and Circulation)

Download

Browse Figures

Versions Notes

Abstract

Observing System Simulation Experiments (OSSEs) provide a framework in which to evaluate the impact of prospective ocean-observation networks on model forecasting performance prior to their actual deployment. This study presents the design and validation of an OSSE tailored for the operational coastal model of southern Portugal, SOMA. The system adopts the fraternal twins approach and a univariate data-assimilation scheme based on Ensemble Optimal Interpolation to update the model’s 3D temperature structure with SST. The methodology provides a flexible framework that preserves the statistical structure of real observation errors while remaining independent of SOMA. This allows straightforward transfer to other applications, thereby broadening its applicability and making it useful as a starting point in the design of observation networks beyond that presented in this case study. The OSSE experiments were compared against corresponding Observing System Experiments (OSEs) using real satellite SST products. Results show that the designed OSSE is internally consistent, sensitive to observation density, and capable of reproducing realistic correction patterns that closely match those obtained in the OSEs. These findings provide strong evidence that the SOMA OSSE system is a reliable tool for assessing the potential impact of future surface-observation strategies.

Keywords:

coastal model; ocean observation; data assimilation; ensemble optimal interpolation; observation experiments

1. Introduction

Numerical modelling is one of the primary tools used for understanding and predicting marine system dynamics. Hydrodynamic models simulate the ocean state by translating the physics of fluid motion into a computational framework that approximates the solution of discretized primitive equations using numerical methods [1,2]. These methods are implemented as algorithms that advance the ocean state forward in time over a defined spatial domain. Through this process, models enable the simulation of oceanic conditions, providing valuable insights into the structure and evolution of marine systems under both natural variability and anthropogenic influence.

While numerical models are essential in ocean prediction, they are still only one element of the broader framework used to understand marine systems in operational oceanography. Observations lie at the centre of the scientific method and provide the factual basis needed to develop and validate theoretical formulations and numerical models. It is primarily through the continuous and systematic observation of ocean phenomena via in-situ measurements, remote sensing, or other monitoring technologies [3] that empirical knowledge of the ocean’s dynamics is established. Historical data collected on water variables provide critical information about climate evolution, such as changes in sea surface height (SSH) [4,5] and variations in water temperature [6,7].

The observation network is a cornerstone of operational oceanography and has driven increasing demand for high-quality oceanographic data. As a result, the evolution of observation-system technologies has enabled real-time measurements of various ocean variables [8,9]. Despite this progress, observational data often present significant spatial and temporal gaps, particularly in remote or logistically challenging regions. Because the implementation of new ocean-observation systems typically involves substantial financial investment and technical complexity, Observing System Simulation Experiments (OSSEs) provide a framework in which to evaluate the potential impact of additional observations before their actual deployment by exploring hydrodynamic models and data assimilation (DA) systems [10].

The purpose of an OSSE system is to ensure that the additional observations will meaningfully contribute to scientific and operational goals. However, since it relies on synthetic data (Section 3), the development of such a system requires using an Observing System Experiment (OSE) to serve as a benchmark by which to validate the OSSE [11]. While both approaches operate in a similar manner, OSEs rely on real-world, existing observations to evaluate their impact through data denial experiments, comparing model performance with and without specific datasets. Demonstrating that an OSSE system produces impact assessments consistent with those of an OSE, researchers confirm the credibility of the framework for evaluating future observing systems. This validation step is critical to ensuring the system produces realistic and reliable predictions before resources are committed to new observational deployments.

Within this context, the NAUTILOS project [12] is focused on developing cost-effective next-generation technologies for ocean observation. Funded by the Horizon 2020 program of the EU’s Future of Seas and Oceans Flagship Initiative, the project addresses critical gaps in marine monitoring, particularly for chemical, biological, and deep-ocean physical variables. By integrating innovative sensors and samplers into a range of ocean platforms, such as autonomous vehicles, mooring buoys, and fisheries-observation systems, the project contributes to a standardized, interoperable system for marine data collection, enhancing global efforts to understand and manage ocean ecosystems. Aligned with the broader goals of the NAUTILOS project, the present work focuses on the development and validation of an OSSE system tailored to an operational hydrodynamic model of the southern coast of Portugal. The primary objective is to establish and verify the functionality of the OSSE system itself, ensuring that it is capable of supporting future assessments of observational strategies.

The subsequent sections are structured to present the key components and methodologies adopted in the development of the proposed OSSE system. Section 2 outlines the main oceanographic features of the Algarve region, the focus of this study. Next, Section 3 provides a comprehensive overview of the OSSE framework, while Section 4 presents the general formulation of the Ensemble Optimal Interpolation (EnOI) scheme, which was used as the DA method in the designed system. Building on this context, the OSSE system was implemented for the Algarve Operational Modeling and Monitoring System (SOMA), a hydrodynamic model based on MOHID, as described in the first part of the methodology (Section 5). The second part details the design choices made for the system, including a fraternal twins configuration of SOMA, which was used to represent two distinct model realizations for comparative assessment. It also describes how observations are extracted from the system and the implementation of the EnOI scheme in SOMA. Finally, Section 6 and Section 7 present the results of the system’s implementation and the main conclusions of this work.

2. Algarve General Circulation

The south of Iberia is characterized by a Mediterranean climate, which features distinct seasonal variations with hot, dry summers and mild, humid winters. The average annual temperature in this region is approximately 17 °C, with summer temperatures frequently exceeding 22 °C and reaching maximums around 40 °C. In contrast, winter temperatures typically remain below 18 °C, rarely dropping below freezing [13]. The climate and especially the wind regime on the Algarve coast are intimately related to the North Atlantic Oscillation (NAO) [14,15]. During the warmest months, the effects of NAO combined with the low precipitation rate can even cause extreme droughts [16,17]. In the summer, it is common for the NAO pressure gradient to be large, favouring strong westerly winds in northern Europe. However, as the centre of the high-pressure field moves towards Portugal, the winds change direction, becoming northerly on the west coast. In winter, on the other hand, this gradient is not so noticeable and the wind patterns are more variable, and northerly winds are generally weaker or even replaced by southerly components [18].

The observed oceanographic patterns and circulation in the Algarve are highly dynamic and are influenced by a combination of upwelling, poleward currents, and mesoscale eddies [19]. Based on the cyclical wind regimes associated with the pressure-gradient variability of NAO, the southward winds in the summer drive upwelling currents along the west coast, bringing cold, nutrient-rich waters to the surface [20]. During that season, the temperature difference between land and ocean waters creates a pressure gradient that channels the northerly winds along the south coast, creating conditions that favour upwelling currents in this region as well [14]. As a result of the upwelling, the flow of colder waters along the Algarve’s southern coastline is typically eastward, while warmer oceanic waters lying offshore of the continental shelf flow westward [21].

When the NAO pressure gradient weakens, westerly winds become predominant, a typical condition of winter months. This change reduces the intensity of upwelling events, thereby allowing the warm countercurrent from the Gulf of Cadiz to strengthen and propagate westward along the south coast [22,23]. This current continues in a poleward course along the west coast, carrying warm and saline waters that interact with other equatorward currents and thus creating complex circulation patterns along the Portuguese coastline [24,25]. Although upwelling is possible during the cold season, it is driven by this poleward flow and differs from the upwelling seen in summer, which is characterized by colder and nutrient-rich waters [26]. In this context, the most relevant variable to update in the model for this region would be sea surface temperature (SST), as its inclusion would help to constrain and correct the vertical structure of the coastal water masses, thereby improving the representation of subsurface temperature fields.

3. Observation System Simulation Experiment

Driven by the need to mitigate spatial and temporal gaps in ocean data coverage, observation networks are continuously being developed and improved. As technological advancements enable the deployment of increasingly sophisticated sensors and platforms, it becomes essential to evaluate their potential impact to ensure they will bring benefits to ocean forecasting once operational. This is the exact purpose of an OSSE system, which provides a framework in which to assess observation networks prior to their real-world deployment, helping to guide resource allocation and optimize the network design.

OSSE systems have been employed in numerical weather prediction since the early 1980s [27,28]. This methodology was adopted by the ocean-modeling community more recently [29,30], and due to this, oceanic OSSEs remain in a relatively early stage of development compared to their atmospheric counterparts. As discussed in [11], this highlights the importance of adopting methodical and robust design strategies, as well as rigorous validation techniques. Without proper calibration, OSSE systems risk producing misleading results, potentially overestimating or underestimating the value of proposed observational systems and thus supporting biased decision-making in sensor deployment.

The essential concept behind an OSSE is the substitution of reality, which is difficult and expensive to observe, with a high-quality numerical simulation, or, as it is commonly referred to, the Nature Run (NR). The NR serves as a proxy for the true state of the ocean, and synthetic observations are extracted from the simulation. In parallel, a Forecast Model (FM) is integrated and has its state updated with the synthetic observations using a DA system. This process constitutes an OSSE. The performance of the FM and OSSE forecasts are then assessed against the NR using a set of defined metrics and parameters. Essentially, OSSEs function as “data denial” experiments, creating conditions in which to assess the effectiveness of different observation strategies by selectively manipulating synthetic data.

As highlighted in [11,31], the numerical frameworks behind NR and FM should be as distinct as possible; i.e., the two models should have different numerical implementations, resolution grids, parameterisations, and boundary conditions. However, achieving this level of difference between the NR and FM can be challenging. Consequently, one might want to create an OSSE system in which NR and FM are based on the same numerical model (e.g., [32,33]) but adjust other factors to the greatest extent possible. These systems are referred to as fraternal twins experiments: the same model is used, but with different setups. In contrast, in an identical twins approach, i.e., one in which NR and FM are based on the same model and the differences between them are too limited, the experiment leads to unrealistic error growth between the models, yielding biased impact assessments in ocean DA systems [34].

As mentioned in Section 1, prior to employing an OSSE system for the evaluation of a prospective observation system, it is of paramount importance to validate the system with a corresponding OSE. An OSE functions in a manner analogous to an OSSE, but it employs actual observations instead of the synthetic data generated by the NR. Thus, the FM is assessed against actual observations in the OSE, whereas in the OSSE, it is compared to the NR. When designing an OSSE, it is crucial to ensure that the differences between the NR and FM are substantial but not so extensive that the latter becomes unrealistic. Furthermore, special attention must be paid to avoid assigning unrealistic errors to the synthetic observations, as this could skew impact assessments. Once these steps are completed, the consistency of observation impacts between the two systems is compared. The common metrics used to assess observations in both the OSE and OSSE, such as averages, root mean square errors (RMSE), and skill scores, are detailed in Appendix A.

4. Data-Assimilation Scheme

An ocean model is a representation of reality and is therefore prone to imperfection. The states generated by a model are subject to various sources of uncertainty, including mathematical model simplifications, model parameterisation, limited temporal and spatial resolution, truncation and rounding errors, and boundary and initial conditions. DA methodologies provide a set of processes to reduce these uncertainties by combining real-world measurements with model results. They aim to estimate the best state of the represented system by statistically evaluating the errors of both the observations and the model.

Advanced data-assimilation methods, such as the four-dimensional variational (4DVAR) or the Ensemble Kalman Filter (EnKF) methods, have been proven optimal for linear systems but are computationally expensive. In practice, most operational forecasting systems rely on computationally inexpensive variants like 3DVAR and the EnOI [35]. Regardless, most DA problems can be traced back to the Bayesian state-estimation problem, meaning each method derives from this common root but assumes different strategies and assumptions. Since a comprehensive review of DA methods lies beyond the scope of this paper, readers are directed to consult the formulations presented in [36,37], starting from the Bayesian formulation in both.

Given its similarities to methods employed in operational systems, the EnOI was selected for this work. This method builds on the EnKF framework, which estimates the background error covariance matrix with the spread and correlations among an ensemble of finite model realizations [38]. During the analysis update, this approximation provides a representation of the error statistics for the ensemble average. However, ensemble-based DA methods like EnKF can be computationally prohibitive due to the cost of generating and integrating large ensembles [39], a limitation that EnOI mitigates by using a static (time-invariant) ensemble.

The EnOI scheme can be regarded as a simplification of the EnKF [40,41]. The main distinction lies in the assumption of the time-invariant error statistics: in EnOI, the background error covariance matrix is static in time and is derived from a fixed ensemble of model anomalies. As noted in [39], because the computational cost of ensemble-based methods scales with the number of ensemble members, EnOI can be approximately N times less expensive than the EnKF, where N is the number of ensemble members used in both schemes. Moreover, in EnOI, the analysis update is computed only once for the background state, whereas in the EnKF, each ensemble member requires a separate update step. The general formulation of EnOI, including the standard analysis equation of a DA problem, is presented in Appendix B.

Although EnOI provides a computationally efficient alternative to EnKF by assuming a static background error covariance, it retains a key strength of ensemble-based methods: its multivariate assimilation capability. The background error covariance matrix encodes statistical relationships between different state variables, thereby allowing the update to propagate across them [42]. This feature enables the observation of a variable (e.g., SSH) to influence and modify other variables that are correlated (e.g., temperature and salinity), which is crucial for preserving dynamical consistency in ocean models. Due to its balance between computational efficiency and robust assimilation, EnOI has been widely adopted in various ocean applications, as seen in the above-mentioned studies: for shelf circulation off the Oregon (US) coast [41], using high-frequency (HF) radar-derived surface velocity measurements; for the Gulf of Mexico [42], using satellite altimetry data; and in the Australian region [39], using a combination of remote and in situ observations. It has also been used in other works, such as for the South Atlantic off the Brazilian coast [43], using satellite altimetry; for the western Mediterranean [44], using CTD probes and underwater glider datasets; and in the Hong Kong area [45], using temperature and salinity profiles.

5. Methodology

5.1. The Operational Forecasting System of the Algarve

To achieve the proposed objective of this work, the OSSE System was developed and integrated with the Algarve Operational Modelling and Monitoring System (SOMA) [46]. The aim is to provide a system that can be used to assess the impact of new observation networks launched in the region where the model is implemented and determine how the model’s forecasting performance would benefit from these networks. The model itself is a realization of the MOHID Modelling System [47], which is a robust numerical tool programmed in ANSI FORTRAN. It is designed in a modular architecture in which each module represents a different marine process that may be physical, chemical, or biological. Although the programming language is not specifically object-oriented, its architecture allows the model to work in this paradigm, which makes it possible to run several modules simultaneously, making it suitable for downscaling applications [48] such as SOMA.

MOHID applies an Eulerian approach with a finite volume method for spatial discretization to solve the primitive Navier–Stokes equations of fluid motion. These equations, applied at a macroscopic level to the control volume of each grid cell, consist of the conservation of mass, the conservation of momentum, and a thermal energy equation. The model considers the assumptions of hydrostatic equilibrium and the Boussinesq approximation [49]. A notable feature of MOHID is its ability to use generic vertical coordinates, enabling the simulation of complex geometries across multiple vertical subdomains [50]. Temporal discretization is performed using a semi-implicit alternating-direction implicit (ADI) algorithm with two time levels per iteration [51,52]. Additionally, the model includes modules for Lagrangian transport, which are particularly relevant for oil-spill applications [53,54].

According to [55], SOMA was initially developed as a high-resolution model designed to predict oil-spill trajectories off the Algarve coast. Subsequently, as demonstrated in [46], it was further employed to identify sources of oil leakage by combining backtracking simulations with trajectory data from ships. Such studies confirmed SOMA’s capability to reproduce the general ocean circulation in the Algarve region. As a result, the model was finally adjusted to render it operational; it has provided four-day daily forecasts since 2019, ensuring continuous oceanographic predictions of water conditions. The model has been evolving into an observatory for the region and features a visualization interface, with open access to the data available via a THREDDS server [56].

The model bathymetry is based on data from the European Marine Observation and Data Network (EMODNET) and is represented by two levels of increasing spatial resolution. The first level employs a horizontal grid with a resolution of 2 km, while the second level refines the grid to 1 km and is implemented in a smaller area. In both configurations, the vertical grid is discretized into 50 Cartesian layers, extending to a depth of about 4500 m. At the open boundary, a water-level condition based on the method proposed by [57] is applied, while a Flow Relaxation Scheme (FRS) [58] is used for the velocity, salinity, and temperature. FRS is also used to manage the communication between the two levels. Further details of the model implementation can be found in [46]. SOMA’s implementation area is shown in Figure 1.

External boundary fields for current velocity, temperature, and salinity are provided by the Global Ocean Physics Analysis and Forecast, a product of Mercator Ocean International through the Copernicus Marine Environment Monitoring Service (CMEMS) [59]. For the tide field, SOMA has an additional grid, which is a simple 2D hydrodynamic domain, forced by the FES2014 global tidal solution [60]. This domain has the single purpose of generating and supplying the tidal conditions to the first level of SOMA, and thus it is referred to as Level 0. At the atmospheric boundary, the model integrates data from the regional weather-forecast system SKIRON [61,62], which was developed by the Atmospheric Modeling and Weather Forecasting Group of the National and Kapodistrian University of Athens. The system provides hourly inputs on a 5 km resolution grid, including variables for wind-velocity components, air temperature, sea-level pressure, and other factors required to compute heat and momentum fluxes through bulk formulae.

In the operational cycle of SOMA, Level 0 is used exclusively to generate tidal boundary conditions and does not produce valid forecast data. In contrast, Levels 1 and 2 provide four-day forecasts of SSH, as well as three-dimensional fields of temperature, salinity, and ocean currents. A noteworthy aspect, as highlighted in [63], is SOMA’s operational strategy of performing weekly restart simulations based on CMEMS boundary conditions. This operational configuration serves to update the initial conditions for each new forecast cycle and is explored in the OSSE system implementation for SOMA, as detailed in the following section.

5.2. SOMA OSSE System Design

5.2.1. Nature-Run and Free-Run Configurations

As outlined in the objectives of this study, the OSSE system was specifically developed to operate using SOMA as the background model. Given this premise, the ocean-model component of the OSSE is not a variable under investigation but is rather assumed to be a fixed and fully operational system. However, the experiment requires simulation of two different ocean states: one representing the true state (the NR) and another emulating a typical forecast scenario (the FM). In this work, the distinction between NR and FM was achieved by applying the fraternal twins approach, which resulted in using two different configurations of SOMA.

To represent the NR, this study relied on the existing SOMA forecast database for the year 2023. In other words, the key features of the NR within the OSSE system include the use of the Level 2 grid characterized by a horizontal resolution of 1 km and the incorporation of weekly restart cycles based on CMEMS boundary conditions. In contrast, configuring the FM required the introduction of controlled perturbations to SOMA to ensure that its evolution would diverge sufficiently from that of the NR while remaining physically realistic. The FM was initialized with the same initial conditions used for the NR at the start of 2023, as generated by a restart simulation. However, the main structural change involved modifying the Level 2 grid, which was applied over the area illustrated in Figure 1, panel 1b, by reducing its resolution to 2 km, thereby matching the coarser Level 1 grid shown in panel 2a of the same figure. Furthermore, unlike the NR, the FM was executed continuously throughout 2023 without any weekly updates from CMEMS. As a result, this configuration is referred to as the Free Run (FR), and it constitutes the second instance of SOMA used for comparison against the NR according to the metrics described in Appendix A.

5.2.2. Data-Assimilation System

The next fundamental component of the OSSE is the DA system, responsible for incorporating synthetic observations into the FR. The assimilation algorithm employed was based on the formulation presented in [64]. Modifications were made to ensure compatibility with the SOMA model grid and to enable the reading of synthetic observation data as input, as described in Section 5.2.3.

The ensemble used to estimate the background error covariance in the EnOI scheme was sampled from the SOMA FR integration, the same model that was updated by the synthetic observations. Throughout the full-year integration of 2023, model states were extracted at regular intervals, specifically at a rate of one state every 25 h, in order to avoid tidal aliasing; the resulting total comprised 354 ensemble members. This sampling strategy was designed to capture the system’s natural variability across the range of oceanographic conditions of the implemented region. Given the strong seasonal behaviour of the Algarve’s circulation, as discussed in Section 2, the ensemble was considered sufficient to represent background error structures.

In this implementation, the weighting coefficient

α

, which regulates the balance between the background ensemble variance and the observation error covariance in the analysis update (Appendix B), was set to the default value of

1.0

. This choice effectively gives greater weight to the observations. While previous sensitivity studies have shown that smaller values of

α

may reduce spurious noise at the cost of weaker updates [65], higher values improve the initial agreement with observations, albeit at the expense of forecast accuracy at longer lead times. Because the present study was dedicated to the development and demonstration of the OSSE framework rather than to parameter optimization, a sensitivity analysis was not performed here. This represents a limitation of the current implementation; therefore, a systematic evaluation of

α

will be addressed in a future study.

In DA systems, localization functions are commonly employed to reduce the impact of spurious long-range correlations, which are common in ensemble-based DA methods due to finite ensemble size [36]. Due to this, the localization algorithm implemented in the adopted EnOI scheme was used, considering the Gaspari–Cohn correlation function [66]. It is a quasi-Gaussian function that becomes exactly zero beyond a certain distance (compact support) but decreases the correlation values gradually and continuously as the distance increases (smooth tapering). This ensures that the transition from full to zero correlation is smooth, which avoids the introduction of artificial discontinuities into the assimilation process and is effective in addressing the correlation decay issue. The localization radius was set to 10 grid cells, which corresponds to a physical scale of 20 km in the SOMA FR configuration.

5.2.3. Synthetic Observations and the Reference OSE

The final component of the OSSE system is the process of extracting synthetic observations from the NR integration. As previously described, those observations are not actual measurements obtained from external instruments. Instead, they are values picked from a field of SOMA via extraction directly from the model’s state vector during the NR simulation. These values are then treated as if they were observational data and assimilated into the FR. To achieve this, a specific procedure was implemented to extract data from the NR outputs at predefined spatial and temporal locations chosen to mimic the structure of a realistic observation network.

Although the assimilation update is applied to the full three-dimensional model domain, the current implementation of the DA system in SOMA was restricted to surface observations. Extending the system to incorporate subsurface data remains a technical challenge due to the complexities of vertical localization and observation operator design. Therefore, it was left for future development. Within this constraint, the experiments employed a univariate DA approach in which SST observations were assimilated to update the complete 3D temperature structure of the model. In other words, temperature was the only variable included in the model state vector of the DA system. This variable has a central role in regulating key oceanic processes such as vertical mixing, stratification, and density-driven circulation. Thus, even when it is assimilated as a single variable, it can exert a wider influence on the modelled ocean state, indirectly improving other variables through the dynamical balances represented in SOMA.

In this context, the synthetic SST observations were extracted from the surface layer of SOMA’s Level 2 grid in the NR solution. A hierarchical random strategy based on a spatial sampling approach was implemented. In this approach, the horizontal domain was divided into a regular grid of subareas, each one defined by a fixed number of model grid cells forming a square. Within each subarea, a single observation point was randomly selected using the randint function from Python’s (version 3.12) random module, which generates a pseudo-random integer corresponding to a specific grid cell location. This procedure was applied iteratively across the entire layer, ensuring an irregular but homogeneously distributed set of observations. Consequently, the density of observations depends on the size of the subareas: smaller subareas result in a larger number of observation points, while larger subareas yield fewer but more sparsely distributed observations.

It is crucial to emphasize that for an OSSE system to be reliable, the errors associated with the synthetic observations should not be derived from the NR itself. Instead, they should realistically replicate the kinds of uncertainties present in actual observational systems [11]. This requires additional assumptions, as synthetic observations do not originate from physical instruments and thus lack the natural noise introduced by sensors, platforms, or environmental conditions. To guide the estimation of the errors and establish a realistic reference for comparison, two operational satellite-based products covering the SOMA region in 2023 were considered.

CMEMS SST NRT ODYSSEA L4 Product [67]: an operational monitoring product providing near-real-time (NRT) SST fields. Updated daily, it is optimized for operational forecasting and rapid environmental monitoring. This product has a finer spatial resolution of $0 . 02^{\circ}$ and is designed to capture short-term variability.
CMEMS SST Reprocessed L4 Product [68]: a historical, consistently reprocessed climate product that has been available since 1 January 1982. It offers high-quality SST fields produced with rigorous reanalysis methods and quality-control protocols. The product is intended for climate studies and long-term analyses, with a horizontal resolution of $0 . 05^{\circ}$ .

Both satellite products share similar characteristics, providing SST data along with the corresponding estimated observation error standard deviation. Using more than one dataset of the same type ensures that the OSSE system is evaluated under consistent conditions, so that the experiments are expected to produce comparable responses rather than to behave differently depending on the chosen dataset. In addition to their role in defining the OSSE observation errors, these products also served as the observational sources for the reference OSE system. For this purpose, the extraction of data points was restricted to the spatial extent of the NR domain (SOMA Level 2). Within this area, observations were selected using the same pseudo-random sampling strategy previously described for the synthetic data, and for each point, the error value provided by the satellite product was directly applied in the assimilation process of the OSE experiments.

In the OSSE experiments, the procedure to define the observation errors began with identification of the base error value at each observation point from the corresponding satellite product used in the comparative OSE experiment. In other words, if the OSSE configuration was designed to be evaluated against an OSE using the NRT ODYSSEA dataset, then the base error at each synthetic observation point in the OSSE was initially taken from the same spatial location in that product. However, direct assignment of these values would result in an unrealistic equivalence of errors between the systems, contradicting the principle that synthetic observations must emulate, not replicate, real observational noise. To overcome this limitation and preserve the integrity of the OSSE system, a spatially correlated random perturbation field was applied to the base error values. This field was generated using a FORTRAN algorithm based on the methodology described in Appendix E of [40]. The output is a two-dimensional pseudo-random field with horizontal correlation that smoothly varies the values across the model grid to ensure that neighbouring points are not perturbed independently.

Following the exposed methodology, each OSE and OSSE experiment consisted of the assimilation of a single set of surface observations into the FR. The experimental design involved a structured sequence of steps. First, a set of potential observation points was extracted from the NR grid using the pseudo-random spatial sampling procedure described earlier. From this initial set, a subset was randomly selected, also using the random module from Python, to define the number of observations to be assimilated. This reduction process was repeated multiple times using the same base set to generate different observation scenarios containing 10, 50, 100, 200, and 500 data points. Each observation in these sets was associated with an error estimate derived from the corresponding satellite product and perturbed using the spatially correlated random field previously described. Importantly, a different perturbation field was applied to each observation set, ensuring independent spatial patterns of error for each experiment. This behaviour is possible because the FORTRAN program used for generating these fields initializes the random number generator with a seed derived from the system clock, which ensures that a distinct perturbation is produced at each execution over the same model grid. This experimental design is summarized in Table 1.

6. Results and Discussion

This section presents the outcomes of the experiments conducted with the OSSE system configured for the SOMA model. The results are structured into two parts: the first subsection focuses on the analysis of FR integration, which served as the forecast model in the OSSE. It includes a comparison of its temperature fields with the NR, along with an assessment of variability among ensemble members and spatial correlation. The second subsection addresses the core objective of this work: the comparative evaluation of the OSSE and OSE experiments using the various observational scenarios described in the previous section.

6.1. SOMA Free Run Integration

The FR simulation was integrated continuously throughout the year 2023, following the configurations and conditions detailed in Section 5.2.1. Since SST is the common variable provided by both satellite products selected as observational sources in the reference OSE system, it was also adopted as the basis for assessing the performance of the FR against the NR. Accordingly, the annual evolution of the instantaneous spatial averages (as defined by Equation (A1)) and the corresponding RMSEs (Equation (A3)) were computed over the Level 2 domain for both integrations, and the results are presented in the chart in Figure 2.

As shown in Figure 2, the NR and FR exhibit similar seasonal patterns, but their magnitudes vary throughout the year. The RMSE values fluctuate, with higher errors occurring during three distinct periods: early in the year (January–February), in mid-summer (July–August), and late in the year, as the cold season begins. The largest discrepancies align with the transitions between warm and cold seasons, suggesting that the differences between NR and FR are more pronounced during these periods. Despite these differences, the FR solution consistently followed the same seasonal trend as the NR solution did, reflecting a coherent thermal cycle over the year. In certain periods, particularly during spring and early autumn, the SST fields of both solutions were notably close, likely due to the constraint of the shared boundary conditions used in both simulations. Even then, this alignment is seen as a positive result because it reinforces the realism of the FR, while it still represents a degraded solution.

A closer inspection of the NR SST curve reveals more pronounced discontinuities, visible as sharp jumps in temperature that are not present in the FR. These abrupt changes are a clear visual indication of the weekly restart simulations implemented in the NR, which force SOMA state variables directly from the updated data of the CMEMS Global solution product. This approach is inherited from the standard SOMA operational forecast framework. Since such updates were deliberately excluded from the FR solution to simulate an unconstrained forecast, these differences further highlight the structural divergence between the two integrations.

It can be inferred that the most significant discrepancies between NR and FR align with the moments of the weekly adjustments, highlighting their influence on the thermal evolution of the NR. This behaviour is particularly evident in the first months of 2023, where the NR solution exhibits a sharper decline in SST compared to the FR in late January and the beginning of February. The influence of the restart mechanism forces the solution to more closely follow the cooling patterns present in the CMEMS data. In contrast, the FR retains more thermal inertia and cools more gradually. At the end of the cold season, the FR lags behind the NR, which begins to recover earlier in response to the updated forcing fields. This example illustrates how the absence of regular external updates in the FR leads to smoother but slower thermal transitions, further supporting the appropriateness of the FR’s role as a realistically imperfect forecast scenario within the OSSE framework.

While it is reassuring that the FR solution remained stable and physically consistent, it is nevertheless desirable for it to diverge as much as possible from the NR solution. The larger the discrepancies between them, the greater the system’s sensitivity to the observation inputs. This increased sensitivity makes it easier to quantify the added value of the synthetic observations. With this in mind, a specific time frame was selected for conducting both the OSSE and OSE experiments. The selection was guided by the temporal evolution of the RMSE (red curve in Figure 2), which pointed to early February as one of the periods of maximum disagreement between NR and FR. At the same time, it was necessary to consider the availability and quality of satellite observations, which are often affected by cloud cover. Consequently, February 3rd was identified as an optimal date, at it balanced a high RMSE with favourable observational conditions.

Before the observation experiments could begin, an initial assessment of the correlation structure among the ensemble variables in the FR was performed. Given that SST is the primary variable updated through the assimilation process in this system, it was important to examine how it correlates with other physical fields across the model domain. To this end, correlation coefficients were computed between the SST time series at a fixed surface grid cell and the time series of all other variables at each cell within the same horizontal layer. The time series were derived from the FR ensemble members; that is, there was one value for each 25 h period of 2023. Two reference points were selected for this analysis: one located along the southern coast (36.95° N, 7.73° W), an area influenced by the countercurrent system, and another on the western coast (37.29° N, 9.09° W), where coastal upwelling plays a dominant role. The resulting correlation maps, presented in Figure 3 and Figure 4, provide valuable insight into the spatial coherence and dynamical relationships represented by the ensemble.

No correlation plots for SST and SSH are presented here, as the analysis did not reveal any significant relationship between these variables in the FR ensemble. Figure 3 depicts the correlation maps for the southern reference point. Panels 1.A and 2.A show the correlation between SST and salinity at the surface and at a depth of 100 m, respectively, while panels 1.B and 2.B depict the correlation between SST and horizontal currents at the two same depths. A clear negative correlation is observed between SST and salinity in the ensemble, and this becomes more pronounced with depth. This result indicates that, within the variability represented by the ensemble, colder waters at the surface of the south coast tend to be associated with higher salinity in this region, potentially reflecting the vertical mixing process simulated by the model. In contrast, the correlations between SST and current velocity are weakly positive near the surface and become even less significant at greater depths.

A similar pattern is observed in the maps shown in Figure 4, which correspond to the western reference point influenced by upwelling. Again, panels 1.A and 2.A show the correlation with salinity at the surface and 100 m, while 1.B and 2.B refer to the correlation with horizontal currents. As in the southern region, SST is negatively correlated with salinity in the ensemble, and this correlation intensifies with depth. The relationship with currents shows a limited positive correlation near the surface that diminishes with depth, indicating that there is only a modest association between thermal variability and current patterns within the ensemble in this area as well.

Given the consistent negative correlation patterns between SST and salinity observed in Figure 3 and Figure 4, a vertical profile analysis was conducted to further investigate this relationship throughout the water column. Using the same two reference locations, the correlation between the surface SST and salinity values was computed at increasing depths at each respective point. Hence, for each spot, the correlation was computed over time across all ensemble members and repeated for all vertical salinity levels in the model. The results (Figure 5) reveal a clear intensification of the negative correlation with depth. The red curve represents the profile for the southern coast reference point, while the blue curve corresponds to the western location. In both cases, the strength of the inverse relationship increases steadily with distance below the surface, confirming that the inverse relationship between temperature and salinity becomes more pronounced with depth in the ensemble.

The results presented in Figure 3, Figure 4 and Figure 5 suggest a negative correlation between SST and salinity within the FR ensemble that becomes particularly pronounced at greater depths. Given that the OSSE system is based on an EnOI data-assimilation scheme, such a statistical relationship could, in principle, be leveraged in a multivariate assimilation framework [42,69]. This would allow SST observations to contribute indirectly to salinity updates, potentially improving the model’s representation of water mass properties and thermohaline structure. Such an approach also carries practical implications for the design of observing systems, as exploiting multivariate updates from SST could lessen the reliance on dedicated salinity measurements and thereby enable the deployment of lower-cost observing technologies and more extensive large-scale monitoring networks.

Although the ensemble reveals a statistical relationship between SST and salinity, the physical basis of this negative correlation remains uncertain. Specifically, in the southern region of the domain, a positive correlation between SST and salinity was initially anticipated, considering the influence of easterly winds that occur offshore of the south coast. These winds transport warm, saline Mediterranean waters toward the Algarve shelf and often alternate with westerly winds along the coastline (Section 2). The dynamics of these alternating wind regimes and their impact on surface and subsurface water properties may not be fully captured by the FR ensemble, especially if its sampling fails to represent the full range of seasonal variability. For this reason, and to preserve methodological clarity, the current implementation of the OSSE system was limited to temperature updates only, leaving multivariate approaches for future development once the underlying mechanisms are better understood.

6.2. SOMA OSSE System Assessment

The performance and internal structure of the FR having been established, this section presents how it responded to the updates derived from the synthetic observations generated in the NR, as well as a comparative evaluation of the OSSE experiments and their corresponding OSE configurations. These analyses aimed to quantify the behaviour of the OSSE system to ensure that, when applied to assess new observation strategies, it would not systematically overestimate or underestimate their value.

Regarding the pseudo-random perturbation fields introduced in Section 5.2.3, a horizontal decorrelation length had to be defined to determine the spatial scale over which values in a generated field remained correlated. This distance reflected the approximate range at which SST values in the L3 satellite products appear spatially coherent prior to their processing into the corresponding L4 gridded datasets. For this reason, a value of 25 km was selected, corresponding to 25 cells in the 1 km NR grid.

Knowing that 3 February 2023 was selected as the date for the updates, each experiment was subsequently integrated for an additional seven days to evaluate the temporal evolution of the forecast after the assimilation step. Figure 6 presents the results of this post-assimilation period for the experiments based on the NRT ODYSSEA product. In the left-hand panel, the daily evolution of the SST RMSE is shown for each OSE experiment relative to the satellite observations, with the black curve representing the baseline FR error. The remaining curves correspond to OSEs using an increasing number of assimilated observation points, from 10 in OSE1 to 500 in OSE5, as listed in Table 1. The right-hand panel displays the corresponding SST RMSE for the OSSE experiments, which assimilated synthetic observations and were evaluated against the NR. As expected, experiments using the Reprocessed product exhibited behaviour very similar to the behaviour of those that used the NRT ODYSSEA data. Therefore, the RMSE results from the Reprocessed-based experiments are presented in the Supplementary Material.

Portanto, o plot dos dados está disponibilzado como um material suplementar

A detailed analysis of Figure 6 provides important insights into the OSSE system’s behaviour following the updates. Most likely due to the random multiplicative factor that increases the observation errors, the RMSE of the FR with respect to the NR is consistently higher than the RMSE relative to the satellite observations. However, the emphasis here is not on this absolute difference, but on the relative consistency across experiments. Both OSE and OSSE configurations exhibit a clear, monotonic reduction in RMSE as the number of assimilated observations increases. The temporal evolution also shows that, as expected, assimilating a large number of observations produces a gradual increase in RMSE in the days following the update, whereas assimilating only a few points tends to increase the RMSE at analysis time, with the resulting dynamical inconsistencies causing more harm than benefit.

Notably, the OSSE experiments closely mirror the behaviours of their corresponding OSE counterparts over time, despite relying solely on synthetic data. For each observation count, the RMSE curves of the OSSEs follow trends that are closely aligned with those of the OSEs, indicating that the synthetic updates generate corrections in the model state that are dynamically comparable to those achieved through the assimilation of real satellite observations. This agreement is particularly clear when comparing experiments configured with the same number of assimilated observations (e.g., OSE3 versus OSSE3). Comparable consistency was also found when repeating the experiments with the Reprocessed satellite product, reinforcing that the OSSE system responds similarly when assimilating equivalent datasets. Had the results from the Reprocessed product diverged substantially from those using the NRT ODYSSEA data, it could have indicated shortcomings in the system’s design and the need for further refinement.

In addition to the temporal RMSE analysis, Taylor diagrams were used to evaluate the similarity between the forecast fields and the reference data. These were generated for two instants: the day of the EnOI update step (3 February) and seven days afterward (10 February). Figure 7 presents the results for the experiments based on the NRT ODYSSEA satellite product. In line with previous RMSE-based assessment, the overall patterns in the diagrams are consistent. The most relevant aspect lies not in the precise values, but in the comparative behaviour of OSSEs relative to OSEs. The satellite data (SAT_NRT in the figure) and the NR serve as reference anchors, with SAT_NRT exhibiting a higher standard deviation, reflecting its origin as a gridded observational product. As observations are progressively assimilated into the FR, the forecast state converges toward the reference: the OSE experiments (red markers) move closer to SAT_NRT, while the OSSE experiments (yellow markers) approach the NR. Although the OSSEs show slightly higher root mean square differences (RMSD) than the OSEs, consistent with the RMSE trends, both follow remarkably similar trajectories in terms of correlation and RMSD reduction. This convergence was also evident in the experiments using the Reprocessed satellite product.

Finally, an additional metric was used to further assess the system’s behaviour: the Murphy Skill Score (MSS), as defined in Equation (A7). For the OSE experiments, the NR values in the equation were replaced by the corresponding satellite product, which served as the observational reference. In the OSSE experiments, the NR values remained the reference dataset, reflecting the synthetic nature of these experiments. The results for the experiments based on the NRT ODYSSEA satellite product are shown in Figure 8. Each panel illustrates how the MSS varies with the number of assimilated observation points, with the left panel representing the moment immediately after the EnOI update (3 February) and the right panel representing the results from the sampling seven days later (10 February).

As expected, the skill score generally increases as more observations are assimilated, indicating enhanced forecast accuracy. More importantly, the OSSE curves follow the same upward trend as their OSE counterparts, reaffirming the system’s consistency. The analysis MSS increases even when a large number of data points is assimilated, whereas the forecast MSS is less sensitive to the number of observations, possibly being hampered by the quality of other inputs, such as lateral and surface boundary conditions. No specific benchmark value for the MSS was sought; rather, the objective was to confirm that OSSE experiments respond similarly to OSEs in terms of performance improvement. Overall, the results demonstrate that the OSSE system developed in this study is internally consistent, sensitive to observation density, and capable of realistically reproducing the correction patterns expected from real-world assimilation. This provides strong evidence of its suitability for assessing the potential impact of future observational strategies, such as those envisioned under the NAUTILOS initiative.

7. Conclusions

A practical framework for assessing the possible influence of new observation systems on model forecasts before their actual deployment is provided by Observation System Simulation Experiments (OSSEs). Within this context, the main objective of this study was to design and validate an OSSE system tailored to the operational model of the Algarve coast, SOMA. The system was developed using the fraternal twins approach, in which two configurations of the same model, MOHID, were used to simulate the Nature Run (NR) and the Forecast Model (FM).

In the NR, SOMA’s original operational configuration was used. In contrast, the FM was modified to a lower-resolution grid and did not include weekly adjustments from outer conditions. For this reason, the FM was also referred to as the Free Run (FR) in the proposed OSSE system. Data assimilation was carried out using the EnOI scheme, which was chosen for its balance between computational efficiency and representativeness of ensemble-based statistics. At this stage, assimilation was limited to surface observations, both due to the current configuration of the DA system, which still requires further adaptation to support 3D assimilation with SOMA, and to ensure consistency with the operational satellite-based products used in the Observing System Experiments (OSEs).

As recommended in the literature, an OSSE system should be evaluated against a comparable OSE system in order to ensure that the former does not produce unrealistic results when assessing new observations. In this study, such a comparison was made using two distinct L4 satellite products for SST from CMEMS: the NRT ODYSSEA product and the Reprocessed dataset. These products served as references for the OSE experiments, while the corresponding OSSEs assimilated synthetic SST observations derived from the NR. The results confirmed that the OSSE system responded consistently across both satellite references, as demonstrated by the RMSE evolution over time, Taylor diagrams, and Murphy Skill Score; these results support its robustness and reliability for future impact assessments.

A key aspect of the methodology is the emulation of realistic observational errors in the synthetic data, which ensures that the assessment of new data streams reflects the uncertainties of real-world applications. This was achieved by generating pseudo-random fields, using a spatial correlation factor across the NR grid, and scaling by satellite-derived error data. The result is a flexible framework that preserves the statistical structure of real observation errors and can be used to test different error magnitudes in the future. Although the experiments presented were tailored to SOMA and the Algarve region, the methodological framework is not tied to SOMA specifically. The generation of synthetic observations and their associated error fields allows for straightforward transfer to other coastal models and regional contexts, broadening the model’s applicability and making it useful as a base model for evaluating emerging observing systems beyond that presented in the present case study.

The results presented here indicate that the SOMA OSSE system is consistent, sensitive to the number of observations used in the update step, and capable of realistically emulating synthetic observation errors. As such, it provides a solid foundation for future studies aimed at assessing the impact of emerging observation strategies in the Algarve region, including those being developed under the NAUTILOS project. Beyond its immediate application, the system also paves the way to broader developments, such as more in-depth research into the correlation between the SST and salinity state variables of the model, which could enhance the system’s robustness by enabling more advanced multivariate assimilation strategies.

This work also constitutes a very significant milestone in SOMA’s operational forecasts: the adaptation and integration of a DA system to improve the model state variables, including real satellite observations. However, the restriction to surface-only assimilation remains a critical limitation of the current system. Future developments should prioritize the integration of the EnOI scheme with real-time forecasts and the expansion of its scope to encompass additional observation types, such as satellite altimetry and 3D in situ fields like those obtained by autonomous platforms, including the AUV missions already operational in the region [56]. In addition, a systematic sensitivity analysis of the weighting coefficient

α

will be an essential step in optimizing the balance between background error structures and observational impact, ensuring a more robust assimilation framework. In this sense, the present study should be seen as a first step in the development of a more comprehensive DA framework that will culminate in a significant enhancement of the operational monitoring capacity along the Algarve coast.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/jmse13091830/s1, Figure S1: RMSE-Reprocessed_Based.

Author Contributions

Conceptualization, F.M. (Fernando Mendonça) and F.M. (Flávio Martins); methodology, F.M. (Fernando Mendonça), F.M. (Flávio Martins) and L.B.; software, F.M. (Fernando Mendonça) and L.B.; validation, F.M. (Flávio Martins) and L.B.; formal analysis, F.M. (Fernando Mendonça), F.M. (Flávio Martins) and L.B.; investigation, F.M. (Fernando Mendonça); data curation, F.M. (Fernando Mendonça); writing—original draft preparation, F.M. (Fernando Mendonça); writing—review and editing, F.M. (Flávio Martins), L.B. and F.M. (Fernando Mendonça); visualization, F.M. (Fernando Mendonça); supervision, F.M. (Flávio Martins) and L.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Portuguese Foundation of Science and Technology (FCT) to CIMA [grant number UID/00350/2020 CIMA]; ARNET [grant number LA/P/0069/2020]; H. Europe NAUTILOS project [grant number 101000825]; European Research Executive Agency THETIDA project [grant number 101095253]. Fernando Mendonça is a PhD student funded through the FCT Research Scholarships Programme [DOI reference https://doi.org/10.54499/UI/BD/153357/2022].

Data Availability Statement

The operational data from the Algarve modelling system (SOMA) are openly available at https://doi.org/10.34623/tx0z-bb23.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Statistical Metrics

This appendix details the core metrics used to evaluate the Forecast Model (FM) against the Nature Run (NR). The same equations are applied to assess (1) the FM against actual observation data, (2) OSSE experiments against the NR, and (3) OSE experiments against observational data.

Appendix A.1. Spatially Integrated Averages and Standard Deviations

A straightforward comparison between the models can be accomplished by calculating the instantaneous spatial averages (AVG) and standard deviations (STD) of the NR and FM. The resulting data can then be utilized to construct a time series of model evolution. The parameters are defined as follows:

A V G (t_{i}) = \frac{Σ_{domain} θ (t_{i})}{N},

(A1)

S T D (t_{i}) = \sqrt{\frac{Σ_{domain} {(θ (t_{i}) - A V G (t_{i}))}^{2}}{N}} .

(A2)

In the previous equations,

θ

is the analyzed variable at time

t_{i}

, and N the number of grid cells. Values can represent instantaneous outputs or averages within a time window centered on

t_{i}

, with both the window length and sampling frequency (e.g., daily or weekly) depending on the process and region. Spatial sums may cover the full 3D field or a chosen 2D layer, such as the surface. The time series obtained from AVG (Equation (A1)) and STD (A2) offer a straightforward means of comparing the FM against the NR through time. The same diagnostics can be extended to observational datasets (e.g., L4 SST or SSH data products), ensuring that the model’s statistical behaviour is consistent with real-world conditions.

Appendix A.2. Spatially Integrated RMSE

The time series of the instantaneous spatially integrated root mean square error (RMSE) is an effective preliminary indicator of the forecast evolution in relation to the NR and observations. This parameter is defined by the following equation:

R M S E (t_{i}) = \sqrt{\frac{Σ_{domain} {(θ_{F M} (t_{i}) - θ_{N R} (t_{i}))}^{2}}{N}} .

(A3)

In the last equation,

θ_{N R}

and

θ_{F M}

represent the variables from the Nature Run and the Forecast Model, respectively, at time

t_{i}

, while N is the number of grid cells considered. As with Equations (A1) and (A2),

θ

may correspond either to instantaneous outputs or to averages within a time interval centred on

t_{i}

, with the choice of interval and sampling frequency (e.g., daily or weekly) depending on the process and region under study. The spatial integration can also be carried out over the full 3D domain or restricted to a 2D layer. Since the NR often employs a finer spatial resolution than the FM does to reduce the risk of the identical twins problem, it is standard practice to average the NR fields onto the FM grid prior to computing RMSE.

Appendix A.3. Temporally Integrated Averages and Standard Deviations

The previous indicators provide a global overview of the temporal evolution of the solutions, yet they do not allow a spatial evaluation. To address this limitation, temporally integrated indicators for each grid point can be employed. The temporally integrated average is computed as follows:

A V G (i, j, k) = \frac{\sum_{t_{i} = 0}^{T} θ (i, j, k)}{N},

(A4)

where

θ

denotes the analysed variable, either from the Forecast Model or the NR, evaluated at grid point

(i, j, k)

, while T is the number of time steps included. The values of

θ

may correspond to instantaneous outputs or to averages over a time interval centred on

t_{i}

. Using the same approach, the standard deviation can be derived according to Equation (A5):

S T D (i, j, k) = \sqrt{\frac{\sum_{t_{i} = 0}^{T} {(θ (i, j, k) - A V G (i, j, k))}^{2}}{N}} .

(A5)

Horizontal and vertical maps can be plotted from the indicators in Equations (A4) and (A5), which provide a general idea of the spatial behaviour of the model statistics.

Appendix A.4. Temporally Integrated RMSE

The temporally integrated RMSE between the forecast and the reference NR can be calculated using the same logic employed for the previous indicators. This allows the plotting of horizontal maps of each layer or vertical cuts, which provide a general overview of the spatial distribution of forecast deviations relative to the NR. It is computed by the following:

R M S E (t_{i}) = \sqrt{\frac{\sum_{t_{i} = 0}^{T} {(θ_{F M} (i, j, k) - θ_{N R} (i, j, k))}^{2}}{N}} .

(A6)

Appendix A.5. Murphy Skill Score

The metrics described in Appendix A provide a general characterization of the NR and FM and a simple way to compare them. However, the literature also recommends the use of more sophisticated statistical metrics, such is the Murphy Skill Score (MSS):

M S S = ρ^{2} - {[ρ - \frac{σ_{F M}}{σ_{N R}}]}^{2} - {[\frac{{\bar{θ}}_{F M} - {\bar{θ}}_{N R}}{σ_{N R}}]}^{2} .

(A7)

The MSS can be evaluated either over the full grid or individually at each grid cell for a selected time period, thereby yielding a spatial distribution of predictive performance. In Equation (A7),

ρ

is the correlation between the NR and FM,

σ

is the standard deviation, and

\bar{θ}

represents the average of the variable analysed. Following [70], MSS values vary between

- \infty

and 1: a score of 1 corresponds to a perfect forecast, 0 indicates no improvement relative to the reference, and negative values denote performance worse than that of the reference. Intermediate values

(0 < M S S < 1)

indicate increasing levels of forecast improvement.

Appendix A.6. Taylor Diagrams

The Taylor Diagram [71] is a graphical tool developed to summarize the performance of models by combining three statistical metrics—correlation, root mean square difference (RMSD), and standard deviation—into a single and interpretable plot. It provides a compact way to visualize and compare the agreement between the forecast output and the reference NR, where each comparison (or each OSSE) is represented as a point on a polar coordinate system. The azimuthal position of each point reflects the correlation between FM and NR, while the radial distance from the origin represents the pattern’s standard deviation, and the distance from the reference point shows the RMSD. Thus, the diagram offers a rapid method of evaluating the accuracy of the forecast model.

The distinction between RMSE and RMSD lies in their respective focuses. RMSE is a broader measure of overall accuracy, capturing both the mean difference (bias) and variability between the forecast and NR. In contrast, RMSD removes the bias first, allowing it to emphasize discrepancies in the shape and phase of the spatial or temporal structure and variability of the dataset. Thus, it is defined by the following equation:

R M S D = \sqrt{\frac{\sum_{i = 1}^{N} {((θ_{F M_{i}} - {\bar{θ}}_{F M}) - (θ_{N R_{i}} - {\bar{θ}}_{N R}))}^{2}}{N}},

(A8)

where

θ

is the analysed variable in FM and NR;

\bar{θ}

is the average of that variable; and N is the number of grid cells used. In general, the closer the forecast result is to the NR, the better the model performance.

Appendix B. EnOI General Formulation

The general formulation of the Ensemble Optimal Interpolation (EnOI) can be introduced through the standard analysis equation of a Data Assimilation (DA) problem, expressed in Equations (A9) and (A10), where the time index is omitted for clarity:

x^{a} = x^{b} + K (y - {Hx}^{b}),

(A9)

K = {PH}^{T} {[{HPH}^{T} + R]}^{- 1} .

(A10)

Equation (A9) provides the model state update in

x^{a}

. The current state of the model, or the background, is denoted by

x^{b}

and, together with the analysis, they are m-dimensional vectors of real numbers, i.e.,

(x^{a}, x^{b}) \in R^{m}

. The state vectors typically contain the values at all grid cells of all variables being updated in the assimilation process, such as velocity components, temperature, salinity, or SSH, although the specific variables included depend on the configuration and objectives of the system. The observation vector is given by

y \in R^{d}

and

H \in R^{d \times m}

is a matrix which serves as an operator to map the background vector to the observations’ space.

In Equation (A10) the superscripts ^T indicate a transposed matrix,

K \in R^{m \times d}

is the matrix of weights, akin to the Kalman gain in the Ensemble Kalman Filter (EnKF),

P \in R^{m \times m}

is the background error covariance matrix, and

R \in R^{d \times d}

is the observation error covariance matrix. From the two equations and in the event that the uncertainty associated with the model is significantly smaller than that associated with the observations (i.e.,

P < < R

), the weights approach zero. Consequently, the analysis in Equation (A9) practically becomes the value of the background. Conversely, if the observations are more reliable than the model (i.e.,

R < < P

), then more weight is given to the observations.

If the analysis equation is to be solved, some definitions must be given. Firstly, the matrix holding the static ensemble members of the background is given by

E

in Equation (A11):

E = (x_{1}, x_{2}, \dots, x_{N}) \in R^{m \times N},

(A11)

where N is the number of ensemble members, each one an m-dimensional state vector. With that, the ensemble anomaly (

X^{'}

) is defined as the following:

X^{'} = E - \bar{E},

(A12)

where

\bar{E}

is the ensemble mean. Then, the ensemble covariance matrix

P

can be defined as

P = \frac{{X^{'} X^{'}}^{T}}{N - 1} .

(A13)

Moving to the measurements vector

y

in Equation (A9), an ensemble of perturbed observations must be introduced into the system to ensure that the ensemble analysis properly accounts for the measurement uncertainty and maintains consistency with the computation of the Kalman gain. The perturbation is defined by Equation (A14):

y_{i} = y + ϵ_{i}, 1 \leq i \leq N,

(A14)

where

ϵ_{i}

is the simulated random measurement error, drawn from a distribution with zero mean and a covariance that matches the observation error covariance matrix

R

in Equation (A10). In this way, the perturbed observations can be stored in the following observation matrix:

Y = (y_{1}, y_{2}, \dots, y_{N}) \in R^{d \times N} .

(A15)

With the previous statements, it is possible to replace the error covariance matrices in the analysis

x^{a}

and in

K

by their ensemble representation to obtain the ensemble of updated states (

E^{a}

):

E^{a} = E + {X^{'} X^{'}}^{T} H^{T} {[{HX}^{'} {X^{'}}^{T} H^{T} + (N - 1) R]}^{- 1} (Y - HE) .

(A16)

Equation (A16) shows the general formulation of the EnKF analysis scheme, which includes updating each member of the ensemble and integrating the model N times to calculate the error covariance matrix

P

. However, in the EnOI, the analysis is computed only for one single model state, so the last equation can be rewritten as follows:

x^{a} = x^{b} + α X^{'} {X^{'}}^{T} H^{T} {[α {HX}^{'} {X^{'}}^{T} H^{T} + (N - 1) R]}^{- 1} (y - {Hx}^{b}) .

(A17)

In Equation (A17), the parameter

α \in (0, 1]

is introduced to tune the weight given to the ensemble background error covariance relative to the observation error covariance [40]. Since EnOI employs a stationary ensemble to approximate background error statistics, the background variance can sometimes be overestimated due to the long-term sampling of model states. The scaling factor

α

mitigates this issue by reducing the ensemble-based covariance contribution to a level that better represents the actual forecast uncertainty at the time of assimilation. When

α

approaches zero, the effective observation error variance is inflated, leading the analysis to converge toward the background state (

x^{a} \approx x^{b}

). Conversely, as it approaches one, the ensemble covariance is fully retained, allowing the observations to exert a stronger influence on the analysis update.

References

Versteeg, H.; Malalasekera, W. An Introduction to Computational Fluid Dynamics, 2nd ed.; Prentice Hall: Philadelphia, PA, USA, 2007. [Google Scholar]
Griffies, S.M. Some Ocean Model Fundamentals. In Ocean Weather Forecasting: An Integrated View of Oceanography; Springer: Dordrecht, The Netherlands, 2006; pp. 19–73. [Google Scholar] [CrossRef]
Le Traon, P.Y. Satellites and Operational Oceanography. In Operational Oceanography in the 21st Century; Springer: Dordrecht, The Netherlands, 2011; pp. 29–54. [Google Scholar] [CrossRef]
Frederikse, T.; Landerer, F.; Caron, L.; Adhikari, S.; Parkes, D.; Humphrey, V.W.; Dangendorf, S.; Hogarth, P.; Zanna, L.; Cheng, L.; et al. The causes of sea-level rise since 1900. Nature 2020, 584, 393–397. [Google Scholar] [CrossRef] [PubMed]
Gómez-Enri, J.; Aldarias, A.; Mulero-Martínez, R.; Vignudelli, S.; Bruno, M.; Mañanes, R.; Izquierdo, A.; Fernández-Barba, M. Satellite Radar Altimetry Supporting Coastal Hydrology: Case Studies of Guadalquivir River Estuary and Ebro River Delta (Spain). IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 3587–3599. [Google Scholar] [CrossRef]
Mills, L.; Janeiro, J.; Martins, F. Baseline climatology of the Canary Current Upwelling System and evolution of sea surface temperature. Remote Sens. 2024, 16, 504. [Google Scholar] [CrossRef]
Liu, Z.; Xing, X.; Chen, Z.; Lu, S.; Wu, X.; Li, H.; Zhang, C.; Cheng, L.; Li, Z.; Sun, C.; et al. Twenty years of ocean observations with China Argo. Acta Oceanol. Sin. 2023, 42, 1–16. [Google Scholar] [CrossRef]
Legler, D.; Freeland, H.; Lumpkin, R.; Ball, G.; McPhaden, M.; North, S.; Crowley, R.; Goni, G.; Send, U.; Merrifield, M. The current status of the real-time in situ Global Ocean Observing System for operational oceanography. J. Oper. Oceanogr. 2015, 8, s189–s200. [Google Scholar] [CrossRef]
Le Traon, P.; Antoine, D.; Bentamy, A.; Bonekamp, H.; Breivik, L.; Chapron, B.; Corlett, G.; Dibarboure, G.; DiGiacomo, P.; Donlon, C.; et al. Use of satellite observations for operational oceanography: Recent achievements and future prospects. J. Oper. Oceanogr. 2015, 8, s12–s27. [Google Scholar] [CrossRef]
Privé, N.C.; McGrath-Spangler, E.L.; Carvalho, D.; Karpowicz, B.M.; Moradi, I. Robustness of Observing System Simulation Experiments. Tellus A Dyn. Meteorol. Oceanogr. 2023, 75, 309–333. [Google Scholar] [CrossRef]
Halliwell, G., Jr.; Srinivasan, A.; Kourafalou, V.; Yang, H.; Willey, D.; Le Hénaff, M.; Atlas, R. Rigorous evaluation of a fraternal twin ocean OSSE system for the open Gulf of Mexico. J. Atmos. Ocean. Technol. 2014, 31, 105–130. [Google Scholar] [CrossRef]
Novellino, A.; Martins, F.A.; Pieri, G.; Misurale, F. New approach to underwater technologies for innovative, low-cost ocean observation (NAUTILOS): Operational field primary capture system. In Proceedings of the 2022 IEEE International Workshop on Metrology for the Sea; Learning to Measure Sea Health Parameters (MetroSea), Milazzo, Italy, 3–5 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 203–208. [Google Scholar] [CrossRef]
Beck, H.E.; Zimmermann, N.E.; McVicar, T.R.; Vergopolan, N.; Berg, A.; Wood, E.F. Present and future Köppen-Geiger climate classification maps at 1-km resolution. Sci. Data 2018, 5, 180214. [Google Scholar] [CrossRef]
Leitão, F.; Relvas, P.; Cánovas, F.; Baptista, V.; Teodósio, A. Northerly wind trends along the Portuguese marine coast since 1950. Theor. Appl. Climatol. 2019, 137, 1–19. [Google Scholar] [CrossRef]
Hurrell, J.W.; Deser, C. North Atlantic climate variability: The role of the North Atlantic Oscillation. J. Mar. Syst. 2010, 79, 231–244. [Google Scholar] [CrossRef]
Dias, L.F.; Aparício, B.A.; Nunes, J.P.; Morais, I.; Fonseca, A.L.; Pastor, A.V.; Santos, F.D. Integrating a hydrological model into regional water policies: Co-creation of climate change dynamic adaptive policy pathways for water resources in southern Portugal. Environ. Sci. Policy 2020, 114, 519–532. [Google Scholar] [CrossRef]
Santos, J.F.; Pulido-Calvo, I.; Portela, M.M. Spatial and temporal variability of droughts in Portugal. Water Resour. Res. 2010, 46, W03503. [Google Scholar] [CrossRef]
Fiúza, A.F.; Demacedo, M.; Guerreiro, M. Climatological space and time-variation of the Portuguese coastal upwelling. Oceanol. Acta 1982, 5, 31–40. [Google Scholar]
Relvas, P.; Barton, E.D. Mesoscale patterns in the Cape Sao Vicente (Iberian peninsula) upwelling region. J. Geophys. Res. Ocean. 2002, 107, 28-1–28-23. [Google Scholar] [CrossRef]
Relvas, P.; Barton, E.D.; Dubert, J.; Oliveira, P.B.; Peliz, A.; da Silva, J.C.; Santos, A.M.P. Physical oceanography of the western Iberia ecosystem: Latest views and challenges. Prog. Oceanogr. 2007, 74, 149–173. [Google Scholar] [CrossRef]
Fiúza, A.F. Upwelling patterns off Portugal. In Coastal Upwelling Its Sediment Record: Part A: Responses of the Sedimentary Regime to Present Coastal Upwelling; Springer: Berlin/Heidelberg, Germany, 1983; pp. 85–98. [Google Scholar] [CrossRef]
Teles-Machado, A.; Peliz, A.; Dubert, J.; Sánchez, R.F. On the onset of the Gulf of Cadiz Coastal Countercurrent. Geophys. Res. Lett. 2007, 34, L12601. [Google Scholar] [CrossRef]
Garel, E.; Laiz, I.; Drago, T.; Relvas, P. Characterisation of coastal counter-currents on the inner shelf of the Gulf of Cadiz. J. Mar. Syst. 2016, 155, 19–34. [Google Scholar] [CrossRef]
Haynes, R.; Barton, E.D. A poleward flow along the Atlantic coast of the Iberian Peninsula. J. Geophys. Res. Ocean. 1990, 95, 11425–11441. [Google Scholar] [CrossRef]
Frouin, R.; Fiúza, A.F.; Ambar, I.; Boyd, T.J. Observations of a poleward surface current off the coasts of Portugal and Spain during winter. J. Geophys. Res. Ocean. 1990, 95, 679–691. [Google Scholar] [CrossRef]
Relvas, P.; Barton, E.D. A separated jet and coastal counterflow during upwelling relaxation off Cape São Vicente (Iberian Peninsula). Cont. Shelf Res. 2005, 25, 29–49. [Google Scholar] [CrossRef]
Atlas, R.; Kalnay, E.; Halem, M. Impact of satellite temperature sounding and wind data on numerical weather prediction. Opt. Eng. 1985, 24, 341–346. [Google Scholar] [CrossRef]
Atlas, R. Atmospheric observations and experiments to assess their usefulness in data assimilation (gtSpecial IssueltData assimilation in Meteology and oceanography: Theory and practice). J. Meteorol. Soc. Jpn. Ser. II 1997, 75, 111–130. [Google Scholar] [CrossRef]
Mourre, B.; De Mey, P.; Ménard, Y.; Lyard, F.; Le Provost, C. Relative performance of future altimeter systems and tide gauges in constraining a model of North Sea high-frequency barotropic dynamics. Ocean Dyn. 2006, 56, 473–486. [Google Scholar] [CrossRef]
Oke, P.R.; Schiller, A. Impact of Argo, SST, and altimeter data on an eddy-resolving ocean reanalysis. Geophys. Res. Lett. 2007, 34, L19601. [Google Scholar] [CrossRef]
Halliwell, G.R., Jr.; Kourafalou, V.; Le Hénaff, M.; Shay, L.K.; Atlas, R. OSSE impact analysis of airborne ocean surveys for improving upper-ocean dynamical and thermodynamical forecasts in the Gulf of Mexico. Prog. Oceanogr. 2015, 130, 32–46. [Google Scholar] [CrossRef]
Ford, D. Assimilating synthetic Biogeochemical-Argo and ocean colour observations into a global ocean model to inform observing system design. Biogeosciences 2021, 18, 509–534. [Google Scholar] [CrossRef]
Fennel, K.; Mattern, J.P.; Doney, S.C.; Bopp, L.; Moore, A.M.; Wang, B.; Yu, L. Ocean biogeochemical modelling. Nat. Rev. Methods Prim. 2022, 2, 76. [Google Scholar] [CrossRef]
Yu, L.; Fennel, K.; Wang, B.; Laurent, A.; Thompson, K.R.; Shay, L.K. Evaluation of nonidentical versus identical twin approaches for observation impact assessments: An ensemble-Kalman-filter-based ocean assimilation application for the Gulf of Mexico. Ocean Sci. 2019, 15, 1801–1814. [Google Scholar] [CrossRef]
Martin, M.J.; Hoteit, I.; Bertino, L.; Moore, A.M. Data assimilation schemes for ocean forecasting: State of the art. State Planet 2025, 5-opsr, 9. [Google Scholar] [CrossRef]
Carrassi, A.; Bocquet, M.; Bertino, L.; Evensen, G. Data assimilation in the geosciences: An overview of methods, issues, and perspectives. Wiley Interdiscip. Rev. Clim. Change 2018, 9, e535. [Google Scholar] [CrossRef]
Evensen, G.; Vossepoel, F.C.; Van Leeuwen, P.J. Data Assimilation Fundamentals: A Unified Formulation of the State and Parameter Estimation Problem; Springer Nature: Berlin/Heidelberg, Germany, 2022. [Google Scholar] [CrossRef]
Evensen, G. Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res. Ocean. 1994, 99, 10143–10162. [Google Scholar] [CrossRef]
Oke, P.; Brassington, G.; Griffin, D.; Schiller, A. Ocean data assimilation: A case for ensemble optimal interpolation. Aust. Meteorol. Oceanogr. J. 2010, 59, 67–76. [Google Scholar] [CrossRef]
Evensen, G. The Ensemble Kalman Filter: Theoretical formulation and practical implementation. Ocean Dyn. 2003, 53, 343–367. [Google Scholar] [CrossRef]
Oke, P.R.; Allen, J.S.; Miller, R.N.; Egbert, G.D.; Kosro, P.M. Assimilation of surface velocity data into a primitive equation coastal ocean model. J. Geophys. Res. Ocean. 2002, 107, 3122. [Google Scholar] [CrossRef]
Counillon, F.; Bertino, L. Ensemble Optimal Interpolation: Multivariate properties in the Gulf of Mexico. Tellus A Dyn. Meteorol. Oceanogr. 2009, 61, 296–308. [Google Scholar] [CrossRef]
Tanajura, C.; Costa, F.; Silva, R.; Ruggiero, G.; Daher, V. Assimilation of Sea Surface Height Anomalies into Hycom with an Optimal Interpolation Scheme over the Atlantic Ocean Metarea V. BRazilian J. Geophys. 2013, 31, 257–270. [Google Scholar] [CrossRef]
Hernandez-Lasheras, J.; Mourre, B. Dense CTD survey versus glider fleet sampling: Comparing data assimilation performance in a regional ocean model west of Sardinia. Ocean Sci. 2018, 14, 1069–1084. [Google Scholar] [CrossRef]
Liu, Y.; Xie, J.; Liu, Z.; Gan, J.; Zhu, J. The Assimilation of Temperature and Salinity Profile Observations for Forecasting the River–Estuary–Shelf Waters. J. Geophys. Res. Ocean. 2021, 126, e2020JC017043. [Google Scholar] [CrossRef]
Janeiro, J.; Neves, A.; Martins, F.; Relvas, P. Integrating technologies for oil spill response in the SW Iberian coast. J. Mar. Syst. 2017, 173, 31–42. [Google Scholar] [CrossRef]
Sobrinho, J.; de Pablo, H.; Pinto, L.; Neves, R. Improving 3D-MOHID water model with an upscaling algorithm. Environ. Model. Softw. 2021, 135, 104920. [Google Scholar] [CrossRef]
Braunschweig, F.; Leitao, P.; Fernandes, L.; Pina, P.; Neves, R. The object-oriented design of the integrated water modelling system MOHID. In Developments in Water Science; Elsevier: Amsterdam, The Netherlands, 2004; Volume 55, pp. 1079–1090. [Google Scholar] [CrossRef]
Mayeli, P.; Sheard, G.J. Buoyancy-driven flows beyond the Boussinesq approximation: A brief review. Int. Commun. Heat Mass Transf. 2021, 125, 105316. [Google Scholar] [CrossRef]
Martins, F.; Leitão, P.; Silva, A.; Neves, R. 3D modelling in the Sado estuary using a new generic vertical discretization approach. Oceanol. Acta 2001, 24, 51–62. [Google Scholar] [CrossRef]
Martins, F.A.B.d.C.; Neves, R.; Leitão, P.C. A three-dimensional hydrodynamic model with generic vertical coordinate. In Proceedings of the 3rd International Conference on Hydroinformatics, Copenhagen, Denmark, 24–26 August 1998; pp. 1403–1410. [Google Scholar]
Martins, F. Modelação Matemática Tridimensional de Escoamentos Costeiros e Estuarinos Usando Uma Abordagem de Coordenada Vertical Genérica. Ph.D. Thesis, Universidade Técnica de Lisboa (Portugal), Lisbon, Portugal, 1999. [Google Scholar]
Janeiro, J.; Zacharioudaki, A.; Sarhadi, E.; Neves, A.; Martins, F. Enhancing the management response to oil spills in the Tuscany Archipelago through operational modelling. Mar. Pollut. Bull. 2014, 85, 574–589. [Google Scholar] [CrossRef] [PubMed]
Moreira, D.; Janeiro, J.; Tosic, M.; Martins, F. A Generic Operational Tool for Early Warning Oil Spills – Application to Cartagena Bay and the Algarve Coast. In Proceedings of the INCREaSE 2023, Faro, Portugal, 5–7 July 2023; Semião, J.F.L.C., Sousa, N.M.S., da Cruz, R.M.S., Prates, G.N.D., Eds.; Springer: Cham, Switzerland, 2023; pp. 109–120. [Google Scholar] [CrossRef]
Janeiro, J.; Martins, F.; Relvas, P. Towards the development of an operational tool for oil spills management in the Algarve coast. J. Coast. Conserv. 2012, 16, 449–460. [Google Scholar] [CrossRef]
CIMA. Operational Modelling and Monitoring System of the Algarve—SOMA; CIMA: London, UK, 2024. [Google Scholar] [CrossRef]
Blumberg, A.F.; Kantha, L.H. Open boundary condition for circulation models. J. Hydraul. Eng. 1985, 111, 237–255. [Google Scholar] [CrossRef]
Martinsen, E.A.; Engedahl, H. Implementation and testing of a lateral boundary scheme as an open boundary condition in a barotropic ocean model. Coast. Eng. 1987, 11, 603–627. [Google Scholar] [CrossRef]
EU Copernicus Marine Service. Global Ocean 1/12° Physics Analysis and Forecast Updated Daily; EU Copernicus Marine Service: Toulouse, France, 2016. [Google Scholar] [CrossRef]
Carrere, L.; Lyard, F.; Cancet, M.; Guillot, A.; Roblou, L. FES 2012: A New Global Tidal Model Taking Advantage of Nearly 20 Years of Altimetry. In 20 Years of Progress in Radar Altimatry; Ouwehand, L., Ed.; ESA Special Publication; ESA: Paris, France, 2013; Volume 710, p. 13. [Google Scholar]
Kallos, G.; Nickovic, S.; Papadopoulos, A.; Jovic, D.; Kakaliagou, O.; Misirlis, N.; Boukas, L.; Mimikou, N.; Sakellaridis, G.; Papageorgiou, J.; et al. The regional weather forecasting system SKIRON: An overview. In Symposium on Regional Weather Prediction on Parallel Computer Environments; University of Athens: Athens, Greece, 1997; Volume 15, p. 17. [Google Scholar]
Papadopoulos, A.; Katsafados, P.; Kallos, G.; Nickovic, S. The Weather Forecasting System for Poseidon—An Overview. J. Atmos. Ocean. Sci. 2002, 8, 219–237. [Google Scholar] [CrossRef]
Mendonça, F.; Martins, F.; Janeiro, J. SMS-Coastal, a New Python Tool to Manage MOHID-Based Coastal Operational Models. J. Mar. Sci. Eng. 2023, 11, 1606. [Google Scholar] [CrossRef]
Counillon, F.; Keenlyside, N.; Wang, S.; Devilliers, M.; Gupta, A.; Koseki, S.; Shen, M.L. Framework for an Ocean-Connected Supermodel of the Earth System. J. Adv. Model. Earth Syst. 2023, 15, e2022MS003310. [Google Scholar] [CrossRef]
Counillon, F.; Bertino, L. High-resolution ensemble forecasting for the Gulf of Mexico eddies and fronts. Ocean Dyn. 2009, 59, 83–95. [Google Scholar] [CrossRef]
Gaspari, G.; Cohn, S.E. Construction of correlation functions in two and three dimensions. Q. J. R. Meteorol. Soc. 1999, 125, 723–757. [Google Scholar] [CrossRef]
EU-CMEMS. European North West Shelf/Iberia Biscay Irish Seas—High Resolution ODYSSEA L4 Sea Surface Temperature Analysis; EU-CMEMS: Luxembourg, 2019. [Google Scholar] [CrossRef]
EU-CMEMS. European North West Shelf/Iberia Biscay Irish Seas—High Resolution L4 Sea Surface Temperature Reprocessed; EU-CMEMS: Luxembourg, 2019. [Google Scholar] [CrossRef]
Turner, M.R.J.; Walker, J.P.; Oke, P.R.; Grayson, R.B. Assimilation of Sea-Surface Temperature into a Hydrodynamic Model of Port Phillip Bay, Australia. In Proceedings of the Coasts and Ports 2005: Coastal Living—Living Coast, Adelaide, Australia, 20–23 September 2005; pp. 511–516. [Google Scholar]
Murphy, A.H. Skill scores based on the mean square error and their relationships to the correlation coefficient. Mon. Weather. Rev. 1988, 116, 2417–2424. [Google Scholar] [CrossRef]
Taylor, K.E. Summarizing multiple aspects of model performance in a single diagram. J. Geophys. Res. Atmos. 2001, 106, 7183–7192. [Google Scholar] [CrossRef]

Figure 1. Spatial domains of the Algarve operational model (SOMA) used in this study. Panels (1a,1b) display the full extent of the SOMA Level 1 and Level 2 grids, respectively. Panels (2a,2b) present a zoomed-in view of the region surrounding Cape St. Vincent, located at the southwestern tip of the Algarve, to highlight the difference in spatial resolution between the two grid levels.

Figure 2. Comparison of sea surface temperature (SST) between the Nature Run (NR, green line) and the Free Run (FR, black line) over the year 2023. The root mean square error (RMSE) between the temperature fields is shown by the red line, with its scale indicated on the secondary y-axis (right).

Figure 3. Correlation maps computed using a reference point on the surface (asterisk on the maps, 36.95° N, 7.73° W). The correlation was calculated over time across all FR ensemble members, relating the surface temperature at this fixed point to salinity and current fields in the rest of the model grid. Maps (1.A,1.B) show the correlations at the surface for salinity and currents, respectively. Maps (2.A,2.B) depict the same properties at a depth of 100 m.

Figure 4. Correlation maps computed using a reference point on the surface (asterisk on the maps, 37.29° N, 9.09° W) on the western coast of the domain. The correlation was calculated over time across all FR ensemble members, relating the surface temperature at this fixed point to salinity and current fields in the rest of the model grid, following the same methodology as in Figure 3. Maps (1.A,1.B) show the correlation at the surface for salinity and currents, respectively, while Maps (2.A,2.B) depict the same properties at a depth of 100 m.

Figure 5. Depth-dependent correlation between SST and salinity over time, calculated using the FR ensemble members. Each curve corresponds to a reference location within the model domain: the southern coast of Portugal (36.95° N, 7.73° W; red curve, same as Figure 3) and the western coast (37.29° N, 9.09° W; blue curve, same as Figure 4).

Figure 6. On the left, the SST error over time of the FR and experiments OSE1 to OSE5, referenced in Table 1, with respect to the NRT ODYSSEA satellite product. On the right, the SST error of the FR and experiments OSSE1 to OSSE5 with respect to the NR. The experiments were integrated for an additional seven days following the update on 3 February.

Figure 7. The Taylor diagrams present the statistical properties (standard deviation, correlation, and root mean square difference, RMSD) for OSE1–OSE5 and OSSE1–OSSE5, as well as the Free Run (FR_SAT) relative to the NRT ODYSSEA satellite data (SAT) and the Free Run (FR_NR) relative to the Nature Run (NR). Panel (A) shows the results immediately after the EnOI update, and Panel (B) shows the results seven days later.

Figure 8. Murphy Skill Score (MSS) for experiments based on the NRT ODYSSEA satellite product. Blue lines indicate OSE experiments, with skill scores computed relative to their satellite-based reference data; red lines indicate OSSE experiments, computed relative to the NR. The left panel shows MSS immediately after the EnOI update (3 February), and the right panel shows MSS seven days later (10 February).

Table 1. Summary of the OSE and OSSE experiments. Each experiment is defined by the number of surface observation points assimilated (column Number of Points), the data source from which SST values were extracted (Data Source), and the method used to obtain or estimate the observation errors (Observation Error).

Experiment	Number of Points	Data Source	Observation Error
OSE₁	10	CMEMS NRT ODYSSEA	from data source
OSE₂	50
OSE₃	100
OSE₄	200
OSE₅	500
OSE₆	10	CMEMS Reprocessed	from data source
OSE₇	50
OSE₈	100
OSE₉	200
OSE₁₀	500
OSSE₁	10	SOMA Nature Run	NRT_ODYSSEA ×	random_field₁
OSSE₂	50			random_field₂
OSSE₃	100			random_field₃
OSSE₄	200			random_field₄
OSSE₅	500			random_field₅
OSSE₆	10	SOMA Nature Run	Reprocessed ×	random_field₆
OSSE₇	50			random_field₇
OSSE₈	100			random_field₈
OSSE₉	200			random_field₉
OSSE₁₀	500			random_field₁₀

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mendonça, F.; Martins, F.; Bertino, L. Design of an Observing System Simulation Experiment for the Operational Model of the Southwestern Coast of Iberia (SOMA). J. Mar. Sci. Eng. 2025, 13, 1830. https://doi.org/10.3390/jmse13091830

AMA Style

Mendonça F, Martins F, Bertino L. Design of an Observing System Simulation Experiment for the Operational Model of the Southwestern Coast of Iberia (SOMA). Journal of Marine Science and Engineering. 2025; 13(9):1830. https://doi.org/10.3390/jmse13091830

Chicago/Turabian Style

Mendonça, Fernando, Flávio Martins, and Laurent Bertino. 2025. "Design of an Observing System Simulation Experiment for the Operational Model of the Southwestern Coast of Iberia (SOMA)" Journal of Marine Science and Engineering 13, no. 9: 1830. https://doi.org/10.3390/jmse13091830

APA Style

Mendonça, F., Martins, F., & Bertino, L. (2025). Design of an Observing System Simulation Experiment for the Operational Model of the Southwestern Coast of Iberia (SOMA). Journal of Marine Science and Engineering, 13(9), 1830. https://doi.org/10.3390/jmse13091830

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design of an Observing System Simulation Experiment for the Operational Model of the Southwestern Coast of Iberia (SOMA)

Abstract

1. Introduction

2. Algarve General Circulation

3. Observation System Simulation Experiment

4. Data-Assimilation Scheme

5. Methodology

5.1. The Operational Forecasting System of the Algarve

5.2. SOMA OSSE System Design

5.2.1. Nature-Run and Free-Run Configurations

5.2.2. Data-Assimilation System

5.2.3. Synthetic Observations and the Reference OSE

6. Results and Discussion

6.1. SOMA Free Run Integration

6.2. SOMA OSSE System Assessment

7. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Statistical Metrics

Appendix A.1. Spatially Integrated Averages and Standard Deviations

Appendix A.2. Spatially Integrated RMSE

Appendix A.3. Temporally Integrated Averages and Standard Deviations

Appendix A.4. Temporally Integrated RMSE

Appendix A.5. Murphy Skill Score

Appendix A.6. Taylor Diagrams

Appendix B. EnOI General Formulation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI