3.1. Data Sets
For the evaluation of satellite and model-derived chlorophyll concentration (Chl) and sea surface temperature (SST), we used data from in situ observations. In situ Chl and surface water temperature data were obtained from public databases of the International Council for the Exploration of the Sea (ICES Dataset on Ocean Hydrography,
https://ocean.ices.dk, accessed on 1 November 2020). Data were collected as part of the HELCOM (The Baltic Marine Environment Protection Commission) program according to their protocols (
https://helcom.fi/helcom-at-work/publications/manuals-and-guidelines/, accessed on 1 November 2020). For our analyses, we selected eight stations located in the open sea and representative of the Baltic Proper waters. The geographical positions of these stations are listed in
Table 1 and shown in
Figure 1.
Satellite Chl data used in this paper include four data products (
Table 2). Three products [
34] were downloaded from the GlobColour website on 3 November 2020 (
http://globcolour.info, accessed on 1 November 2020). These are: (1) the weighted averaging surface chlorophyll concentration listed in the GlobColor website as CHL1 and in this paper referred to as Chl
1. Estimates of Chl
1 were derived using the reflectance band ratio algorithm [
4]; (2) the surface chlorophyll concentration obtained with the GSM algorithm (Chl
gsm) [
35,
36]; (3) the surface chlorophyll concentration derived from a neural network algorithm [
37], in the GlobColour website referred to as Chl2 and in our paper indicated as Chl
2. Finally, the fourth satellite product was the surface Chl concentration estimated with the regional SatBaltic algorithm (Chl
Satb) [
38,
39]. In the SatBaltic system, the chlorophyll algorithm has been frequently updated (see
http://www.satbaltyk.pl, accessed on 1 November 2020). We used data processed with the most recent algorithm at the time of manuscript preparation (version 2020). We did not include Copernicus Marine Environment Monitoring Service’s (CMEMS) data in our analysis, as these data have already been evaluated by several authors, including service providers [
40]. Full details of the approach used by GlobColour in the standard processing of satellite ocean color data are provided in the manual (GlobColour Product User Guide GC-UM-ACR-PUG-01 version 4.2.1, March 2020) [
41]. Spatial resolution of satellite data is about 1 km.
Both Chl
1 and Chl
gsm data products are surface Chl concentration determinations based on the standard global ocean algorithms [
4,
35,
36] and use, as an input, merged data from several ocean color sensors observing the Earth during a given day, when surface Chl was estimated (
Table 2). Time series of these two Chl products began when SeaWiFS started to deliver data in September of 1997. For our work, we used data from years 1998–2019. The SatBaltic data product is based on regionally modified remote sensing reflectance band ratio relationships [
38,
39]. It was created from the MODIS Aqua observations that started in 2002. The Chl
2 data product was derived using the neural network algorithm established to estimate surface Chl in coastal regions [
37]. The Chl
2 data set available at GlobColor was generated only from MERIS and OLCI-A. This is why it includes a significantly smaller number of data points coincident with in situ measurements than in the case of all the other Chl products used in this study.
Table 3,
Table 4,
Table 5 and
Table 6 list the exact numbers of coincident satellite/in situ data pairs available at each station for each Chl product. In these comparisons, we used data from pixels containing coincident in situ data points from in situ stations. Only data pairs with in-situ measurements and satellite determinations for the same day were used in comparisons. This match-up criterium allowed us to build a data set with a sufficiently large number of observations, but is not as strict as that used in [
8] for algorithm development and validation. We feel that our approach is justified, since in most situations in the open Baltic Sea we do not expect large changes in Chl concentrations over the course of a day. In addition, we do not propose to establish new remote sensing relationships, but only assess different satellite data sets. Our goal was to show the potential users of these data what differences between in situ and remote sensing data can be expected.
There are several satellite SST data products available as open-access data for the Baltic Sea. We decided to use a data series known as the National Oceanic and Atmospheric Administration (NOAA) daily Optimum Interpolation SST Version 2 data set (dOISST.v2). The dOISST.v2 data set is available at the National Centers for Environmental Information (NCEI) website, under the name “NOAA Optimum Interpolation 1/4 Degree Daily Sea Surface Temperature (OISST) Analysis, Version 2” (with doi:10.7289/V5SQ8XB5). The same data are distributed elsewhere, for example, at the Physical Oceanography Distributed Active Archive Center (PODAAC) of the Jet Propulsion Laboratory, NASA, as the GHRSST (Group for High Resolution SST) Level 4 AVHRR_OI Global Blended Sea Surface Temperature Analysis (with doi:10.5067/GHAAO-4BC01). The dOISST.v2 data set has been approved by the NOAA Climate Data Record (CDR) program as an operational CDR. It meets the definition of CDR put forward by the National Research Council (2004), as it is of sufficient length, consistency, and continuity to determine climate variability. These global daily SST records (one daily value for each pixel), with spatial resolution of 0.25° by 0.25°, are based on the Advanced Very High Resolution Radiometer (AVHRR) infrared satellite measurements (Pathfinder from September 1981 through December 2005, operational AVHRR from January 2006). The final global data set was derived combining satellite SST retrievals with SST observations from ships and buoys, and proxy SSTs generated from sea ice concentrations. The full description of data processing methods and comparisons between the NOAA dOISST.v2 and in situ data can be found in [
42,
43]. Note that the infrared satellite remote sensing SST algorithms can provide either a skin SST, if they are based on radiative transfer models, or a subskin SST, if in situ observations have been used to adjust satellite retrievals. In the dOISST.v2 data set, the bias correction of the satellite data is based on data from ships and buoys, and therefore it should be interpreted as the bulk SST [
7]. In order to apply the correction for bias, the satellite data were classified into daytime and nighttime bins and corrected separately using in situ data. Then, all the data were reanalyzed jointly using the optimum interpolation (OI) procedure. The final data represent the daily mean bulk SST values representative for the top 1 m surface water layer.
The modeled SST and surface Chl data (indicated as SST
model and Chl
model) used in our comparisons were extracted from the Baltic Sea biogeochemical reanalysis data set (BALTICSEA_REANALYSIS_BIO_003_012) provided by the Copernicus Marine Environment Monitoring Service’s (CMEMS) Baltic Monitoring and Forecasting Centre (BAL MFC). These reanalysis data were derived using the ice-ocean NEMO (Nucleus for European Modelling of the Ocean) model [
44]. NEMO was coupled with the biogeochemical model SCOBI (Swedish Coastal and Ocean Biogeochemical model). The horizontal grid resolution is approximately 2 nautical miles (latitude 0.03333 degrees; longitude 0.05556 degrees), and there are 56 water depth levels. The reanalysis applied the Localized Singular Evolutive Interpolated Kalman (LSEIK) filter for data assimilation [
45]. The observational data used for data assimilation included SST, nitrate, phosphate, ammonium, and dissolved oxygen concentrations. For comparison with satellite and in situ data, we selected data from 1998–2019 at grid points located at the shortest possible distance from the in situ stations and from the same day as the in situ observations. Data originated from the uppermost available model depth level (~1.5 m). More details on the model setup can be found in the PRODUCT USER MANUAL Baltic Sea Biogeochemical Reanalysis Product (BALTIC SEA_REANALYSIS_BIO_003_012, CMEMS-BAL-PUM-003-012 version 2) [
46].
Table 7 lists the number of coincident model/in situ data pairs available at each station for Chl, while
Table 8 and
Table 9 list the number of data pairs available for satellite/in situ and model/in situ SST comparisons, respectively.
3.2. Methods
The differences between in situ and satellite (or model) derived data were quantified by standard statistical methods. First, comparisons between the data sets in linear space included the root mean square error (RMSE), the bias (B), the mean absolute error (MAE), and the Pearson’s correlation coefficient (R) for all types of data pairs. The bias (B) was defined as the mean difference between the estimated data value (from model or satellite algorithms) and the in situ measurement:
where N is the total number of measurements, O
n is the value measured in situ, and P
n is the predicted value (satellite or model, Chl or SST determination). The root mean square error (RMSE) was calculated as:
The mean absolute error (MAE) was calculated using the following formula:
Additionally, Chl data were log-transformed. In this case in the formulas listed above
Following [
47], for Chl log-transformed data, the bias (B
L), the mean absolute error (MAE
L), and the Pearson’s correlation coefficient were calculated. Ocean color Chl determinations are usually log-transformed prior to calculation of error metrics, because the data values frequently span multiple orders of magnitude [
47]. This log-transformation results in a conversion of the statistical metrics from linear to multiplicative space. In general, the choice of either linear or multiplicative metrics depends on the properties of the variable of interest. For example, water temperature is always evaluated with linear metrics. Data such as Chl, spanning multiple orders of magnitude and with the uncertainty that varies proportionally with the data value, are better assessed using multiplicative metrics [
48]. The bias (B
L) and the mean absolute error (MAE
L) listed in the Tables were calculated as:
where V
conv is either B
L or MAE
L converted out of the log-space and V
log is its value calculated on log-transformed data. After this back transformation, the errors can be interpreted as a percentage. For example, a B
L of 1.5 indicates that the estimated value is on average 1.5× (50%) greater than the observed data, while a MAE
L of 1.6 indicates an average relative error of 60%.
When comparing the in situ and satellite-derived data, one has to remember that both kinds of data are subject to errors. For example, in situ data include errors due to the limited precision of instrumentation used in the experiments. It is logical to expect that errors in satellite and model data can be larger than in in situ measurements. These errors are due to the assumptions used to calculate the values and approximate nature of algorithms. However, even if all data used in this paper are expected to be subject to errors, we refer to in-situ data as ‘measured’ and to the differences between satellite or model derived and in-situ estimates as ‘errors’. Regressions listed in this paper represent Model II major-axis reduced regression [
49], as this type of regression model is suitable when the two variables in the regression equation contain errors.