Partition of Marine Environment Dynamics According to Remote Sensing Reﬂectance and Relations of Dynamics to Physical Factors

: Seawaters exhibit various types of cyclic and trend-like temporal alterations in their biological, physical, and chemical processes. Surface water dynamics may vary, for instance, when the timings, durations, or amplitudes of seasonal developments of water properties alter between years and locations. We introduce a workﬂow using remote sensing to identify surface waters undergoing similar dynamics. The method, called ocean surface dynamics partitioning, classiﬁes pixels based on their temporal change patterns instead of their properties at successive time snapshots. We apply an efﬁcient parallel computing method to calculate Dynamic Time Warping (DTW) time series distances of large datasets of Earth Observation MERIS-instrument reﬂectance data R rs (510 nm) and R rs (620 nm), and produce a matrix of time series distances between 12,252 locations/time series in the Baltic Sea, for both wavelengths. We deﬁne cluster prototypes by hierarchical clustering of distance matrices and use them as initial prototypes for an iterative process of partitional clustering in order to identify areas that have similar reﬂectance dynamics. Lastly, we compute distances from the time series of the reﬂectance data to selected physical factors (wind, precipitation, and changes in sea surface temperature) obtained from Copernicus data archives. The workﬂow is reproducible and capable of managing large datasets in reasonable computation times and identifying areas of distinctive dynamics. The results show spatially coherent and logical areas without a priori information about the locations of the satellite image time series. The alignments of the reﬂectance time series vs. the observational time series of the physical environment clarify the causalities behind the cluster formation. We conclude that following the changes in an aquatic realm by biogeochemical observations at certain temporal intervals alone is not sufﬁcient to identify environmental shifts. We foresee that the changes in dynamics are a sensitive measure of environmental threats and therefore they will be important to follow in the future.


Introduction
Many physical, chemical, and biological processes in oceans and seas proceed in temporal cycles of different lengths, from diurnal and seasonal [1][2][3][4][5] to longer periods, such as the El Niño cycle in the Pacific Ocean [6,7]. Compared to trend-like changes, such as eutrophication [8][9][10] and seawater acidification [11,12], cyclic processes show a degree of recurrence, yet they may be difficult to observe if embedded within other dynamic processes. In shallow coastal seas, for example, multiple factors like seawater stratification, river outlets, and local geographical features are present. The momentary state of a given point of the sea surface reflects also recent weather conditions and many observation at another location at the same time step. The units to be classified are the time series where the observations lie, not the observations themselves. We argue that if areas share similar dynamics, the factors behind the changes of the water properties are the same. We also consider that the changes in sea surface layer dynamics are gradual, as are most of the properties and phenomena in the sea. This approach has been applied earlier by Suominen [38] and by Mantas et al. [39]. For this research we develop computational methods to perform k-means clustering of large amounts of time series efficiently, enabling the use of the presented method up to a global scale. Secondly, after differentiating areas of distinctive dynamics we go further and study how physical factors are linked to the reflectance dynamics-annually regular or not.
Alternative approaches are available for assessing similarities between time series data of different origins [21,40,41]. The mining of long time series data is typically computationally demanding because of the massive and occasionally multi-dimensional characteristics of the observations. The datasets can be noisy and include spatial and temporal gaps and temporal shifts, or the time series may have varying lengths. Dynamic time warping (DTW) [42] is widely applied in land use classifications and studies of terrestrial cycles [19,24,43], and in recent years it has been used in the aquatic realm by Suominen [38] and Mantas et al. [39]. A crucial step in our approach is the forming of a prototype of a set of time series. These prototypes are used for the time series k-means clustering but also to differentiate and visualize the areas of distinctive dynamics. DTW barycentric averaging (DBA) [44] is a competitive method for prototyping time series and we selected it together with DTW as the distance measure to our k-means clustering algorithm.
The Baltic Sea was chosen for our approach as it contains several known nontrivial spatial and temporal features and therefore forms a suitable area for the testing of an unsupervised method, such as k-means clustering. We use DTW to define (dis)similarity of satellite observation time series at the level of individual bins, i.e., over 12,000 point locations, and apply cluster partitioning to group the series. Typical dynamics in different parts of the Baltic Sea were defined as the prototypes of the time series clusters. The areas where each dynamics prevails were visualized in cartographic form. The raw data consists of the MERIS-instrument observations from an 8-year study period (open water periods 1 June- 15 September in 200415 September in -2011. This period is practical as it is used also in many status assessments of the Baltic Sea and is therefore of high interest. We perform the procedure equally in two different spectral bands, R rs (510) and R rs (620), and expect to see spatially coherent groups of bins with analogous dynamics. In order to analyze the causalities behind the cluster formation, we also compute distances from the time series of selected physical factors (wind, precipitation, and changes in sea surface temperature) to the time series of R rs (λ).

Study Area
The Baltic Sea ( Figure 1) is a semi-closed marginal sea of the North Sea. As a shallow and brackish water sea (mean depth 54 m), it is ecologically sensitive [45,46]. The northern location (approx. 53-66 • N) induces strong seasonality in the physical, chemical, and biological properties of the Baltic Sea surface waters.
Horizontally, the surface layer salinity gradually decreases from the oceanic level at the inlet to almost zero in the distal parts of the northernmost basin. Due to high freshwater input from land and highly saline but low in volume input of oceanic waters, the Baltic Sea has a strong halocline at the depth of 40-70 m. The northernmost areas freeze by mid-January and the ice cover may last until May. In the summer months, temperatures in the open sea rise to 15-20 • C, creating a thermocline at the depth of 15-20 m. Vertical mixing of waters takes place again in autumn. There is a relatively persistent counter-clockwise upper layer circulation pattern in the Baltic Sea with an average speed of approximately 5 cm s −1 [47]. Climate models suggest increasing precipitation and Remote Sens. 2021, 13, 2104 4 of 28 discharges, which can lead to decreased salinity, increased nutrient loads, and consequent changes in biogeochemical cycles in the Baltic Sea [48][49][50].
Annual successions of phytoplankton and cyanobacteria follow the same pattern each year. A spring bloom of diatoms and dinoflagellates follows increased light and rising temperatures in April-May. Another bloom, dominated by cyanobacteria, takes place in late summer in July-August [51][52][53], and these blooms notably affect the optical properties of the seawater. In calm weather cyanobacteria accumulate on the surface, but at higher winds they mix with the water column above the thermocline.
In addition to cyanobacteria and abundant phytoplankton, the optical properties of the Baltic Sea waters are driven by colored dissolved organic matter (CDOM) and suspended particulate matter (SPM) [54,55]. High concentrations of SPM occur near the river mouths, but the inorganic fraction of SPM travels a relatively short distance from the shore [56]. In the open sea, SPM mainly consists of phytoplankton and cyanobacteria [57]. The bottom shear stress in the SW Baltic Sea is dominated by the wave contribution in the shallower areas due to the lack of tides [56]. Transport of fine sand was modeled to occur at wind speed > 15 m s −1 , whereas finer materials are rapidly eroded from shallower areas already at wind speeds of 10 m s −1 .

R rs (λ) Time Series
EO imagery possesses a particularly high potential for the study of seawater change. The Copernicus Marine Environment Monitoring Service [58] of the European Union Earth Observation Program provides global and regional data for oceanic bio-geochemical monitoring and modelling. Its archives facilitate long-term time series analysis without the laborious stage of compiling data from multiple EO sources. Yet it is a demanding task to transform remote sensing and in situ observational data into useful forms supporting practical decision-making like ecosystem-based management or marine spatial planning [59,60].
In the semi-enclosed Baltic Sea in northern Europe, sea surface waters are optically complex [54,56,61,62] and varying concentrations of CDOM hinder the retrieval of information on other water properties [54,55,63]. Accordingly, instead of utilizing biogeochemical data products, we concentrate on analyzing reflectance data and thereby avoid the potential error sources identified in bio-geochemical products and consequent artificial spatial patterns. We use the same daily MERIS instrument normalized remote sensing reflectance R rs (λ) data as in Suominen [38], the wavelengths λ being 510 nm and 620 nm in this study. The rationale for selecting these wavelengths is further justified in Section 3.2. The data originate from MERIS 3rd data reprocessing with MERIS Ground Segment (MEGS) Processor Version 8.0. The reflectance data was produced to correspond to Level-2 for-Remote Sens. 2021, 13, 2104 5 of 28 mat with Case 2 Regional CoastColour processor (C2RCC, v0.15) [64][65][66]. The chosen reflectance processor is the same as the one provided by CMEMS for the MERIS timeline in the Baltic Sea [67]. During winter months, seasonal ice covers and low sun angles prohibit a year-round analysis of water reflectance changes.

Sampling of R rs (λ) Data
We binned the data using the level 3 binning operator tool [68] in Sentinel Application Platform [69] software into bins of size 1.25 by 1.25 , that is, 2315 m in the north-south direction and from 950 m to 1350 m in the east-west direction. Each bin consisted of approx. 24-32 original pixels for which the median value was computed. Every second bin in the north-south direction and every fourth bin in the east-west direction were removed. Bins that were located nearer than 3300 m from a land object were removed to avoid so called mixed pixels containing partially water and land. As a result, the data contained 12,827 bins/time series. In the northernmost parts of the study area, the last annual sea ice often does not melt until May. Towards autumn, in turn, the shortening days with frequent cloud covers diminish the amount of qualified imagery. In order to obtain comparable data throughout the area, the study period was limited to the summer season between 1 June and 15 September. During the analyzed period, the temporal coverage of the imagery was 15-25% in the southern parts of the Baltic Sea and, due to more frequent overflights and clearer skies, 25-40% in the northern parts ( Figure 2). A small number of images with too few cloud-free observations were excluded from the final dataset. As a result, the final number of included bins in the times series data was 12,252. Missing values in the time series were filled in by using linear interpolation between preceding and subsequent values to make the dataset consistent for the DTW computations. The point-wise reflectance R rs (λ) time series were standardized (Equation (1)) to ensure that the (dis)similarity measure and clustering are based on the shape of temporal patterns and that the analysis ignores the effect of the absolute reflectance level, where R rs (λ) is the reflectance at wavelength λ (nm), µ and σ are the annual mean and the standard deviation of the observed R rs (λ), and R z (λ) is the standardized R rs (λ). As a result, the linearly interpolated, standardized reflectance datasets over the 8-year period 2004-2001 for 1 June-15 September compose a general view of reflectance dynamics in the study area.

Time Series of Physical Factors
Data about precipitation, wind direction and wind speed 10 m above the ground (UERRA regional reanalysis for Europe on single levels from 1961 to 2019 re-analysis MESCAN-SURFEX) [70] were obtained from Copernicus Atmosphere Monitoring Service [71]. The spatial resolution of the gridded precipitation and the wind data was 5.5 km. The NetCDF data was sampled and further processed using R software.
Zonal and meridional wind components were computed from daily wind speeds and direction data at 06:00. They were further divided into four wind component time series (eastward, southward, westward, and northward), where a positive value is the wind speed (m/s) towards the cardinal point, the lowest value being zero. Daily precipitation data consist of the accumulated amount of water falling onto the ground/water surface over 24 h. A pointwise precipitation was not considered to represent adequately the precipitation affecting the discharges and reflectance dynamics of the sea area, instead we used the mean precipitation within a 50 km radius around the locations of reflectance time series. Data on sea surface temperature (ESA SST CCI reprocessed sea surface temperature analyses) [72,73] were downloaded from Copernicus Marine Environment Monitoring Service [74]. The spatial resolution of daily gridded SST data was 0.05 • × 0.05 • . SST change was computed by subtracting the median of SST of the preceding five days from the daily SST. By this, we wanted to identify rapid changes in SST but ignore seasonal changes (∆SST).
To make the precipitation, wind components and SST comparable with the R rs (λ) time series they were standardized (Equation (2)) in a manner similar to the R rs (λ) data, with the exception that the standardization was done over all years, not annually, where P is the observed value of the physical factor, µ P and σ P are the mean and the standard deviation of the observed P over all annual periods, and P z is the standardized P.

Description of Dynamic Time Warping DTW
Let X and Y, called query and reference, be two time series having lengths n and m: X = x 1 , x 2 ,...x i ...x n and Y = y 1 ,y 2 ...y i ...y m (Figure 3 left). Dynamic time warping is an algorithm that finds the optimal alignment of the two time series. An alignment is given by a collection of assignments x i~yj such that x 1~y1 and x n~ym , any x i must be assigned to one or multiple y j and vice versa, and fulfilling the constraint of monotonicity: if x ĩ y j then x i+1 must be assigned to either y j or y j+1 and vice versa for y with respect to x. Optimality is established by finding the alignment for which an objective function attains its minimum, e.g., the sum of the dissimilarity or absolute distance of the aligned time series values |x i -y j |.
To find the optimal alignment, a local cost matrix LCM is formed (Figure 3 middle). The distance between observations x i and y j are shown in the elements LCM ij . The algorithm finds the least cumulative cost path from (x 1 ,y 1 ) to (x n ,y m ) using a predefined step pattern through the LCM. The step pattern determines the step directions, step lengths, and how the different directions are weighted. The cumulative cost matrix, Figure 3 right, illustrates the aggregation of the dissimilarity measure of the optimally aligned time series. A window constraint determines the longest possible time distance |i -j| between two observations x i ,y j that are allowed to align. Descriptions of DTW, constraints, and step patterns are given e.g. by Giorgino [75], Keogh and Ratanamahatana [76], Petitjean et al. [44] and Suominen [38]. The workflow presented in Sections 2 and 3 is summarized in Figure 4.

Rationale for Selecting the Wavelengths R rs (510) and R rs (620)
In Suominen [38] the wavelengths R rs (443), R rs (560) and R rs (665) were used as they are more closely connected to CDOM, phytoplankton and SPM. In addition, the algorithm for the k-means clustering of time series was essentially the same as here, with the difference that the algorithm was implemented in another software platform, R. The R implementation was not efficient enough to handle the number of time series used in this study within a Remote Sens. 2021, 13, 2104 8 of 28 reasonable computation time. This limitation restricted the spatial coverage of the previous study to the northernmost areas of the Baltic Sea. Likewise, it was not possible to form the DTW distance matrix for the preliminary hierarchical clustering, as we have done in this study. Therefore, this present study provides new insights both spatially and methodologically.
It is essential to note that we are not clustering or classifying a remote sensing reflectance value of a certain time step in relation to another reflectance observation at the same time. Instead, the units to be clustered are time series, which are clustered together according to the similarities in their shapes. The selection of the wavelengths is thus mainly based on their ability to differentiate temporal patterns and changes in time and to a lesser extent on their ability to observe bio-geophysical characteristics of water surface in a conventional manner.
When selecting the wavelengths for this study we emphasized how well the reflectance time series were clustered. In Suominen [38] repeated k-means clusterings were done with random initial cluster prototypes. This way we determined which window constraints, DTW step patterns, and wavelengths ended up with the most valid results, in other words, by which preliminary selections the clusters were internally most coherent and by which selections the repetitions of the non-deterministic k-means method produced mostly labile clusters on a map.
The time series clustering produced mostly labile clusters for R z (560), while R z (443) and R z (665) performed worse. Clusters based on R z (443) were internally less coherent than the clusters based on R z (560) and R z (665). In order to select the bands for the present study, we also computed the correlation coefficients between simultaneous observations at the chosen wavelengths (R rs (443), R rs (510), R rs (620), and R rs (665)) and their standardized counterparts (Table 1). Between the observations of R rs (443) and R rs (510) the coefficient was high r = 0.92, and between R rs (620) and R rs (665) the coefficient was even higher, r = 0.99. Between R rs (510) and R rs (620) the coefficient was lower, r = 0.68. The high mutual correspondence of the two analyzed shorter wavelengths are the result of their very similar reflectance signals: they are more influenced by atmospheric disturbance [65,77,78]. Furthermore, the time series distance measures are susceptible to noise in the observational data and in the atmospheric correction phase [65,66,79]. Since the temporal pattern of R rs (443) and R rs (510) had similarities in their dynamics, we chose the wavelength from the green part of the spectrum R rs (510). R rs (510) and R rs (620) did not share similar dynamics, whereas R rs (620) and R rs (665) did, although their levels differed. However, we standardized the reflectance data, and ended up with the time series R z (620) and R z (665), which had very similar shapes. Thus, the wavelength R rs (620) was chosen as a second band. Table 1. Correlation coefficients between simultaneous observations of R rs (443), R rs (510), R rs (620), and R rs (665) (right) and their standardized counterparts R z (443), R z (510), R z (620) and R z (665) (left).

Constructing Cost Matrix between Two R rs (λ) Time Series by Applying DTW
There are several available implementations of DTW computations and time series clustering, e.g., in R, but some of these implementations quickly become time consuming if there is a large number of time series and we have many clusters to compare against. In k-means clustering, for p clusters of length m and N time series of length n and no time window constraint, one k-means iteration step requires O(pmNn) floating point operations. A closer inspection of the calculations reveals that DTW can be performed in parallel for each time series and each cluster. Also, in order to maximize computer memory cache behavior, it is advantageous to use two loops, an outer loop for the time series, and an Remote Sens. 2021, 13, 2104 9 of 28 inner loop for the cluster prototypes. We decided to implement DTW in C using Advanced Vector Extensions (AVX) vector instructions in order to fully utilize the parallelism of the task. The AVX registers are 256 bits wide, accommodating 8 single precision values. On a machine with N cores running OpenMP we can perform 8N DTW calculations in parallel.
To perform k-means and hierarchical clustering of the satellite time series we use dynamic time warping as the metric to measure the distance between two time series X = {x 1 , x 2 . . . x i . . . x n } and Y = {y 1 , y 2 . . . y j . . . y n }. We do not align time values from different years, but instead the full alignment over 2004-2011 is the sum of the alignments over the individual years. The similarity measure is now the alignment, which has either the smallest sum of the absolute difference (L1 norm) or, as in our case, the square root of the sum of the squared differences (L2 norm) of the aligned points x i , y j . This alignment process is similar to the Needleman-Wunsch global alignment algorithm [80] but possibly with different step patterns. In our case we supplemented the algorithm with a time window constraint T, such that the indices of aligned points x i , y j may differ at most by T: |i-j| <= T. Any alignment beyond T days receives a very large, essentially infinite, cost. The impact of different values of the window constraint T was studied in Suominen [38] and in the present study T is kept relatively short at +-10 days. For N time series {X i }, the N 2 cost values may be calculated once and stored as a distance matrix D ij between the time series X i , X j . In our case with N = 12,252, this was feasible as symmetry D ji = D ij allows us to store only N 2 /2 values, requiring roughly 1 GB of storage. For substantially larger sets of time series storing distances becomes rapidly infeasible, and the DTW distance must be calculated online. The calculation of the D ij s can be done in parallel as the evaluation of any D ij can be performed independently of other distances.

Clustering of R z (λ) Time Series
Time series clustering is an intermediate phase while searching for areas of divergent R rs (λ) dynamics. The ultimate objective of clustering is to define a set of typical dynamics, i.e., prototypes for the clusters. We analyze similarities and dissimilarities between R z (λ) time series using unsupervised k-means clustering and dynamic time warping. K-means clustering is an iterative procedure where each cluster is represented by a prototype time series, and in our case dynamic time warping is used to determine which cluster any given bin time series belongs to by comparing the bin time series against the cluster prototypes. Once all bins have been assigned to a cluster, the cluster prototype is updated using the time series belonging to that cluster, and the iterative process is repeated until very few bins, say less than one per mille, move between the clusters during one iteration.
K-means clustering of the time series starts with the choice of k initial cluster prototypes, where k is the number of clusters sought. If the initial prototypes were selected randomly, the results of the partitions varied, although the overall picture of the geographical borders of different clusters remained alike. The distinctions between the dynamics in the Baltic Sea area are small and cluster validity indices did not show clear evidence of how to select the initial prototypes [38]. In order to make the k-means clustering deterministic, the DTW distance matrix, which was computed in the earlier step, was first clustered by agglomerative hierarchical clustering (complete linkage) in order to form 6-12 clusters. The prototypes of these clusters were computed in our case with DTW barycentric averaging (DBA) [44], in which each time value of the cluster prototype is updated according to the barycenter of the values of the time series belonging to the cluster. The satellite image time series having the shortest DTW distance to the cluster prototypes were identified and these time series were used as initial prototypes in k-means clustering. Hierarchical clustering of the distance matrix and computing the prototypes were conducted in R software with the dtwclust package [81].
The DTW distances between each bin time series and the initial prototypes were then computed. Each time series is associated to the closest prototype as based on the DTW distance and then a new, refined cluster prototype is computed by applying DBA. For each iteration the k-means clustering calculates the DTW distance to the cluster prototypes and reassigns the time series among the clusters, according to the DTW similarity measure. Again, this calculation can be performed in a highly parallel manner as all DTW calculations are independent of each other. The updating of the DBAs, however, requires atomic operations due to data races when adding the contribution from each time series to the DBA. The parallelization and the associated atomic operations were implemented in C using OpenMP for parallelization and OpenMP reduction clauses for the updating of the DBAs.

Selecting the Value of k
The clusterwise mean and total mean Silhouette Indices [82] for k clusters of R z (510) and R z (620) illustrate the cluster characteristics ( Table 2). For each observation bin i, the Silhouette index s(i) is defined as where a(i) is the average distance between i and all other points in the same cluster C to which bin i belongs, and b(i) is given by b is the average distance of i to all observations in cluster C, where C runs over all clusters other than C.
Higher values of the clusterwise mean indicate that the cluster is internally more coherent and better separated from other clusters. In all partitions with k = 6-12 and for both R z (λ) the clusters located in the Gulf of Riga, in the eastern coast of the Gotland Basin and the Gulf of Finland were among the weakest clusters (Table 2). According to the Silhouette index, the optimal value of k was 7 for R z (510), although the indices for other values of k were close to it. For R z (620) the optimal value of k would be 6 or 7, whereas the index value dropped with higher values of k. In case the aforementioned weakest clusters are neglected, the rest of the clusters are relatively cohesive also for higher values of k. A higher k was considered to give better insight to spatial divergences of dynamism and k = 11 was chosen to be the best option.
A number of cluster validity indices (CVI) are available to define the proper number of clusters for analysis. The selection of k is typically made by applying multiple cluster validity indices and the majority rule is used to select the appropriate number of clusters. CVIs give preliminary information about the proper number of k, but the differences can be negligible and therefore their interpretation may end up being artificial. In our study area and with the aforementioned time series similarity measures, the distinctions between the clusters are small, i.e., the clusters are not clearly deviating from each other (see also Suominen [38]). Still, the time series form spatially distinctive pixel group distributions without any information about the locations of the time series.

Areas of Distinctive R rs (λ) Dynamics
Based on the calculated distance, clustering methods assign objects to the nearest cluster prototype giving the impression that an object clearly belongs to one specific cluster while being far away from other clusters. However, such hard partitions between clusters may be the result of only marginal differences in distances to neighboring cluster prototypes. Clusters with hard partitions do not describe the nature of open and coastal seas adequately [38,39]. Spatially bio-geophysical and optical properties are typically changing gradually and the borders of the partitions are changing over time. Moreover, hard partitions give a black-and-white impression about the major groups of dynamics without essential information about the cohesion of the clusters. To avoid visual misinterpretations on the spatiality of reflectance dynamics, we computed DTW distances from each cluster prototype to all other point-wise R z (λ) time series. This way we visualize the cluster memberships on a continuous scale that facilitates a two-sided evaluation of the interlinkages of dynamics: areas having the most similar dynamics and areas where the dynamics differ most. This approach also diminishes the importance of selecting the optimal value of k when interpreting the results.

Relations between Physical Factors and R z (λ)
The distances between the time series of the chosen physical factors (four wind components, precipitation, ∆SST) against R z (λ) were computed for each year separately (107 observations per year) and the annual distances were summed. In the R z (λ) clustering, the alignments were allowed within a +-10 day time window. In the cases of physical factors and R z (λ), only forward-looking alignments were allowed, i.e., a time step in the physical factor time series was allowed to be aligned only with time steps in the future in the R z (λ) time series. The wind components and the precipitation were allowed to align with R z (λ) observations within the next 10 days and ∆SST with R z (λ) observations within the next 5 days. The DTW distances were normalized by dividing them by the sum of the time series lengths (i.e., by 856+856). The relations between the physical factors and the R z (λ) were studied with the R software packages sf [83], geosphere [84], raster [85], ncdf4 [86], rgdal [87] and dtw [75].
The relations between the reflectance time series and the physical factors are demonstrated in Figures 5 and 6. At the earlier date there are cyanobacterial blooms in the Gotland Basins inside the ellipses in Figure 5. Blooms are not extensively occurring in the Bothnian Sea but there is resuspended material in the seawater within a narrow coastal zone of the eastern side of the Bothnian Sea (A) around the red dot in Figure 5. Resuspension is a consequence of preceding rainy and windy periods ( Figure 6). Also, the SST has changed to colder at the same time with the elevated resuspension. A week later, calm and dry weather prevails in the Baltic Sea region and a cyanobacterial bloom forms clear spatial patterns in the Gotland Basins (B). Cyanobacteria has also accumulated on the sea surface in the eastern Gulf of Finland (C) and in the open Bothnian Sea (D), due to lower winds. The resuspension has faded in the coastal zone of the eastern side of the Bothnian Sea around the red dot. SST has changed to warmer and the reflectance has decreased (Figure 6). strated in Figures 5 and 6. At the earlier date there are cyanobacterial blooms in the Gotland Basins inside the ellipses in Figure 5. Blooms are not extensively occurring in the Bothnian Sea but there is resuspended material in the seawater within a narrow coastal zone of the eastern side of the Bothnian Sea (A) around the red dot in Figure 5. Resuspension is a consequence of preceding rainy and windy periods ( Figure 6). Also, the SST has changed to colder at the same time with the elevated resuspension. A week later, calm and dry weather prevails in the Baltic Sea region and a cyanobacterial bloom forms clear spatial patterns in the Gotland Basins (B). Cyanobacteria has also accumulated on the sea surface in the eastern Gulf of Finland (C) and in the open Bothnian Sea (D), due to lower winds. The resuspension has faded in the coastal zone of the eastern side of the Bothnian Sea around the red dot. SST has changed to warmer and the reflectance has decreased ( Figure 6).

Areas of Distinctive R rs (λ) Dynamics in Two Spectral Channels
The k-means clustering of reflectance times series based on their DTW distances resulted in spatially coherent clusters, i.e., only a few bins were not spatially connected to some larger cluster entity. Sharply edged clusters are desirable compared to diffuse clusters as this reflects the method's ability to aggregate areas with similar dynamics without any information about the bin locations. According to the binwise time series distances from the prototypes of the cluster in which the bin is located, the binwise time series distances to the cluster prototypes in question are typically shorter in the northern sea areas than in the southern and eastern basins.
The Bothnian Bay and the Bothnian Sea (Figure 1, locations 1 and 2) had similar spatial clusters with normalized reflectance of R rs (510) and R rs (620), although with R rs (620) the eastern coasts formed two separate clusters whereas in R rs (510) they belonged to the same cluster (Figures 7-10). In the next characterization of the areas of distinctive R rs (λ) dynamics in Figures 7 and 9, we use the signs (-) for values less than the average, (o) for values near the average, and (+) for values higher than the average for the most typical temporal patterns during the early, mid, and late summer.
The eastern coasts of the Bothnian Sea and the Bothnian Bay did not show clear annually repeated temporal patterns neither with R z (510) nor with R rs (620). The reflectance dynamics of the open and western Bothnian Bay clearly deviated from the Bothnian Bay eastern coast. The dynamics of R z (510) in the western and open Bothnian Bay showed weak annually repeated temporal patterns (oo-). The R rs (620) in the open and western Bothnian Bay was declining through the summer, resulting in the temporal pattern (+o-). The dynamics of R rs (510) and R rs (620) in the open Bothnian Sea followed the pattern (-+-). The levels of R z (510) and R z (620) were lower in the western Bothnian Sea (Figure 2) and there were no recognizable temporal patterns neither with R z (510) nor with R z (620).  The pelagic and western areas of the Baltic Sea main basin (Figure 1, locations 3, 4, and 5) share similar dynamics (-+-). The temporal pattern is more obvious with R rs (620), but it is also seen with R rs (510). The main basin is divided into three clusters with R rs (620), but all cluster prototypes resemble each other. The reflectance peak in the main basin occurred a few weeks earlier than in the open Bothnian Sea. In computations of DTW alignments the time window was set to +-10 days, so despite the concurrences of the prototypes, some temporal divergence might still occur.
The near coast areas in the coastal zone of Eastern Gotland Basin (Figure 1, location 5), the coasts of Gulf of Riga (location 7), and the eastern Gulf of Finland (location 6) were clustered together with both spectral bands. The weak temporal pattern of the R rs (510) on the coasts of Gulf of Riga and in the eastern Gulf of Finland was (-oo). Internal cohesion and separation from other clusters were the weakest (Table 2) and the cluster was not clearly separated from surrounding areas. The cluster was however repeatedly formed also with other values of k, indicating that it forms a functional area. Also for R rs (620) the prototype showed a relatively even variation through the summer period (-oo).The outer zone of the coasts of Eastern Gotland Basin (Figure 1, location 5) and central parts of the Gulf of Riga (location 7) formed a cluster with R rs (620). With R rs (510) the Gulf of Finland (location 6) belonged to the same cluster, whereas with R rs (620) the Gulf of Finland was separated to a cluster of its own. The distances from the cluster prototypes to the time series of the main basin and the Gulf of Finland are short in both spectral bands, indicating that these areas share common features in their dynamics. The dynamics of R rs (510) followed the temporal pattern (-+-). Despite the annually repeated temporal pattern, the cluster was relatively weak (Table 2). Also with R rs (620) the cluster had the temporal pattern (-+-), but it was the least cohesive (Table 2). According to the prototype, the peak in midsummer was not as clear as in the open sea. The temporal pattern of the Gulf of Finland was (-+-). With R rs (620) the period of elevated reflectance is longer than in the southern and central parts of the main basin. Figure 9. Areas of distinctive R rs (620) dynamics Cn where n = 1 to 11 based on R z (620) and the binwise time series DTW distances from the prototypes of the clusters. The color coding of the DTW distances in C1 to C11 is the same as in Figure 7. The clustering result is given in the image "C1-C11" with a specific color for each cluster. Figure 10. The R z (620) prototypes of the clusters, with the colors from the image "C1-C11". A moving average of ten days was used for a clearer presentation.

Relations between Physical Factors and Reflectance
East-and northward winds raised the reflectance of both spectral bands on the eastern sides of the basins ( Figure 11A,B and Figure 12A,B). The effects of south-and westward winds were less evident or sharp-edged although they were followed by increased reflectance in large open sea areas in the western side of the basins ( Figure 11C,D and Figure 12C,D). Precipitation had minor local effects on reflectance dynamics in some nearcoast areas ( Figure 13A,B). Precipitation had no effect on reflectance dynamics in the open sea, only in the Bothnian Sea there was a slight link between precipitation and R rs (620) ( Figure 13B). An elevated sea surface temperature was followed by elevated reflectance and vice versa in the open sea and western parts of the basins (Figure 14A,B). An opposite case where a change in SST to colder was followed by elevated reflectance (and vice versa) was typical in the coastal zones on the eastern sides of the basins (Figure 14C,D).    Upper: Time series distances from (standardized) SST change ∆SST z to R z (λ). In reddish areas a change towards warmer SST was more closely followed by an elevated reflectance than in blueish areas. Lower: Time series distances from (standardized) opposite SST change -(∆SST z ) to R z (λ). In reddish areas a change towards colder SST was more closely followed by elevated reflectance than in blueish areas. (A) north wind, (B) east wind, (C) south wind, (D) west wind.

Areas of Distinctive R rs (λ) Dynamics and Their Relations to Physical Factors
The primary factor behind the open water season reflectance dynamics of the open Baltic Sea is the abundance of phytoplankton. In the near coast areas, the rivers and other point sources carry varying amounts of natural and anthropogenic optically active substances, which disturb the regular dynamics. Other factors changing the dynamics include the waves, the resuspension of bottom material, and the exchange of waters with diverging optical properties between major basins. The anti-clockwise circulation pattern of the basins can be identified from the shapes of the clusters. According to the binwise time series DTW distances from the prototypes of the clusters, the edges of diverging dynamics are relatively sharp in the northernmost basin, whereas in the main basin and in the basins on the eastern side, the change in dynamics is more gradual. The proper methodological choices and the dynamics of the northernmost basins were studied with different DTW algorithms and partitioning methods in Suominen [38], and on a general level the spatial patterns were similar to the patterns observed in this study.
In addition to abundant CDOM [88], the optical characteristics of the Baltic Sea are strongly influenced by SPM, particularly in close proximity to the coast where SPM originates from river discharges and coastal erosion [56]. After the deposition of inorganic particles near the shore, the open sea SPM consists almost solely of phytoplankton and/or cyanobacteria [57]. Thus, there are two different water types in the Baltic Sea, open sea vs. coastal waters, indicating distinct optical regimes that may require different ocean color parametrization. This is needed not only when utilizing MERIS archives but also to improve the retrieval from Sentinel-3 OLCI data [88].
The optical characteristics of the Bothnian Bay (Figure 15 right, brown and light brown) are primarily caused by the large rivers of the northern and western coasts. Inflows of the large rivers of the northern Bothnian Bay coasts are humic [89,90] but they carry less suspended matter. Towards south along the coast, the rivers flow through flat agricultural landscapes and the suspended matter is more abundant. The reflectance in the open sea (dark brown) decreases from the early summer towards the autumn. The waters are relatively oligotrophic compared to the other regions of the Baltic Sea and the phytoplankton does not form extensive blooms. The eastern coast of the Bothnian Bay (light brown) is shallow, resulting in resuspension of sediments and elevated reflectance when northward and eastward winds prevail ( Figure 11A,B and Figure 12A,B). On the western coasts, westward winds elevate reflectance in places, but resuspension is lower than on the eastern coasts ( Figures 11D and 12D). Precipitation affects especially R rs (510) on both sides of the Bay of Bothnia ( Figure 13A). The anti-clockwise surface layer circulation of the basin circulates the river inflows and the resuspended matter. In the open sea and western coasts the positive change in SST is followed by elevated reflectance and vice versa ( Figure 14A,B). Again, the eastern coast behaves differently and a positive change in SST is followed by decreasing reflectance, while a change to colder is followed by increased reflectance (Figure 14C,D). In the former case the elevated reflectance may be caused by increased primary production during sunny periods whereas on the eastern coast the elevated reflectance may follow from a vertical circulation of waters during storm events, resulting in colder surface layer waters and the occurrence of resuspended seabed material.
A dominant feature of the Bothnian Sea (Figure 15 right, blue) reflectance dynamics is the divergence of the eastern coasts and the pelagic. The open Bothnian Sea (midblue) originates from the northward surface layer water exchange with the Baltic Sea main basin in the south. The northern Baltic Proper suffers from higher concentrations of inorganic nutrients and excessive phytoplankton blooms in late summers. The open Bothnian Sea cluster reflects the inflow of waters with divergent optical properties, the drifting of phytoplankton towards north, and autochthonous primary production under favorable conditions in the open Bothnian Sea. The increased primary production in mid and late summer is seen in the temporal pattern (-+-). The western Bothnian Sea (light blue) is affected by the water exchange with the Bothnian Bay in the north, which is seen especially in R rs (510) (Figure 2). The region is characterized by clearer waters than the pelagic and the eastern coast, and there are no recognizable temporal patterns in the region. Eastward and northward winds are followed by elevated reflectance on the eastern coasts ( Figure 11A,B and Figure 12A,B). According to Kratzer et al. [56], there are elevated concentrations of inorganic suspended particulate matter in the near shore zone of the eastern Bothnian Sea. After westward and southward winds an elevated reflectance occurs in the open and western side of the Bothnian Sea ( Figure 11C,D and Figure 12C,D). It may indicate the drifting of more turbid coastal waters towards west, increased sun glint due to waves or a resuspension of seabed material of shallower regions on the western side of the Bothnian Sea, but they are not connected to coastal processes as they seem to be on the eastern side. An SST change to colder is followed by elevated reflectance on the eastern coast ( Figure 14C,D). Upwelling may cause a phosphate enrichment of the surficial waters and intensify the cyanobacterial bloom [91,92]. In this research the SST was allowed to align with reflectance observations within the next 5 days, which is a short response time for elevated primary production, due to upwelled waters. On shorter time scales the upwelling decreases the amount of cyanobacteria as it removes the surface waters [92], but this was not seen in our data. The biological responses are probably too slow to be identified with reflectance data only before they are buried under other coastal processes, and the change to lower temperature/elevated reflectance is caused by the vertical mixing and resuspension during storm events. Precipitation is followed by elevated R rs (510) in some limited coastal areas, but interestingly precipitation is followed by increased R rs (620) also in the open Bothnian Sea ( Figure 13B). The linkages are relatively weak, but one factor behind the increased R rs (620) may be the atmospheric deposition of inorganic nutrients. Although atmospheric deposition alone is not sufficient to initiate a bloom [93], it may cause a slight increase in primary production.  (Figure 2) within the area, enabling resuspension of seabed material, but other explanations may be long fetches, wave crests and sun glint. In the northern parts of the main basin the increased SST is followed by increased R rs (510) and R rs (620) and vice versa ( Figure 14A,B), which is probably connected to the accumulation of cyanobacteria to the surface layer during and after less windy and sunny periods.
The coastal zone of Eastern Gotland Basin, the coasts of Gulf of Riga and the eastern Gulf of Finland (Figure 15 right, yellow) are influenced by large rivers. Shallow and sandy coasts are extensive in the S and SE part of the Baltic Sea. They are unstable environments due to, e.g., erosion during winter and deposition of sand on the beaches in the summer, and to the constant shifting of the substrate by winds and currents [94]. Although the annual succession of phytoplankton inevitably affects the optical properties of the waters, the reflectance dynamics is governed by coastal processes like seabed resuspension after east and northward winds ( Figure 11A,B and Figure 12A,B). A removal of turbid (resuspension, primary production) coastal waters after southward winds with the consequent upwelling and the cooling of the surface layer [95] was not recognized from reflectance data. Reflectance was increased in a narrow coastal zone of western Latvia and Lithuania, and in the shallow sea areas in northern Gulf of Riga and eastern Gulf of Finland after a cooling of the surface layer ( Figure 14C,D), which is probably linked to a resuspension of fine material [96] and a vertical mixing of waters during periods of strong winds. Also strong precipitation increases the level of R rs (510) and R rs (620) in narrow coastal strips of western Latvia and Lithuania ( Figure 13A,B) due to discharges and washed out material from land, as reported also in [56]. In this sense the eastern coasts of the main basin and the Gulf of Finland have characteristics similar to the eastern coasts of the Bothnian Bay and the Bothnian Sea.
The reflectance dynamics (-+-) in the outer zone of the coasts of Eastern Gotland Basin, the central parts of the Gulf of Riga (Figure 15 right, red) and the Gulf of Finland (Table 2 chart, dark green) is typical to areas where the annual succession of phytoplankton dominates the temporal pattern. The harmful algae blooms are frequent both in the main basin of the Baltic Sea and the Gulf of Finland. In the Gulf of Riga and in the zone immediately outside the influence of coastal processes, a continuous flow of nutrients and reduced water exchange maintain eutrophication and excessive primary production through the summer, whereas in the less altered regions the inorganic nutrients are consumed more rapidly.
Comparing the dynamics of the reflectance data and the selected physical factors reveals linkages between the physical environment and the EO data. Especially the wind has a clear effect on reflectance, mainly because it regulates the surface layer mixing. In water areas near the coastline, this was seen as increased resuspended material. Also the linkage between SST and reflectance was connected to both the vertical mixing of colder waters and the behavior of phytoplankton under different weather conditions.

Applicability of the Proposed Workflow to Define Areas of Distinctive R rs (λ) Dynamics
Although the clusters were not clearly distinctive according to the selected cluster validity index, they formed spatially coherent and logical entities. This was achieved without a priori information about the locations of the time series. In our approach the clustering is a preprocessing step which helps to determine prototype time series and areas of distinctive reflectance dynamics as gradually changing surfaces. Clusters with hard borders do not provide an objective or illustrative view on the gradual nature of the sea environment.
We used hierarchical clustering to predetermine the initial prototypes for partitional clustering since the results of the partitional clustering were not fully deterministic if the initial prototypes were selected randomly. In general, the shapes and the locations of the clusters remained alike also with random initial prototypes. In addition the clusters were divided logically even if the value of k was changed. The method is able to manage large datasets in reasonable computation times (minutes and hours, not days) if k-means clustering alone is used. Hierarchical clustering may still be used to predetermine initial prototypes. However, in order to limit the dimensions of the DTW distance matrix it might be necessary to sample the full data set first.
Dynamic time warping, and other techniques that allow flexible alignments between non-simultaneous observations of time series, has a clear advantage in dynamic marine environments compared to techniques in which the lag between the aligned observations is rigid. Regardless of the partition method, defining areas of distinctive dynamics requires good quality EO data and long enough time series. In our case the data consisted of eight years, but the annual periods were limited to three and a half months. The compiled EO data products originating from multiple EO sensors offer an interesting source for future research also on a global scale.

Conclusions
We define our approach as ocean surface dynamics partitioning as the unit of partitioning is dynamics as a whole. The surface layer dynamics partitioning succeeded to identify diversity in a marginal sea, where the dynamics is unclear at times. Although we found areas with regularly repeated temporal patterns, random and annually unrepeated dynamics prevail in some areas. The reasons behind their seemingly stochastic variations are multifold (e.g., currents, discharges, resuspension, upwelling), but they are still regarded in this study as functional areas of their own together with spatially coherent areas having more traceable causalities. Functional areas are affected by the same dominant environmental factors even if the cycles of the water properties are not annually repetitive or the value of the observed parameter is not on the same level throughout the group [38]. Thus, identifying areas with similar dynamics offers important new insights to practical water management. The importance of partitioning oceans by their dynamics or aligning different time series (e.g., EO, atmospheric, marine) in order to study their causalities may be emphasized when biological and ecological information is needed as background data for models. They may work as an aid to interpret the interlinkages between biological processes and environmental forcing [29]. This may be especially useful in boreal environments where the seasonal changes of physical factors are large. EO data and partitioning by dynamics offer a more holistic view of environmental forcing than sparsely observed in situ data, mainly obtained during the most intensive periods of primary production. We applied our approach to a marginal sea in northern Europe, but any area with temporally changing dynamics or processes on different scales can be analyzed using the approach demonstrated here.
Typically the cycles indicate the natural state of the system as seasonal climatic conditions are reflected in the properties of ocean water. The degradation of the aquatic environment is frequently seen as a trend as an anthropogenic influence is gradually changing the aquatic environment. Sometimes the water properties may alter seemingly randomly since they are outcomes of multiple simultaneous processes. This contingency may result from natural causes (e.g., the effect of a river at a given point in the sea is dependent on the discharge and the sea currents of the day), but sometimes a natural cycle may be disturbed by some external factor which makes the system unstable. We conclude that following the changes in aquatic realms by biogeochemical observations at certain temporal intervals alone is not sufficient for identifying environmental shifts. We foresee that the changes in dynamics are sensitive measure of environmental threats and therefore they will be important to follow in the future.