Synergy between Ocean Variables: Remotely Sensed Surface Temperature and Chlorophyll Concentration Coherence

: The similarity of mesoscale and submesoscale features observed in different ocean scalars indicates that they undergo some common non-linear processes. As a result of quasi-2D turbulence, complicated patterns of ﬁlaments, meanders, and eddies are recognized in remote sensing images. A data fusion method used to improve the quality of one ocean variable using another variable as a template is used here as an extrapolation technique to improve the coverage of daily Aqua MODIS Level-3 chlorophyll maps by using MODIS SST maps as a template. The local correspondence of SST and Chl-a multifractal singularities is granted due to the existence of a common cascade process which makes it possible to use SST data to infer Chl-a concentration where data are lacking. The quality of the inference of Level-4 Chl-a maps is assessed by simulating artiﬁcial clouds and comparing reconstructed and original data.


Introduction
Global maps of Chl-a concentration at sea surfaces, similar to what happens with many other remote sensing products, suffer from data gaps. This is especially true for high-spatial resolution (few kilometers) daily maps, where orbital voids add to the data loss due to clouds, aerosols or sunglint. For this kind of products, gap-filling and noise-reduction techniques are required to provide uniform products suitable for environmental monitoring or the initialization of ecosystem models. To achieve this goal, data interpolation and noise reduction are commonly performed using univariate methods as, for example, [1]. Multivariate methods have also been applied to generate uniform products that combine information from different sensors; examples of multivariate approaches include optimal interpolation [2], empirical orthogonal functions [3,4], classification methods [5] and more recently using data-diven methods as analog data assimilation [6,7]. The method discussed in this paper is based on the empirical perception that fronts, eddies, and filaments are identifiable in the satellite imagery of different ocean variables (sea level, ocean color, sea surface temperature, sea surface salinity) and, sometimes, even in the raw radiance recorded by the satellites [8,9]. Remote sensing and drifter data have shown that the statistical distribution of surface ocean fields has characteristics of a flow in fully developed quasi-2D turbulence. Owing to the underlying turbulence of the ocean, the application of a technique known as singularity analysis [10] allows to retrieving information about ocean currents from images of any advected scalar.
The analysis of the spatial variability of pigment images found no difference between the wavenumber spectra of SST and ocean color maps, leading to the conclusion that phytoplankton cells in dynamic areas effectively behave as passive scalars [11,12]. In [13] it was shown that sea surface temperature and chlorophyll maps have the same multifractal structure. This fact can be interpreted as a consequence of the turbulent advection at the scales of observation and due to the existence of a common cascade process [14]. The multifractal structure of any scalar can be characterized through singularity analysis [15]. Singularity analysis is any technique capable of calculating a map of the singularity exponents associated with a given scalar. A singularity exponent is a non-dimensional number that characterizes the sharpness, or regularity, of the scalar around a given point [10]. Nieves et al. [13] showed that the singularity exponents extracted from SST, Chl-a or brightness temperature maps are very similar, a fact that was interpreted as an effect of the advection term. Notice that the equivalence among singularity exponents does not require advection to be the largest term in the equation of evolution. Advection is a non-linear term in the equation of evolution, and thus it is at the origin of singularities.
That the common singular structure of variables advected by the turbulent ocean has been exploited to reduce the noise in SMOS sea surface salinity maps using data from auxiliary, high-quality SST maps [16,17]. The same algorithm can be used not only to reduce noise but also to extrapolate provided that the template is defined at the missing points. Although the theoretical foundation of the data fusion method is relatively complex, its practical application is rather simple: a non-parametric, weighted local linear regression between Chl-a and SST maps are calculated at each point, then the regression parameters obtained at each point are applied to the value of SST at that point to infer a new variable of Chl-a.
The goal of this work is to assess the ability of a local linear regression approach to fill remote sensing chlorophyll concentration data gaps. This local relation between SST and Chl-a can be built from a single snapshot of both variables, so the implementation of the method is simple, computationally fast, and no training set is required; moreover, the method is well fit to deal with strong, rapid changes in the dynamics of the flow as far as it is always turbulent. Additionally, the analysis of the auxiliary parameters of the fusion algorithm allows characterizing the modulation of the seasonal correlation modulation between Chl-a and SST variables.

Satellite data
Two Aqua-MODIS ocean products from the NASA Aqua spacecraft are used in this study: Chl-a concentration Level-3 daily product, and standard MODIS Aqua Level 3 SST Thermal IR Daily 4 km Nighttime, both at 4 km × 4 km grid [18]. Our dataset period goes from January to December 2006. The data were downloaded from the Ocean Color web portal, http://oceancolor.gsfc.nasa.gov/. According to many studies, it is assumed that Chl-a follows a lognormal distribution [19] and so we work with the logarithm of chlorophyll concentrations for ease of use (theoretically the singularity exponents should not change if a monotonic function is applied to the data [10], but it is numerically more stable to work with a dynamic range comprising fewer orders of magnitude).

Singularity analysis
Singularity analysis is the key stone of the so-called Microcanonical Multifractal Formalism (MMF) [10]; where the fluid is understood as a hierarchical arrangement of different fractal components, each own with its own fractal dimension and characterized by a particular value of the singularity exponent. Singularity exponents are related to the ocean circulation and thus they are not specific to any particular scalar under study. In fact, it has been verified that with a good approximation singularity lines coincide with the streamlines of the flow [20], confirming that singularity exponents are characterized by the flow and not scalar-specific. The emergence of such a singular structure is the result of the advective forces acting on a quasi-2D turbulence regime, and can be thus observed for scales ranging from kilometers to the planetary scale.
The calculation of singularity exponents in a given noisy, discretized signal requires the use of an appropriate interpolation scheme, the most usual one being wavelet projections. Let s be an arbitrary 2D scalar signal. It will be said that s has a singularity exponent h(x) at the point x if, for any wavelet function Ψ the following relation holds: where r stands for an arbitrary scale parameter (normalized by the integral scale so it is dimensionless and smaller than 1) and o r h(x) is a term that becomes negligible compared to r h(x) when r goes to zero. The amplitude function α(x) does not depend on the particular scale at which the wavelet projection is calculated and has the same units as the gradient of the scalar. The left hand side, i.e., T Ψ |∇s|(x, r), is called the wavelet projection of the gradient modulus of s over the wavelet Ψ, and represents a local zoom of variable size around the point x. What is important in Equation (1) is the function h(x), which is called the singularity exponent of the function s at the point x. The singularity exponent is, by construction, a dimensionless measure of the regularity or irregularity of the function s around the point x.

Fusion method
As discussed in Turiel et al. [10], ocean scalars as chlorophyll and temperature (denoted respectively by c and θ) are multifractal in the microcanonical sense. This implies that a singularity exponent can be assigned at each point and that they are arranged in a particular way, which in turn implies that they are related to a property of the flow (its circulation) rather than to any specificity of the associated scalar.
Assuming that the singularity lines of Chl-a and SST coincide as it was shown in Nieves et al. [13], it follows that their gradients must be related by a smooth 2 × 2 matrix ρ [17] : Although a large number of ρ( x) verifying Equation (2) exist, the indetermination is considerably reduced by imposing that the matrix ρ( x) must be smooth: a non-smooth ρ( x) would be a source of singularities, and hence the singularity lines of c and θ would differ. If we further assume that the gradients of c and θ are aligned, then Equation (2) leads to: with smooth functions a and b (i.e., they have small gradients as compared to those of c and θ).
Let us now suppose that the function c is contaminated by some source of noise n, so in fact our data sets consists of θ( x) (assumed noiseless) and c ( x) = c( x) + n( x). We can construct a filtered version of c, denoted by c f , as: whereâ( x) andb( x) are estimates of the actual parameters a( x) and b( x) in Equation ( The total weight of a point x, N( x), is defined as follows: The weighted regression uses all available data of each field, using the function N( x) that decay as a power-law with the distance (Equation (5)); thus it is scale-invariant, as it does not introduce any preferred scale [16,17].
Withâ( x) andb( x), an estimation of the chlorophyll c f may be obtained applying Equation (4) as soon as the value of θ at that location x is available. This algorithm was shown in Umbert et al. [17] to lead to a large increase in the quality of SMOS SSS maps using OSTIA SST maps as a template; besides, the method leads to significant restoration of the structure of singularity exponents. The values of Chl-a are derived from measurements in the visible part of the spectrum, which may be affected by artifacts like aerosols, sun glint and high turbidity of the water. Several flags are introduced to characterize Chl-a data quality and a result, maps of Chl-a usually suffer from a larger incompleteness than those of SST, which are derived from infrared measurements in the 4 µm range. An example of the application of the fusion algorithm in the global ocean for 1 January 2006 is shown in the bottom panel of Figure 1. As stated before, the algorithm allows estimating the local Chl-a-SST regression by taking into account all possible couples of data (SST,Chl-a) weighted using function N( x) in Equation (5). Fused Chl-a maps integrate the relation between structures present in the SST and Chl-a specific structures. In fused daily products we recognize some expected Chl-a global patterns, near the ocean surface, where availability of sunlight is not limiting, phytoplankton growth depends on temperature and nutrient levels. High chlorophyll concentrations are found in nutrient-rich, cold polar waters and where ocean currents cause upwelling, which brings nutrient-rich deep-cold water to the surface.

Scalar synergy
We will focus in the region of the Gulf of California, a narrow sea between mainland Mexico and the Baja California peninsula, where high primary productivity levels are found as a result of an efficient nutrient transport of waters from under a shallow pycnocline into the euphotic zone [21]. Figure 2 shows the input variables of our algorithm (top panel: Chl-a, middle panel: SST) and the output fused Chl-a (bottom panel). The corresponding singularity exponents are shown in the right column. Rich singularity structures can be recognized in the original chlorophyll concentration maps associated with the fronts mainly caused by the horizontal transport from high primary production areas onto less productive ones. The SST field exhibits also the richness of patterns associated with the circulation in that area. Where both images are not affected by data gaps, the singularity structure (although not the magnitude) is similar between them. Places where the correspondence of both singularity images fails may identify places at which intrinsic dynamics of the variable competes with flow advection.
Once the data fusion is applied, the L4 Chl-a is extrapolated to all the pixels where SST was available. The structure of Chl-a is well represented (compared to the original image), although a smoothing of the original variable degrades the dynamic range of the variable. The singularity exponents of the fused Chl-a data reveal that most of the structures present in the SST map are re-integrated in the chlorophyll map at the same time that Chl-a specific magnitude and structures are maintained; for instance, the strong gradient associated with the biological activity present in the Gulf of California, appear delineated in the fused Chl-a. This implies that the fusion algorithm can partially integrate SST structures without destroying Chl-a ones.

Interpretation of auxiliary parameters
As shown in Equation (4), the functionsâ( x) andb( x) provide information about the local functional dependence between SST and Chl-a. The local slope,â( x), will be negative at those places where SST decreases as Chl-a increases in the neighborhood, and the converse. Considering that cold waters tend to have more nutrients than warm waters, phytoplankton is more abundant where surface waters are cold. So, as we move from one given point to another with colder water, Chl-a would usually increase and thus the slopeâ( x) will be negative. However, the relationship changes from point to point and it should be expected to change from one image to the next one. Figure 3 shows the seasonal average of the slope and intercept (considering winter as January-February-March, spring as April-May-June, summer as July-August-Septemebr and fall as October-November-December).
The coherent patterns of the auxiliary parameters of the method delineate areas with different relation between Chl-a and SST. In the same spirit, Longhurst [22] introduced the concept of ocean biogeochemical provinces, characterized by their particular physical and biological behavior. Longhurst definition is based on the mixed layer depth lying close to the ocean-atmosphere interface. Specific provinces have common characteristics and can generally be classified as four general biomes: the coastal, polar, westerly and trade winds biomes. A visual comparison between local functions of the fusion method and the Longhurst definition is shown in Figure 3.
As expected, negative slopes are present in the upwelling areas associated with the easternmost currents of the great anticyclonic gyres, corresponding to the Benguela and Canary currents in the Atlantic Ocean and the Peru and California currents in the Pacific Ocean. Notice that this negative slope is present all year long. These oceanic areas, as well as the upwelling zones and continental margins, are rich in Chl-a as a result of the proximity to areas where the resurgence of nutrients take place and the local circulation is favorable to nutrient accumulation. A band of cool, chlorophyll-rich water is also apparent all along the equator; the strongest signal at the Atlantic Ocean and the open waters of the Pacific Ocean also leads to negative values of the local interceptb( x). Negative values ofâ( x) are also found in areas where Chl-a concentration decreases as SST increases; this situation, which is typically found in the (oligotrophic) subtropical gyres, intensifies in the Atlantic Ocean during the northern hemisphere winter and spring. The Pacific Ocean exhibits an intensified negative pattern in the Northern subtropical gyre during boreal spring and a negative pattern in the southern hemisphere during austral spring. In both cases, such intensification in the oligotrophic subtropical gyres is driven by the seasonal cycle of sea surface temperature.
Subpolar gyres are also characterized by high Chl-a concentrations linked to nutrient accumulation during winter, when the mixing layer reaches the deep ocean followed by the stratification of the water column during spring. During the dark winter months, the local slope between SST and Chl-a is positive. However, when sunlight returns and nutrients are trapped near the surface during spring and summer, the phytoplankton flourishes in high concentration, seen as negative values ofâ( x).
The local regression coefficient (not shown) has small values along the Equatorial Pacific, and in the Southern and Indian Ocean indicating that horizontal advection of Chl-a cannot be locally explained by SST variability in these regions only. Therefore, either additional variables should be taken into account, or a more sophisticated relation between Chl-a and SST should be used. For example, in ocean regions of High Nutrient Low Chlorophyll (HNLC) as the equatorial Pacific and the Southern Ocean, low Chl-a concentrations are due to a stoichiometric imbalance of iron which have no link to SST. Another possible cause for the low values of the local regression coefficient in some regions is the lack of enough points to provide a quality reconstruction. For instance, 85% of the data points are missing in the Equatorial Pacific due to the large cloudiness.

Validation of reconstruction
To assess the quality of the extrapolation resulting from our data fusion method, we validate it by analyzing 5 different regions (as shown in Figure 4); each region has at least an area of 8 × 8 degrees (200 × 200 pixels). For each of these areas a mask representing a cloud structure (of the typical size and shape found in chlorophyll images) is defined (the clouds are generated using a real MODIS SST 9-km daily image of 1 January 2006, and are kept fixed in time). Those masks have a surface of about 30% of the region on which they will be applied. To assess the quality of the fusion algorithm, we proceed in three steps. In the first step, points lying in the masked area of a daily Chl-a map are removed (notice that there will be additional missing points in the Chl-a map because of the gaps in the original remote sensing product; for instance, in the EP area ( Figure 4) the average percentage of missing points is 85%). In the second step, the fusion algorithm is applied to the masked Chl-a map using the corresponding daily SST map as a template, extrapolating to the missing values. Finally, the extrapolated values are compared to the available original ones on the masked area for each image during the entire year 2006 (365 daily images). We require a minimum of 5% of original Chl-a values existed inside the masked area to perform the cross validation. This strategy allows comparing the original Chl-a and the retrieved one at the available masked points, and therefore the extrapolation ability of the fusion method.
The performance of the algorithm is studied in different Chl-a regimes: oligotrophic regions (Central Atlantic, CA and Pacific in front of California, CL) and eutrophic regions in higher latitudes (North Atlantic, NA), coastal upwelling areas (Benguela upwelling, BG) and the Equator (Equatorial Pacific, EP). NA region is defined by [19. Examples of one daily image validation are shown in Figure 4 for each region. A quantitative measure of the quality of the reconstruction of the chlorophyll is given in terms of four parameters: the root mean square (rms), bias (mean) and standard deviation (std) of the reconstructed error and the correlation coefficient (r) between the masked and reconstructed values. In the case of the central Atlantic and California regions, the concentration of chlorophyll is smaller, followed by the values found in the North Atlantic, the Equatorial Pacific and the Benguela upwelling where highest values of Chl-a concentration are found. Correlations coefficients between L4 and original Chl-a decrease for the regions where lower pigment concentrations are found. Better statistics are associated to areas of high primary production.
The validation for the entire year 2006 is summarized in Table 1; the mean seasonal values for each one of the statistical parameters and clouds are included. Our data fusion method succeeds in filling gaps with mean annual correlation coefficients ranging from 0.58 to 0.81 for the studied period and artificial clouds. Oligotrophic regions (North Atlantic and California) have mean regression coefficients which are significantly lower than the mean correlation coefficients values for the equatorial Pacific, the North Atlantic and the Benguela upwelling regions respectively. The smaller performance in oligotrophic regions is probably due to the small horizontal gradients of Chl-a there together with a larger signal-to-noise ratio. However, the absolute error of the reconstruction is always moderate. A slightly mean positive bias is systematically found, meaning that the L4 estimates are smaller than the original Chl-a values, probably due to the smoothing generated by the weighting function used in the fusion algorithm [16].
The seasonal segregation of validation results highlights a worse performance during the January-February-March period, with mean correlation coefficients ranging from 0.15 to 0.47 (Central Atlantic and Benguela upwelling, respectively). During the rest of seasons, the validation results greatly improve with correlation coefficients ranging from 0.67 to 0.94 in spring (AMJ), from 0.77 to 0.94 in summer (JAS) and from 0.74 to 0.94 in fall season (OND). In the winter season, the strengthening of winds and deepening of the mixed layer in mid and high latitudes create a large supply of nutrients from the deep ocean to the surface, which will be available for phytoplankton to proliferate in the following seasons. In equatorial and coastal upwelling regions (Equatorial Pacific and Benguela, respectively), the upwelling intensity also varies seasonally depending on wind strength and direction and the vertical structure of the water column. Under these circumstances, nutrient availability, vertical mixing and the depth of the mixed layer play crucial roles in the distribution of Chl-a, which could not be only explained by horizontal advection and the distribution of surface temperature.
The power density spectra (PDS) is computed as in [23] inside the boxes shown in in

Discussion and Conclusions
Satellite missions that measure chlorophyll concentration offer limited coverage due to orbital gaps, the presence of clouds and aerosols, sun glint, poor retrieval of the geophysical variables, etc. Combining data from several satellites can considerably increase the spatial coverage of daily maps. In this paper, we have used a method to merge the information coming from two different ocean scalars that share common multifractal characteristics. One of the two scalars, denoted as template, is assumed to have higher spatial coverage and signal-to-noise ratios than the other one, denoted as signal. The application presented here uses MODIS SST as template and MODIS Chl-a as signal. The template is used to extrapolate the signal by estimating local linear regression coefficients between Chl-a and SST, then applying them to the template to restore the signal.
The regression coefficients of the data fusion method exhibit geographical patterns that contain information about the relation between Chl-a and SST. The linear approximation, based on the hypothesis that the gradients of the regression coefficients are negligible as compared with the gradients of the variables Chl-a and SST, is less accurate on equatorial Pacific, Indian oceans, and during the winter season. In all these cases the present scheme needs to be extended. Besides, auxiliary environmental information, such as surface winds, mixed layer depth or nutrient concentration must be taken into account to fully understand the Chl-a behavior.
By exploiting the synergy between Chl-a and SST, it has been possible to increase the daily spatial coverage of MODIS Chl-a. The extrapolated fields have proved to be consistent with the observed ocean structures. A cross validation of the methodology, which consisted on create artificial gaps, apply the methodology and compare to the original Chl-a data, indicates correlation coefficients ranging from 0.67 to 0.94, except in winter season, when the correlation coefficients range from 0.15 to 0.47.
In conclusion, we have demonstrated the efficiency of blending SST and Chl-a to improve spatial coverage of chlorophyll products and to study the relation between ocean scalars. The resulting information can help to further characterize the spatio-temporal correlation between SST and Chl-a, and the temporal variation and long-term evolution of the biogeochemical provinces.