Neural Network Approaches to Reconstruct Phytoplankton Time-Series in the Global Ocean

: Phytoplankton plays a key role in the carbon cycle and supports the oceanic food web. While its seasonal and interannual cycles are rather well characterized owing to the modern satellite ocean color era, its longer time variability remains largely unknown due to the short time-period covered by observations on a global scale. With the aim of reconstructing this longer-term phytoplankton variability, a support vector regression (SVR) approach was recently considered to derive surface Chlorophyll-a concentration (Chl, a proxy of phytoplankton biomass) from physical oceanic model outputs and atmospheric reanalysis. However, those early e ﬀ orts relied on one particular algorithm, putting aside the question of whether di ﬀ erent algorithms may have speciﬁc behaviors. Here, we show that this approach can also be applied on satellite observations and can even be further improved by testing performances of di ﬀ erent machine learning algorithms, the SVR and a neural network with dense layers (a multi-layer perceptron, MLP). The MLP outperforms the SVR to capture satellite Chl (correlation of 0.6 vs. 0.17 on a global scale, respectively) along with its seasonal and interannual variability, despite an underestimated amplitude. Among deep learning algorithms, neural network such as MLP models appear to be promising tools to investigate phytoplankton long-term time-series. model (vs. satellite observations and numerical atmospheric reanalysis here). The SVR accurately reproduced most aspects of the satellite Chl variability (although underestimated by half) and spatial patterns. Here, we show that the SVR, trained on satellite data and atmospheric reanalysis, also encounters di ﬃ culties to well reproduce the Chl signal. The MLP demonstrates the ability of deep learning schemes to reproduce satellite Chl with far better skill than the SVR, not only to capture the general patterns of Chl seasonal and interannual signals and trends, but also their amplitude.


Introduction
Phytoplankton-the microalgae that populates the upper sunlit layers of the ocean-fuels the oceanic food web and regulates oceanic and atmospheric carbon dioxide levels through photosynthetic carbon fixation ( [1,2]). Seasonal and inter-annual cycles of phytoplankton biomass are now relatively well characterized, thanks to the large amount of studies based on radiometric satellite observations (e.g., [3,4]). Since the launch of the Sea-viewing Wide Field-of-View Sensor (SeaWiFS) in late 1997, satellite radiometric observations are continuous. However, 20 years of observations is still too short to thoroughly investigate decadal Chlorophyll-a concentration (Chl, a proxy of phytoplankton biomass) variations. The unavailability of global scale observations over a continuous time-series longer than two decades led the scientific community to rely on coupled physical-biogeochemical ocean modeling to investigate phytoplankton biomass decadal variability. While models are able to resolve seasonal to interannual biogeochemical variability to an ever-improving degree (e.g., [5,6]), they diverge in reproducing decadal observations, in particular phytoplankton regime shifts [7][8][9]. Consequently, it is still not possible on a global scale to clearly separate the phytoplankton long-term response to climate change from natural variability.
However, well-characterized decadal cycles of phytoplankton (in terms of biomass, community composition, and carbon fluxes) are crucial as (1) They can accentuate, weaken, or even mask the climate-related trends (the recent debate about the observed North Atlantic regional cooling in the context of climate change illustrates the crucial need for better understanding decadal variability [10]; (2) The observed changes in phytoplankton during decadal cycle warm phases may provide insights into how future climate warming-induced changes will alter carbon cycle and the marine food web.
The distribution of phytoplankton is strongly controlled by physical processes over a large part of the global ocean (e.g., [11][12][13][14]). Consequently, past Chl variations may be reconstructed from past physical environmental factors. To our knowledge only two studies have been performed to reconstruct surface Chl in order to investigate its decadal variability. The first one allowed the derivation of spatio-temporal surface Chl variations over the 1958-2008 period in the tropical Pacific [15]. This reconstruction used a linear canonical correlation analysis on sea surface temperature (SST) and sea surface height to improve the description of the Chl response to the diversity of observed El Niño events and decadal climate variations in the tropical Pacific. The second one investigated the ability of a non-linear statistical approach based on support vector regression (SVR) to reconstruct historical Chl variations on a global scale using selected surface oceanic and atmospheric physical variables from a numerical model as predictors [16]. The SVR method was able to reproduce trends as well as the main modes of the interannual Chl variability depicted by satellite observations in most regions. Changes observed by satellite Chl between the 1980s and 2000s were also qualitatively captured by the SVR. The main bias of this approach was to underestimate the amplitude of the Chl variations by a factor of two.
Here, we investigate how a multi-layer perceptron (MLP) may be more skillful than this SVR approach to reconstruct satellite Chl on a global scale. While, in [16], we used physical outputs from an ocean forced model to train the SVR and reconstruct surface Chl, here we choose to only use physical predictors from satellite observations and numerical atmospheric reanalysis. This choice is motivated by (i) our ultimate objective, which is to reconstruct Chl from physical observations (i.e., not relying on biogeochemical numerical models); and (ii) the use of the most realistic environmental conditions that those observations allow. However, those observations are mainly available through remotely sensed surface data (oceanic observations below the surface are indeed usually not accessible at large spatial-scales or interannual time-scales) and predictors are then limited to surface variables. With such limited 2D sampling, we are aiming at building a statistical model that may challenge more complex numerical models which simulate complex three-dimensional processes. Indeed, such models may not only strongly diverge in capturing Chl variations at a timescale of a decade but they are also not straightforward to run and require large computing resources.

Oceanic and Atmospheric Datasets
Phytoplankton needs light and nutrients to grow. Physical processes strongly control spatial distribution and time-variability of nutrient inputs in the upper-sunlit layer, and thus of phytoplankton over a large part of the global ocean. Thus, we make use of this physical (bottom-up) control to derive statistical models that relate several physical variables (predictors) to satellite Chl (output) as detailed in Table 1.  Chl was retrieved from the Ocean Colour-Climate Change Initiative (OC-CCI) from the European Space Agency (ESA, [32]). This product has generated global, ocean-colour products for climate research by merging observations from different sensors while attempting to reduce inter-sensor biases [33]. The V4.2 product was extracted on a 1 • grid and with a monthly temporal resolution over 1998-2015. It is referred hereafter to as Chl OC-CCI .
Predictors have been extracted over the same time period than Chl OC-CCI . Those with a higher resolution than 1 • and a month have been averaged to match the Chl grid. The 2 • × 2 • bins of the short-wave radiation product have been divided to match the Chl grid.
The Multivariate El Niño Southern Oscillation Index (MEI) has been provided by the National Oceanic and Atmospheric Administration (NOAA) website [34].

Support Vector Regression (SVR)
Support vector machine is a kernel-based supervised learning method [35] developed for classification purposes in the early 1990s and then extended for regression by [36]. The basic idea behind SVR is to map the variables into a new space, possibly in a non-linear way using the so-called kernel function, so that the regression task hopefully becomes linear in this space. Because SVR can efficiently capture complex non-linear relationships, it has been used in a variety of fields, and more specifically for oceanographic, meteorological and climate impact studies [37][38][39], as well as in marine bio-optics [40][41][42]. Considering a Gaussian kernel, SVR only involves the selection of two hyperparameters: the penalty parameter C of the error term and the kernel band with gamma. Following [16], (i) these two parameters have been set to 2 and 0.3, respectively; and (ii) the SVR was trained on only 9% of the database (randomly selected) due to computational limitations. The SVR is set up with python and the Scikit-learn library. Reconstructed Chl is hereafter referred to as Chl SVR .

Neural Networks
Deep learning models and neural networks (NNs) are at the core of the state-of-the-art in machine learning and artificial intelligence for a large range of applications [43]. NNs are particularly appealing due to their capacity to learn complex relationships from raw data better than other models, when data are abundant. Though the concept of NNs has been around for a long time, these state-of-the-art approaches recently obtained impressive results for many supervised or unsupervised learning problems thanks to the availability of very large datasets, and the increase in computational power in the last few decades [44]. NNs learn complex patterns through the composition of simple elementary operations forming a global highly-nonlinear input/output relationships, whose parameters can be efficiently trained using the back-propagation algorithm. These recent advances motivate an increasing number of applications in spatial oceanography, (e.g., [45,46]). In this context, physics-informed and theory-guided NNs [47] are of key interest as new means to exploit the computational and learning efficiency of NNs, while exploiting prior knowledge and making easier model interpretation. Here, we follow such an approach with a MLP architecture, which uses the same physical features and geospatial information as inputs as the SVR.
NN frameworks exploit mini-batch gradient descent schemes during the training phase. This can allow us to exploit the entire dataset contrary to the SVR model. Indeed, SVR is known not to scale up well with large training datasets. The considered MLP architecture exploits LeakyReLU activation functions after each dense layer ( Figure S1). Dropout layers [48] are added to the last 3 layers to reduce overfitting. Since we are dealing with a regression task, the last layer is linear. Thanks to its benefit of penalizing large errors, MSE is used as the loss function for the MLP. The MLP is set up with python and the Keras library. Configurations details are provided in Table S1.
As for the SVR, we first train a MLP on 9% of the training dataset, randomly selected. The reconstructed Chl is hereafter referred to as Chl MLP-9% . The aim is to provide a first consistent comparison with Remote Sens. 2020, 12, 4156

of 14
Chl SVR using the same training setting. Then, we take advantage of the MLP ability to be trained on larger database and a second MLP is trained on 80% of the dataset. Reconstructed Chl is referred to as Chl MLP . The two learning curves are provided in Figure S2.

Data Preprocessing and Procedure
Predictors and log(Chl) are normalized by removing their respective average and dividing them by their standard deviations. The SVR and the two MLP are trained from 1998 (the first complete year of the satellite Chl OC-CCI time-series) to 2015 between 50 • S and 50 • N (Step 1 in Figure 1). Thus, the resulting SVR and NN schemes are applied on the physical predictors over 1998-2015, and the annual means and standard deviations initially removed are applied to perform the back transformation and reconstruct Chl values, namely either Chl SVR , Chl MLP-9% and Chl MLP outputs (Step 2 in Figure 1). These three Chl reconstructed whole datasets are then compared to Chl OC-CCI to evaluate their skills in reconstructing satellite observations (Step 3 in Figure 1). scale up well with large training datasets. The considered MLP architecture exploits LeakyReLU activation functions after each dense layer ( Figure S1). Dropout layers [48] are added to the last 3 layers to reduce overfitting. Since we are dealing with a regression task, the last layer is linear. Thanks to its benefit of penalizing large errors, MSE is used as the loss function for the MLP. The MLP is set up with python and the Keras library. Configurations details are provided in Table S1.
As for the SVR, we first train a MLP on 9% of the training dataset, randomly selected. The reconstructed Chl is hereafter referred to as ChlMLP-9%. The aim is to provide a first consistent comparison with ChlSVR using the same training setting. Then, we take advantage of the MLP ability to be trained on larger database and a second MLP is trained on 80% of the dataset. Reconstructed Chl is referred to as ChlMLP. The two learning curves are provided in Figure S2.

Data Preprocessing and Procedure
Predictors and log(Chl) are normalized by removing their respective average and dividing them by their standard deviations. The SVR and the two MLP are trained from 1998 (the first complete year of the satellite ChlOC-CCI time-series) to 2015 between 50° S and 50° N (Step 1 in Figure 1). Thus, the resulting SVR and NN schemes are applied on the physical predictors over 1998-2015, and the annual means and standard deviations initially removed are applied to perform the back transformation and reconstruct Chl values, namely either ChlSVR, ChlMLP-9% and ChlMLP outputs (Step 2 in Figure 1). These three Chl reconstructed whole datasets are then compared to ChlOC-CCI to evaluate their skills in reconstructing satellite observations (Step 3 in Figure 1).   Empirical orthogonal function (EOF) analysis is performed to investigate the model's ability to reconstruct Chl seasonal and interannual variability. First, centered seasonal and interannual Chl OC-CCI are obtained by removing their annual and monthly means over 1998-2015, respectively, and by dividing them by their standard deviations. Then, the so-called reference EOF analysis is performed on these centered seasonal and interannual Chl OC-CCI anomalies to avoid an overly dominant contribution of high values on the analysis [49]. Thus, Chl SVR and Chl MLP outputs are projected onto the seasonal and interannual Chl OC-CCI spatial patterns to obtain their associated time components (i.e., principal component-PC). Seasonal and interannual PCs for each dataset are then compared.

Statistical Performances
A first evaluation of the Chl SVR vs. Chl OC-CCI is provided over 1998-2015 at basin scales and for the whole dataset ( Figure 2, upper row). Determination coefficients between both datasets are below 0.5 and even get down to 0.26 in the Indian Ocean, while RMSE is about 0.6 in the three basins. The MLP trained on the same amount of data than the SVR is more skillful than the SVR to reconstruct Chl OC-CCI (Figure 2, middle row). The regression lines between the log of Chl MLP-9% vs. Chl OC-CCI are closer to the 1:1 line for each oceanic basin with determination coefficients higher than 0.63 and Remote Sens. 2020, 12, 4156 6 of 14 up to 0.73 in the Atlantic Ocean. Increasing from 9% to a usual 80% the MLP training dataset further increase the skills of the NN approach to reconstruct Chl OC-CCI with a relative gain of about 10% (Figure 2, lower row). Nevertheless, Chl MLP still underestimates Chl OC-CCI , more specifically in the Pacific. Some of these differences may be related to changes in Chl, which may due for instance to photo acclimation (e.g., [50,51]) or by other components that are not Chl such as suspended particulate matter (SPM) or colored dissolved organic matter (CDOM; [52]). Interestingly, the use of a MLP not only removes computational restrictions imposed by the SVR (i.e., size of the training samples), but it also appears to be more efficient in reconstructing surface Chl from oceanic and atmospheric variables. Thereafter, Chl OC-CCI and Chl SVR are compared to the best NN product, Chl MLP (i.e., trained on 80% of the dataset).   Consistently with the scatterplots, Chl OC-CCI correlations with Chl SVR are significantly lower than with Chl MLP (Figure 3A,C; r = 0.17 vs. 0.6 on a global scale, respectively). Chl OC-CCI -Chl SVR correlations are higher than 0.7 (p < 0.001) over limited regions such as the Atlantic, Indian, and Pacific subtropical areas ( Figure 3A). The MLP allows a significant improvement in the correlation with Chl OC-CCI with values higher than 0.75 over most of the global ocean ( Figure 3C). Areas of high and low NRMSE are similarly distributed for Chl SVR and Chl MLP (Figure 3B,D). For instance, in both cases NRMSE is higher at the highest latitudes and in the tropical north-western and south-eastern Pacific. Although the MLP reduces these NRMSE by 50% compared to the SVR, biases in reference to Chl OC-CCI still remain in these regions. High NRMSE can reflect the influence of other components than phytoplankton biomass on the Chl signal as mentioned above, or/and the impact of other predictors not considered to train the neural network models. This is, for instance, suggested from the Amazon Remote Sens. 2020, 12, 4156 7 of 14 plume where the Chl signal is known to be influenced by river flow and may thus rather be associated with CDOM or SPM than phytoplankton biomass [53].

EOF analysis provides complementary insights into the MLP and SVR ability in reconstructing
Chl spatio-temporal variability. Seasonal and interannual ChlOC-CCI spatial variability are illustrated through their respective EOF first modes with a total variance of 27.6% and 12.6%, respectively ( Figure 4A,C). The well-known Chl seasonal patterns are highlighted with a variability out of phase between the northern vs. southern hemisphere due to the reversal of the season order ( Figure 4A,B). It is also out of phase between high latitudes (light limited) vs. low and mid-latitudes (rather nutrient limited). ChlMLP and ChlSVR are then projected on this reference ChlOC-CCI spatial seasonal pattern. Correlations between ChlOC-CCI and ChlSVR or ChlMLP PCs are of 0.64 and 0.99, respectively ( Figure 4B).

Seasonal to Interannual Variability and Trends
EOF analysis provides complementary insights into the MLP and SVR ability in reconstructing Chl spatio-temporal variability. Seasonal and interannual Chl OC-CCI spatial variability are illustrated through their respective EOF first modes with a total variance of 27.6% and 12.6%, respectively ( Figure 4A,C). The well-known Chl seasonal patterns are highlighted with a variability out of phase between the northern vs. southern hemisphere due to the reversal of the season order ( Figure 4A,B). It is also out of phase between high latitudes (light limited) vs. low and mid-latitudes (rather nutrient limited). Chl MLP and Chl SVR are then projected on this reference Chl OC-CCI spatial seasonal pattern. Correlations between Chl OC-CCI and Chl SVR or Chl MLP PCs are of 0.64 and 0.99, respectively ( Figure 4B). Chl SVR PC shows a fictive double peak in the seasonal cycle and a largely underestimated amplitude. Despite the Chl OC-CCI -Chl MLP high correlation, the amplitude of Chl MLP seasonal variability is also underestimated.
The first EOF mode performed on interannual Chl OC-CCI illustrates the largely reported impact of El Nino Southern Oscillation (ENSO) [4,17,[54][55][56] (Figure 4C). Here again, the MLP results in a significant improvement compared with the SVR to reconstruct Chl OC-CCI variability, with its first PC correlation with Chl OC-CCI of 0.95 vs. 0.63 for Chl SVR ( Figure 4D).
Over the last 18 years, observed Chl OC-CCI trends have increased over most of the global ocean ( Figure 5A). Regionally, some decreases are observed such as in the Indian Ocean, the equatorial Pacific and the Atlantic and Pacific oligotrophic subtropical gyres. While most of these trends are captured by Chl SVR , inverse trends occur in the South Indian and Atlantic Oceans ( Figure 5B). In addition, the trend estimation for Chl SVR reveals an unrealistic high-frequency pattern, which may relate to the support of the Gaussian kernels implemented by the SVR model. On its side, Chl MLP better reproduces Chl OC-CCI trends in terms of spatial distribution, although their amplitude remain underestimated, especially in the Indian Ocean ( Figure 5C).
Despite the ChlOC-CCI -ChlMLP high correlation, the amplitude of ChlMLP seasonal variability is also underestimated.
The first EOF mode performed on interannual ChlOC-CCI illustrates the largely reported impact of El Nino Southern Oscillation (ENSO) [4,17,[54][55][56] (Figure 4C). Here again, the MLP results in a significant improvement compared with the SVR to reconstruct ChlOC-CCI variability, with its first PC correlation with ChlOC-CCI of 0.95 vs. 0.63 for ChlSVR ( Figure 4D). Over the last 18 years, observed ChlOC-CCI trends have increased over most of the global ocean ( Figure 5A). Regionally, some decreases are observed such as in the Indian Ocean, the equatorial Pacific and the Atlantic and Pacific oligotrophic subtropical gyres. While most of these trends are captured by ChlSVR, inverse trends occur in the South Indian and Atlantic Oceans ( Figure 5B). In addition, the trend estimation for ChlSVR reveals an unrealistic high-frequency pattern, which may relate to the support of the Gaussian kernels implemented by the SVR model. On its side, ChlMLP better reproduces ChlOC-CCI trends in terms of spatial distribution, although their amplitude remain underestimated, especially in the Indian Ocean ( Figure 5C).   Over the last 18 years, observed ChlOC-CCI trends have increased over most of the global ocean ( Figure 5A). Regionally, some decreases are observed such as in the Indian Ocean, the equatorial Pacific and the Atlantic and Pacific oligotrophic subtropical gyres. While most of these trends are captured by ChlSVR, inverse trends occur in the South Indian and Atlantic Oceans ( Figure 5B). In addition, the trend estimation for ChlSVR reveals an unrealistic high-frequency pattern, which may relate to the support of the Gaussian kernels implemented by the SVR model. On its side, ChlMLP better reproduces ChlOC-CCI trends in terms of spatial distribution, although their amplitude remain underestimated, especially in the Indian Ocean ( Figure 5C).  A similar conclusion can be done when considering Chl trends in oligotrophic gyres (with surface Chl < 0.07 mg.m −3 as in [18]) over the last 15 years. Such trends can be weak and not sufficiently well resolved by the MLP (nor the SVR) in terms of sign or amplitude ( Figure 6). While it is crucial to improve the reconstruction of the signal amplitude on a global scale, it is particularly appealing in those oceanic deserts which would tend to expand in the context of climate warming and increasing stratification [18,57,58]. As expected, the spread of these oligotrophic areas is different here from Chl OC-CCI than those observed by [18] based on a single-sensor Chl product over 1998-2006 [59]. Thus, training deep learning schemes on Chl products from both single and multi-sensors could provide a more synoptic view on the impact and interpretation of reconstructed Chl long time-series. is crucial to improve the reconstruction of the signal amplitude on a global scale, it is particularly appealing in those oceanic deserts which would tend to expand in the context of climate warming and increasing stratification [18,57,58]. As expected, the spread of these oligotrophic areas is different here from ChlOC-CCI than those observed by [18] based on a single-sensor Chl product over 1998-2006 [59]. Thus, training deep learning schemes on Chl products from both single and multi-sensors could provide a more synoptic view on the impact and interpretation of reconstructed Chl long time-series. Finally, a first attempt to investigate predictors' relative importance is performed with the MLP approach and the mean decrease accuracy method [60]. As expected, the SST and short-wave radiations are the most important predictors (Table 2). Interestingly, the zonal surface wind stress component seems of particular importance while sea level anomaly is far behind. Indeed, redundant indirect information's about the ocean circulation can be derived from those predictors, with Finally, a first attempt to investigate predictors' relative importance is performed with the MLP approach and the mean decrease accuracy method [60]. As expected, the SST and short-wave radiations are the most important predictors (Table 2). Interestingly, the zonal surface wind stress component seems of particular importance while sea level anomaly is far behind. Indeed, redundant indirect information's about the ocean circulation can be derived from those predictors, with potentially more information "embedded" within the wind stress that may also be linked to a meaningful parameter for phytoplankton growth: the mixed layer depth. Spatial coordinates (latitude and longitude) are less important for the reconstructions suggesting that the MLP can extract geospatially-dependent features from other predictors than the coordinates themselves. Thus, a complementary run was performed, still using 80% of the dataset to train the MLP but based only on the five main predictors with a relative importance higher than 0.1 in Table 2. As expected, removing spatial information's from the MLP training still allows to reconstruct realistic Chl (unlike the SVR, which produces unrealistic concentrations). Interestingly, although the averaged relative importance of seven of the eight removed predictors are lower than 0.033, the determination coefficients between this specific Chl MLP vs. Chl OC-CCI decrease by more than 0.2 while the RMSE increase to more than 0.12 in each oceanic basin when compared to the MLP using all the predictors ( Figure S3 vs. Figure 2). This result illustrates that the use of these predictors, apparently weakly significant in average, does not change significantly the Chl geographical distribution pattern, but can modulate its regional intensity ( Figure S4). The impact of the different predictors on Chl reconstruction according to the oceanic regions and/or the climate cycles should therefore be specifically considered and investigated in a dedicated study, once a deep learning scheme will have been considered as sufficiently satisfying.

Discussion
To our knowledge, the only study that has been performed to reconstruct satellite surface Chl on a global scale used a SVR approach [16]. In this former study, we also used Chl OC-CCI and the same physical predictors than in the present study, but the predictors were originating from a numerical model (vs. satellite observations and numerical atmospheric reanalysis here). The SVR accurately reproduced most aspects of the satellite Chl variability (although underestimated by half) and spatial patterns. Here, we show that the SVR, trained on satellite data and atmospheric reanalysis, also encounters difficulties to well reproduce the Chl signal. The MLP demonstrates the ability of deep learning schemes to reproduce satellite Chl with far better skill than the SVR, not only to capture the general patterns of Chl seasonal and interannual signals and trends, but also their amplitude. Neither the training of the MLP nor that of the SVR involve time information, as the training loss only involves a grid point-wise reconstruction error criterion. Thus, our results support the greater ability of the MLP to generalize time patterns than the SVR and the relevance of neural network approaches compared with kernel methods.
However, further efforts still remain to alleviate the issue of Chl underestimation. One plausible hypothesis would be that complementary predictors may provide additional insights to this issue. For instance, precipitation is among a key driver of coastal run-off and river discharge could be considered. Thus, applying similar learning-based schemes to CDOM and SPM, especially jointly to Chl, would be note-worthy. Such combined approaches could help improving both reconstruction and interpretation of Chl in regions where satellite Chl products from case 1 waters may reflect changes from other components than phytoplankton biomass. Particulate backscattering (as a proxy of the Particulate Organic Carbon) retrieved from satellite also deserves to be considered in addition to Chl in the training/reconstruction process. Indeed, it would allow us to investigate the extent to which the Chl variability reflects changes in phytoplankton biomass vs. cellular changes in response to light [51,61,62], which is of particular importance in oligotrophic gyres.
If, in this study, we demonstrate the better potential of NNs to accurately represent Chl seasonal and interannual signals, compared to the SVR, so far, only a MLP was used. MLP is known to not explicitly consider the spatial and temporal correlations in the dataset. Specific architectures to handle spatially or temporally structured data, i.e., convolutional neural networks and recurrent neural networks (such as long short-term memory networks) are currently under investigation and are expected to further improve the Chl reconstruction performance, in particular for the Chl seasonal cycle amplitude. Interestingly, such NN architectures would provide additional insights on the physical variables of interest through sensitivity tests, which could drive the reconstruction of Chl beyond the set of predictors considered in this study. Recent advances in deep learning for irregularly-sampled datasets also suggest that future work could learn such NN representations from datasets involving missing data patterns, which may be of key interest for Chl patterns in some regions [63,64].

Conclusions
The present study investigates two statistical approaches to derive from satellite observations the Chl seasonal to interannual variability and trends, as well as their potential in reconstructing biological past long-term time-series. The MLP is more skillful than the SVR to capture both the spatio-temporal patterns and amplitude of Chl OC-CCI on a global scale over 1998-2015. Chl MLP and Chl OC-CCI seasonal and interannual first mode of variability are highly correlated (r-PCs = 0.99 and 0.95, p < 0.001), suggesting that the MLP is reliable to reproduce Chl OC-CCI , at least over the last 18 years. However, some underestimation in Chl MLP amplitudes as well as regional bias need to be fixed in future studies and the predictors' importance in Chl reconstruction deserve to be more deeply investigated.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-4292/12/24/4156/s1, Figure S1: MLP architecture. Table 1: MLP configuration. Figure S2: Learning curves of Chl MLP-9% and Chl MLP . Figure S3: Scatter plots of Chl MLP trained only for predictors with a relative importance higher than 0.1 in Table 2. Figure S4: Correlation and NRMSE of Chl OC-CCI vs. Chl MLP trained only for predictors with a relative importance higher than 0.1 in Table 2. Funding: Funding for the internship of A. Brini and GPU was provided by the CNES/TOSCA call (Grant n • 160515/00) PhytoDeV project. The J. Roussillon PhD grant was funded by the ISblue project, Interdisciplinary graduate school for the blue planet (ANR-17-EURE-0015) and co-funded by a grant from the French government under the program "Investissements d'Avenir".