Spatial downscaling of MODIS Chlorophyll-a with Genetic Programming in South Korea

: Chlorophyll-a (Chl-a) is one of the major indicators for water quality assessment and recent developments in ocean color remote sensing have greatly improved the ability to monitor Chl-a on a global scale. The coarse spatial resolution is one of the major limitations for most ocean color sensors including Moderate Resolution Imaging Spectroradiometer (MODIS), especially in monitoring the Chl-a concentrations in coastal regions. To improve its spatial resolution, downscaling techniques have been suggested with polynomial regression models. Nevertheless, polynomial regression has some restrictions, including sensitivity to outliers and fixed mathematical forms. Therefore, the current study applied genetic programming (GP) for downscaling Chl-a. The proposed GP model in the current study was compared with multiple polynomial regression (MPR) to different degrees (2 nd -, 3 rd -, and 4 th -degree) to illustrate their performances for downscaling MODIS Chl-a. The obtained results indicate that GP with R 2 = 0.927 and RMSE = 0.1642 on the winter day and R 2 = 0.763 and RMSE = 0.5274 on the summer day provides higher accuracy on both winter and summer days than all the applied MPR models because the GP model can automatically produce appropriate mathematical equations without any restrictions. In addition, the GP model is the least sensitive model to the changes in the input parameters. The improved downscaling data provide better information to monitor the status of oceanic and coastal marine ecosystems that are also critical for fisheries and fishing farming.


Introduction
Coastal marine ecosystems are the most important habitats for species that live in the world's most productive ecosystems, such as fish and marine mammals [1]. The influences of the proximity to land, large quantities of nutrients delivered via streams, and sewage discharge lead to increased susceptibility of these ecosystems to rapid changes in water quality through both anthropogenic and natural mechanisms. Therefore, it is essential to monitor water quality in coastal ecosystems to mitigate the adverse impacts of human-related activities in these environments [1,2].
A phytoplankton cell is a planktonic photosynthesizing organism [3], and phytoplankton biomass can serve an index to provide information about marine ecosystem health. Coastal ecosystems throughout the world are affected by the fast growth of the phytoplankton population, often resulting from water column stratification or increases in nutrients [2]. Harmful algal blooms, like dinoflagellate, Gymnodinium breve (commonly referred to as "red tide") produce neurotoxins such as saxitoxin and gonyautotoxin that cause water quality degradation, which have considerable consequences for marine environments such as fish death [4][5][6][7]. The chlorophyll-a (Chl-a) concentration has been recognized as a direct indicator of phytoplankton biomass because all phytoplanktons contain Chl-a and high Chl-a concentrations show more desirable environmental conditions for phytoplankton growth [3]. Therefore, by monitoring changes in the Chl-a concentration distribution, long-term trends in the water quality of coastal and oceanic systems can be assessed to a point where the negative effects can be mitigated [2,8,9]. However, traditional techniques such as in situ field sampling, moored instruments, drifting instruments, and fluorometry used for Chl-a measurement are expensive laboratory-based instruments and have some spatiotemporal limitations.
Over the last two decades, there has been an increase in remote sensing applications as a substitute for traditional techniques for near-real-time measurements of global phytoplankton biomass, including both qualitative and quantitative estimates [10,11]. However, there are two major challenges associated with extracting information from remote sensing data: 1) the sheer amount of data, and 2) variable precision and continuity among remote sensing-derived products. [12]. To solve such issues, many techniques have been developed, such as reflectance-based classification algorithms [13], spectral band ratios [14][15][16], spectral band-difference algorithms [17][18][19], bio-optical models [20,21], and analytical techniques [22,23].
The retrieval of Chl-a concentrations in coastal areas by the abovementioned techniques is performed using coarse spatial resolution ocean color sensors such as Moderate Resolution Imaging Spectroradiometer (MODIS), Coastal Zone Color Scanner (CZCS), MEdium Resolution Imaging Spectrometer (MERIS), and Sea-viewing Wide Field-of-view Sensor (SeaWiFS). Although the high temporal resolution of these sensors (e.g., 1-2 days for MODIS, 3 days for MERIS, and 1 day for SeaWiFS) makes them suitable for continuous monitoring, their spatial resolution (e.g., 4 km for MODIS, 300 m for MERIS, and 1.1 km for SeaWiFS) is not satisfactory due to their orbital characteristics and technical configurations.
Recent studies have introduced spatial downscaling algorithms as an alternative solution to the coarse spatial resolution of ocean color sensors. Spatial downscaling has been widely used for downscaling coarse spatial resolution data by utilizing the high-resolution remote sensing reflectance measurements, for instance for land surface temperature [24][25][26], precipitation [27][28][29][30], and soil moisture [31,32]. Fu, et al. [33] combined coarse spatial resolution MODIS Chl-a measurements with high spatial resolution Landsat 8 OLI band combinations using a polynomial regression model (fourth-order polynomial regression) to downscale MODIS Chl-a maps from 4 km to 30 m spatial resolution. However, polynomial regression models have some restrictions, such as sensitivity to outliers and the use of fixed mathematical forms to define the relationship between the predictor and predictand variables. Machine learning (ML) approaches have been suggested to deal with these restrictions and have received increasing attention for downscaling studies as powerful alternative tools, but only limited applications for downscaling of Chl-a [34][35][36].
Among ML models, genetic programming (GP) have recently received much attention in a number of fields including water resource management studies [37]. The idea behind the GP was inspired by biological evolution that makes it a collection of techniques for finding the best solution in the space of possible solutions. This unique feature of GP made it a suitable technique for various water resource management applications, including ocean engineering and hydrology, hydrological forecasts, and groundwater modeling [37]. Therefore, the current study assessed the accuracy of GP for Chl-a downscaling and compared its results with the results of three multiple polynomial regression (MPR) models, including second-order (2 nd ), third-order (3 rd ), and fourth-order (4 th ) polynomials. The developed models were utilized for Chl-a downscaling over the western coast of South Korea.

Study area
The study area was part of the Korean West Sea, which is located in the eastern part of the Yellow Sea ( 35 (Figure 1). The Yellow Sea is a shallow marine ecosystem with the average and maximum water depths of 44 m and 103 m, respectively [38]. There is clear seasonality in sea surface temperature (SST) over the Yellow Sea, where January is the coldest month with an average SST of 4-7 o C and July is the warmest, with an average SST of 26-27 o C [39]. There are a total of 339 fish species in the Korean West Sea [40]. Over the past few years, some fish species, such as small yellow croaker, hairtail, large yellow croaker, and flatfish have exhibited continuous declines due to overharvesting, degraded marine ecosystem quality, and several unknown factors [41]. The increase in the level of the eutrophication, as a result of human activities such as dense agricultural practices along the coastal area, is another reason for environmental pollution over the study area that have significant negative effects on marine ecosystems, such as fish death, and a loss of important protein for the people dependent upon them [1]. Furthermore, in the Yellow Sea waters, there are other constituents than phytoplankton, such as inorganic particles and dissolved organic matter that are the major obstacle for simple empirical algorithms to determine the statistical relationship between Chl-a concentration and spectral bands [42]. Additionally, a very limited number of studies have attempted to investigate the optical properties of the Yellow Sea, such as phytoplankton, from the ocean color images. As a result, monitoring the Chl-a concentration, as an important index to evaluate the extent of eutrophication, at fine resolution is a crucial task in this area.

Materials and methods
The main objective of this research was to develop an approach for MODIS Chl-a downscaling and produce Chl-a concentration maps for complex coastal regions. Figure 2 displays the detailed explanation of the procedure used. The downscaling approach was accomplished in four steps: (1) remote sensing data, including MODIS Chl-a at 4 km (defined as Y4k) and S-2A at 10 m (defined as X10), were acquired, and S-2A data were resampled to 4 km MODIS resolution (denoted as X4k); (2) the most important S-2A band combinations (X4k) were chosen by utilizing the support vector machine recursive feature elimination (SVM-RFE) method; (3) MODIS Chl-a downscaling from 4 km to 10 m was performed by regressing X4k to Y4k, calculating the residual at 4 km ( 4k ε ), and adding the interpolated residual ( 10 ε ) to the estimated fine-resolution Chl-a ( 10 Ŷ ); (4) the obtained downscaled maps were compared with visual comparison, validated with in situ data, and all the applied methods were assessed using sensitivity analysis. A complete explanation of the aforementioned steps is described in Sections 3.1 to 3.4.

Data Description
In the current study, a downscaling approach was proposed based on GP and MPR techniques to produce high-resolution MODIS Chl-a data by relating coarse-resolution MODIS Chl-a data to high-resolution Sentinel-2A MSI (S-2A) data. There are some challenges associated with downscaling Chl-a: 1) MODIS Chl-a and S-2A measurements have different revisit times of 8 and 10 days, respectively; 2) in situ data used for validation of the downscaling model are irregularly distributed in space and rarely accessible; 3) to obtain reasonable results for downscaling Chl-a, a remote sensing image should have cloud coverage less than 10%. It has been reported that sea fog is frequently observed around the Korean peninsula with a maximum occurrence in the West Sea in summer with a mean frequency of 5.3 days in July [43]. The frequency is less than 1 day during fall and winter. Therefore, the above-mentioned challenges imposed some limitations for selection of images on every season. Hence, two MODIS and S-2A Level-1C (hereafter S-2A) datasets were acquired on February 2 (winter) and August 4 (summer) in 2016. All images had less than 10% cloud coverage and were almost concurrent with in situ measurements (±5 days).
The Chl-a concentration maps used in the current study were used from MODIS Chl-a concentration standard mapped images (SMIs) with a 4 km resolution. The SMIs were created from bands 13 (0.653 µm),14 (0.681 µm), and 15 (0.750 µm) of the MODIS sensor (hereafter referred to as MODIS Chl-a). Due to strong fluorescence signals from Chl-a at these spectral bands, these bands were suitable for the detection of the Chl-a concentration [44,45]. For retrieval of Chl-a concentration, the standard OC3/OC4 (OCx) band ratio algorithm combined with color index (CI) introduced by Reference [46] was implemented. The MODIS Chl-a data were obtained from the Giovanni website (https://giovanni.gsfc.nasa.gov/giovanni/).
S-2A band images (Table 1) for high-resolution data were acquired from the Copernicus Open Access Hub (https://scihub.copernicus.eu/). S-2A products include 13 bands with the top-ofatmosphere (TOA) Reflectance. All bands were processed by radiometric and geometric correction using a UTM/WGS84 projection. Recent studies have reported that S-2A is more powerful than Landsat 8 OLI in distinguishing areas affected by algal blooms because S-2A has three near-infrared (NIR) bands and Landsat 8 OLI has only one NIR band [47][48][49][50][51]. The three additional NIR bands of S-2A make it feasible to develop more suitable algorithms for the retrieval of the Chl-a concentration in intense bloom conditions [52].  Figure 1. Multidepth sampling was used to collect water samples from the surface layer (0-15 cm depth). A fairly large water sample was collected and the sample was filtered to concentrate the chlorophyll-containing organisms followed by mechanical rupturing of the collected cells, and extraction of the chlorophyll from the disrupted cells into the organic solvent acetone. The extract was then analyzed by a spectrophotometric method using fluorescence. Therefore, to validate the downscaling models, a total of 14 samples at two depths (Table 2) were employed. The remote sensing and in situ data acquisition dates are shown in Table 2, with in situ data from February 1, 2016, corresponding to remote sensing data from February 2, 2016, and in situ data from August 1, 2016, corresponding to remote sensing data from August 4, 2016.

Preprocessing
The image preprocessing methods the same as (1) MODIS gap filling, (2) resampling, (3) landwater separation, and (4) subsetting were used to prepare all satellite images for further processes. Then, feature selection was employed to determine the most important band combinations for Chl-a prediction. First, the grid-fill method [53] was used to fill the missing values in the MODIS Chl-a images. This software has high computational efficiency and fills missing values in an iterative relaxation scheme using Poisson's equation. For validation, random removal of the original Chl-a pixel values was conducted to provide data to validate the grid-fill method. For this purpose, 5% of the original data were deleted from the outside boundary of the study region and preserved for crossvalidation. Then, the corresponding values of the deleted data were estimated by utilizing the gridfill method, and the difference between the original pixel values and the reconstructed values was evaluated. Consequently, the spatial distribution of Chl-a concentrations was produced.
At the second step, all S-2A images were mosaicked and then resampled to a resolution of 10 m (X10). Since the study region had a remarkable number of islands (Figure 1), the reflectance caused by islands could decrease the accuracy of the applied models. Therefore, the third step of preprocessing aimed to separate water from land pixels to improve the downscaling results. The most commonly used indices, namely, the normalized difference vegetation index (NDVI) and normalized difference water index (NDWI), have already been employed for land-water separation purposes [54][55][56][57][58]. In the current study, land-water separation was performed by utilizing the NDWI index. The formula of this index is as follows: At the fourth step, the study area boundaries were used to produce a spatial subset of both MODIS Chl-a and S-2A images, and the S-2A dataset was resampled from 10 m (X10) to 4 km resolution of MODIS Chl-a (X4k) using the nearest neighbor method [59]. Then, MODIS Chl-a and S-2A images at 4 km (Y4k and X4k) were used as dependent and independent variables, respectively, to develop GP and MPR models. Additionally, the developed models were applied to S-2A images at 10 m (X10) to estimate downscaled Chl-a at 10 m spatial resolution.
At the fifth step, the SVM-RFE feature selection method was employed to specify the most important bands of the S-2A dataset (X4k). This method is a widely used technique for feature selection that has been utilized in a wide variety of remote sensing research studies [60][61][62]. To perform feature selection, multiple mathematical operations, including multiplication, addition, subtraction, rationing, averaging, and square transformation, were used to calculate 597 various combinations of the resampled S-2A bands at 4 km, and the computed combinations served as the input for the feature selection method. Then, SVM-RFE was trained on the calculated dataset using the MODIS Chl-a concentration (Y4k) and S-2A band combinations (X4k) as the predictand and predictor variables, respectively. As a result, relevant combinations for the downscaling procedure were chosen.

Downscaling
The downscaling model used in the current study was a regression-based approach considering the statistical relationship between MODIS Chl-a and S-2A bands at 4 km and 10 m spatial resolution. The straightforward formulation of this relationship is expressed as in Equation (2).
where X is the selected S-2A bands (predictor variables) at 4 km (X4k) and 10 m (X10), Ŷ is the estimated MODIS Chl-a at 4 km ( 4k Y ) and 10 m ( 10 Y ), and f represents a nonlinear regression function developed by GP and MPR techniques. First, a relationship between Y4k and X4k was established using all the applied methods as 2 nddegree MPR, 3 rd -degree MPR, 4 th -degree MPR, and GP. From the 636 pixels of the independent (X4k) and dependent (Y4k) variables, 445 pixels were used for training, and the remaining 191 pixels were reserved for the validation of the models. Standardization was a prerequisite step before training all the applied methods to ensure that all variables remained on the same scale. Therefore, the standardization of all independent variables (X4k) in the training and validation set was performed by subtracting the mean and dividing by the standard deviation of the training set. This preprocessing sped up the convergence and allowed efficient training of the network. Then, the estimated Chl-a at 4 km ( 4k Y ) was subtracted from the original MODIS Chl-a at 4 km (Y4k) as follows: where 4 ε k is the low-resolution residual at 4 km.

Sensitivity Analysis
For a given mathematical model, a sensitivity analysis (SA) measures how much the uncertainty and fluctuations of the input variable contribute to the outputs or performance of the system. In general, SA may be performed by two different techniques, local and global SA, with the former exploring the important model factors for a given set of factor values, and the latter apportioning the uncertainty in outputs to the uncertainty in each input factor to identify the important factors [65]. In the current study, the Morris method [66], as the most widely used global SA method, was employed to quantify the sensitivity of the GP and MPR models. This method is also called the once-at-a-time (OAT) method because, in each run, a new value is assigned to only one input variable. To carry out SA, the Morris sensitivity measure index ( μ * ) was used as in Equation (5): where i is the number of input variables, r is the number of sample points in the parameter space (indexed n) and EEi,n is the elementary effects (EEs) assessed for the i-th input variable using the n-th sample point. EEs are employed to specify noninfluential inputs for a computationally costly mathematical model or for a model with a great number of input parameters, where the costs of evaluating other SA methods such as variance-based methods are not reasonably priced.

Multiple Polynomial Regression
The polynomial model is a form of a regression method that can be used when the relationship between an independent and dependent variable is curvilinear [67]. The n th order polynomial model with one variable Equation (6) is the general form of the polynomial model that indicates the nonlinear relationship between one predictand and one predictor variable.
Furthermore, MPR can also be defined with different degrees. For instance, a quadratic (secondorder, n = 2) polynomial model can be given as in Equation (7).
To select the best model between different MPR degrees for MODIS Chl-a downscaling, three degrees of models, including second-order (2 nd ), third-order (3 rd ), and fourth-order (4 th ), were trained and compared.

Genetic Programming
GP is one of the most recent data-driven techniques developed by Koza [68] and is a collection of techniques for finding a highly fit individual in the space of possible solutions. In GP, individuals are mathematical formulas created by combinations of functions (e.g., sin, cos, ÷,×, +) and variables (e.g., x, y, 6). Each individual takes the role of a possible solution for a given problem. Figure 3 shows four simple formulas (individuals) created by functions and variables as an example. Each individual has its fitness value to optimize. GP applies evolutionary computation to find the best individual for optimized fitness values.
Generally, the GP technique follows four steps to find the fittest individual: (1) an initial random population of individuals composed of functions and variables is created; (2) the fitness of each individual in the population is validated with a problem-specific fitness function, and the most appropriate individuals are selected to survive in the new population as parents; (3) once parents are selected, they create better types known as offspring or new generations by producing algorithms known as genetic operators (i.e., crossover, mutation, and duplication); (4) then, the individuals are assessed for fitness; (5) the process from (2) to (4) is repeated over many generations until an individual satisfies a given success criterion (e.g., the number of generations exceeds the specified number of iterations). Figure 3 illustrates the crossover and mutation operations in GP. Individuals in GP are shown in tree form with different characteristics, such as size, shape, and content. The crossover and mutation operations are performed to produce new offspring (the panels in the lower row) from the parent (the panels in the upper row). Additionally, Figure 3 displays how crossover and mutation change the final functions and eliminate an input variable y at the second offspring (the lower right panel).
In the current study, the gplearn package [69] in Python software was used for GP implementation. In general, GP has two major parameters (population size and generation size) that should be optimized to generate high performance. The optimum values of the mentioned parameters were computed by utilizing the harmony search (HS) algorithm introduced by Geem, et al. [70] using 10-fold cross-validation. For more information about the HS method, readers are referred to Geem, Kim and Loganathan [70] and Lee and Singh [59].

Preprocessing
The scatter plots between the original MODIS Chl-a and their reconstructed values with the gridfill method are shown in Figure 4. According to the results, a good match between the original MODIS Chl-a and their filled values can be seen with R 2 = 0.996 for winter (panel (a)) and R 2 = 0.939 for summer (panel (b)). Therefore, it can be concluded that the grid-fill method exhibits good performance for the reconstruction of MODIS Chl-a values. Note that the reason for the better performance of the method on the winter day than on the summer day might be related to the presence of more missing values in the selected scene of the summer day that occurs due to high cloudiness on the summer day.

Downscaling Results
For downscaling, MPR with three degrees (2 nd , 3 rd , and 4 th ) and GP were trained with the four determined band combinations as predictors and MODIS Chl-a as a predictand at low-resolution (4 km). Then, a residual correction was performed utilizing the described methodology in the downscaling section (Section 3.3), and Equations (3)-(4) to produce high-resolution Chl-a maps (Y10). To assess the accuracy of the downscaling technique, the final downscaled maps (i.e., Y10) were validated with the original MODIS Chl-a maps at a pixel size of 4 km. For this purpose, sample values were extracted from the downscaled maps (Y10) within a 3×3 window around the center of each MODIS Chl-a pixel, and the mean of each sample was calculated and compared with the original MODIS Chl-a ( Figure 5 and Table 3).  On the winter day, as shown in Table 3, the performance measurements of the GP model were slightly better than those of the 2 nd -degree MPR. According to the performance indices, the accuracy of the models was ranked as GP > 2 nd -degree MPR > 3 rd -degree MPR > 4 th -degree MPR for the winter day. For the summer day, the rank was GP > 3 rd -degree MPR > 2 nd -degree MPR > 4 th -degree MPR.
Overall, the GP exhibited the best performance for the winter and summer days.
This finding shows the superiority of GP over the MPR method with different degrees of Chl-a downscaling. The possible reason for this result might be related to the flexible structure of the GP model compared with the fixed formulation of MPR models. While MPR models use a fixed form to define the relationship between the predictor and predictand variables, the evolution process in GP allows its function to take any feasible formulation. This flexibility gives GP the ability to adopt any form of functions to capture various relationships between the predictor and predictand variables, even highly nonlinear relations. Therefore, this unique feature of GP increases the probability of finding the best relationship in the downscaling procedure, resulting in a better prediction than the MPR models.
The accuracy of the models is presented in Figure 5. The best match between the MODIS Chl-a and simulated values can be seen in the GP model (see the panels in the fourth column in Figure 5), with R 2 = 0.927 and RMSE = 0.1642 on the winter day, compared to the MPR models (2 nd -, 3 rd -, and 4 th -degree). The same result can be seen on the summer day (the best performance in the GP model), as shown in the bottom panels of Figure 5. Furthermore, it can be seen that all the applied models estimate Chl-a better at lower concentrations than at higher concentrations, particularly in the range of 1.5-3.5 mg m −3 , as presented in Figure 5.

Visual Comparison
The detailed maps of the downscaling approach for the winter and summer days are shown in Figures 6 and 7, respectively. The good agreement between the original MODIS Chl-a (the panels in the first column in Figure 6) and the estimated Chl-a maps (the panels in the last column in Figure 6) can be seen for the GP and 2 nd -degree MPR. From Figure 6, the major trend of Chl-a is fairly modeled with GP and 2 nd -degree MPR but specific and substantially high values are not captured in both models such as the values in the near coastal area. However, the residual model can additionally capture this high variability and produce a reliable estimate in the final stage, as seen in the last column of Figure 6. According to the results illustrated in Table 3, and the panels in the last column of Figure 6, 4 th -degree MPR cannot estimate the Chl-a values at 10 m resolution as accurately as the other models, especially in coastal areas. This phenomenon indicates that the GP and 2 nd -degree MPR can capture the most variation in high Chl-a concentrations at 4 km resolution.  By estimating the Chl-a concentrations on the summer day, similar model behaviors became obvious on the summer day compared to the winter day (Figure 7). Among all the applied models, GP estimations in coastal areas showed better agreement with the original MODIS Chl-a on both the winter and summer days than all the applied MPR models. Since coastal areas play a vital role in marine ecosystems and human health, GP estimations are considerable for water quality monitoring in coastal areas.
From Figure 7, the simulation results became worse in the sea (the second and last column panels in Figure 7), and all models tended to underestimate some high Chl-a values. The possible reason for this result might be the higher spatial variability of the MODIS Chl-a concentration on the summer day than on the winter day (Figure 8), which is different from the normal distribution ( 0.522 δ = , = 0.733 θ ). Therefore, all the applied models did not fairly estimate the Chl-a values in the sea, and their accuracy decreased on the summer day compared to that on the winter day ( Figure 5 and Table  3). From Figures 6 and 7, the Chl-a concentration gradient follows a similar pattern with water depth in all maps, and its concentration decreases as water depth increases. Therefore, the region close to the seashore shows a high Chl-a concentration, while the region in the sea presents a low concentration of Chl-a.

In Situ Validation
The results of the downscaling models were assessed with the in situ Chl-a data of seven stations along the study area ( Figure 1) to validate model performances. For this purpose, sample values were extracted within a 3×3 window around the center of each station point, and the mean of each sample was calculated and compared with the in situ data. The results in terms of the R 2 and RMSE are presented in Figure 9 for the winter day (left panel) and the summer day (right panel). The computed p-values ( Figure 9) shows that R 2 of all models is statistically significant (p-values lower than 0.05). In general, the R 2 of the GP model was higher than that of the other models for the winter and summer days, which were 0.59 and 0.47, respectively, as shown in Figure 9. Additionally, the RMSE of the GP model was smaller than that of the other models for the winter and summer days, at 0.766 and 0.483, respectively. These results indicate that the GP model can estimate Chl-a concentrations more accurately than the other models on both winter and summer days. From Figures 5 and 9, one can see that the validation of the downscaling models with the original MODIS Chl-a and the in situ data provides different levels of accuracy. This discrepancy shows the uncertainties in remote sensing data, indicating that the performance of the downscaling model is greatly affected by the accuracy of the remote sensing reflectance.

Sensitivity Analysis
To explore the sensitivity of the applied models to the changes in input parameters, the sensitivity measure index ( μ * ) was calculated, as shown in Equation (5), for all predictor variables on both winter and summer days (Figures 10 and 11). The results of SA indicate that all the applied models are sensitive to the surface reflectance changes ( μ * values range from 0 to 1.45 for the winter and 0 to 2.42 for the summer day), and this sensitivity varies between the applied models. Figures 10   and 11 show that the 2 nd -degree MPR model is the most sensitive model ( μ * values ranged from 0.09 to 1.24 for the winter and 0.94 to 2.42 for the summer day), while the GP model is the least sensitive model ( μ * = 1.45 and 1.4 for B2/B3 band combination in winter and summer days, respectively) to the changes in the predictor variables. In addition, the sensitivity of the MPR models increases on the summer day compared to that of the winter day, while the GP model shows a slight decrease to the changes of the B2/B3 band combination on the summer day compared to that of the winter day.
It can be concluded that although GP and 2 nd -degree MPR show comparable accuracy ( Figure 5 and Table 3) on both winter and summer days, the GP model with the least sensitivity to the changes in the input parameters is more effective than the 2 nd -degree MPR for downscaling Chl-a. Additionally, SA results show that the accuracy of all models is strongly related to the accuracy of the remote sensing data; this further confirms the need for atmospheric correction as an essential task in the downscaling procedure [71].

Discussion and Conclusions
In the current study, a downscaling framework composed of GP and MPR models was developed to downscale MODIS Chl-a from 4 km to 10 m pixel size using S-2A band combinations at 10 m as predictor variables. The MODIS Chl-a at 4 km was downscaled in four main steps: (i) acquiring MODIS Chl-a and S-2A images at 4 km and 10 m, respectively; (ii) applying feature selection to select the most important S-2A band combinations as predictor variables; (iii) employing the trained MPR with three degrees (2 nd , 3 rd , and 4 th ) and GP model to downscale MODIS Chl-a; (iv) assessment of the results with original MODIS Chl-a maps at a pixel size of 4 km and in situ measurements.
By comparing the performance of all the models applied for downscaling, the GP model presents more accurate results and less sensitivity to the changes in all variables for downscaling Chl-a. The superiority of the GP model over MPR models is related to the flexible formulation structure of the GP compared to the fixed formulation of the MPR models. The flexible structure of the GP allows its function to take any feasible formulation so that it increases the probability of finding the best inputoutput relationship between thousands of formula. Visual comparison of the applied models showed that although the major trend of Chl-a is fairly modeled with GP and 2 nd -degree MPR, but specific and substantially high values are not captured with both models such as the values in near coastal areas. This drawback of the models can be solved with the residual correction that makes it an essential procedure to improve the accuracy of the spatial downscaling model. Among all the applied models, the GP model provided better estimations in coastal areas than all the MPR models. Therefore, GP can serve as a feasible alternative model to estimate Chl-a concentrations in coastal areas with complex characteristics, where water quality monitoring plays a vital role in the protection of marine ecosystems and human health. Moreover, the results indicate that the performance of the models greatly depends on the spatial variability of the MODIS Chl-a concentration. Its distributional characteristics (e.g., normal or skewed) can be a good option for model selection criteria. Therefore, to ensure the validity of the GP model in other coastal areas, it is recommended to assess the normality to select the GP model for spatial downscaling of the MODIS Chl-a.
Although the downscaling procedure used in the current study was applied on two selected days, one in winter and one in summer based on data availability, it produced reasonable results due to using the S-2A dataset with 10 m spatial resolution. Future studies can be focused on extending this procedure at a higher temporal resolution. In addition, more research can be conducted with deep learning techniques for downscaling Chl-a concertation. In addition to the surface reflectance, more predictors such as NDVI and NDWI, due to the strong Chl-a absorption at red-NIR spectral regions, may be used in determining Chl-a concentration.