Mapping Freshwater Chlorophyll-a Concentrations at a Regional Scale Integrating Multi-Sensor Satellite Observations with Google Earth Engine

Monitoring harmful algal blooms (HABs) in freshwater over regional scales has been implemented through mapping chlorophyll-a (Chl-a) concentrations using multi-sensor satellite remote sensing data. Cloud-free satellite measurements and a sufficient number of matched-up ground samples are critical for constructing a predictive model for Chl-a concentration. This paper presents a methodological framework for automatically pairing surface reflectance values from multi-sensor satellite observations with ground water quality samples in time and space to form match-up points, using the Google Earth Engine cloud computing platform. A support vector machine model was then trained using the match-up points, and the prediction accuracy of the model was evaluated and compared with traditional image processing results. This research demonstrates that the integration of multi-sensor satellite observations through Google Earth Engine enables accurate and fast Chl-a prediction at a large regional scale over multiple years. The challenges and limitations of using and calibrating multi-sensor satellite image data and current and potential solutions are discussed.


Introduction
The occurrence of harmful algal blooms (HABs) has increased in U.S. freshwater ecosystems in recent years [1][2][3][4]. Many cyanobacteria species can produce toxins that affect the nerve system, liver, and skin and cause harmful impacts on humans and animals using them for drinking water or recreation [5][6][7]. HABs can also damage freshwater ecosystems, such as polluting beaches, causing taste and odor problems for drinking waters, lowering the ambient light required for submerged aquatic vegetation, and depleting oxygen levels and hence killing fishes [8]. HABs have become one of the major water quality issues for inland waters in some states [9]. The cost of water treatment has been an economic burden in recent decades [1]. Despite the significant negative impacts of HABs on ecosystems, the economy, and public health, they are not monitored and assessed on a regular basis due to the high cost and the sparsity of ground water quality sampling data [1]. Remote sensing has been increasingly used for monitoring and mapping HABs in aquatic systems, as it is capable of collecting synoptic data over multiple spatial and temporal scales [10][11][12][13][14][15][16][17][18][19].
It has been demonstrated that satellite and airborne optical remote sensing can estimate concentrations of, and changes in, parameters such as chlorophyll-a (Chl-a), phycocyanin, and turbidity, which are common indicators used to estimate the presence and intensity of HABs [9, [20][21][22][23]. In recent years, remote sensing has been adopted as a complementary approach to monitoring inland water quality in many applications [1,[24][25][26][27][28][29][30][31]. Although airborne hyperspectral images (e.g., AVIRIS (Airborne Visible-Infrared Imaging Spectrometer), CASI (Compact Airborne Spectrographic Imager)) are generally considered more effective in detecting and mapping water quality at local scales [31,32], their applications are still limited due to the cost, data availability, and processing difficulty due to high dimensionality [33]. For long-term monitoring of HABs at a state or multiple state scales in the U.S., it is advantageous to use images collected by multiple satellite sensors to increase temporal resolution and mitigate cloud cover. Two types of satellite sensors have been used for water quality mapping. The relatively coarse spatial resolution sensors capable of monitoring ocean color, like MERIS (MEdium Resolution Imaging Spectrometer) and MODIS (Moderate Resolution Imaging Spectroradiometer), have the spectral bands needed for detecting HABs in oceans or large lakes. However, their large instantaneous fields of view are not suitable for water quality mapping in small inland water bodies due to the large footprints (300~500 m). In contrast, the finer spatial resolutions (10~60 m) of Landsat, Sentinel-2, and ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) multispectral images enable them to resolve small freshwater lakes and rivers more than a few hundred meters wide. Therefore, the application of multispectral Landsat, ASTER, and Sentinel-2 images has been preferred for freshwater lake mapping projects [24][25][26][27][34][35][36][37].
Several satellites have collected and will continue to collect high resolution (10~30m) multispectral images over the Earth's surface. The United State Geological Survey (USGS), National Aeronautics and Space Administration (NASA), Japan Ministry of International Trade and Industry (MITI), and European Space Agency (ESA) have worked with Google Inc. to make these satellite data available online through the Google Earth Engine (GEE) cloud platform. The data repository of GEE has already incorporated several fine resolution satellite image data assets that have global spatial coverage and span several decades of time since 1984. These include the entire datasets collected by Landsat 4/5/7/8, Sentinel 1/2, and ASTER. The GEE updates its repository on a daily basis with around 6000 new image scenes from current active satellite sensors, making it a near real-time image repository. The cloud storage of satellite image data makes it possible to avoid the time-consuming upload/download procedure. The GEE has an intrinsically parallel computation capability that divides massive tasks into small ones and utilizes many processors to process them individually and in parallel, hence dramatically speeding up the intensive computation required for large-scale mapping applications. The satellite images in the GEE repository are preprocessed to a variety of processing levels and products, such as surface reflectance, top of atmospheric reflectance (TOA), and vegetation indices such as Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI). Google Earth Engine (GEE) is a comprehensive web portal that integrates multi-source satellite image data, high-performance cloud-based computing, and open-source software and algorithms [38,39]. Recently, GEE has enabled a number of long-term large-scale inland water investigations. These include inland surface water dynamics [40,41], analysis of sediments in crater lakes [42] and rivers [43], quantification of colored dissolved organic matter (CDOM) and dissolved organic matter (DOC) in Arctic rivers [44], retrieval of river water Chl-a and turbidity [45], monitoring of HABs in lakes [46], and risk estimation of HABs [47], etc. Despite the potential of the public GEE data repository for water quality mapping over multiple scales, many technical challenges remain. The possible inconsistency between different sensors in their spectral bandwidth specifications, spatial resolutions, and atmospheric correction algorithms may cause difficulty in the integration of Chl-a mapping in freshwater lakes. Thus, it is essential to (1) evaluate whether the public data in the Google Earth Engine data repository have sufficient information to support regional water quality mapping, (2) examine and optimize the spatial and temporal windows/criteria used to search and identify the match-up points between ground water quality samples and corresponding satellite observations to ensure both the quantity and quality of the selected match-up points for the prediction model calibration, and (3) assess if such match-up points improve the predictive model as compared with the traditional approach consisting of data download to a workstation with linear processing and mosaicking before the water quality image maps are uploaded to servers. In this research, we used GEE to automatically search matching image pixels in space and time across multiple sensors for the water quality samples. For the match-up points from different satellite sensors, satellite observations were calibrated and normalized across sensors. A support vector machine (SVM) machine learning model was then calibrated and validated with the match-up points that were automatically identified within the GEE image data repository before its prediction power was evaluated by comparison with in situ samples analyzed in the laboratory. Figure 1 shows the extent of the study area and the locations of water quality samples obtained by the U.S. Army Corps of Engineers (USACE) Louisville District Water Quality Team [48]. In situ water quality data have been consistently collected for 12 USACE lakes in the tri-state region, including 5 in Kentucky (Barren River Lake, Green River Lake, Nolin River Lake, Rough River Lake, and Taylorsville Lake), 4 in Indiana (Brookville Lake, Cagles Mill Lake, Monroe Lake, and Patoka Lake), and 3 in Ohio (Caesar Creek Lake, West Fork Lake, and Harsha Lake) (https://www.lrl.usace.army.mil/ Missions/CivilWorks/Water-Information/Water-Quality/). USACE water quality data provide in situ measurements of surface water temperature ( • C), dissolved oxygen (milligrams per liter), pH, turbidity (nephelometric turbidity units (NTU)), Secchi depth (meter), and surface chlorophyll concentration (micrograms per liter). This water quality dataset covers the time period between May 2013 and November 2017. Chl-a concentration measurements are used in the present study for the development and validation of the SVM predictive models.

Study Area and Data
The USACE water quality data were collected in the months of May, June, July, August, September, and October. The Chl-a concentrations varied from 1.3 to 63.1 µg/L, with an average value of 9.9 µg/L and a standard deviation of 7.8 µg/L. There are seasonal changes in the mean values of the Chl-a samples, indicating the role of climate and some other seasonal factors such as agriculture activities in shaping Chl-a concentrations in these lakes. The average Chl-a values (µg/L) gradually increase from May (7.7) to June (8.2), July (8.4), August (9.1), and reach the peak in September (11.7). It then drops in October (8.4) to a similar level as in June and July.

Multi-Source Data Inquiry Implemented on Google Earth Engine
This study focused on multispectral satellite datasets available in GEE that correspond to the USACE water quality dataset for the Louisville region, including Landsat 7 and 8 and Sentinel-2A and 2B (Table 1). For remote sensing applications, the synchronization between ground sampling and satellite overpasses is important to ensure that the ground sample data and satellite images are from the same water quality status. At the developing stage of algal blooms, the water spectral characteristics could change within several days or even hours. Ideally, the water quality sampling can be planned in advance to take place on a cloud-free day coincident with satellite overpass. However, in reality, many water quality samples do not exactly match cloud-free satellite observations in time due to survey logistics constraints and the uncertainty of weather conditions. The use of multi-sensor images from a large number of satellites orbiting Earth greatly increases the opportunity to find cloud-free satellite observations for a given ground water quality sample survey.
The GEE satellite data repository and computing algorithms make it possible to automatically locate cloud-free image pixels that match the water quality samples in location and time. The search and inquiry script was written in Python based on the "geemap" package that wraps the GEE algorithms (https://github.com/giswqs/geemap). To ensure a sufficient number of match-up points for model development between the ground samples and satellites, a time window is allowed in the temporal matching. Namely, if no corresponding image pixels can be found on the same day of ground water quality data acquisition, the temporally closest image among those images within a time window of the water quality sampling date (i.e., 2 days) is acquired. Table 1 shows the

Multi-Source Data Inquiry Implemented on Google Earth Engine
This study focused on multispectral satellite datasets available in GEE that correspond to the USACE water quality dataset for the Louisville region, including Landsat 7 and 8 and Sentinel-2A and 2B (Table 1). For remote sensing applications, the synchronization between ground sampling and satellite overpasses is important to ensure that the ground sample data and satellite images are from the same water quality status. At the developing stage of algal blooms, the water spectral characteristics could change within several days or even hours. Ideally, the water quality sampling can be planned in advance to take place on a cloud-free day coincident with satellite overpass. However, in reality, many water quality samples do not exactly match cloud-free satellite observations in time due to survey logistics constraints and the uncertainty of weather conditions. The use of multi-sensor images from a large number of satellites orbiting Earth greatly increases the opportunity to find cloud-free satellite observations for a given ground water quality sample survey.  The GEE satellite data repository and computing algorithms make it possible to automatically locate cloud-free image pixels that match the water quality samples in location and time. The search and inquiry script was written in Python based on the "geemap" package that wraps the GEE algorithms (https://github.com/giswqs/geemap). To ensure a sufficient number of match-up points for model development between the ground samples and satellites, a time window is allowed in the temporal matching. Namely, if no corresponding image pixels can be found on the same day of ground water quality data acquisition, the temporally closest image among those images within a time window of the water quality sampling date (i.e., 2 days) is acquired. Table 1 shows the availability of the major GEE satellite image assets and their bands corresponding to Landsat TM. A Python script was written to automatically search in the data archives with given date and coordinates ( Figure S1). We examined cloud-free image vs. sample location/time searches using 2-day and 10-day time windows. No match-up points were found with Landsat-5 images for the time period in which water samples were acquired. Optical satellite images are often affected by clouds and hazes. A simple haze and cloud detection algorithm described in the next section was used to identify image pixels contaminated by clouds and hazes. After excluding these contaminated pixels ( Figure 2), 56 match-up points were obtained for calibrating and evaluating the predictive model. If the temporal search window is expanded from 2 days to 10 days as in [48], all 97 USACE water samples have matching cloud-free image pixels. However, there is concern that the low synchronization between match-up points may not be adequate to support the model construction since water quality conditions can change significantly within 10 days. The use of the multi-sensor satellite image sources in GEE repository increases the number of match-up points for model development, but it is important to understand to what extent the inconsistency between different satellite sources affects the accuracy of the predictive model. for calibrating and evaluating the predictive model. If the temporal search window is expanded from 2 days to 10 days as in [48], all 97 USACE water samples have matching cloud-free image pixels. However, there is concern that the low synchronization between match-up points may not be adequate to support the model construction since water quality conditions can change significantly within 10 days. The use of the multi-sensor satellite image sources in GEE repository increases the number of match-up points for model development, but it is important to understand to what extent the inconsistency between different satellite sources affects the accuracy of the predictive model.

Cloud Masking and Haze Detection
Cloudy pixels were initially masked using the data quality assessment (QA) layer of each satellite data product to ensure that only cloud-free surface reflectance values were used to form match-up points. For example, Sentinel-2 surface reflectance Tier-1 product on GEE has a data layer "QA60" indicating the cloud condition. Those pixels that are indicated to be part of a cloud or cloud shadow are replaced with "NoData" or null. The cloud mask worked for most cloud conditions. However, it cannot detect haze caused by dust and air pollution or thin clouds. As illustrated in Figure 2, the haze did not block ground reflectance measurements of the satellite sensor but still contaminated the ground reflectance by adding path radiance from the haze to all bands. The haze effect is particularly detrimental for water quality mapping because of the low-albedo nature of water pixels. Therefore, a haze detection algorithm should be used to remove the pixels with serious haze contamination. Most haze/cloud detection algorithms, such as the Normalized Difference Vegetation Index (NDVI), Normalized Difference Snow Index (NDSI), Whiteness, and B4/B5 ratio [49], were developed for land features, and they are not suitable to detect the haze over water pixels. Our extensive experiments show that the blue band is most sensitive to the haze contamination and it is most likely to be contaminated if surface reflectance in the blue band is greater than 0.15. Therefore, we used a simple histogram slicing algorithm for haze detection and assigned a haze score of 0 for pixels with R blue ≤ 0.15 and a haze score of 1.0 for pixels with R blue > 0.15, where R blue is the surface reflectance at the blue band.

GEE Surface Reflectance Data Validation
Removal of atmospheric interference is vital for water quality mapping because the majority of the total upwelling radiance over a water body received by the satellites originates from the atmospheric scattering [50,51]. The surface reflectance products in the GEE data repository were processed using different atmospheric correction algorithms. The Landsat 8 OLI images were corrected using the Land Surface Reflectance Code (LaSRC), which is a heritage of the MODIS and Landsat TM and ETM+ products [52]. To ensure the quality of satellite data, we compared the GEE OLI surface reflectance product against water surface reflectance collected by the ASD (Analytical Spectral Devices, Inc.) spectroradiometer in the field.
In total, 56 water samples on 21 September 2015 and 20 water samples on 9 October 2016 were collected with an ASD spectroradiometer in the wavelength range 350-1025 nm at 1 nm spectral resolution when Landsat 8 passed over Harsha Lake and Caesar Creek Lake in Ohio [46]. At each water sampling site, 40 spectral reflectance readings were collected by the ASD spectroradiometer and then processed to generate one representative reference spectrum. To evaluate the GEE Landsat 8 surface reflectance data, in situ ASD reference spectra collected at water sampling points were integrated to the five Landsat 8 bands (B1-B5) as ground references free of atmospheric influence, based on the Landsat 8 Spectral Response Functions (https://landsat.gsfc.nasa.gov/preliminary-spectral-responseof-the-operational-land-imager-in-band-band-average-relative-spectral-response/). Bands 6 and 7 of Landsat 8 images were not included because their wavelengths exceeded the ASD spectral range. The average RMS% (relative root mean square) was used as the criterion for quality assessment: where R i (ASD) is the in situ ASD reference reflectance for band i and R i (image) is the reflectance in band i of the Landsat surface reflectance image.

Cross-Sensor Calibration between MSI and OLI over Water Bodies
In order to establish a cross-sensor calibration of at-surface reflectance for Sentinel-2 MSI and Landsat 8 OLI multispectral imagers, we examined their differences and then developed an empirical model to normalize them. The multispectral optical image products from Sentinel-2 MSI and Landsat 8 OLI were processed using different atmospheric correction algorithms. It was reported that the Sentinel-2 MSI level-2 surface reflectance product did not work well with the water quality data due to the algorithm used for atmospheric correction [48]. The Landsat 8 OLI images in GEE were corrected using the Land Surface Reflectance Code (LaSRC), which is a heritage of the MODIS and Landsat TM and ETM+ products [52]. In contrast, Sentinel-2 products used the Sentinel 2 Correction (Sen2Cor) algorithm. As shown in Figure 3, Landsat 7 ETM+ and Landsat 8 OLI images are statistically comparable and similar over water quality sample sites but are quite different from Sentinel-2 MSI surface reflectance values ( Figure 3). As marked in Figure 3, the mean value of Landsat 8 OLI is 0.284, while the mean value of Sentinel-2 MSI is 0.969. The differences can be attributed to different band radiometric characteristics (0.48 vs. 0.49 um), spatial resolutions (30 vs. 10 m), and atmospheric correction algorithms (LaSRC vs. Sen2Cor). Therefore, direct use of the two surface reflectance products from Landsat and Sentinel-2 sensors without cross-sensor calibration may bias the predictive models. To deal with the cross-sensor difference between these two products, the United States National Aeronautics and Space Administration (NASA) simulated the two instruments from hyperspectral data collected by the Hyperion instrument over 500 individual sites around the world and fitted linear equations to adjust the surface reflectance of each Sentinel-2 band to match the corresponding Landsat-8 band. NASA has released a consistently combined product for Landsat 8 and Sentinel-2 imagery [53,54]. Barsi et al. [55] used pseudo-invariant calibration sites (PICS) to calibrate an empirical equation to reconcile the cross-sensor differences between Sentinel-2 MSI top-ofatmosphere reflectance with Landsat 8 OLI. With a similar approach, we used a linear regression model to remove the systematic bias between Sentinel-2 MSI and Landsat 8 OLI images.
From the multi-sensor inquiry on GEE, overlapping Sentinel-2 MSI images and Landsat images (either OLI or ETM+) at 16 water sample sites were identified and used as radiometric tie points. After removal of one outlier from these radiometric tie points, a linear regression equation was fitted for each surface reflectance band ( Figure 4). All regression equations were statistically significant, with coefficients of determination (R 2 ) above 0.45. Bands 4 and 6 have slightly lower R 2 values compared to other bands. Band 7 has the highest R 2 value. Using these linear equations, the Sentinel-2 MSI surface reflectance values were converted and normalized to Landsat reflectance values and merged with other Landsat OLI or ETM+ samples for training the predictive model. To deal with the cross-sensor difference between these two products, the United States National Aeronautics and Space Administration (NASA) simulated the two instruments from hyperspectral data collected by the Hyperion instrument over 500 individual sites around the world and fitted linear equations to adjust the surface reflectance of each Sentinel-2 band to match the corresponding Landsat-8 band. NASA has released a consistently combined product for Landsat 8 and Sentinel-2 imagery [53,54]. Barsi et al. [55] used pseudo-invariant calibration sites (PICS) to calibrate an empirical equation to reconcile the cross-sensor differences between Sentinel-2 MSI top-of-atmosphere reflectance with Landsat 8 OLI. With a similar approach, we used a linear regression model to remove the systematic bias between Sentinel-2 MSI and Landsat 8 OLI images.
From the multi-sensor inquiry on GEE, overlapping Sentinel-2 MSI images and Landsat images (either OLI or ETM+) at 16 water sample sites were identified and used as radiometric tie points. After removal of one outlier from these radiometric tie points, a linear regression equation was fitted for each surface reflectance band ( Figure 4). All regression equations were statistically significant, with coefficients of determination (R 2 ) above 0.45. Bands 4 and 6 have slightly lower R 2 values compared to other bands. Band 7 has the highest R 2 value. Using these linear equations, the Sentinel-2 MSI surface reflectance values were converted and normalized to Landsat reflectance values and merged with other Landsat OLI or ETM+ samples for training the predictive model.

Machine Learning Model for Chl-a Mapping
Previously, various empirical models have been tested to map Chl-a concentrations [17,24,34,56,57]. Tebbs et al. [24] developed a linear regression model for mapping Chl-a in Lake Bogoria, Kenya, based on time-series Landsat ETM+ images and monthly in situ measurements of Chl-a for the period from November 2003 to February 2005. They found that the performance of their model was sensitive to the selection of bands or band ratio combinations. In general, blue to green spectral bands are better for estimating Chl-a concentrations in clear (oligotrophic) water conditions [16,58], while red and near-infrared bands are preferable in high-biomass and turbid coastal and inland waters [19,58,59]. The major criticism for the empirical regression models is that the assumption of a linear relationship between spectral reflectance and Chl-a may not hold for waters with a complicated optical condition, particularly for inland and coastal waters with relatively high biomass (Chl-a > 20 μg/L) [7].
Machine learning algorithms (e.g., support vector machines, neural networks, decision trees) have been increasingly used in mapping Chl-a in the past two decades, due to their robustness to image noise and lack of strong assumptions about data samples (e.g., normal distribution) [60][61][62][63][64][65][66]. Among machine learning algorithms, the support vector machine (SVM) method has its merit in requiring fewer samples because its learning power is determined by the data samples at cluster borders to define the "hyperplane" [67]. It has been demonstrated that the SVM has better performance than other machine learning models under small-sample scenarios [64,68]. In this case study, the 2-day search window resulted in a relatively small sample set consisting of 56 match-up points. Therefore, the SVM algorithm was selected as the predictive model because of its superior performance in a small-sample condition [68], which represents a major challenge for most water quality mapping projects. The SVM regression model was set with a radial basis function (RBF) kernel using the R "e1071" package. All of the variables were standardized to avoid any bias caused by unit differences. The model parameters, cost and gamma, were fine-tuned using a genetic algorithm (GA) by setting the number of populations to be 50 and the number of generations to be 100. The crossover probability was set to 0.8, and the mutation probability was set to 0.1. The prediction accuracy of the fine-tuned SVM model was assessed using an independent validation dataset. The match-up points were randomly split into training and validation sets with 75% and 25% of all samples. The SVM model

Machine Learning Model for Chl-a Mapping
Previously, various empirical models have been tested to map Chl-a concentrations [17,24,34,56,57]. Tebbs et al. [24] developed a linear regression model for mapping Chl-a in Lake Bogoria, Kenya, based on time-series Landsat ETM+ images and monthly in situ measurements of Chl-a for the period from November 2003 to February 2005. They found that the performance of their model was sensitive to the selection of bands or band ratio combinations. In general, blue to green spectral bands are better for estimating Chl-a concentrations in clear (oligotrophic) water conditions [16,58], while red and near-infrared bands are preferable in high-biomass and turbid coastal and inland waters [19,58,59]. The major criticism for the empirical regression models is that the assumption of a linear relationship between spectral reflectance and Chl-a may not hold for waters with a complicated optical condition, particularly for inland and coastal waters with relatively high biomass (Chl-a > 20 µg/L) [7].
Machine learning algorithms (e.g., support vector machines, neural networks, decision trees) have been increasingly used in mapping Chl-a in the past two decades, due to their robustness to image noise and lack of strong assumptions about data samples (e.g., normal distribution) [60][61][62][63][64][65][66]. Among machine learning algorithms, the support vector machine (SVM) method has its merit in requiring fewer samples because its learning power is determined by the data samples at cluster borders to define the "hyperplane" [67]. It has been demonstrated that the SVM has better performance than other machine learning models under small-sample scenarios [64,68]. In this case study, the 2-day search window resulted in a relatively small sample set consisting of 56 match-up points. Therefore, the SVM algorithm was selected as the predictive model because of its superior performance in a small-sample condition [68], which represents a major challenge for most water quality mapping projects. The SVM regression model was set with a radial basis function (RBF) kernel using the R "e1071" package. All of the variables were standardized to avoid any bias caused by unit differences. The model parameters, cost and gamma, were fine-tuned using a genetic algorithm (GA) by setting the number of populations to be 50 and the number of generations to be 100. The crossover probability was set to 0.8, and the mutation probability was set to 0.1. The prediction accuracy of the fine-tuned SVM model was assessed using an independent validation dataset. The match-up points were randomly split into training and validation sets with 75% and 25% of all samples. The SVM model trained with the training set was assessed by the validation set. The root mean squared error (RMSE) and the mean absolute percentage error (MAPE) defined below were used to evaluate the model performance under different data scenarios.
where y i is the observed value andŷ i is the predicted value. The model is calibrated by minimizing the RMSE of the training data. The minimization might overfit the model; that is, even though the RMSE of the training data is small, that of the validation data remains large. Therefore, both RMSEs of the training data and the validation data are reported.

Scenario Tests of the Multi-Sensor Approach with Different Search Windows
Xu et al. [69] used a much larger temporal window to form match-up points for constructing empirical models for predicting Chl-a concentration. For 97 water samples, Landsat 8 OLI images were manually searched to identify images closest in time to the samples. Even though radiometric calibration and atmospheric correction were carefully applied, the RMSE of the prediction model (6.17 µg/L) is higher than the result in this research. Xu et al. [69] only used one satellite sensor (Sentinel-2 MSI) to form match-up points for model construction. Because the search was constrained to a single sensor, the water samples and the paired satellite reflectance measurements may not capture the same water conditions.
Using multi-sensor data can shorten the temporal gap between satellite overpass and ground sampling. However, the sensor configurations and spectral responses of different sensors might introduce uncertainty to the data as well. Therefore, it is important to check which way might bring more uncertainty-adding more sensors or allowing for larger time gaps. To answer this question, we compared the model prediction performance under five different data scenarios (S#) composed by different sensor and search window combinations ( Table 2): 2-10 OLI, ETM+, and MSI 32 S1: Use 97 match-up points that are formed by searching the readily available multi-sensor (OLI, ETM+, and MSI) surface reflectance products in GEE with a 10-day window to identify the paired image pixels for ground water samples.
S2: Use the 56 match-up points that are formed by only searching the OLI surface reflectance products in GEE with a 10-day window.
S3: Use the 56 match-up points that are formed by searching the multi-sensor (OLI, ETM+, and MSI) surface reflectance products in GEE with a 2-day window.
S4: Use the 32 match-up points that are formed by searching the OLI surface reflectance products in GEE with a time window between 2 days and 10 days from the corresponding ground sample dates.
S5: Use the 32 match-up points that are formed by searching the multi-sensor (OLI, ETM+, and MSI) surface reflectance products in GEE with a time window between 2 days and 10 days from the corresponding ground sample dates.
Among these data scenarios, S1 is the baseline scenario because it uses cross-sensor data and a comparable time window with Xu et al. [69]. The comparisons between S2 and S3 and between S4 and S5 scenarios show the effect of sensor differences on the performance of the prediction model, and the comparisons between S2 and S4 and between S3 and S5 show the effect of larger time gaps (> 2 days but up to 10 days) for pairing match-up points on the model performance. These scenario studies are helpful to answer the research questions of the paper.

Surface Reflectance Validation
The datasets on GEE were provided as a bulk. Therefore, there might be some uncertainty in the data products, which could affect the model prediction of Chl-a. The uncertainty of some images used in this research was measured by using the field data collected from a spectrometer. The averaged RMS% was computed for each spectral band from field sampling points (Table 3). Among the 20 sampling sites across the lake, half of them have a turbidity value close to 0 NTU and, at those sites, GEE Landsat 8 reflectance data generated extremely large average RMS% values at bands 1-5, which were 84.17%, 70.07%, 56.85%, 52.97%, and 61.23%, respectively. If excluding these sites, the GEE Landsat 8 reflectance at the remaining sites matches well with the in situ ASD reflectance measurements at bands 1-4, with average RMS% less than 25%, as shown in Table 3. This indicates that the GEE Landsat 8 reflectance data could be problematic for the water quality monitoring of very clear waters.
Harmful algal blooms frequently occur on Harsha Lake. GEE Landsat 8 reflectance at sampling sites of Harsha Lake yielded average RMS% less than 30% at bands 2-4. In both cases, the quality of GEE Landsat 8 reflectance data (bands 1-4) is acceptable. In particular, reflectance at bands 2-4 is often more accurate than reflectance at band 1, and band 5 reflectance is the least accurate, indicated by the very large average RMS% (greater than 60%). Atmospheric correction over the water surface is particularly more difficult than other land cover types because of the low signal-to-noise ratio of satellite images over water bodies. The validation of the OLI surface reflectance product using ASD surface reflectance indicates that the quality of Landsat 8 OLI surface reflectance data from GEE is sufficient for modeling Chl-a. These results show the necessity of normalizing Sentinel-2 MSI to Landsat 7 ETM+ and Landsat 8 OLI surface reflectance using the cross-sensor calibration.

SVM Model Performance under Different Data Scenarios
The SVM model parameters and accuracy reports under the five data scenarios discussed in the Methods section (Table 2) are listed in Table 4 below. The following facts can be observed from the table.
First, the SVM model accuracy under scenario S1 is better than that reported in [69]. For the match-up point formation, scenario S1 uses the same length of temporal search window (10 days) as in [64]. S1 uses the GEE image assets to find the paired image pixels, instead of conducting user-specified atmospheric corrections as in [64]. In addition, S1 used multiple sensors for finding the best match-up points. This result indicates that the direct use of GEE image assets can skip atmospheric correction-related image preprocessing tasks and make it much more convenient to establish the prediction model for multi-site and multi-date Chl-a mapping. The multi-sensor approach reduces the time gaps between the ground samples and satellite overpassing and therefore improves the data quality for training the predictive models. At the same time, the use of an SVM machine learning model may have contributed to the prediction accuracy improvement. It is worth noting that S1 has the lowest MAPE value, indicating that even if it does not produce the smallest RMSE, it has the best accuracy relative to the magnitude and variance of Chl-a. Especially, the RMSEs of the training and validation datasets are similar in S1, which suggests that the model was not overfitted. Second, S3 has lower RMSE and MAPE than S2. These two scenarios have the same number of samples. S3 used the multi-sensor search, while S2 only searched OLI images. This indicates that the reduction in temporal gaps between ground samples and satellite observations for the paired match-up points by using the multi-sensor approach improved the predictive model performance.
Third, the overfitting problem is present in S2-S5, but not in S1. The overfitting problem is indicated by the significantly larger RMSE value for the validation than that for the training set ( Table 4). The overfitting problem may be caused by the small number of match-up points in S2-S5 (less than 60), although our use of the GA algorithm in assisting SVM parameter calibration mitigates the overfitting. It is worth noting the water samples used in this research were collected before Sentinel-2 satellites were launched in March 2017. It is expected that the chance of pairing future ground samples with satellite data could be greatly improved by the constellation of Sentinel-2 satellites.
Lastly, in S1, all ground water samples are paired with cloud-free multi-sensor satellite observations within the 10-day window, and 97 match-up points are formed for the model development. Although the quality of match-up points could be compromised by the larger temporal time window, the model prediction accuracy in S1 is still the best among the data scenarios. This result suggests that it is a valid option to use a large temporal window (e.g., 10 days) to ensure a sufficient number of match-up points (e.g., > 80) for the prediction model development when the match-up rate of the ground samples is low, at the same time using multi-sensor data to minimize the time gaps. In short, S1 provides enough training and evaluation match-ups for our SVM/GA algorithm to effectively map Chl-a concentrations at a regional scale with reasonable accuracy in this study.

Predicting Chl-a in an Unsampled Lake within the Study Area
The SVM model was trained using the sample data to predict Chl-a concentration in the study area with the images obtained from the GEE image repositories. One of the unsampled lakes, Green River Lake in Kentucky, is displayed in Figure 5 to show the predicted Chl-a concentration from a Sentinel-2A image acquired on 28 August 2016. The map shows a spatially coherent distribution of Chl-a concentration in the lake. Such spatial coherence is expected and follows the First Law of Geography [70] regarding the quality of spatial data.

Comparison with Other Studies
Recent research by Zhang et al. [71] applied an SVM model with an RBF kernel on Landsat 8 OLI images to estimate the chlorophyll-a concentrations of multiple lakes in China. Their 90 matchup samples were formed using Landsat OLI images acquired on two cloud-free days, respectively, in December 2017 and in March 2018, over five lakes. Water samples acquired on the exact same days of two Landsat OLI overpasses were used to form their match-up points, and six spectral bands were used in the model construction. They reported an RMSE of 22.636 μg/L. Although we have a similar number of samples and the similar SVM with an RBF kernel to theirs, the RMSE of our SVM model (4.424 μg/L) is considerably smaller than theirs. The high absolute RMSE in Zhang et al. can be attributed to the "extremely high" Chl-a concentration (over 200 μg/L) in some lakes of their study area. In a water quality assessment project in the U.S., Xu et al. [69] manually corrected Landsat 8 OLI images for calibrating empirical models with the same set of water samples as in our study, and they reported a comparable RMSE of 6.17 μg/L. Therefore, our model evaluation shows that the quality of GEE public data assets is sufficient for mapping freshwater Chl-a in a large geographical region. In other words, the readily available surface reflectance products in the GEE repository can be used for large-scale water quality mapping applications, and users do not have to go through the complicated atmospheric correction of raw image products on their own. The direct use of GEE satellite data products for water quality applications not only saves a lot of image processing time associated with atmospheric correction but also makes the model predictions reproducible and comparable and makes effective and routine water quality monitoring feasible and affordable. Cross-sensor

Comparison with Other Studies
Recent research by Zhang et al. [71] applied an SVM model with an RBF kernel on Landsat 8 OLI images to estimate the chlorophyll-a concentrations of multiple lakes in China. Their 90 match-up samples were formed using Landsat OLI images acquired on two cloud-free days, respectively, in December 2017 and in March 2018, over five lakes. Water samples acquired on the exact same days of two Landsat OLI overpasses were used to form their match-up points, and six spectral bands were used in the model construction. They reported an RMSE of 22.636 µg/L. Although we have a similar number of samples and the similar SVM with an RBF kernel to theirs, the RMSE of our SVM model (4.424 µg/L) is considerably smaller than theirs. The high absolute RMSE in Zhang et al. can be attributed to the "extremely high" Chl-a concentration (over 200 µg/L) in some lakes of their study area. In a water quality assessment project in the U.S., Xu et al. [69] manually corrected Landsat 8 OLI images for calibrating empirical models with the same set of water samples as in our study, and they reported a comparable RMSE of 6.17 µg/L. Therefore, our model evaluation shows that the quality of GEE public data assets is sufficient for mapping freshwater Chl-a in a large geographical region. In other words, the readily available surface reflectance products in the GEE repository can be used for large-scale water quality mapping applications, and users do not have to go through the complicated atmospheric correction of raw image products on their own. The direct use of GEE satellite data products for water quality applications not only saves a lot of image processing time associated with atmospheric correction but also makes the model predictions reproducible and comparable and makes effective and routine water quality monitoring feasible and affordable. Cross-sensor calibration is still necessary, but it does not add much complexity if the linear regression in Section 3.4 is used.

Chl-a Sample Data Variability with Various Time Intervals
We conducted some tests of the change in Chl-a samples with different time windows using the sites sampled on multiple dates in the same year. The sampling time intervals include 1, 2, 3, 28,30,39,40,48,49,55,58, or 67 days based on the USACE reference data. The average change in Chl-a (µg/L) and percentage of change are displayed in Table 5. The sampling interval up to three days had little impact (less than 10%) on the sample values. However, when the interval was too long (30 days), the percent difference could be as large as 42.3%. Unfortunately, our samples did not have the 10-day interval measurement, and therefore the data could not show the impact of the 10-day interval on the Chl-a samples. The data scenario S1 used the 10-day window to search the multi-sensor datasets, and the accuracy was the highest among all scenarios. This indicates that a 10-day window is a valid option if the number of samples is too low.

Summary and Conclusions
Google Earth Engine (GEE) provides data repositories and scripting tools for users to conduct cloud-based image processing and inquiry. Mapping Chl-a in freshwater lakes often faces the challenge of a small number of samples that can be paired with satellite images. This research demonstrates a workable solution to maximize match-up points by searching cloud-free and haze-free image pixels from the GEE data repositories using online scripts. The support vector machine (SVM) model trained using the match-up points, consisting of ground water quality samples paired with GEE image pixels returned from the inquiries, can predict Chl-a concentration with reasonable accuracy. The major findings are summarized below:

1.
Google Earth Engine greatly facilitates the pairing of satellite surface reflectance image pixels with corresponding field water quality samples to form match-up points for predictive model development. The cloud-based inquiry supported by GEE makes it much more efficient to use Landsat 7 ETM+ for land resource mapping. In our case, we found 22 match-up points by pairing Landsat 7 ETM+ pixels with the water quality samples, which was around one-third of the 56 match-up points used for training the SVM model.

2.
The RMSE of Chl-a of the SVM model trained by the data obtained from single-source (OLI only) imagery was 4.42 µg/L (compared with 6.17 µg/L in a previous project report using the same sample data). It is evident that the GEE image product is reliable for water quality mapping.

3.
A smaller temporal search window (two-day window) for pairing field water samples with the multi-sensor satellite images in GEE data repositories improves the model prediction accuracy, but the improvement is not significant and reduces the number of match-up training and validation sample/image pixel pairs, which introduces model overfitting.

4.
The use of multi-sensor image data from GEE improves the data match-up between ground samples and satellite images and therefore improves the model prediction accuracy.

5.
For mapping water quality parameters over a multistate region, the number of match-up points needs to be large enough to avoid the model overfitting bias. In our case, the number of match-up points from three states and 12 lakes should be in the order of 90 to avoid overfitting. Models with less than 60 match-up points may suffer from the overfitting problem.
Overall, GEE with reflectance normalization between satellites and sufficient water quality sample data in combination with our new SVM/GA model results in an effective, affordable, and feasible satellite water quality monitoring system for inland waters at regional scales. This research can serve as a case study and example for future water quality mapping using GEE. It is worth noting that although Landsat 5 TM surface reflectance was not included in the experiments of this research, it remains comparable with Landsat 7 ETM+, Landsat 8 OLI, and Sentinel-2A and B. A slight modification of the Python codes developed in this research will allow similar queries to be performed on Landsat 5 TM datasets.