Evaluation of MODIS-Aqua Atmospheric Correction and Chlorophyll Products of Western North American Coastal Waters Based on 13 Years of Data

There is an increasing need for satellite-derived accurate chlorophyll-a concentration (chla) products to improve fisheries management in coastal regions. However, the methods used to derive these products have to be evaluated, so the associated uncertainties are known. The performance of three atmospheric correction methods, the near infrared (NIR), the shortwave infrared (SWIR), and the Management Unit of the North Seas Mathematical Models with an additional modification (MUMM + SWIR), and derived chla products based on the Moderate Resolution Imaging Spectroradiometer AQUA (MODIS) images acquired from 2002 to 2014 over the west coast of Canada and the United States were evaluated. The atmospherically corrected products and above-water reflectance were compared with in situ AERONET (N ~ 650) and above-water reflectance (N ~ 34) data, and the Ocean Color 3 MODIS (OC3M)-derived chla were compared with in situ chla measurements (N ~ 82). The statistical analysis indicated that the MUMM + SWIR method was the most appropriate for this region, with relatively good retrievals of the atmospheric products, improved retrieval of remote sensing reflectance with bias lower than 20% for the OC3M bands, and improved retrievals of chla (r = 0.83, slope = 0.89, logRMSE = 0.33 mg m−3 for ±1 h). The poorest chla retrievals were achieved with the SWIR and NIR methods. These results represent the most comprehensive satellite data analysis of MODIS retrievals for this region and provide a framework for the MUMM + SWIR method that can be further tested in other coastal regions of the world.


Introduction
There is a need for improved monitoring of dynamic coastal processes including productivity, critical habitats, and fisheries given the effects of increasing human pressures and a changing climate [1].Traditional methods for monitoring coastal water properties typically rely on in situ sampling from a ship or buoy based systems, which are often prohibitively costly and spatio-temporally limited.Inability to effectively monitor and characterize dynamic zones poses a significant barrier, for instance, to fisheries management.As an example, in the west coast of Canada, improved understanding of the impacts of bottom-up forcing on fish populations requires long-term spatio-temporal productivity data [1].Long-term data derived from ocean colour satellites, for instance, MODIS-Aqua, offer an unparalleled tool for synoptic surface chlorophyll-a concentration (chla) associated with high (near daily) sampling frequencies, thus providing data at the required resolution for improving ecosystem-based fisheries management [2,3].However, effective use of these data has some challenges, including effective removal of atmospheric interference from at-sensor signals and development of robust satellite-based chla algorithms [4][5][6].Many models are available for atmospheric correction (e.g., [7][8][9][10][11]) and chla retrieval (e.g., [4] for a review of several methods) from ocean colour images.However, a combination of the most appropriate models is required for robust products desired by the user community [12].
The objective of this research is to evaluate three atmospheric correction methods for MODIS-Aqua (hereafter MODIS) and derived chla products for the estuarine system, the Salish Sea, located on the west coast of Canada and the United States.To accomplish this, the first step was to process MODIS imagery acquired from 2002 to 2014 based on three different atmospheric correction methods: (i) the NASA algorithm using a 2-Band relative humidity based model selection and a iterative NIR correction (hereafter, "NIR" method); (ii) the NASA algorithm with a substitution of SWIR bands (1240 nm, 2130 nm) ("SWIR" method); and (iii) the GW94-based algorithm that assumes spatial homogeneity of the atmosphere over the region of interest [10], but with an additional modification using advantages of the SWIR bands to estimate NIR aerosol contributions ("MUMM + SWIR" method).The different methods were evaluated in comparison to AERONET data and in situ above-water spectral data.Subsequently, atmospherically corrected images were subjected to the OC3M chlorophyll algorithm and retrievals assessed in comparison to in situ chla at the different match up time windows.

Study Area
The Salish Sea, composed of the Strait of Georgia, Puget Sound, and Strait of Juan de Fuca systems, is a large, deep (max ~400 m, mean ~150 m), estuarine-forced temperate sea, located on the Pacific southwest coast of Canada and northwest coast of the US, and it is connected to the Pacific Ocean by outlets at its north and south extremities (Figure 1).A key feature of this region is inputs from the Fraser River, which discharges ~140 km 3 of fresh water and ~20 megatons of sediment annually [13], providing up to 80% of its freshwater inputs [14].These inputs drive southward estuarine circulation, forming brackish surface plumes that dominate the southern and central portions of the Strait of Georgia [15].
High particulate inputs in the Salish Sea produce optically complex waters [16] with the highest light attenuation due to suspended matter occurring particularly in the spring and summer [16,17].Sutton et al. [18] demonstrated spatial variability in the suspended matter, in that northern waters are dominated by phytoplankton when compared to the central and southern regions; this is similarly observed with bio-optical data [16].Biologically, this region exhibits typical temperate diatom-dominated spring blooms followed by weaker fall bloom events [19], where primary assemblages are Thalassiosira spp., Skeletonema costatum, and Chaetoceros spp.[20].The timing of the spring bloom is observed to vary interannually [19, 21,22], mediated by light availability [21], wind events controlling the mixed-layer depth, and timing of outflow from the spring freshet [19].Dinoflagellates are the second most abundant group of phytoplankton, dominating the total biomass in the summer and early fall [23] when productivity is nitrate-limited [24].During winter months, nutrients tend to be above limitation level, creating a situation where phytoplankton growth is light-limited [19].
The aerosol optical depth distribution in the northeastern Pacific is mostly due to biogenic and sea salt sulphate emissions [25].In the Salish Sea, significant loads of aerosols are emitted from major urban sites in the southern and central regions [26,27].These tend to be carbonaceous and roughly divided into either "organic" or "black" groups.While both are emitted from biomass and fossil fuel combustion, organics tend to scatter while black aerosols strongly absorb radiation [28].

Satellite Data and Image Processing
All available Level-1A files of the Salish Sea (June 2002-2014) were downloaded from the Ocean Biology Processing Group (OBPG) at ~1 km 2 nadir spatial resolution.A total of 3396 L1A files were evaluated and selected for further processing.The L2GEN program in the SeaWiFS Data Analysis System 6.4 (SeaDAS) was used to generate the standard Level-2 files containing water-leaving reflectance, spectral aerosol optical thickness, and Å ngstrom exponents, required for the evaluation of the atmospheric correction methods.

Atmospheric Correction Approach
The MODIS sensor-measured reflectance at top of atmosphere (ρ  ) is expressed as the sum of atmospheric and water contributions [7]: Here, ρ  (λ) is water-leaving reflectance containing the useful water properties information to be isolated from ρ  (λ) through atmospheric correction [7,10]; t(λ) is the diffuse transmittance of the atmosphere, available through lookup tables; and ρ  represents reflectance from sun glint and whitecaps, and is estimated by using ancillary data, including windspeed and solar and sensor geometry, and recommended flags [29].Confounding signal contributions detected by the satellite originate from absorption and scattering by gases (ρ  (λ)) and aerosols (ρ  (λ)) [30], as well as gas and aerosol interactions (ρ  (λ)) [31].The ρ  (λ) term is accurately derived from lookup tables computed for different solar and viewing geometries, atmospheric pressure, and wind speed [32].The ρ  (λ) + ρ  (λ) terms, however, are highly variable and cannot be predicted a priori.
The Gordon and Wang's approach [7] solves Equation (1) by quantifying aerosol reflectance, ρ  (λ) at two NIR bands where water-leaving reflectance is assumed to be zero (t(λ)ρ  (λ) ≈ 0) due to high water absorption.One band is used to evaluate magnitude of aerosol contribution and the second for evaluating wavelength dependence [7].Any detected signal at these bands is assumed to

Satellite Data and Image Processing
All available Level-1A files of the Salish Sea (June 2002-2014) were downloaded from the Ocean Biology Processing Group (OBPG) at ~1 km 2 nadir spatial resolution.A total of 3396 L1A files were evaluated and selected for further processing.The L2GEN program in the SeaWiFS Data Analysis System 6.4 (SeaDAS) was used to generate the standard Level-2 files containing water-leaving reflectance, spectral aerosol optical thickness, and Ångstrom exponents, required for the evaluation of the atmospheric correction methods.

Atmospheric Correction Approach
The MODIS sensor-measured reflectance at top of atmosphere (ρ t ) is expressed as the sum of atmospheric and water contributions [7]: Here, ρ w (λ) is water-leaving reflectance containing the useful water properties information to be isolated from ρ t (λ) through atmospheric correction [7,10]; t(λ) is the diffuse transmittance of the atmosphere, available through lookup tables; and ρ g represents reflectance from sun glint and whitecaps, and is estimated by using ancillary data, including windspeed and solar and sensor geometry, and recommended flags [29].Confounding signal contributions detected by the satellite originate from absorption and scattering by gases (ρ r (λ)) and aerosols (ρ a (λ)) [30], as well as gas and aerosol interactions (ρ ra (λ)) [31].The ρ r (λ) term is accurately derived from lookup tables computed for different solar and viewing geometries, atmospheric pressure, and wind speed [32].The ρ a (λ) + ρ ra (λ) terms, however, are highly variable and cannot be predicted a priori.
The Gordon and Wang's approach [7] solves Equation (1) by quantifying aerosol reflectance, ρ a (λ) at two NIR bands where water-leaving reflectance is assumed to be zero (t(λ)ρ w (λ) ≈ 0) due to high water absorption.One band is used to evaluate magnitude of aerosol contribution and the second for evaluating wavelength dependence [7].Any detected signal at these bands is assumed to correspond to atmospheric contributions to total signal.The effects of aerosols and Rayleigh-aerosol interactions, ρ a (λ) + ρ ra (λ), are then estimated at the two NIR bands from sensor-measured radiances, computed Rayleigh scattering, and estimated whitecap contributions.ρ ra (λ) is zero where radiation is only scattered once by either aerosols or air.This is true for low aerosol optical thicknesses (sensor near nadir); therefore in Equation ( 1), ρ a (λ) + ρ ra (λ) are replaced by aerosol single scattering reflectance, ρ as (λ) [33,34]: ρ as (λ) is used to calculate the single scattering epsilon, ε(λi,λj), which is in turn used to define the aerosol model to correct the visible wavelengths.ε(λi,λj) is defined as: where (λi,λj) represent the shorter and longer wavelengths, respectively.The value of ε(λi,λj) characterizes the spectral variation of the aerosol extinction coefficient, which includes the aerosol optical thickness, single scattering albedo, and aerosol phase function.It is then used to retrieve appropriate atmospheric optical properties in the visible wavelengths through predefined lookup tables [8].Three atmospheric correction methods were investigated for their ability to characterize and correct for atmospheric aerosols: (1) the 2-Band iterative NIR, which is the standard NASA correction method; (2) the SWIR, which has shown improved results in highly turbid waters; and (3) the MUMM + SWIR, which is a modified method and, in its original form, has also shown improved results for turbid waters.
Method 1 (NIR): The 2-Band iterative NIR correction algorithm builds on the black-pixel assumption approach [7], and is designed to address algorithm failure occurring in high particulate backscattering waters [35,36].The method uses an iterative bio-optical model considering convergence to NIR reflectance to quantify particulate contributions to the R rs (NIR).From that, the effects of aerosols at the two NIR bands is then extrapolated and removed in the visible spectra through predefined lookup tables; further details of this method can be found in the literature [8].
Method 2 (SWIR): First proposed by [9], this method replaces the assumption of black pixels in NIR with the SWIR bands, 1240 and 2130 nm, in highly turbid waters where the NIR method fails.This is generally valid, as water absorption is several orders of magnitude larger at 2130 nm (2200 m −1 ) than in the NIR (5 m −1 ) [37].
Method 3 (MUMM + SWIR): The MUMM method uses an approach whereby the assumption of zero water-leaving reflectance in the NIR bands is replaced by the assumption of spatial homogeneity of the aerosol type in the region of interest [10].The method requires a priori knowledge of the normalized reflectance ratios for water at two NIR bands, 748 nm and 869 nm, α(λi,λj), and aerosol, ε(λi,λj), then the reflectance from aerosols and water can be separated in the Rayleigh-corrected reflectance [10].
Using an estimated value of ε(748, 869) and α, the following equations can be solved to separate the reflectance from aerosols and water in the Rayleigh-corrected reflectance, ρ rc [10].
The method also accounts for the effect of diffuse atmospheric transmittance, t(λ), by multiplying α by γ, which is defined as [10]: where t v (λ) and t 0 (λ) are the sensor diffuse transmittance and the solar diffuse transmittance, respectively.Following the separation of ρ w and ρ A from ρ rc , ρ A (748) and ρ A (869) are passed to the standard NIR atmospheric correction.With the iterative fit of the standard NIR atmospheric correction [10], some degree of variability in the aerosol type and optical thickness is allowed.This approach is commonly referred to as the Management Unit of the North Sea Mathematical Models (MUMM) atmospheric correction.
Successful results from the MUMM method requires the accurate definition of α(λi, λj) and ε(λi, λj) representing clear waters in the image to apply to areas of turbid waters in the image.However, in the Salish Sea, high surface chla and terrestrial particulates [16,38,39] can contribute to the water-leaving reflectance at the NIR bands, especially during spring and summer conditions.For this reason, the spatially averaged aerosol property, ε(748, 869), was estimated from previous corrected images according to the SWIR method.In the Salish Sea, the mean ε(748, 869) was derived from a region farther from the coastline and away from known areas of high scattering [16] to limit potential land adjacency and turbid water effects, respectively.Specifically, a 5 × 5 pixel-box centred at 49.404 • N/−123.965• W was used to produce an average α and ε for input into the MUMM + SWIR method.For images with cloud contamination at this specific location, the nearest location with no clouds was used.A similar approach using spatial averaging of aerosol properties estimated by the SWIR atmospheric correction has also been applied in the atmospheric correction used by [40].In that study, the Ångström exponent value was spatially averaged over an approximate 4 km × 4 km area in the central portion of Lake Taihu.However, the average Ångström exponent value was used to fix the model type at 2130 nm, and an iterative correction method was applied to the single band.In our study, a slightly different approach is applied, where the aerosol model is captured for the NIR bands using the SWIR correction estimated ε(748, 869) value.
chla Retrievals For this study, chlorophyll concentrations were derived from all atmospherically corrected images using the standard OBPG chlorophyll algorithm, OC3M [41] and a possible switch to the OCI [42].OC3M is an empirically derived algorithm developed as an extension of the OC4 and OC2 SeaWiFS algorithms adapted for the specific spectral bands of the MODIS sensor.It was statistically derived from chlorophyll concentrations ranging from 0.0008 to 90 mg m −3 , with the majority of concentrations ranging between 0.08 to 3.0 mg m −3 using the SeaWiFS Bio-optical Algorithm Mini-Workshop (SeaBAM) dataset [43].The OC3M algorithm is stated as: where The coefficients a 0 , a 1 , a 2 , a 3 , and a 4 are 0.2424, −2.7423, 1.8017, 0.0015, and −1.2280, respectively.Here, the larger observed value of R rs (443) and R rs (488) nm bands is chosen as dividend to the R rs (547) value in calculating reflectance ratio X.
Previous research in the Salish Sea has shown that the simple band ratio OC3M algorithm generally performs better than, for example, the GSM1 algorithm, especially when the heavily turbid river waters are masked from the analysis [39].All OC3M MODIS-derived surface chlorophyll values were extracted using a 3 × 3 km average pixel-box from the three different atmospheric correction products.Subsequently, we investigated the efficacy of the atmospheric correction methods and associated OC3M chla products with decreased temporal match-up windows of ±8, ±6, ±4, ±2, and ±1 h intervals.

Processing Flags
Processing flags were used to exclude data with a high solar zenith angle, land, clouds, and straylight to prevent chla overestimation due to increased path length [44].To account for the size of our study area, we adopted a 3 × 3 pixel straylight flag mask, which captures 99.6% of the intensity of the theoretical PSF [45].The 3 × 3 mask was found to conserve good quality data while eliminating sharply different reflectance values present near cloud and land.A final step in the flagging criteria excludes any remaining pixels with negative reflectance values in the blue wavelength bands (412, 443 nm) due to obvious atmospheric correction failure [46].

In Situ Data for Method Validation
In Situ Radiometric Measurements Two in situ radiometric datasets were used to evaluate satellite atmospheric correction methods: Aerosol Robotic Network (AERONET) Ångstrom exponent and aerosol optical thickness, and in situ above-water reflectance.The first dataset was downloaded, cloud-screened, and quality controlled (level 2.0) from AERONET, located on Saturna Island and maintained by AEROCAN [25].Data used included of the Ångstrom exponent, Å (440 nm/870 nm), and aerosol optical thickness, τ a (440, 675, 870 nm), and were extracted to match up with coincident atmospheric corrected imagery (±15 min) from 2002-2014, for a total of 684 samples corresponding to the number of good quality MODIS images.The Ångstrom exponent describes the wavelength dependence of τ a , thus providing basic information of prevailing aerosol size as spectral shape is directly related to particle size [47].Values of Åaround 1.0 indicate aerosols typically associated with a mixture of sea salts (marine aerosols), and values greater than 2.0 generally indicate finer aerosols associated with urban influence of by-products of organic combustion [8].MODIS-AERONET match-up locations were extracted from a 3 × 3 averaged pixel-box representing waters centred on 48.7327 • N, −123.1181• W, close to the AERONET site, and away from direct influence of the Fraser River plume.
The second set of radiometric data included in situ above-water R rs (λ) collected during cruises using a manufacture calibrated Hyperspectral Surface Acquisition System (HyperSAS).This system features sensors for measuring sea surface radiance (L t (λ)), sky radiance (L s (λ)), and sky irradiance (E s (λ)) at 136 channels in the 350-800 nm range with a 3.0 • half-angle field of view (FOV) for the L t and L s sensors.Data was acquired at optimum acquisition geometry [48] and processed following the protocols by [49,50], converted to above-water remote sensing reflectance [49], and convolved with the MODIS Spectral Response Functions [50].A total of 194 spectra were acquired for this study, including 15 spectra from [39] that were acquired and processed using the same methods (Table 1).For the comparison of the MODIS-derived R rs (λ) and in situ HyperSAS R rs (λ) match-ups corresponding to 3 × 3 pixel-box averaged R rs (λ) were extracted from the images at HyperSAS sampling locations (±3 h).Only match-ups with data for all methods were used for a direct comparison to HyperSAS R rs (λ) since the different approaches to atmospheric correction created disparate valid-pixel distributions for the same daily acquisitions.Out of the 194 HyperSAS spectra acquired, only 16 samples were available for direct comparison to imagery with the given constraints.Relaxing the need for completely coincident R rs (λ) values between all atmospheric methods, the total number of match-ups increased to a maximum of 34 (Table 1).

In Situ Chlorophyll Data
In situ chlorophyll surface concentrations from two separate datasets were used for validation of the imagery-derived chla products for each atmospheric correction method (Table 1).The first dataset corresponds to samples collected aboard the CCGS W.E. Ricker during the summer and fall of 2012.All chlorophyll samples (n = 192) were collected at the surface (≤3.0 m; this represents, on average, the first optical depth in this region [16]) using the continuous shipboard laboratory pump.Triplicate 1L samples were filtered using 0.7 µm Whatman GF/F glass fibre filters following the Ocean Optics protocols [51].Samples were stored at −25 • C during the cruise and then transferred to a −80 • C facility at the University of Victoria until pigment extraction.Chla pigment concentrations were determined using a Dionex high phase liquid chromatography (HPLC) system equipped with a PDA-100 photodiode array detector and DHI pigment standards [52].
The second in situ chlorophyll dataset of surface (≤3.0 m) chlorophyll concentrations (n = 618) was provided by the Institute of Ocean Sciences, Fisheries and Oceans Canada, as part of their Data repository for water property profiles (http://www.pac.dfo-mpo.gc.ca/science/oceans/datadonnees/index-eng.html), and processed according to [38].

Match-Up Statistics
The statistical analysis considered the comparison between the in situ AERONET vs. MODIS products, the in situ HyperSAS vs. MODIS products, and the in situ chla vs. MODIS derived chla.For all AERONET and MODIS product match-ups, the differences between AERONET (SITU) and Satellite (SAT) of variable X (Å and τ a ) are expressed by the median of absolute relative per cent difference (|ψ|), median relative per cent difference (ψ), which represents a measure of bias, and median absolute difference (|δ|), which represents a measure of uncertainty [46]: ψ = 100 × median |ψ| and ψ indicate per cent uncertainty and bias, respectively, while |δ| is the uncertainty in units of the variable X.
Linear regression analysis and associated slopes, and the Pearson correlation coefficient were carried out to evaluate the performance of satellite to in situ HyperSAS and chla values.Median values were used for central tendencies for the relatively large AERONET dataset [46].However, the mean operator is used here to identify uncertainty and bias calculations in the much smaller HyperSAS-MODIS dataset.Evaluative statistics used include the mean absolute percentage difference (MAD), the mean relative percentage difference (MRD), and the root mean squared error (RMSE) between satellite-derived (X SAT ) and in situ (SITU) R rs (λ) and in situ chla [53]. ) Surface chlorophyll concentrations tend to be log-normally distributed [54], therefore, both MODIS derived and in situ chla were log transformed, and the root mean-square log-error RMSE log was calculated as [55]: For this study, statistical significance is defined at a 95% confidence level.

AERONET Match-Ups
On average, 623 images (i.e., the number of Ångstrom and τ a of AERONET-MODIS match-ups) were analyzed for each atmospheric correction method.The numbers of match-ups vary between individual atmospheric correction methods due to differing numbers of valid images for each method and the quality of accompanying AERONET data.The AERONET data for the period suggests that the dominant aerosol size distributions near Saturna Island correspond to coarse modes (radii ≤ 0.5 µm), with a mode centred in 1.5.Approximately 90% of AERONET Ångstrom values acquired in the area were less than 2.0, with the majority between the 1.2 to 1.6 range (Figure 2), thus representing typical coastal aerosols [8].Approximately 10% of all Åretrievals, however, were fine mode (Å > 2.0, radii ≤ 0.5 µm), indicating the presence of strongly absorbing aerosols associated with urban pollution/biomass combustion [30,47].
Similarly, the MODIS-derived Åalso shows that, regardless of the atmospheric correction method, the large majority of the match-up data corresponds to Ålower than 2.0.Specifically, the NIR method produced the most consistent Ångstrom frequencies when compared with the AERONET frequencies, a mode around 1.2, and resolved exponents greater than 2.0 (Figures 2 and 3).This method also exhibits one of the lowest uncertainties (|ψ| = 24.48%)and biases (ψ = −3.27%)(Figure 3).The SWIR method slightly over-represents Åin the 1.6-2.0range by ~20%, while underestimating all other ranges.This method shows similar uncertainty (|ψ| = 24.33%) to NIR, but exhibits high dispersion, as expressed by a larger positive bias (11.65%).The MUMM + SWIR method exhibits similar results to the NIR method at Åvalues less than 0.8, but it over-represents Åbetween 0.8-1.0 by 12%, and under-characterizes Åvalues greater than 1.6.The MUMM + SWIR method showed a larger negative bias (−13.79%), while uncertainties are similar to the other methods.
Overall, the analysis of the τ a data shows that the three methods exhibit over-estimation for all wavelengths (Figure 4), but slightly improved statistical results for the shorter wavelengths (443 and 675 nm) than the longer wavelengths (870 nm).Among the three evaluated methods, the MUMM + SWIR exhibits the lowest uncertainty, about 23% less uncertainty when compared to NIR for the same bands, and therefore the lowest overall biases (Table 2).Associated with the lowest uncertainties and biases, this method also shows the highest number of match-ups, and relatively high correlation coefficient values (r = 0.70 for the 443 nm band) and lower RMSE (7.04%), similar to the NIR method (Figure 4).The SWIR method shows the poorest performance, with the highest |ψ| (150.0-200.0%), the lowest correlation coefficients (0.44), and larger positive biases and RMSE (Figure 4 and Table 2).From the 194 acquired in situ HyperSAS measurements, only 13-34 were match-ups for the MODIS derived   () analysis, within the defined ±3 h temporal window (Table 1).The number of match-ups differs for the individual atmospheric correction methods due to differing valid pixels for the same image.Figure 5 and Table 3 show the results for all the match-ups for each atmospheric method.From the 194 acquired in situ HyperSAS measurements, only 13-34 were match-ups for the MODIS derived R rs (λ) analysis, within the defined ±3 h temporal window (Table 1).The number of match-ups differs for the individual atmospheric correction methods due to differing valid pixels for the same image.Figure 5 and Table 3 show the results for all the match-ups for each atmospheric method.The SWIR method produced the lowest average match-up incidences (n = 13, 47%) across all wavelengths, and the highest percentage of negative values (47%), followed by the MUMM + SWIR (n = 20, 24%), and the NIR (n = 29, 15%).The greatest frequency of negative   () occurred in the 412 and 443 nm bands for all methods.However, nearly half of all SWIR-derived   (488-547 nm) were negative, and therefore invalid.All negative   () values (mostly blue wavelengths) were removed to be consistent with image processing procedures, and all remaining positive values were investigated for further analysis.
The mean absolute percentage difference (MAD) of the dataset ranged 38-404% (NIR), 53-587% (SWIR), and 37-56% (MUMM + SWIR).The poorest observed performance is for the SWIR method, which exhibits the highest uncertainties for all bands, and the most negative bias and high dispersion around a theoretical 1:1 relationship (Figure 5) as a result of the poorest correlation across all bands (r ranging 0.24-0.57),with slopes ranging 0.27-0.51(Table 3).The NIR method performed slightly better than the SWIR method, thus showing a decrease in the absolute and relative percentage differences and improved average mean r (0.71 ± 0.33).Superior results are observed for the MUMM + SWIR method, which shows the lowest overall uncertainties (45%) and biases (−17%), the highest r  The SWIR method produced the lowest average match-up incidences (n = 13, 47%) across all wavelengths, and the highest percentage of negative values (47%), followed by the MUMM + SWIR (n = 20, 24%), and the NIR (n = 29, 15%).The greatest frequency of negative R rs (λ) occurred in the 412 and 443 nm bands for all methods.However, nearly half of all SWIR-derived R rs (488-547 nm) were negative, and therefore invalid.All negative R rs (λ) values (mostly blue wavelengths) were removed to be consistent with image processing procedures, and all remaining positive values were investigated for further analysis.
The mean absolute percentage difference (MAD) of the dataset ranged 38-404% (NIR), 53-587% (SWIR), and 37-56% (MUMM + SWIR).The poorest observed performance is for the SWIR method, which exhibits the highest uncertainties for all bands, and the most negative bias and high dispersion around a theoretical 1:1 relationship (Figure 5) as a result of the poorest correlation across all bands (r ranging 0.24-0.57),with slopes ranging 0.27-0.51(Table 3).The NIR method performed slightly better than the SWIR method, thus showing a decrease in the absolute and relative percentage differences and improved average mean r (0.71 ± 0.33).Superior results are observed for the MUMM + SWIR method, which shows the lowest overall uncertainties (45%) and biases (−17%), the highest r (0.73 ± 0.14) values, close distribution to the 1:1 line (Figure 5), and also very important, slope values are similar for most of the bands (0.66 ± 0.04), especially the OC3M bands (Table 3).This indicates that the shape of the spectral curve is better preserved, which contributes to improved chla retrievals when using the OC3M.
Figure 6 illustrates the comparison between MODIS R rs (λ) and in situ R rs (λ) match-ups concurrent for the three evaluated methods.This qualitative analysis shows that the shape of the R rs (λ) spectra produced with the SWIR and NIR corrections diverge the most from the HyperSAS in situ spectra when compared to R rs (λ) spectra derived from the MUMM + SWIR method.Specifically, a greater proportion of results for the NIR and SWIR methods shows negative reflectance in the 443 nm and 488 nm bands, thus resulting in negative biases higher than 40% (Figure 7), while the MUMM + SWIR produces negative biases generally lower than 20% at the blue and green wavelengths.The 412 nm band exhibited the highest bias, where MUMM + SWIR under-estimates R rs by 38%, and both NIR and SWIR methods were upwards of 70%.Given that the OC3M model relies on the ratio of 443 nm or 488 nm to the 547 nm band to derive an accurate chlorophyll concentration [43], the MUMM + SWIR method resulted in an overall more accurate R rs (λ) retrieval in these bands, thus the resulting ratio between the blue green bands is more accurate.The highest average absolute difference between the MODIS derived blue-green ratio and the HyperSAS blue-green ratio is 10%, 23% and 24% for the MUMM + SWIR, NIR, and SWIR methods, respectively.
Remote Sens. 2017, 9, x FOR PEER REVIEW 12 of 24 that the shape of the spectral curve is better preserved, which contributes to improved chla retrievals when using the OC3M. Figure 6 illustrates the comparison between MODIS   () and in situ   () match-ups concurrent for the three evaluated methods.This qualitative analysis shows that the shape of the   () spectra produced with the SWIR and NIR corrections diverge the most from the HyperSAS in situ spectra when compared to   () spectra derived from the MUMM + SWIR method.Specifically, a greater proportion of results for the NIR and SWIR methods shows negative reflectance in the 443 nm and 488 nm bands, thus resulting in negative biases higher than 40% (Figure 7), while the MUMM + SWIR produces negative biases generally lower than 20% at the blue and green wavelengths.The 412 nm band exhibited the highest bias, where MUMM + SWIR under-estimates   by 38%, and both NIR and SWIR methods were upwards of 70%.Given that the OC3M model relies on the ratio of 443 nm or 488 nm to the 547 nm band to derive an accurate chlorophyll concentration [43], the MUMM + SWIR method resulted in an overall more accurate   () retrieval in these bands, thus the resulting ratio between the blue green bands is more accurate.The highest average absolute difference between the MODIS derived blue-green ratio and the HyperSAS blue-green ratio is 10%, 23% and 24% for the MUMM + SWIR, NIR, and SWIR methods, respectively.

OC3M Chlorophyll Retrievals
MODIS-OC3M chla retrievals from the three atmospheric correction methods versus in situ chlorophyll-a concentrations are compared for different match-up temporal windows (±1 h to ±8 h).Match-up time differences correspond to the time difference in situ data acquisition and satellite overpass.The in situ chla for the match-ups ranged from 0.4 to 30.0 mg m −3 , representing typical yearly conditions in the Salish Sea [38].
Chla estimates based on the images corrected with either the NIR or the SWIR methods (NIR-OC3M and SWIR-OC3M, respectively) exhibit consistently poor chla retrievals with large overestimation showing values higher than 40 mg m −3 (Figure 8), which is not typical of these waters, even during bloom conditions [38].The retrievals from the NIR-OC3M method shows the highest overestimation of chla, with both uncertainty and bias greater than 400%, and correlation values not significant at the 95% confidence level, regardless of the match-up temporal window.The SWIR-OC3M retrievals show similarly poor results, high bias and uncertainty, not significant correlation coefficients, and a general overestimation of chla concentrations (Table 3).
In contrast, the MUMM + SWIR-OC3M method, regardless of the match-up temporal window, exhibits the lowest uncertainty, bias, and RMSE, as well as the slope closest to 1.0, and the highest correlation coefficients.Further, this method allowed for the largest number of chla retrievals, with a range of 82 samples for ±8 h to 16 samples for the ±1 h temporal window.Perhaps most importantly,

OC3M Chlorophyll Retrievals
MODIS-OC3M chla retrievals from the three atmospheric correction methods versus in situ chlorophyll-a concentrations are compared for different match-up temporal windows (±1 h to ±8 h).Match-up time differences correspond to the time difference between in situ data acquisition and satellite overpass.The in situ chla for the match-ups ranged from 0.4 to 30.0 mg m −3 , representing typical yearly conditions in the Salish Sea [38].
Chla estimates based on the images corrected with either the NIR or the SWIR methods (NIR-OC3M and SWIR-OC3M, respectively) exhibit consistently poor chla retrievals with large overestimation showing values higher than 40 mg m −3 (Figure 8), which is not typical of these waters, even during bloom conditions [38].The retrievals from the NIR-OC3M method shows the highest overestimation of chla, with both uncertainty and bias greater than 400%, and correlation values not significant at the 95% confidence level, regardless of the match-up temporal window.The SWIR-OC3M retrievals show similarly poor results, high bias and uncertainty, not significant correlation coefficients, and a general overestimation of chla concentrations (Table 3).
In contrast, the MUMM + SWIR-OC3M method, regardless of the match-up temporal window, exhibits the lowest uncertainty, bias, and RMSE, as well as the slope closest to 1.0, and the highest correlation coefficients.Further, this method allowed for the largest number of chla retrievals, with a range of 82 samples for ±8 h to 16 samples for the ±1 h temporal window.Perhaps most importantly,

OC3M Chlorophyll Retrievals
MODIS-OC3M chla retrievals from the three atmospheric correction methods versus in situ chlorophyll-a concentrations are compared for different match-up temporal windows (±1 h to ±8 h).Match-up time differences correspond to the time difference between in situ data acquisition and satellite overpass.The in situ chla for the match-ups ranged from 0.4 to 30.0 mg m −3 , representing typical yearly conditions in the Salish Sea [38].
Chla estimates based on the images corrected with either the NIR or the SWIR methods (NIR-OC3M and SWIR-OC3M, respectively) exhibit consistently poor chla retrievals with large over-estimation showing values higher than 40 mg m −3 (Figure 8), which is not typical of these waters, even during bloom conditions [38].The retrievals from the NIR-OC3M method shows the highest overestimation of chla, with both uncertainty and bias greater than 400%, and correlation values not significant at the 95% confidence level, regardless of the match-up temporal window.The SWIR-OC3M retrievals show similarly poor results, high bias and uncertainty, not significant correlation coefficients, and a general overestimation of chla concentrations (Table 3).
In contrast, the MUMM + SWIR-OC3M method, regardless of the match-up temporal window, exhibits the lowest uncertainty, bias, and RMSE, as well as the slope closest to 1.0, and the highest correlation coefficients.Further, this method allowed for the largest number of chla retrievals, with a range of 82 samples for ±8 h to 16 samples for the ±1 h temporal window.Perhaps most importantly, it is the only method with improved statistical results with finer temporal differences (Figure 8, Table 3).For instance, within a ±1 h match-up time, chla retrievals exhibit the best results defined by the linear regression line with a slope of 0.89, the lowest offset (0.65), and r linear of 0.83, thus defining a low bias (21%) and RMSE ( log RMSE = 0.33 mg m -3 ) (Table 4).
Remote Sens. 2017, 9, x FOR PEER REVIEW 14 of 24 it is the only method with improved statistical results with finer temporal differences (Figure 8, Table 3).For instance, within a ±1 h match-up time, chla retrievals exhibit best results defined by the linear regression line with a slope of 0.89, the lowest offset (0.65), and rlinear of 0.83, thus defining a low bias (21%) and RMSE (logRMSE = 0.33 mg m 3 ) (Table 4).4. Lines represent the 1:1 relationship.Examples of final chla maps produced from the respective atmospheric correction techniques are shown in Figure 9. Three typical dates are shown and correspond to winter, spring, and summer conditions.The February image generally shows lower chla (<1.0 mg m −3 ), and large areas of null data for all three methods.This typical low chlorophyll concentrations [38] and lack of data is consistent with the whole dataset for this time of year, where growth is often light limited, in part due to the presence of clouds [21].An example of spring bloom conditions is shown in the MUMM + SWIR-OC3M product for 2 April 2008, where chla concentrations are around 4.0-8.0mg m −3 in the northern region, 3.0 mg m −3 in the Strait of Juan de Fuca, and higher than 10.0 mg m −3 and lower than 40 mg m −3 through a majority of the central Salish Sea.This region is also often characterized by a second bloom in the fall [38].The SWIR-OC3M product shows a noisy distribution of chla, with the majority of values within the 5.0 ± 3.0 mg m −3 range.The NIR-OC3M product shows the most extreme values of chla within the Salish Sea, with values higher than 40.0 mg m −3 , not commonly observed in this region [20,38].Examples of final chla maps produced from the respective atmospheric correction techniques are shown in Figure 9. Three typical dates are shown and correspond to winter, spring, and summer conditions.The February image generally shows lower chla (<1.0 mg m −3 ), and large areas of null data for all three methods.This typical low chlorophyll concentrations [38] and lack of data is consistent with the whole dataset for this time of year, where growth is often light limited, in part due to the presence of clouds [21].An example of spring bloom conditions is shown in the MUMM + SWIR-OC3M product for 2 April 2008, where chla concentrations are around 4.0-8.0mg m −3 in the northern region, 3.0 mg m −3 in the Strait of Juan de Fuca, and higher than 10.0 mg m −3 and lower than 40 mg m −3 through a majority of the central Salish Sea.This region is also often characterized by a second bloom in the fall [38].The SWIR-OC3M product shows a noisy distribution of chla, with the majority of values within the 5.0 ± 3.0 mg m −3 range.The NIR-OC3M product shows the most extreme values of chla within the Salish Sea, with values higher than 40.0 mg m −3 , not commonly observed in this region [20,38].

Discussion
This research comprised the most comprehensive analysis of MODIS-Aqua imagery subject to evaluation of atmospheric correction methods and derived chla products using in situ data from the Salish Sea.We provided an evaluation of three atmospheric correction methods (NIR, SWIR, and MUMM + SWIR) in relation to in situ AERONET and above-water reflectance data, followed by an assessment of MODIS chla retrievals in relation to in situ chla measurements.Our findings show that, for the region of study, the combined statistical results of the tested atmospheric correction methods and chla retrievals support MUMM + SWIR as the most appropriate method to determine accurate MODIS   () for retrieval of chla using the OC3M algorithm.The following sections provide a discussion of our main results.

Atmospheric Correction
To understand the ability of the three atmospheric correction methods to accurately retrieve   (λ) required for chla determination, products generated from each method were compared to a stationary AERONET site and in situ above-water   samples collected throughout the Salish Sea.Overall, the comparison with the AERONET in situ data showed that derived aerosol products were the least accurate for the SWIR method, and relatively similar for NIR and MUMM + SWIR methods.Å (440, 870 nm) biases varied from −13.8% to +11.6%, and uncertainty was approximately 25%, with slightly lower performance for the MUMM + SWIR method; τa (443), however, showed higher ranges of both uncertainty and biases, 71.3-93.4% and 70.4-168.2%,respectively, with higher values for 675 and 875 nm, but still overall better performance was obtained for the SWIR + MUMM method, with correlation coefficient value of 0.7, an associated slope close to 1 and RMSE equal to about 7% for the blue band (Figure 4 and Table 2).
Specifically, the SWIR method produced the largest uncertainties in retrievals for both Å and τa.A few factors contribute to the observed uncertainties.High turbidity levels in the Salish Sea may

Discussion
This research comprised the most comprehensive analysis of MODIS-Aqua imagery subject to evaluation of atmospheric correction methods and derived chla products using in situ data from the Salish Sea.We provided an evaluation of three atmospheric correction methods (NIR, SWIR, and MUMM + SWIR) in relation to in situ AERONET and above-water reflectance data, followed by an assessment of MODIS chla retrievals in relation to in situ chla measurements.Our findings show that, for the region of study, the combined statistical results of the tested atmospheric correction methods and chla retrievals support MUMM + SWIR as the most appropriate method to determine accurate MODIS R rs (λ) for retrieval of chla using the OC3M algorithm.The following sections provide a discussion of our main results.

Atmospheric Correction
To understand the ability of the three atmospheric correction methods to accurately retrieve R rs (λ) required for chla determination, products generated from each method were compared to a stationary AERONET site and in situ above-water R rs samples collected throughout the Salish Sea.Overall, the comparison with the AERONET in situ data showed that derived aerosol products were the least accurate for the SWIR method, and relatively similar for NIR and MUMM + SWIR methods.Å(440, 870 nm) biases varied from −13.8% to +11.6%, and uncertainty was approximately 25%, with slightly lower performance for the MUMM + SWIR method; τ a (443), however, showed higher ranges of both uncertainty and biases, 71.3-93.4% and 70.4-168.2%,respectively, with higher values for 675 nm and 875 nm, but still overall better performance was obtained for the SWIR + MUMM method, with correlation coefficient value of 0.7, an associated slope close to 1 and RMSE equal to about 7% for the blue band (Figure 4 and Table 2).
Specifically, the SWIR method produced the largest uncertainties in retrievals for both Åand τ a .A few factors contribute to the observed uncertainties.High turbidity levels in the Salish Sea may cause non-negligible R rs (1240 nm), thus resulting in poor aerosol product retrievals, as it has been reported for other turbid waters [56][57][58].Turbidity in the Salish Sea reaches values around 30.0 mg L −1 in waters under high influence of the Fraser River plume [16,39].Uncertainty in retrievals with the SWIR method is also associated with the signal-to-noise ratio (SNR) of the employed band that produces broad frequency distributions relative to NIR bands [59].The current MODIS SWIR band (1240 nm) was originally designed for land and atmosphere applications, which generally represents bright signals [60].While there exists a MODIS 1640 nm band with improved SNR compared to the 1240 nm band (495 vs. 157 on-orbit measured, respectively [61]), it is currently non-functional due to a number of broken detectors [62].Although the current 1240 and 2130 nm SWIR bands have produced effective results in some regions with turbid waters [57,63], other studies, similar to our results, show that the lower SNR often produces poor outcomes [62, 64,65].Specific to our results, this lower SNR resulted in the largest uncertainties and overall variability of MODIS τ a when compared with AERONET, and ultimately inaccurate chlorophyll products (Table 4).These findings corroborate observations by [65,66] where products derived current SWIR bands are less accurate than standard NIR bands.One way to minimize inaccuracies of retrievals is to increase the SNR of the SWIR bands to 180 for 1640 nm, and 100 for 2135 nm [62, 65,67].Improved SWIR band SNR is defined for the Ocean Land Color Instrument (OLCI) sensor onboard the Sentinel-3 platform [62], although it remains to be seen whether the OLCI characteristics will be sufficient for effective SWIR correction of these waters.
The NIR and MUMM + SWIR were derived from two very different approaches (Section 2.2.1), but generally produced close results with regard to the uncertainties and biases when compared with AERONET τ a (443 nm, 865 nm), and slightly lower performance for the MUMM + SWIR for estimates of Ångstrom.The observed small differences in uncertainties and biases for the two methods are likely a function of the constraints of the MUMM + SWIR method on how to define the aerosol model for the atmospheric correction.Instead of a pixel-based determination as used in the NIR method, the MUMM + SWIR method defines the aerosol model based on a sample (ε(λi,λj)) from a northern region of the Salish Sea, where waters are less turbid, and assumes spatial homogeneity for the entire region.A similar approach using spatial averaging of aerosol properties has also been successfully applied in the atmospheric correction used, for example, by [40].However, aerosol loads present in the more populated central/southern regions of the Salish Sea may not be present in the relatively more isolated northern region [26,27] used to define ε(λi,λj).As such, the assumption of aerosol homogeneity may not always be valid for areas under the influence of urban environments, which is expressed in our results by the reduced ability of the MUMM + SWIR method to resolve larger Ångstrom values (Figure 2), corresponding to combustible particles and sulphates [47].These are absorbing aerosols that can lead to increased negative bias in the derived short wavelength R rs (λ) [68].Fine-mode carbonaceous aerosols alter atmospheric correction results through either incorrect selection of atmospheric models or, in the case of strongly absorbing aerosols, significant over-estimation of aerosol radiance contributions, over-correction, and negative reflectance in the blue bands (412 and 443 nm) [34].The negative bias (ψ = −13.79%)observed for Ångstrom estimates using MUMM + SWIR is likely a result of this.Specifically, this bias is influenced by the fact that τ a (440 nm) is proportionally larger than τ a (870 nm).High τ a (440 nm) can occur when aerosol single scattering albedo decreases due to extinction from combustion by-products [69].
The range of the spatial aerosol optical properties in the Salish Sea [26] leads to some of the differences between our results and improved results, for example, in [46] for the European marginal seas (Adriatic, Baltic, North, and Black seas) and in [35] for coastal waters of the northern Baltic Sea, using different atmospheric correction methods.We speculate that the lower uncertainties on the retrieved aerosol products reported by these authors in comparison with uncertainties in this study is likely attributed to the Salish Sea being a water body/air mass surrounded by anthropogenic aerosol inputs [26] in contrast to the areas of study of [46], which are characterized by more open and exposed coastal sites.Interestingly, Mélin and coworkers [46] results exclusively for the Adriatic Sea, which similar to the Salish Sea, is an estuarine site bordered by anthropogenic aerosol sources, shows similar outcomes to our findings for the Ångstrom MUMM + SWIR: |ψ| = 27% vs. 17% and |δ|= 0.34 vs. 0.27, respectively, for the Salish Sea and the Adriatic Sea, and both were negatively biased at 13%.
A further important point on the evaluation of the aerosol products is that, ultimately, regardless of the atmospheric correction method, match-ups may be affected by the spatial mismatch in the sampling location of both τ a and Ångstrom used for the MODIS imagery and the land-based location of the AERONET station, although the sampling locations are in close proximity.Previous analysis, however, indicates that slight variations of location and size of the nearby satellite pixels sampling location may not significantly alter final comparisons [70].Therefore, it is unlikely that the location of the 5 × 5 pixel-box used in the study significantly affected overall uncertainty and bias results.
The secondary level of analysis corresponding to the comparison of R rs (λ) from the different atmospheric methods and the in situ HyperSAS R rs (λ) indicates that, overall, the MUMM + SWIR method retrieves more accurate reflectance values for all wavelengths.Note that, although the number of in situ samples is not large, they do represent the distribution of reflectance in the studied area, from the more turbid waters closer to the Fraser plume to less turbid waters in the northern region [39].For the true match-up data including the three methods, the MUMM + SWIR achieved negative biases lower than 20% for the OC3M bands (443, 488 and 547 nm) (Figure 7).This level of uncertainty is generally achieved for the best models (e.g., [5,35]), with poorer performance, but still acceptable, reported by other authors (e.g., [55,71]).The largest uncertainties were observed for the non-OC3M bands, 678 nm band (~50%) and 412 nm (~38%), similar to [35].Given the similar and lower uncertainties of the OC3M bands, the MUMM + SWIR method also produced the lowest absolute difference (10%) when comparing the MODIS blue green reflectance ratios to HyperSAS; that is, the spectral shape of R rs for 443, 488 and 547 nm was accurately preserved.This is an important consideration, as an error in the blue green ratio is propagated to the OC3M chla retrieval.Besides the achieved closure to the OC3M bands and band ratios, the robustness of MUMM + SWIR results is also based on the spatial representativeness of the in situ HyperSAS collected samples-samples are from various water conditions throughout the Salish Sea (Figure 1).
Contrary to the MUMM + SWIR results, the NIR method shows a larger negative and variable bias on R rs (λ) across all bands when compared to HyperSAS (Figure 7), thus suggesting atmospheric over-correction (Table 2).The range in MAD and MRD indicates that the atmospheric model selection and correction is not applied proportionally, likely a result of turbidity causing invalid model selection, and further CDOM absorption in the blue wavelengths affecting individual pixel reflectance.Underestimation of the blue bands (443 nm, 488 nm) reflectance relative to green (547 nm) reflectance likely (Figure 7) resulted in the overestimation of chla over a broad range of water conditions (Figure 8, Table 3), as reported in other studies [35,72].The NIR approach is an established and reliable method for optically simple, more open waters, where variability of marine optical properties such as absorption and scattering is solely a function of phytoplankton composition [31,73].Optics in coastal waters are more complex and, the Black Pixel assumption is not consistently effective.In these waters, total organic and inorganic suspended material often contributes to backscattering in the NIR [10,31,58,73] causing overestimation of NIR aerosol reflectance.Band mischaracterization confounds the selection of an appropriate atmospheric model, causing error in path radiance back calculations and poor estimates of all spectral bands.The result is low or negative water leaving reflectance in the blue wavelengths leading to the failure of chla retrievals.
Although, to our knowledge, we have the largest number of match-ups for ocean colour satellite evaluation for this region, this study is still based on a less than ideal number of match-ups for in situ HyperSAS data.Further, the AERONET site is land-based.Improvements on the validation of atmospheric correction methods would be facilitated by an increased number of in situ reflectance data, such as other coastal sites that are equipped with the AERONET-OC (Ocean Colour) [5,72].Although, as pointed out by [5] and similar to the in situ HyperSAS data used in this research, AERONET-OC stations measure in a specific location (point data), and caution should be taken when interpreting the results of comparisons with reflectance from a region, generally defined as an average of a 3 × 3 or 5 × 5 pixel-box in the satellite image.Other sampling methods, such as moving ship-based initiatives, are also an alternative [74], however, the variability of the reflectance measurements within the pixel-box has to be defined.This will be the case for the recently installed autonomous SAS Solar Tracker above-water radiometers aboard the BC Ferries in the Salish Sea [75].

Chlorophyll Retrievals
The results of coupling the OC3M model with the different atmospheric correction methods clearly shows that the MUMM + SWIR-OC3M resulted in the best statistical performance for retrievals of chla, regardless of the match-up temporal window between in situ and MODIS-derived chla.Statistically significant correlations were consistently high with r linear = 0.81 for 8 h time difference (n = 82 match-ups), increasing to r linear = 0.83 for 1 h match-ups (n = 16), and accompanied slopes ranged from 0.77 and 0.86, respectively.The bias and uncertainty are 21% and 62%, respectively, for 1 h temporal window.These results are in the same range of acceptance as results published for other coastal regions of the world (e.g., [4,8,55,65,71,76,77]).The NIR-OC3M and SWIR-OC3M yielded higher uncertainties and biases when compared to the in situ chla.Both the NIR and SWIR corrected reflectances tend to highly over-estimate in situ chla observations, and both returned the lowest number of valid retrievals, regardless of the match-up temporal window.Chla retrievals from the standard NIR method exhibiting extreme values are associated with negligible or negative R rs (443 nm), a result of over-estimation of the aerosol optical transmittance.
Empirical blue-green (440-550 nm) band-ratio algorithms, such as the OC3M, are the most commonly implemented due to their relative simplicity, and phytoplankton absorption is most influential in this spectral region [4,78].This simplicity is useful for the diverse waters in the Salish Sea and, as originally tested for these waters by [39], provides chla estimates under the greatest range of surface-water optical properties.However, uncertainties of chla retrievals for concentrations below 0.2 mg m −3 are known to be high [39,79].While these low magnitude values were close to the observed for our samples (ranged from 0.4 to 30.0 mg m −3 ), the large majority of chla samples (~82%) were above 1.0 mg m −3 .We also must consider that the increased concentrations of CDOM and total suspended matter in the studied waters impact the accuracy of the current blue-green ratio algorithm to retrieve chla, as demonstrated for other optically complex waters [80].The variable concentrations of CDOM and total suspended matter play a role in the magnitude and shape of the reflectance signal and therefore the efficacy of the OC3M algorithm.For instance, we have measured aCDOM(443) ranging 0.007-3.0m −1 and total suspended matter ranging 1.0-27.0mg L −1 in these waters [16,39,81].These conditions are typical of coastal waters around the world, such as the Beaufort Sea, Chesapeake Bay, and the Baltic Sea, where standard OBPG chla algorithms fail [71,82,83].To address some of the issues related to CDOM adsorption, methods using the chlorophyll fluorescence emission centred at 685nm for non-atmospheric corrected MERIS [84] and MODIS [22,85] imagery have been used in British Columbia coastal waters.Further analysis is required with a larger in situ dataset to explore this method, given that the fluorescence signal can vary not only with chla concentration, but also due to photo-inhibition, phytoplankton species, and cell physiology [85].In addition, effective fluorescence light height retrieval can be difficult in turbid coastal waters with TSM exceeding 5 mg m −3 [86,87] where NIR elastic scattering confounds the signal [88].These are typical concentrations in the Salish Sea during spring/summer [16,39].Ideally, regional algorithms (e.g., [4,71,89], band-difference algorithms (e.g., [76]), and semi-analytical models such as the GSM1 [4,39,77,90] should be explored to test their retrieval accuracies in these waters.
Regardless of the adopted model approach, either based on global or regional empirical or semi-analytical models, the in situ chla satellite match-up samples are generally low, as is also the case for this research.Exceptions of model comparison with large data sets are generally based on the global databases, such as NOMAD [12]; however, caution must be taken on the evaluation of models when the same dataset is used to parameterize the actual models [4].In addition, as a commonality for most of the satellite-in situ chla match-up analyses, in situ chla data are point-wise data and satellite retrievals are based on pixel-box data, which generally represent a few hundred metres.Given that the at-sensor radiance is a function of all the water optically active constituents within the pixel, as well as adjacency effects from surrounding pixel-atmosphere interactions, it is reasonable to assume that a portion of the error in match-ups are due to coarse spatial resolution.This may have large implications for validation of satellite retrievals, especially in dynamic coastal waters [91] such as the case of the Salish Sea [16,81].Further, satellite-derived chla is estimated based on reflectance from the first optical depth, [92], which in the Salish Sea, ranges from one to three metres [16].This is generally shallower than the depth of chlorophyll maximum, for both stratified and non-stratified oceanographic conditions in this region [20,38].This is a limitation of current remote sensing methods [93]; however, in the specific case of the Salish Sea, it does mean that the OC3M retrievals represent waters that are always above the depth of chlorophyll maximum.

Conclusions
The NIR, SWIR, and modified MUMM + SWIR products coupled with OC3M-derived chla concentrations were evaluated using in situ AERONET aerosol optical measurements (N ~650), above-water R rs (λ) (N ~34), and in situ chla (N ~83) data.Our results support the MUMM + SWIR as the most appropriate method to determine accurate MODIS R rs (λ) for retrieval of chla using the OC3M algorithm for deriving time series products for the Salish Sea.
The MUMM + SWIR method works nearly as well as the NIR black pixel assumption to define τ a (λ) and Åwhen compared to the in situ AERONET data.This partly supports the assumption of aerosol spatial homogeneity for the MUMM method.However, this method does not properly resolve fine mode fraction Å, which means that further work is needed to address the influence of strongly absorbing urban aerosols that may occur in the central region due to large urban settlements.Nonetheless, when compared to in situ R rs (λ) collected from broader water and atmospheric conditions throughout the Salish Sea, the MUMM + SWIR performed the best, especially for the OC3M bands.In addition, the MUMM + SWIR method resulted in more accurate chla retrievals with the lowest uncertainty and bias, and best linear regression statistics (r linear = 0.83, slope = 0.89 for ±1 h temporal window).The OC3M coupled with the NIR and SWIR methods did not perform well for the studied waters, as chla values were largely overestimated beyond values typical of the Salish Sea.While these findings clearly show that the MUMM + SWIR performs better in waters where the black pixel assumption fails, we nevertheless recognize that further efforts are needed to address the limited number of viable R rs (λ) match-ups to compare atmospheric correction methods over a broad range of waters with varying optical constituents.Since July 2016, our research group has been addressing this need with the recent installation of two autonomous above-water Hyperspectral sensors (SAS Solar tracker) that are providing continues R rs (λ) along two different ferry routes covering different optical waters of the Salish Sea [75].These new systems will allow for further evaluation of the MUMM + SWIR and other atmospheric models so improved satellite products can be generated.

Figure 1 .
Figure 1.Salish Sea study area (Central and North Strait of Georgia (SoG), Puget Sound, and Strait of Juan de Fuca).Chlorophyll-a in situ sampling stations for 2012 and 2013 are represented by grey points; HyperSAS acquisition sites are represented by triangles; AERONET site is represented by a large yellow triangle.

Figure 1 .
Figure 1.Salish Sea study area (Central and North Strait of Georgia (SoG), Puget Sound, and Strait of Juan de Fuca).Chlorophyll-a in situ sampling stations for 2012 and 2013 are represented by grey points; HyperSAS acquisition sites are represented by triangles; AERONET site is represented by a large yellow triangle.

Figure 2 .
Figure 2. Histogram match-up of Å ngstrom exponent derived from three MODIS atmospheric corrections (N = 581, 620, and 618, respectively) and Saturna AERONET measurements (July 2002 to July 2014).MODIS and AERONET data are plotted as shaded and black outlined bars, respectively.

Figure 2 .
Figure 2. Histogram match-up of Ångstrom exponent derived from three MODIS atmospheric corrections (N = 581, 620, and 618, respectively) and Saturna AERONET measurements (July 2002 to July 2014).MODIS and AERONET data are plotted as shaded and black outlined bars, respectively.

Figure 2 .
Figure 2. Histogram match-up of Å ngstrom exponent derived from three MODIS atmospheric corrections (N = 581, 620, and 618, respectively) and Saturna AERONET measurements (July 2002 to July 2014).MODIS and AERONET data are plotted as shaded and black outlined bars, respectively.

Figure 4 .
Figure 4. Match-up of MODIS to AERONET aerosol optical thickness at three wavelengths for three atmospheric correction methods.The red line is the regression line of the equation, and the dashed line indicates the 1-to-1 relationship.The number of match-ups (±15 min) are 569, 617 and 684 for the NIR, SWIR, and MUMM + SWIR methods, respectively.

Figure 4 .
Figure 4. Match-up of MODIS to AERONET aerosol optical thickness at three wavelengths for three atmospheric correction methods.The red line is the regression line of the equation, and the dashed line indicates the 1-to-1 relationship.The number of match-ups (±15 min) are 569, 617 and 684 for the NIR, SWIR, and MUMM + SWIR methods, respectively.

24 Figure 5 .
Figure 5.Comparison of match-ups of MODIS remote sensing reflectance to convolved in situ HyperSAS data.Respective bands are denoted by symbols.Solid lines represent the 1:1 relationship.

Figure 5 .
Figure 5.Comparison of match-ups of MODIS remote sensing reflectance to convolved in situ HyperSAS data.Respective bands are denoted by symbols.Solid lines represent the 1:1 relationship.

Figure 7 .
Figure 7.Comparison of matching MODIS-derived to in situ (HyperSAS) remote sensing reflectance for three atmospheric correction methods.The y-axis indicates per cent difference to respective MODIS bands along the x-axis (N = 16).

Figure 7 .
Figure 7.Comparison of matching MODIS-derived to in situ (HyperSAS) remote sensing reflectance for three atmospheric correction methods.The y-axis indicates per cent difference to respective MODIS bands along the x-axis (N = 16).

Figure 7 .
Figure 7.Comparison of matching MODIS-derived to in situ (HyperSAS) remote sensing reflectance for three atmospheric correction methods.The y-axis indicates per cent difference to respective MODIS bands along the x-axis (N = 16).

Figure 8 .
Figure 8. Relationship between MODIS derived chlorophyll concentration and in situ samples for ±8 h (grey and grey with black dot) and ±1 h (black dot) temporal windows.Statistics are provided in Table4.Lines represent the 1:1 relationship.

Figure 8 .
Figure 8. Relationship between MODIS derived chlorophyll concentration and in situ samples for ±8 h (grey and grey with black dot) and ±1 h (black dot) temporal windows.Statistics are provided in Table4.Lines represent the 1:1 relationship.

24 Figure 9 .
Figure 9. Example of MODIS-derived chlorophyll distribution for February, April and September.

Figure 9 .
Figure 9. Example of MODIS-derived chlorophyll distribution for February, April and September.

Table 1 .
In situ data used in this study and associated sources.N is the number of samples.

Table 2 .
Validation statistics for the three atmospheric correction methods for MODIS-derived τ a expressed as |ψ| and ψ (in %) and |δ| and δ (in τ a units).

Table 3 .
Results of MODIS and HyperSAS () comparison.Percentage of negative retrievals are given.Statistics are based on all remaining positive values.Method (λ) % Negative Count (N) MAD% MRD% RMSE rlinear Slope

Table 3 .
Results of MODIS and HyperSAS R rs (λ) comparison.Percentage of negative retrievals are given.Statistics are based on all remaining positive values.

Table 4 .
Summary statistics for MODIS vs. in situ remote chlorophyll concentrations.Match-up samples were derived from approximately 40 images.MethodParameters Count (N)

Table 4 .
Summary statistics for MODIS vs. in situ remote chlorophyll concentrations.Match-up samples were derived from approximately 40 images.