Analyzing Performances of Different Atmospheric Correction Techniques for Landsat 8 : Application for Coastal Remote Sensing

Ocean colour (OC) remote sensing is important for monitoring marine ecosystems. However, inverting the OC signal from the top-of-atmosphere (TOA) radiance measured by satellite sensors remains a challenge as the retrieval accuracy is highly dependent on the performance of the atmospheric correction as well as sensor calibration. In this study, the performances of four atmospheric correction (AC) algorithms, the Atmospheric and Radiometric Correction of Satellite Imagery (ARCSI), Atmospheric Correction for OLI ‘lite’ (ACOLITE), Landsat 8 Surface Reflectance (LSR) Climate Data Record (Landsat CDR), herein referred to as LaSRC (Landsat 8 Surface Reflectance Code), and the Sea-Viewing Wide Field-of-View Sensor (SeaWiFS) Data Analysis System (SeaDAS), implemented for Landsat 8 Operational Land Imager (OLI) data, were evaluated. The OLI-derived remote sensing reflectance (Rrs) products (also known as Level-2 products) were tested against near-simultaneous in-situ data acquired from the OC component of the Aerosol Robotic Network (AERONET-OC). Analyses of the match-ups revealed that generic atmospheric correction methods (i.e., ARCSI and LaSRC), which perform reasonably well over land, provide inaccurate Level-2 products over coastal waters, in particular, in the blue bands. Between water-specific AC methods (i.e., SeaDAS and ACOLITE), SeaDAS was found to perform better over complex waters with root-mean-square error (RMSE) varying from 0.0013 to 0.0005 sr−1 for the 443 and 655 nm channels, respectively. An assessment of the effects of dominant environmental variables revealed AC retrieval errors were influenced by the solar zenith angle and wind speed for ACOLITE and SeaDAS in the 443 and 482 nm channels. Recognizing that the AERONET-OC sites are not representative of inland waters, extensive research and analyses are required to further evaluate the performance of various AC methods for high-resolution imagers like Landsat 8 and Sentinel-2 under a broad range of aquatic/atmospheric conditions.


Introduction
Ocean colour (OC) remote sensing provides information on in-water optical properties indicating the concentrations of water constituents such as chlorophyll-a.In optically shallow waters, depth and seafloor spectral reflectance may also be estimated using remotely sensed images.Information about near-surface, in-water optical properties, in the form of water quality maps, can provide advance warning of algal bloom development [1] and potentially lead to early mitigation efforts to reduce health risks and financial losses.Bathymetric maps, derived from water depth estimates, can be used to produce or update navigational charts [2], reducing the risk of ship groundings.Benthic habitat maps, inferred from seafloor spectral reflectance, can be used to track changes in the distribution of seafloor habitats [3,4].However, extracting ocean colour products such as chlorophyll-a, water depth, and bottom types from remotely sensed images is difficult because, over blue ocean waters, only ~10% of the total signal that reaches the TOA (Top of Atmosphere) typically comes from within the water column [5].In addition to the radiation leaving the water column (L w ), the TOA radiance measured by satellite sensors includes contributions from scattering and absorption in the atmosphere and reflection at the sea surface [6].It is important to estimate and account for the contribution from these other sources in order to estimate L w , which is readily normalized by the total downwelling irradiance just above the sea surface to yield the remote sensing reflectance (R rs ).
Atmospheric effects are removed through atmospheric correction (AC) [7], but residual errors in AC can introduce large uncertainties in R rs estimates, resulting in erroneous retrieval of OC products such as apparent optical properties [8].In open ocean waters, where phytoplankton governs the optical regime, it can be conveniently assumed that there is no water-leaving radiance in the near-infrared (NIR) region, such that any measured TOA radiance in this spectral band is attributed to atmospheric path radiance and reflectance from the water surface.While this assumption is valid for open-ocean waters, in shallow or optically complex waters that are, in general, characterized by a combination of constituents, such as phytoplankton, coloured dissolved organic matters, and suspended particulate matters, R rs (NIR) may be significantly greater than zero because of bottom reflectance (which can come from highly reflective near-surface vegetation such as kelp and seagrasses) and backscattering by suspended materials [9].This can lead to over-correction of atmospheric and surface reflectance effects, leading to underestimation and even negative L w estimates within the shorter wavelengths used to derive OC products [9,10].To account for the non-negligible R rs (NIR), algorithms that work for Case 2 waters have been developed (e.g., [10][11][12]) and tested (e.g., [13,14]).With these efforts, it is now possible to retrieve L w over coastal waters.However, the low spatial resolution of traditional ocean colour sensors inhibits the detection of detailed features that are not easily discernible in coarse-resolution satellite images.The recent availability of higher-resolution satellite sensors, e.g., Landsat 8 Operational Land Imager (OLI) and Sentinel-2 Multispectral Instrument (MSI), with adequate spectral and radiometric characteristics for ocean colour applications, has the potential to greatly improve coastal ocean colour applications, including water quality [15,16], bathymetry, and seafloor habitat mapping [17].
Optimizing the utility of the OLI sensor for aquatic science and applications requires validating R rs products to better understand their potential and limitations.A few recent studies have started investigating the quality of OLI-derived R rs for coastal applications.For example, Pahlevan et al. [15] used the AC scheme in SeaDAS software package to determine the best Landsat 8 band combinations that can minimize error in R rs retrieval over different coastal water types at select AERONET-Ocean Colour (OC) sites.Using newly computed calibration gains, they revealed that OLI-derived R rs estimates are as good as those from other ocean colour sensors.Likewise, the work of Franz et al. (2015) [18] illustrated the potential of OLI data for water-related studies.By employing the AC process in the SeaDAS Level-2 processing algorithm (l2gen), they assessed the quality of OLI-derived R rs and subsequently retrieved the chlorophyll-a concentration over the Chesapeake Bay, USA.In agreement with Pahlevan et al. (2017a) [15], they found that with a precise AC procedure, the high radiometric quality and improved imaging capabilities of OLI hold great promise for satellite-based coastal monitoring.Doxani et al. (2018) [19] tested a wide range of AC algorithms over different land cover types, highlighting the strengths and limitations of each algorithm.More recently, Wei et al. (2018) [20] also assessed four AC algorithms with a focus over water bodies, revealing that the NIR-short-wave infrared (SWIR) approach implemented in SeaDAS produced the most robust R rs estimates from Landsat 8.However, none of these studies examined the effects of environmental variables on the R rs retrieval accuracy of different AC algorithms.
With the increase in usage of OC products among the science community, and the need for robust R rs products, it is important to understand the potential of AC algorithms.Most of the ocean colour community has for years been using water-based AC methods for a wide range of applications from coastal to inland waters, so it is important that the effects of relevant environmental variables on the R rs retrieval accuracy of these AC algorithms is examined.Such knowledge may assist in the choice of AC algorithm for a given set of environmental conditions, and/or improved R rs retrieval under a wider range of conditions.Equally important is that other users interested in studying inland waters (e.g., biogeochemists, aquatic biologists) fully understand the accuracy of generic AC processors, in particular the land surface reflectance product, which is commonly used.
Here, we pursued an approach similar to Pahlevan et al. (2017a) [15], but expanded it by evaluating the performances of four different AC algorithms to determine which method produces the most robust R rs products in shallow coastal waters.Also, like Doxani et al. (2018) [19], we tested both land-based and water-based algorithms at multiple sites, but covered more sites over a longer time period to better capture space-time dynamics related to water optical properties.Using 54 in-situ measurements from 14 AERONET-OC sites, we (1) tested the following algorithms for atmospheric correction of Landsat 8 images: (a) Atmospheric and Radiometric Correction of Satellite Imagery (ARCSI) [21], (b) the Atmospheric Correction for OLI 'lite' (ACOLITE) [22], (c) SeaDAS [8], and (d) the United States Geological Survey's standard land-based AC used to produce the Landsat 8 Surface Reflectance (LSR) Climate Data Record (Landsat CDR), herein referred to as LaSRC [23]; (2) analysed the differences in spectral bands between satellite and in-situ measurements; and (3) examined the effects of three key environmental variables on the R rs retrieval accuracy of water-based AC algorithms.To our knowledge, this is the first inter-comparison exercise that tested AC algorithms using representative data from many coastal sites with varying atmospheric conditions and optical properties, combining the three approaches mentioned above in a single study.

Landsat 8 OLI Data
OLI measures TOA radiance in the visible, NIR, and short-wave infrared (SWIR) bands, at a spatial resolution of 30 m.Compared to its predecessors, OLI includes a new coastal/aerosol band (435-451 nm) in addition to the traditional blue (452-512 nm), green (533-590 nm), and red (636-673 nm) bands.The addition of a new band, together with enhanced spectral coverage and radiometric resolution, enables improved observation of water bodies from space and the ability to estimate the concentration of atmospheric aerosols for AC [16,24,25] (note that aerosol estimation for AC by the coastal band is done over land).Compared to data from existing global ocean colour missions, the higher spatial resolution has the potential to make important contributions to ocean colour remote sensing, such as separating and mapping in-water constituents in coastal waters [16,25].Although OLI signal-to-noise ratios (SNRs) (Table 1) are generally lower than those of heritage ocean colour sensors such as SeaWiFS or the Moderate Resolution Imaging Spectroradiometer onboard Aqua (MODISA), its enhanced SNR across all bands compared to the past Landsat missions improves OLI's ability to measure subtle variability in near-surface conditions and ultimately make OLI products a valuable source of data for ocean colour studies [16,18].

AERONET-OC Data
To validate the performance of the AC processors applied to the OLI data, we acquired 122 cloud-screened and fully quality-controlled Level 2.0 AERONET-OC in-situ measurements of normalized water-leaving radiance (nL w ) for 14 AERONET-OC sites, including 12 coastal sites (i.e., Galata Platform, Gloria, GOT Seaprism, Gustav Dalen Tower, Helsinki Lighthouse, Long Island Sound Coastal Observatory (LISCO), Martha's Vineyard Coastal Observatory (MVCO), Thornton C-Power, USC Seaprism, Venise, WaveCIS Site CSI-6, Zeebrugge-MOW1) and two lake sites (Lake Erie, Palgrunden) (Figure 1).AERONET-OC, managed by NASA's Goddard Space Flight Center (GSFC) [26], is a sub-network of the AERONET federated instrument [27,28].Although OLI and AERONET-OC have somewhat different spectral bands, a set of comparable bands centred at 441 nm, 491 nm, 551 nm, and 667 nm can be used for cross-comparison purposes.Thus, AERONET-OC data were collected in four spectral bands centred at 441, 491, 551, and 667 nm, for comparison with OLI's four visible bands, centred at 443, 483, 561 and 655 nm.Note that there is a slight difference, i.e., ±1 to ±3 nm, in all four bands among measurements at different sites.As R rs is not directly available from the AERONET-OC sites, the normalized water-leaving radiances (nL w , W m −2 sr −1 ) were divided by the top-of-the atmosphere (TOA) solar irradiance (F 0 ) [29] to obtain R rs .

AERONET-OC Data
To validate the performance of the AC processors applied to the OLI data, we acquired 122 cloud-screened and fully quality-controlled Level 2.0 AERONET-OC in-situ measurements of normalized water-leaving radiance (nLw) for 14 AERONET-OC sites, including 12 coastal sites (i.e., Galata Platform, Gloria, GOT Seaprism, Gustav Dalen Tower, Helsinki Lighthouse, Long Island Sound Coastal Observatory (LISCO), Martha's Vineyard Coastal Observatory (MVCO) , Thornton C-Power, USC Seaprism, Venise, WaveCIS Site CSI-6, Zeebrugge-MOW1) and two lake sites (Lake Erie, Palgrunden) (Figure 1).AERONET-OC, managed by NASA's Goddard Space Flight Center (GSFC) [26], is a sub-network of the AERONET federated instrument [27,28].Although OLI and AERONET-OC have somewhat different spectral bands, a set of comparable bands centred at 441 nm, 491 nm, 551 nm, and 667 nm can be used for cross-comparison purposes.Thus, AERONET-OC data were collected in four spectral bands centred at 441, 491, 551, and 667 nm, for comparison with OLI's four visible bands, centred at 443, 483, 561 and 655 nm.Note that there is a slight difference, i.e., ±1 to ±3 nm, in all four bands among measurements at different sites.As Rrs is not directly available from the AERONET-OC sites, the normalized water-leaving radiances (nLw, W m -2 sr -1 ) were divided by the top-of-the atmosphere (TOA) solar irradiance (F0) [29] to obtain Rrs.

Match-Up Exercise
To obtain the in-situ Rrs data needed to test AC procedures for OLI, we performed a match-up exercise between the AERONET-OC measurements and OLI data as follows: (i) Using the OLI metadata database file provided by the United States Geological Survey (USGS), python code was created to automatically retrieve all Landsat 8 OLI scenes and the contemporaneous AERONET-OC data (from the AERONET-OC website) that were within a ± 30-minute time window of Landsat 8 overpass times, for the April 2013 to May 2017 timeframe (note that a strict time window of ± 30 minutes, which reduces the number of match-up pairs, was used to limit the error introduced by

Match-Up Exercise
To obtain the in-situ R rs data needed to test AC procedures for OLI, we performed a match-up exercise between the AERONET-OC measurements and OLI data as follows: (i) Using the OLI metadata database file provided by the United States Geological Survey (USGS), python code was created to automatically retrieve all Landsat 8 OLI scenes and the contemporaneous AERONET-OC data (from the AERONET-OC website) that were within a ±30-min time window of Landsat 8 overpass times, for the April 2013 to May 2017 timeframe (note that a strict time window of ±30 min, which reduces the number of match-up pairs, was used to limit the error introduced by water movement between satellite and AERONET-OC observations, and ensure the quality of match-ups).This yielded a text file containing a total of 122 match-ups with coincident satellite and in-situ data for 14 AERONET-OC sites (Figure 2), as well as information on aerosol optical thickness, solar zenith angles (SZA), and wind speed for each match-up.All corresponding OLI scenes were subsequently bulk-downloaded using Landsat-util, a tool to automatically find and download multiple Landsat 8 scenes.Some of the 122 OLI scenes visibly contained a non-negligible amount of specular reflection off the sea surface (sunglint) in the area of the AERONOET-OC site.As not all AC algorithms have the capacity for sunglint correction, to obtain realistic and comparable R rs across all AC methods, scenes with visible specular reflection were excluded, leaving 69 of the original 122 scenes; (ii) Following the approach of Bailey and Werdell (2006) [30], a regional subset of Landsat 8 data was generated for each scene by (1) extracting a 7 × 7 pixel window centred on the location of the AERONET-OC site, and (2) removing the centre 3 × 3 pixels from that window to limit the effect of noise from the site's superstructures and shadows cast by it; (iii) For SeaDAS, remaining low-quality pixels were then removed by employing the internal SeaDAS exclusion flags, which include flags for land, clouds, cloud-shadow, ice, stray light, low nL w (555), high viewing zenith angle (>60 • ), high sunglint, and high TOA radiance.Scenes without any unflagged pixels were eliminated from the match-up exercise, leaving 54 scenes for comparison with AERONET-OC data.Similarly, internal ACOLITE exclusion flags were used to remove low-quality pixels for ACOLITE, which left 56 scenes for AERONET-OC comparison.The 54 scenes remaining after applying the SeaDAS exclusion flags also passed ACOLITE's exclusion flags, and were therefore used for further analysis.To ensure an unbiased inter-comparison, we included pixels with negative (Table A3) and zero R rs retrievals from all methods; (iv) We then obtained the per-band median R rs values of the unflagged pixels from each Landsat regional subset for final comparison with in-situ data, and used the median AERONET-OC measurements collected within the ±30-min window of the Landsat 8 overpass to represent in-situ match-ups.

Description of Atmospheric Correction Algorithms
Atmospheric correction of the OLI data was carried out using four algorithms: ARCSI, ACOLITE (version 20170113.0),SeaDAS (version 7.4), and LaSRC.Note that LaSRC is a product that has already been processed for surface reflectance by the United States Geological Survey (USGS).Both ACOLITE and SeaDAS have been specifically designed for AC over water surfaces, whereas ARCSI and LaSRC have not; we therefore refer to the latter two as land-based methods.The output of the water-based methods is R rs , which is directly comparable to the in-situ data from AERONET (after conversion: R rs = nL w / F 0 ), while the output of the land-based methods is in units of above-surface diffuse reflectance R(0 + ), which we converted to R rs using: ARCSI is an open-source software program developed at Aberystwyth University [21].It is a relatively new AC algorithm with functionalities to process multispectral images from both commercial and publicly available sensors and also to obtain processed data for direct use in remote sensing analyses [31].It is a command line tool where Py6S [32] can be implemented to correct multispectral images to above-surface diffuse reflectance using the 6S model [33], which simulates ground and atmospheric radiation under a variety of conditions.Within the 6S method, input parameters such as the Aerosol Optical Thickness (AOT), vertical column water vapour, and ozone concentration are automatically used by the 6S method to characterize the state of the atmosphere.
ACOLITE is a binary distribution of Landsat 8 OLI and Sentinel-2 MSI processing software developed by the Royal Belgian Institute of Natural Sciences [22,34].It is an image-based AC algorithm that estimates L w by correcting for molecular and aerosol scattering in the atmosphere using the Gordon and Wang (1994a) approach [5].Molecular reflectance correction, based on viewing and illumination geometries, is performed with a 6SV-based look-up table [33].Unlike SeaDAS, which uses 80 aerosol models for aerosol estimation [5,35], aerosol reflectance is estimated by determining aerosol types from the ratio of reflectances in two SWIR bands over water pixels where reflectance can be assumed zero, an approach similar to Ruddick et al. (2000) [36].Based on this assumption, it also retrieves water-leaving reflectances in both the visible and NIR bands together with other parameters of interest in marine and inland waters.ACOLITE is primarily designed for processing Landsat 8 OLI data for aquatic remote sensing applications, but has recently been modified and updated to include processing of Sentinel-2 MSI data [37].
LaSRC is a Level-2 data set produced and released as a provisional product by the USGS since January 2015, primarily to support terrestrial remote sensing applications.Unlike the precursor algorithm, i.e., Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS), used for previous Landsat satellites such as Landsat 4-5 Thematic Mapper (TM) and Landsat 7 Enhanced Thematic Mapper Plus (ETM+), which used the 6S model, LaSRC is generated using a dedicated Landsat Surface Reflectance code [24].Data are available as standalone climate data records (CDRs) that represent specific geophysical and biophysical properties of the land surface [23].AC is mainly based on the MODIS collection 6-AC algorithm, which uses a radiative transfer model for the inversion of atmospheric parameters such as aerosol and water vapour [24].It should be noted that surface reflectance is provided in seven spectral bands (the first seven OLI bands) only for scenes with a solar zenith angle less than 76 • , and that the 443 nm and 482 nm bands are not suitable for analysis as they are 'consumed' for aerosol inversion tests within the LaSRC.Although still provisional and under continuous improvement, LaSRC has been validated and assessed for land applications [38][39][40][41], and has a dedicated aerosol retrieval algorithm for pixels over water [23].
SeaDAS, which uses an NIR and a SWIR band for aerosol estimation for processing OLI imagery, contains an AC scheme originally designed for open ocean water based on the assumption of negligible L w in the NIR bands.This approach, which is NASA's operational AC algorithm [5], includes an l2gen (Level 2 generator) to retrieve R rs and other optical and geophysical water and atmospheric properties.Following some improvements to estimation of aerosol contributions (e.g., [12,42,43]) the l2gen processor in SeaDAS can now be used for deriving R rs in moderately turbid coastal waters [14].

Atmospheric Correction Procedure and Validation
To derive R rs , all four AC algorithms were parameterized using their default processing options.In addition, the following processing was implemented: (1) In SeaDAS, out-of-band correction options were set to zero (outband_opt = 0), i.e.R rs was reported at full bandpass, without correction to the nominal band centre, and no sunglint correction was implemented (glint_opt = 0) since there was no such correction option available for other algorithms; (2) In ARCSI, the 'clear water' option was used for the reflectance of a ground target as processing requires an option from 'green vegetation', 'clear water', 'sand', or 'lake water'.Also, AOT and other atmospheric parameters were automatically identified and estimated by ARCSI during batch processing.To derive AOT for each scene, realistic minimum and maximum values of 0.001 and 0.9, respectively, were manually specified; (3) Finally, to allow for consistency among all methods, we assumed a perfect sensor calibration by applying unity gains for vicarious calibration across all bands for all processors.For SeaDAS, aerosol correction was implemented following the Gordon and Wang (1994a) approach [5], with the NIR/SWIR correction option (865-1609 nm band combination) as suggested in Pahlevan et al. (2017a) [15] and Mobley et al. (2016) [7].ACOLITE aerosol correction was implemented using the default SWIR option (1609 and 2201 nm band combination) which computes Rayleigh-corrected reflectance from the SWIR bands for moderate and turbid waters.Each AC algorithm was applied to the final 54 Landsat 8 OLI scenes, representing a wide variety of coastal and atmospheric conditions (Table A1).When comparing their performance, we considered AERONET-OC data as the reference with negligible uncertainties.Note that uncertainties in the AERONET-OC in-situ measurements are ~5% in the blue to green bands and ~8% in the red band [26].As noted in Pahlevan et al. (2017b) [44] and Mélin and Sclep (2015) [45], compensating for discrepancies arising from the differences in nominal band centre wavelengths is crucial for obtaining a robust match-up analysis across all spectral bands (in particular for OLI's relatively broad spectral bands).To this end, we carried out a spectral band adjustment using the deep neural network approach as described in Pahlevan et al. (2017b) [44] and compared R rs derived with and without band adjustments.
The algorithm performance was compared using six metrics including: as well as the coefficient of determination (R 2 ), slope, and intercept of the line fitted using least-squares regression between in-situ and satellite R rs estimates.The x mea and x est are AERONET-OC and satellite-derived R rs data, respectively.The Spectral Angle (SA), which is insensitive to spectral amplitude, is used to quantify the similarity between satellite and in-situ R rs spectra.Values close to 0 indicate high similarity.

Validation of AC Algorithms
Scatter plots showing the estimated (OLI) and observed (AERONET-OC) R rs values for each match-up are presented in Figure 3, and summary statistics are tabulated in Table 2.There are clear differences between the water-based and land-based AC algorithms, with SeaDAS and ACOLITE outperforming ARCSI and LaSRC in all metrics for all bands, with only one exception (slope for R rs (482)).Between the two water-based methods, SeaDAS outperforms ACOLITE in every metric for all bands, with the exception of slopes for R rs (482), R rs (561), and R rs (655).SeaDAS had RMSEs close to zero between 0.0013 and 0.0005 1/sr across all four wavelengths) demonstrating a high degree of similarity between in-situ and OLI-estimated R rs with OLI data processed through SeaDAS (Table 2).A comparison of RMSE results of the per-band difference with and without band adjustments (Figure A1) shows that SeaDAS was the AC method most sensitive to spectral band differences, with the largest improvement of band adjustment occurring in the 655 nm channel.Spectral Angle values obtained for all algorithms showed that SeaDAS and ACOLITE have the highest similarity with in-situ R rs spectra (ARCSI: 0.46, ACOLITE: 0.27, LaSRC: 0.53 and SeaDAS: 0.20).The overall performance of SeaDAS reveals that the NIR-SWIR aerosol correction option can yield satisfactory results in low-to-moderately turbid waters.A possible reason for this is that the aerosol correction scheme, constructed following Ahmad et al. (2010) [42], was based on aerosol data obtained mainly from AERONET-OC sites [46].Comparison of R 2 values among all methods shows that the lowest and most diverse values are in the 443 nm wavelength, with values between 0.05 and 0.84.For LaSRC in particular, the regression line for the comparison in the 443 nm wavelength deviates very much from the 1:1 line, yielding a poor R 2 and slope (Figure 3).The poor correlation and low RMSE (R 2 : 0.05, slope: 0.23) are mostly a result of the large discrepancies between the observed and estimated R rs for the Zeebrugge-MOW1 site, where mean R rs was underestimated by ~70%.This is the largest underestimation by any method, across all sites.For this site, which is one of the most turbid sites in the AERONET-OC network, mean observed in-situ R rs in the red band is 0.0155 sr −1 , making it the only site with R rs (655) one order of magnitude greater than the mean value of ~0.001 sr −1 observed for all 14 AERONET-OC sites.This level of turbidity is common for this site, which is located only ~3.65 km from the coastline and receives sediment-rich water inputs from nearby rivers, as also noted by Vanhellemont and Ruddick (2015) [47] and was clearly visible in additional scenes excluded during the match-up exercise.Note that the TOA radiance data were used 'as is' without optimizing the vicarious calibration gains (as computed by Pahlevan et al. (2017a) [15]), which might further improve R rs retrievals.Similarly, none of the AC methods with different configuration capabilities that might improve performance were optimized as there was no optimal setting that would work for all cases considered in this paper.

Inter-Comparison of Reflectance Spectra at Each Site
Comparison of mean estimated and observed R rs at each AERONET-OC site (Figure A2) shows that all algorithms except SeaDAS generally overestimate R rs across all wavelengths, with the largest and smallest overestimation occurring in the 443 and 665 nm wavelengths, respectively.This is further supported by the RMSE and bias results (Figure 4).The largest errors (RMSE) and overestimations (bias) are observed in the 443 nm wavelength, probably due to the strong atmospheric scattering in this band.ARCSI has the largest overall positive bias in this wavelength, and indeed the highest overestimation at each site.However, its R rs results across all wavelengths at the Zeebrugge-MOW1 site compare well with R rs estimates from SeaDAS and ACOLITE.This suggests that ARCSI has a low sensitivity to the high concentrations of suspended sediments that dominate this site, as reported by De Maerschalck and Vanlede (2013) [48].This may also serve as an indication that ARCSI can better deal with turbid conditions than can LaSRC, which underestimated R rs by ~45% at this site.The failure of LaSRC for this band is likely due to the fact that it is part of the bands used for the aerosol inversion scheme [23].
The best performance from LaSRC across all bands is at Lake Erie (with two match-ups) where all other AC algorithms except ARCSI also have the best match with in-situ R rs .LaSRC outperforms other algorithms for the first three bands.Percentage difference values are 4.5%, −3.2%, and −0.3% for the first three bands, respectively.For ACOLITE and SeaDAS, the corresponding values are 28.8/−20.7%,13.6/−12.8%,and −0.4/−6.4%.Indeed, at this site LaSRC has the best R rs estimate of all methods in the 561 nm channel, while ACOLITE has the best R rs estimate in the 655 nm channel, with a percentage difference of −1.3%, whereas SeaDAS and LaSRC are −18.2% and −30%, respectively.Similar to the estimated R rs by LaSRC in Lake Erie in the 561 nm channel, LASRC-derived R rs (561) also agrees well with that of in-situ at Zeebrugge-MOW1 site; the percentage difference here is −0.25%.For SeaDAS and ACOLITE, these values are 6.7% and 4.1%, respectively.LaSRC also outperforms other AC algorithms in the 443, 482, and 655 nm channels at the GOT-Seaprism site, with only one match-up.Percentage differences between estimated and in-situ R rs are 8.8%, 11.1%, and −40.8%, respectively.For ACOLITE and SeaDAS, the corresponding values are 87.7/−79.3%,60.9/−46.3%,and 96.4/−127.6%.This is the only site where SeaDAS uncharacteristically underestimates R rs across all four bands.While any conclusion is tentative as GOT Seaprism (2014-026) only has one match-up, the poor performance of SeaDAS here is as a result of algorithm failure (very low R rs in 443, 482, and 561 nm wavelengths, and negative R rs in 655 nm wavelength), which can be attributed to conditions such as residual effects from cloud shadow or overcorrection for aerosol contribution in one or more visible band.Overcorrection typically occurs when water-leaving radiance is non-negligible in the band used to estimate the aerosol contribution [49].Other instances of failure (as defined above) from one or more One possible reason for the generally poor performance of ARCSI can be the aerosol contribution removal which relies on estimates from (1) dense dark vegetated surfaces, based on the assumption that reflectance of vegetated pixels is sufficiently dark, and a linear relationship between reflectance in the SWIR and blue bands or (2) dark pixels in the blue band, based on the assumption of an invariant aerosol concentration over the entire scene.However, these assumptions can easily be violated as finding a vegetated pixel that satisfies this condition may be difficult in scenes acquired over coastal waters and AOT variations may be sufficiently large such that adjoining pixels may have significantly different AOT.For the blue bands in particular, per-scene AOT estimates may lead to erroneous retrievals.For LaSRC, the generally poor performance may be due to the use of land-based pixels for aerosol estimation.In addition, for LaSRC, retrieving accurate R rs estimates over water requires the presence of a considerably large land area adjoining the water pixels.The majority of the AERONET-OC sites used in this study only have relatively small nearby land surfaces.This may help explain the few instances of good performance near land masses (e.g., for the Lake Erie and Zeebrugge MOW-1 sites).One possible reason for the generally poor performance of ARCSI can be the aerosol contribution removal which relies on estimates from (1) dense dark vegetated surfaces, based on the assumption that reflectance of vegetated pixels is sufficiently dark, and a linear relationship between reflectance in the SWIR and blue bands or (2) dark pixels in the blue band, based on the assumption of an invariant aerosol concentration over the entire scene.However, these assumptions can easily be violated as finding a vegetated pixel that satisfies this condition may be difficult in scenes acquired over coastal waters and AOT variations may be sufficiently large such that adjoining pixels may have significantly different AOT.For the blue bands in particular, per-scene AOT estimates may lead to erroneous retrievals.For LaSRC, the generally poor performance may be due to the use of land-based pixels for aerosol estimation.In addition, for LaSRC, retrieving accurate Rrs estimates over water requires the presence of a considerably large land area adjoining the water pixels.The majority of the AERONET-OC sites used in this study only have relatively small nearby land surfaces.This may help explain the few instances of good performance near land masses (e.g. for the Lake Erie and Zeebrugge MOW-1 sites).

Influence of Environmental Factors for SeaDAS and ACOLITE
To understand the impact environmental factors may have on Rrs retrieval errors from the waterbased AC methods, we investigated the influence of three variables: AOT(869), SZA, and hourly wind speed.These three variables are known to influence Rrs retrievals [50,51], e.g.AOT(870) and SZA have

Influence of Environmental Factors for SeaDAS and ACOLITE
To understand the impact environmental factors may have on R rs retrieval errors from the water-based AC methods, we investigated the influence of three variables: AOT(869), SZA, and hourly wind speed.These three variables are known to influence R rs retrievals [50,51], e.g., AOT(870) and SZA have been found to reduce the quality of water-leaving radiance derived from SeaWiFS and MODIS sensors [52].Figure 5a-c illustrates the error (x est − x mea ) for each match-up point as a function of each environmental parameter, for each AC method.Negative values imply that an algorithm underestimated the observed R rs value, and vice versa.We used tests of the statistical significance (two-tailed, α = 0.05, critical value = 0.2262) of the individual Pearson correlation coefficients to guide this analysis (Pearson correlation coefficients were used to show the strength of the relationship as they have been widely used in similar studies).While ACOLITE consistently overestimated R rs in the 443 and 482 nm bands, as also noted in [53], errors for both SeaDAS and ACOLITE were not significantly influenced by AOT (Figure 5a, no statistically significant correlations).However, SZA was significantly and positively correlated with R rs retrieval errors from SeaDAS for all four bands (i.e., r = 0.495486743, 0.483529464, 0.253699366, and 0.427793365, respectively), and from ACOLITE for the 443 and 482 nm bands (i.e., r = 0.239717715 and 0.228792001, respectively) (Figure 5b).A similar pattern was evident for wind speed, which was significantly positively correlated with R rs retrieval errors from SeaDAS for all bands except band 3 (for which a positive correlation was present, but not statistically significant) and from ACOLITE for the 443 and 482 nm bands (Figure 5c).We further examined the significance of the relationship between each environmental variable and errors across all wavelengths and found that SZA and AOT(869) were significant for SeaDAS in the first three and first two wavelengths, respectively, while ACOLITE was only influenced by wind speed in the 443 nm channel.These patterns, while generally causing only small errors in R rs retrieval, may guide further developments of both AC methods to make them more robust across the range of environmental conditions.
Remote Sens. 2018, 10, x FOR PEER REVIEW 11 of 20 pattern was evident for wind speed, which was significantly positively correlated with Rrs retrieval errors from SeaDAS for all bands except band 3 (for which a positive correlation was present, but not statistically significant) and from ACOLITE for the 443 and 482 nm bands (Figure 5c).We further examined the significance of the relationship between each environmental variable and errors across all wavelengths and found that SZA and AOT(869) were significant for SeaDAS in the first three and first two wavelengths, respectively, while ACOLITE was only influenced by wind speed in the 443 nm channel.These patterns, while generally causing only small errors in Rrs retrieval, may guide further developments of both AC methods to make them more robust across the range of environmental conditions.A2.

Conclusions
This paper provides an evaluation of four atmospheric correction algorithms (ACOLITE, ARCSI, LaSRC, and SeaDAS) for estimating R rs .Fifty-four match-ups were used to test the performance of these algorithms over mostly coastal sites that form part of the AERONET-OC network.After accounting for spectral band differences in AERONET-OC and OLI measurements/products, the R rs products from all algorithms were compared to AERONET in-situ R rs data.The generic AC methods (ARCSI and LaSRC) were less accurate for deriving R rs in coastal environments than water-based methods (ACOLITE and SeaDAS).These AC methods were particularly unreliable in the 443 and 482 nm channels, and performed well at only a few sites located in nearshore and inland waters.SeaDAS produced the best performance overall, while ACOLITE, though it performed better than the two generic AC methods, was less accurate than SeaDAS for R rs retrievals over (mostly) low-to-moderately coastal waters such as those typical of the AERONET-OC sites.Analyses of differences in spectral bands between satellite and in-situ measurements revealed that band adjustment minimized differences between sensors with different spectral bands.A relationship seems to exist between R rs retrieval accuracy for the two water-based AC methods and two atmospheric variables: SZA and wind speed.Future studies should examine these relationships further and consider related improvements to the AC methods.Neither of the water-based AC methods can currently be used to process images from commercial sensors such as WorldView-2/3 (which have improved spatial resolution) or previous Landsat missions (though this capability is available in an in-house version of SeaDAS for future public release).Given the usefulness of high spatial resolution data and the understanding that can be gained from time-series analysis for aquatic studies, such improvements would be valuable.Our findings are primarily applicable to nearshore coastal waters under low aerosol condition (i.e., AOT (869) ≤ 0.2).Further validation is required over inland waters (e.g., recently established sites in Green Bay, Grizzly Bay and Lake Okeechobee across the United States) and at stations with few match-up points (e.g., GOT Seaprism and Lake Erie with one and two match-ups, respectively) to better understand the performance of each AC method for various science and application areas.In future studies, the authors intend to evaluate the performance of these AC algorithms over inland waters such as those found over the GloboLakes sites.

Figure 2 .
Figure 2. Number of match-ups between Landsat 8 OLI scenes and AERONET-OC site measurements within ± 30-minute window of the Landsat 8 overpass (GAL: Galata, GLO: Gloria, GOT: Got Seaprism, GUS: Gustav Dalen Tower, HEL: Helsinki, ERIE: Lake Erie, LIS: LISCO, MVC: MVCO, PAL: Palgrunden, THO: Thornton C-Power, USC: USC Seaprism, VEN: Venise, WAV: WaveCIS Site CSI-6, ZEB: Zeebrugge-MOW1).Dark blue represents the total number of initial match-ups within a ± 30minute time window of the Landsat 8 overpass times for each site.Light blue represents the total number of final match-ups used for analysis after excluding scenes with sunglint and performing the match-up exercise.

Figure 2 .
Figure 2. Number of match-ups between Landsat 8 OLI scenes and AERONET-OC site measurements within ±30-min window of the Landsat 8 overpass (GAL: Galata, GLO: Gloria, GOT: Got Seaprism, GUS: Gustav Dalen Tower, HEL: Helsinki, ERIE: Lake Erie, LIS: LISCO, MVC: MVCO, PAL: Palgrunden, THO: Thornton C-Power, USC: USC Seaprism, VEN: Venise, WAV: WaveCIS Site CSI-6, ZEB: Zeebrugge-MOW1).Dark blue represents the total number of initial match-ups within a ±30-min time window of the Landsat 8 overpass times for each site.Light blue represents the total number of final match-ups used for analysis after excluding scenes with sunglint and performing the match-up exercise.

Figure 3 .
Figure 3. Scatterplots of the relationship between in-situ measurements (x-axis) and OLI estimates (y-axis) for each OLI band acquired over 14 AERONET-OC sites.Regression lines are shown in colours, while the thick dotted black lines are 1:1 lines.(a) 443 nm; (b) 482 nm; (c) 561 nm; (d) 651 nm.

Figure 4 .
Figure 4. Overall band-by-band RMSE and mean bias results for all algorithms.

Figure 4 .
Figure 4. Overall band-by-band RMSE and mean bias results for all algorithms.

Figure 5 .Figure 5 .
Figure 5. Scatterplots of the error (sr -1 ) showing the dependency of Rrs retrieval accuracy from both ACOLITE and SeaDAS on (a) AOT(869), (b) SZA, and (c) wind speed.AOT(869) and wind speed Figure 5. Scatterplots of the error (sr −1 ) showing the dependency of R rs retrieval accuracy from both ACOLITE and SeaDAS on (a) AOT(869), (b) SZA, and (c) wind speed.AOT(869) and wind speed were derived from coincident measurements at each AERONET-OC site used in this study, while SZA was obtained by subtracting the sun elevation angle provided in the Landsat 8 metadata from 90.Each circle represents a match-up data point, for a total of 54 data points across the 14 AERONET-OC sites.The 54 match-ups and their corresponding environmental parameter values are tabulated in TableA2.

Figure A2 .
Figure A2.Line graphs showing the Rrs spectra of each of the 14 AERONET-OC stations (Results were averaged for each station except GOT Seaprism for which only one match-up is available).

Figure A2 .
Figure A2.Line graphs showing the R rs spectra of each of the 14 AERONET-OC stations (Results were averaged for each station except GOT Seaprism for which only one match-up is available).

Table 1 .
Comparison of the band centres and the signal-to-noise ratios of MODIS and Landsat 8 Operational Land Imager (OLI) at specified levels of typical spectral radiance.

Table 2 .
Statistical results for the retrieved remote sensing reflectance (Rrs) obtained for all processors with and without band adjustment (values in parenthesis represent results without band adjustment).Best metrics are highlighted in bold letters.After-band-adjustment linear fit, which was employed to reveal the relationship between in-situ and modelled Rrs, improves with increasing wavelength for both Atmospheric Correction (ACOLITE and SeaDAS), with R 2 values of 0.70/0.84,0.85/0.92,0.92/0.95,and 0.93/0.97for bands 1 through 4 for ACOLITE/SeaDAS, respectively.A similar trend is seen for ARCSI and LaSRC, for the first three bands.

Table 2 .
Statistical results for the retrieved remote sensing reflectance (R rs ) obtained for all processors with and without band adjustment (values in parenthesis represent results without band adjustment).Best metrics are highlighted in bold letters.After-band-adjustment linear fit, which was employed to reveal the relationship between in-situ and modelled R rs , improves with increasing wavelength for both Atmospheric Correction (ACOLITE and SeaDAS), with R 2 values of 0.70/0.84,0.85/0.92,0.92/0.95,and 0.93/0.97for bands 1 through 4 for ACOLITE/SeaDAS, respectively.A similar trend is seen for ARCSI and LaSRC, for the first three bands.

Table A1 .
Satellite scenes and their correspondent sites.

Table A2 .
Values of environmental parameters for each match-up.

Table A3 .
Cases of negative R rs retrievals from the four AC algorithms.