remote sensing Evaluation of Sentinel-2/MSI Atmospheric Correction Algorithms over Two Contrasted French Coastal Waters

: The Sentinel-2A and Sentinel-2B satellites, with on-board Multi-Spectral Instrument (MSI), and launched on 23 June 2015 and 7 March 2017, respectively, are very useful tools for studying ocean color, even if they were designed for land and vegetation applications. However, the use of these satellites requires a process called “atmospheric correction”. This process aims to remove the contribution of the atmosphere from the total top of atmosphere reﬂectance measured by the remote sensors. For the purpose of assessing this processing, seven atmospheric correction algorithms have been compared over two French coastal regions (English Channel and French Guiana): Image correction for atmospheric effects (iCOR), Atmospheric correction for OLI ‘lite’ (ACOLITE), Case 2 Regional Coast Colour (C2RCC), Sentinel 2 Correction (Sen2Cor), Polynomial-based algorithm applied to MERIS (Polymer), the standard NASA atmospheric correction (NASA-AC) and the Ocean Color Simultaneous Marine and Aerosol Retrieval Tool (OC-SMART). The satellite-estimated remote-sensing reﬂectances were spatially and temporally matched with in situ measurements collected by an ASD FieldSpec4 spectrophotometer. Results, based on 28 potential individual match-ups, showed that the best performance processor is OC-SMART with the highest values for the total score S tot (16.89) and for the coefﬁcient of correlation R 2 (ranging from 0.69 at 443 nm to 0.92 at 665 nm). iCOR and Sen2Cor show the less accurate performances with total score S tot values of 2.01 and 7.70, respectively. Since the size of the in situ observation platform can be signiﬁcant compared to the pixel resolution of MSI onboard Sentinel-2, it can create bias in the pixel extraction process. Thus, to study this impact, we used different methods of pixel extraction. However, there are no signiﬁcant changes in results; some future research may be necessary.


Introduction
In recent decades, remote sensing has been incorporated into various important research studies on the ocean such as ecological investigation or water quality monitoring [1][2][3]. Satellite data play an important role in providing both spatial and temporal information necessary to monitor the change in water quality parameters [4]. Earth observation (EO) provides several key ocean parameters [5]. One of these parameters is the ocean color: ocean color studies the interaction of the sunlight with the marine particles that are optically-active. It can provide information on the concentration and the bio-optical properties of these particles, such as the chlorophyll-a, the total suspended matter, or the  For deriving the aerosol contribution to the signal measured by a space-borne sensor, the black pixel assumption (i.e., the ocean is totally absorbent in the near-infrared bands (NIR)) is used in the open ocean waters [8]. However, in optically-complex waters such as turbid coastal waters (the areas observed by MSI, up to 20 km offshore), this hypothesis is not valid anymore [13,14] and this leads to large errors when retrieving the output products [9,15,16]. To deal with this issue, many different algorithms have been developed for the ocean color remote sensors, which can handle the non-zero R rs in turbid coastal water areas [14,[17][18][19][20][21][22][23][24][25][26][27][28]. As shown in published studies [10,15,16,22], these algorithms show strong discrepancies in terms of performance, but did not cover the entire variability of the coastal waters and there is still a need in assessing these algorithms.
Although, as mentioned, there are currently many algorithms that have been developed with high processing efficiency, not all algorithms are open-source, easy to use and well-supported for Sentinel-2 products. Some examples can be mentioned such as Fast Line-of-sight Atmospheric Analysis of Hy-percubes (FLAASH) and Maccs-Atcor Joint Algorithm (MAJA). FLAASH is an algorithm used for cloud detection and atmospheric correction [29]. However, this processor is a proprietary commercial software and it only accepts radiance images in band-interleaved-by-line (BIL) or band-interleaved-by-pixel (BIP) format as the input, which means it is not well-supported for Sentinel-2 images [30]. MAJA is an open-source processor, however, this method is a bit complicated to use, and requires memory, disk space and computer power [31]. Therefore, in order to make it easy for readers to apply the results, access and use the processors, the criterion for choosing AC algorithms applied in this study are free to use and easy to operate.
In this paper, a comparison of seven different atmospheric correction algorithms is presented for MSI images. The retrievals with the seven AC are compared to in situ measurements from sea campaigns in the French coasts of the English Channel and French Guiana using a match-up exercise. By using a set of statistical parameters and scheming score, the validation of these seven processors are evaluated based on their accuracy and performances over these coastal areas. We also studied the impact of the pixel extraction in the match-up exercise on the retrievals as we deal with high-spatial-resolution sensors (and the structure used to collect the in situ measurements can be the same size as one or several pixels).
The in situ and satellite datasets and the match-up exercise are fully detailed in Section 2. The associated statistical parameters used in this study are described in Section 3. Section 4 presents the background on atmospheric correction and the principles of seven AC processors used in our study. Section 5 introduces the performances and statistical results of the AC on the match-up dataset. Then the results are discussed in Section 6. Finally, the conclusion of the ranking score and performances and recommendations for future researches are provided in Section 6.

In Situ Measurements
The radiometric dataset included in this study has been collected from two French coastal areas: French Guiana and the English Channel as shown in Figure 1. The data in French Guiana are primarily collected from the coastal area near the Maury and Cayenne rivers, while for the English Channel, between Calais to Brest, with most of the sampling collected in the coastal areas of Wimereux and Boulogne-sur-Mer, in the Eastern part of the English Channel. The English Channel is an area with strong tidal currents and low bathymetry with two strong freshwater inputs from two rivers, the Seine and the Escaut. Under water mixing from natural factors such as water inputs and strong tidal currents, the formation of suspended matter in the water in this area increases. Besides, this area also has the appearance of bloom events of phytoplankton in spring, making the density of planktons increase rapidly during this time. Combining all these factors has made the English Channel an area of relatively high turbidity with a mean near-surface suspended sediments concentration of 10-35 mg l −1 [32,33].
Similar to the English Channel, the French Guiana area has low bathymetry and two strong water inputs from two rivers, the Amazon and the Maury. In addition, the French Guiana area also receives a large amount of suspended sediment from North Brazilian and Guiana currents. All these factors have made French Guiana a highly turbid area [34,35].
The radiometric in situ data were collected using the Analytical Spectral Devices (ASD) FieldSpec4 spectro-photometer in the Visible/Near Infrared (VNIR, 350-1050 nm) and the Shortwave Infrared (SWIR, 900-2500 nm) parts of the spectrum. The sampling principle for the ASD spectra is the one described in [36].
The downwelling irradiance above the surface (E d ) was measured using an almost 100% reflecting Spectralon reference panel. Then, the total upwelling radiance from the water (L t ) (i.e., from the water surface) was measured by pointing the sensor at the water surface at 40 • from nadir, maintaining an azimuth of 90 • or 135 • from the solar plane, depending on the pontoon orientation with respect to the sun. Downwelling sky radiance (L sky ) was measured at a zenith angle of 40 • to account for the skylight reflection [36]. The water-leaving reflectance is then calculated using the following equation: The English Channel is an area with strong tidal currents and low bathymetry with two strong freshwater inputs from two rivers, the Seine and the Escaut. Under water mixing from natural factors such as water inputs and strong tidal currents, the formation of suspended matter in the water in this area increases. Besides, this area also has the appearance of bloom events of phytoplankton in spring, making the density of planktons increase rapidly during this time. Combining all these factors has made the English Channel an area of relatively high turbidity with a mean near-surface suspended sediments concentration of 10-35 mg l −1 [32,33].
Similar to the English Channel, the French Guiana area has low bathymetry and two strong water inputs from two rivers, the Amazon and the Maury. In addition, the French Guiana area also receives a large amount of suspended sediment from North Brazilian and Guiana currents. All these factors have made French Guiana a highly turbid area [34,35].
The radiometric in situ data were collected using the Analytical Spectral Devices (ASD) FieldSpec4 spectro-photometer in the Visible/Near Infrared (VNIR, 350-1050 nm) and the Shortwave Infrared (SWIR, 900-2500 nm) parts of the spectrum. The sampling principle for the ASD spectra is the one described in [36].
The downwelling irradiance above the surface (Ed) was measured using an almost 100% reflecting Spectralon reference panel. Then, the total upwelling radiance from the water (Lt) (i.e., from the water surface) was measured by pointing the sensor at the water surface at 40° from nadir, maintaining an azimuth of 90° or 135° from the solar plane, depending on the pontoon orientation with respect to the sun. Downwelling sky radiance (Lsky) was measured at a zenith angle of 40° to account for the skylight reflection [36]. The water-leaving reflectance is then calculated using the following equation: where ρas is the air-sea interface reflection coefficient. The full protocol is described in [36]. Then the remote-sensing reflectance Rrs is obtained using the following equation:

Satellite
In this study, MSI images are downloaded from the Copernicus Open Access Hub (https://scihub.copernicus.eu/, accessed on 25 August 2020) for both Sentinel-2A and Sentinel-2B images, corresponding to the field trip measurements of 28 days, as shown in Table 2. The temporal window for selecting a match-up was ±2 h between the satellite overpass and the in situ measurements.
where ρ as is the air-sea interface reflection coefficient. The full protocol is described in [36]. Then the remote-sensing reflectance R rs is obtained using the following equation: The English Channel is an area with strong tidal currents and low bathymetry wit two strong freshwater inputs from two rivers, the Seine and the Escaut. Under water mix ing from natural factors such as water inputs and strong tidal currents, the formation o suspended matter in the water in this area increases. Besides, this area also has the ap pearance of bloom events of phytoplankton in spring, making the density of plankton increase rapidly during this time. Combining all these factors has made the English Chan nel an area of relatively high turbidity with a mean near-surface suspended sediment concentration of 10-35 mg l −1 [32,33]. Similar to the English Channel, the French Guiana area has low bathymetry and tw strong water inputs from two rivers, the Amazon and the Maury. In addition, the Frenc Guiana area also receives a large amount of suspended sediment from North Brazilia and Guiana currents. All these factors have made French Guiana a highly turbid are [34,35].
The radiometric in situ data were collected using the Analytical Spectral Device (ASD) FieldSpec4 spectro-photometer in the Visible/Near Infrared (VNIR, 350-1050 nm and the Shortwave Infrared (SWIR, 900-2500 nm) parts of the spectrum. The samplin principle for the ASD spectra is the one described in [36].
The downwelling irradiance above the surface (Ed) was measured using an almos 100% reflecting Spectralon reference panel. Then, the total upwelling radiance from th water (Lt) (i.e., from the water surface) was measured by pointing the sensor at the wate surface at 40° from nadir, maintaining an azimuth of 90° or 135° from the solar plane depending on the pontoon orientation with respect to the sun. Downwelling sky radianc (Lsky) was measured at a zenith angle of 40° to account for the skylight reflection [36]. Th water-leaving reflectance is then calculated using the following equation: where ρas is the air-sea interface reflection coefficient. The full protocol is described in [36 Then the remote-sensing reflectance Rrs is obtained using the following equation:

Satellite
In this study, MSI images are downloaded from the Copernicus Open Access Hu (https://scihub.copernicus.eu/, accessed on 25 August 2020) for both Sentinel-2A and Sen tinel-2B images, corresponding to the field trip measurements of 28 days, as shown i Table 2. The temporal window for selecting a match-up was ±2 h between the satellit overpass and the in situ measurements. (2)

Satellite
In this study, MSI images are downloaded from the Copernicus Open Access Hub (https://scihub.copernicus.eu/, accessed on 25 August 2020) for both Sentinel-2A and Sentinel-2B images, corresponding to the field trip measurements of 28 days, as shown in Table 2. The temporal window for selecting a match-up was ±2 h between the satellite overpass and the in situ measurements.

Match-Ups Selection
The match-up selection process is performed to compare satellite estimates and in situ data. The location of the in situ measurements is co-located with the satellite image pixels. A match-up is obtained following several consecutive steps. First, satellite estimates are extracted over a 3 × 3 pixels box, with the central pixel corresponding to the location of the field station. Second, to guarantee data uniformity between field and satellite data, a time variation of ±2 h is set [37]. Third, a pixel is considered as "valid" when it is not a pixel with values R rs < 0 or flags. The flags pixels include instrument data missing or invalid, water mask, land mask, cloud, possible sea-ice or snow contamination, whitecap, and saturation (within any band from 400 to 754 nm) [16]. Fourth, a match-up is considered "valid" when in the 3 × 3 pixels box at least 6 pixels are considered valid. Finally, a coefficient of variation (CV; standard deviation over mean) is calculated for each wavelength, and the final value of the match-up is "valid" only if the CV is < 0.2 at each wavelength [15,16].

Extraction Method
For the match-up exercise, we first used the protocol described above from Bailey and Werdell [37] and Jamet et al. [16]. However, while the resolution of MSI used for validation in this article is 60 m for all MSI bands, the size of structures used to collect the in situ measurements can be considerable to the pixel (for instance, research vessels, towers, lighthouse). It means that the structures can impact the value of one or more pixels in the extraction box, leading to inaccurate valid pixels (or no valid pixels at all). This has been shown by Concha et al. [38]. In this latter study, different match-up validation protocols have been tested for different ACs, and the results showed that the number of available Remote Sens. 2022, 14, 1099 6 of 25 match-ups was in a direct influence with the position of the measuring platforms. In our study, we studied the impact of shifting the extraction box from a few pixels in order to avoid these disturbances from the measurement structures. We evaluated extraction boxes that are shifted from the centered pixel by 2 pixels to the left, right, up, or down to check the effect on the match-ups analysis. We also look at removing only the central pixel with the box still centered. Figure 2 illustrates the different extraction boxes used for this analysis.

Basics of Atmospheric Correction
The signal measured by the satellite can be decomposed into several terms [7,8]: where ρ t (λ) is the top of atmosphere reflectance measured by satellite; ρ r (λ) is the Rayleigh reflectance, ρ a (λ) is the reflectance from multiple scattering by aerosols, ρ ra (λ) is the aerosolmolecular scattering, ρ wc (λ) is the reflectance from whitecaps, ρ g is the sun-glint reflectance and ρ w (λ) is the water-leaving reflectance; t d and T are diffuse and direct transmittance, respectively. The sun-glint, whitecaps, and gas absorptions corrections are pre-processed and then removed from ρ toa (λ). Then, the Rayleigh corrected reflectance (ρ rc ) is obtained: where ρ A (λ) = ρ a (λ) + ρ ra (λ). In addition to the mentioned parameters, there is another factor that can influence the retrievals, which come from the proximity of land and scattering of surface-reflected radiance, is adjacency effect. However, the impact of the adjacency effect is low from around 10 km from the coasts [39][40][41]. Most of used in situ data were collected around this distance; thus, the adjacency effect is quite negligible. Therefore, the main purpose of atmospheric correction is to remove the multiple scattering aerosol contributions from the Rayleigh corrected reflectance. In our study, we considered that the adjacency effect was negligible.

Description of the Algorithms
Image correction for atmospheric effects (iCOR) is a scene generic atmospheric correction that aims for land, coastal, inland, or transitional waters [42,43], and it has been tested over different coastal waters [48][49][50]. Furthermore, iCOR atmospheric correction processing can be divided into four steps [42]. First, all pixels with land and water values are identified and separated; then, the AOT values will be extracted from the soil pixels according to an improved version of the method developed by Guanter [51]; then, SIMEC is used to implement the adjacency correction approach for water pixels and fixed background land [43]; finally, the radiative transfer equation is solved. Regarding the reference data, iCOR uses the MODTRAN5 LUT to perform the atmospheric correction process mentioned above [52]. In this work, S2 images were processed by using default parameters with iCOR (v.0.3) through SNAP.
Atmospheric correction for OLI 'lite' (ACOLITE) is an atmospheric correction processor for coastal and inland waters, which was developed by the REMSEM group [12,27,44,45]. It operates atmospheric correction by using the "dark spectrum fitting" approach by default, but it can also be configured to use the exponential extrapolation [44]. In this research, the ACOLITE v.20211124 has been applied using the default approach without the negative mask for rhow (l2w_mask_negative_rhow=False) to check its performance.
C2RCC is a development of Case 2 Regional processor, which relies on a large database of ρ w (λ) and related top-of-atmosphere radiances [19]. The inversion of the water leaving signal and the satellite signal is performed by artificial neural networks. A characterization of optically complex water through its inherent optical properties is used to parameterize radiative transfer models for the atmosphere over the water body [53]. For this work, the normal net version with typical ranges of IOPs has been used through the Sentinel Application Platform (SNAP v.8.0). All the satellite images have been processed with default parameters. Sentinel 2 Correction (Sen2Cor) is an atmospheric correction algorithm developed by ESA specifically for Sentinel-2 Level 2A land product [46]. In general, the atmospheric correction algorithm assumes that the vegetation pixels in the satellite image have adequate darkness and that the ratio of the bottom-atmospheric reflectance between the wavelengths is constant, called dark dense vegetation (DDV) [46,54]. Accordingly, this algorithm will require the presence of vegetation, corresponding to the dark areas in satellite images. In case the input satellite image meets the above requirements, the algorithm will automatically extract the AOT data from the dark pixels and correct the image [52]. In this study, the satellite images were processed by Sen2Cor with default parameters.
Polynomial-based algorithm applied to MERIS (Polymer) is an atmospheric correction for oceanic water, specially designed for the high sun-glint condition [26]. It is based on spectral matching, which applies all the spectral bands to atmospheric and sun-glint correction. In detail, the atmospheric component of this algorithm is a polynomial function used to derive spectral reflectance of the atmosphere and sun glint. The satellite images herein were processed following default parameters in Polymer v.4.13.
SeaDAS-integrated NASA atmospheric correction algorithm applied in this study is based on the method presented by Bailey et al. [17] (NASA-AC). To calculate the value of remote sensing reflectance (R rs ), this algorithm applies a bio-optical model in the red and NIR bands. First, to be able to apply this model, an initial atmospheric correction with the black pixel assumption is needed for the calculation of chlorophyll-a concentration (Chl-a). Then, Chl-a is used to calculate the spectral particulate backscattering, which is then used to derive the remote sensing reflectance through an iterative scheme [17]. In this study, indicators for performing atmospheric correction were set as default.
OC-SMART algorithm focuses on analyzing data from ocean color satellites such as Landsat-8, Sentinel-2/3, SeaWiFS, MODIS-AQUA [20,47]. OC-SMART platform can provide remote sensing reflectance (R rs ) and Chlorophyll-a (Chl-a) based on a multilayer neural network (MLNN) method. A coupled ocean-atmospheric radiative transfer model has been used to calculate simultaneously R rs and the Rayleigh radiance corrected at the top of the atmosphere and thereby being used to train a MLNN. The MLNN directly estimates R rs from ρ rc [47]. In this study, the integrated atmospheric correction algorithm in OC-SMART has been applied using default setting for assessment.

Statistical Parameters
Four statistical parameters have been calculated to assess the performance of each atmospheric correction algorithm: the correlation coefficient (R 2 ), the Bias, and the relative error (RE): where N is the number of match-ups, R rs AC is the remote sensing reflectance retrieved by the atmospheric correction algorithms and R rs ASD is the remote sensing reflectance from ASD in situ measurements.
Quality Assurance Score (QAS) [55], Chi-square mean (χ 2 ) [56], and spectral angle mean (SAM) [57] were also calculated to assess the full spectrum of the retrieved R rs . QAS gives quantification of the spectrum quality of R rs retrievals with reference to a reference in situ dataset [55]. For the best performance, QAS value has to be close to one. SAM provides the difference between the in situ and the retrieved R rs spectra, SAM value has to go towards 0 • [57]. χ 2 indicates relative error values of the full spectrum of R rs retrievals and has to be null [56].
In this work, QAS, χ 2 , and SAM were calculated using 4 wavelengths: 443, 490, 560, and 665 nm, which are the closest to the QAS reference R rs wavelengths. However, in χ 2 case, the wavelength at 560 nm is excluded due to its use for error normalization.
where j and i are the indexes of λ and spectrum between 443 and 665 excluding 560 nm, respectively. X is the normalized R rs at 560 nm. 〈R rs ASD , R rs AC 〉 is the dot product of corresponding R rs , and R rs ASD × R rs AC is the Euclidean norm of R rs ASD and R rs AC .

Scoring Scheme
Similar to QAS, χ 2 , and SAM, the total score (S tot ) indicates AC performances over the full spectrum of R rs [36,56]. S tot is calculated based on the scoring of the slope α and intercept β of the regression line, RE, Bias, and R 2 for each algorithm. It is calculated on Remote Sens. 2022, 14, 1099 9 of 25 the variation between the minimum and maximum of those sixth statistical parameters, as shown in the following equations: Four wavelengths were considered for five statistical parameters, which lead to a maximum S tot value of 20. Table 3 shows the number of in situ measurements, days, and available match-ups. A majority of the field measurements are from French Guiana. However, these data were mainly for the aim of research for Sentinel-3 satellite [36], which led to a significant decrease in the potential number of match-ups for Sentinel-2. Table 3. Location, number of days, stations, and potential match-ups by areas.

Study Areas Days Stations Potential Match-Up
English Channel  16  43  22  French Guiana  14  60  8  Total  30  103  30 For the assessment, seven different ACs were applied to the Sentinel-2 satellite images for the evaluation of their performances. Accordingly, the match-up dataset is divided into two categories: individual match-ups and common match-ups. The common matchup in this study aims to focus on evaluating the efficiencies of the correction algorithms under the condition of the same number of output data. This condition helps to assess, more uniformly, the ability to extract remote sensing reflectance from TOA signal between ACs. Meanwhile, individual match-up provides a more comprehensive view. Besides the performance of ACs, the individual match-up can also help to determine which ACs are better at handling atmospheric influences and the overall performances of 7 ACs help to evaluate their ability to remove invalid pixels. Thus, in this section, the match-up results will be divided for these 2 cases.

All Match-Ups: Individual Performance
After selecting the appropriate match-ups based on the protocol mentioned in Section 2.3, the total amount of valid data for each atmospheric correction algorithm is shown in Table 4. The number of valid match-ups varies between the ACs and between wavelengths for a given ACs. NASA-AC is the AC that provided the least number of match-ups (9 at 665 nm) while Sen2Cor, Polymer, OC-SMART, and iCOR are the AC providing the highest number of match-ups (between 21 and 22 depending on the wavelength). Figure 3 shows the comparison between the estimated R rs from atmospheric correction algorithms and the R rs from in situ measurements for MSI bands from 443 nm to 665 nm. The accuracy of the AC is wavelength-dependent. For wavelengths at 443 nm and 490 nm, the retrieved R rs from all ACs are mostly under-estimated except for ACOLITE and iCOR. For higher wavelengths the scattering around the 1:1 line decreases with increasing wavelength, leading to better performances of the atmospheric correction algorithms. Those results agree with the work of Pahlevan et al. [10].   Table 5 indicate the variation in the statistical parameters (RE, Bias, and R 2 ) for all ACs with MSI bands from 443 nm to 665 nm. Similar to the scatter plots, the value of R 2 in the blue ranges from about 0.28 to 0.57 for all AC except for OC-SMART with a value of 0.69 (Table 5). The value of R 2 increases with wavelength and reaches 0.61 to 0.94 in the red region for all AC. Thus, the correlation between the in situ and estimated R rs is relatively high in the red band, showing better estimates compared to those in the blue band. Regarding the error-related statistical parameters, in general, the output of the algorithms has relatively high errors in the blue and red bands. The relative error forms the standard U-shape [15,16] for most ACs. In contrast, the errors are significantly lower at 560 nm compared to the blue band and red band.
In general, the seven ACs can be categorized into two groups depending on their performances: (1) NASA-AC, C2RCC, Polymer, ACOLITE, and OC-SMART with small scattering at higher wavelengths and low biases (especially OC-SMART); (2) Sen2Cor and iCOR showing lower performances compared to the first group, with higher scattering around the 1:1 line and lower performances in term of statistical results (Figure 4 and Tables 5 and 6).   The first group includes NASA-AC, C2RCC, Polymer, ACOLITE, and OC-SMART. In this group, OC-SMART shows the best performances compared to the other algorithms. The second group includes iCOR and Sen2Cor. For this group, both Acs show lower statistical results compared to the first group. All parameters values such as relative error, and bias are substantially higher for this group. For the relative error, iCOR and Sen2Cor showed significantly higher values than the algorithms in group 1. This difference in RE value can be seen in blue bands with RE values 3 to 4 times higher than the Acs of group 1. The same trend can be seen for the Bias value. Accordingly, the Bias value of iCOR and Sen2Cor ranges from 43.06% at 560 nm to 112.94% at 665 nm and 47.22% at 560 nm to 118.57% at 443 nm, respectively, while the Bias value of the algorithms of group 1 ranges only from −10% to 30% except for ACOLITE. The difference in the values of the statistical parameters between group 1 and group 2 was most evident in the Bias value of iCOR at band 665 nm and Sen2Cor at band 443 nm, which are 112.94% and 118.57%, respectively. This value is 5 to 6 times higher than Bias value of group 1. Besides, the lower performances of ACs in group 2 are also clearly shown through the value of R 2 , which is only 0.36 for Sen2Cor at 443 nm.  Table 6 gives the values of QAS, χ 2 , SAM, and S tot for each atmospheric correction algorithms. On the opposite, Sen2Cor shows the highest value (9.66). The SAM value needs to be as low as possible to get estimated spectra as similar as possible as in situ spectra. For S tot , OC-SMART is the atmospheric correction with the highest value, 16.89 over 20, Polymer has the second-highest value of S tot (14.90). The atmospheric correction with the lowest S tot value (2.01) is Sen2Cor.

Common Match-Ups: Performance Inter-Comparison
In this section, we compare the ACs on the same match-ups dataset. The number of common match-ups decreases considerably compared to the case of individual match-ups with the number of match-ups varying between 7 at 665 nm and 16 at 560 nm, as shown in Table 7. This is due to the number of overall match-ups from NASA-AC (Table 4). Figures 5 and 6 are similar to Figures 3 and 4 but for the common match-ups. Figure 5 shows the comparison between the in situ R rs value and the ACs estimated R rs . As for the case of individual match-ups, the accuracy of the ACs increases with increasing wavelength. At 443 nm, most of the ACs provide under-estimation of R rs (except ACOLITE), and this under-estimation tends to decrease with increasing wavelength. The scattering around the 1:1 line is the smallest at 665 nm. Table 7. Number of match-ups for 4 visible MSI bands of all ACs in common match-ups case. Figure 6 shows the variation in the statistical parameters (RE, Bias, R 2 ) with the wavelength on the common match-ups dataset. Overall, the values of the statistical parameters, in this case, show lower performances for all ACs of the first group, and higher performances for the second group compared to the all match-ups case. This can be seen through the evaluation of the correlation between the in situ R rs and AC estimates.  In terms of spectra shapes, QAS values increased considerably for all ACs, except for OC-SMART. This is especially true for iCOR, which QAS increased from 0.65 to 0.89. The same improvement in the shape can be observed for the other ACs, with a decrease in SAM values from 9.59 to 6.7, 7.07 to 5.44, 9.35 to 4.95, 4.99 to 4.73 for Sen2Cor, ACOLITE, iCOR, and NASA-AC, respectively. For χ 2 , the value decreases for all ACs of group 2 and ACOLITE. The most substantial change in χ 2 is the case of iCOR, where the value decreases from 0.375 to 0.046. Overall, with the exception of Polymer, iCOR, OC-SMART, and ACOLITE, the performance of the ACs tend to improve with respect to S tot value. iCOR shows the highest decrease in S tot from 7.7 to 3.37 and NASA-AC shows the highest increase in S tot from 14.36 to 17.13 (Tables 6 and 8).

Sensitivity to the Position of 3-by-3 Pixels Box
As mentioned in Section 2.4, the existence of research vessels, towers, and lighthouses at sea can cause interferences in the match-ups exercise, directly affecting the statistical results of the AC processing. Therefore, to address this issue, we studied the impact of shifting the 3-by-3 pixels box from the in situ measurements. The results presented in the previous sub-sections were estimated using a 3-by-3 pixel box centered on the in situ measurements, which is the standard protocol for classic ocean color images [37]. Herein, we shift the center from the location of the in situ measurements and we calculated the statistics for five different positions for each AC. Figure 7 illustrates the differences in the statistical parameters for two cases: (1) the extraction box is shifted to the right of the in situ measurement by two pixels; (2) the extraction box is shifted downward of the in situ measurements by two pixels. The statistical results show that the location of the 3-by-3 pixels box does not change much the statistical parameters, except for NASA-AC and Sen2Cor. For NASA-AC, in the case when the box is shifted to the right by two pixels, the Bias value increases steadily at 443, 490, and 560 nm bands by about 2-4.7%, especially for the Bias value at 665 nm, which increased by 10,3%. Meanwhile, in the case when the box is shifted downward by two pixels, the Bias value increases steadily at 490, 560, and 665 nm bands by about 3.8-5%, and its value increases the most at band 443 nm by 8%. Accordingly, the correlation between in situ R rs and NASA-AC estimates also substantially decreases-the strongest decrease being at 443 nm from 0.40 to 0.28 for the case when the box is shifted downward by two pixels, and from 0.54 to 0.40 at band 490 nm for the case when the box is shifted downward by two pixels. For Sen2Cor, the Bias value highly increases at 665 nm by around 40% for both cases. However, unlike NASA-AC, the correlation between the in situ R rs and the Sen2Cor AC estimates increases considerably, with values of R 2 increasing on average by 0.015 for the right-shifted case and by 0.03 for the down-shifted case for wavelengths from 443 nm to 560 nm, and decreases by 0.075-0.088 at 665 nm band for both cases.
For other cases of pixel extraction, as shown in Figure 7, the statistical parameters show no significant changes or stable trends. Table A1 shows the statistical indicators on the shape of the estimated R rs spectra depending on the position of the 3-by-3 pixels box. Except for the right-shifted and down-  OC-SMART shows the overall best performances on our in situ dataset. For the case of individual match-ups datasets, OC-SMART yields the lowest errors and bias at two bands 443 nm and 665 nm. R 2 also shows the higher values across all the bands with values ranging between 0.69 at 433 nm and 0.92 at 665 nm for the case of individual match-ups, and between 0.46 at 443 nm and 0.85 at 665 nm for the common match-ups case. The scatterplots for OC-SMART also show a satisfactory adjustment to the 1:1 line with low scattering for all wavelengths. These results show that the estimated R rs from OC-SMART is highly accurate compared to the in situ measurement data. For the scoring scheme, OC-SMART provides the highest S tot result, which is 16.89/20 for the individual, and the second-highest S tot value (14.76) for the common match-ups dataset.
NASA-AC shows good performances as demonstrated by the statistical parameters and the scoring scheme on the common match-ups dataset. Accordingly, for the individual match-up case, the estimated R rs show a good correlation with the in situ data with R 2 values ranging from 0.28 at 443 nm to more than 0.92 at 665 nm. The error and Bias values are also relatively low, which ranges from 31.6% at 443 nm to 56.3% at 665 nm and −3.59% at 443 nm to 31.30% at 665 nm, respectively. These values are 12% to 15% higher than OC-SMART. Especially, NASA-AC yields the highest S tot value (17.13) for the common match-up case. NASA-AC provides good estimations of the spectra of R rs , as shown in the values of SAM, QAS, and χ 2 . Overall, however, NASA-AC has a tradeoff that can be seen in the individual case. Despite the high efficiency in atmospheric correction, the ability to mask pixels with atmospheric noise components is still not good. This can be seen in the number of match-ups shown in Table 4. Accordingly, the number of valid match-ups after processing by NASA-AC is the lowest, and it is strongly reduced for high wavelengths (band 665 nm only has 9 valid match-ups in the individual match-up case). This result is similar to what was mentioned by Ilori et al. [58] and by Pahlevan et al. [10]. This low number of match-ups can be caused by the inhomogeneity of the estimated remote sensing reflectance (R rs ) in the 3-by-3 pixels box, as shown in Figure 8. Besides the low number of match-ups, a part of the problem can be caused by the fact that the NASA-AC methodology requires an initial correction process which assumes water-leaving radiances are null in the NIR region [17]. Thus, it may not be good to apply to some typical types of nearshore coastal water due to turbidity. NASA-AC can be considered as the second-best atmospheric correction algorithm on our in situ dataset. Polymer has good statistical results and relatively high scores in the scoring scheme. The correlation between the in situ and the estimated R rs is relatively high. This is shown by the value of R 2 , which gradually increases from 0.57 at 443 nm to 0.92 at 665 nm. This result is similar to those obtained by Pereira-Sandoval et al. [49], who validated Polymer using in situ data from field campaigns, obtaining R 2 values equal to 0.3 for 443 nm, 0.4 for 490 nm, 0.8 for 560 nm, and 0.93 for 665 nm, which shows an increase for the coefficient of variation value. Besides, those results are in the same trend of values as the ones observed here for both match-ups cases. The algorithm shows quite low performances in the blue band and the accuracy gradually increases in the green and red bands, as shown in Figures 3 and 4. From the statistical results, Polymer retrieves accurate spectra of R rs compared to a reference in situ dataset [55] and low spectra differences between the in situ measurements and the retrieved R rs with values of QAS and SAM of 0.87 and 5.99, respectively (Table 6). Overall, Polymer shows the value of S tot of 14.90 and 13.74 for individual and common match-ups datasets, respectively. Similar to Polymer, C2RCC is also an atmospheric correction algorithm with acceptable and stable performances, not affected by wavelength as much as NASA-AC. C2RCC has a relatively high correlation between the estimated R rs and the in situ measurements with R 2 ranging from 0.4 at 443 nm to nearly 0.88 at 665 nm. Similar to Polymer, C2RCC exhibits relatively low accuracy in the blue bands, and this accuracy increases with increasing wavelength. One reason for this can be the strong Rayleigh scattering in the blue band by atmospheric gases, which might not be properly corrected [49]. Therefore, to solve this problem, as mentioned in the study by Pereira-Sandoval et al. [49], C2RCC is being developed and tested to create a new version capable of more accurate correction for sun-glint in the blue wavelength range. Overall, C2RCC still shows an acceptable scoring scheme with the value of S tot are 13.31 and 14.11 for individual and common match-ups cases.
Polymer and C2RCC have quite similar statistical values for some parameters, such as relative error. However, for others, Polymer shows higher accuracies compared to C2RCC such as lower values of SAM values, which are 5.99 for Polymer and 8.76 for C2RCC (Table 6), and higher R 2 ranging from 0.57 to 0.92, while R 2 for C2RCC ranging only from 0.4 to 0.88. Compared to C2RCC, Polymer has a better ability to remove invalid pixels. This is shown by the number of available match-ups after the match-up selection process, as given in Table 4, wherein the available match-up number of Polymer is around 21 and 22, while this value for C2RCC ranges from 17 to 21. Therefore, Polymer is a more accurate atmospheric correction algorithm than C2RCC in our study.
This agrees with what was presented by Pahlevan et al. [10]. The study of Pahlevan has assessed the performance of a total of six different ACs on Landsat-8 and Sentinel-2. Over 1000 match-ups from seven different optical water types for both fresh and coastal water have been applied. The results of this study show that OC-SMART and NASA-AC are the two best atmospheric corrections for working in coastal waters. Polymer also provides high-quality estimates of the remote-sensing reflectance. However, overall, it does not surpass the performances of OC-SMART and NASA-AC [10].
Unlike other ACs in this group, ACOLITE shows lower performances. For the correlation between estimated and in situ R rs , R 2 values vary between 0.55 at 443 nm and 0.74 at 665 nm. It can be seen from this result that the correlation of ACOLITE R rs is low in the blue band. This also can be seen from scatterplots in Figure 3, with the regressed line for ACOLITE being distinguishable from the 1:1 line. SAM value of ACOLITE is the second-highest value (7.07) compared to the other atmospheric correction algorithms in this group. The S tot value of ACOLITE for the individual and common match case are 14.52 and 13.26, respectively. Although the S tot value of ACOLITE in the all match-up case is relatively high compared to C2RCC and NASA-AC, the statistical parameters of ACOLITE are lower than these 2 ACs. For instance, the RE values of ACOLITE at 443 and 655 nm are 2 times higher than that of C2RCC and NASA-AC. The Bias values of C2RCC and NASA-AC at 443 and 490 nm are negative and range from −11% to −2%, while the Bias values of ACOLITE at these two bands range from 22% to 53%. This low performance could be due to the ACOLITE's mechanism, which masks pixels containing clouds, fog, or cirrus. ACOLITE's masking process allows the removal of areas with cirrus clouds using cirrus detection from the 1375 nm band. These cirrus clouds pixels at 1375 nm are applied to mask the remaining bands of MSI. Therefore, as shown in Figure 9 of French Guiana scene on 28, November 2016, when some of the S2 satellite images at 1375 nm are almost entirely covered by cloud or haze, the masking process of ACOLITE will remove the whole scene. This will result in all extracted values being NaN. Overall, ACOLITE shows the lowest performances in the first group.

Comparison of the Second Group of AC: Sen2Cor and ICOR
The second group of AC (Sen2Cor and iCOR), in general, shows lower performances compared to the first group. This is understandable as both Sen2Cor and iCOR mechanisms mostly depend on the presence of land, as those algorithms are land-based method [54] and improved by Guanter [51] method for Sen2Cor and iCOR, respectively. These methods generate scene parameters requiring a distribution of pixels containing land. In the absence of land, these algorithms use default values of the aerosol optical thickness, which was the case for three match-ups in the English Channel and one match-up in French Guiana in this study. Therefore, it is suggested that the tested versions of these processors should better be applied for inland scenes.
In detail, in the second group, Sen2Cor shows relatively low performances in the visible bands, especially for the blue band. At 443 nm, R 2 is nearly 0.36, which indicates a low accuracy and correlation between estimated and in situ R rs , as suggested by Figure 4. The data points are highly scattered around the 1:1 line for bands between 443 and 560 nm. Furthermore, the total scheming score (S tot ) is very low (2.01 and 4.19 for individual and common match-ups cases, respectively) on our dataset. This suggests that this algorithm shows poor performances in reproducing the spectral shape of the in situ data. This poor performance could partly come from the fact that the approach of this algorithm is based on the "Dense Dark Vegetation", which means that the AC will perform better in the high presence of vegetation. One example is in Ruescas et al. [59] as the authors used Sen2Cor to correct the atmosphere in Albufera lagoon. They showed good accuracies in the retrievals and concluded that it was due to the high concentration of chlorophyll-a all year round in the lagoon, especially in spring and summer when there is an additional resonance of crop rice around the lagoon [59]. Therefore, in the condition of lesser vegetation, the accuracy of Sen2Cor can significantly decrease.
Compared to Sen2Cor, iCOR has a relatively high correlation between the estimated and in situ R rs , as shown by the value of R 2 which gradually increases from 0.44 at 443 nm to 0.86 at 665 nm, while the R 2 value of Sen2Cor varies from 0.36 at 443 nm to 0.61 at 665 nm. However, except for this statistical parameter, iCOR showed higher values in all other indicators such as Bias and RE in 665 nm, which is 30% higher compared to Sen2Cor. This leads to a low value of S tot (7.7 and 3.37 for individual and common dataset cases respectively). Notably, iCOR is the only algorithm that does the adjacency correction, which is SIMilarity Environmental Correction (SIMEC), before performing the atmospheric correction [43]. This means that in order to perform the correction process, iCOR needs two conditions to be met. First, one condition is to have at least one spectral band in the wavelength range between 680 nm and 730 nm, where strong reflections from land plants and weak reflections from water can be obtained. Second, there must be at least one reference band in the NIR region [60]. However, in the case of extremely high turbidity such as coastal waters, the reflectance spectrum of NIR varies considerably [61] and, maybe, this could be the reason for lower performances. Besides, the other reason can be that iCOR uses a fixed aerosol model for the rural area, which is not suitable for the coastal area, which needs a marine or pollution aerosol model [62]. Another reasonable explanation for these lower performances is that iCOR requires a combination of pure green vegetation and bare soil endmember, and when these conditions cannot be satisfied to calculate the AOT value, it is recommended to manually set the appropriate AOT value for the area [42]. However, in this study, instead of using a specific AOT value for each region, the AOT value used by iCOR was set to default.

Processing Time
In terms of the ease of using AC algorithms, with the processor of i7-7700HQ with 8 cores and 16 GB RAM, the time processing of OC-SMART with default setting is the fastest out of the seven algorithms. The normal process time OC-SMART takes for 1830 × 1830 pixels of Sentinel-2 is in the range of 150 to 450 s. ACOLITE with default setting is the second fastest algorithm, followed by Polymer and Sen2Cor. In fourth place, iCOR and C2RCC took approximately 40 to 50 min to process the whole Sentinel-2 scene. NASA-AC took the longest processing time, which could reach around 2 h with the "-2" option for aerosol that is based on Bailey et al. [17] method. Regardless of the time processing and also the performance, OC-SMART is highly recommended for atmospheric correction in the coastal area.

Conclusions
This research aimed to test and validate the performance of seven S2/MSI atmospheric correction processors by using in situ measurements collected in two contrasted French coastal water: the English Channel and French Guiana. We used statistical parameters on the absolute values of R rs and on the R rs spectra and a ranking scheme to evaluate the performances of each AC. The scoring scheme system applied in this study has been effective in evaluating and ranking atmospheric correction algorithms for coastal waters. This scheme was taken into consideration for scoring the AC processors and then ranking them with respect to their performances. The ranking system given in this article should be applied in atmospheric correction of coastal area.
The results showed that out of seven algorithms, OC-SMART and NASA-AC are the best suitable processor on our in situ dataset. This is similar with the conclusion in ACIX-Aqua research of Pahlevan et al. [10]. In detail, OC-SMART has the highest total score of 16.89/20, NASA-AC is the second most suitable AC with the value S tot is 17.13/20 in the case of common match-ups dataset. However, there is a tradeoff for using NASA-AC since its available match-ups number is considerably lower compared to the other ACs. Polymer and C2RCC are, respectively, third and fourth in the ranking system. However, the accuracy of these algorithms is not high in the blue band (443 nm) and increases towards the red and green bands. This trend can also be observed for the other ACs algorithms. ACOLITE is the fifth in the ranking system. The two left ACs show lower performances on our dataset, especially in the blue band. It is suggested that the tested versions of these two left ACs should be better applied to inland scenes.
We tested the sensitivity of the performances on the position of the extraction box to the in situ measurement location. As the size of the structure that helped to collect the in situ measurements can be substantial to the pixel size of S2/MSI, it is necessary to study the impact of these structures on the performances of the AC. However, most of the results obtained for the data extraction methods tested in this study did not show significant changes, except for the case of right-shifted and down-shifted boxes, with a small improvement in the overall scoring scheme results. Therefore, it is not possible to conclude with certainty whether this right shift box or down shift box test method are better than the conventional data extraction method. This might be because our seawaters were too homogeneous or because we processed all the MSI images at a 60-m spatial resolution. However, it is necessary to consider the size of the measurements platforms when validating MSI images over the ocean.
We recommend using the ranking system result of this research over our regions of interest or similar regions.