Retrieval of Chlorophyll a from Sentinel-2 MSI Data for the European Union Water Framework Directive Reporting Purposes

The European Parliament and The Council of the European Union have established the Water Framework Directive (2000/60/EC) for all European Union member states to achieve, at least, “good” ecological status of all water bodies larger than 50 hectares in Europe. The MultiSpectral Instrument onboard European Space Agency satellite Sentinel-2 has suitable 10, 20, 60 m spatial resolution to monitor most of the Estonian lakes as required by the Water Framework Directive. The study aims to analyze the suitability of Sentinel-2 MultiSpectral Instrument data to monitor water quality in inland waters. This consists of testing various atmospheric correction processors to remove the influence of atmosphere and comparing and developing chlorophyll a algorithms to estimate the ecological status of water in Estonian lakes. This study shows that the Sentinel-2 MultiSpectral Instrument is suitable for estimating chlorophyll a in water bodies and tracking the spatial and temporal dynamics in the lakes. However, atmospheric corrections are sensitive to surrounding land and often fail in narrow and small lakes. Due to that, deriving satellite-based chlorophyll a is not possible in every case, but initial results show the Sentinel-2 MultiSpectral Instrument could still provide complementary information to in situ data to support Water Framework Directive monitoring requirements.


Introduction
Water Framework Directive (2000/60/EC) (WFD) obligates all European Union (EU) member states to implement water management and estimate ecological status in water bodies, through monitoring and classification [1].The WFD aims to achieve, at least, "good" ecological status of inland waters by 2027, by using a program of measures or maintain the "good" status if it already exists [2].The classification of ecological status is divided into five categories: "very good" to "very bad", where category "good" indicates very light bias due to human activity from reference conditions [3,4].
Consistent monitoring in water bodies is essential for fulfilling EU WFD, but traditional in situ monitoring (e.g., water sample collection and later analysis in the laboratory) is a rather time and money consuming method for estimating the quality of water on a regular basis.Since the late 1970s, due to successful ocean color satellite missions, the use of remote sensing technologies has increased [5][6][7], providing possibilities for observing water bodies regularly, locally and globally [8].The advantages of remote sensing over traditional monitoring methods are temporal and spatial coverage and the possibility to estimate water quality in non-accessible water bodies [6,9,10].Furthermore, the possibility to compose time series from historical data allows evaluating the changes in water quality over time [11].However, passive satellite remote sensing is dependent upon weather, air mass changes, and sunlight conditions, which directly affect the quality and quantity of useful data.
Chlorophyll a (chl a) is the main pigment in phytoplankton, which is known as one of the key parameters of the WFD which indicates the trophic status of water.Through photosynthesis, the phytoplankton converts CO 2 and H 2 O into O 2 and is responsible for primary production in the water column [6,12].In addition, chl a is the main indicator of phytoplankton biomass [13][14][15] and can be used to determine the water clarity [9].Phytoplankton blooms are natural processes in the water environment, which show the normal functioning of a water ecosystem [16].However, toxic cyanobacteria blooms or the excessive blooms caused by human impact cause environmental problems affecting inland waters directly through the reduction of water quality and indirectly restricting the use of drinking water, fishing, and swimming [17].
Satellite remote sensing has been used as a valuable tool for supporting the implementation of the WFD, by deriving phytoplankton and cyanobacterial pigments such as chl a and phycocyanin (PC), total suspended matter (TSM), colored dissolved organic matter (CDOM), and the spectral attenuation coefficient (K d ) [18].Gohin et al. [19] have compared in situ data for 74 coastal water bodies with Sea-viewing Wide Field Instrument Sensor (SeaWiFS) images from the period of 1998 to 2004 to estimate annual cycles of the fortnight chl a concentration for implementing WFD requirements by estimating the ecological status of water by chl a.However, results showed notable differences between in situ data and satellite data, which could have been caused by influence from the coastline, resolution or approximation of used methods.For large lakes, Medium Resolution Imaging Spectrometer (MERIS) images have been successfully used for estimating the water quality parameters required by the WFD over optically very different lakes, e.g., in perialpine lakes [20] and in oligotrophic to hyper eutrophic Nordic lakes [21].Results showed the possibility to estimate the trophic status of different water bodies and trends of chl a seasonal dynamics.Philipson et al. [22] have pointed out, based on the study of Swedish lakes, the limitation of insufficient spatial resolution (300 m) for monitoring smaller lakes required by the WFD.Ocean Color and Land Imager (OLCI) onboard Sentinel-3 (S3), as a continuation for MERIS, has benefits for monitoring large surface water bodies because of similarity to the MERIS sensor and the ability to use previously developed algorithms [23].The MultiSpectral Instrument (MSI) onboard Sentinel-2 (S2) with high resolution (up to 10 m) has shown suitability for detecting cyanobacterial blooms and retrieval of chl a concentrations in subalpine lakes [24].However, further development for atmospheric correction (AC) algorithms and the need for the larger validation data over optically complex waters is necessary [25].Therefore, the EU's Copernicus Program provides full and free access for quality-controlled data, which is an important source of information for environmental monitoring, and consistent technological improvement [26].
Different aspects have to be considered while deriving water quality parameters from remotely sensed data.According to Morel and Prieur [27], Case 1 waters are mostly dominated by phytoplankton, whereas Case 2 waters with different concentrations of optically active substances (OAS) (chl a, CDOM, TSM) are more complex when deriving water quality parameters.This is due to high and independent absorption and scattering by all OASs [15,28,29].In addition, as 90% of the signal that reaches the sensor is affected by the absorption and scattering by different particles in the atmosphere (water vapour, ozone, oxygen, carbon dioxide) and aerosols, atmospheric correction (AC) is an essential procedure.Optical sensors measure reflected light from the atmosphere and the surface of the water body at visible (VIS) and near-infrared (NIR) wavelengths.AC processors remove the scattered signal of atmosphere and retrieve the signal from the water's surface, which is called water-leaving reflectance (ρ w ) [6,30].For Case 1 waters, AC algorithms assume that the water-leaving radiance (L) is zero in the NIR part of the spectrum.This assumption is not valid in turbid Case 2 waters because of the scattering by particles that increase the ρ w in the NIR part.It causes over-correction in the visible part of the ρ w spectrum [31,32].In small narrow lakes or in the vicinity of the coast, the adjacency effect also influences the reflected light field because pixels are affected by the signal originating from the surrounding land [33].Pixels from the coastline are brighter than water pixels, and the multiple scattering precludes accurate derivation of water quality parameters [34].
As chl a has absorption peaks in VIS part of the spectrum, it is possible to estimate chl a in remote sensing applications through model-based or empirical approaches.A model-based approach is using bio-optical models to simulate ρ w or the top-of-atmosphere (TOA) radiance spectrum with specified water constituents.A more widely empirical approach is used, which is based on band ratios and, therefore, might have a smaller sensitivity, but is easier to develop and apply [6,32].
For Case 1 oceanic waters, the Blue-Green Two-Band Ratio Model (Equation ( 1)) algorithms work most successfully, because of the phytoplankton domination [13].The model uses blue-green wavelengths (440-550 nanometers (nm)) because the first absorption peak of chl a is around 440 nm (R (λ blue )) and minimal absorption is around 550 nm R λ green [27]: For Case 2 waters, algorithms at blue-green wavelengths fail because of absorption and scattering due to CDOM and TSM [35].Due to the second chl a absorption peak near 675 nm, the Two-Band NIR-Red Ratio Model (Equation ( 2)) [36] is widely used, where R(λ red ) is located in the range of maximum chl a absorption between 660 and 690 nm (R(λ 1 )) and R(λ N IR ) characterizes the range of wavelengths between 700 and 720 nm (R(λ 2 )), or wavelengths beyond 710 nm (R(λ 3 )) [10,37].
For more turbid and productive waters the Three-Band NIR-Red Ratio Model (Equation ( 3)) algorithm has been developed [38], because of a wide range of OAS in the water.The three-Band NIR-Red Ratio Model algorithm uses the same wavelengths as the Two-Band NIR-Red Ratio Model algorithm, where R(λ 1 ) is maximally sensitive to the absorption peak of chl a, R(λ 2 ) is minimally sensitive to the absorption of chl a and R(λ 3 ) is minimally affected by the absorption of chl a [39].
For highly turbid waters Le [40] has done further development for the Three-Band NIR-Red Model and has added R(λ 4 ) into the equation.In the Four-Band NIR-Red Model (Equation ( 4)) algorithm the fourth band should minimize the impact of absorption and backscattering of TSM in R(λ 3 ) and is located at NIR wavelengths: Additionally, various so-called line height algorithms have been developed.For detecting surface blooms and near-surface vegetation in coastal and ocean waters, the Maximum Chlorophyll Index (MCI) (Equation ( 5)) has been used for MERIS sensor, which is based on the height of the peak at 709 nm and is used with chl a over 10 mg/m 3 [41]: In the MCI algorithm, L represents TOA radiance at the specific wavelengths, and the index 0.389 represents the ratio of wavelengths (709 − 681)/(753 − 681).According to Gower [41], the MCI algorithm can be used effectively with ρ w values as well, instead of radiances.
The Fluorescence Line Height (FLH) (Equation ( 6)) algorithm is the most suitable for waters where chl a concentration is 1-20 mg/m 3 and it uses the chl a fluorescence peak maximum near 685 nm, that is located between the linear baseline of two adjacent bands [42]: In the FLH algorithm, L 681 is the water-leaving radiance at the fluorescence maximum peak wavelength of MERIS band and the index 0.364 represents the ratio of wavelengths (681 − 665)/(709 − 665).
Estonian inland waters are classified as Case 2 waters because of the high amount of different OAS.According to the EU WFD regulations, 89 lakes should be monitored regularly in Estonia (Figure 1), which are divided into eight different groups by the type of water body (Table 1).Estonian small lakes belong to Type 1 to Type 5, the two biggest lakes Peipsi and Võrtsjärv both form their own groups, and coastal lakes belong to Type 8. To determine the ecological status of the water body by using in situ measurements, water samples of phytoplankton and physico-chemical background from inland waters are collected monthly from May to September, except in Peipsi and Võrtsjärv, where samples are gathered from monthly April to October and from July to August, respectively [4,43].
In the FLH algorithm, L681 is the water-leaving radiance at the fluorescence maximum peak wavelength of MERIS band and the index 0.364 represents the ratio of wavelengths (681−665)/(709−665).
Estonian inland waters are classified as Case 2 waters because of the high amount of different OAS.According to the EU WFD regulations, 89 lakes should be monitored regularly in Estonia (Figure 1), which are divided into eight different groups by the type of water body (Table 1).Estonian small lakes belong to Type 1 to Type 5, the two biggest lakes Peipsi and Võrtsjärv both form their own groups, and coastal lakes belong to Type 8. To determine the ecological status of the water body by using in situ measurements, water samples of phytoplankton and physico-chemical background from inland waters are collected monthly from May to September, except in Peipsi and Võrtsjärv, where samples are gathered from monthly April to October and from July to August, respectively [4,43].For estimating the ecological status of water, each indicator (biological, physico-chemical, and hydromorphological) have been associated with certain thresholds to assign an ecological status of the water.Therefore, the ecological status of a lake is the combined result of each quality indicator and type-specific conditions that need to be considered.Type-specific thresholds according to the Estonian regulation of Water Act [4] for chl a are shown in Table 2.At least seven quality parameters should be considered, and all the parameters are equally important to estimate the ecological status of water [4].
Table 2. Thresholds for chlorophyll a (chl a) (mg/m 3 ) estimating the ecological status of water for different lake types.Each ecological status class is represented by defined colors [1].Thresholds are defined according to the Estonian regulation of Water Act [4].Due to the small water surface area of the Estonian inland waters, high spatial resolution of the sensor is required.The spatial resolution of the MSI gives an advantage over ocean color satellites with the possibility to also monitor smaller water bodies.A similar sensor to the MSI is the Operational Land Imager (OLI) onboard NASA's satellite Landsat-8 (launched 2013), which, in addition to land data, provides data from aquatic systems.The spatial resolution of OLI is 30 m, but temporal resolution is 16 days, which is not sufficient for regular monitoring [45].
The S2 mission was originally designed for monitoring land cover changes and is composed of two identical satellites-S2A was launched in 2015 and S2B in 2017.The S2 mission includes satellites S2C and S2D as well, which are planned to be sent into orbit in the next decade.The MSI sensor measures in 13 spectral bands from 443 to 2190 nm with spatial resolution 10, 20, 60 m and with a 12-bit radiometric resolution [46].The MSI is a similar sensor to OLI, but with advantages over OLI due to a higher revisitation of S2 and more spectral bands in both the visible and NIR wavelengths [47].Compared to OLCI, the signal-to-noise ratio (SNR) is lower in S2, and the position and width of the bands are different.The OLCI band for deriving chl a is located in the middle of the second chl a absorption peak (the centre of the band is 674 nm) and is an easily detectable absorption in the water by chl a (Figure 2).The chl a absorption band for MSI is wider (width 38 nm) than for the band in OLCI sensor (width 7.5 nm) and might have a lower sensitivity to measure the absorption peak at 675 nm compared to OLCI. by chl a (Figure 2).The chl a absorption band for MSI is wider (width 38 nm) than for the band in OLCI sensor (width 7.5 nm) and might have a lower sensitivity to measure the absorption peak at 675 nm compared to OLCI.The purpose of this study is to test the suitability of the S2 MSI for monitoring small lakes with a wide range of OAS.The three specific objectives are: (1) to compare different AC processors and validate the results against in situ measurements to find the most suitable; (2) adjust previously developed chl a algorithms to MSI bands and develop algorithms for optically different water types, and (3) derive the ecological status of water based on chl a as required by WFD.The purpose of this study is to test the suitability of the S2 MSI for monitoring small lakes with a wide range of OAS.The three specific objectives are: (1) to compare different AC processors and validate the results against in situ measurements to find the most suitable; (2) adjust previously developed chl a algorithms to MSI bands and develop algorithms for optically different water types, and (3) derive the ecological status of water based on chl a as required by WFD.

In Situ Data
Three sets of data were used to (1) test the accuracy of S2 MSI radiometric products; (2) develop the empirical chl a algorithms for S2 MSI, and (3) derive a chl a time series over selected lakes.
First, 13 match-up points with in situ measured radiometric data covering the period 2015-2017 were analyzed to compare and identify the most accurate AC processor.These match-ups represent a variety of Case 2 inland waters, where OAS vary in a wide range (detailed description for each individual match-up is provided in Table 3 under Results and Discussion section).Radiometric measurements were performed with above-water RAMSES TriOS radiometers, which measure upwelling radiance L u (λ), downwelling radiance L d (λ) and downwelling irradiance E d (λ).Field measurement protocol and the derivation of ρ w is based on the published protocols [48,49].
The ρ w (Equation ( 7)) is calculated as: where ρ sky is the air-water interface reflection coefficient, which depends on wind speed W(m/s) ρ sky = 0.0256 + 0.00039W + 0.000034W 2 [48].Second, to compare, develop and adapt empirical chl a algorithms to S2 MSI bands, in situ data collected during the FP7 GLaSS (Global Lakes Sentinel Services (313256)) project, was used (hereafter GLaSS dataset).This dataset consists of simultaneous radiometric measurements (processed with MSI Spectral Response Function (SRF)), Inherent Optical Properties (IOPs), chl a, and TSM data measured globally over optically different lakes and representing 412 data points from Estonia, Finland, The Netherlands, and Italy.More information about the dataset can be found in Reference [50].As there are only a few match-ups with S2 MSI, in situ measured the GLaSS dataset was used to develop the empirical chl a algorithms for different water types.
Third, to derive the seasonal dynamics of chl a in different types of lakes for the ecological status class estimation, chl a data (chl a measured from the water samples) from the Estonian National Monitoring database was used from the period 2015-2017.For this dataset, water samples were collected from the surface layer, chl a samples were filtered to Whatman GF/F ø 25 mm and pigment extraction was done with 5 mL 96% ethanol.Chl a was measured spectrophotometrically with Hitachi U-3010 (430-750 nm), and the concentrations of chl a were derived according to Reference [51].

S2 MSI Data
S2 MSI Level-1C (L1C) (processing baseline 02.02, 02.04 or 02.05) images were downloaded from Copernicus Open Access Hub (https://scihub.copernicus.eu/)for the period 2015-2017.Match-ups were selected by allowing ±3 days difference between S2 MSI overpass and simultaneous in situ measurements.S2 MSI SRF (v3.0) was applied on in situ ρ w .Downloaded images were processed by SNAP (v5) developed by Brockmann Consult, Array Systems Computing and Communication and Systémes (C-S) which is a free open toolbox for processing data from the Sentinel missions.As spatial resolution varies with different bands, 60 m resampling was performed with Resampling (v2.0) tool to obtain all bands to test various chl a algorithms and give the same base for each AC processor.For identification of pixel types, IdePix (v2.2) was used on L1C images for using only cloud-free pixels.Images were processed with AC processors ACOLITE (v20180925.0),C2RCC (v0.15),POLYMER (v1.1),Sen2Cor (v2.1.2).Investigated S2 MSI bands are B1 (443 nm)-B7 (783 nm), as the main bands for developing chl a algorithms for Case 1 and Case 2 waters.

ACOLITE
The ACOLITE processor is developed for coastal and inland waters and applicable for processing high-resolution Landsat 8 OLI and S2 MSI images to give results over extremely turbid, narrow, and small water bodies.The processor uses the Dark Spectrum Fitting (DSF) approach [52] and outputs ρ w at specific sensor wavelengths (different algorithms of IOPs, chl a, and TSM are optional outputs).For the study, only water pixels with no flags were used [53,54].

C2RCC
The C2RCC (Case 2 Regional CoastColour processor) is developed for optically complex Case 2 waters, which use a large database of simulated ρ w and TOA radiances.It is based on neural network technology and has been trained in extreme ranges of scattering and absorption properties.The C2RCC outputs results of ρ w , IOPs, chl a, and TSM and provides the possibility to add additional background information such as salinity, elevation, ozone, temperature, and air pressure.For this study, the salinity was set to 0.0001 which is different from the default setting.For accurate results, weather and atmosphere parameters, the European Centre for Medium-Range Weather Forecasts (EMCWF) source was used and pixels named VALID_PE (the operators valid pixel expression has resolved to true) and RHOW_OOR (one of the inputs to the IOP retrieval neural net is out of training range) were used for analyzing ρ w [55].

POLYMER
POLYMER (POLYnomial based algorithm applied to MERIS) was originally developed for MERIS data to remove the influence of sun glint and retrieve ocean color parameters and spectrum of ρ w , but has been extended to other sensors, such as S2 MSI.POLYMER uses a spectral matching method, which is based on polynomial atmospheric and bio-optical water reflectance model and outputs results of ρ w , chl a, TSM, IOPs, backscattering coefficient of non-covarying particles, quality flags, and reflectance of the sun glint.Valid pixels were marked as 0 as water pixels with no flags [56,57].

Sen2Cor
Sen2Cor is an AC processor for vegetation applications and for scene classification, which is designed for S2 MSI data to generate L2 products.Sen2Cor relies on a large database of look-up tables and atmospheric radiative transfer model and is able to classify scenes into 12 classes (clouds, cloud shadows, vegetation, snow, water, cirrus, etc.).It outputs bottom-of-atmosphere (BOA) reflectance images, with aerosol optical thickness, water vapour, scene classification and quality indicators, such as cloud and snow probability.For this study, only water pixels were used [58].

Statistical Analysis
For statistical analysis of ρ w data from each AC processor was used for analyzing the difference between satellite-derived and in situ measured ρ w.The mean of 3 × 3 pixel area was used, as it improves the SNR and also increases the probability of having retrieved the value from S2 processed data.Each statistic was calculated for specific MSI bands.The following statistics were applied: the coefficient of determination (R 2 ) (Equation (8)), the average absolute percentage difference (ψ) (Equation ( 9)), the root-mean-square difference (∆) (Equation ( 10)), the bias (δ) (Equation ( 11)), slope (S) and intercept (I).The S near the zero and the I near the one indicates that the S2 MSI observations fit well with in situ measurements.In the equations x i is representing i-th in situ measurement, y i is the i-th S2 MSI derived value, and N is the number of match-ups [59].

Validation of Water-Leaving Reflectance
Four AC processors were applied and compared to derive ρ w from S2 MSI data.Altogether 13 match-ups from nine different water bodies (Table 3) were analyzed.
Figure 3 shows the comparison of in situ measured ρ w compared to derived ρ w from various AC procedures after processor-based flagging.Least retrievals are over highly absorbing small inland lakes (Jõemõisa, Kaiu).Results are better for points further away from the shore waters (e.g., Peipsi_11).Missing ρ w specific AC processor spectrum means no results from the AC processor.In the next paragraph, the results from Figure 3 are explained and linked with the lake's specification based on the S2 MSI overpass.The S2 MSI images over four small lakes Jõemõisa (Figure A1), Kaiu (Figure A1), Verevi (Figure In the next paragraph, the results from Figure 3 are explained and linked with the lake's specification based on the S2 MSI overpass.

Jõemõisa, Kaiu, Verevi and Pangodi
The S2 MSI images over four small lakes Jõemõisa (Figure A1), Kaiu (Figure A1), Verevi (Figure A2) and Pangodi (Figure A3) originate from 28th of August 2016, and the fieldwork was performed on 25th of August 2016.Both Jõemõisa and Kaiu are classified as Type 2 (non-stratified, color dark/light) waters according to the WFD (Table 1).These two lakes used to be one big lake centuries ago, but because of decreasing water level, they have become separate lakes.Both lakes are eutrophic with high chl a (>21 mg/m 3 ) and very high a cdom (442) (>10 m −1 ) (Table 3), due to the location in the bog area, which results in yellow-brown water color with a low transparency (0.8 m) [60,61].In situ ρ w spectrum is relatively low with a distinctive peak in 705 nm due to strong absorption of CDOM and high chl a. From all the AC processors, only C2RCC provided results for these small lakes, because other AC processors did not derive any ρ w values from these conditions.The shape of the C2RCC derived ρ w is similar to in situ measured ρ w , but it is underestimated in both lakes and the peak due to high chl a absorption is absent.
Verevi belongs to Type 3 (stratified, color dark/light) according to the WFD (Table 1).It is a eutrophic lake (chl a 31 mg/m 3 ) with a mostly swampy coastal area, yellow water color, and moderate transparency (1.4 m) (Table 3) [60,61].C2RCC strongly underestimates in situ measured ρ w and does not show the chl a absorption peak at 675 nm (Figure 3).ACOLITE overestimated in situ measured ρ w, but shows a similar in situ ρ w spectrum chl a absorption peak.NIR bands are really strongly overestimated by ACOLITE, which could be caused by the adjacency effect.Problems with other AC processors could similarly be caused by the adjacency effect, which influences pixels near the coastline.In addition, high CDOM absorption could be the second reason, because, for pixels with a low backscattered light level, the processors are not able to derive ρ w .
Pangodi (WFD Type 3) is located in between agricultural and forestry areas in the south-east of Estonia, where spring waters are very important sources for nutrient-rich inflow water.Pangodi is a eutrophic lake with moderate chl a (15.2 mg/m 3 ) and relatively low TSM (4.2 mg/m 3 ), and a cdom (442) (1.3 m −1 ) (Table 3).The color of water is yellow-green or green-yellow with low to moderate transparency (1.7 m) [60,61].Compared to previous lakes, the in situ measured ρ w is higher in Pangodi (Figure 3), because of a lower amount of chl a and CDOM.The lower absorption by OAS and the larger water surface area could be the reason that C2RCC, Sen2Cor, and ACOLITE gave similar results as well.The shape of the ρ w spectrums of all three AC processors are similar to in situ measured ρ w spectrum, but it is underestimated by C2RCC and Sen2Cor and the chl a absorption peak at 675 nm is not derived.ACOLITE estimated the in situ measured ρ w very accurately in blue and green wavelengths, but strongly overestimate at NIR wavelengths, similarly to Verevi spectrum.POLYMER did not derive any ρ w values from Pangodi.

Kirikumäe, Murati and Hino
There was a same day match-up for the 30th of August 2017 for Kirikumäe (Figure A4), Murati (Figure A5), and Hino (Figure A6) lakes.These lakes are classified as Type 5 (non-stratified, water color light), Type 3 (stratified, water color dark/light), and Type 2 (non-stratified, water color dark/light) (Table 1), respectively, according to WFD.Kirikumäe is surrounded by a low and swampy areas, which makes it a rare non-stratified eutrophic semi-humus lake, where concentrations of OAS are high (Table 3).The color of the water is yellow to brown-yellow with a low to moderate transparency.Murati is a narrow yellow colored lake, surrounded by agricultural clay soil and sand soil forestry areas.The lake is eutrophic, (chl a > 20 mg/m 3 ) with a cdom (442) over 10 m −1 (Table 3), which makes the transparency of the water low [60,61].Due to the high CDOM absorption, in situ measured ρ w is low, which agrees with a low ρ w values by C2RCC and Sen2Cor (Figure 3).Although both AC processors derive the ρ w with similar shape to in situ data, they slightly overestimate blue part and underestimate green-red part of the in situ measured ρ w spectrum and do not derive the chl a absorption peak.ACOLITE overestimates the in situ measured ρ w spectrum, except at wavelength 443 and the shape of the ρ w is not similar to in situ measured spectrum.NIR bands are strongly overestimated, which has been seen as a problem previously.It can be observed that POLYMER is not able to derive ρ w for high chl a and CDOM dominated waters with a small surface area.
Hino has a large surface area and is located between forest areas.The amount of OAS is relatively low (Table 3), except for high TSM (10.7 mg/m 3 ).However, the water of Hino is described as transparent and water color light to yellow-green [60,61].As TSM is high, the in situ measured ρ w spectrum is higher than in previous lakes.The larger surface area and low OAS gave opportunity to derive results besides ACOLITE, C2RCC, and Sen2Cor, from POLYMER, which estimates the ρ w at 560 nm similar to the in situ measured ρ w (Figure 3) although the blue part is overestimated and red part underestimated.The most similar shape to the in situ measured ρ w spectrum are C2RCC and ACOLITE, but C2RCC strongly underestimates in situ ρ w similar to Sen2Cor and ACOLITE overestimates, except the blue bands.

Otepää Valgjärv
There was a same day match-up for the 28th of August 2017 for Otepää Valgjärv (Figure A7).It is classified as Type 2 (non-stratified, color dark/light) based on the WFD (Table 1).This green-yellow colored lake is surrounded by forest areas on one side and agricultural areas on the other.It is a eutrophic lake, where chl a is above 25 mg/m 3, and TSM is also over 25 mg/m 3 (Table 3), which makes the lake's transparency low [60,61].The in situ measured ρ w shows a strong absorption peak at 675 nm (Figure 3), which is not detected by any of the AC processors, except ACOLITE, but the rest of the ρ w spectrum is strongly overestimated.The most similar shape of ρ w is Sen2Cor, which estimates the ρ w well at 560 nm similar to POLYMER.Although, Sen2Cor is not able to detect chl a absorption at 675 nm, it does not underestimate in situ measured ρ w as strongly as C2RCC and POLYMER.

Peipsi
There was a same day match-up for Estonia's biggest lake Peipsi (Figure A8) from the 14th of September 2016, which belongs to Type 7 (non-stratified, water color light) (Table 1) according to the WFD.The concentrations of chl a (24-35 mg/m 3 ) and TSM (10-16 mg/m 3 ) are high (Table 3).The color of water is yellow-green or green-yellow, and the transparency is moderate (0.6-0.9 m) in the open part of the lake [60,61].In situ points are located in different parts of the lake: Point Peipsi_11 is located in the middle of the lake, point Peipsi_12 in adjacent to land, and Peipsi_38 is in the mouth of river inflow.AC processors worked well, and the derived values and shapes of the ρ w are comparable with in situ data for all points (Figure 3), except for ACOLITE where the ρ w values at bands 443 nm, 490 nm, 740 nm, and 783 nm were too high.C2RCC provided no chl a absorption peak.Other AC processors derived a chl a absorption peak at 675 nm.In the river mouth (Peipsi_38) POLYMER was not able to derive ρ w in the conditions.

Võrtsjärv
There was also a same day match-up for the 20th of May 2016 for Estonia's second largest lake Võrtsjärv (Figure A9), which belongs to Type 6 (non-stratified, water color light) according to the WFD (Table 1).Lake Võrtsjärv is a shallow eutrophic lake with high chl a and TSM (Table 3), which causes low water transparency.The color of the water is yellow-green or green-yellow and is caused by a high amount of plankton and resuspension from sediment [60,61].Point Võrtsjärv_1 is located in the middle of the lake, and point Võrtsjärv_10 is near the coast.While each processor derives the shape of the ρ w spectrums similar to in situ data, ACOLITE and Sen2Cor strongly overestimate blue part of the ρ w spectrum.As these match-up points were in the vicinity of clouds, ACOLITE and Sen2Cor could be more sensitive to cloud pixels, and this results in a higher ρ w .A high TSM (10.8 mg/m 3 ) cause higher ρ w at red/NIR wavelengths, which is shown by the results of all AC processors.The chl a absorption peak near 675 nm is derived by all processors except C2RCC.

Comparison of AC processors
Comparison of AC processors over optically different water types revealed that the adjacency effect could cause problems with estimating correct and accurate ρ w spectrum in small lakes.Additionally to the adjacency effect, AC processors fail in CDOM dominated waters more frequently, because of high absorption of light in these conditions, which cause lower reflectances.It means that AC processors are not able to retrieve ρ w from these kinds of lakes.This has been reported by Grendaité et al. [25] on eutrophic lakes in Lithuania and Kutser et al. [62] in CDOM-dominated waters in Estonia and Sweden.However, C2RCC is capable of deriving ρ w in most cases and especially in problematic CDOM dominated waters, where other AC processors failed.C2RCC is able to work in cases of different water types.Sen2Cor and ACOLITE are similarly able to work in CDOM dominated waters, although they do not derive ρ w more often than C2RCC.ACOLITE typically overestimates, similar to Sen2Cor's, the blue part of the ρ w spectrum.Caballero et al. [63] have studied Spanish coastal areas, applying ACOLITE and POLYMER on medium to highly turbid waters and found that ACOLITE similarly overestimated the blue part of the spectrum and the same has been noticed by Dörnhöfer et al. [64] on oligotrophic lakes.Additionally, ACOLITE often overestimates NIR wavelengths, which could be caused by the adjacency effect.Caballero et al. [63] observed similar results as shown in this study for POLYMER which gave accurate results to in situ measurements in TSM dominated waters, such as coastal areas.Results of comparison of all AC processors are summarized in the Table 4.The number of pixels for comparison among AC processors was the highest for C2RCC (N = 13), which gives the opportunity to obtain information from small lakes.The least data were retrieved with POLYMER (N = 7) because it did not retrieve ρ w in small lakes.The coefficient of determination shows the highest correlation in all AC processors at the wavelengths 560, 665, and 705 nm, which are the main wavelengths for developing chl a algorithms for optically complex Case 2 waters.C2RCC shows the highest correlation (R 2 > 0.7) at these wavelengths, whereas the correlation is smaller (R 2 < 0.7) for other AC processors.Zero I for C2RCC indicates that derived ρ w fit well with the in situ measured ρ w .ACOLITE gives higher I values (I ≥ 0.1) and other AC lower I values (I ≤ 0.0).S near to one indicates a good fit between derived in situ measured ρ w .At the main wavelengths of developing chl a algorithms, the S was close to one for the C2RCC processor (S = 0.8-1.18).
Based on the average absolute percentage difference values, POLYMER gives the most accurate results compared to the in situ measured ρ w at the main chl a developing wavelengths (ψ < 40.6).According to the root-mean-square difference, C2RCC and POLYMER show the most accurate results, which means that C2RCC and POLYMER derived ρ w is the most comparable with in situ measured ρ w , C2RCC was ∆ = 0.01 only at 560 nm.Otherwise it was 0. As bias indicates over-or underestimation, then AC processors rather underestimate the in situ measured ρ w spectrum.
Based on the derived statistics, C2RCC and POLYMER showed the highest accuracy.As the purpose of the study is to estimate water quality in small lakes, the high number of derived water pixels is an essential value, which is highest in the case of C2RCC.The reason for the highest number of derived results in case of C2RCC is the processor-based flagging.It did not flag out as many pixels as ACOLITE, POLYMER, and Sen2Cor in optically complex waters.Based on the coefficient of determination and root-mean-square difference, the C2RCC was selected as a processor to derive the ρ w as an input for chl a algorithms.Figure 4 shows the correlation between derived ρ w of C2RCC and in situ measured ρ w at given wavelengths.Values of derived ρ w of C2RCC are rather underestimated compared with in situ measured ρ w values.Table 4 supports the underestimation evidence because the bias is negative from 490 nm to 740 nm.
Remote Sens. 2018, 10, x FOR PEER REVIEW 14 of 27 determination and root-mean-square difference, the C2RCC was selected as a processor to derive the ρw as an input for chl a algorithms.Figure 4 shows the correlation between derived ρw of C2RCC and in situ measured ρw at given wavelengths.Values of derived ρw of C2RCC are rather underestimated compared with in situ measured ρw values.Table 4 supports the underestimation evidence because the bias is negative from 490 nm to 740 nm.

Comparing and Developing chl a Algorithms for S2 MSI
Based on the literature overview, 28 empirical algorithms were selected that could be adapted to S2 MSI bands (Table 5).These algorithms were tested on GLaSS dataset, where oligotrophic to hyper eutrophic lakes were represented.

Comparing and Developing chl a Algorithms for S2 MSI
Based on the literature overview, 28 empirical algorithms were selected that could be adapted to S2 MSI bands (Table 5).These algorithms were tested on GLaSS dataset, where oligotrophic to hyper eutrophic lakes were represented.
Table 5.The form of the investigated algorithms with references, which were adjusted to S2 MSI wavelengths for this study.

Investigated Empirical
Algorithms Reference Investigated Empirical Algorithms Reference To analyze the applicability of various empirical approaches on different lakes, algorithms (Table 5) were applied either on L1C S2 MSI data or on in situ measured ρ w data (GLaSS dataset) to derive the conversion factors for estimating chl a.On L1C data, various MCI-based approaches were tested (Table 6).This analysis is based on a limited number of data (Table 3, N = 12) with relatively high scatter in the data (R 2 ~0.25)As there are only a few match-ups between S2 MSI, and in situ data, the GLaSS dataset was used to test the algorithms (Table 5) on different water types and derive the conversion factors suitable for S2 L2 data for estimating chl a (Table 6) For every water type (or lake), the three best empirical approaches were chosen based on the highest R 2 values (Table 6).

Ecological Status of Water in Lakes Based on chl a
For estimating the ecological status of a water body based on chl a, the specific thresholds have to be considered for each lake type according to WFD (Table 2).Figure 6 represents the seasonal dynamics of chl a and the corresponding ecological status class, focusing on different types of water according to WFD using empirical algorithms from Table 6.Additionally, the chl a value as output from C2RCC standard algorithm is shown on the Figure 6.

Ecological Status of Water in Lakes Based on chl a
For estimating the ecological status of a water body based on chl a, the specific thresholds have to be considered for each lake type according to WFD (Table 2).Figure 6 represents the seasonal dynamics of chl a and the corresponding ecological status class, focusing on different types of water according to WFD using empirical algorithms from Table 6.Additionally, the chl a value as output from C2RCC standard algorithm is shown on the Figure 6.

Ecological Status of Water in Lakes Based on chl a
For estimating the ecological status of a water body based on chl a, the specific thresholds have to be considered for each lake type according to WFD (Table 2).Figure 6 represents the seasonal dynamics of chl a and the corresponding ecological status class, focusing on different types of water according to WFD using empirical algorithms from Table 6.Additionally, the chl a value as output from C2RCC standard algorithm is shown on the Figure 6.As Verevi is a small lake, invalid ρ w spectrums are shown in Figure 6b as described previously in Figure 5b.Applying chl a algorithms on Verevi, the concentration of chl a remains constant (Figure 6a), because the Case 2 ρ w spectrums are not typical.However, the MCI algorithm was applied to L1C data, which derives accurate results compared to in situ data and shows seasonal dynamics.Lakes with a larger water surface area show better results using C2RCC ρ w .Ermistu belongs to type 2 (non-stratified, water color dark/light), with chl a concentration between 10.8-28 mg/m 3 and assigned "good" status class.Four-Band NIR-Red Model chl a = 43.2 + 10.2 approach worked best and showed a similar trend with in situ measured chl a.The standard C2RCC algorithm estimates chl a similar to empirical algorithms, but the range of minimum and maximum values are higher and makes it less stable for estimating chl a (Figure 6c).As the water surface is large (ha = 450 ha), the ρ w of C2RCC was estimated well, because the adjacency effect did not influence the water pixels of the lake.Both in situ measured and empirical approach derived chl a values are low (chl a < 10 mg/m 3 ) and there is no strong chl a absorption peak at 675 nm in the C2RCC derived ρ w spectrums (Figure 6d), which is reflected in the stable seasonal dynamics.The ecological water status in Ermistu was estimated "very good" both by in situ measurements (average in situ chl a 6.0 mg/m 3 , N = 4) and by satellite-derived chl a (average 5.7 mg/m 3 , N = 10).Ähijärv belongs to Type 3 (stratified, water color dark/light), therefore, chl a between 5.8-13 mg/m 3 assigns the lake into "good" ecological status by class.Algorithm chl a = 24385.4× (R 705 − ((R 665 + R 740 )/2)) + 7.7 gives the most accurate results for this large lake.The same band ratio has been used by Toming et al. [69] on lakes with chl a in range 3-72 mg/m 3 retrieving R 2 = 0.8.An MCI based approach gave similar concentrations to in situ measurements.However lower chl a values were overestimated.Futhermore, in other investigated Type 3 lakes, MCI based algorithm for ρ w , estimates chl a in the same order as in situ measured values.The standard CR2CC algorithm gave very low or very high concentrations, which has been noted previously, but the dynamics remain similar to in situ measurements.ρ w of Ähijärv showed slight chl a absorption (Figure 6f), but not as strong as it should be with chl a over 20 mg/m 3 .The ecological water status in Ähijärv was estimated as "moderate" by in situ measurements (average in situ chl a 18.0 mg/m 3 , N = 4) as well as satellite-derived chl a (average empirical algorithm 14.7 and 16.4 mg/m 3 , N = 9) and by L1C MCI (23.6 mg/m 3 , N = 9).
Peipsi 2 belongs to Type 6 (described previously).The chl a between 3-8 mg/m 3 assigns the lake into "good" ecological status.Three-Band NIR-Red Model algorithm chl a = 260.1 × ((R 665 −1 − R 705 −1 ) × R 740 ) + 27.9, with different coefficients compared to Type 2 lakes, gave similar dynamics for chl a compared to in situ measurements, although similar to C2RCC with an underestimation of values.The MCI algorithm applied to L1C data overestimates the chl a, because chl a concentration is lower than 15 mg/m 3 at the beginning of summer (Figure 6g).However, it follows similar dynamics as in situ chl a and with higher concentrations.Since Peipsi is a larger lake, water pixels were not affected as much as for small lakes by the coastline, which shown by the number of suitable C2RCC pixels (N = 13).The ecological water status in Peipsi_2 was estimated as "moderate" by in situ measurements (average in situ chl a 16.8 mg/m 3 , N = 5), "good" and "moderate" by satellite-derived chl a (average 7.4 and 16.7 mg/m 3 , N = 13) and "bad" by L1C MCI (25 mg/m 3 , N = 13).It was possible to estimate the ecological status of water by chl a in some cases.Bresciani et al. [24] have investigated subalpine lakes with S2 MSI and Landsat OLI, which showed a good advantage for spatial scale analyses over in situ measurements.In situ monitoring for chl a includes point measurements from the lakes, but remote sensing is capable of estimating chl a values over the lake.High revisitation with the two satellites helps to detect intense phytoplankton blooms and estimate the ecological status of water considering phytoplankton events in the final classification.The possibility of estimating chl a depended on the capability of C2RCC to derive accurate ρ w in small lakes and in the vicinity of land.In Type 2, algorithm (R 665 −1 − R 705 −1 )/(R 740 −1 − R 705 −1 ) worked well, giving similar results as in situ measurements.Three-Band NIR-Red Model algorithm R 705 − ((R 665 + R 740 )/2), also worked in Peipsi and smaller lakes.However, the standard C2RCC algorithm typically estimates chl a reasonably well, when chl a < 10 mg/m 3 .The MCI based approach L2 MCI (R783) applied on ρ w, worked in Ähijärv and Peipsi_2.The L1C MCI approach typically overestimated in situ measured chl a, but showed similar dynamics and concentrations when chl a > 15 mg/m 3 .The MCI algorithm applied on L1C data is the best solution for avoiding failures which come from AC or in the case of intense phytoplankton blooms.The same tendency has been shown by Toming et al. [69], Grendaite et al. [25], and Alikas et al.
[73], when band ratio algorithms applied on L1C data gave higher accuracy in deriving chl a compared to L2 data.However, more available data is needed to investigate relationship between MCI and chl a.The accuracy in the final chl a product could also be increased by using bands with higher spatial resolution (10 m, 20 m) which could decrease error propagation due to the adjacency effect.

Conclusions
The EU WFD obligates member states to monitor lakes larger than 50 ha and derive their ecological status.The goal is to achieve, at least, "good" ecological status and take measures to improve the status, if needed.S2 MSI has suitable spatial resolution from 10 m, which could support the development of new applications over lakes for fulfilling WFD monitoring requirements.With both satellites, S2A and S2B, the temporal resolution of 2 to 3 days provides the possibility to include and analyze more data for testing and for developing new applications for S2 MSI satellites.Even more, it has the advantage of providing time series and dynamics of chl a in lakes.
S2 MSI is mainly designed for vegetation applications, therefore, it is important to compare different AC processors for finding the most suitable for water applications.As the AC procedure is an essential tool for developing chl a algorithms, four different AC processors: ACOLITE, C2RCC, POLYMER, and Sen2Cor were tested in this study.Based on the 13 match-up points, C2RCC was chosen due to the high correlation with in situ measurements at bands useable for deriving chl a.
As chl a is one of the main parameters for estimating the ecological status of water based on WFD, the second part of the study included testing and developing chl a algorithms adjusted to S2 MSI bands.Further investigation showed that C2RCC is not able to give accurate results in small, narrow lakes, where the adjacency effect affects the pixels near the coastline.Therefore, it is important to develop corrections for the adjacency effect, which could help avoid invalid mixed pixels in analyses.
In some cases, where the adjacency effect was smaller, especially in larger lakes (over 90 ha) and where the shape of the lake is round, it was possible to estimate chl a in water surface using empirical algorithms.The standard C2RCC chl a algorithm constantly estimates the value of chl a either higher or lower, compared to in situ data, but with similar seasonal dynamics.In case of high chl a (up to 150 mg/m 3 ), good agreement with in situ data was derived with Three-Band NIR-Red Model (R665 −1 − R705 −1 ) × R740 and R705 − ((R665 + R740)/2) algorithm.In the lakes with high TSM (up to 19 mg/m 3 ), Four-Band NIR-Red model algorithm (R665 −1 − R705 −1 )/(R740 −1 − R705 −1 ) worked best, removing the TSM influence, and as previously mentioned, the Three-Band NIR-Red Model algorithm and MCI algorithm applied on L1C data estimated chl a similar to in situ measurements in case of chl a > 15 mg/m 3 .As not influenced by ACs, the MCI algorithm is the most suitable on S2 L1C data, however, further investigation is needed.
Furthermore, the C2RCC processor was not very sensitive in estimating chl a absorption at band 665 nm, which could be due to the adjacency effect in small lakes.It means that more validation data from optically complex waters are essential for improving AC processors.S2 MSI has potential for deriving water quality parameters from small lakes for fulfilling EU WFD monitoring and reporting requirements.However, improvements of ACs algorithms are essential for producing higher-level water quality products.This would allow to use S2 MSI advantages over in situ measurements and support the regular monitoring over small lakes.

Figure 1 .
Figure 1.Eighty-nine lakes in Estonia (≥50 ha), which should be monitored regularly according to the European Union Water Framework Directive (EU WFD) [4,44].Red points represent lakes, which were used for atmospheric correction (AC) processors validation in the study.Background map courtesy of Maa-amet.

Figure 1 .
Figure 1.Eighty-nine lakes in Estonia (≥50 ha), which should be monitored regularly according to the European Union Water Framework Directive (EU WFD) [4,44].Red points represent lakes, which were used for atmospheric correction (AC) processors validation in the study.Background map courtesy of Maa-amet.

Figure 2 .
Figure 2. Comparison of MultiSpectral Instrument (MSI) (solid lines) and Ocean Color and Land Imager (OLCI) (dotted lines) bands at specific wavelengths considering Spectral Response Function over visible and near-infrared (NIR) part of the spectrum.Bands at second chl a absorption peak are highlighted by bold grey lines.Grey thin solid lines show the variation of in situ measured ϱ w spectrums.

Figure 2 .
Figure 2. Comparison of MultiSpectral Instrument (MSI) (solid lines) and Ocean Color and Land Imager (OLCI) (dotted lines) bands at specific wavelengths considering Spectral Response Function over visible and near-infrared (NIR) part of the spectrum.Bands at second chl a absorption peak are highlighted by bold grey lines.Grey thin solid lines show the variation of in situ measured w spectrums.

Figure 5 .
Figure 5.Time series of Raigastvere (a) and Saadjärv (c), where red areas (and the red spectra) show the condition band 3 < band 1 and black areas (and the black spectra) represent water pixels (b), grey triangles represent Raigastvere in situ spectrum and grey stars, Saadjärv in situ measured spectrum.

Figure 5 .
Figure 5.Time series of Raigastvere (a) and Saadjärv (c), where red areas (and the red spectra) show the condition band 3 < band 1 and black areas (and the black spectra) represent water pixels (b), grey triangles represent Raigastvere in situ spectrum and grey stars, Saadjärv in situ measured spectrum.

Figure 5 .
Figure 5.Time series of Raigastvere (a) and Saadjärv (c), where red areas (and the red spectra) show the condition band 3 < band 1 and black areas (and the black spectra) represent water pixels (b), grey triangles represent Raigastvere in situ spectrum and grey stars, Saadjärv in situ measured spectrum.

Author
Contributions: A.A. and K.A. are responsible for conceptualization and methodology.A.A. did the data analyzing, writing the first draft and visualization.Both A.A. and K.A. collected some of the in situ data and contributed to the final version of the manuscript.Funding: This research was funded by European Union's Horizon 2020 research and innovation program grant number 730066 and by the Estonian Research Council grant (PSG10).

Table 1 .
Classification of Estonian lake types for applying the European Union Water Framework Directive (EU WFD) based ecological status class estimation.Water color "dark" means absorption at 400 nm is ≥4 m −1 and water color "light" means absorption at 400 nm is <4 m −1 .High Amount of Chloride means content of chloride in water >25 mg/L, and low amount means <25 mg/L.

Table 3 .
Description of analyzed Estonian lakes, which were used for validating AC processors.Concentrations of optically active substances (OAS) in each lake were measured in the laboratory from the water samples collected during the radiometric measurements, and the processing baseline of S2 MultiSpectral Instrument (MSI) images is added in the last column.

Table 6 .
Based on the Level-1C (L1C) (MCI based approaches) or Global Lakes Sentinel Services (313256) (GLaSS dataset), three algorithms for every water body were selected based on the highest R 2 .Minimum, maximum, and average concentrations are shown in every lake.The equation for deriving chl a is shown in the last column.Estonian lakes are representing lakes from Table3, where chl a is over 10 mg/m 3 and empirical chl a algorithm means the empirical algorithm with conversion factors.