An Optical Classification Tool for Global Lake Waters

Shallow and deep lakes receive and recycle organic and inorganic substances from within the confines of these lakes, their watershed and beyond. Hence, a large range in absorption and scattering and extreme differences in optical variability can be found between and within global lakes. This poses a challenge for atmospheric correction and bio-optical algorithms applied to optical remote sensing for water quality monitoring applications. To optimize these applications for the wide variety of lake optical conditions, we adapted a spectral classification scheme based on the concept of optical water types. The optical water types were defined through a cluster analysis of in situ hyperspectral remote sensing reflectance spectra collected by partners and advisors of the European Union 7th Framework Programme (FP7) Global Lakes Sentinel Services (GLaSS) project. The method has been integrated in the Envisat-BEAM software and the Sentinel Application Platform (SNAP) and generates maps of water types from image data. Two variations of water type classification are provided: one based on area-normalized spectral reflectance focusing on spectral shape (6CN, six-class normalized) and one that retains magnitude with no modification to the reflectance signal (6C). This resulted in a protocol, or processing scheme, that can also be applied or adapted for Sentinel-3 Ocean and Land Colour Imager (OLCI) datasets. We apply both treatments to MERIS imagery of a variety of European lakes to demonstrate its applicability. The studied target lakes cover a range of biophysical types, from shallow turbid to deep and clear, as well as eutrophic and dark absorbing waters, rich in colored dissolved organic matter (CDOM). In shallow, high-reflecting Dutch and Estonian lakes with high sediment load, 6C performed better, while in deep, low-reflecting clear Italian and Swedish lakes, 6CN performed better. The 6CN classification of in situ data is promising for very dark, high CDOM, absorbing lakes, but we show that our atmospheric correction of the imagery was insufficient to corroborate this. We anticipate that the application of the protocol to other lakes with unknown in-water characterization, but with comparable biophysical properties will suggest similar atmospheric correction (AC) and in-water retrieval algorithms for global lakes.


Introduction
Freshwater lakes, reservoirs and rivers are an essential resource for human and animal survival.Population increase coupled with change in land use, hydrologic regimes and climate are stressing these systems worldwide, threatening their function as sources for drinking water, socio-economic activities and ecological environments.Over the last decade, there has been an increase in the capacity and availability of remote sensing imagery from satellites for lake systems worldwide, promoting the usage and creating new demands for reliable remotely-sensed datasets.These new capabilities stem in part from newly-launched satellites, such as the MultiSpectral Imager (MSI) on board the European Space Agency's (ESA) Sentinel-2 satellite and the Ocean Land Colour Imager (OLCI) on board ESA's Sentinel-3 satellite.The OLCI sensor is similar in spectral capabilities as the Medium Resolution Imaging Spectrometer (MERIS) sensor (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012), containing spectral channels well suited to derive bio-optical parameters over the large range of optical conditions exhibited in lakes [1,2].Sentinel-3A was launched in February 2016, and its twin Sentinel-3B is expected to be launched in 2017.The tandem missions of Sentinel-3A/B and follow ups will provide unprecedented monitoring capabilities for lake water quality because of the favorable band settings, high signal/noise ratios, full spatial resolution (300 m) and high overpass frequency.
A prototype infrastructure for handling of bio-optical algorithms and data products specific to freshwater lakes was prepared within the EU Global Lakes Sentinel Services (GLaSS) project (www.glass-project.eu).GLaSS aimed to develop generic methods and tools for Sentinel-2 and Sentinel-3 data, using legacy datasets, and in support of water quality management for any lake worldwide.One of the GLaSS products developed for lake image analysis is a classification tool based on the spectral matching method of Moore et al. [3] as an expression of optical water types (OWTs).The OWT tool operates on atmospherically-corrected and quality-checked images prior to the application of bio-optical algorithms and provides users with a powerful data analysis technique to visualize and discover the (variability of) optical conditions across image scenes.
Classification schemes are more common to terrestrial imagery, but are gaining traction in aquatic applications and share basic similarities [4,5].In both cases, the classification systems are based on features (i.e., spectral channels) in a spectral signal related to underlying types with ecological meaning.The features stem from the spectral reflectance shape and magnitude and are ultimately limited by the spectral resolution of the sensors when utilized for image classification.For aquatic uses, water types are analogous to land cover types, representing an optical condition, and hence, are referred to as optical water types or OWTs.This notion of optical type has origins in [6,7], where water types were defined by the diffuse attenuation coefficient of downwelling light.These Jerlov types are still used in marine applications [8], and were used in a recent modeling study to generate Inherent Optical Properties (IOPs) for each type [9], directly utilizing type-specific parameters.
More recent water type schemes have been introduced over the last 20 years using a variety of methods based on in situ and/or satellite reflectance data.Regardless of the method, OWTs provide information on the spatial distribution of optical states across image scenes when applied to satellite data.These mapped products function as weighting factors for optimizing bio-optical algorithms and product uncertainties for image scenes [3,[10][11][12].In these cases, they are intermediary products that are not needed themselves for analysis and are invisible to users.However, OWTs are depictions of optical states, providing information on underlying water conditions that in and of themselves have intrinsic ecological value.They have been used directly for interpretive analysis for ecological diversity [13] and ecological patterns [14] that may not be obvious from other bio-optical products, such as chlorophyll concentration, which may be hard to retrieve in complex lake waters, because of the complex atmospheric and in-water optical properties.In some cases, OWTs have been linked to distinct optical phenomena that relate to specific phytoplankton [15].These studies collectively illustrate the varying roles and uses for water types, whether freshwater or marine, when applied to remote sensing data.
The GLaSS optical water types are a follow up of [3] that presented OWTs derived from lake and coastal waters.The GLaSS dataset comprises lake data only, encompassing a larger dataset that includes more diverse lakes from across the globe.Within this paper, we introduce this classification method (called GLaSS-OWT or GLaSS optical water type method).The water types were derived from a cluster analysis.The classification system that we present has two main implementation options: a set of optical water types for un-modified reflectance data and a set of water types for normalized reflectance data, an aspect not presented in [3].The method is designed to be applicable to any lake system, covering a large range of biophysical types from shallow turbid to clear and deep, as well as eutrophic and dark absorbing colored dissolved organic matter (CDOM)-rich waters.
We describe the development of the classification method and demonstrate its application to a variety of lake systems processed with different atmospheric correction schemes.The variations in OWT image products are discussed in the context of atmospheric correction.We also examine the strengths and differences of the different OWT schemes and how they may be appropriate for different global lakes with unknown optical properties.

In Situ Data Sources
Conceptually, the OWTs represent optical states that can be determined by the spectral remote sensing reflectance or R rs (λ).This term refers to the above-water quantity unless otherwise noted.In practice, they are derived from averaging grouped R rs (λ) spectra that share characteristics (e.g., spectral shape), where each individual spectrum is an instance along an optical continuum bound by the outer ranges of the environmental and optical conditions of all water systems.The goal of the GLaSS lake classification is a meaningful partitioning of the full multi-dimensional R rs (λ) space into a set of optical water types.This water type-specific approach is intrinsically independent of location and time and therefore designed for global application.Within a water type, there is a range of optical conditions that is represented, and thus, the environmental representation of a water type is that of an average condition.
The GLaSS OWT implementation is based on that of Moore et al. [3], but includes a larger variety of lakes.A motivation for the GLaSS OWT implementation was to develop a lake-specific classification tool for all lakes and conditions.To achieve this, we assembled a dataset of in situ hyperspectral R rs (λ) with co-measured Chl-a and Total Suspended Matter (TSM) concentrations and absorption of CDOM at 443 nm (aCDOM) from multiple sources covering a wide dynamic range in optical and environmental conditions.This dataset includes the 'lake only' dataset portion (N = 320) from [3], which consists of measurements from the northeast U.S., the Great Salt Lake [16] and across Spain [17].We refer the readers to these references for further information on the data collection protocols.These data were combined with the GLaSS in situ dataset (Table 1), which consists of R rs (λ) with co-measured Chl-a, TSM and aCDOM from different countries.This dataset contains a large range of Chl-a, TSM and CDOM concentrations that are covered, including the high concentrations (Chl-a > 900 (mgm −3 ), TSM > 200 (mg −3 ), CDOM > 30 (443 m −1 ), representing a large variety of optical conditions.
The GLaSS R rs (λ) measurements were collected above water and processed according to standard protocols [18].The measurements consisted of: (1) light (radiance) emerging from water (L w ) measured at a 40-45 degree elevation angle from nadir and about a 135 degree azimuth angle from the Sun; (2) radiance from the sky (L sky ) measured at the same viewing angles; and (3) downwelling irradiance measurement (E d ).The remote sensing reflectance, R rs (in sr −1 ) is then computed with: where the air-sea interface reflectance factor was fixed at 0.028 at a zenith angle of 42 degrees [19].A dataset from Lake Erie measured in 2013 (N = 16) was also added during the development of the tool.These data included hyperspectral R rs (λ) taken with a Field Spec Pro TM VNIR-NIR1 portable spectrometer system from Analytical Spectral Devices (Boulder, Colorado).The protocol for deriving R rs (λ) was similar to that of the GLaSS data for Steps 1 and 2, although downwelling irradiance measurement (E d ) was determined from a grey card plaque.
All hyperspectral R rs (λ) data were band averaged to 3-nm resolution in the merged dataset (N = 926), quality controlled and reduced to N = 871 (Figure 1).Quality control measures consisted of visual inspection on every spectral observation and the application of the ocean chlorophyll (OC4) algorithm and MERIS three-band Chl-a algorithms for consistency checking.Observations with noisy or negative spectra were rejected, as were spectra with abnormal Chl-a retrievals.It should be noted that R rs (λ) associated with floating algal mats were removed (i.e., high NIR values).We believe this to be a special water type case that will be added in the future.The current dataset contained too few samples for this type to be characterized at present.

Development of the GLaSS Optical Water Types
To create the OWTs, a cluster analysis was applied to the merged, quality controlled R rs (λ) data.The goal of the clustering is simply to serve as a mechanism to sort data and to produce a partitioning of meaningful sub-groups.The effectiveness of cluster partitioning depends on the features, in our case R rs as specific wavelengths, represented as a vector, that contribute to separability.In many cases, feature dimensionality can be reduced from the original dataset.This is often necessary to minimize processing time and cluster instability from redundant features or bands that highly covary [20], which is the case with hyperspectral data.Prior to clustering, feature selection and extraction were conducted on R rs (λ).The wavelengths chosen were those that matched the MERIS (and several Sentinel 3) visible and NIR band centers-412, 443, 490, 510, 560, 620, 665, 681, 709 and 753 nm-and reduced the dimension of each R rs (λ) spectra from 134 down to 10. Note, that this sole purpose of feature reduction is for identifying clusters, not for reducing the spectral dimensionality of the overall dataset.
We applied the fuzzy c-mean (FCM) algorithm [21] to the reduced R rs (λ) data.Following [3], these data were transformed to sub-surface values (Equation (2)) following [22].It should be noted that the clustering and ensuing membership functions use the below-water quantity, but we will retain referencing any spectra as R rs (λ) for simplicity.R rs (0, −) = R rs (0, +) 0.52 + 1.7 * R rs (0, +) The FCM algorithm partitions the input data into a specified number of clusters.The function operates by minimizing the distance between the data points and the prototype cluster centers (means), which are iteratively adjusted until optimization criteria are met.Since the number of clusters is not known beforehand, FCM was applied to the dataset over a range of clusters set from 2-20.Cluster validity functions were used to assess the effectiveness of the cluster performance for each outcome.These functions measure various aspects of the entire cluster partitioning and were used to guide the ultimate choice for the number of optimal clusters [3].
The clusters define the GLaSS OWTs through their means and covariance matrices.While only a subset of bands was used to determine the cluster partitioning, the OWTs were created with the full hyperspectral data allowing for the construction of a membership function (the main component of the classification tool that produces the image classification) to operate on any band configuration within the range of hyperspectral data (400-800 nm) and, thus, on any satellite sensor.It is important to note that the clustering process was applied to the spectrally-reduced R rs (λ) data, resulting in a partitioning of the data.This partitioning was simply a means for sorting, and once sorted, the membership functions could be produced from the hyperdimensional R rs (λ) data.
There are two different forms of R rs (λ) used in classification schemes for depicting OWTs: area-normalized R rs (λ), e.g., [13,23] and un-modified or non-normalized R rs (λ), e.g., [3].The rationale behind normalizing is to remove the influence of magnitude on clustering and stressing the spectral shape.The work in [23] showed that coastal turbid waters are susceptible to magnitude shifts based on the concentration of particles of the same type, which are sorted into the same cluster when normalized.Absorption characteristics have more impact on clustering.
The GLaSS OWTs are represented through both approaches, resulting in two different water type sets: a normalized set and a non-normalized set.For the normalized set, we applied a trapezoidal numerical integration over a wavelength range from 400-750 nm (Photosynthetically Active Radiation), hereafter called PAR-normalized, for each spectrum.Each dataset was analyzed separately for cluster analysis, cluster validity and the development of optical water types through the means and covariance matrices.For the non-normalized and the normalized data, the optimal number of clusters (and associated optical water types) was six for each based on validity functions and a priori user knowledge.These are denoted as 6C and 6CN, respectively (Figure 2).

BEAM/SNAP Implementation and the Membership Function
The classification system has been implemented as a processing tool in Brockmann Consult's BEAM software and its successor SNAP and is available for application to satellite imagery (http://www.brockmann-consult.de/cms/web/beam/project,http://step.esa.int/main/toolboxes/snap/).The tool produces class memberships to OWTs (for either configuration) using membership functions, which produce fuzzy partitions for the OWT set.
Membership functions are formed from the mean and covariance matrix for each cluster, and class (OWT) membership values ranging from 0-1 are assigned to observations (pixels) using a two-step fuzzy process.For the first step, the Mahalanobis distance is computed between the observation and the OWT as: where R rs is the observed remote sensing reflectance vector, µ j is the mean reflectance vector of the j-th OWT and Σ −1 j is the covariance matrix for the j-th OWT.The Mahalanobis distance is the multivariate equivalent of the standardized random variable Z = (X − M)/S, which is the distance of the univariate random variable X from its mean M normalized by the standard deviation S. In other words, the Mahalanobis distance is a weighted form of the Euclidean and is preferable because it incorporates the shape of the distribution of points around the cluster center (i.e., the geometric shape of the point cloud expressed in terms of variance).For the second step, the membership function converts the Mahalanobis distance into a fuzzy membership using a chi-square probability function.In mathematical terms, if the probability distribution of points belonging to the cluster centered at µ j is normal and R rs is a member of that population, then Z 2 as defined by Equation ( 3) has a chi-squared distribution with n degrees of freedom where n is the dimensionality of V rs .The likelihood that R rs is drawn from the j-th population can be defined as: where F n (Z 2 ) is the cumulative chi-square distribution function with n degrees of freedom.The fuzzy membership ranges from 0-1 and depicts the degree to which a measured reflectance vector belongs to a given OWT.The value is one if the measured vector is identical to the mean vector of that OWT, and its value diminishes to zero as the Mahalanobis distance increases.This allows for an observation to have memberships to multiple OWTs, although in practice, one or two are typically expressed as present.

Characteristics of Remote Sensing Data
For inland waters, high backscatter and absorption in both the atmosphere (by land aerosols) and the water (due to high concentrations of optically-active substances) can confuse the coupled atmospheric correction and in-water retrieval software [24].Furthermore, nearby vegetated land can cause over-radiation of water pixels in the near-infrared (NIR) wavelengths that are used for atmospheric correction.Therefore, we started with radiometrically-corrected MERIS Level-1 TOA radiances.These base datasets were processed with different atmospheric correction algorithms, and the output reflectances (with confidence flags) can subsequently be used in the OWT classification system.The confidence flags are quite strict and will, e.g., indicate extreme reflectances caused by sun glint or vision of the lake bed in optically-shallow waters.The satellite images were processed with and without correction for stray light from adjacent land pixels, using the Improve Contrast over Ocean and Land (ICOL) processor [25].The images were atmospherically corrected using several processors: Case 2 Regional (C2R, [26]), CoastColour with C2R ( CC2R, [27]) and the Modular Inversion and Processing scheme (MIP) [28][29][30].This is a subset from the atmospheric correction (AC) methods tested in the GLaSS project [31], because not all AC output was suitable as input for the OWT tool.SCAPE-M (Self-Contained Atmospheric Parameters Estimation from MERIS data, [32]) is not included in the classification analysis, because of known problems with MERIS Band 2, which would have a large influence on the produced classes.Due to missing spectral bands, the output of the Freie Universität Berlin (FUB/WeW) Water Processor [33] cannot be fed into the OWT tool.The standard MERIS Ground Segment (MEGS) Processor is not included because of the extremely low number of valid pixels it produced in atmospheric correction tests in GLaSS (0-14%, depending on the lake [31]).The output of the 6S (Second Simulation of a Satellite Signal in the Solar Spectrum, [34,35] and the ATCOR [36] processors also did not perform well compared to other atmospheric correction processors for any of the selected lakes, likely because we did not have sufficient information to optimize their parameterization, and they were also not included here.

Properties of the GLaSS OWTs
The cluster analysis for each treatment of R rs (λ) resulted in the creation of the OWTs (Figures 2  and 3).The number of optimal clusters was six for each case, which were not directly linked and were coincidental.Tables 2 and 3 show the distributions of class assignments from the cluster analysis for individual in situ lake datasets for each partition.For referencing OWTs within each scheme, we adopt a nomenclature convention of the scheme followed by the OWT.For example, OWT 1 of the non-normalized scheme will be referenced as 6C-1, and OWT 1 of the PAR-normalized scheme will be referenced as 6CN-1, and so forth.For the non-normalized data (Table 2), the distributions across type vary by slightly more than a factor of two maximum (84 points to 6C-6 and 199 to 6C-4).Individual lake datasets typically group into two or three clusters.For example, the Finnish lakes are mostly grouped into 6C-2, while the Italian lakes are spread across five different OWTs, but mostly are grouped into 6C-1.The normalized R rs (λ) cluster distributions change somewhat (Table 3).In some lakes, the data are spread out across more OWTs (e.g., Spanish, New Hamphire (NH) and Finnish lakes), whereas in the case of Lake Peipsi, the data become more concentrated into a single OWT.Still, most of the data sources show just a few dominant types.The two OWT schemes differ in small, but important ways in how the R rs (λ) are distributed and in resulting OWT means.The PAR-normalized treatment effectively removed magnitude effects.For example, 6C-3-6C-6 appear similar in shape and inflection characteristics (e.g., variations on peaks at 550 and 710 nm, depressions at 620 and chlorophyll absorption between 665 and 680 nm), but with different magnitudes and with a general flattening of spectra towards 6C-6, as seen in the mean spectra (Figure 2).For the PAR-normalized system, the spectra belonging to a given cluster cover a wide range of magnitudes, as seen in the R rs (λ) when viewed in their non-normalized condition (Figure 3, right column).The same spectra are distributed over several OWTs in the 6C scheme.Conversely, 6C-2 contains low R rs (λ) typically associated with high absorption, and these R rs (λ) are distributed over several OWTs in 6CN, offering new potential for discrimination within dark or high absorbing waters.
An underlying assumption and early motivation for OWT approaches in the context of bio-optical algorithms is that data assigned to the same cluster share IOP characteristics [12].Without a full set of co-measured IOP data, it is not possible to verify whether or not R rs (λ) associated with the same OWT share similar IOP characteristics.However, the distributions of co-measured Chl-a (all stations), CDOM and TSM concentrations (available for 376 of stations) provide insight into spectral drivers behind the water types (Tables 4 and 5).The OWT distributions for Chl-a, CDOM and TSM are shown in Figure 4.The trends for mean Chl-a for the 6C scheme show an increase from OWT 1-OWT 4, while TSM increases across all six OWTs, indicating that C6-5 and C6-6 have major inorganic particle contributions.High mean CDOM values are in 6C-2-6C-4, with 6C-2 having the highest value and consistent with the lowest overall mean R rs (λ).These combinations are broadly consistent with progressively elevated mean R rs (λ), tempered with suppressed spectra with high Chl-a and CDOM OWTs.For the 6CN scheme, mean Chl-a follows that of the 6C scheme.A notable difference in the distribution for CDOM is evident, with 6CN-3 having the highest mean.TSM also follows the trend for the 6C scheme, with the highest TSM distributions associated with 6CN-6 and consistent with the shape of the mean R rs (λ).The general relations between the OWT optical and in-water properties for the two schemes can be summarized as follows: the 6C mean R rs (λ) retain absolute shape and are influenced by absorption and scattering properties, ranging from a relatively clear type (6C-1) to turbid, highly scattering waters (6C-6).A very dark water type indicative of high absorption (CDOM-dominated) is also represented (6C-2), and 6C-3-6C-5 generally are associated with increasing levels of phytoplankton biomass in eutrophic waters.In 6CN, peaks and valleys in the red and NIR region are the most differentiating aspect of shape, with 6CN-1 and 6CN-2 relatively flat in this region, and varying levels of shape and magnitude for 6CN-3 through 6CN-6.The largest peak amplitude in the red/NIR region is exhibited by 6CN-3, consistent with the highest Chl-a levels.For the 6CN, there is no 'dark' water OWT, as in the case of the 6C scheme.

The GLaSS Lakes Case Studies
The GLaSS OWT tool utilizes these schemes with the membership functions to produce mapped products (Figure 5).Mapped products show (1) the fuzzy memberships to each OWT, (2) the dominant water type (determined from the water type with the highest membership) and (3) the membership sum (i.e., the sum of memberships from all water types).Also included are the normalized memberships (not to be confused with the normalized R rs (λ)).The normalized memberships are constrained to sum to one for every pixel.For these quantities, each membership is divided by the membership sum for that pixel.We tested the tool and the two schemes on a selection of MERIS images from GLaSS target lakes, which included lakes in Estonia, Finland, Italy, The Netherlands and Sweden (Table 6).One of our goals was to assess how OWTs are impacted by and can inform us of how to improve the application of atmospheric correction schemes to local imagery.To test this, the satellite images were processed with and without correction for stray light from adjacent land pixels (ICOL [25]) and atmospherically corrected using several AC methods available in GLaSS (see Table 7).The mapped distribution of the dominant OWT for each classification scheme was evaluated on a qualitative basis in consultation with local GLaSS lake experts, since we lack match-up validation data.Atmospheric correction with CC2R gave best results for most lakes, and maps based in these results are discussed in the next sections.

Italian Lakes: Deep and Clear
Located in the southern Perialpine region, Lake Garda is the largest Italian lake, typically with meso-oligotrophic conditions.The lake can be divided in two sub-basins: a larger area extending with a N-SW orientation with a deep bottom; and a shallower SE basin.Lake Maggiore is the second largest by surface and volume.It is a very narrow elongated lake with a N-S orientation.The deepest basins (max depth 373 m) are situated in the central and northern parts, with shallower bottoms in the south.Lake Maggiore has experienced eutrophication since the 1960s, but since the 1980s, it has stabilized and cleared, and today, it is classified as oligotrophic.For the Italian lakes, following Tables 2 and 3, we expect 6C-1 and 6CN-1 to occur most of the time, in combination with 6CN-2 [37][38][39].Seasonal and daily variation can induce some deviations.Figure 6 shows the classified maps for these lakes after ICOL corrections.The invalid or suspect flags were not applied as masks for the maps, in order not to loose much of the data.Without ICOL, large parts of Garda and all of the other lakes in the area are flagged as 'L2R (level-2 reflectance) invalid' or 'L2R suspect', and the waters are classified as 6C-3 and 6C-4, which is clearly not correct.With ICOL processing, still many of the pixels in the Italian lakes (except for larger Lake Garda) are flagged as 'L2R invalid', but the resulting water types 6C-1, 6C-2 and 6C-3 could be correct for these lakes.However, the percentage presence of 6C-3 is higher than expected, and 6C-4 is assigned to parts of Lake Lugano and Lake Idro, which is not appropriate.The 6CN classifier assigns OWT 6CN-1 to all of the Italian lakes, except of Lake Lugano (6CN-2).This agrees with the known optics of the lakes and with the distribution shown in Table 3.We believe the OWT normalization is appropriate and accurate here.

The Estonian and Dutch Lakes: Shallow-Turbid and Shallow Phytoplankton-Dominated Lakes
Using Tables 2 and 3 as a guide, we expected Lake Peipsi in Estonia to be classified mostly as 6C-3 and 6C-4 and partly as 6C-5.Lake Võrtsjärv has higher sediment and CDOM loads, and class 6C-5 could therefore be expected.In Figure 7, the results of the 6C (left) and 6CN (right) classifications for Lake Peipsi (east on the map) and Lake Võrtsjärv (west) are shown.6C-2 and 6C-3 are assigned to the northern part of Lake Peipsi; these are lower OWTs than reported in Table 4.However, the three 6C water types that are found in Lake Peipsi have R rs (λ) spectra that are similar to field measurements, and the spatial distribution of the OWT classes seems credible: the northern part with lower classes than the southern part.The southern part of Lake Peipsi (Lake Pihkva) is richer in sediments than the northern part, and Lake Pihkva is very similar to Lake Võrtsjärv, which is confirmed by the classifications [40,41].At the time of image acquisition (18 July 2005), there was a large phytoplankton bloom in the northern part of Lake Peipsi.In the beginning of July 2005, the measured Chl-a varied between 14 and 74 mg m −3 with lower values close to shore and higher values in the center, and in August, the bloom was even more intense.This range of Chl-a concentrations complies somewhat with 6C-2 (Table 4).Importantly, in situ measurements of Lake Peipsi from 2008-2011 [42] show average CDOM absorption at 440 nm of 3.1 m −1 .This combination of Chl-a and CDOM concentrations (cf.Table 4) explains that 6C-2 was assigned to this image.OWT 6C-3 indicates the presence of somewhat lower Chl-a concentrations for the areas adjacent to the blooms in Lake Peipsi.
OWTs 6CN-4 and 6CN-5 were expected from the normalized classification of the in situ spectra (Table 3).However, 6CN-2 and 6CN-5 were found in the MERIS image for the northern part of Lake Peipsi.Still, this is a reasonable distribution for the period with during a phytoplankton bloom (elevated Chl) and CDOM concentrations of around 3 m −1 (Table 4).The smaller southern part, Lake Vörtsjärv, is assigned class OWT 6C-4 or 6CN-6.The non-normalized classification appears to work best here, as Võrtsjärv has high Chl-a, TSM and CDOM, which agree with OWT 6C-4, but not with OWT 6CN-6 (which had a lower CDOM range in the training set; Table 4).The Dutch Lake IJsselmeer and its split-off Lake Markermeer have quite distinct optical properties.Markermeer is shallow (average depth of 3.6 m), and bottom sediments are characterized by fine, easily resuspendable sediments with frequently high surface TSM concentrations [24].River IJssel discharged higher nutrient loads into Lake IJsselmeer, in the past.Lake IJsselmeer is still optically dominated by phytoplankton and cyanobacterial blooms.As expected from Table 2, Markermeer is classified with a combination of OWT 6C-3 and 6C-4, while the majority of IJsselmeer is assigned to OWT 6C-3 (Figure 8).Near the outflow of River IJssel, IJsselmeer also contains OWT 6C-2, which could indicate the presence of CDOM in an otherwise relatively clear region, where mussels filter the water.In both lakes, some OWT 6C-1 pixels are found along the shorelines.This is not correct and is not explained by the adjacency effect, which would lead to a higher (not lower) OWT number.However, all of these 6C-1 pixels are indeed flagged as 'L2R invalid' or 'L2R suspect'.With the normalized classifier, 6CN-2 is dominantly assigned to both lakes IJsselmeer and Markermeer, and 6CN-1 occurs, as well.The latter could be incorrect because the associated concentrations are low (Table 5).In that case, an incorrect atmospheric correction would explain the difference between the results in Table 5 and the MERIS-based maps.2, the Finnish lakes are almost always classified as 6C-2, because of their overall low R rs (λ).After normalization, more differentiation in shape leads to several assigned normalized OWTs (6CN-2, 6CN-4 and one instance of 6CN-5).With the 6C application, most of Päijänne and the central parts of Pääjärvi were classified as 6C-2, which is according to expectation since these lakes have generally low R rs (λ) attributable to high CDOM absorption for Pääjärvi and low TSM and particle scattering in Päijänne.Although Lake Vesijärvi is predominantly classified as 6C-3, this is viewed as accurate, because this lake has low CDOM and typically higher TSM and Chl-a and, thus, higher R rs (λ).The 'L2R suspect' flag was raised at the shores of Lake Päijänne, which indicates that the OWT 6C-4 pixels might require masking due to excessively high R rs (λ) values.The 6CN classifier assigns 6CN-1 to Lake Päijänne and 6CN-2 to Lake Pääjärvi.Other surrounding lakes are classified as 6CN-1, as well.6CN-1 was not expected according to Table 3, and Lake Vesijärvi and the small surrounding lakes could be classified into several classes.The question is whether the normalization and classification did not work well here, or if something else is disturbing the results.Because Tables 2 and 3 do represent the differences between Finish lakes well, the expectation is that the atmospheric correction might not have been suitable for these lakes [43].The two largest Swedish lakes are Lake Vänern and Lake Vättern.Lake Vättern is very clear, with very low concentrations of Chl-a, TSM and CDOM.Lake Vänern typically has low concentrated chlorophyll blooms and relatively high CDOM absorption of around 1 m −1 (at 440 nm) [40,44].Both lakes are mainly classified as OWT 6C-2, Lake Vänern also partly as OWT 6C-3 (Figure 10).The small Bay Dättern, in the south of the eastern basin of Lake Vänern, is very turbid, with high concentrations of TSM (>30 g m −3 ), Chl-a (>30 mg m −3 in summer) and very high CDOM concentrations (3-10 m −1 ).Bay Dättern is classified as OWT 6C-4, and probably due to the high CDOM absorption, it does not fall into OWT 6C-6.The occurrence of OWT 6C-5 was also expected for this bay, but that class was not found.ICOL processing makes a difference in Lake Vättern, but not in a positive sense: after ICOL processing, Vättern is classified as OWT 6C-3, while OWT 6C-1 would have been more appropriate.Bay Dättern continued to be flagged as 'L2R invalid' after ICOL processing.The result without ICOL processing is therefore preferred.With the normalized classifier, Vättern is actually classified as OWT 6CN-1 and seems the most appropriate for this lake.After normalization, Vänern is assigned OWT 6CN-2 and Dättern OWT 6CN-3, which is correct.For the Swedish lakes, the normalization seems an improvement over the non-normalized classification.

Discussion
The GLaSS lake OWTs were developed by extending the OWTs derived in [3].In that earlier study, seven OWTs were identified, but represented coastal marine waters, as well as inland freshwater.The number of OWTs we found in a larger dataset but exclusive to freshwater was six.The impact of adding more data did not significantly alter the partitioning of reflectance space into clusters.We purposefully omitted spectra associated with floating algae because of too few instances to derive stable statistics, but we believe this is a water type that exists and should be incorporated in future renditions.
In addition to the development of the six OWTs on un-modified spectra (6C), we developed a parallel set using normalized spectra (6CN), accentuating absorption features by removing scaled magnitude effects solely attributable to concentration levels [45].The GLaSS tool contains both options for OWT processing.It is yet to be determined which choice of the classification scheme is best for a given lake (i.e., 6C or 6CN).This will depend on how the classification maps might be used and the nature of the lake system.In the Dutch and Estonian lakes (the lakes with a higher sediment load), the 6C classification performs better, while for the Italian and Swedish lakes (mainly for the clear Lake Vättern), the 6CN classification provided the best results based on the current analysis.This is consistent with our expectations: the normalized method discriminates 'low reflecting' lakes (either clear blue lakes or brown/yellow CDOM lakes) that would otherwise end up in the same 'low' reflectance OWT using non-normalized classification.For the Finnish lakes, however, the results from both classification schemes seem not very convincing: there are large contrasts between the classified in situ reflectance values (Tables 4 and 5) and the image results.We believe this is caused by atmospheric correction problems over dark waters.For these dark absorbing lakes, such as the ones in Finland and Sweden, it is known that the FUB processor performs best.However, due to missing spectral bands, the output of this processor cannot be fed into the OWT tool.
These AC test results highlight a new role for the OWT classification in identifying atmospheric correction problems, as an overall aim of the GLaSS OWT tool is to improve water quality products generated from satellite image processing for any lake system.A general problem with image processing over lakes is that certain AC and bio-optical algorithm retrieval schemes are more suitable for some optical conditions, while other schemes work better for other conditions (e.g., clear versus turbid waters).Selecting the most suitable AC and retrieval algorithm schemes is the most critical decision for producing accurate and meaningful water quality products.The OWT classification provides a mechanism to assist, for example by indicating whether a dark-pixel correction is possible (non-turbid) or not.
Using OWTs for improving the results of atmospheric correction for imagery over lakes would be a new application for these products.Currently, one iteration of atmospheric correction combined with the application of the OWT tool shows the distribution of OWTs over the whole lake.In cases that lakes contain water types that have a better performance with different AC schemes, one could imagine an iterative system where standard AC processing is executed and OWTs are computed, and then, if certain water types are found (assuming error in the turbid areas), a re-application of AC over the scene could be applied with a scheme more suitable to turbid conditions for those pixels assigned to the OWT that is connected to a different AC scheme.This approach is conceptually similar to the switching scheme between the NIR and SWIR AC models originally suggested by [46] and further tested by [47,48] for MODIS imagery over various coastal locations, as well as for MERIS Case 1 and Case 2 atmospheric correction [49][50][51][52].In the present case, multiple AC schemes could be available for selection, with images re-combined similar to the algorithm blending method for in-water retrievals, e.g., [3].This approach would require further testing of different AC schemes with different image scenes containing a variety of OWTs, but offers an avenue for blending AC schemes within a single image.
One important issue to mention concerns the use of the flags derived from the pre-processing of the scenes and the atmospheric correction method used.In the scenes analyzed, the 'L2R suspect' and the 'L2R invalid' flags removed some misclassified pixels along the lake shores, which could be caused by the remaining adjacency effect or by mixed land-water pixels.Those flags can be used to mask out these problematic pixels, but they can also mask large areas of valid pixels.Therefore, to determine to which class the main part of the lake belongs, it is advised not to use additional flagging besides the land-and cloud-related flags.It is also wise to ignore the much higher classes that are found in the 1-2 pixels along the shores of the lakes when they appear.
The classification maps produced from the GLaSS tool also serve as stand-alone products that provide spatial information for understanding the distributions and long-term trends of optical states that have environmental and ecological linkages.ESA's Diversity II Project (http://www.diversity2.info/) used the dominant OWT class as a monthly inland waters product from MERIS imagery for a variety of globally-distributed lakes.Time series of these classification maps provide indications of how a given system may be trending or changing as expressed OWT changes.Frequency maps of OWTs can be generated and are useful for understanding the distributions of the dominant water types for a given lake, leading to a first order indication of the types of AC schemes and retrieval algorithms that may be needed for processing, e.g., [3].There is much interest in using remote sensing to support reporting for the European Water Framework Directive and U.S. Clean Water Act [53,54].Frequent OWT maps can provide an insight into the state and seasonal patterns that occur in lakes.Longer term or unexpected changes can be a reason to perform a full processing and taking additional samples for detailed analysis.
The 6C and 6CN classification methods analyzed here are similar in their implementation, but represent different approaches in classification and interpretation.The non-normalized approach (6C) is based on absolute R rs (λ) values and thus can differentiate 'dark' from 'bright' waters more effectively than the normalized scheme (6CN), which essentially removes magnitude effects attributable to particle scattering.The normalized class partitioning is driven by spectral shape effects, largely from spectral-varying absorption properties.These considerations may be relevant to how the classification tool is ultimately used for a lake, such as to determine the most suitable tuning of a bio-optical algorithm or for general optical assessment.
Water classification is a somewhat recent and evolving discipline.Classification schemes exist that use normalized and non-normalized R rs spectra, but there has been no attempt anywhere to connect the two approaches.Currently, each approach has been treated separately without the other, and each has advantages and disadvantages.However, it is possible to unify the two schemes.Figure 11 shows a view of the combined system as a matrix with the normalized and non-normalized normalized input remote sensing reflectance (Rrs) spectra separated into their respective clusters using the six-class scheme for both, resulting in 36 potential variations.Based on our results, 20 of the 36 possibilities are encountered.One approach for integrating the different schemes would be to use one classification system subsidiary to the other.Under this approach, fuzzy memberships would be derived for one scheme as the master factor, and a second sorting could take place according to the dominant OWT of the subsidiary scheme.This approach avoids intermingling fuzzy memberships, which is not yet feasible, but does add a new layer of classification by further discriminating shapes within a class.As an example, the new integrated scheme would use the fuzzy memberships for non-normalized classes as the main fuzzy value for pixel weighting if serving that function, and the dominant class of the normalized data as a subset variation of the non-normalized class.Theoretically, each non-normalized class has six normalized potential assignments when combining 6C and 6CN.This approach has not been tested, but could be a way to take full advantage of the classification tool.It is beyond the scope of this study to develop these concepts further and remains a gap that future work should address.
The selection of algorithms optimized for local conditions can also be facilitated by using the OWT approach to direct the algorithm selection and output blending.As has been demonstrated by [3], the best performing algorithm to a particular water type can be determined through algorithm analysis a priori.During operational classification, the class memberships can then be used to weigh retrievals from multiple algorithms into a blended product.This assumes that algorithm performance for specific OWTs are globally representative.This assumption should be checked for a given lake system though.Variations in local optical drivers or specific Inherent Optical Properties (sIOPs) may deviate from global behavior or conditions.

Conclusions
Optical water type classification is a developing research topic in aquatic remote sensing.It has evolved from Jerlov water types as descriptors of marine waters, to a variety of marine and freshwater schemes designed for use with remote sensing image applications.We have developed a new tool specific for classifying lake remote sensing images, now available in the BEAM software.The development of the tool is an outgrowth of the method presented by [3], differentiated by new data and new scheme configurations.Optical data from different lakes across Europe, the U.S. and China were merged, covering a wide range of environmental conditions, including dark lake waters, turbid waters and highly eutrophic waters comprising cyanobacteria blooms.We have re-developed lake optical water types with both a spectral-normalized and non-normalized treatment, resulting in two separate, but linked schemes.The resulting water types in each scheme were described by in-water concentrations of chlorophyll-a, CDOM and total suspended matter.While each scheme differs at a fundamental level, they serve the same roles for downstream applications, which in the past have included using them as intermediary products for optimizing bio-optical algorithm selection and as stand-alone products for supporting biogeochemical and biodiversity system analysis.We have focused our research on the development of the schemes within the tool and its use with remote sensing imagery from the MERIS sensor for a variety of European lakes as case studies with different optical conditions.We found that each scheme had merits for generating mapped water type products, depending on the lake.While the images used are a small subset of conditions likely to be found globally, the analysis is useful as a means for contrasting the different approaches for different lake conditions.The best scheme for any system requires a fundamental a priori knowledge of optical drivers, necessary for interpreting the images.For example, Lake Vättern in Sweden showed better results for the normalized scheme, while other lakes such as Lake Võrtsjärv and Lake Markermeer and IJseelmeer in the Netherlands showed better performance with the non-normalized approach.These approaches were assessed by how accurately we believed the classification maps depicted the underlying optics.
Although a prime use of classification maps is for bio-optical algorithm application, we did not set out to test or develop the tool with algorithms, as was done in [3].There is a wide variety in algorithms and intended purposes of algorithms, and this type of evaluation with the two-scheme approach was beyond the scope of this work.However, we presented a new use for classification maps as related to guiding atmospheric corrections schemes.As the tool operates directly on the spectral R rs , the atmospheric correction scheme will impact the effectiveness of the classification tool.We tested several different atmospheric correction schemes with each image, producing different classification maps for each test image.The classification results provided feedback on the performance of the atmospheric correction scheme, and we believe that the classification map interpretations are useful in assessing the performance of atmospheric correction when in situ match-up data are not available, which is generally the case.Thus, another use of classification maps is for atmospheric correction assessment and possibly selecting and blending, as well, although we speculate on how this may be done.
These two scheme variations presented here-spectral-normalized and non-normalized-represent the current options available to developers and users of images produced from optical water type classification, regardless of origin.We have shown how these schemes differ spectrally and in use, as well as in in-water characterizations.We have also shown how they are linked through a classification matrix and speculate on the potential to unify the two schemes, which could provide a way to maximize the advantages of each scheme together.This is an evolving area of research, and the guidelines and uses of optical water type schemes are still being explored and discovered.

Figure 2 .
Figure 2. OWT mean spectra for the non-normalized (left) and PAR-normalized (right) clusters.(Normalization as explained in text.)Open circles indicate bands used in the clustering.

Figure 3 .
Figure 3. Distribution of individual R rs (λ) across the clusters (OWT 1, top; OWT-6, bottom) for the non-normalized (left column) and PAR-normalized (middle column) schemes.The right column shows the same data for the middle column, but not normalized.

Table 1 .
In situ data from various Global Lakes Sentinel Services (GLaSS) partners and advisory board member Yunlin Zhang.(VIS/NIR, in the visible and near-infrared range; CDOM, colored dissolved organic matter; TSM, total suspended matter; Chl, chlorophyll-a.)

Table 2 .
Cluster distribution using the six-class (6C) classification scheme.

Table 3 .
Cluster distribution using the six-class normalized (6CN) classification scheme.

Table 6 .
List of GLaSS lakes for test case application by country.

Table 7 .
Atmospheric correction method used in each image tested.