Automatic Extraction of Open Water Using Imagery of Landsat Series

Open surface freshwater is an important resource for terrestrial ecosystems. However, climate change, seasonal precipitation cycling, and anthropogenic activities add high variability to its availability. Thus, timely and accurate mapping of open surface water is necessary. In this study, a methodology based on the concept of spatial autocorrelation was developed for automatic water extraction from Landsat series images using Taihu Lake in south-eastern China as an example. The results show that this method has great potential to extract continuous open surface water automatically, even when the water surface is covered by floating vegetation or algal blooms. The results also indicate that the second shortwave-infrared band (SWIR2) band performs best for water extraction when water is turbid or covered by surficial vegetation. Near-infrared band (NIR), first shortwave-infrared band (SWIR1), and SWIR2 have consistent extraction success when the water surface is not covered by vegetation. Low filter image processing greatly overestimated extracted water bodies, and cloud and image salt and pepper issues have a large impact on water extraction using the methods developed in this study.


Introduction
Open surface freshwater bodies, including lakes, reservoirs, rivers, streams, and ponds, are a significant sink and source of CO 2 for aquatic and terrestrial ecosystems, important resources for agricultural, aquacultural, industrial, and residential use, and are integral to social economics, infrastructure stability, and emergency preparedness [1][2][3]. Global climate change, seasonal precipitation and anthropogenic activities lead to various changes (including predictable seasonal cycles or episodic variability) in open surface water that can substantially influence environmental security, ecological processes, and related ecosystem services [1,4,5]. Therefore, timely, frequent, and precise information on the spatial distribution and temporal change of open water is the foundation of sustainable water resource management, emergency response (flood or drought events), water-related disease control (e.g., malaria), economic development, and environment protection [4][5][6][7][8].
Remote sensing images, recording long-term spatial information of the earth's surface, have proven their potential for tracking land cover change and ecological processes [9]. Maps extracted from remotely sensed images, documenting spatial distribution, temporal dynamics, and long-term trends of surface water bodies, are useful to derive their spatial and temporal patterns [2,8]. Among all the remote sensing platforms, Light Detection and Ranging (LiDAR) provides the most accurate open water maps at regional scales [3], but are limited in their ability to map surface water bodies at global LISA may have great potential for inland water mapping with high spatial autocorrelation within water bodies in contrast to heterogeneous land cover types.
Therefore, this study aims to explore the potential to use LISA for inland water mapping under various environmental conditions, spectral domains, and image qualities. The specific objectives are: (1) to develop a new method based on the theory of LISA for inland water mapping from Landsat imagery, (2) to evaluate the potential of the new method for water extraction in various environmental conditions, (3) to compare the variation in water extraction from different Landsat bands, and (4) to test the influences of low filter image processing on surface water extraction.

Study Area
Taihu Lake, the third largest freshwater lake in China (area~2428 km 2 including islands or 2338 km 2 without islands), is a typical shallow inland lake with average depth of 1.9 m (maximum depth is about 2.6 m) [37][38][39][40]. It is located in the downstream of Yangtze River (Figure 1), on the southern Yangtze River Delta [41,42]. Taihu Lake falls in the East Asia Monsoon climate region with annual mean temperature of 14.0-16.2 • C and mean annual precipitation of 1000-1400 mm [43][44][45]. Taihu Lake has a very complicated river and channel network with 13 in-flowing rivers (mean annual runoff into the river is 4100 m 3 ) and one outflowing river [44,46]. The drainage basin of Taihu Lake covers some of the most developed regions of China, including Jiangsu, Zhejiang, and Anhui provinces and Shanghai municipality, which contributes about 10% of Gross Domestic Product (GDP) with only 0.4% of China's territory [47].
Water 2020, 12, x FOR PEER REVIEW 3 of 20 Therefore, this study aims to explore the potential to use LISA for inland water mapping under various environmental conditions, spectral domains, and image qualities. The specific objectives are: (1) to develop a new method based on the theory of LISA for inland water mapping from Landsat imagery, (2) to evaluate the potential of the new method for water extraction in various environmental conditions, (3) to compare the variation in water extraction from different Landsat bands, and (4) to test the influences of low filter image processing on surface water extraction.

Study Area
Taihu Lake, the third largest freshwater lake in China (area ~ 2428 km 2 including islands or 2338 km 2 without islands), is a typical shallow inland lake with average depth of 1.9 m (maximum depth is about 2.6 m) [37][38][39][40]. It is located in the downstream of Yangtze River (Figure 1), on the southern Yangtze River Delta [41,42]. Taihu Lake falls in the East Asia Monsoon climate region with annual mean temperature of 14.0-16.2 °C and mean annual precipitation of 1000-1400 mm [43][44][45]. Taihu Lake has a very complicated river and channel network with 13 in-flowing rivers (mean annual runoff into the river is 4100 m 3 ) and one outflowing river [44,46]. The drainage basin of Taihu Lake covers some of the most developed regions of China, including Jiangsu, Zhejiang, and Anhui provinces and Shanghai municipality, which contributes about 10% of Gross Domestic Product (GDP) with only 0.4% of China's territory [47]. Rapid urbanization and industrialization since the 1980s, as well as liberal fertilizer use have resulted in an enormous amount of waste water and sewage discharge into Taihu Lake [48]. As a result, Taihu Lake has experienced serious eutrophication and a resultant algal blooms since the 1990s  Rapid urbanization and industrialization since the 1980s, as well as liberal fertilizer use have resulted in an enormous amount of waste water and sewage discharge into Taihu Lake [48]. As a result, Taihu Lake has experienced serious eutrophication and a resultant algal blooms since the 1990s [41,49,50]. Water in Taihu Lake is consistently turbid all year around with average and maximum concentration of suspended sediment over 50 and 300 mg L −1 , respectively [43,50,51]. Therefore, the optical properties of Taihu Lake are very complex, varying substantially by season and location; even within 24 h at the same location variation can be high due to sediment resuspensions and algal blooms [52]. This makes it a suitable area to test our new developed method for open water extraction.
Taihu Lake is often divided into six sections ( Figure 1) based on shoreline geometry, human activity, and environmental factors [53,54]. Section 1, including Meiliang Bay and Zhushan Bay, is a cyanobacteria-dominated region due to high concentrations of nitrogen and phosphorus. Section 2 (Gongshan Bay) is home to a large number of tributaries of Yangtze River flowing into the lake since 2001. Section 3, including Zhenhu Bay, Guangfu Bay, Xukou Bay, and Dongshan Bay, is a macrophytes-dominated region where most aquatic vegetation is found. Section 4 (East Bay) is a submerged vegetation region with good water quality and rich fishery production. Section 5 is a floating-leaf vegetation distributed region. Finally, Section 6 is cyanobacteria dominated region [54,55].

Methodology
Processes for this study included pre-processing of Landsat single bands, analysis of spatial autocorrelation, and selecting significant low-value clusters. This final process meant identifying low-low clusters for the near-infrared band (NIR), first shortwave-infrared band (SWIR1), and second shortwave-infrared band (SWIR2), but high value clusters (high-high clusters) for coastal and three visible bands. Images were post-processed for low-low clusters on open surface water ( Figure 2). Each single Landsat band, including the coastal band (i.e., for Landsat OLI image only), blue, green, red, NIR, SWIR1, and SWIR2 bands were tested separately for water extraction using the methods developed in this study. The specific steps are described here (see supplement materials for python script of all steps).

Pre-Processing of Landsat Single Bands
Pre-processing included clipping the imagery to the study area, converting rasters to a NumPy array, and standardizing the NumPy array (Equation (1)).
where R is reflectance of each pixel for one Landsat single band, R is the mean, and std(R) is the standard deviation of all pixels for one Landsat single band.
Water 2020, 12, x FOR PEER REVIEW 5 of 20 where is reflectance of each pixel for one Landsat single band, ̅ is the mean, and ( ) is the standard deviation of all pixels for one Landsat single band. band; SWIR1 = first shortwave infrared band; SWIR2 = second shortwave infrared band; Spatial clusters = low-low clusters (i.e., low value pixels significantly cluster together spatially) or high-high clusters (i.e., high value pixels significantly cluster together spatially).

Spatial Autocorrelation for the Standardized NumPy Array
The surrounding eight pixels (Xj) were selected as spatial neighbors for the pixel Xi ( Figure 2) to calculate spatial autocorrelation index (Equation (2) with equal spatial weight ( ) assigned to each of the eight neighbors = 1/8 = 0.125 ( Figure 2). Equation (2) is based on the concept of Moran's Index. Near-infrared band; SWIR1 = first shortwave infrared band; SWIR2 = second shortwave infrared band; Spatial clusters = low-low clusters (i.e., low value pixels significantly cluster together spatially) or high-high clusters (i.e., high value pixels significantly cluster together spatially).

Spatial Autocorrelation for the Standardized NumPy Array
The surrounding eight pixels (X j ) were selected as spatial neighbors for the pixel X i (Figure 2) to calculate spatial autocorrelation index I i (Equation (2) with equal spatial weight (W ij ) assigned to each of the eight neighbors = 1/8 = 0.125 ( Figure 2). Equation (2) is based on the concept of Moran's Index.
where I i is the spatial autocorrelation index that refers to the association (including statistical strength and direction of the association) between a given pixel and its spatial neighbors, X i is the standardized reflectance for one pixel of Landsat single band, X j is the spatial neighbor for X i , W ij is the spatial weight, and n is the total number of spatial neighbors (eight in this case). The test statistic Z for significant test (standard normal distribution) for spatial autocorrelation I i is calculated by Equation (3) based on the concept of Moran's Index.
where Z is the test statistic from standard normal distribution, I i is the spatial autocorrelation index from Equation (2). E I , A, and B are calculated from Equations (4), (5), and (6), respectively.
where N is the total number of pixels for the single Landsat band.
where W ij is the spatial weight for each neighbor, N is the total number of pixels for the single Landsat band, and b is calculated using Equation (7).
where W ij refers to the spatial weight for each neighbor, N is the total number of pixels for the single Landsat band, and b is calculated using Equation (7).
where X is the standardized reflectance of each pixel for one Landsat single band, and X refers to the average value of the standardized array calculated by Equation (1). The associated p-value from the test statistic Z in the significance test for spatial autocorrelation was calculated based on the python package: "scipy.stats".
Low-low clusters (i.e., the first "low" means the value of I i is relatively low of the NumPy array, the second "low" means the value of its spatial neighbor is relatively low, and all low values are significantly, spatially clustered) for NIR, SWIR1, and SWIR2 bands, high-high clusters for coastal bands and three visible bands that refer to open surface water were selected from the standardized NumPy array based on three criteria: (1) I i > 0 (i.e., either high values or low values clustered together; (2) the value in the standardized array is less than zero (i.e., low values clusters are separate from high value clusters) for NIR, SWIR1, and SWIR2. For visible bands and the coastal band from Landsat OLI imagery, this criterion is slightly different (the standardized array larger than zero) because the reflectance of water in those bands is higher than other land cover types; and (3) p-value ≤ 0.05 (i.e., spatial autocorrelation index is statistically significant).

Post-Processing for Open Surface Water Extraction
The post-processing steps included converting the reassigned NumPy array to raster, defining the coordinate system for the converted raster, converting raster to polygon, eliminating small polygons (area less than 40,000 m 2 ) that merge small polygons into the surrounding features (i.e., the minimum polygon area threshold were selected as 40,000 m 2 to reduce some effects of fish, crab, and shrimp farms which is near Taihu Lake for water extraction because the area of those farms are less than 40,000 m 2 in the study area), selecting polygons of open surface water, and smoothing polygons.

The Effects of Low Filter Image Process on Water Extraction
To mitigate noise from image quality, many studies introduce low filters to process images thereby improving feature accuracy and reducing illogical classification results from open water extraction [9,30]. The texture of open water is smoother than other land cover types from Landsat imagery, and the low filter process smooths open water features even more, which is effective to reduce the influence of image noise (i.e., salt and pepper effects from imagery). However, the low filter process might change the morphology and area of extracted open water from remotely sensed images. Therefore, a comparison was made of extracted open water area between low filtered and original images.

Time Series Analysis and Segmented Linear Regression for Climate and Survey Data
To capture an accurate annual precipitation trend, water level, and pondage data, seasonal effects were reduced using time series analysis (TSA) in Package "xts" (Extensible Time Series)) of R software (This software is from Bell Laboratories (Lucent Technologies) by John Chambers and colleagues, New Jersey, USA; It is initially written by Robert Gentleman and Ross Ihaka, Statistics Department of The University of Auckland, New Zealand). TSA separates regular seasonal variation and inter-annual trend from original temporal data (i.e., monthly data with 12 frequency per year). Then the results of annual trends were analyzed using the segmented linear relationship of Package "Segmented" in R software. The segmented linear relationship, which is also called broken-line relationship, is widely used in ecological research [56]. In this study, the segmented linear relationship was used to analyze thresholds for the inter-annual dynamics of precipitation, water level, and pondage data, as well as temporal change of the extracted Taihu area.

Open Water Extraction from Different Landsat Bands
Among seven bands in the Landsat series, open water extraction from SWIR2 and SWIR1 is more stable than NIR, visible bands, and the coastal band from OLI imagery ( Figure 3). NIR, SWIR1, and SWIR2 bands perform better for open surface water extraction (Figure 4, more example extractions in Supplementary A in the supplement materials), especially when the water was shallow and turbid. In 1984, when the water in Taihu Lake was clearer than previous years, most of the lake area was extracted from blue, green, and red bands, excluding the shallow area in eastern Taihu Lake and surrounding wetlands (Figure 4a-c). Since Taihu Lake became highly turbid after this, the visible Landsat bands were not suitable for open water extraction (Figure 4g-i). The extracted area from visible and coastal bands in Landsat OLI varied greatly under different water conditions (Figure 3). This is consistent with previous findings that NIR, SWIR1, and SWIR2 (i.e., TM band 4, 5, and 7) were Water 2020, 12, 1928 8 of 19 typically used for open water extraction because water absorbed more completely in those bands than other land cover types [6].
xtracted from blue, green, and red bands, excluding the shallow area in eastern Taihu Lake a rrounding wetlands (Figure 4a-c). Since Taihu Lake became highly turbid after this, the visib andsat bands were not suitable for open water extraction (Figure 4g-i). The extracted area fro isible and coastal bands in Landsat OLI varied greatly under different water conditions (Figure his is consistent with previous findings that NIR, SWIR1, and SWIR2 (i.e., TM band 4, 5, and 7) we pically used for open water extraction because water absorbed more completely in those ban an other land cover types [6]. The reflectance of clear water decreased with increasing wavelength in the optical domain (450-2500 nm) [5]. Therefore, reflectance of water in optical bands with long wavelength (NIR, SWIR1, and SWIR2) had lower values than other land cover types, which means water bodies were low value clusters in optical bands with long wavelengths. The methodology developed based on the concept of spatial autocorrelation is very suitable to detect low value clusters (i.e., low-low clusters or cold spots in spatial autocorrelation results). This method has greater potential to eliminate illogical results than other pixel-based classification methods because it emphasizes the autocorrelation of each pixel with its neighbors.  Gong's (2017) results also show that spatial weights (i.e., an important parameter for local spatial autocorrelation models) would improve the temporal consistency of remote sensing classification [36]. However, water extraction using the methods from this study were influenced by other land cover types with low reflectance values (e.g., mountain and building shadows). Frazier and Page Gong's (2017) results also show that spatial weights (i.e., an important parameter for local spatial autocorrelation models) would improve the temporal consistency of remote sensing classification [36]. However, water extraction using the methods from this study were influenced by other land cover types with low reflectance values (e.g., mountain and building shadows). Frazier and Page found TM band 5 (SWIR1) performed best for open water extraction, while TM bands 4 (NIR) and 7 (SWIR2) were less accurate [57]. Our results indicate that NIR was less accurate than SWIR1 and SWIR2 for open water extraction because it was influenced by "built-up noise" (Figure 4j). Therefore, extraction results from NIR overestimate actual open water area in some situations (Figure 3). The reason could be that "water-leaving radiance" contributed by NIR was significantly higher in the consistently turbid waters of Taihu Lake than that in clear water [58]. As with NIR, all three visible bands were affected by built-up noise during open water extraction (Figure 4g-i). Previous research indicates that normalized differences between green and SWIR1 that are equivalent to TM band 5, OLI band 6, or MODIS (Moderate Resolution Imaging Spectroradiometer) band 6 are best for open water extraction in various inundation extremes based on threshold algorithms [5,6], but are greatly influenced by dark surface noise in urban areas [27,28]. According to the results in this study, the noise of shadows in built-up areas comes from the green band instead of SWIR1 for open water extraction by the normalized difference between green and SWIR1.
Water extraction from both SWIR1 and SWIR2 was slightly influenced by mountain shadows, and shadow effects on SWIR2 were stronger than SWIR1 (Figure 4k,l). Thus, our results indicate that SWIR1 is superior to SWIR2 and NIR for water extraction, which is consistent with Frazier and Page's findings [57]. This is especially true when water is turbid and mountain shadows have similar spectral signals in Landsat imagery. However, SWIR2 has greater potential than SWIR1 and NIR to extract eutrophic open water with algal blooms ( Figure 5). Even though the extraction area of Taihu Lake from SWIR2 was similar to that from SWIR1 (Figure 3), greater fluctuation of water extraction from SWIR1 than that from SWIR2 was due to the severe algal bloom in Taihu Lake (Figures 3 and 5e). Part of the open water the algal bloom was excluded for open water extraction from SWIR1 (Figure 5b,e), while severe algal blooms and eutrophic water did not affect the water extraction process using SWIR2 based on the methodology in this study (Figure 5c,f). Using the NIR band, only open water uncovered by algal blooms was extracted, while water covered by algal blooms was excluded (Figure 5a,d). Therefore, NIR shows much less potential for open water extraction in Taihu Lake because the extracted water area from NIR varies among situations due to eutrophication and turbidity (Figure 3). Taking highly turbid water and severe algal blooms in Taihu Lake under consideration, SWIR2 is preferred for open water extraction. In this study, shadow effects for open water extraction from SIWR2 were controlled by a simple threshold using SWIR2 and the red band (i.e., (SWIR2 − Red)/(SWIR2 + Red) < −0.2: after exploring the spectral characteristics between water bodies and mountain shadows in the Landsat series, this threshold sufficiently reduced noise from mountain shadows).
The accuracy assessment was conducted based on the reference of Sentinel-2A images acquired in 2017-05-29, 2017-12-25, and 2019-11-15, paired with water extraction from Landsat OLI image acquired in 2017-05-27, 2017-12-21, and 2019-11-09 respectively. One hundred points were selected randomly in each image within the study area for accuracy assessment. The overall accuracy was 98% (more information about the accuracy assessment, error matrix, and the comparison between water extraction from Landsat OLI images and Sentinel-2A images are presented in Supplementary B of the supplement materials).

The Effects of Low Filter Image Processing on Water Extraction
The area of extracted open water from low filtered SWIR2 imagery is overestimated compared to the original SWIR2 band from Landsat TM, ETM + , and OLI imagery (Figure 6a). The extraction results from low filtered and original SWIR2 are significantly different for either TM, ETM + , or OLI imagery (Figure 6b). Interestingly, the results from the low-and non-filtered OLI images were more similar to each other than the results of the Landsat TM and ETM + (Figure 6). Olmanson's research indicates that estimation result from Landsat OLI imagery is more homogeneous with less noise than that from Landsat 7 imagery [59]. This is because Landsat OLI (Landsat 8) imagery has a narrower multispectral band pattern than Landsat TM and ETM + . There are four TM images with extremelyhigh water areas identified in low filtered SWIR2 imagery not identified in the original imagery because those four images (acquisition dates: 1988- 11-03, 1992-12-16, 1993-12-19, 1998-12-17) show visual salt and pepper noise (Figure 6a).

The Effects of Low Filter Image Processing on Water Extraction
The area of extracted open water from low filtered SWIR2 imagery is overestimated compared to the original SWIR2 band from Landsat TM, ETM + , and OLI imagery (Figure 6a). The extraction results from low filtered and original SWIR2 are significantly different for either TM, ETM + , or OLI imagery ( Figure 6b). Interestingly, the results from the low-and non-filtered OLI images were more similar to each other than the results of the Landsat TM and ETM + (Figure 6). Olmanson's research indicates that estimation result from Landsat OLI imagery is more homogeneous with less noise than that from Landsat 7 imagery [59]. This is because Landsat OLI (Landsat 8) imagery has a narrower multispectral band pattern than Landsat TM and ETM + . There are four TM images with extremely-high water areas identified in low filtered SWIR2 imagery not identified in the original imagery because those four images (acquisition dates: 1988- 11-03, 1992-12-16, 1993-12-19, 1998-12-17) show visual salt and pepper noise (Figure 6a).
indicates that estimation result from Landsat OLI imagery is more homogeneous with less noise than that from Landsat 7 imagery [59]. This is because Landsat OLI (Landsat 8) imagery has a narrower multispectral band pattern than Landsat TM and ETM + . There are four TM images with extremelyhigh water areas identified in low filtered SWIR2 imagery not identified in the original imagery because those four images (acquisition dates: 1988- 11-03, 1992-12-16, 1993-12-19, 1998-12-17) show visual salt and pepper noise (Figure 6a).

Temporal Trend of Extracted Area of Taihu Lake
Finally, water extraction results based on the shadow-controlled SWIR2 band from 77 cloud free images with no image quality issues (i.e., salt and pepper) were selected to analyze the temporal trend of the dynamics in Taihu Lake (Figure 7; more extraction results are in Supplementary C of supplement materials). The area of Taihu Lake is relatively stable compared to other lakes with surrounding wetlands because Taihu Lake has cofferdams along most of its boundary. The temporal variation of the Taihu area was mainly caused by human activities including fish, crab and shrimp farms, wetland recovery, restoration from cultivation, and highway tunnel construction (Figure 8). Most dynamics were observed in Section 4, the south-east part of the lake (Figure 1), because of intensive anthropogenic disturbance (Figure 7). Since 1979, the surrounding wetlands in Section 4 that were once covered by reeds were gradually reclaimed as farms for fish, crab, or shrimp (Figure 7a-f). This increased the water area ( Figure 8). Due to seasonal changes in water-covered farmlands that connect to the main lake, the extracted Taihu area had extremely high values in 1991 and 1992 (from images acquired in 1991- 10-27, 1991-11-12, 1992-04-20, 1992-05-22, and 1992-06-07; Figures 7  and 8). Since 1998, farmlands in the surrounding area of Section 4 were forbidden, so the abandoned farmlands were reclaimed by wetland grasses (Figure 7g) which led to a water area decrease from 1997 to 1998. However, the area of fish, crab, or shrimp farmlands gradually increased after 1998 at a slower rate compared to the period during 1979 to 1997 (Figure 8). Until 2008, policies forbidding farmland expansion were implemented again and the area of Taihu Lake dropped somewhat from 2008 to 2010 (Figures 7j and 8). After 2010, a policy of lake restoration from cultivation was implemented and the area of Taihu Lake gradually increased (Figure 8). In addition, wetland recovery in the south-east part of Section 2 caused some variation in the lake area during 2004 to 2006 (Figure 7i), and the construction of highway tunnel also decreased the lake area slightly after 2018 (Figure 7l).

Inter-Annual Dynamics of Precipitation, Water Level and Pondage
The temporal trend of pondage and water level was very consistent with the variation of inter-annual precipitation between 2007 and 2019 ( Figure 9). However, temporal dynamics of Taihu Lake do not correspond to precipitation, water level, or pondage (Figures 8 and 9). The area of Taihu Lake greatly increased from 1984 until 1997 (Figure 8), while precipitation was quite stable between 1984 to 1991 and decreased from 1991 to 2004 (Figure 9). Normally, the area of the shallow water region of the lake increases along with higher precipitation. Therefore, the increasing area of Taihu Lake between 1984 to 1997 was not caused by precipitation. The surrounding wetlands covered by reed in Section 4 of Taihu Lake (Figure 1) were gradually converted to parts of Taihu Lake (Figure 7). Those areas, unlike the original wetlands, were covered by water all year around, which were extracted as parts of Taihu Lake in the process of water extraction in this study.

Water Extraction in Different Sections of Taihu Lake
According to the extraction results from different Taihu Lake Sections based on all Landsat bands ( Figure 10), NIR, SWIR1, and SWIR2 show significantly higher potential for water extraction than visible bands and the coastal band (from Landsat OLI only). For Sections 2, 3, and 4, the extraction area and its variation were consistent from NIR, SWIR1, and SWIR2 bands (Figure 10b-d).
The extraction results from SWIR2 were clearly better than from SWIR1 as well as from NIR for Sections 1, 5, and 6 ( Figure 10a,e,f). Sections 2, 3, and 4 are usually macrophyte-dominated regions with limited water surface covered by vegetation. Thus, submerged vegetation does not have much effect on water extraction from NIR or the SWIR1 band. However, Sections 1 and 6 are cyanobacteria-

Water Extraction in Different Sections of Taihu Lake
According to the extraction results from different Taihu Lake Sections based on all Landsat bands ( Figure 10), NIR, SWIR1, and SWIR2 show significantly higher potential for water extraction than visible bands and the coastal band (from Landsat OLI only). For Sections 2, 3, and 4, the extraction area and its variation were consistent from NIR, SWIR1, and SWIR2 bands (Figure 10b-d).
The extraction results from SWIR2 were clearly better than from SWIR1 as well as from NIR for Sections 1, 5, and 6 ( Figure 10a,e,f). Sections 2, 3, and 4 are usually macrophyte-dominated regions with limited water surface covered by vegetation. Thus, submerged vegetation does not have much effect on water extraction from NIR or the SWIR1 band. However, Sections 1 and 6 are cyanobacteria-

Water Extraction in Different Sections of Taihu Lake
According to the extraction results from different Taihu Lake Sections based on all Landsat bands ( Figure 10), NIR, SWIR1, and SWIR2 show significantly higher potential for water extraction than visible bands and the coastal band (from Landsat OLI only). For Sections 2, 3, and 4, the extraction area and its variation were consistent from NIR, SWIR1, and SWIR2 bands (Figure 10b-d). The extraction results from SWIR2 were clearly better than from SWIR1 as well as from NIR for Sections 1, 5, and 6 ( Figure 10a,e,f). Sections 2, 3, and 4 are usually macrophyte-dominated regions with limited water surface covered by vegetation. Thus, submerged vegetation does not have much effect on water extraction from NIR or the SWIR1 band. However, Sections 1 and 6 are cyanobacteria-dominated, and Section 5 is largely dominated by floating-leaved vegetation, which challenges water extraction from the NIR band due to the vegetation covered water surface.
Water 2020, 12, x FOR PEER REVIEW 15 of 20 dominated, and Section 5 is largely dominated by floating-leaved vegetation, which challenges water extraction from the NIR band due to the vegetation covered water surface.

Error Sources of Automatic Open Water Extraction from Landsat Series
Some error sources, including cloud cover, image quality, polygon smoothing, and small polygon elimination, might lower the accuracy of automatic open water extraction from the Landsat series. Cloud shadows and thick clouds that cover water regions have a large impact on automatic open water extraction that use spatial autocorrelation (Supplementary D), while thin clouds, as well as thick clouds that are not above open water, do not influence on water extraction using NIR, SWIR1, or SWIR2 bands (Supplementary C: images acquired on 2004-10-14 and 2016-1-1). However, clouds have a strong influence on water extraction that use three visible bands because water has higher reflectance than other land cover types. This means that water bodies (as with cloud features) are high value clusters (i.e., hot spots in spatial autocorrelation analyses). The two post-processing steps

Error Sources of Automatic Open Water Extraction from Landsat Series
Some error sources, including cloud cover, image quality, polygon smoothing, and small polygon elimination, might lower the accuracy of automatic open water extraction from the Landsat series. Cloud shadows and thick clouds that cover water regions have a large impact on automatic open water extraction that use spatial autocorrelation (Supplementary D), while thin clouds, as well as thick clouds that are not above open water, do not influence on water extraction using NIR, SWIR1, or SWIR2 bands (Supplementary C: images acquired on 2004-10-14 and 2016-1-1). However, clouds have a strong influence on water extraction that use three visible bands because water has higher reflectance than other land cover types. This means that water bodies (as with cloud features) are high value clusters (i.e., hot spots in spatial autocorrelation analyses). The two post-processing steps for automatic water extraction, including polygon smoothing and small polygon elimination, were applied after the extracted raster was converted to a polygon. However, polygon smoothing might cause some variation (about 30 m shifts, similar to the spatial resolution of the Landsat series) for final extracted water boundaries and small polygon elimination (i.e., the area set for small polygons in the python script is 40,000 m 2 ) might cause the loss of some small water bodies or wetlands. The advantage of the two steps is to reduce image quality effects on the classification results from traditional remote sensing classification methods, and to smooth the water bodies to bring them closer to the natural boundaries rather than the serrated boundaries caused by Landsat image spatial resolution.

Conclusions
The main purpose of this study was to develop a method based on the LISA concept to improve inland water extraction accuracy and temporal consistency in highly turbid and eutrophic water bodies with frequent anthropogenic disturbance. Using Landsat TM, ETM + , and OLI imagery, water extraction from the SWIR2 band using the developed methodology has great potential to extract inland water bodies automatically, even when the water is turbid and water surface is covered by algal blooms or floating vegetation. Our method with SWIR2 band greatly reduced the effects of vegetation surface cover compared to that of NIR, SWIR1, and visible bands. When the water surface was not covered by vegetation, NIR, SWIR1, and SWIR2 have consistent water extraction results; better than visible bands and the coastal band. Based on the methodology developed in this study, both SIWR1 and SWIR2 bands strongly eliminate noise from dark surfaces in urban areas compared to NIR and three visible bands.
Clouds and image quality (e.g., salt and pepper) have large impacts on automatic open water extraction based on the LISA concept. Low filter image processing is often applied to smooth the image for reducing salt and pepper effects in remotely sensed imagery, especially for water extraction due to the smooth texture of water bodies. However, comparing results between low filtered SWIR2 bands and original SWIR2 band, it appears that the low filter process overestimates extracted water areas. Our findings might enable global water extraction from multispectral imagery under various environmental conditions and image qualities.