Long-Term Spatial and Temporal Monitoring of Cyanobacteria Blooms Using MODIS on Google Earth Engine: A Case Study in Taihu Lake

: As cyanobacteria blooms occur in many types of inland water, routine monitoring that is fast and accurate is important for environment and drinking water protection. Compared to ﬁeld investigations, satellite remote sensing is an e ﬃ cient and e ﬀ ective method for monitoring cyanobacteria blooms. However, conventional remote sensing monitoring methods are labor intensive and time consuming, especially when processing long-term images. In this study, we embedded related processing procedures in Google Earth Engine, developed an operational cyanobacteria bloom monitoring workﬂow. Using this workﬂow, we measured the spatiotemporal patterns of cyanobacteria blooms in China’s Taihu Lake from 2000 to 2018. The results show that cyanobacteria bloom patterns in Taihu Lake have signiﬁcant spatial and temporal di ﬀ erentiation: the interannual coverage of cyanobacteria blooms had two peaks, and the condition was moderate before 2006, peaked in 2007, declined rapidly after 2008, remained moderate and stable until 2015, and then reached another peak around 2017; bays and northwest lake areas had heavier cyanobacteria blooms than open lake areas; most cyanobacteria blooms primarily occurred in April, worsened in July and August, then improved after October. Our analysis of the relationship between cyanobacteria bloom characteristics and environmental driving factors indicates that: from both monthly and interannual perspectives, meteorological factors are positively correlated with cyanobacteria bloom characteristics, but as for nutrient loadings, they are only positively correlated with cyanobacteria bloom characteristics from an interannual perspective. We believe reducing total phosphorous, together with restoring macrophyte ecosystem, would be the necessary long-term management strategies for Taihu Lake. Our workﬂow provides an automatic and rapid approach for the long-term monitoring of cyanobacteria blooms, which can improve the automation and e ﬃ ciency of routine environmental management of Taihu Lake and may be applied to other similar inland waters.


Introduction
Under the background of accelerated industrialization and urbanization, harmful algae blooms (HABs) occur frequently in inland and coastal waters due to eutrophication, which is the combined result of natural and anthropogenic processes, although it is primarily caused by inputs of anthropogenic nutrient loadings (nitrogen and phosphorus) [1]. Some species of algae blooms, such as cyanobacteria, can release odors and toxic compounds that are harmful to aquatic creatures and water ecosystems [1][2][3]. remote sensing [27]. GEE is especially suitable for long-term and large-scale spatiotemporal series of remote sensing applications; its multi-petabyte geospatial data and high-performance computation capacity provide a new approach to understanding the earth. Conventional remote sensing algorithms and methods, which were initially conducted with small-scale data, have the potential to be embedded and integrated in GEE, and applied in a variety of fields and scenarios. In the Web of Science database, we analyzed nearly 200 articles that have used GEE; most of their themes were roughly related to agriculture applications or urban or forest mapping. Overall, land use and land cover classification was the most common and prevailing topic. However, optical constituent studies of inland and coastal waters are rare [28][29][30][31][32], due largely to the complexity of bio-optical properties of case II water [3,4,15]. What is more, the lack of embedded water-leaving reflectance products of GEE has hindered the further operational applications in case II water. It is necessary to find and develop available operational inland water processing algorithms and perform them on GEE's existing embedded datasets.
Based on the urgent need for long-term assessment of cyanobacteria blooms in eutrophic inland water, and recent developments in cloud-based technology, the objectives of our study were to (1) develop an operational workflow in GEE to accomplish rapid and automatic monitoring of long-term cyanobacteria blooms, (2) analyze the spatiotemporal patterns of cyanobacteria blooms in Taihu Lake over the most recent 19 years (2000 to 2018), and (3) determine the relationship between cyanobacteria blooms and environmental driving factors using our study's monitoring data.

Study Area
Taihu Lake, the third largest freshwater lake in China, is located in the prosperous Yangtze River Delta (Figure 1). Its water surface coverage is approximately 2338 km 2 , and it has a shallow mean water depth of 1.9 m [2,5,15]. It is a typical subtropical lake and is influenced by monsoons. Taihu Lake is one of the most important sources of drinking water for adjacent cities, such as Wuxi city [2,5]. Due to eutrophication lasting for more than two decades [8], cyanobacteria blooms frequently occur in Taihu Lake [33], and they have had negative effects on aquatic life and human health [34]. The cyanobacteria blooms that occurred in the summer of 2007 heavily affected the safety of the drinking water of nearly 2 million citizens [2,15]. After that crisis, the local government began to manage and control eutrophication [5]; these measures were effective for the next decade, but recent studies reported that the water quality in Taihu Lake has deteriorated again, and severe cyanobacteria blooms frequently occurred in 2017 [35][36][37]. For convenience of analysis, we divided Taihu Lake into eight segments according to previous studies: Central Lake, West Lake, South Lake, East Lake, Zhushan Bay, Meiliang Bay, Gong Bay, and East Bay [38]. East Bay is perennially submerged and emergent dominated by aquatic vegetation [15,[39][40][41][42]. The optical features of aquatic vegetation are similar to floating cyanobacteria, which can decrease the monitoring accuracy; therefore, East Bay was excluded from our study.

Satellite Data
MOD09GQ and MOD09GA products were used in our study. MOD09GQ and MOD09GA are short for MOD09GQ.006 Terra Surface Reflectance Daily L2G Global 250 m and MOD09GA.006 Terra Surface Reflectance Daily L2G Global 1 km and 500 m, respectively. The MODIS surface reflectance products provide an estimate of the surface spectral reflectance as it would be measured at ground level in the absence of atmospheric scattering or absorption [43,44] . Details of the products are listed in Table 1.  [43,44]. FAI calculation requires red and near-infrared (NIR) bands of MOD09GQ and one shortwave infrared (SWIR) band of MOD09GA. Atmospheric correction requires all of the SWIR bands of MOD09GA. To make full use of the spatial resolution advantage of MOD09GQ, three SWIR bands

Satellite Data
MOD09GQ and MOD09GA products were used in our study. MOD09GQ and MOD09GA are short for MOD09GQ.006 Terra Surface Reflectance Daily L2G Global 250 m and MOD09GA.006 Terra Surface Reflectance Daily L2G Global 1 km and 500 m, respectively. The MODIS surface reflectance products provide an estimate of the surface spectral reflectance as it would be measured at ground level in the absence of atmospheric scattering or absorption [43,44]. Details of the products are listed in Table 1. To make full use of the spatial resolution advantage of MOD09GQ, three SWIR bands (sur_refl_b05/sur_refl_b06/sur_refl_b07) of MOD09GA were resampled to 250 m [15,23,41]. The resampling method was bilinear interpolation.

Land Masking
Land surface pixels have high FAI values; thus, they should be filtered by a land mask [15]. The multi-year innermost boundary of Taihu Lake was carefully visually digitized according to Landsat TM/ETM+/OLI images. Finally, the land mask was eroded to 250 m (one pixel width) toward water to avoid the land adjacent effect [15,45].

Cloud Masking
Cloud pixels also have high FAI values [15,16]. To exclude cloud contamination, a cloud mask is required [46]. A single-band threshold R rc (859 nm) > 0.15 was adopted as a cloud mask. Most cloud and sunglint contamination can be filtered by this kind of cloud mask [47][48][49].

Water-Leaving Reflectance Correction
Previous studies broadly used MODIS Rayleigh correction data for inland water applications, however, the GEE has not embedded this kind of data. Hence, we had no choice but to use MODIS surface reflectance products (MOD09) in our workflow. Although the MODIS surface reflectance products were primarily designed for land surfaces [50], previous studies have found that the MOD09 reflectance was systematically greater than the in-situ reflectance over inland water, because MOD09 fails to fully correct for the aerosol effect [39,50,51]. Water-leaving reflectance correction should be implemented before inland water application [31,33]. Here, we performed a simple yet effective method to retrieve the water-leaving reflectance of Taihu Lake, which was developed by Wang et al. [45,49]: where R c rs (λ) is the corrected water-leaving reflectance of the MOD09 product centered at wavelength λ, R(λ) is the original MOD09 reflectance at λ, min(R NIR : R SWIR ) is the minimum positive reflectance value between NIR and SWIR bands, and π is placed as the denominator to transform surface reflectance to water-leaving reflectance. Here, the sur_refl_b05, sur_refl_b06, and sur_refl_b07 bands were used as R NIR : R SWIR . The correction process was implemented on each pixel. This operational water-leaving reflectance correction method has relative stable performances over inland waters under various conditions [45,49].

FAI Threshold
Using FAI gradient histogram statistics of long-term (2000 to 2008) MODIS R rc (λ) data, Hu et al. [15] determined FAI > −0.004 as the floating cyanobacteria distinguishing threshold. After water-leaving reflectance correction, this FAI threshold was adopted to distinguish floating cyanobacteria pixels in our workflow. In the following validation and discussion sections, the potential errors caused by this fixed threshold that may incorporate into our workflow, will be reported in detail. FAI is defined as follows [15,16]: where R rc (645), R rc (859), and R rc (1240) are Rayleigh-corrected reflectance at 645, 859, and 1240 nm, respectively. Here, R rc (645), R rc (859), and R rc (1240) respectively correspond to R c rs (645), R c rs (859) and R c rs (1240), which were derived in Section 2.3.4.

Spatiotemporal Analysis of Cyanobacteria Blooms
The above floating cyanobacteria extraction processes were executed on each MODIS daily image; the final output result was that pixels with FAI values greater than −0.004 were classified as floating cyanobacteria. Those pixels were assigned a value of 1, while non-cyanobacteria pixels were assigned a value of 0. These cyanobacteria-binarized MODIS daily images were initially input to the following temporal and spatial analysis.

Temporal Analysis
Based on cyanobacteria-binarized MODIS images, we counted the total amount of cyanobacteria pixels on a daily basis, then calculated the daily floating cyanobacteria area (FA d ) by multiplying one pixel's area (0.25 × 0.25 km). After that, we calculated the monthly (FA m ) and annual (FA a ) mean area of floating cyanobacteria using daily area data. According to the FA data (including FA d , FA m and FA a , the same hereafter), we determined the temporal trends of floating cyanobacteria at different levels. Due to cyanobacteria blooms with large daily areas being more severe than small blooms, we paid additional attention to the significant cyanobacteria blooms in our monitoring workflow. Based on FA d , which we derived in the above step, significant cyanobacteria blooms were defined as floating cyanobacteria daily coverages that exceeded 25% of the total lake (or segment) area [15]. We derived and analyzed the annual start and end Julian dates of the significant cyanobacteria blooms, and based on floating cyanobacteria severity classification, we analyzed the cumulative percentage temporal trend of the significant cyanobacteria blooms.

Spatial Analysis
We composed cyanobacteria-binarized daily images and calculated the sum of all values at each pixel of each year, and then obtained the annual accumulative floating cyanobacteria occurrence frequency on a pixel basis. The same method was executed for each month over the 19-year period, and the monthly accumulative bloom occurrence frequency was then determined. For the initial date and during analysis, we first attached the Julian date information to each cyanobacteria-binarized image; then, after composing cyanobacteria-binarized daily images of each year, we calculated the minimum and the maximum date value of each pixel, i.e., the date of first and the last floating cyanobacteria occurrence on a pixel basis, and we finally obtained the annual floating cyanobacteria occurrence duration, by calculating the difference between the first and the last date on a pixel basis [15].

Validation
Accuracy validation is essential for long-term series of cyanobacteria bloom monitoring. Here, we used three methods to validate the accuracy of our workflow. The first method involved analyzing the correlation between our monthly floating cyanobacteria areas (FA m ) and monthly in-situ concentration of chlorophyll-a (C Chla ) (collected from Taihu Basin Authority of Ministry of Water Resources, www. tba.gov.cn). C Chla is often regarded as the proxy of cyanobacteria biomass in water [2,14,26]. Previous studies have demonstrated that FA m and C Chla have good overall consistency under static wind weather [22,52]. The second method involved comparing our results with those of other studies. The different spatiotemporal scales of our FA were qualitatively and quantitatively validated by the results or data of similar studies. The third method involved visually interpreting FA d on Landsat TM/ETM+/OLI images to evaluate our concurrent results [5,15]. The finer spatial texture of floating cyanobacteria could be visually delineated under the higher resolution of Landsat true color Remote Sens. 2019, 11, 2269 7 of 32 images [15]. In general, by comparing our results with the in-situ data (the first method) and the remote sensing observations (the second and the third method), we can get a comprehensive accuracy assessment report.

Analysis of the Relationship between Environmental Driving Factors and Cyanobacteria Blooms
To fully mine the long-term series of cyanobacteria bloom information in our study, and to understand the role of environmental factors on driving cyanobacteria blooms, we conducted redundancy analysis (RDA) to identify relationships between cyanobacteria blooms and environmental driving factors. RDA was initially developed for ecologists to draw biplots to display relationships between species and explanatory variables. The length of the arrow is a measure of fit for species. More quantitatively, we can read the approximated correlations of one species with others by projecting the arrowheads of the other species (or environmental variables) onto an imaginary line overlaying that species' arrow [53,54]. In the RDA ordination diagram, the cosine of the angle between two arrows represents their correspondence; an acute angle, obtuse angle, or right angle respectively denote their positive, negative, and lack of correlation [6].
First, we conducted a detrended correspondence analysis (DCA) and found all gradient lengths were less than three, hence, the linear model was adopted for the following analysis. Then, RDA was implemented to find the relationship between environmental variables and cyanobacteria bloom characteristics [6,[54][55][56]. Environmental variables included meteorological and water quality factors. Meteorological factor data were downloaded from the National Ecosystem Research Network of China, water quality factor data were collected from Taihu Basin Authority of Ministry of Water Resources, and cyanobacteria bloom characteristics were derived from our results ( Table 4). Due to the incomplete temporal sequence of the meteorological data, we used monthly and annual data between 2005 and 2016 ( Figure S1) to implement RDA, which occupied two-thirds of the temporal span of the whole study period (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018) and also included recent years with the exception of 2017 and 2018. Therefore, we think the amount of data was sufficient to produce credible correlation analysis. We not only used the monthly mean area of floating cyanobacteria (FA mean ) but also selected monthly maximum area of floating cyanobacteria (FA max ) as another monthly characteristic, because FA max may better represent monthly floating cyanobacteria coverage due to the wind effect [6,15].
The above processing and analysis steps are all depicted in the workflow in Figure 2.

Validation of Workflow Accuracy
3.1.1. Validation Using In-Situ Data (Chlorophyll-a) Due to the similar FAI characteristics of floating cyanobacteria, macrophytes distributed in the East Lake would cause overestimation of FA, especially in summer and autumn [22]. To consider this

Validation Using In-Situ Data (Chlorophyll-a)
Due to the similar FAI characteristics of floating cyanobacteria, macrophytes distributed in the East Lake would cause overestimation of FA, especially in summer and autumn [22]. To consider this deviation source, we performed a correlation analysis between the FA m of the entire lake and chlorophyll-a concentration (C Chla ), and between the FA m of the entire lake excluding East Lake and C Chla . The FAI threshold was another potential source of error. Since FAI was developed and applied in Taihu Lake, few studies demonstrated that FAI > −0.004 can be extended to extract floating cyanobacteria after 2008, because this threshold value was initially calculated using statistical data from 2000 to 2008 [15]. Hence, applicable uncertainties still remain in our workflow when manipulating these long-term images. Therefore, we conducted a correlation analysis between FA m and C Chla over the whole temporal stage deviation source, we performed a correlation analysis between the FAm of the entire lake and chlorophyll-a concentration (CChla), and between the FAm of the entire lake excluding East Lake and CChla. The FAI threshold was another potential source of error. Since FAI was developed and applied in Taihu Lake, few studies demonstrated that FAI > −0.004 can be extended to extract floating cyanobacteria after 2008, because this threshold value was initially calculated using statistical data from 2000 to 2008 [15]. Hence, applicable uncertainties still remain in our workflow when manipulating these long-term images. Therefore, we conducted a correlation analysis between FAm and CChla over the whole temporal stage (  The Pearson correlation coefficients ( Table 2) show that FAm and CChla have good consistency and correlation. After excluding the contamination of the East Lake, the correlation between FAm and CChla increased in all temporal stages. Another finding was that FAm and CChla had a stronger correlation in the late stage (2009-2018) than in the early stage (2001-2008), regardless of the spatial extent. Our correlation analysis demonstrates that (1) macrophytes in East Lake affected the final FA accuracy; (2) due to the massive loss of macrophytes in Taihu Lake after 2014 (this will be discussed in following sections), our FA results were closer to actual floating cyanobacteria areas in the late stage; and (3) FAI > −0.004 was found to remain robust after 2008, showing that we could rationally use this threshold to process nearly 20 years' worth of images. Table 2. Pearson correlation coefficients between monthly mean CChla and monthly mean floating cyanobacteria area (FAm) over entire lake and entire lake excluding East Lake, in different temporal stages.

Entire lake
Entire lake excluding East Our monitoring workflow mainly focused on the long-term trends and dynamics of cyanobacteria blooms; therefore, large-scale patterns were, to some extent, more important than short-term and partial details. To validate our derived spatial and temporal patterns of cyanobacteria blooms, we visually inspected our results and those of other studies (Table 3). In this process, some The Pearson correlation coefficients (Table 2) show that FA m and C Chla have good consistency and correlation. After excluding the contamination of the East Lake, the correlation between FA m and C Chla increased in all temporal stages. Another finding was that FA m and C Chla had a stronger correlation in the late stage (2009-2018) than in the early stage (2001-2008), regardless of the spatial extent. Our correlation analysis demonstrates that (1) macrophytes in East Lake affected the final FA accuracy; (2) due to the massive loss of macrophytes in Taihu Lake after 2014 (this will be discussed in following sections), our FA results were closer to actual floating cyanobacteria areas in the late stage; and (3) FAI > −0.004 was found to remain robust after 2008, showing that we could rationally use this threshold to process nearly 20 years' worth of images. Table 2. Pearson correlation coefficients between monthly mean C Chla and monthly mean floating cyanobacteria area (FA m ) over entire lake and entire lake excluding East Lake, in different temporal stages.

Entire Lake
Entire Lake Excluding East Lake

Validation Using the Results of Other Studies
Our monitoring workflow mainly focused on the long-term trends and dynamics of cyanobacteria blooms; therefore, large-scale patterns were, to some extent, more important than short-term and partial details. To validate our derived spatial and temporal patterns of cyanobacteria blooms, we visually inspected our results and those of other studies (Table 3). In this process, some artifacts and a mismatch in spatiotemporal patterns were visible [39]. As a result, we found a high consistency between our spatiotemporal patterns and those of similar studies. We also extracted floating cyanobacteria corresponding to specific dates to display the evolution of the 2007 cyanobacteria blooms ( Figure 4) and compared this with the results of Hu et al. [15]. We found that our long-term patterns as well as our daily spatial distribution results were highly consistent with previous reports. floating cyanobacteria corresponding to specific dates to display the evolution of the 2007 cyanobacteria blooms ( Figure 4) and compared this with the results of Hu et al. [15]. We found that our long-term patterns as well as our daily spatial distribution results were highly consistent with previous reports.  In addition to the qualitative comparison between our results with others, we also quantitatively validated our findings using the Taihu Lake cyanobacteria bloom remote sensing inversion dataset (2017), which was produced and contributed by Ma et al. [5,23,57].  In addition to the qualitative comparison between our results with others, we also quantitatively validated our findings using the Taihu Lake cyanobacteria bloom remote sensing inversion dataset (2017), which was produced and contributed by Ma et al. [5,23,57]. The dataset was supported by Lake-Watershed Science Data Center, National Earth System Science Data Sharing Infrastructure, National Science & Technology Infrastructure of China, http://lake.geodata.cn. This dataset uses the algae pixel-growing algorithm (APA), which can extract floating cyanobacteria with higher accuracy.

Validation Using Landsat Interpretation Data
Qualified Landsat images were limited due to their 16-day revisit interval and cloud contamination. We only used around 300 Landsat TM/ETM+/OLI images for validation, therefore, not every MODIS FAd result had corresponding Landsat validation data. However, this method can be regarded as a systematic sampling process ( Figure 6). Since Landsat TM/ETM+/OLI and Terra/MODIS have different local crossing times, it should be noted that wind weather or changing cloud state could affect the final consistency between the two kinds of data but, in most cases, they can retain general spatial similarity during short intervals (i.e., within a few hours). The linear regression between our FAd and Landsat interpretation data shows there is a good correlation between the two datasets, with a linear slope equal to 0.86, R 2 of 0.92, and a root mean square error (RMSE) of 39.25 km 2 ( Figure 7).

Validation Using Landsat Interpretation Data
Qualified Landsat images were limited due to their 16-day revisit interval and cloud contamination. We only used around 300 Landsat TM/ETM+/OLI images for validation, therefore, not every MODIS FA d result had corresponding Landsat validation data. However, this method can be regarded as a systematic sampling process ( Figure 6). Since Landsat TM/ETM+/OLI and Terra/MODIS have different local crossing times, it should be noted that wind weather or changing cloud state could affect the final consistency between the two kinds of data but, in most cases, they can retain general spatial similarity during short intervals (i.e., within a few hours). The linear regression between our FA d and Landsat interpretation data shows there is a good correlation between the two datasets, with a linear slope equal to 0.86, R 2 of 0.92, and a root mean square error (RMSE) of 39.25 km 2 ( Figure 7).

Validation Using Landsat Interpretation Data
Qualified Landsat images were limited due to their 16-day revisit interval and cloud contamination. We only used around 300 Landsat TM/ETM+/OLI images for validation, therefore, not every MODIS FAd result had corresponding Landsat validation data. However, this method can be regarded as a systematic sampling process ( Figure 6). Since Landsat TM/ETM+/OLI and Terra/MODIS have different local crossing times, it should be noted that wind weather or changing cloud state could affect the final consistency between the two kinds of data but, in most cases, they can retain general spatial similarity during short intervals (i.e., within a few hours). The linear regression between our FAd and Landsat interpretation data shows there is a good correlation between the two datasets, with a linear slope equal to 0.86, R 2 of 0.92, and a root mean square error (RMSE) of 39.25 km 2 ( Figure 7).

Temporal Coverage Patterns
The daily cyanobacteria bloom coverage areas (FA d ) for the entire lake (excluding East Bay) and each segment are depicted in Figure 8. Though synoptic interannual cyanobacteria bloom severity fluctuated, an obvious seasonal trend existed within each year: after a long-term bloom-free period, cyanobacteria survive and float to the water surface, relying on buoyancy. In July, August, and September, FA d is at its largest, and individual severe cyanobacteria blooms could occur in other months. Floating cyanobacteria finally retreat during November to February of the next year. Monthly maximum FA d is also depicted to represent the highly changeable floating cyanobacteria coverage due to the wind effect [15].
From the perspective of the entire lake, the floating cyanobacteria condition was moderate during 2000-2004. During this period, even the monthly maximum FA rarely exceeded half of the entire lake area. However, during 2005 to 2007, FA increased, in 2006 and 2007, a series of severe cyanobacteria bloom outbreaks occurred in the entire lake, which threatened the drinking water safety of the 2 million people nearby [5]. During 2008-2010, FA generally decreased. During 2011-2016, the condition fluctuated with a gradual deterioration. In 2017, a series of severe cyanobacteria blooms broke out over the entire lake over nearly the whole year. In terms of the frequency and density of floating cyanobacteria occurrence, this year's severity level tended to overwhelm that observed in 2006 and 2007. In 2018, the condition improved but remained severe. As for Central Lake, South Lake and West Lake that occupy most of the entire lake, their FA a dynamics were similar to that of the entire lake. As for East Lake, two obvious periods were distinguished during the long-term series: the first period was before 2013, in which FA m and FA a in East Lake fluctuated moderately; after 2013, the second period, FA m and FA a began to decrease sharply, reaching their lowest levels in 2015 and 2016, and then recovered slightly in 2017 and 2018, but to a level still less than that observed during the first period. East Lake is not as cyanobacteria-dominated as other segments and many macrophytes existed in this segment over a long period, hence its FA results could not completely represent the cyanobacteria. Three bay segments, Zhushan Bay, Meiliang Bay, and Gong Bay, are all located in the north of Taihu Lake and have bad fluid connectivity. Their FA m and FA a fluctuated but were relatively stable and regular over the 19 years (Figure 9). The FA results of Gong Bay were also contaminated by macrophytes, especially before 2005 (see Section 3.3). Even in some moderate years, seasons, or months, individual and serious cyanobacteria blooms still occurred, which placed heavy stress on the lake's environment and water safety.

Spatial Distributions of Annual and Monthly Occurrence Frequency
According to the cyanobacteria bloom occurrence accumulative frequency (Figure 10a), between 2000 and 2005, floating cyanobacteria frequently occurred in Zhushan Bay, Meiliang Bay, and the northwest shore regions. Gong Bay and East Lake showed high occurrence frequencies; these areas were dominated by macrophytes in the beginning of 21st century [41,42,58], and we can deduce that Taihu Lake was relatively healthy in this period, excluding some bays and nearshore regions. From 2005 to 2007, cyanobacteria blooms spread from the northwest regions to the Central Lake along the shoreline. Especially in 2007, the condition was much more serious, and the bloom occurrence frequency average of 2007 was doubled in comparison to the previous several years in most lake segments (Figure 10b). Many more cyanobacteria blooms occurred in South Lake than before, and Gong Bay and East Lake's high occurrence frequencies declined sharply. According to the findings of previous studies and the frequency-based approach [41], these segments were dominated by macrophytes, and the macrophyte ecosystems of the above two segments were destroyed due to deterioration of the water environment [42]. In 2008 and 2009, the high frequency occurrence regions gradually shrank in the northwest lake area. The increase of high frequency occurrence areas in East Lake and Gong Bay indicates that the aquatic vegetation in these segments recovered to a certain extent. During 2010 to 2014, the high occurrence regions were mainly concentrated in the northeast lake area, Zhushan Bay, and Meiliang Bay; the overall lake condition was moderate in this period. Beginning from 2015, the occurrence of high frequency tended to spread from the northwest lake area to the entire lake again, and a series of heavy cyanobacteria blooms occurred over the entire lake around 2017. The severity in this year was worse than 2007, especially in Meiliang Bay, West Lake, and Central Lake (Figure 10b). Then, the lake condition in 2018 improved slightly in most segments. We also found that in 2015, the high frequency occurrence in Gong Bay and East Lake almost disappeared and did not recover to the previous state in the following years. In 2015, the local government organized a series of mechanized salvages in East Lake and Gong Bay [42], which destroyed macrophyte roots, preventing them from participating in natural restoration.
The monthly occurrence frequencies of cyanobacteria blooms also had obvious spatial differences ( Figure 11). Cyanobacteria blooms initially occurred in around April, with the occurrence locations mainly concentrated in Zhushan Bay, Meiliang Bay, Gong Bay, and the northwest nearshore regions. In summer, especially July and August, cyanobacteria blooms frequently occurred in West Lake, Zhushan Bay, and Meiliang Bay; simultaneously, Gong Bay and East Lake showed a high frequency of occurrence, mostly due to the macrophytes distributed in these areas growing rapidly in summer. From September to November, there was a gradual decrease in the coverage of high frequency; then, from December to the beginning of the next year, the floating cyanobacteria coverage was small. Bay are due to both cyanobacteria blooms and macrophytes, the same as the initial and duration results in Section 3.4.

Spatial Distributions of Annual Initial Date and Duration
According to initial occurrence date of cyanobacteria blooms (Figures 12a,b), before 2004, only Zhushan Bay, Meiliang Bay, and limited nearshore areas had relatively earlier initial dates. The initial dates in most segments were considerably later, and some central lake regions did not experience cyanobacteria blooms at all. However, during 2004 to 2007, the initial date in West Lake, South Lake,

Spatial Distributions of Annual Initial Date and Duration
According to initial occurrence date of cyanobacteria blooms (Figure 12a,b), before 2004, only Zhushan Bay, Meiliang Bay, and limited nearshore areas had relatively earlier initial dates. The initial dates in most segments were considerably later, and some central lake regions did not experience cyanobacteria blooms at all. However, during 2004 to 2007, the initial date in West Lake, South Lake, and Central Lake advanced significantly. Between 2008 and 2016, the initial date of the entire lake was moderate and stable; in 2017, it advanced significantly, especially in Southwest Lake and nearshore areas. In 2018, the initial date was delayed, but in South Lake, Meiliang Bay, and Gong Bay, it was still quite early. The cyanobacteria bloom duration (Figure 13a) is the difference between the last occurrence date and the initial occurrence date, hence, in the case of the roughly stable last date, the spatial patterns of the duration were similar to the initial date, i.e., for some areas that had early (late) initial dates, their durations tended to become longer (shorter). and Central Lake advanced significantly. Between 2008 and 2016, the initial date of the entire lake was moderate and stable; in 2017, it advanced significantly, especially in Southwest Lake and nearshore areas. In 2018, the initial date was delayed, but in South Lake, Meiliang Bay, and Gong Bay, it was still quite early. The cyanobacteria bloom duration (Figure 13a) is the difference between the last occurrence date and the initial occurrence date, hence, in the case of the roughly stable last date, the spatial patterns of the duration were similar to the initial date, i.e., for some areas that had early (late) initial dates, their durations tended to become longer (shorter).

Temporal Characteristics of Significant Cyanobacteria Blooms
We defined significant cyanobacteria blooms as floating cyanobacteria daily coverage areas (FAd) that exceeded 25% of the entire lake or each segment area [15], and their first and last occurrence dates are depicted in Figure 14. For open lake segments (West Lake, South Lake, and Central Lake), their significant cyanobacteria bloom occurrence dates were similar to the entire lake. For bay segments (Zhushan Bay, Meiliang Bay, and Gong Bay), their significant cyanobacteria bloom occurrence dates were earlier than the open lake segments, mainly due to severe eutrophication. The results of East Lake and Gong Bay were mixtures of cyanobacteria blooms and macrophytes, especially before 2015, when macrophytes were not being mechanically salvaged and were still prevailing.

Temporal Characteristics of Significant Cyanobacteria Blooms
We defined significant cyanobacteria blooms as floating cyanobacteria daily coverage areas (FA d ) that exceeded 25% of the entire lake or each segment area [15], and their first and last occurrence dates are depicted in Figure 14. For open lake segments (West Lake, South Lake, and Central Lake), their significant cyanobacteria bloom occurrence dates were similar to the entire lake. For bay segments (Zhushan Bay, Meiliang Bay, and Gong Bay), their significant cyanobacteria bloom occurrence dates were earlier than the open lake segments, mainly due to severe eutrophication. The results of East Lake and Gong Bay were mixtures of cyanobacteria blooms and macrophytes, especially before 2015, when macrophytes were not being mechanically salvaged and were still prevailing.
As mentioned in the previous section, significant cyanobacteria blooms were defined as FA d that exceeded 25% of the entire lake or each segment area. To exclude the land adjacent effect, aquatic vegetation coverage, and floating cyanobacteria extraction deviation, we regarded FA d that was less than 5% of the entire lake (or segments) area as null floating cyanobacteria. We also defined FA d that was greater than 5% but less than 25% of the entire lake (or segments) area as slight cyanobacteria blooms.
According to the accumulative percentage of the cyanobacteria bloom severity level (Figure 15), the entire lake experienced five stages: before 2004, significant cyanobacteria blooms rarely occurred; in 2006 and 2007, they increased rapidly; then declined in 2008; a moderate level was maintained between 2009 and 2015; and their proportion then increased after 2016. The temporal dynamics of slight cyanobacteria blooms were similar those of the significant blooms. The accumulative percentage of cyanobacteria bloom severity levels of open lake segments (West Lake, South Lake, and Central Lake) were similar to the entire lake. The periodic changes in cyanobacteria-dominated Zhushan Bay and Meiliang Bay, were similar to those observed in the entire lake, but their significant and slight severity levels occupied a greater proportion. In other words, those bays were much unhealthier than the average status. Due to much of Gong Bay's macrophytes being vanished after 2005 [35], we deduce that the Gong Bay's cyanobacteria bloom condition deteriorated in recent years, especially after 2015. East Lake, as a well-known macrophyte-dominated segment before 2015, demonstrated similar temporal patterns to Gong Bay.
advanced again between 2016 to 2018. For 2008, 2016, and 2018, significant cyanobacteria blooms occurred quite early because, in the prior years, especially for 2007 and 2017, the cyanobacteria blooms were rather severe, and the early significant cyanobacteria bloom occurrence dates in 2008 and 2018 were extensions of their previous years. For open lake segments (West Lake, South Lake, and Central Lake), their significant cyanobacteria bloom occurrence dates were similar to the entire lake. For bay segments (Zhushan Bay, Meiliang Bay, and Gong Bay), their significant cyanobacteria bloom occurrence dates were earlier than the open lake segments, mainly due to severe eutrophication. The results of East Lake and Gong Bay were mixtures of cyanobacteria blooms and macrophytes, especially before 2015, when macrophytes were not being mechanically salvaged and were still prevailing. As mentioned in the previous section, significant cyanobacteria blooms were defined as FAd that exceeded 25% of the entire lake or each segment area. To exclude the land adjacent effect, aquatic vegetation coverage, and floating cyanobacteria extraction deviation, we regarded FAd that was less than 5% of the entire lake (or segments) area as null floating cyanobacteria. We also defined FAd that was greater than 5% but less than 25% of the entire lake (or segments) area as slight cyanobacteria blooms.
According to the accumulative percentage of the cyanobacteria bloom severity level (Figure 15), the entire lake experienced five stages: before 2004, significant cyanobacteria blooms rarely occurred; in 2006 and 2007, they increased rapidly; then declined in 2008; a moderate level was maintained between 2009 and 2015; and their proportion then increased after 2016. The temporal dynamics of slight cyanobacteria blooms were similar those of the significant blooms. The accumulative percentage of cyanobacteria bloom severity levels of open lake segments (West Lake, South Lake, and Central Lake) were similar to the entire lake. The periodic changes in cyanobacteria-dominated Zhushan Bay and Meiliang Bay, were similar to those observed in the entire lake, but their significant and slight severity levels occupied a greater proportion. In other words, those bays were much unhealthier than the average status. Due to much of Gong Bay's macrophytes being vanished after 2005 [35], we deduce that the Gong Bay's cyanobacteria bloom condition deteriorated in recent years, especially after 2015. East Lake, as a well-known macrophyte-dominated segment before 2015, demonstrated similar temporal patterns to Gong Bay. Timing and duration of significant cyanobacteria blooms for the entire lake (excluding East Bay) and each segment.
As mentioned in the previous section, significant cyanobacteria blooms were defined as FAd that exceeded 25% of the entire lake or each segment area. To exclude the land adjacent effect, aquatic vegetation coverage, and floating cyanobacteria extraction deviation, we regarded FAd that was less than 5% of the entire lake (or segments) area as null floating cyanobacteria. We also defined FAd that was greater than 5% but less than 25% of the entire lake (or segments) area as slight cyanobacteria blooms.
According to the accumulative percentage of the cyanobacteria bloom severity level (Figure 15), the entire lake experienced five stages: before 2004, significant cyanobacteria blooms rarely occurred; in 2006 and 2007, they increased rapidly; then declined in 2008; a moderate level was maintained between 2009 and 2015; and their proportion then increased after 2016. The temporal dynamics of slight cyanobacteria blooms were similar those of the significant blooms. The accumulative percentage of cyanobacteria bloom severity levels of open lake segments (West Lake, South Lake, and Central Lake) were similar to the entire lake. The periodic changes in cyanobacteria-dominated Zhushan Bay and Meiliang Bay, were similar to those observed in the entire lake, but their significant and slight severity levels occupied a greater proportion. In other words, those bays were much unhealthier than the average status. Due to much of Gong Bay's macrophytes being vanished after 2005 [35], we deduce that the Gong Bay's cyanobacteria bloom condition deteriorated in recent years, especially after 2015. East Lake, as a well-known macrophyte-dominated segment before 2015, demonstrated similar temporal patterns to Gong Bay. . Accumulative percentage of cyanobacteria bloom severity level for the entire lake (excluding East Bay) and each segment. Significant, slight, and null cyanobacteria blooms are defined as FAI > −0.004 coverage area (FAd) larger than 25%, larger than 5% but less than or equal to 25%, and less than or equal to 5% of entire lake or each segment, respectively.

Redundancy Analysis (RDA) between Environmental Driving Factors and Cyanobacteria Bloom Characteristics
According to the monthly RDA analysis (Figure 16), temperature, sunshine radiance, wind speed, and precipitation were positively correlated with cyanobacteria blooms (both FAmean and FAmax), but most water quality factors were negatively correlated or nearly uncorrelated with monthly cyanobacteria blooms, except CODMn. The T's monthly and annual individual explanations to FAmean were 0.273 and 0.168, respectively; while the Tmin's monthly and annual individual explanations to FAmean were 0.283 and 0.212, respectively (Table 4). Hence FAmean had a stronger relationship with Tmin than T. Previous studies have found that higher Tmin in winter promotes the cyanobacteria survival over winter, and there is a significant correlation that exists between Tmin and initial blooming date [5], because cyanobacteria grow better at high temperatures [59,60]. Wind speed affects FAmax more than FAmean because wind helps cyanobacteria to form surface scum [15], especially when the wind speed is less than 4 m/s [61]. However, annual environmental variables have different effects; water quality factors, especially nutrient loadings (TN and TP)) were positively correlated with annual cyanobacteria blooms (Figure 17), probably because sporadic cyanobacteria blooms are affected by meteorological factors to a large extent, e.g., wind speed, wind direction, and temperature, but frequent and large-scale cyanobacteria blooms from the interannual perspective are dominated by nutrient distribution and abundance [7,26,62,63]. Meteorological factors were thought to play a more important role in cyanobacteria bloom formation, especially when Taihu Lake's nutrient loadings were sufficient [64,65]. Intensive precipitation carries nutrient loadings from adjacent catchments to the lake [59], therefore, precipitation was also positively correlated with monthly and annual cyanobacteria blooms as what we got (Figures 16 and 17). . Accumulative percentage of cyanobacteria bloom severity level for the entire lake (excluding East Bay) and each segment. Significant, slight, and null cyanobacteria blooms are defined as FAI > −0.004 coverage area (FA d ) larger than 25%, larger than 5% but less than or equal to 25%, and less than or equal to 5% of entire lake or each segment, respectively.

Redundancy Analysis (RDA) between Environmental Driving Factors and Cyanobacteria Bloom Characteristics
According to the monthly RDA analysis (Figure 16), temperature, sunshine radiance, wind speed, and precipitation were positively correlated with cyanobacteria blooms (both FA mean and FA max ), but most water quality factors were negatively correlated or nearly uncorrelated with monthly cyanobacteria blooms, except COD Mn . The T's monthly and annual individual explanations to FA mean were 0.273 and 0.168, respectively; while the T min 's monthly and annual individual explanations to FA mean were 0.283 and 0.212, respectively (Table 4). Hence FA mean had a stronger relationship with T min than T. Previous studies have found that higher T min in winter promotes the cyanobacteria survival over winter, and there is a significant correlation that exists between T min and initial blooming date [5], because cyanobacteria grow better at high temperatures [59,60]. Wind speed affects FA max more than FA mean because wind helps cyanobacteria to form surface scum [15], especially when the wind speed is less than 4 m/s [61]. However, annual environmental variables have different effects; water quality factors, especially nutrient loadings (TN and TP)) were positively correlated with annual cyanobacteria blooms (Figure 17), probably because sporadic cyanobacteria blooms are affected by meteorological factors to a large extent, e.g., wind speed, wind direction, and temperature, but frequent and large-scale cyanobacteria blooms from the interannual perspective are dominated by nutrient distribution and abundance [7,26,62,63]. Meteorological factors were thought to play a more important role in cyanobacteria bloom formation, especially when Taihu Lake's nutrient loadings were sufficient [64,65]. Intensive precipitation carries nutrient loadings from adjacent catchments to the lake [59], therefore, precipitation was also positively correlated with monthly and annual cyanobacteria blooms as what we got (Figures 16 and 17).

Accuracy Deviation Sources in Our Workflow
The validation results show that for rapid and automatic monitoring purposes, our workflow provides acceptable accuracy; however, some deviation sources remained. Firstly, compared with Rayleigh-corrected reflectance data, the MODIS land surface reflectance data (MOD09) used in our study are sensitive to ambient disturbance, which leads to some inevitable noisy, negative, or even a lack of pixel data, as well as patches [50,66]. FAI > −0.004 was used to distinguish cyanobacteria bloom pixels from non-cyanobacteria blooms [15], but no field investigation was performed to validate the accuracy using this critical threshold [15]. Liu et al. recommended FAI > −0.025 as a cyanobacteria bloom extraction threshold in Taihu Lake according to their field investigation [41]. Since most of the previous studies used FAI > −0.004 to distinguish cyanobacteria pixels in Taihu Lake, we also adopted this threshold to allow for comparison of results. FAI could just distinguish dense cyanobacteria blooms that have formed floating scums and have not mixed with the water column [15,16,52], so it is more suitable for monitoring high Chla concentration cyanobacteria blooms [13]. The FAI threshold can extract some other types of aquatic vegetation and contaminate the final statistics, including emerged and floating-leaved macrophytes, which would lead to an overestimation of cyanobacteria blooms, especially for East Lake and Gong Bay, but this deviation should not affect the long-term temporal and large-scale spatial patterns. Regardless, macrophytes distributed in the above two segments can be deduced according to previous studies and the frequency-based approach [41,58].
Another accuracy uncertainty was the land surface adjacent effect. Stray light from brighter land pixels that are adjacent to a dark water target would lead to increased radiance of NIR bands [67]; thus, the water pixels' FAI values would be elevated according to the FAI formula, which would cause erroneous segmentation. The land surface adjacent effect may be influential for up to many kilometers offshore [67,68], but the land mask cannot be shrunk too much toward the water, under the consideration of comprehensive statistics.
Cloud contamination was another deviation source. When processing floating cyanobacteria pixels covered by thin cloud, visual inspection could be used to identify them and classify those pixels to the final floating cyanobacteria statistics, but the single-band threshold would mask thin cloud, so an underestimation of cyanobacteria blooms may have occurred in some cloudy images. The cloud mask would also remove some pixels that were contaminated by the adjacent effect, insufficient atmospheric correction, satellite sensor mechanical fault, and other accidental factors.

Environmental Driving Factors of Taihu Lake's Severe Cyanobacteria Blooms around 2017
Massive nutrient loading input to Taihu Lake came from sewerage, agriculture fertilizer, and livestock drainage. These factors were generally considered fundamental in creating the 2007 cyanobacteria bloom crisis [5,8], which affected water safety and environment of millions of citizens. Other related environmental variables also helped to create a suitable environment for cyanobacteria bloom outbreak, e.g., temperature and wind [5,69]. After that crisis, the local government has made considerable efforts to control the nutrient loadings and pollutants enter Taihu Lake over the whole catchment. After over one decade, opposite trends were observed among some key water quality indices according to monitoring data for the entire lake: TN, NH 3 -N, and COD Mn declined rapidly after the 2007 peak, then remained at a moderate level; TP declined after the 2007 peak, then remained moderate between 2008 and 2014, rose rapidly after 2015, and reached a new peak around 2017 [36]. During the past decade, measures to control catchment pollutants have apparently been effective. However, the huge floods in 2015 and 2016 carried a huge amount of nitrogen and phosphorus to Taihu Lake. Most of the nitrogen with soluble form was discharged after the floods, but most of the phosphorus was deposited in the lake in particulate form. The reserved phosphorus that accumulated in lake sediment was gradually released by means of Microcystis growth, acidity, and oxidation changes [35]. Taihu Lake's severe cyanobacteria blooms in 2017 were attributed to high TP concentration, high temperature, and frequent weak wind, all of which boost cyanobacteria bloom formation [70], which is consistent with our RDA analysis. The role of nitrogen is still ambiguous to some extent, some studies indicated that reducing phosphorus is more effective at causing a decline in cyanobacteria abundance [8,11,69,71,72]. We also believe that reducing TP by the means of controlling external and endogenous sources would be a feasible long-term management strategy, especially under the condition that explanation of TP was 0.135 in annual perspective, much higher than its monthly explanation (0.001) ( Table 4). Long-term climate warming and wind speed decreasing over the past several decades also create a suitable environment for cyanobacteria blooms [37,59,64]. Therefore, establishing a cyanobacteria bloom monitoring system is vital towards taking preventative action in protecting water environments and safety.
As has been mentioned in above sections, FAI is not sensitive enough to distinguish between cyanobacteria and macrophyte [58], so our obtained cyanobacteria bloom spatiotemporal patterns were contaminated by macrophytes more or less, especially in East Lake and Gong Bay. Several studies have indicated that large areas of macrophytes disappeared in East Lake and Gong Bay after 2015, primarily due to mechanized salvage by the local government [42,62,73]. Taihu Lake has changed from a macrophyte-dominated state to a phytoplankton-dominated state under anthropogenic pressure [37]. The existence of an aquatic macrophyte ecosystem would help improve the underwater light environment [8], stabilize phosphorus in the sediment [73], and suppress cyanobacteria growth [74]. Local government and scientists have made great efforts to recover macrophytes; however, artificial planting and restoration in the early 2000s was unsuccessful, largely due to the water environment being unsuitable for macrophyte growth. The artificial restoration of macrophytes can only succeed when the level of nutrient loading is reduced to an appropriate level [75].

Broader Applications of Our Workflow
Inland water has more complicated inherent optical properties (IOPs) than ocean water [76]. Atmospheric correction, the adjacent effect, and other complicated inland water remote sensing problems hinder the data mining and extension of the current methodology to different research locations [77]. We embedded and integrated related processing procedures into a cloud platform (GEE), achieved an acceptable accuracy, and obtained the cyanobacteria bloom spatiotemporal patterns of Taihu Lake. We believe our workflow has the potential to be applied to other lakes and coastal waters for similar purposes. Researchers should attempt to embed different atmospheric correction algorithms and cyanobacteria bloom extraction methods for different inland waters when using our workflow.
The floating cyanobacteria coverage we extracted can only represent the horizontal distribution, but not the cyanobacteria biomass, which must consider the vertical distribution of cyanobacteria biomass in the water column. Thus, related remote sensing inversion algorithms may be embedded into our workflow in future. The FAI threshold method in our workflow could potentially be improved by the algae pixel-growing algorithm (APA) [57] or the histogram adaptive threshold, etc. Overall, further efforts are urgently required to embed many types of conventional inversion algorithms and operational processes into cloud platforms, which will help us find new information by combining remote sensing big data and specific natural or social scenarios.

Conclusions
In this study, we developed an operational cyanobacteria bloom monitoring workflow based on Google Earth Engine, processed nearly 7000 MODIS daily images, and retrieved spatial and temporal patterns of cyanobacteria blooms in China's Taihu Lake, over 19 years (2000 to 2018). Our workflow provides a basis for automatic remote sensing monitoring of cyanobacteria blooms. Several conclusions were reached in our study: (1) Some sources of error in our workflow were reported, such as the FAI threshold, the land surface adjacent effect, and cloud contamination. However, inspected by the validation using in-situ data and remote sensing observations, the workflow still achieved an acceptable accuracy for rapid and automatic monitoring purposes. Our workflow can improve the automation and efficiency of routine environmental management of Taihu Lake and has the potential to be applied to other inland and coastal water areas.
(2) From the perspective of temporal coverage distribution, cyanobacteria blooms in Taihu Lake were moderate before 2006, peaked in 2007, declined rapidly after 2008, remained moderate and stable until 2015, and then reached another peak around 2017. Several lake segments displayed similar temporal trends to the entire lake, especially for those open lake segments. From the perspective of interannual spatial distribution, bay areas and the northwest lake area had more severe cyanobacteria blooms than open lake segments. From the perspective of monthly spatial distribution, most cyanobacteria blooms primarily occurred in April, became severe in July and August, then improved after October. We also found that macrophytes distributed in East Lake and Gong Bay disappeared after 2015 and have not yet been restored.
(3) According to the redundancy analysis, meteorological factors (temperature, wind speed, precipitation, etc.) were positively correlated with cyanobacteria bloom characteristics, from both monthly and interannual perspectives; however, nutrient loadings were only positively correlated with cyanobacteria bloom characteristics from an annual perspective. Reducing total phosphorous (TP) by controlling external and endogenous sources, together with restoring macrophyte ecosystem, would be the feasible long-term management strategies for Taihu Lake.
Author Contributions: T.J. designed and coded the workflow on GEE and analyzed the data, T.J., X.Z. and R.D. drafted and revised the paper.