1. Introduction
Cyanobacterial blooms are a persistent symptom of eutrophication in inland waters and have become a major concern for lake ecosystem conservation and water-resource management. Bloom-forming cyanobacteria may reduce water transparency, alter phytoplankton community structure, affect aquatic habitats, consume dissolved oxygen during bloom decay, and produce cyanotoxins that threaten drinking-water safety and recreational use [
1,
2,
3,
4,
5]. These problems are particularly serious in large shallow lakes and semi-enclosed eutrophic water bodies, where high nutrient loading, weak water exchange, and frequent meteorological disturbances can support repeated bloom development. In recent decades, climate warming, changes in wind regime, altered precipitation patterns, and continued anthropogenic nutrient inputs have jointly modified the frequency, magnitude, timing, and persistence of cyanobacterial blooms in many lakes [
1,
2,
3,
4,
5,
6]. Therefore, long-term observations are needed not only to identify bloom occurrence, but also to understand how bloom events evolve, decline, and recover under changing environmental conditions.
Satellite remote sensing has provided an effective means of monitoring cyanobacterial blooms over broad spatial extents and long time periods. Compared with routine field sampling, satellite observations can capture lake-wide bloom distribution, short-term spatial redistribution, and interannual variability in a consistent observational framework. A range of optical remote-sensing methods has been developed for bloom monitoring, including empirical chlorophyll-a retrieval models, band-ratio algorithms, fluorescence-related indices, normalized-difference-type indices, and surface-floating algae detection indices [
5,
7,
8,
9,
10,
11,
12,
13]. Among these methods, the Floating Algae Index (FAI) proposed by Hu [
14] has been widely used for detecting floating algal accumulations because it enhances the contrast between near-infrared reflectance and a red–shortwave infrared baseline, and it is relatively robust for long-term monitoring using MODIS observations. Hu et al. [
15] further demonstrated the applicability of MODIS observations for characterizing cyanobacterial blooms in Lake Taihu, providing an important methodological basis for subsequent long-term bloom monitoring in large eutrophic lakes.
Based on these methodological advances, remote sensing has been extensively used to reconstruct long-term bloom dynamics in Lake Taihu [
16,
17,
18,
19]. Huang et al. [
20] used MODIS data to detect algal blooms in Taihu Lake from 2000 to 2011 and examined the roles of water-quality, meteorological, and climatic factors in bloom formation. Subsequent studies further mapped bloom occurrence, bloom area, bloom frequency, and spatial hotspots, showing that Taihu blooms are highly heterogeneous and often concentrated in specific lake regions under the combined influence of nutrient enrichment, wind-driven transport, shoreline configuration, and shallow-water hydrodynamics. Qin et al. [
21] emphasized that Lake Taihu continued to suffer from cyanobacterial blooms despite restoration efforts during 2007–2017, and reported a severe bloom event in May 2017 with a very large bloom area, highlighting the persistence of bloom risk in this shallow eutrophic system. These studies have substantially improved understanding of where and when blooms occur in Taihu, and they also show that bloom behavior should be interpreted in relation to lake-region characteristics and the definition of the effective monitoring area.
Lake Dianchi provides a useful contrast to Taihu because it differs markedly in lake morphology, elevation, climatic background, and seasonal thermal conditions. Taihu is a large, shallow, wind-sensitive lake in the lower Yangtze River region, whereas Dianchi is a plateau eutrophic lake with deeper water and a different seasonal hydroclimatic regime. These differences may lead to different bloom persistence and recovery behavior even when similar remote-sensing indicators are used. Previous studies have generally shown that bloom dynamics in eutrophic lakes are regulated by both nutrient conditions and meteorological forcing, but the relative importance of temperature, wind, precipitation, water depth, hydrodynamic mixing, and lake background may vary among lakes [
22,
23,
24,
25,
26,
27,
28,
29,
30,
31]. In shallow systems, wind can rapidly redistribute surface cyanobacterial accumulations and alter the apparent bloom area observed by satellite. In more persistent or thermally stratified systems, bloom decline may be more closely related to seasonal temperature changes, water-column stability, and nutrient availability. This means that the same bloom-area or chlorophyll-a signal may represent different ecological and physical processes in different lakes.
Although remote sensing has greatly improved bloom detection and bloom-area monitoring, most existing studies still focus on bloom occurrence, spatial extent, frequency, hotspot distribution, or peak intensity [
5,
28,
32,
33,
34]. Much less attention has been paid to what happens after the bloom peak or after the visible bloom area begins to contract. This is an important limitation. The disappearance of an obvious surface bloom does not necessarily mean that bloom-related water-quality pressure has ended [
34,
35]. Chlorophyll-a may remain elevated, residual biomass may persist, and small but persistent bloom patches may continue to affect water quality after the main bloom stage. For lake management, the post-bloom period is directly related to the duration of water-quality risk and the timing of post-event monitoring. However, compared with bloom initiation and outbreak detection, the recovery stage has rarely been quantified as an event-scale remote-sensing process. In particular, there is still no widely used operational framework for estimating how long it takes for satellite-derived chlorophyll-a to return to a non-bloom background level after bloom termination.
In this study, we define this process as operational post-bloom recovery. The term is used in a remote-sensing and management-oriented sense, rather than as an indication of complete ecological restoration. Specifically, it refers to the time interval from bloom termination to the first post-bloom date on which satellite-derived chlorophyll-a falls below an event-specific non-bloom background threshold. This definition allows post-bloom recovery to be quantified as an event-scale indicator. It also provides a basis for examining whether recovery duration is mainly controlled by the magnitude of the preceding bloom, by short-term meteorological transitions during and after bloom decline, or by persistent lake-specific background conditions.
Here, Lake Taihu and Lake Dianchi were selected to test this event-based recovery framework because they represent two distinct eutrophic lake types [
28,
33,
36]. Taihu is a large shallow lake where wind-driven mixing, surface accumulation, and spatial redistribution strongly influence bloom expression. Dianchi is a plateau lake with different thermal and hydrological conditions and a more persistent seasonal bloom background. By applying the same operational framework to both lakes, this study aims to determine whether post-bloom recovery can be consistently quantified from long-term remote-sensing records, and whether the environmental controls of recovery differ between lake types.
The objectives of this study are to:
- (1)
reconstruct the long-term temporal and spatial patterns of cyanobacterial blooms in Lake Taihu and Lake Dianchi from 2000 to 2022 using MODIS-derived bloom-area and chlorophyll-a records;
- (2)
identify post-bloom recovery events and quantify recovery duration using an event-based operational definition;
- (3)
compare recovery-duration distributions and representative post-bloom trajectories between the two lakes; and
- (4)
evaluate the environmental controls of recovery duration using statistical screening, nonlinear modeling, grouped contribution analysis, and path-based validation.
The remainder of this paper is organized as follows.
Section 2 introduces the study areas and data sources.
Section 3 describes the remote-sensing preprocessing, bloom identification, chlorophyll-a inversion, recovery-event definition, and environmental-control analysis.
Section 4 presents the long-term bloom patterns, recovery characteristics, recovery trajectories, and driver-analysis results.
Section 5 discusses lake-dependent recovery mechanisms, methodological uncertainties, and implications for bloom monitoring and management.
Section 6 summarizes the main conclusions.
3. Research Methodology
3.1. Data Processing
3.1.1. Cyanobacterial Bloom Identification
Remote-sensing images were preprocessed before bloom identification, including resampling, land masking, cloud screening, and off-water reflectance correction. MOD09GQ and MYD09GQ scenes were resampled to 250 m by bilinear interpolation. To reduce contamination from mixed shoreline pixels, emergent vegetation, floating macrophytes, and shallow nearshore substrates, the land-water boundary was eroded by one pixel (250 m) toward the lake interior before lake-wide bloom mapping. Because Lake Taihu is a large shallow lake, potential contamination from nearshore macrophytes, emergent vegetation such as Typha, floating vegetation, shallow substrates, and mixed shoreline pixels was specifically considered during preprocessing. First, the eastern part of Lake Taihu, including East Bay/East Taihu, was excluded from the effective monitoring region because this region is characterized by dense aquatic vegetation and has been excluded or treated separately in previous MODIS-based cyanobacterial-bloom studies. Hu et al. [
15], for example, excluded East Bay from Taihu bloom statistics because the signals in this area were mainly associated with aquatic vegetation rather than cyanobacteria. Second, the land–water boundary was eroded inward by one 250 m pixel to reduce shoreline mixed pixels and nearshore emergent-vegetation contamination. Third, bloom extraction was based on FAI-derived floating-algae signals after cloud screening, water masking, and reflectance correction. These procedures were designed to reduce, although not completely eliminate, the influence of nearshore macrophytes and shallow-water mixed pixels on lake-wide bloom statistics. This step is particularly important for Lake Taihu, where shallow water, nearshore aquatic vegetation, and localized reed or Typha zones may affect the optical signal. Nevertheless, at MODIS resolution it is not possible to completely exclude all nearshore vegetation contamination; therefore, the bloom maps are interpreted as screened lake-wide floating-algae signals rather than fine-scale macrophyte maps. Cloud-contaminated pixels were removed using the threshold Rrc(859 nm) > 0.15. To further reduce aerosol effects, water-leaving reflectance was corrected following the method of Wang et al. [
46].
Cyanobacterial blooms were identified using the Floating Algae Index (FAI), rather than by applying a chlorophyll-a concentration threshold. Based on red, near-infrared, and short-wave infrared reflectance, FAI enhances the spectral contrast between floating algal mats and surrounding water and has been widely used for bloom detection in both marine and inland waters [
5,
7,
8,
10,
11,
12]. Previous studies have shown that FAI is less sensitive than many alternative indices to atmospheric aerosols, thin clouds, and sun glint, making it suitable for long-term bloom monitoring with MODIS imagery [
14]. The index is calculated as follows:
In Equation (1),
Rrc,RED,
Rrc,NIR, and
Rrc,SWIR denote Rayleigh-corrected reflectance in the red, near-infrared, and short-wave infrared bands, respectively.
R′rc,NIR represents the linearly interpolated baseline reflectance at the near-infrared wavelength, and
LRED,
LNIR, and
LSWIR are the center wavelengths of the corresponding bands. The FAI threshold was determined using the gradient-threshold method proposed by Hu et al. [
15].
For Lake Taihu, the bloom-monitoring region was defined as the effective remote-sensing monitoring area rather than the entire administrative lake surface. The eastern part of Lake Taihu, including East Bay/East Taihu, was excluded from the effective monitoring region because this area is characterized by extensive aquatic vegetation and differs from the main cyanobacterial-bloom-prone sectors in optical background and bloom-occurrence characteristics. This treatment is consistent with previous MODIS- and FAI-based studies of cyanobacterial blooms in Lake Taihu, in which East Bay/East Taihu was excluded or treated separately because aquatic vegetation can produce spectral signals similar to floating cyanobacteria and may reduce bloom-detection accuracy [
15,
47]. Therefore, the bloom-area percentage reported for Lake Taihu in this study refers to the percentage of the defined effective monitoring region, not the percentage of the entire lake surface.
3.1.2. Chlorophyll-a Inversion
Remote-sensing inversion of lake water quality is commonly performed using either empirical models or machine-learning methods [
26,
48]. Empirical models establish statistical relationships between synchronous image observations and field measurements and remain widely used because they are straightforward to implement and computationally efficient [
49,
50]. In this study, chlorophyll-a concentrations in Lake Taihu and Lake Dianchi were retrieved using lake-specific empirical models developed from field observations and previous research. Although the reported R
2 of the Dianchi model is lower than that of the Taihu model, it was retained because it was developed for plateau lakes and provided a consistent MODIS-compatible long-term chlorophyll-a record for the 2000–2022 period. The chlorophyll-a estimates were used primarily to identify event-scale recovery relative to an event-specific non-bloom background threshold, rather than to interpret small differences in absolute concentration on individual dates. Therefore, uncertainty in the chlorophyll-a inversion was considered when interpreting recovery duration and inter-lake contrasts. The inversion algorithms used for chlorophyll-a concentration retrieval are summarized in
Table 2.
3.2. Operational Definition of Post-Bloom Recovery
In this study, post-bloom recovery was defined operationally as the elapsed time from the end of a bloom event to the first day on which remotely sensed mean chlorophyll-a concentration fell below a seasonally referenced background threshold. This definition is intended to capture recovery of the remotely sensed chlorophyll-a signal after bloom decline, rather than complete restoration of the lake ecosystem or all aspects of water quality.
Chlorophyll-a concentration was selected as the core recovery indicator because it is directly related to eutrophication status, phytoplankton biomass, and the optical expression of bloom decline in satellite observations. Here, chlorophyll-a is used as an operational indicator of post-bloom state transition in Lake Taihu and Lake Dianchi.
The end date of a bloom event was determined from daily FAI-derived bloom-area records. A bloom day was defined as a day on which FAI-derived bloom area exceeded 5% of the effective monitoring area for each lake. The 5% bloom-area threshold was used as an operational criterion for identifying lake-scale bloom days. This threshold was intended to exclude small, isolated, or shoreline-confined bloom patches that may not represent a lake-scale bloom event. Similar area-based bloom grading has been used in previous remote-sensing studies. For example, Jing et al. [
30,
53] classified algal blooms in Lake Dianchi according to bloom-area percentages and defined slight, moderate, and severe bloom conditions using 5%, 10%, and 15% of lake area, respectively, following the bloom-grading framework developed for Lake Taihu. Therefore, days with FAI-derived bloom area greater than or equal to 5% of the effective monitoring area were classified as bloom days, whereas days below this threshold were treated as non-bloom days.
The 20-day interval threshold was used as an operational event-separation rule rather than as a fixed ecological boundary. This threshold was selected to balance two practical requirements. On the one hand, short interruptions in the bloom-area record may occur because of cloud contamination, missing valid observations, wind-driven redistribution of surface cyanobacterial accumulations, or temporary bloom contraction. Previous satellite studies have shown that cyanobacterial-bloom monitoring depends on valid remote-sensing observations and that bloom area can vary substantially at short time scales because of observation conditions and hydrometeorological redistribution [
15,
54,
55]. Such short interruptions should not automatically split one continuous bloom episode into several independent events. On the other hand, a very long interval would risk merging separate seasonal bloom episodes. Therefore, adjacent FAI-derived bloom episodes were merged when the interval between them was shorter than 20 days and when apparent chlorophyll-a recovery did not persist for at least three consecutive valid observations. If the interval was 20 days or longer, or if chlorophyll-a remained below the event-specific background threshold for at least three consecutive valid observations, the two bloom episodes were treated as independent events.
The recovery date was defined as the first day after bloom termination on which mean chlorophyll-a fell below the event-specific background threshold. The background threshold was not a fixed monthly value, and no explicit seasonal signal removal was applied before event detection. Instead, for each event, the threshold was calculated from non-bloom days within a February-to-January annual window surrounding the focal bloom season. This annual window was used to avoid splitting winter recovery periods across calendar years and to provide a seasonally comparable non-bloom reference. Under this framework, the background value was allowed to vary among years and between lakes according to prevailing trophic and meteorological conditions, while bloom days were excluded from the calculation. Recovery is therefore interpreted as the return of the remotely sensed chlorophyll-a signal below its event-specific non-bloom reference level.
Accordingly, the post-bloom recovery duration used in this study is the number of days between bloom termination and the first recovery day defined above.
Figure 2 illustrates the workflow used to identify bloom days, determine event termination, estimate background chlorophyll-a, and calculate operational recovery duration.
Step 1: Identify bloom days and non-bloom days. A bloom day was defined as a day on which bloom area exceeded 5% of the effective monitoring area for each lake, following the event-based threshold used in previous studies. For Lake Taihu, this effective monitoring area excluded East Bay/East Taihu as described above; therefore, the area denominator used for bloom-day identification was the defined bloom-monitoring region rather than the entire administrative lake surface.
Step 2: Determine the recovery date. Daily mean chlorophyll-a was derived independently from the chlorophyll-a inversion model, and an event-specific background threshold was estimated from non-bloom days within the surrounding February-to-January annual window. The first post-bloom day on which mean chlorophyll-a fell below that threshold was identified as the recovery date.
Step 3: Determine bloom termination. Neighboring bloom episodes were merged when the inter-episode interval was shorter than 20 days and when apparent chlorophyll-a recovery did not persist for at least three consecutive valid observations. The last day of the merged episode was treated as the bloom end date.
Step 4: Calculate recovery duration. Recovery duration was defined as the number of days between bloom termination and the first recovery date identified after the event.
3.3. Data Analysis of Environmental Factors
Data analysis was organized as a four-layer workflow designed to identify the environmental factors of operational post-bloom recovery at the event scale. First, baseline screening was conducted using Spearman correlation and standardized multiple linear regression (MLR) to identify candidate predictors and provide interpretable linear benchmarks [
4,
56,
57]. Second, random forest regression was used as the primary nonlinear driver-identification model, and spline-smoothed response curves were employed to visualize the shape of key predictor-response relationships [
58,
59]. Third, hierarchical partitioning was used to quantify the independent contribution of each predictor group [
60,
61]. Finally, a regression-based path-validation framework, following the logic of path analysis, was used to assess whether dominant controls acted directly on recovery duration or indirectly through bloom-area recovery rate [
62,
63].
where
Di is the recovery duration of event
i,
te,i is the bloom-end date, and
tr,i is the operational recovery date. For each event, predictors were organized into bloom-stage, transition-stage, trigger-stage, water-quality, and lake-identity categories (
Table 3). Bloom-stage variables described event magnitude and peak-state conditions; transition-stage variables represented environmental conditions during post-peak adjustment; trigger-stage variables represented conditions immediately surrounding recovery completion; water-quality variables characterized the broader chemical background; and lake identity captured persistent inter-lake differences.
Temperature-related predictors were calculated from ERA5-Land near-surface air temperature. These variables were used to describe atmospheric thermal transitions among bloom, transition, and trigger stages. They were not treated as direct measurements of lake surface or water-column temperature. This distinction is important because lake water temperature may lag behind or deviate from air temperature due to heat storage, vertical mixing, evaporation, and lake morphometry. Accordingly, the temperature effects reported below should be interpreted as statistical associations between atmospheric thermal forcing and post-bloom recovery duration.
where
Xb,i denotes the mean value of variable X during the bloom stage of event
i, and
Xs,i denotes the corresponding value during the transition or trigger stage. Spearman correlation was used to evaluate monotonic relationships between recovery duration and candidate predictors, and standardized MLR provided a baseline linear model of the form
where
Zp,i is the standardized value of predictor
p for event
i,
βp is the corresponding regression coefficient, and ε
i is the residual term. This baseline model was used only as a reference and not as the primary basis for mechanism inference. Random forest regression then served as the primary nonlinear driver-identification model, and spline-smoothed curves were used to summarize the response shapes of the principal predictors.
Nonlinear responses and variable importance were then analyzed using random-forest regression and spline-smoothed response curves. To quantify the independent explanatory role of each driver group, hierarchical partitioning was applied across the grouped predictors. The independent contribution of group g was calculated as:
where
G is the set of all predictor groups and
S is a subset of groups excluding
g. Finally, mechanism validation was implemented using a path-based regression framework linking phase-specific meteorological transitions, lake identity, recovery rate of bloom area, and recovery duration. The two component equations were written as
where Mi is the recovery-rate-of-area mediator, Wi is transition-stage wind, ΔTi is trigger-minus-bloom near-surface air-temperature change, Pi is transition-stage precipitation, and Li denotes lake identity. The symbol Di denotes recovery duration consistently across the duration definition, the baseline regression model, and the path-based duration equation. This path-based framework was used as a supportive mechanism assessment rather than as a full structural equation model. Together, these four analytical layers allowed the study to move from variable screening to nonlinear identification, from contribution decomposition to mechanism-oriented validation.
4. Results
4.1. Outbreak Patterns of Cyanobacterial Blooms in Lakes
4.1.1. Temporal Changes in the Size of Algal Bloom Outbreaks
Based on the FAI-derived bloom maps, long-term bloom dynamics in Lake Taihu and Lake Dianchi were reconstructed for 2000–2022. Because the two lakes differ markedly in size, bloom extent was normalized as the percentage of the effective monitoring area covered by cyanobacterial blooms, allowing direct comparison between systems. For Lake Taihu, this percentage was calculated using the defined effective monitoring region, excluding East Bay/East Taihu, rather than the entire administrative lake surface.
Figure 3 presents both the daily bloom-area percentage and the annual mean bloom-area percentage for the two lakes. Cyanobacterial blooms occurred in both lakes throughout the study period, but their temporal trajectories differed substantially.
In Lake Taihu, the daily bloom area fluctuated strongly from year to year, but the long-term pattern remained consistent with the annual mean series. The period 2000–2004 represented the mildest stage of bloom activity. Bloom extent then increased sharply after 2005, with particularly severe events in 2006 and 2007, when the annual mean bloom-area percentage exceeded 10% and the daily maximum bloom coverage within the effective monitoring region approached 80%. From 2008 to 2016, bloom conditions remained serious but comparatively stable, with smaller interannual changes in the annual mean bloom area. A renewed lake-wide intensification occurred in 2017, when both bloom frequency and spatial density exceeded the levels observed in 2006–2007 and the annual mean bloom-area percentage reached 11.9%. Conditions improved during 2018–2022, but bloom extent remained at a relatively high level compared with the early study period.
Lake Dianchi showed a different pattern. Daily bloom area varied markedly, but the annual mean series indicates a phased evolution characterized by an initial increase, a subsequent decline, a secondary increase, and finally a lower-intensity stage. During 2000–2005, bloom extent increased from 2000 and peaked in 2002, when the annual mean bloom-area percentage reached 3.7%, the highest value of the study period. Although daily bloom area fluctuated considerably during 2003–2005, the annual mean remained lower than that of 2000–2002. During 2006–2010, the annual mean bloom area declined relative to the first stage, but the largest single bloom event of the study period occurred in 2008, covering 56.9% of the lake area. Bloom extent increased again during 2011–2014: although the daily maxima were lower than in 2006–2010, the annual mean bloom area exceeded that of the second stage and remained only slightly below that of the first stage. From 2015 to 2022, both daily and annual mean bloom areas were lower than in the preceding three stages. The monthly average bloom-area distributions further highlight these lake-specific seasonal patterns (
Figure 4).
4.1.2. Spatial Distribution of Bloom-Frequency Hotspots
To characterize the spatial persistence of bloom activity, the interannual frequency of bloom occurrence was further analyzed at the pixel level for both lakes (
Figure 5 and
Figure 6). In Lake Taihu, bloom-frequency hotspots expanded from nearshore bays to broader western and central lake sectors after 2005, with pronounced lake-wide intensification in 2007 and 2017. In contrast, Lake Dianchi exhibited a more spatially confined hotspot pattern, with high-frequency bloom occurrence concentrated mainly in the northern lake region.
The Taihu maps indicate a transition from localized, bay-dominated bloom occurrence in the early 2000s to broader lake-wide expansion in later years, consistent with the temporal increase in bloom severity. In Dianchi, the main hotspot remained concentrated in the northern part of the lake, suggesting stronger spatial persistence but a more localized bloom-development background. It should be noted that
Figure 3 and
Figure 5 describe different quantities.
Figure 3 shows the daily and annual mean bloom-area percentages within the effective monitoring region, whereas
Figure 5 shows pixel-level bloom-occurrence frequency over the long-term period. Therefore, the color classes in
Figure 5 should not be interpreted as the bloom-area percentage on a single day.
4.2. Post-Bloom Recovery Characteristics in Lake Taihu and Lake Dianchi
A total of 82 valid post-bloom recovery events were identified in Lake Taihu, compared with 28 events in Lake Dianchi (
Table 4). Recovery duration differed markedly between the two lakes. In Lake Taihu, the mean recovery duration was 8.82 days and the median was 6 days, whereas in Lake Dianchi the mean recovery duration reached 25.32 days and the median was 22 days (
Table 4). The interquartile range was also substantially wider in Lake Dianchi than in Lake Taihu, indicating much stronger event-to-event heterogeneity in Dianchi.
The cumulative distribution functions further highlight the contrast in recovery regimes between the two lakes (
Figure 7). Recovery in Lake Taihu was concentrated within the short- to medium-duration range, while Lake Dianchi exhibited a much slower cumulative rise, reflecting a substantially larger proportion of long-lasting recovery events. These distributional differences show that post-bloom recovery cannot be adequately described by mean duration alone.
The class-based statistics provide a clearer view of this contrast (
Table 5;
Figure 8). In Lake Taihu, recovery events were broadly distributed across the 1–20 day range, with 21.95% occurring within 1–2 days, 23.17% within 3–5 days, 20.73% within 6–10 days, and 24.39% within 11–20 days. Only 9.76% of Taihu events lasted longer than 20 days. In contrast, Lake Dianchi was dominated by prolonged recovery, with 57.14% of events exceeding 20 days. Short recovery events were relatively uncommon in Dianchi, with only 10.71% occurring within 1–2 days and 7.14% within 3–5 days (
Table 5).
The seasonal distributions of recovery events also differed substantially between the two lakes (
Figure 9). Because Taihu and Dianchi have different climatic seasonality,
Figure 9 is interpreted as a calendar-month distribution of valid recovery events within each lake rather than as evidence that the same calendar month represents equivalent seasonal conditions in the two systems. In Lake Taihu, recovery events were concentrated mainly from June to December, with the highest frequencies observed between August and October. In Lake Dianchi, recovery events were concentrated primarily in autumn and winter, especially in October, November, and January. November alone accounted for the largest monthly count in Dianchi. These results indicate that the timing of post-bloom recovery is lake-specific and likely linked to differences in seasonal forcing and bloom evolution.
Overall, the results show that Lake Taihu is characterized by more frequent and generally faster post-bloom recovery, whereas Lake Dianchi exhibits fewer but much longer recovery events. This contrast is evident not only in the central tendency of recovery duration but also in the full event-duration distributions and monthly occurrence patterns.
4.3. Representative Post-Bloom Recovery Trajectories
Representative event trajectories further illustrate the process-based meaning of the recovery metric (
Figure 10). In Lake Taihu, the event closest to the lake-wide median recovery duration (Event 19, 6 days) showed a relatively rapid decline after bloom termination, with the 7-day mean chlorophyll-a series falling steadily toward the identified recovery date while bloom area simultaneously contracted. In Lake Dianchi, the representative median-like event (Event 16, 22 days) displayed a much more persistent post-bloom phase, with both chlorophyll-a and bloom area decreasing gradually over a longer interval before the recovery threshold was reached. These contrasting trajectories visually confirm that post-bloom recovery in Taihu is typically faster and more abrupt, whereas recovery in Dianchi tends to be slower and more prolonged.
When all recovery events were aligned to the identified recovery date, both lakes exhibited a coherent decline in chlorophyll-a and bloom area toward day 0, but the magnitude and sharpness of that decline differed between systems (
Figure 11). In Lake Taihu, normalized chlorophyll-a dropped to 44.5% of the bloom-event peak at the recovery date, while normalized bloom area decreased to 12.1%. In Lake Dianchi, the corresponding values were lower in absolute normalized terms at day 0 (20.4% for chlorophyll-a and 4.9% for bloom area), but the pre-recovery decline was more gradual and extended over a longer time window. The aligned composites therefore indicate that recovery is not an isolated threshold-crossing event; instead, it reflects a structured post-bloom transition during which chlorophyll-a and bloom extent jointly weaken as the system approaches recovery.
Daily observations during the recovery phase showed different degrees of association between chlorophyll-a concentration and bloom area in the two lakes. Because the scatter clouds in
Figure 12 are highly dispersed and right-skewed, the relationship between chlorophyll-a during recovery and FAI-derived bloom area during recovery was not interpreted as a simple deterministic linear relationship. Instead, the association was evaluated using Spearman rank correlation, and the trend was visualized using a non-parametric LOWESS smoother.
On the original scale, Lake Taihu showed a statistically significant but weak positive association between chlorophyll-a and FAI-derived bloom area during recovery (Spearman’s rho = 0.23, n = 913, p < 0.001), whereas Lake Dianchi showed a markedly stronger association (Spearman’s rho = 0.64, n = 583, p < 0.001). To improve interpretability and reduce the influence of strong right-skewness, both variables were also examined after log10(x + 1) transformation. Because Spearman rank correlation is invariant to monotonic transformation, the rank-based statistics remained unchanged between the original-scale and log-transformed plots, while the transformed plots provided a clearer visual representation of the point-cloud structure.
To further assess the potential nonlinear form of the relationship, several candidate parametric nonlinear models were compared, including logarithmic, power, exponential-saturation, and Michaelis–Menten-type saturation models. For Lake Taihu, the best-performing nonlinear model was a logarithmic function, but its explanatory power remained limited (R2 = 0.056, RMSE = 9.385, 5-fold CV RMSE = 9.399, AIC = 4092.6), confirming that the relationship was weak and highly dispersed. For Lake Dianchi, the best-performing nonlinear model was a Michaelis–Menten-type saturation model (R2 = 0.421, RMSE = 5.999, 5-fold CV RMSE = 6.030, AIC = 2095.0), indicating that the relationship is better interpreted as a stronger nonlinear monotonic association rather than a simple linear coupling.
Therefore,
Figure 12 indicates a weak association in Lake Taihu and a stronger, partly nonlinear association in Lake Dianchi, rather than a uniform deterministic relationship across the two lakes.
4.4. Integrated Analysis of Environmental Controls on Operational Post-Bloom Recovery
4.4.1. Traditional Statistical Screening and Baseline Regression
Traditional statistical screening identified several variables consistently associated with recovery duration (
Figure 13). Spearman analysis showed that lake identity, transition-stage wind, trigger-minus-bloom temperature change, and transition-stage precipitation were among the strongest correlates of recovery duration. Standardized MLR confirmed that the signs of these relationships were stable: stronger transition-stage wind was generally associated with shorter recovery duration, whereas stronger cooling from bloom to trigger stage and the lake-background contrast were associated with prolonged recovery. Taken together, the baseline statistics indicate that post-bloom recovery was more strongly linked to phase-specific environmental transitions than to bloom magnitude alone.
4.4.2. Nonlinear Responses and Variable Importance
Random forest regression, used here as the primary nonlinear driver-identification model, confirmed that recovery duration was governed by a limited set of dominant predictors (
Table 6). The most influential variables were trigger-minus-bloom temperature change, lake identity, and transition-stage wind, followed by transition-stage precipitation and selected bloom-state descriptors. Spline-smoothed response curves further showed that the effects of wind and temperature were not purely linear. Recovery duration decreased more rapidly above moderate transition-stage wind levels, whereas stronger negative temperature departures from bloom to trigger stage were associated with disproportionately longer recovery in the upper tail of the response.
4.4.3. Contribution Decomposition of Variable Groups
Hierarchical partitioning clarified how much explanatory information was carried by each variable group (
Figure 14;
Table 7). Trigger-stage meteorology accounted for the largest independent contribution (31.5%), followed by lake identity (27.6%) and transition-stage meteorology (19.4%). Bloom-stage variables and water-quality variables contributed substantially less. This decomposition indicates that post-bloom recovery was dominated by short-term atmospheric transitions superimposed on lake-specific background sensitivity, whereas bloom magnitude and static water-quality conditions played secondary roles.
4.4.4. Path-Based Validation of the Recovery Mechanism
Path-based validation was used to examine whether the dominant meteorological controls acted directly on recovery duration or indirectly through the recovery rate of bloom area (
Figure 15;
Table 8 and
Table 9). The results showed that lake identity exerted a significant negative effect on the recovery rate of bloom area, which in turn significantly shortened recovery duration, indicating a measurable indirect pathway. In contrast, trigger-minus-bloom temperature change mainly affected recovery duration directly, whereas its indirect effect through bloom-area recovery rate was weak. Transition-stage wind showed a negative direct coefficient for recovery duration, but its retained effect in the path-based model was weaker than that of temperature change and lake identity.
Overall, the path-based results support a dual-control pattern in which operational post-bloom recovery is shaped by both direct phase-specific meteorological forcing and indirect modulation through lake-dependent recovery pace.
5. Discussion
5.1. Contrasting Post-Bloom Recovery Regimes Between Lake Taihu and Lake Dianchi
The clearest result of this study is that post-bloom recovery differed substantially between the two lakes. Lake Taihu had more recovery events and a much larger share of short-to-medium recovery durations, whereas Lake Dianchi was characterized by fewer events but a much stronger long-duration tail. This difference is evident in both the ECDF and the class-based statistics. In Taihu, only 9.76% of events lasted longer than 20 days, while in Dianchi 57.14% of events fell into the >20 day class. The monthly distributions show the same contrast: Taihu recovery events were concentrated mainly from June to December, whereas Dianchi recovery events were concentrated mainly in autumn and winter, with November accounting for the largest monthly count.
These differences suggest that bloom termination does not lead to the same type of post-bloom adjustment in the two systems. In Taihu, the representative event and the event-aligned composite both indicate a relatively rapid decline toward the recovery threshold. In Dianchi, the same analyses show a slower and more persistent transition. This interpretation is also supported by the recovery-phase association analysis. During recovery, chlorophyll-a and FAI-derived bloom area showed a weak but significant monotonic association in Taihu (Spearman’s rho = 0.23, n = 913, p < 0.001), whereas the association was stronger in Dianchi (Spearman’s rho = 0.64, n = 583, p < 0.001). This contrast suggests stronger decoupling between pigment concentration and surface bloom expression in Taihu than in Dianchi.
The absence of identified recovery events in some months should not be interpreted as the complete absence of cyanobacterial blooms during those periods. Instead, it reflects the event-based definition used in this study, under which a valid post-bloom recovery event requires a preceding bloom termination followed by a return of remotely sensed chlorophyll-a below the event-specific background threshold. For Lake Taihu, previous remote-sensing studies have shown that cyanobacterial-bloom coverage is generally low from December to April, begins to increase rapidly during April–May, and remains relatively high from May to November, with most large-scale blooms starting in April–May and ending in November–December [
64]. In addition, Microcystis in Lake Taihu tends to shift from overwintering or low-activity conditions to spring recruitment and active growth as temperature increases, rather than entering a typical post-bloom decline stage during early spring [
65,
66]. Therefore, the lack of Taihu recovery events from February to May is consistent with the seasonal bloom-development cycle and the operational recovery definition.
For Lake Dianchi, algal blooms can occur throughout the year, but their frequency is lower in winter–spring and higher in summer–autumn, with severe bloom events usually occurring during July–September [
33,
53]. Climate analyses have also suggested that summer–autumn warming, reduced precipitation, and declining wind speed can favor algal-bloom outbreaks in the Dianchi watershed [
67]. Consequently, February–July in Dianchi more often corresponds to low-frequency, developing, or intensifying bloom conditions rather than completed post-bloom recovery events. This explains why the identified recovery events in Dianchi were concentrated mainly after the major bloom-development period.
5.2. Recovery Was Controlled More by Phase-Specific Transitions than by Bloom Magnitude Alone
The integrated factor analysis indicates that operational post-bloom recovery was governed mainly by phase-specific meteorological transitions rather than by bloom magnitude alone. This conclusion was supported across baseline statistics, nonlinear modeling, hierarchical partitioning, and path-based validation. In particular, transition-stage wind, trigger-minus-bloom temperature change, and lake identity emerged repeatedly as the most influential predictors, while bloom-stage and background water-quality variables explained comparatively less variance. This pattern suggests that the recovery stage is not simply a passive continuation of bloom decline, but a distinct adjustment phase regulated by short-term shifts in atmospheric forcing.
This interpretation is physically plausible. In shallow or meteorologically responsive systems, post-bloom chlorophyll signals may decline rapidly when wind-driven mixing disrupts surface accumulation or when short-term thermal changes alter upper-water-column stability. Previous studies in Lake Taihu have emphasized the sensitivity of cyanobacterial surface aggregation to wind-driven transport and mixing, supporting the idea that short-term wind conditions can rapidly modify bloom expression and apparent recovery pace [
16,
17,
18,
19,
68]. In Dianchi, by contrast, bloom-related changes have been more strongly linked to thermal and abiotic background conditions, which is consistent with the stronger role of cooling-sensitive and persistence-prone recovery pathways identified here [
22,
23,
26,
27,
33,
38,
69]. Thus, the same category of meteorological forcing can generate different recovery outcomes depending on lake-specific environmental sensitivity.
The contribution decomposition and path-based results further clarify the mechanism behind this lake-dependent behavior. Trigger-stage meteorology and lake identity together explained a large share of recovery-duration variability, indicating that recovery was shaped not only by the immediate forcing environment but also by the background sensitivity of the receiving lake. The indirect pathway through bloom-area recovery rate suggests that part of the lake effect operated through the pace of spatial bloom contraction rather than through duration alone. In this sense, operational post-bloom recovery is controlled by both direct phase-specific forcing and indirect lake-dependent modulation.
At the same time, the explanatory power of individual models remained moderate, indicating that post-bloom recovery is still a heterogeneous process that cannot be reduced to a single deterministic pathway. The mechanisms identified here should therefore be interpreted as dominant statistical controls rather than a complete causal representation of recovery dynamics. Even so, the consistency across multiple analytical layers strengthens the conclusion that post-bloom recovery is transition-dominant, lake-dependent, and more strongly associated with short-term meteorological adjustment than with bloom magnitude alone.
These findings can be summarized as a lake-dependent transition framework (
Figure 16), in which the same category of phase-specific meteorological forcing is filtered through different lake backgrounds and translated into distinct post-bloom recovery trajectories in Lake Taihu and Lake Dianchi.
5.3. Implications for Bloom Monitoring and Management
These findings have two practical implications. First, bloom monitoring should not stop at bloom termination. The results show that the period after visible bloom decline still contains meaningful information about how long the system remains above its operational background condition. In Taihu, the shorter recovery durations and the weaker coupling between chlorophyll-a and bloom area suggest that apparent surface retreat may occur relatively quickly. In Dianchi, the longer recovery durations and the stronger chlorophyll–area coupling indicate that post-bloom persistence remains important even after the bloom has begun to contract. In other words, the end of a bloom event is not necessarily the end of bloom-related water-quality pressure.
Second, the results show that remote sensing can be used not only to detect bloom occurrence, but also to quantify lake-specific recovery behavior. The representative recovery trajectories, aligned composites, and path-based analysis together show that recovery has internal structure: it is not just a threshold-crossing date, but a measurable transition phase. This extends the role of remote sensing from outbreak surveillance to post-bloom assessment and makes inter-lake comparison more informative. In this study, the same event-based framework captured a rapid-adjustment pattern in Taihu and a persistence-prone pattern in Dianchi, showing that recovery metrics can provide comparative information that bloom area or chlorophyll peak alone cannot fully capture.
5.4. Limitations and Future Directions
This study also has several limitations. First, the recovery metric used here is operational rather than fully ecological. Recovery was defined as the first post-bloom day on which chlorophyll-a fell below the event-specific background threshold, so it represents recovery of the remotely sensed chlorophyll signal rather than complete ecosystem restoration. Second, the number of valid recovery events was smaller in Dianchi than in Taihu, which may affect the stability of event-level comparisons and model ranking. Third, although the integrated framework identifies dominant drivers and plausible pathways, it does not by itself provide a complete causal explanation of post-bloom recovery. Fourth, although shoreline erosion and water masking were used to reduce mixed-pixel effects, residual contamination from nearshore macrophytes, floating vegetation, or shallow substrates cannot be completely ruled out at MODIS resolution. Fifth, differences in valid optical-image availability between the two lakes may affect the exact timing of individual recovery dates, especially for representative event trajectories. These uncertainties were considered when interpreting recovery duration and inter-lake contrasts.
A further uncertainty is potential residual contamination from nearshore macrophytes and emergent vegetation in Lake Taihu. Although East Bay/East Taihu was excluded from the effective monitoring region and the shoreline mask was eroded inward by one 250 m pixel, MODIS observations cannot fully resolve fine-scale mixtures of cyanobacteria, aquatic vegetation, suspended sediment, and shallow substrates in complex nearshore waters [
5,
13]. Recent studies have also emphasized that aquatic vegetation and cyanobacterial blooms may show overlapping spectral behavior in visible and near-infrared bands, and that nearshore Taihu pixels can contain mixed signals from cyanobacteria, aquatic vegetation, and suspended sediment. Therefore, the bloom maps in this study should be interpreted as screened lake-wide floating-algae signals rather than fine-scale maps capable of completely separating cyanobacterial blooms from all macrophyte-related signals. Future work should combine MODIS with higher-spatial-resolution imagery, such as Sentinel-2, Landsat, or GF-series data [
9,
10,
13,
57], and apply mixed-pixel decomposition or vegetation-specific masking to further reduce nearshore contamination.
Another source of uncertainty is the representation of thermal conditions. In this study, near-surface air temperature from ERA5-Land was used as a proxy for atmospheric thermal forcing because continuous event-scale lake surface temperature or in situ water-temperature observations were not available for both lakes throughout 2000–2022. This choice is reasonable for describing regional atmospheric thermal transitions, as previous studies have shown that lake surface or near-surface water temperature can be statistically related to air temperature and other climatic variables [
42,
43]. However, air temperature and lake surface water temperature are not interchangeable. Lake depth, heat storage, evaporation, wind-driven mixing, and water-column stability can cause lake water temperature to respond differently from air temperature, and global lake studies have shown that lake surface warming varies markedly among lake systems [
45]. Therefore, the temperature-related effects identified here should be interpreted as atmospheric thermal-transition signals rather than direct water-temperature controls. Future work should incorporate lake surface temperature, in situ water temperature, or coupled thermal–hydrodynamic observations to better resolve the thermal mechanisms of post-bloom recovery.
Future work should combine event-based satellite indicators with higher-frequency in situ observations, especially water-column chlorophyll profiles, hydrodynamic information, and process-based ecological measurements [
31,
70]. This would help distinguish more clearly between rapid disappearance of surface bloom signals, slower decline in biomass, and broader ecological recovery. It would also help clarify how wind, temperature, precipitation, and lake-specific background jointly shape post-bloom trajectories across different types of eutrophic lakes.
6. Conclusions
This study developed an event-based remote-sensing framework to quantify operational post-bloom recovery in Lake Taihu and Lake Dianchi using long-term chlorophyll-a and bloom-area records. The results show that post-bloom recovery differed markedly between the two lakes. Lake Taihu had more recovery events and generally shorter recovery durations, whereas Lake Dianchi had fewer events but a much larger proportion of prolonged recovery episodes.
The analyses further show that recovery duration was governed mainly by phase-specific meteorological transitions and lake background rather than by bloom magnitude alone. Trigger meteorology and lake identity made the largest independent contributions, and the path-based validation indicated that lake identity influenced recovery both directly and indirectly through the recovery rate of bloom area, while trigger-minus-bloom temperature change mainly acted through a direct pathway. These results indicate that post-bloom recovery is not a uniform decay process, but a lake-dependent transition shaped by both short-term forcing and persistent system background.
Overall, the study shows that remote sensing can be used to quantify not only bloom occurrence, but also the duration and structure of post-bloom recovery. This provides a practical basis for comparing lakes, interpreting bloom decline more carefully, and extending bloom assessment beyond outbreak detection alone.