Next Article in Journal
Transcriptome Analysis of Dimethyl Fumarate Inhibiting the Growth of Aspergillus carbonarius
Next Article in Special Issue
Uncovering Fibrocapsa japonica (Raphidophyceae) in South America: First Taxonomic and Toxicological Insights from Argentinean Coastal Waters
Previous Article in Journal
Nymphoides peltata Alleviates Patulin-Induced Glutamine Metabolic Stress and Epithelial Toxicity in Small Intestinal Epithelial Cells
Previous Article in Special Issue
Systematic Review on CyanoHABs in Central Asia and Post-Soviet Countries (2010–2024)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modeling Algal Toxin Dynamics and Integrated Web Framework for Lakes

1
NOAA, Global Systems Laboratory, Boulder, CO 80305, USA
2
National Academies of Sciences, Engineering, and Medicine, Washington, DC 20001, USA
3
İstanbul Provincial Directorate of Agriculture and Forestry, Ministry of Agriculture and Forestry, İstanbul 34724, Türkiye
4
Oak Ridge Institute for Science and Education, Oak Ridge, TN 37830, USA
5
USDA-ARS, Adaptive Cropping Systems Laboratory, Beltsville, MD 20705, USA
6
IIHR–Hydroscience & Engineering, University of Iowa, Iowa City, IA 52242, USA
7
Department of Geographical and Sustainability Sciences, University of Iowa, Iowa City, IA 52245, USA
8
River-Coastal Science and Engineering, Tulane University, New Orleans, LA 70118, USA
9
ByWater Institute, Tulane University, New Orleans, LA 70118, USA
*
Author to whom correspondence should be addressed.
Toxins 2025, 17(7), 338; https://doi.org/10.3390/toxins17070338
Submission received: 3 June 2025 / Revised: 26 June 2025 / Accepted: 2 July 2025 / Published: 3 July 2025

Abstract

Harmful algal blooms (HABs) are one of the major environmental concerns, as they have various negative effects on public and environmental health, recreational services, and economics. HAB modeling is challenging due to inconsistent and insufficient data, as well as the nonlinear nature of algae formation data. However, it is crucial for attaining sustainable development goals related to clean water and sanitation. From this point of view, we employed the sparse identification nonlinear dynamics (SINDy) technique to model microcystin, an algal toxin, utilizing dissolved oxygen as a water quality metric and evaporation as a meteorological parameter. SINDy is a novel approach that combines a sparse regression and machine learning method to reconstruct the analytical representation of a dynamical system. The model results indicate that MAPE values of approximately 2% were achieved in three out of four lakes, while the MAPE value of the remaining lake is 11%. Moreover, a model-driven and web-based interactive tool was created to develop environmental education, raise public awareness on HAB events, and produce more effective solutions to HAB problems through what-if scenarios. This interactive and user-friendly web platform allows tracking the status of HABs in lakes and observing the impact of specific parameters on harmful algae formation.
Key Contribution: This work modeled the variation in microcystin, an algal toxin, utilizing the SINDy approach, which integrates sparse regression and machine learning, employing dissolved oxygen as a water quality indicator and evaporation as a meteorological variable. Furthermore, an interactive, web-based, user-friendly platform was created based on this model.

Graphical Abstract

1. Introduction

Reduced water clarity, unpleasant odors and tastes, the proliferation of harmful algal blooms (HABs), the loss of aquatic animal populations, increased nutrient concentrations in primary producers, acidification, deoxygenation and shifts in the aquatic food web are all results of eutrophication, which is caused by an influx of nutrients like fertilizers or pollutants [1,2,3]. HABs are caused by many sources, such as water pollution based on agricultural activities, wastewater treatment plant discharges, leakages from sewer systems, natural factors like pH and light levels, and climate change impacts. In recent decades, HABs have been seen as a serious hazard to the environment, according to the consensus of the scientific community [2,3,4]. They have several detrimental effects on the environment [5,6] such as toxin accumulation in reservoirs or water bodies [7], the public, and economics [4,8].
The impact of climate change on HABs is anticipated to manifest in alterations to their frequency, magnitude, biogeographical distribution, phenology [9,10], and toxicity [11]. Generally, nutrient pollution from agriculture and industry, water temperature, and water quality parameters are the main drivers of HABs occurrence [5,12,13]. The Intergovernmental Panel on Climate Change (IPCC) Special Report on the Ocean and Cryosphere in a Changing Climate (SROCC), which was approved in September 2019, was the first time the connection between HABs and climate change was stated in a formal way.
HABs primarily consist of one or more species of cyanobacteria, commonly referred to as blue-green algae, including Microcystis and Anabaena [14]. Microcystins, predominantly generated by Microcystis spp., are the most widespread cyanobacterial toxins in global freshwater systems [15].
HAB researchers have endeavored to predict HAB indicators through statistical, process-based, and hybrid models [11,16,17]. [18] employed a generalized additive model that utilized an identity-link function for Gaussian distribution. The model incorporated diverse environmental variables such as sunspot numbers, winter North Atlantic Oscillation (NAO) indices, monthly mean rainfall, air and sea water temperature, salinity, winds and Ekman transport, and phytoplankton data. Using cyanobacteria biomass as an indicator, [19] used Bayesian network (BN) [20] to relate future climate change and land-use management scenarios to ecological state. [21] utilized empirical dynamic modeling (EDM) [22] to predict chlorophyll-a, demonstrating the efficacy of dynamic models in forecasting ecological parameters.
Some studies reveal that the frequency, volume, biogeography, phenology, and toxicity of HABs are likely to vary as a result of climate change [11,23,24,25,26,27]. Increased ocean stratification [28,29] brought on by greater glacier melting, higher air temperatures [30,31], changing precipitation [32,33] and wind patterns [34,35,36], changed nutrient availability and composition [37], light intensity [33], and ocean acidity all have an impact on HABs [11]. Furthermore, the dispersion of HABs can be influenced by wind, air and lake temperature [38], while precipitation can facilitate the introduction of nutrients into aquatic environments, thereby promoting the development of HABs [6].
There were various efforts to study HAB events, observed in the state of Iowa, which heavily relies on agriculture as its primary economic sector [39,40]. In their study, ref. [41] developed a novel metric for normalizing microcystin congeners, enabling a comparative analysis of water bodies impacted by cyanobacterial harmful algal blooms (CyanoHAB). Additionally, they introduced a geometry-based image processing technique that facilitated the integration of aerial images captured by a drone, taken from above the water surface. A significant linear relationship was observed between the concentrations of chlorophyll-a and microcystin in lakes located in Iowa, as evidenced by a correlation coefficient of 62%. The researchers also observed that the feasibility of multispectral imaging for estimating microcystin concentrations may be limited at present, primarily due to the spectral constraints of the multispectral camera. Ref. [42] conducted a collection of 65 water samples from various lake beaches in Iowa to examine the potential relationship between the concentrations of microcystin and the abundances of genes responsible for toxin production. Strong correlations were observed between the abundance estimations of mcyA genes and microcystin concentrations in lake water samples. In a study conducted by [41], it was discovered that microcystins were present in all 10 lakes that were sampled in Iowa. Furthermore, microcystin was identified as the predominant toxin in 90% of these samples. Ref. [43] carried out a study where they collected water samples from 38 lakes in Iowa from 2018 to 2021. They developed three models using nine variables, which included chemical, biological, climatic, and land-use factors, to predict cyanobacterial HABs for a one-week period.
HABs modeling is a challenging task due to following reasons: (1) it is affected by various and multidimensional factors [44,45]; (2) HABs show complex nonlinear behavior [21,36,46]; (3) they are not uniform both in time and space [47,48]; and (4) there is not sufficient and continuous data [41,49,50]. Therefore, existing physical models have some difficulties [51] to find relationships between each factor affecting HABs prediction and many variable parameters should be required. It is costly and time-consuming to get around these restrictions.
Sparse Identification of Nonlinear Dynamics (SINDy) [52] employs sparsity methodologies and machine learning algorithms to reveal the differential equations that govern a dynamical system. It exploits the observation that the majority of dynamical systems exhibit a limited number of significant terms. This method utilized in various applications such as simulating and optimizing microalgal and cyanobacterial photo-production processes [53], physics-informed learning [54], predicting blood glucose levels [55], modeling air pollutants [56].
To overcome HABs’ modeling challenge, SINDy was used to model microcystin, which is one of the main indicators of HABs, using dissolved oxygen and daily total evaporation. We selected dissolved oxygen as the water quality criterion. Furthermore, it is the water quality parameter that has the highest amount of accessible data. Another factor included in the study is evaporation, which is a combination of a set of atmospheric variables. This analysis also incorporated other meteorological characteristics, including wind speed, maximum air temperature, lake water mixed layer temperature, and precipitation. Since there is a correlation between all the above meteorological characteristics, it is crucial to incorporate one of these elements into the modeling process to ensure precise modeling. The SINDy allows us to model HAB formation with discrete input dataset [57] and identify the governing equations that underlie nonlinear natural phenomena [52].
In order to effectively communicate the drivers and impacts of the HAB model, however, it is necessary to integrate such predictive models with web-based technologies. Widespread use of these web-based tools for data sharing, scientific visualization, data analytics, monitoring critical parameters, dissemination of necessary warnings, and decision support. The development of these information systems [58] is extremely beneficial for enhancing social awareness [59] in terms of scientific communication [60]. As previously mentioned, it has been determined that the state of Iowa has had a severe HAB problem in recent years, and the public’s awareness of this significant environmental issue falls short of expectations [61]. Therefore, it is imperative to disseminate information and enhance public awareness on the issue of HABs in lakes across Iowa.
To address this requirement, we created a web-based interactive communication tool, which includes the algal toxin, microcystin, model based on SINDy for selected lakes in Iowa. This tool has been developed to share the results, estimate the condition of the lakes according to what-if scenarios, increase awareness about HABs, and help decision-making mechanisms. In addition, it provides an easily accessible mapping environment (e.g., Google Maps API) on the web. This web platform may be used not only by water professionals but also by teachers, students and the public. When users change any variable, they will be able to see for themselves the change in harmful algae formation in the lake and determine whether the harmful algae value in the lake remains within the safe range for swimming, fishing, etc.
This paper is structured as follows: Section 2 explains the study area, data, SINDy method, and educational framework of HABs in some Iowa lakes. The results of the HAB modeling and its integration into the web-based information system can be found in Section 3. Some suggestions and evaluations were given in Section 4.

2. Results and Discussions

2.1. SINDy Model

The analyzed data is partitioned into training data, which accounts for 75% of the total, and test data, which accounts for the remaining 25%. Gaussian noise with a standard deviation of 10% of the root mean square error (RMSE) was added to the training data, ensuring that only the most significant terms were retained in the model. The subset of candidate terms in the system was determined using sequential thresholded least squares (STLSQ) as an optimizer since the SINDy algorithm, in its standard form, utilizes the STLSQ method. The algorithm is specifically designed for the least squares formulation and performs effectively, although it lacks the ability to easily incorporate modifications such as extra constraints, resilient formulations, or nonlinear parameter estimates [62]. The model was fitted to the noisy data, and the coefficients were stored in an array. The performance of the model was assessed using test data. It is crucial to note that as the threshold increases, the model includes fewer terms, making it sparser and reducing the risk of overfitting to noise. Nevertheless, setting the threshold too high can potentially remove crucial dynamics. Hence, the optimal threshold value is being sought for promoting sparsity. Figure 1 illustrates the relationship between RMSE in the test data and threshold values on the testing trajectory of d M / d t where M indicates microcystin. The optimal threshold value is the value that minimizes the RMSE while preserving significant terms. Put simply, the optimal threshold value is the one that effectively captures important dynamics and does not overfit with noise.
West Okoboji, McIntosh Woods, Black Hawk, and Geode Lake were chosen for modeling microcystin using dissolved oxygen and evaporation factors with SINDy. Details of threshold selection and equations for microcystin, dissolved oxygen, and evaporation are provided for the West Okoboji Lake. Only the final microcystin equations are given for the other lakes. The data presented in Figure 2, Figure 3, Figure 4 and Figure 5, which display the microcystin and predicted microcystin graphs for each lake, were not retrieved prior to pre-processing as they illustrate the rates of change in the microcystin levels. It is evident that performing such a procedure will elevate the error rates.
West Okoboji Lake
The optimal threshold value was determined to be 0.038 for West Okoboji datasets. Figure 1 displays RMSE values plotted against the threshold values for constructing the model using these datasets.
The equation system for microcystin (M), dissolved oxygen (D), and evaporation (E), determined using the optimal threshold value, can be represented as follows (Equations (1)–(3)):
d M d t = 0.984 M 2.352 M D 0.464 M E
d D d t = 0.121 M E
d E d t = 0.114 M 0.054 D 0.281 M D + 0.099 D 2 + 0.069 D E 0.058 E 2
The rate of change in microcystin data was calculated by integrating these equations. Figure 2 displays the rate of change in microcystin and the projected microcystin values for West Okoboji Lake.
Figure 2 demonstrates that the SINDy model accurately predicts this change with exceptional accuracy, especially when the microcystin change is very sharp.
McIntosh Woods Lake
SINDy gives rise to the model presented in Equations (4)–(6) for McIntosh Woods Lake.
d M d t = 3 M E
d D d t = 0.065 + 0.539 M + 0.337 E 3.105 M 2 0.356 M D 0.775 M E + 0.145 D 2 0.241 D E 0.328 E 2
d E d t = 0.33 + 12.97 M 2.32 D + 0.5 E 90.64 M 2 0.7 M D 24.82 M E + 2.74 D 2 + 1.23 D E 0.5 E 2
The model developed by SINDy identified a significant number of terms, potentially indicating that the approach referenced produces a model of the current system that lacks generalizability. Figure 3 displays the rate of change in microcystin and the projected microcystin values for McIntosh Woods Lake.
The microcystin change rate in McIntosh Woods Lake has remained constant over an extended period of time. It was observed that this value increased rapidly towards the end of the time period. Although the forecast model accurately predicted this sudden rise, it appears to have overestimated it.
Blackhawk Lake
The model for McIntosh Woods Lake is derived from SINDy and is represented by Equations (7)–(9).
d M d t = 0.9828 M 1.08 M D 1.5732 M E
d D d t = 0.133 M 2
d E d t = 0.097 M 2 0.124 M E
Figure 4 depicts the rate of changes in microcystin and the predicted microcystin values for Blackhawk Lake.
While accurately predicting variations in change is challenging, the SINDy model effectively captures fluctuations in the rate of change.
Geode Lake
The equation system (Equations (10)–(12)) for Geode Lake is as follows:
d M d t = 0.952 M 0.356 M D 1.688 M E
d D d t = 0.072 D + 0.029 E + 0.301 M 2 0.257 D E 0.022 E 2
d E d t = 0.188 M 0.372 M E
Figure 5 displays the rate of change in microcystin and the projected microcystin values for Geode Lake.
The estimations for Geode Lake are comparable to those conducted for other lakes. It is seen that the model accurately predicts times of rapid increase or decrease in rate of change values. The prediction outcomes for lakes have demonstrated that the forecasts generated by SINDy are highly effective in predicting the time periods during which harmful algae experience rapid growth or decline. Table 1 shows the prediction model performance results for every lake. Correlation coefficient (r), root mean square error (RMSE) and mean absolute percentage error (MAPE) are used as performance indicators.
The correlation coefficients between model findings and observations in lakes other than McIntosh Woods are highly proximate to 1. The reason for this is that SINDy perfectly captured the observed values for all lakes but McIntosh Woods. The prediction findings for McIntosh Woods Lake are satisfactory, albeit its prediction accuracy is lower compared to other lakes. The MAPE results indicate that the SINDY model effectively forecasts the fluctuations in nonlinear microcystin data.

2.2. HALGIS Web Framework

HALGIS is a publicly available informational web platform (Figure 6) that can be accessed freely at https://hydroinformatics.uiowa.edu/lab/halgis (accessed on 2 July 2025). The landing page contains details on the datasets utilized and the analysis available in the system. These harmful algae ML-based prediction results based on SINDy and environmental factors were incorporated into the HALGIS. The data obtained from multiple sources will be temporarily saved in a local database. The web platform incorporates the Google Maps API to display GeoJSON files of the selected lakes in the study area (Figure 7). This allows users to see the size of the lake and which river network and watershed (HUC-8 level) it is connected to. Users are able to open the harmful algae estimator module and change the environmental variables (microcystin, dissolved oxygen, and evaporation) to see the harmful algae trend for the West Okoboji Lake (Figure 8).
HALGIS elevates the understanding of environmental sustainability among different user groups. The general public can utilize it as an informational guide to assess the quality of their nearby lakes, assisting in promoting local ecological awareness and engagement. For educators, it provides a dynamic, interactive tool that promotes in-depth exploration and understanding of aquatic ecosystems and the influence of environmental factors. Students, particularly those involved in environmental science programs, can use HALGIS as a substantial research tool, leveraging authentic data to practice and refine their research skills. The interactivity offered by the platform fosters proactive learning and encourages users to think critically about the interrelatedness of environmental factors and their effect on our water bodies. Thus, HALGIS proves to be a remarkable asset in fostering a more informed and environmentally conscious society.
The HAB estimator indicates a positive correlation between the rise in microcystin levels and the occurrence of HAB events in the lake. Furthermore, it is possible to analyze not only the presence of microcystin but also the comprehensive changes in dissolved oxygen and evaporation parameters, as well as the variations in HAB occurrences in the lake. Displaying the interactive HAB trend would enhance users’ knowledge of this environmental concern and improve the communication abilities of environmental science students as well as the educators.

3. Conclusions

Environmental contaminants and climate change can lead to the development of harmful algal blooms (HABs) in lakes, affecting ecological balance. These formations in lakes can grow to such an extent that they endanger the survival of other organisms in the environment and pose a risk to public health by contaminating drinking water sources. This study aimed to simulate HABs, a critical aspect for environmental health. For this objective, all water quality parameters linked to HABs, indicators of harmful algal presence in the lake, and pertinent meteorological factors were analyzed. Data availability is the main focus in these assessments. As is known, the primary issue in HAB investigations is the insufficient data availability. The second issue that needs to be addressed is synchronizing the data for these parameters. For instance, one water quality measurement could be recorded within an hour, whereas another one could be measured at a different day or time. After identifying various discrepancies, a comprehensive set of data combinations was established, and multi-dimensional time series were generated by aligning them with relevant meteorological data. These time series were used to model HABs with SINDy.
Multiple reasons influenced the selection of SINDy for HAB modeling. The SINDy approach is chosen for its exceptional modeling capability, which remains effective even with limited data. Additionally, it demonstrates robustness in handling data noise features and is well-suited for discrete data. These advantages have been highlighted in research conducted by [57,63,64,65,66]. As a result of modeling experiments, microcystin (a toxic substance produced by harmful algae), dissolved oxygen (a water pollution parameter), and evaporation (a meteorological variable containing temperature and precipitation information) were selected as the three variables that gave the sparsest equation to be used in the study.
The primary lake in the study is West Okoboji Lake which is used actively for various recreational activities such as boating, swimming, and water skiing. The graphs based on the microcystin equations’ results derived from SINDy (Figure 1, Figure 2, Figure 3 and Figure 4) reveal the following about the lake. The equations derived for all lakes did accurately represent the numerical change in microcystin; they precisely described the variations in microcystin values. The high correlation and quite low error values in Table 1 confirm this observation. The SINDy method accurately predicted the nonlinearly varying toxin microcystin, which is produced by cyanobacteria. All models created by SINDy for all lakes share the common characteristic of having a strong capacity to forecast extreme points, in contrast to conventional prediction models.
HALGIS web platform was developed as an information system with integrated data access, analysis, and visualization capabilities. HALGIS is a comprehensive online platform that provides access to harmful algae conditions, HABs-related data, information, and interactive visualizations. HALGIS offers information on monitoring harmful algal blooms and the real-time condition of lakes, while also serving as an educational tool on environmental pollution. Students can acquire insight into future HABs generation by adjusting parameter values and will have the ability to observe the climate change impact on environmental sustainability.
To address the data issue, crucial for future HAB studies, it is essential to standardize data collection by ensuring all measurements are taken simultaneously in a uniform format. Validation with ground-based data is essential for the wider utilization of satellite datasets, highlighting the important nature of the data collection step. Benchmark datasets following FAIR (findability, accessibility, interoperability, and reusability) data principles should be created and shared to tackle the significant threat to environmental health issues posed by HABs. Benchmark datasets may enhance estimation and prediction studies on harmful algae by granting access to the latest data. As research progresses, understanding of climate change and its effects on HABs increases, allowing for more precise planning of preventative and protective actions. Advancements in information systems for lake ecosystems and HABs will allow for real-time monitoring of lake pollutants and environmental health.

4. Materials and Methods

4.1. Study Area

In recent decades, Iowa’s lakes have experienced the expansion of cyanoHABs distribution [41,61]. The existing monitoring of cyanoHABs in Iowa is insufficient, resulting in a paucity of data on specific microcystin congeners [41]. West Okoboji, McIntosh Woods (Clear Lake), Black Hawk, and Geode Lakes (see Figure 9) that had the most easily obtainable data were chosen as the pilot lakes for the study. These lakes are significant due to their comparatively larger surface area, proximity to rivers, and regular utilization by the public for sports and recreational pursuits, including fishing (with a habitat for over 25 fish species), swimming, camping, and boating. In Figure 9, blue lines, blue dots, and red dots denote rivers, lakes, and selected lakes, respectively.

4.2. Case Study

The study analyzed various water quality parameters, including dissolved oxygen, chlorophyll-a, total phosphorus, total nitrogen, microcystin, pH, and turbidity data of the lakes, to identify indicators of harmful algal blooms from the Iowa Department of Natural Resources AQuIA database. The study was unable to use every variable due to the unfixed sampling intervals (7 days, 8 days, 10 days or 14 days, etc.) and the very small and discontinuous number of data points for some parameters. After considering the availability and consistency of the data, it was determined that microcystin and dissolved oxygen data would be used.
The time range of algal data is limited to the period from May to September due to certain meteorological and lake water conditions that promote algae development. The primary challenge encountered during the investigation was the acquisition of adequate data at consistent intervals. West Okoboji Lake was designated as the primary lake due to its ample size and form, which allow for data collection from multiple observation sites. The data for West Okoboji was collected from the stations listed in Table 2. Data for additional lakes were obtained at the specific sample site of each corresponding lake. Figure 10 presents the statistical information and graphical representations of the microcystin data. The trend line (red dashed line) in Figure 10 clearly illustrates the rise in microcystin values.
In addition, ECMWF Reanalysis hourly ERA5-land data, which are the latest global reanalysis data from 1950 to present with 0.1° spatial horizontal resolution, were used as meteorological data. The meteorological data used in the study were hourly wind speed at 2 m, air temperature, evaporation, lake mixed layer temperature, and precipitation data and converted to daily scale. However, due to the limited quantity of rainfall during the summer months, when HABs occur, a significant portion of the precipitation data consists of zero values and was therefore omitted from the analysis. In addition, other meteorological factors, except evaporation, were eliminated during modeling experiments since they interact with each other, and evaporation allows for the building of the most accurate model.

Data Preprocessing

The phase space was reconstructed and then the attractor of microcystin data was plotted to reveal the characteristics of the microcystin data. In order to reconstruct the phase space, it is necessary to determine the time delay and embedding dimension [67]. The study employed the mutual information function [68] to ascertain the time delay. The initial minimal value of average mutual information (AMI) is selected as the optimal time delay. According to Figure 11a, time delay ( τ ) was taken as 13 and the embedding dimension was assumed to be 3. Figure 11b displays the two-dimensional representation of the resulting attractor projection. Given that the maximal Lyapunov exponent of the Microcystin data is negative (−0.54), it can be concluded that the data is not chaotic [69,70]. However, the presence of a strange attractor in the microcystin data indicates that this data is nonlinear.
Modeling nonlinear data such as microcystin is a challenging task. Furthermore, the presence of measurement mistakes and experimental flaws introduces noise into the data. Deriving the dynamics of a parameter or process from data that is both noisy and nonlinear is an exceedingly intricate undertaking. To ensure accuracy, the microcystin data underwent a sequence of procedures prior to being modeled using the SINDy algorithm (see Figure 12). PySINDy [71,72] was utilized in this study to implement the SINDy application.
Data pre-processing techniques, such as standardization and normalization, are used to make variables that have different scales comparable. This helps machine learning algorithms to make more accurate and consistent predictions [56,73,74]. Therefore, microcystin, dissolved oxygen and evaporation values were normalized due to their significant differences in scales. The microcystin data given in Figure 10 is raw data. When we take this data simultaneously with dissolved oxygen and evaporation, it is seen that microcystin data number decreases even more as seen in Figure 13.
The microcystin data utilized in the study were acquired through weekly sampling. In this work, the modified Akima interpolation technique (MAkima) [75], as utilized by [56], was employed due to the need for a finer discretization of the time interval when integrating a continuous-time system of ordinary differential equations and for data augmentation. The MAkima approach incorporates MAkima algorithms and is based on shape-preserving piecewise cubic Hermite interpolating polynomial interpolation (PCHIP) [76]. The authors refer to this pre-processing step as data augmentation due to the increase in the quantity of data points. Interpolation is a data augmentation approach utilized in machine learning systems [77]. Essentially, the MAkima procedure relies on spline interpolation to determine the values between two given points, resulting in a finer level of discretization. Through this procedure, the quantity of data points for each variable is quadrupled. Figure 14 shows the raw and splined microcystin data after the normalization step. Figure 15 shows augmented microcystin, dissolved oxygen and evaporation data together after MAkima interpolation.
The data pre-processing steps have a crucial role in facilitating the extraction of valuable information from data [78]. Applying smoothing and denoising techniques is beneficial for obtaining accurate outcomes when using the SINDy method [79]. In this research, the final stage of data preprocessing involves the process of data smoothing. The Whittaker-Henderson approach [80,81,82,83] was used to smooth microcystin and meteorological variables. Whittaker-Henderson smoothing is a successful method of smoothing discrete-time data that is based on spline smoothing and is specifically designed for equally spaced data points [78]. Figure 16 displays the normalized and augmented microcystin data with the smoothed version of this data. The R libraries utilized for AMI, Lyapunov exponents’ calculations, and Whittaker-Henderson smoothing are ‘tseriesChaos’, ‘nonlinearTseries’, and ‘pracma’, respectively. MatLab was utilized for the implementation of MAkima.

4.3. Sparse Identification of Nonlinear Dynamics (SINDy)

Ref. [52] incorporated sparse regression and machine learning with nonlinear dynamical systems to model nonlinear processes using noisy data. The only model structural assumption is that the dynamics are governed by a few key components, thus the equations are sparse in the space of potential functions. Thanks to sparse regression, SINDy identifies the minimal number of terms in the dynamic governing equations necessary for precise data representation. This yields a succinct model that reconciles precision with complexity to prevent overfitting. SINDy is a machine learning technique that derives dynamical system models from time series data, which may manifest as conventional differential equations or partial differential equations [84].
This approach initially constructs a library comprising variations in linear or nonlinear candidate basis functions. Subsequently, the active elements of the coefficients vector are determined by sparse regression. The model is ultimately revised using active terms, while the residual terms are disregarded based on the regularization parameter through sparse regression [85].
State x ( t ) in a dynamical system can be taken as x ˙ = f ( x t ) . In order to ascertain the function from the data, a temporal evolution of the state x t is collected and either the derivative x ˙ ( t ) is measured, or it is numerically approximated from x ( t ) . After sampling the data numerous times and arranging it into two matrices, a data matrix X and its derivative X ˙ are as follows:
X = x T t 1 x T t 2 . . . x T t m = x 1 t 1   x 2 t 1     x n t 1 x 1 t 2   x 2 t 2     x n t 2 .   .   . .   .   . .   .   .   x 1 t m   x 2 t m     x n t m ,   X ˙ = x ˙ T t 1 x ˙ T t 2 . . . x ˙ T t m = x ˙ 1 t 1   x ˙ 2 t 1     x ˙ n t 1 x ˙ 1 t 2   x ˙ 2 t 2     x ˙ n t 2 .   .   . .   .   . .   .   .   x ˙ 1 t m   x ˙ 2 t m     x ˙ n t m  
A library, denoted as λ X , is created, which contains potential nonlinear functions of the X .
λ X = |   |   |   |   1   X   X P 2   X P 3   |   |   |   |  
X P i denotes polynomials of the ith degree. At this point, a sparse regression problem can be formulated to find a coefficient matrix C = ξ 1 ,   ξ 2 ,   ,   ξ n that will identify the active nonlinearities in the dynamic system:
X ˙ = λ X C
Each column ε k of C represents a sparse vector of coefficients that determine which terms are active in the right-hand side of one of the row equations x ˙ k = f k x in x ˙ = f ( x t ) . After determining the value of C , a model for each row of the governor equations can be developed in the following manner:
x ˙ k = f k x = λ x T ξ k

4.4. HALGIS Web Framework

HALGIS, the Harmful ALGae Information System, was developed as a web-based platform to track the formation of harmful algal blooms in Iowa lakes by monitoring the alterations in microcystin levels, a toxin generated by cyanobacteria. HALGIS aims to offer a one-stop digital platform for accessing data and information about the impacts of HABs on public health, recreational activities, and wildlife. The landing page also provides the causes of HAB, information on data integration, analysis, and visualization, and link to data sources. The main stakeholders for HALGIS are the public, students, and environmental education professionals. Therefore, it is crucial to create an interactive and user-friendly interface that is accessible to individuals with limited technical knowledge and expertise. It can be accessed across multiple platforms such as PCs, smartphones, and tablets. HALGIS was organized into multiple layers, as depicted in Figure 17. HALGIS offers data on lakes and HAB conditions to help users comprehend potential HABs and environmental health risks. Users can contribute photos of hazardous lakes using the HALGIS interface.

Author Contributions

Conceptualization, Ö.B., S.Y., I.D.; methodology, Ö.B., S.Y.; software, Ö.B., S.Y., A.D., I.D.; validation, Ö.B., S.Y.; formal analysis, Ö.B., S.Y.; investigation, Ö.B.; resources, Ö.B., S.Y., M.L., I.D.; data curation, Ö.B., S.Y.; writing—original draft preparation, Ö.B., S.Y.; writing—review and editing, Ö.B., S.Y., I.D., M.L.; visualization, Ö.B., S.Y., A.D.; supervision, Ö.B., S.Y., I.D.; project administration, Ö.B., S.Y., M.L., I.D.; funding acquisition, Ö.B., S.Y., M.L., I.D. All authors have read and agreed to the published version of the manuscript.

Funding

The University of Iowa Healthy Lakes Initiative provided the funding for this study.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study is publicly available via the following links: AQuIA Database (The Iowa Department of Natural Resources Water Quality Monitoring and Assessment): https://programs.iowadnr.gov/aquia (Last accessed on 2 July 2025). ERA5-land hourly data: https://doi.org/10.24381/cds.e2161bac.

Acknowledgments

Special thanks to Daniel Kendall (Iowa Department of Natural Resources) for providing information on the harmful algae database and to the University of Iowa Healthy Lakes Initiative team, under the direction of Corey Markfort (University of Iowa), for their comments that enriched the discussions.Özlem Baydaroğlu was supported in part by an appointment to the NRC Research Associateship Program at the National Oceanic and Atmospheric Administration-Global System Laboratory (NOAA-GSL), administered by the Fellowships Office of the National Academies of Sciences, Engineering, and Medicine. The statements, findings, conclusions, and recommendations are those of the author and do not necessarily reflect the views of NOAA or the U.S. Department of Commerce. Serhan Yeşilköy was supported in part by an appointment to the Agricultural Research Service (ARS) Research Participation Program administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the U.S. Department of Energy (DOE) and the U.S. Department of Agriculture (USDA). ORISE is managed by ORAU under DOE contract number DE-SC0014664. All opinions expressed in this paper are the author’s and do not necessarily reflect the policies and views of USDA, DOE, or ORAU/ORISE.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Schindler, D.W. Recent advances in the understanding and management of eutrophication. Limnol. Oceanogr. 2006, 51, 356–363. [Google Scholar] [CrossRef]
  2. Paerl, H.W.; Gardner, W.S.; Havens, K.E.; Joyner, A.R.; McCarthy, M.J.; Newell, S.E.; Qin, B.; Scott, J.T. Mitigating cyanobacterial harmful algal blooms in aquatic ecosystems impacted by climate change and anthropogenic nutrients. Harmful Algae 2016, 54, 213–222. [Google Scholar] [CrossRef] [PubMed]
  3. Rolim, S.B.A.; Veettil, B.K.; Vieiro, A.P.; Kessler, A.B.; Gonzatti, C. Remote sensing for mapping algal blooms in freshwater lakes: A review. Environ. Sci. Pollut. Res. 2023, 30, 19602–19616. [Google Scholar] [CrossRef]
  4. Gobler, C.J. Climate change and harmful algal blooms: Insights and perspective. Harmful Algae 2020, 91, 101731. [Google Scholar] [CrossRef] [PubMed]
  5. Graham, J.L.; Dubrovsky, N.M.; Eberts, S.M. Cyanobacterial Harmful Algal Blooms and US Geological Survey Science Capabilities; US Department of the Interior: Washington, DC, USA; US Geological Survey: Reston, VA, USA, 2016. [Google Scholar] [CrossRef]
  6. Coffey, R.; Paul, M.J.; Stamp, J.; Hamilton, A.; Johnson, T. A review of water quality responses to air temperature and precipitation changes 2: Nutrients, algal blooms, sediment, pathogens. J. Am. Water Resour. Assoc. 2019, 55, 844–868. [Google Scholar] [CrossRef]
  7. Li, Q.; Li, Q.; Wu, J.; He, K.; Xia, Y.; Liu, J.; Wang, F.; Cheng, Y. Wellhead Stability During Development Process of Hydrate Reservoir in the Northern South China Sea: Sensitivity Analysis. Processes 2025, 13, 1630. [Google Scholar] [CrossRef]
  8. CDC. Harmful Algal Bloom (HAB) Associated Illness. 2021. Available online: https://www.cdc.gov/habs/general.html (accessed on 23 February 2023).
  9. Bakanoğulları, F.; Şaylan, L.; Yeşilköy, S. Effects of phenological stages, growth and meteorological factor on the albedo of different crop cultivars. Ital. J. Agrometeorol. 2022, 1, 23–40. [Google Scholar] [CrossRef]
  10. Yeşilköy, S.; Demir, I. Crop yield prediction based on reanalysis and crop phenology data in the agroclimatic zones. Theor. Appl. Climatol. 2024, 155, 7035–7048. [Google Scholar] [CrossRef]
  11. Ralston, D.K.; Moore, S.K. Modeling harmful algal blooms in a changing climate. Harmful Algae 2020, 91, 101729. [Google Scholar] [CrossRef]
  12. Paerl, H.W.; Huisman, J. Blooms like it hot. Science 2008, 320, 57–58. [Google Scholar] [CrossRef]
  13. Paerl, H.W.; Paul, V.J. Climate change: Links to global expansion of harmful cyanobacteria. Water Res. 2012, 46, 1349–1363. [Google Scholar] [CrossRef] [PubMed]
  14. Lad, A.; Breidenbach, J.D.; Su, R.C.; Murray, J.; Kuang, R.; Mascarenhas, A.; Najjar, J.; Patel, S.; Hegde, P.; Youssef, M.; et al. As we drink and breathe: Adverse health effects of microcystins and other harmful algal bloom toxins in the liver, gut, lungs and beyond. Life 2022, 12, 418. [Google Scholar] [CrossRef] [PubMed]
  15. Li, B.; Zhang, X.; Wu, G.; Qin, B.; Tefsen, B.; Wells, M. Toxins from harmful algal blooms: How copper and iron render chalkophore a predictor of microcystin production. Water Res. 2023, 244, 120490. [Google Scholar] [CrossRef] [PubMed]
  16. Brookfield, A.E.; Hansen, A.T.; Sullivan, P.L.; Czuba, J.A.; Kirk, M.F.; Li, L.; Newcomer, M.E.; Wilkinson, G. Predicting algal blooms: Are we overlooking groundwater? Sci. Total Environ. 2021, 769, 144442. [Google Scholar] [CrossRef]
  17. Ruiz-Villarreal, M.; García-García, L.M.; Cobas, M.; Díaz, P.A.; Reguera, B. Modelling the hydrodynamic conditions associated with Dinophysis blooms in Galicia (NW Spain). Harmful Algae 2016, 53, 40–52. [Google Scholar] [CrossRef]
  18. Díaz, P.A.; Ruiz-Villarreal, M.; Pazos, Y.; Moita, T.; Reguera, B. Climate variability and Dinophysis acuta blooms in an upwelling system. Harmful Algae 2016, 53, 145–159. [Google Scholar] [CrossRef]
  19. Moe, S.J.; Haande, S.; Couture, R.M. Climate change, cyanobacteria blooms and ecological status of lakes: A Bayesian network approach. Ecol. Model. 2016, 337, 330–347. [Google Scholar] [CrossRef]
  20. Baydaroğlu, Ö.; Koçak, K. Spatiotemporal analysis of wind speed via the Bayesian maximum entropy approach. Environ. Earth Sci. 2019, 78, 17. [Google Scholar] [CrossRef]
  21. Baydaroğlu, Ö. Harmful algal bloom prediction using empirical dynamic modeling. Sci. Total Environ. 2025, 959, 178185. [Google Scholar] [CrossRef]
  22. Ye, H.; Beamish, R.J.; Glaser, S.M.; Grant, S.C.; Hsieh, C.H.; Richards, L.J.; Schnute, J.T.; Sugihara, G. Equation-free mechanistic ecosystem forecasting using empirical dynamic modeling. Proc. Natl. Acad. Sci. USA 2015, 112, E1569–E1576. [Google Scholar] [CrossRef]
  23. Anderson, C.R.; Moore, S.K.; Tomlinson, M.C.; Silke, J.; Cusack, C.K. Living with harmful algal blooms in a changing world: Strategies for modeling and mitigating their effects in coastal marine ecosystems. In Coastal and Marine Hazards, Risks, and Disasters; Elsevier: Amsterdam, The Netherlands, 2015; pp. 495–561. [Google Scholar] [CrossRef]
  24. Hallegraeff, G.M. Ocean climate change, phytoplankton community responses, and harmful algal blooms: A formidable predictive challenge 1. J. Phycol. 2010, 46, 220–235. [Google Scholar] [CrossRef]
  25. Treuer, G.; Kirchhoff, C.; Lemos, M.C.; McGrath, F. Challenges of managing harmful algal blooms in US drinking water systems. Nat. Sustain. 2021, 4, 958–964. [Google Scholar] [CrossRef]
  26. Zhang, Y.; Shi, K.; Liu, J.; Deng, J.; Qin, B.; Zhu, G.; Zhou, Y. Meteorological and hydrological conditions driving the formation and disappearance of black blooms, an ecological disaster phenomena of eutrophication and algal blooms. Sci. Total Environ. 2016, 569, 1517–1529. [Google Scholar] [CrossRef]
  27. Maniyar, C.B.; Kumar, A.; Mishra, D.R. Continuous and synoptic assessment of Indian inland waters for harmful algae blooms. Harmful Algae 2022, 111, 102160. [Google Scholar] [CrossRef]
  28. Townhill, B.L.; Tinker, J.; Jones, M.; Pitois, S.; Creach, V.; Simpson, S.D.; Dye, S.; Bear, E.; Pinnegar, J.K. Harmful algal blooms and climate change: Exploring future distribution changes. ICES J. Mar. Sci. 2018, 76, 353. [Google Scholar] [CrossRef]
  29. Trainer, V.L.; Moore, S.K.; Hallegraeff, G.; Kudela, R.M.; Clement, A.; Mardones, J.I.; Cochlan, W.P. Pelagic harmful algal blooms and climate change: Lessons from nature’s experiments with extremes. Harmful Algae 2020, 91, 101591. [Google Scholar] [CrossRef]
  30. Gobler, C.J.; Doherty, O.M.; Hattenrath-Lehmann, T.K.; Griffith, A.W.; Kang, Y.; Litaker, R.W. Ocean warming since 1982 has expanded the niche of toxic algal blooms in the North Atlantic and North Pacific oceans. Proc. Natl. Acad. Sci. USA 2017, 114, 4975–4980. [Google Scholar] [CrossRef]
  31. Hou, X.; Feng, L.; Dai, Y.; Hu, C.; Gibson, L.; Tang, J.; Lee, Z.; Wang, Y.; Cai, X.; Liu, J.; et al. Global mapping reveals increase in lacustrine algal blooms over the past decade. Nat. Geosci. 2022, 15, 130–134. [Google Scholar] [CrossRef]
  32. Ho, J.C.; Michalak, A.M. Exploring temperature and precipitation impacts on harmful algal blooms across continental US lakes. Limnol. Oceanogr. 2020, 65, 992–1009. [Google Scholar] [CrossRef]
  33. Zhou, Y.; Yan, W.; Wei, W. Effect of sea surface temperature and precipitation on annual frequency of harmful algal blooms in the East China Sea over the past decades. Environ. Pollut. 2021, 270, 116224. [Google Scholar] [CrossRef]
  34. Deng, J.; Chen, F.; Liu, X.; Peng, J.; Hu, W. Horizontal migration of algal patches associated with cyanobacterial blooms in an eutrophic shallow lake. Ecol. Eng. 2016, 87, 185–193. [Google Scholar] [CrossRef]
  35. Hamilton, D.S.; Perron, M.M.; Bond, T.C.; Bowie, A.R.; Buchholz, R.R.; Guieu, C.; Ito, A.; Maenhaut, W.; Myriokefalitakis, S.; Olgun, N.; et al. Earth, wind, fire, and pollution: Aerosol nutrient sources and impacts on ocean biogeochemistry. Annu. Rev. Mar. Sci. 2022, 14, 303–330. [Google Scholar] [CrossRef] [PubMed]
  36. Liu, S.T.; Zhang, L. Surface Chaotic Theory and the Growth of Harmful Algal Bloom. In Surface Chaos and Its Applications; Springer Nature: Singapore, 2022; pp. 299–320. [Google Scholar] [CrossRef]
  37. Huo, S.; He, Z.; Ma, C.; Zhang, H.; Xi, B.; Xia, X.; Xu, Y.; Wu, F. Stricter nutrient criteria are required to mitigate the impact of climate change on harmful cyanobacterial blooms. J. Hydrol. 2019, 569, 698–704. [Google Scholar] [CrossRef]
  38. Michalak, A.M.; Anderson, E.J.; Beletsky, D.; Boland, S.; Bosch, N.S.; Bridgeman, T.B.; Chaffin, J.D.; Cho, K.; Confesor, R.; Daloğlu, I.; et al. Record-setting algal bloom in Lake Erie caused by agricultural and meteorological trends consistent with expected future conditions. Proc. Natl. Acad. Sci. USA 2013, 110, 6448–6452. [Google Scholar] [CrossRef]
  39. Islam, S.S.; Yeşilköy, S.; Baydaroğlu, Ö.; Yıldırım, E.; Demir, I. State-level multidimensional agricultural drought susceptibility and risk assessment for agriculturally prominent areas. Int. J. River Basin Manag. 2024, 23, 337–354. [Google Scholar] [CrossRef]
  40. Wilkinson, G.M.; Walter, J.A.; Albright, E.A.; King, R.F.; Moody, E.K.; Ortiz, D.A. An evaluation of statistical models of microcystin detection in lakes applied forward under varying climate conditions. Harmful Algae 2024, 137, 102679. [Google Scholar] [CrossRef]
  41. Greene, S.B.D.; LeFevre, G.H.; Markfort, C.D. Improving the spatial and temporal monitoring of cyanotoxins in Iowa lakes using a multiscale and multi-modal monitoring approach. Sci. Total Environ. 2021, 760, 143327. [Google Scholar] [CrossRef]
  42. Lee, J.; Choi, J.; Fatka, M.; Swanner, E.; Ikuma, K.; Liang, X.; Leung, T.; Howe, A. Improved detection of mcyA genes and their phylogenetic origins in harmful algal blooms. Water Res. 2020, 176, 115730. [Google Scholar] [CrossRef]
  43. Villanueva, P.; Yang, J.; Radmer, L.; Liang, X.; Leung, T.; Ikuma, K.; Swanner, E.D.; Howe, A.; Lee, J. One-Week-Ahead Prediction of Cyanobacterial Harmful Algal Blooms in Iowa Lakes. Environ. Sci. Technol. 2023, 57, 20636–20646. [Google Scholar] [CrossRef]
  44. Kim, J.; Jones, J.R.; Seo, D. Factors affecting harmful algal bloom occurrence in a river with regulated hydrology. J. Hydrol. Reg. Stud. 2021, 33, 100769. [Google Scholar] [CrossRef]
  45. Wang, C.; Wang, Z.; Wang, P.; Zhang, S. Multiple effects of environmental factors on algal growth and nutrient thresholds for harmful algal blooms: Application of response surface methodology. Environ. Model. Assess. 2016, 21, 247–259. [Google Scholar] [CrossRef]
  46. Yu, X.; Yuan, S.; Zhang, T. The effects of toxin-producing phytoplankton and environmental fluctuations on the planktonic blooms. Nonlinear Dyn. 2018, 91, 1653–1668. [Google Scholar] [CrossRef]
  47. Hallegraeff, G.M.; Anderson, D.M.; Belin, C.; Bottein, M.Y.D.; Bresnan, E.; Chinain, M.; Enevoldsen, H.; Iwataki, M.; Karlson, B.; McKenzie, C.H.; et al. Perceived global increase in algal blooms is attributable to intensified monitoring and emerging bloom impacts. Commun. Earth Environ. 2021, 2, 117. [Google Scholar] [CrossRef]
  48. Lee, J.; Jeon, W.; Chang, M.; Han, M.S. Evaluation of rapid cell division in non-uniform cell cycles. J. Basic Microbiol. 2015, 55, 1159–1167. [Google Scholar] [CrossRef]
  49. Guo, J.; Dong, Y.; Lee, J.H. A real time data driven algal bloom risk forecast system for mariculture management. Mar. Pollut. Bull. 2020, 161, 111731. [Google Scholar] [CrossRef]
  50. Wang, X.; Bouzembrak, Y.; Marvin, H.J.; Clarke, D.; Butler, F. Bayesian Networks modeling of diarrhetic shellfish poisoning in Mytilus edulis harvested in Bantry Bay, Ireland. Harmful Algae 2022, 112, 102171. [Google Scholar] [CrossRef]
  51. Janssen, A.B.; Janse, J.H.; Beusen, A.H.; Chang, M.; Harrison, J.A.; Huttunen, I.; Kong, X.; Rost, J.; Teurlincx, S.; Troost, T.A.; et al. How to model algal blooms in any lake on earth. Curr. Opin. Environ. Sustain. 2019, 36, 1–10. [Google Scholar] [CrossRef]
  52. Brunton, S.L.; Proctor, J.L.; Kutz, J.N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 2016, 113, 3932–3937. [Google Scholar] [CrossRef]
  53. Zhang, D.; Savage, T.R.; Cho, B.A. Combining model structure identification and hybrid modelling for photo-production process predictive simulation and optimisation. Biotechnol. Bioeng. 2020, 117, 3356–3367. [Google Scholar] [CrossRef]
  54. Corbetta, M. Application of sparse identification of nonlinear dynamics for physics-informed learning. In Proceedings of the 2020 IEEE Aerospace Conference, Big Sky, MT, USA, 7–14 March 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–8. [Google Scholar] [CrossRef]
  55. Joedicke, D.; Parra, D.; Kronberger, G.; Winkler, S.M. Identifying differential equations for the prediction of blood glucose using sparse identification of nonlinear systems. In Proceedings of the International Conference on Computer Aided Systems Theory, Las Palmas de Gran Canaria, Spain, 20–25 February 2022; Springer Nature: Cham, Switzerland, 2022; pp. 181–188. [Google Scholar] [CrossRef]
  56. Rubio-Herrero, J.; Marrero, C.O.; Fan, W.T.L. Modeling atmospheric data and identifying dynamics Temporal data-driven modeling of air pollutants. J. Clean. Prod. 2022, 333, 129863. [Google Scholar] [CrossRef]
  57. Kaiser, E.; Kutz, J.N.; Brunton, S.L. Sparse identification of nonlinear dynamics for model predictive control in the low-data limit. Proc. R. Soc. A 2018, 474, 20180335. [Google Scholar] [CrossRef] [PubMed]
  58. Yeşilköy, S.; Baydaroglu, O.; Singh, N.; Sermet, Y.; Demir, I. A contemporary systematic review of Cyberinfrastructure Systems and Applications for Flood and Drought Data Analytics and Communication. Environ. Res. Commun. 2024, 6, 102003. [Google Scholar] [CrossRef]
  59. Albano, R.; Sole, A.; Adamowski, J. READY: A web-based geographical information system for enhanced flood resilience through raising awareness in citizens. Nat. Hazards Earth Syst. Sci. 2015, 15, 1645–1658. [Google Scholar] [CrossRef]
  60. Iyengar, S.; Massey, D.S. Scientific communication in a post-truth society. Proc. Natl. Acad. Sci. USA 2019, 116, 7656–7661. [Google Scholar] [CrossRef]
  61. Shr, Y.H.; Zhang, W. Do Iowa Residents and Farmers Care about Improving Water Quality and Reducing Harmful Algal Blooms?: Results from Two Household Surveys. Center for Agricultural and Rural Development. 2021. Available online: https://www.card.iastate.edu/products/publications/pdf/21pb32.pdf (accessed on 2 July 2025).
  62. Champion, K.; Zheng, P.; Aravkin, A.Y.; Brunton, S.L.; Kutz, J.N. A unified sparse optimization framework to learn parsimonious physics-informed models from data. IEEE Access 2020, 8, 169259–169271. [Google Scholar] [CrossRef]
  63. Champion, K.P.; Brunton, S.L.; Kutz, J.N. Discovery of nonlinear multiscale systems: Sampling strategies and embeddings. SIAM J. Appl. Dyn. Syst. 2019, 18, 312–333. [Google Scholar] [CrossRef]
  64. França, T.; Braga, A.M.B.; Ayala, H.V.H. Feature engineering to cope with noisy data in sparse identification. Expert Syst. Appl. 2022, 188, 115995. [Google Scholar] [CrossRef]
  65. Lisci, S.; Gitani, E.; Mulas, M.; Tronci, S. Modeling a Biological Reactor using Sparse Identification Method. Chem. Eng. Trans. 2021, 86, 901–906. [Google Scholar] [CrossRef]
  66. Moazeni, F.; Khazaei, J. Data-enabled identification of nonlinear dynamics of water systems using sparse regression technique. IFAC-PapersOnLine 2023, 56, 2389–2394. [Google Scholar] [CrossRef]
  67. Takens, F. Detecting strange attractors in turbulence. In Dynamical Systems and Turbulence, Warwick 1980: Proceedings of a Symposium Held at the University of Warwick 1979/80; Springer: Berlin/Heidelberg, Germany, 2006; pp. 366–381. [Google Scholar]
  68. Fraser, A.M.; Swinney, H.L. Independent coordinates for strange attractors from mutual information. Phys. Rev. A 1986, 33, 1134. [Google Scholar] [CrossRef]
  69. Grassberger, P.; Schreiber, T.; Schaffrath, C. Nonlinear time sequence analysis. Int. J. Bifurc. Chaos 1991, 1, 521–547. [Google Scholar] [CrossRef]
  70. Kantz, H. A robust method to estimate the maximal Lyapunov exponent of a time series. Phys. Lett. A 1994, 185, 77–87. [Google Scholar] [CrossRef]
  71. de Silva, B.M.; Champion, K.; Quade, M.; Loiseau, J.C.; Kutz, J.N.; Brunton, S.L. Pysindy: A python package for the sparse identification of nonlinear dynamics from data. arXiv 2020. [Google Scholar] [CrossRef]
  72. Kaptanoglu, A.A.; de Silva, B.M.; Fasel, U.; Kaheman, K.; Goldschmidt, A.J.; Callaham, J.L.; Delahunt, C.B.; Nicolaou, Z.G.; Champion, K.; Loiseau, J.-C.; et al. PySINDy: A comprehensive Python package for robust sparse system identification. arXiv 2021. [Google Scholar] [CrossRef]
  73. Abdullah, F.; Alhajeri, M.S.; Christofides, P.D. Modeling and control of nonlinear processes using sparse identification: Using dropout to handle noisy data. Ind. Eng. Chem. Res. 2022, 61, 17976–17992. [Google Scholar] [CrossRef]
  74. Fukami, K.; Murata, T.; Zhang, K.; Fukagata, K. Sparse identification of nonlinear dynamics with low-dimensionalized flow representations. J. Fluid Mech. 2021, 926, A10. [Google Scholar] [CrossRef]
  75. Akima, H. A new method of interpolation and smooth curve fitting based on local procedures. J. ACM 1970, 17, 589–602. [Google Scholar] [CrossRef]
  76. Mohamad, N.B.; Lai, A.C.; Lim, B.H. A case study in the tropical region to evaluate univariate imputation methods for solar irradiance data with different weather types. Sustain. Energy Technol. Assess. 2022, 50, 101764. [Google Scholar] [CrossRef]
  77. Baydaroğlu, Ö.; Demir, I. Temporal and spatial satellite data augmentation for deep learning-based rainfall nowcasting. J. Hydroinform. 2024, 26, 589–607. [Google Scholar] [CrossRef]
  78. Baydaroğlu, Ö.; Muste, M.; Cikmaz, A.B.; Kim, K.; Meselhe, E.; Demir, I. Testing protocols for smoothing datasets of hydraulic variables acquired during unsteady flows. Hydrol. Sci. J. 2024, 69, 1813–1830. [Google Scholar] [CrossRef]
  79. Cortiella, A.; Park, K.C.; Doostan, A. A priori denoising strategies for sparse identification of nonlinear dynamical systems: A comparative study. J. Comput. Inf. Sci. Eng. 2023, 23, 011004. [Google Scholar] [CrossRef]
  80. Henderson, R. A new method of graduation. Trans. Actuar. Soc. Am. 1924, 25, 29–40. [Google Scholar]
  81. Henderson, R. Further remarks on graduation. Trans. Actuar. Soc. Am. 1925, 26, 52–57. [Google Scholar]
  82. Whittaker, E.T. On a new method of graduation. Proc. Edinb. Math. Soc. 1922, 41, 63–75. [Google Scholar] [CrossRef]
  83. Whittaker, E.T. VIII.—On the Theory of Graduation. Proc. R. Soc. Edinb. 1925, 44, 77–83. [Google Scholar] [CrossRef]
  84. Yuan, Z. Applications of Sparse Identification of Nonlinear Dynamical Systems. Ph.D. Dissertation, California State Polytechnic University, Pomona, CA, USA, 2023. [Google Scholar]
  85. Kadah, N.; Özbek, N.S. Model Investigation of Nonlinear Dynamical Systems by Sparse Identification. Avrupa Bilim Teknol. Derg. 2020, 254–263. [Google Scholar] [CrossRef]
Figure 1. RMSE values vs. threshold values for the model construction for pre-processed data.
Figure 1. RMSE values vs. threshold values for the model construction for pre-processed data.
Toxins 17 00338 g001
Figure 2. The rate of change in microcystin and predicted microcystin values for West Okoboji Lake.
Figure 2. The rate of change in microcystin and predicted microcystin values for West Okoboji Lake.
Toxins 17 00338 g002
Figure 3. The rate of change in microcystin and predicted microcystin values for McIntosh Woods Lake.
Figure 3. The rate of change in microcystin and predicted microcystin values for McIntosh Woods Lake.
Toxins 17 00338 g003
Figure 4. The rate of change in microcystin and predicted microcystin values for Blackhawk Lake.
Figure 4. The rate of change in microcystin and predicted microcystin values for Blackhawk Lake.
Toxins 17 00338 g004
Figure 5. The rate of change in microcystin and predicted microcystin values for Geode Lake.
Figure 5. The rate of change in microcystin and predicted microcystin values for Geode Lake.
Toxins 17 00338 g005
Figure 6. Harmful Algae Information System (HALGIS) landing page.
Figure 6. Harmful Algae Information System (HALGIS) landing page.
Toxins 17 00338 g006
Figure 7. This web framework also allows their users to select different lakes and provides HAB-related information about beaches, fishing waypoints, fish kill estimates, and algae bloom reports, which were integrated from the Iowa Department of Natural Resources databases.
Figure 7. This web framework also allows their users to select different lakes and provides HAB-related information about beaches, fishing waypoints, fish kill estimates, and algae bloom reports, which were integrated from the Iowa Department of Natural Resources databases.
Toxins 17 00338 g007
Figure 8. SINDy-based algae trend modeling can be calculated through sliders for the dissolved oxygen and evaporation rate over the lake. Sliders allow users to change the harmful algae-related factors and calculate the harmful algae trends over the lake.
Figure 8. SINDy-based algae trend modeling can be calculated through sliders for the dissolved oxygen and evaporation rate over the lake. Sliders allow users to change the harmful algae-related factors and calculate the harmful algae trends over the lake.
Toxins 17 00338 g008
Figure 9. Study Area, which covers the most HAB-experienced lakes in the State of Iowa.
Figure 9. Study Area, which covers the most HAB-experienced lakes in the State of Iowa.
Toxins 17 00338 g009
Figure 10. Raw microcystin data for Iowa and its summary statistics between the years 2006 and 2022 and the months May and September.
Figure 10. Raw microcystin data for Iowa and its summary statistics between the years 2006 and 2022 and the months May and September.
Toxins 17 00338 g010
Figure 11. (a) Average mutual information (in Bits) (b) The attractor of microcystin (MC).
Figure 11. (a) Average mutual information (in Bits) (b) The attractor of microcystin (MC).
Toxins 17 00338 g011
Figure 12. The flow of a sequential process before the application of the SINDy algorithm (Adopted from Brunton et al., 2016 [52]).
Figure 12. The flow of a sequential process before the application of the SINDy algorithm (Adopted from Brunton et al., 2016 [52]).
Toxins 17 00338 g012
Figure 13. Normalized microcystin, dissolved oxygen and evaporation data.
Figure 13. Normalized microcystin, dissolved oxygen and evaporation data.
Toxins 17 00338 g013
Figure 14. Augmented microcystin data points after modified Akima interpolation.
Figure 14. Augmented microcystin data points after modified Akima interpolation.
Toxins 17 00338 g014
Figure 15. Interpolated microcystin, dissolved oxygen and evaporation data.
Figure 15. Interpolated microcystin, dissolved oxygen and evaporation data.
Toxins 17 00338 g015
Figure 16. Microcystin and smoothed microcystin data.
Figure 16. Microcystin and smoothed microcystin data.
Toxins 17 00338 g016
Figure 17. The overall structure and components of web-based framework.
Figure 17. The overall structure and components of web-based framework.
Toxins 17 00338 g017
Table 1. Model performance criteria for each lake.
Table 1. Model performance criteria for each lake.
LakerRMSEMAPE
West Okoboji0.990.00011.61
McIntosh Woods0.690.004611.3
Blackhawk0.990.00011.72
Geode0.990.00471.95
Table 2. West Okoboji Lake data sources.
Table 2. West Okoboji Lake data sources.
Site IDSite Name
21300001Gull Point Beach
21300002Pikes Point Beach
21300003Triboji Beach
22300009West Okoboji Lake
14000189Emerson Bay 1
14000190Emerson Bay 2
14000191Emerson Bay 3
14000193Emmerson T-4
14000410West Lake Okoboji-Smiths Bay
14000411West Lake Okoboji-Millers Bay
14000412West Lake Okoboji-Main Basin North
15300001Unnamed tributary to Emerson Bay at beach
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Baydaroğlu, Ö.; Yeşilköy, S.; Dave, A.; Linderman, M.; Demir, I. Modeling Algal Toxin Dynamics and Integrated Web Framework for Lakes. Toxins 2025, 17, 338. https://doi.org/10.3390/toxins17070338

AMA Style

Baydaroğlu Ö, Yeşilköy S, Dave A, Linderman M, Demir I. Modeling Algal Toxin Dynamics and Integrated Web Framework for Lakes. Toxins. 2025; 17(7):338. https://doi.org/10.3390/toxins17070338

Chicago/Turabian Style

Baydaroğlu, Özlem, Serhan Yeşilköy, Anchit Dave, Marc Linderman, and Ibrahim Demir. 2025. "Modeling Algal Toxin Dynamics and Integrated Web Framework for Lakes" Toxins 17, no. 7: 338. https://doi.org/10.3390/toxins17070338

APA Style

Baydaroğlu, Ö., Yeşilköy, S., Dave, A., Linderman, M., & Demir, I. (2025). Modeling Algal Toxin Dynamics and Integrated Web Framework for Lakes. Toxins, 17(7), 338. https://doi.org/10.3390/toxins17070338

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop