Predicting Cyanobacterial Harmful Algal Blooms (CyanoHABs) in a Regulated River Using a Revised EFDC Model

: Cyanobacterial Harmful Algal Blooms (CyanoHABs) produce toxins and odors in public water bodies and drinking water. Current process-based models predict algal blooms by modeling chlorophyll-a concentrations. However, chlorophyll-a concentrations represent all algae and hence, a method for predicting the proportion of harmful cyanobacteria is required. We proposed a technique to predict harmful cyanobacteria concentrations based on the source codes of the Environmental Fluid Dynamics Code from the National Institute of Environmental Research. A graphical user interface was developed to generate information about general water quality and algae which was subsequently used in the model to predict harmful cyanobacteria concentrations. Predictive modeling was performed for the Hapcheon-Changnyeong Weir–Changnyeong-Haman Weir section of the Nakdong River, South Korea, from May to October 2019, the season in which CyanoHABs predominantly occur. To evaluate the success rate of the proposed model, a detailed ﬁve-step classiﬁcation of harmful cyanobacteria levels was proposed. The modeling results demonstrated high prediction accuracy (62%) for harmful cyanobacteria. For the management of CyanoHABs, rather than chlorophyll-a, harmful cyanobacteria should be used as the index, to allow for a direct inference of their cell densities (cells/mL). The proposed method may help improve the existing Harmful Algae Alert System in South Korea.


Introduction
An algal bloom is a phenomenon in which there is an abnormal proliferation of photosynthetic algae in water bodies which turns the water in a river or lake green. From the perspective of traditional taxonomy, algal blooms can be caused by green algae, diatoms, or cyanobacteria. Algal blooms can occur globally. In South Korea, this phenomenon is mainly caused by cyanobacteria. Cyanobacterial Harmful Algal Blooms (CyanoHABs) that occur in freshwater lakes, rivers, and estuaries are caused by cyanobacteria, which are also known as blue-green algae, and occur most often in the summer. Species belonging to the genus Microcystis spp. are representative of CyanoHABs in South Korea's freshwaters [1]. CyanoHABs cause many social, economic, and environmental problems each summer in South Korea. Aquatic species, such as fish, are killed by toxins produced by Microcystis spp. [2][3][4], eventually leading to the degradation of the aquatic ecosystem. It also adversely affects the water management of the drinking water protection zone. The United States Environmental Protection Agency stated, "CyanoHABs and their toxins can harm people, animals, aquatic ecosystems, the economy, drinking water supplies, property values, and recreational activities, including swimming and commercial and recreational fishing in many countries". This means that the prediction and management of CyanoHABs is imperative not only in South Korea, but also globally. Therefore, it is necessary to predict the occurrence of harmful cyanobacteria, Microcystis spp., Anabaena(=Dolichospermum) spp., (2) To predict CyanoHABs by applying the GUI and using the EFDC-NIER to the section between Hapcheon-Changnyeong Weir and Changnyeong-Haman Weir, where severe occurrences of HABs are observed. (3) To verify the accuracy of the prediction of CyanoHABs and suggest improvements to the Harmful Algae Alert System.

EFDC-NIER Model
The EFDC was developed by the Virginia Institute of Marine Science in the U.S., and the U.S. Environmental Protection Agency (EPA) released the Generalized Vertical Grid version. Then, DSI released EFDC_DS (version 20100328) and has continuously updated the source code. The water quality parameters of the EFDC include information on phytoplankton, carbon, nitrogen, phosphorus, and silicon cycles, and dissolved oxygen and chemical oxygen demand (COD). Phytoplankton are divided into three algal group (diatoms, green algae, and cyanobacteria). This division makes it convenient to account for seasonal transitions. Since algal groups with different behavioral characteristics are mixed in the same algal group, there may be limitations in reproducing the rapid occurrence of certain algae and complex species transitions. Therefore, to model multiple algal species, the NIER improved the source code of the model. As shown in the schematic of the reactions among multiple algal species (CHx1-CHxn) in Figure 2, the algae-related state parameters were expanded based on the EFDC_DS (version 20100328) (red indicates where the mechanism of algae generation and death affects (Figure 2)).

Series Water Quality Variable Modeling Variable Input Data Equations
Carbon TOC * Kdbot: Organic substance decomposition rate in the BOD bottle (day −1 ).

Conversion of Chlorophyll-a Concentration to Carbon Content
The total carbon content of each algal group (diatoms, green algae, and cyanobacteria) in each tributary that flows into a main stream is required to provide boundary conditions. The best strategy would be to convert the cell counts of algal species-observed over a long period at the inflow of each tributary-into a carbon amount. However, for the Nakdong River, cell count data for the tributaries were not available. Therefore, the chlorophyll-a contents at the tributary endpoints were converted to a carbon content for each harmful algal bloom prediction group using cell count data collected at seven-day intervals from a point located 500 m upstream from the Changnyeong-Haman Weir (in the main stream) in 2018. The sampling method is shown in Appendix A Figure A1. Classification of algal species according to characteristics such as habitat environment, environmental tolerance, and sensitivity is called 'Codon' [20] (Figure 3). The following steps were used for this conversion (as outlined in Figure 3):

1.
The 684 species of algae observed in the Nakdong River were classified into five algal groups and their carbon content per cell was determined (Table 2).

2.
The carbon content per liter was determined for each species. For example, for Asterionella spp., the number of observed cells was 240 cells/mL and the carbon content per cell was 125 pg C/cells. Therefore, the carbon content per liter was calculated as follows: 240 cells/mL × 125 pg C/cells = 240 cells/mL × 125 × 10 −9 mg C/cells = 0.00003 mg C/mL = 0.03 mg C/L.

3.
Based on the number of cells observed every week, the carbon occupancy rate was calculated for each of the five groups.

4.
The carbon occupancy proportions for each of the five groups per month were calculated (Table 3).

5.
Under the assumption that the carbon occupancy proportions of each harmful algal bloom prediction group observed in the main stream were identical to those in the tributaries, the chlorophyll-a observed in the tributaries was converted to a carbon content for each harmful algal bloom prediction group using the monthly carbon occupancy proportions and the carbon:chlorophyll-a ratio (β; see below).
In the Changnyeong-Haman Weir, the average ratio of carbon content to observed chlorophyll-a concentration for each prediction group from 2013 to 2018 was 0.12, and the average ratio for eight locations monitored in the Nakdong River (Sangju, Nakdan, Gumi, Chilgok, Gangjeong-Goryeong, Dalseong, Hapcheon-Changnyeong, and Changnyeong-Haman weirs) was also 0.12. Therefore, 0.12 was used as β in this study. Appendix B Figure A4 compares the observed chlorophyll-a with that calculated using β = 0.12. Table 2. Classification of algal species observed in the Nakdong River into groups and their respective carbon content per cell.

Group
Species ( Equation (2) was used to convert the chlorophyll-a concentration observed in the main stream into a total carbon content to be used as a boundary condition for the inflow from the tributary. After the chlorophyll-a concentration was converted to a carbon content for each harmful algal bloom prediction group using β = 0.12 mg C/µg chl-a, the algal carbon content at each boundary was added to the monthly proportion of each algal group (based on the carbon content) of the Changnyeong-Haman Weir in Table 3. For example, if the observed chlorophyll-a concentration in July in the Changnyeong-Haman Weir was 40 mg/m 3 , the carbon amount of Microcystis spp. was 0.12 × 0.281 × 40 = 1.3488 mg C/L.
where X i is the i th algal group and β = 0.12.

Development of a GUI for Automatic Input Data Generation
In order to consistently and rapidly convert the water quality data collected by the Water Environment Information System [21] into data appropriate for the EFDC-NIER model using the method described in Section 2.2.2, a GUI was developed using Microsoft Excel's Visual Basic tool ( Figure 4). The Water Environment Information System is a freely downloadable database that provides all water quality observational data in South Korea. The GUI was composed of the following modules:

1.
A storage module for storing the carbon occupancy proportions of each harmful algal bloom prediction group in each period.

2.
An input module for receiving the water quality data observed at the tributary endpoints, used for boundary conditions. 3.
A conversion module for converting the chlorophyll-a concentrations to carbon contents for each harmful algal bloom prediction group at the observation time (to enable modeling by matching these amounts with the carbon occupancy proportions of each harmful algal bloom prediction group according to the modeling period).

4.
A modeling module for 3D numerical modeling of each harmful algal bloom prediction group for all study areas based on the carbon content of each harmful algal bloom prediction group.

Building and Modeling of the EFDC-NIER Model
As shown in Figure 5, there are eight multifunctional weirs in the Nakdong River, which is subdivided into 22 mid-sized basin units. In the Nakdong River, problems related to CyanoHABs are common. In particular, the weirs in the downstream portion of the river have poor water quality and more frequent CyanoHABs compared with those in the upstream portion. The Hapcheon-Changnyeong Weir-Changnyeong-Haman Weir section (72 km) was selected for this study. It is an appropriate study area because CyanoHABs continuously occur in summer; moreover, this section serves as a water source. As shown in Figure 5, the EFDC-NIER model was applied to the Hapcheon-Changnyeong Weir and the Samrangjin water level observatory section. The factors influencing the water balance, such as tributaries flowing into the main stream, effluent from sewage treatment plants, and water intake stations, were accounted for in the model as boundary conditions. The number of horizontal and vertical grid elements was 11,290 and 5, respectively. In addition, 15 transverse grids were used to reflect the water body flow with respect to the operation of hydraulic structures. The "Mask" option was set for the grids where multifunctional weirs were located, and the water flowing upstream was discharged to the downstream section using the hydraulic structure module (QCTL) ( Figure 6). The following data were used in the model: hourly meteorological observation data from the Meteorological Data Open Portal of the Korea Meteorological Administration [22]; daily weir operation data provided by K-water [23]; daily dam discharge data provided by the Water Resource Management Information System [24]; flow observation data provided by the Ministry of Environment; and water quality monitoring network data from the Ministry of Environment [21].
In this study, modeling of general water quality variables and algae-related water quality was performed for the study area using the model input data generation GUI developed in Section 2.2. The Hapcheon-Changnyeong Weir was set as the upstream boundary of the model, whereas the Samrangjin water level observatory section was chosen as the downstream boundary. Within the model domain, the Changnyeong-Haman Weir was included as a hydraulic structure. The reproducibility of the model was evaluated for the period from 1 May to 30 October 2019. Predicted values were compared with observed values collected at a point located 500 m upstream from the Changnyeong-Haman Weir. The hydraulic structures in the Changnyeong-Haman Weir were as follows (from the left to the right bank): a fishway, a fixed weir, a movable weir, another fixed weir, a small hydropower station, and finally another fishway. The discharge priorities for the operation of the hydraulic structures are as follows: fishway ≥ small hydropower station > movable weir > fixed weir. During the modeling period, the movable weirs were operated in turning type, and most flows were discharged through the fishway and small hydropower station. Thus, the flow of the water body was concentrated near the right bank.  The model parameters were calibrated using the mean absolute error (MAE) and the root mean square error (RMSE). The MAE is the average of the absolute errors between the observed and simulated values and can be used to compare the residuals between models. The RMSE is a mean error between the observed and simulated values and serves as an indicator of model precision. The values in Table 4 were used as the major parameters for harmful algal bloom and water quality analysis. The RMSE and MAE in the model simulation period are summarized in Table 5. The equations for MAE and RMSE are as follows: and where P i is the simulated value at time i, O i is the observed value at time i, and N is the number of observed values in the whole period.   Figure 7 compares the simulation results with the observed data collected 500 m upstream from the Changnyeong-Haman Weir. It is considered that the water quality items (e.g., nutrients and organic matters) that are highly related to the water flow characteristics (water level and water temperature), as well as the CyanoHABs and the behavior patterns of the weir section determining the CyanoHABs and its pattern, were reasonably reproduced ( Table 5). The harmful cyanobacteria were simulated with the calibrated EFDC-NIER model from May to October, when CyanoHABs occurred, by converting the carbon content simulated for Group M (Microcystis spp.) and Group H1 (Anabaena(=Dolichospermum) spp. and Aphanizomenon spp.) using the carbon content per unit cell (pg C/cell) in Table 2. Figure 8 compares the observed and simulated harmful cyanobacteria data. The simulation predicted that harmful cyanobacteria occurred from May to October in the Changnyeong-Haman Weir. In particular, the number of harmful cyanobacteria cells was predicted to be approximately >100,000 cells/mL.

Applicability of the Model for Short-Term CyanoHABs Predictions
The Harmful Algae Alert System of South Korea defines four alarm-triggering levels for harmful cyanobacteria based on their amount as follows: normal: <1000 cells/mL; concern: 1000-10,000 cells/mL; alert: 10,000-1,000,000 cells/mL; and bloom: >1,000,000 cells/mL [1]. However, the alert stage should be refined and subdivided because the range 10,000-1,000,000 cells/mL is too wide.
The World Health Organization and the Australian National Health and Medical Research Council have adopted water quality threshold values such as the "safe water quality for recreational activities" (harmful cyanobacteria < 20,000 cells/mL), "water quality that can cause adverse effects when humans or animals come in contact with the water" (20,000 < harmful cyanobacteria < 100,000 cells/mL), and "unsafe water quality when ingested by humans" (harmful cyanobacteria > 100,000 cells/mL) [25].
The South Korean Ministry of Environment operates a water quality monitoring network and a Harmful Algae Alert System based on observation points located 500 m upstream of eight multifunctional weirs. In the case of the Changnyeong-Haman Weir, which is a water source section, model predictions need to be provided that can determine if the water quality is unsafe for human ingestion based on harmful cyanobacteria counts (level 3: 10,000-100,000 cells/mL and level 4: 100,000-1,000,000 cells/mL). Table 6 lists the carbon occupancy proportions of harmful cyanobacteria measured at seven-day intervals from 2014 to 2019 at the water quality monitoring network point of the Changnyeong-Haman Weir. Over the five years, the harmful cyanobacteria amounts were less than 100,000 cells/mL for 25.2% of the time; for harmful cyanobacteria > 100,000 cells/mL, the occurrence proportion was 6.2%. Most of the harmful cyanobacteria observations (93.8%) were below 100,000 cells/mL. Given that the Harmful Algae Alert System in the existing section operates in the range 10,000-1,000,000 cells/mL, alert levels can often be triggered in the range 10,000-100,000 cells/mL. Once an alert is triggered, appropriate action should be taken. If two distinct alert levels (3 and 4) are used, step-by-step actions should be taken only if the concentration is 100,000 cells/mL or higher. In other words, a waste of administrative resources can be prevented by operating the Harmful Algae Alert System proactively and efficiently, that is, by using detailed CyanoHABs predictions and acting according to five distinct alert levels. In order to evaluate the applicability of the five distinct alert levels presented in this paper, the number of harmful cyanobacteria cells was predicted and the results were analyzed from May to October 2019. In South Korea, water quality and algae are measured every week. Therefore, the modified Harmful Algae Alert System proposed in this study was evaluated with a total of 26 data. The number of alert level (10,000-1,000,000 cells/mL) in existing Harmful Algae Alert System was 17, and 5 cases were analyzed for more than 100,000 cells/mL (Table 7). By comparing the results of the predictive modeling with the values reported in Table 8, the prediction was successful in 62% of cases. When evaluating the predictive power for the newly defined alert levels (3 and 4) by dividing them on the basis of 100,000 cells/mL, the prediction accuracy was analyzed as 65% ( Table 7). The prediction accuracy is slightly reduced than before. However, considering that the weather forecast accuracy of the Korean Meteorological Administration, which has the most significant effect on the CyanoHABs prediction, is approximately 70%. So, a success rate of 62% is considered the maximum possible goal. Consequently, a further increase in the success rate for the harmful cyanobacteria can be achieved by considering the modeling results for the entire year using the following equation: Success rate (%) = (A + B + C + D + E)/N × 100 (5) where A, B, C, D, and E are the number when the observed and predicted data meet in each section; N = total number, as shown in Table 8.  Table 8. Method for calculating the success rate of the harmful cyanobacteria prediction.

Discussion and Conclusions
A method for predicting the amount of harmful cyanobacteria using the EFDC-NIER model was proposed. In addition, policy utilization measures, such as the operation of the Harmful Algae Alert System, were examined by predicting the harmful cyanobacteria using the proposed method and the developed numerical tool. The major findings of this study are as follows: 1.
Harmful cyanobacteria occur in large amounts from June to August in Changnyeong-Haman Weir. According to National Institute on Environmental Research, the dominant algae observed from June to August in 2019 was Microcystis spp. Therefore, CyanoHABs are mainly caused by Microcystis spp. in South Korea. This phenomenon is demonstrated by the simulation results of harmful cyanobacteria in this study. The simulation focused on harmful cyanobacteria because it is the main cause of algal blooms. However, if the cell numbers of other relevant algal groups, such as diatoms and green algae, need to be predicted in the future, these algal groups can also be added. In addition, the carbon content (pgC/cell) simulated for each group and carbon content per unit cell in Table 2 can be converted to cell numbers. Therefore, detailed algal simulations for multiple species will be possible using the modeling method proposed in this study.

2.
The developed numerical tool was applied to the Hapcheon-Changnyeong Weir-Changnyeong-Haman Weir section of the Nakdong River, which experiences severe growth of HABs. The modified Harmful Algae Alert System subdivided the alert level (10,000-1,000,000 cells/mL) in existing Harmful Algae Alert System based on 100,000 cells/mL, which is a unsafe water quality when ingested by humans [25]. The total success rate for the prediction of harmful cyanobacteria was 62% (65% at level 3 and 4). The predictive power of the modified Harmful Algae Alert System presented in this study is slightly reduce, so it can be judged that its applicability is inferior. However, it is not true. For example, if the predicted number of harmful cyanobacteria cells was 900,000 cells/mL and the measured number was 20,000 cells/mL, the prediction is evaluated as successful in existing Harmful Algae Alert System. This is an overestimation of the predictive power due to the wide range of alert level in existing Harmful Algae Alert System. Because of these cases, predictive power should not be evaluated solely by predictive success rate. For this reason, it can be said that administrative power and budget were used inefficiently by over-reaction, even though it was not dangerous to humans. In other words, responding to the algal blooms problem by subdividing the sections according to severity like the modified Harmful Algae Alert System can prevent unnecessary administrative power and budget consumption, and manage algal blooms more efficiently. In terms of policy, more advanced algal blooms management will be possible if administrative power and budget wasted due to over-reaction are utilized elsewhere such as training experts, recruiting more researchers, increasing the algae measurement budget, and upgrading measurement and prediction equipment, etc. In this study, there is a limitation in that the sample size is small. Recently, algae are constantly being measured. In addition, the frequency of algae measurement has increased compared to before due to the advanced technique of estimating the concentration of harmful cyanobacteria using phycocyanin and the advancement of remote sensing such as hyperspectral image. For this reason, many high-quality data are being obtained. If high-quality data will be obtained more and more algal groups are considered in the future, the cell number simulation for multiple algal species as well as for harmful cyanobacteria will be possible, and it is foreseen that the success rate will also be improved.
Methods to ensure the model performs short-term CyanoHABs predictions were suggested. To establish the boundary conditions with respect to the inflow from tributaries into the studied river section, chlorophyll-a concentrations observed in the main stream were converted to monthly carbon proportions for each HAB prediction group. However, in order to apply scientifically-based boundary conditions, it is necessary to measure the cell number of each algal species at the inflows from the tributaries and convert them to a particular carbon content for each HAB prediction group. Furthermore, the initial conditions applied to the model domain are also important. If the initial harmful cyanobacteria concentrations are evaluated from the phycocyanin concentration observed using hyper-spectroscopy (a remote sensing technique) rather than by linearly interpolating point-to-point data, the uncertainties can be reduced.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
National Institute of Environmental Research of South Korea monitors chlorophyll-a (mg/m 3 ) and cyanobacteria (cells/mL) concentrations every seven days. Chlorophyll-a (mg/m 3 ) is monitored by dividing the surface layer and water body mixing average, and cyanobacteria (cells/mL) is monitored at the surface layer. The detailed monitoring method used in this study is shown in Appendix A Figure A1. The sample of the surface layer was obtained by mixing the samples at three points on the left, center, and right located at 0.5 m water depth. The three points selected were the deepest point (center point), and two points separated by 1/4 of the total river width (along the cross section) to the left and right of the center point (left and right points). To obtain a representative sample that reflects the mixing average of the water body, samples from the same three points were taken. Additionally, samples were taken at 1/3 and 2/3 of the water depth from the surface. Then, the sample representative of the mixing average of the entire water body was obtained by mixing all collected samples.
Appendix A Figure A2 shows the results of chlorophyll-a (mg/m 3 ) and cyanobacteria (cells/mL) observed from 6 January 2014 to 19 October 2020. In South Korea, water quality management was performed based on the sum of the number of cells monitored for the genera Anabaena spp., Aphanizomenon spp., Microcystis spp., and Oscillatoria spp. which release toxic compounds. Drinking water is contaminated due to the mass growth of Microcystis spp. from May to October in the Changnyeong-Haman Weir. Previously, the evaluation of CyanoHABs was performed using chlorophyll-a. However, chlorophyll-a is observed even when harmful cyanobacteria are not present, which is a significant issue for using it as an indicator. Appendix A Table A1 shows the monthly observations of harmful cyanobacteria and chlorophyll-a of the mixing average of the water body. Appendix A Figure A3 shows the correlation between harmful cyanobacteria and chlorophyll-a of the mixing average of the water body observed from 6 January 2014 to 19 October 2020. The correlation analysis shows that chlorophyll-a is not suitable for use as a representative index of CyanoHABs. Therefore, we attempted to develop a technique to manage CyanoHABs through the prediction of harmful cyanobacteria such as Microcystis spp.

Appendix B
Appendix B Table A2 shows an example of the calculation of the amount of carbon for each algal group and converting the calculated carbon amount to chlorophyll-a. β is the carbon/chlorophyll-a concentration ratio and its value is 0.12. For example, 0.2836 divided by 0.12 is 2.3633 for Codon P on 7 January 2019. The total chlorophyll-a value of all nine Codons from M to C is 14.94 (mg/m 3 ). Appendix B Figure A4 compares the observed and calculated chlorophyll-a values.