Benthic Diatom Communities in Korean Estuaries: Species Appearances in Relation to Environmental Variables

In the Korean Peninsula’s southern estuaries, the distributive characteristics of epilithic diatoms and the important environmental factors predicting species occurrence were examined. The collection of diatoms and measurements of water quality and land-use were performed every May between 2009 and 2016, with no influence from the Asian monsoon and snow. Throughout the study, 564 diatoms were classified with first and second dominant species of Nitzschia inconspicua and N. perminuta. Based on diatom appearance and standing crops, the 512 sampling stations were divided into four groups by cluster analysis, and two regions, namely the West and East Sea. Geographically, G1, G2, G3, and G4 were located in the East Sea, Southeast Sea, West Sea, and Southwest Sea, respectively. Canonical correspondence analysis (CCA) results indicated that environmental factors, such as turbidity, electric conductivity (EC), and total phosphorus (TP), significantly influenced the distribution of epilithic diatoms. A random forest model showed that major environmental factors influencing the diatom species appearance included EC, salinity, turbidity, and total nitrogen. This study demonstrated that the spatial distribution of epilithic diatoms in the southern estuaries of the Korean Peninsula was determined by several factors, including a geographically higher tidal current-driven turbidity increase and higher industrial or anthropogenic nutrient-loading.


Introduction
An estuary is a transition zone where seawater and freshwater meet, and it is also a dynamic ecosystem with a diverse composition of living organisms due to significant physicochemical changes, such as water temperature, salinity, and nutrients [1][2][3]. Despite the fact that estuaries have various functions, including providing habitat, purifying water quality, and producing marine products, they are being destroyed by development projects concentrated in these regions [4]. As a result, nutrients, organic matters, and other pollutants are accumulating in coastal waters [5].
The Korean Peninsula is surrounded by seas on three sides, and due to an increase in coastline development relative to the narrow land area, approximately 460 estuaries have formed. Among these, only 235 estuaries are able to maintain estuarine circulation, whereas the estuarine circulation of the remaining estuaries is cut off by estuarine dams or sea dikes, which limit the formation of a brackish water zone that constitutes an estuarine ecosystem [6]. Moreover, development projects concentrated in estuarine regions destroy the various functions of estuarine wetlands, including providing habitat, purifying water quality, and producing marine products [4]. Meanwhile, estuaries in the Korean Peninsula exhibit different characteristics depending on their geographical location. In particular,  * Open streams have no dams to harvest the flowing water or to keep out the sea. ** Closed streams have one or several dams to harvest the flowing water or to keep out the sea.

Water Quality
Water temperature, dissolved oxygen (DO), potential hydrogen (pH), electrical conductivity (EC), salinity, and turbidity of the survey sites were measured on-site using a multi water quality checker (Horiba U-50, HORIBA Ltd., Kyoto, Japan). The water samples needed for laboratory analysis were collected in a 2-L sterilized water collection bottle from each site and transported to the laboratory while being stored on ice. Biological oxygen demand (BOD) was calculated as the difference between the DO concentration measured on-site and the DO concentration of the water sample collected on-site in a 300 mL BOD bottle and subsequently incubated in an incubator for five days at 20 °C under dark conditions, in accordance with the Winkler azide method. Concentrations of dissolved inorganic matters were measured with the ascorbic acid method using a spectrophotometer (Optizen POP, Neogen Inc., Sejong, Korea), after the cadmium reduction method was used to determine total nitrogen (TN) and persulfate decomposition for total phosphorus (TP)

Water Quality
Water temperature, dissolved oxygen (DO), potential hydrogen (pH), electrical conductivity (EC), salinity, and turbidity of the survey sites were measured on-site using a multi water quality checker (Horiba U-50, HORIBA Ltd., Kyoto, Japan). The water samples needed for laboratory analysis were collected in a 2-L sterilized water collection bottle from each site and transported to the laboratory while being stored on ice. Biological oxygen demand (BOD) was calculated as the difference between the DO concentration measured on-site and the DO concentration of the water sample collected on-site in a 300 mL BOD bottle and subsequently incubated in an incubator for five days at 20 • C under dark conditions, in accordance with the Winkler azide method. Concentrations of dissolved inorganic matters were measured with the ascorbic acid method using a spectrophotometer (Optizen POP, Neogen Inc., Sejong, Korea), after the cadmium reduction method was used to determine total nitrogen (TN) and persulfate decomposition for total phosphorus (TP) [37]. For chlorophyll (Chl-a) concentration and ash-free dry mass (AFDM), three or more rocks, ≥10 cm in size with a flat surface, were collected from the survey sites. A soft brush was used to clean 25 cm 2 of the upper section of the collected rocks, which were placed in plastic sample bottles using water from the site. The collected samples were kept under cold and dark conditions and transported to the laboratory for measurement by standard methods [37].

Epilithic Diatom Community
The epilithic diatom samples that were scrubbed off with a soft brush from the upper part of the substrate collected from the survey sites were transported to the laboratory after fixing in Lugol's solution, after which, they were washed using the permanganate method [38] and permanent samples were prepared using an encapsulant. Epilithic diatom samples were observed using an optical microscope (Nikon E600, Nikon, Tokyo, Japan) under 400× to 1000× magnification. The relative abundance of species present for the analysis of epilithic diatom communities was set to the number of diatom frustules being ≥500 under arbitrarily set microscopic field of view. Species were identified using the methods of Krammer and Lange-Bertalot [39,40] and classified according to the Simonsen's classification system [41]. To determine the characteristics of epilithic diatom communities, the dominant species, dominance index [42], diversity index [43], richness index [44], and evenness index [45] of each survey site were derived.

Cluster Analysis
To characterize the distribution of epilithic diatom communities present in estuarine zones within Korea, a cluster analysis was performed based on the number individuals in epilithic diatom communities and number of species present. In the cluster analysis, species types were classified according to similarity between cluster composition using the Ward's linkage method and Euclidean distance.

Indicator Species Analysis
Indicator species analysis (ISA) was performed to determine the indicator species and indicator value (IndVal) of each group categorized by cluster analysis. The ISA is a non-hierarchical statistical analysis method that uses the relative abundance and frequency of each species at each survey site to calculate the IndVal, upon which, the indicator species is determined. In the IndVal method [46], IndVal appears within a range of 0-100 with higher values representing higher indicative power [47]. In this study, when the IndVal was ≥25, the species with IndVal that was five times higher than that of another group (good species) was selected as the indicator species for that group [48]. The Monte Carlo test was performed to determine the significance of the indicator species analysis.

Random Forest
The random forest model was used to predict the presence of epilithic diatom species. The random forest model is a non-parametric method used to estimate and assess the relationship between latent predictor variables and response variables [49], using various combinations of environmental variables. The importance of environmental variables used in this model was determined using minimum description length (MDL), which is used for comparing the relative importance of environmental factors [50] with MDL values converted to a range between 0 and 100. To assess the predictive power of this model, the accuracy rate and area under curve (AUC) were derived. Accuracy was measured based on the dichotomous method of presence and non-presence, with a range of 0-1. The AUC could predict the reliability of the resulting values of the model and it typically has a range of 0.5-1, but values below this range may also appear.

Biological Integrity Assessment
Various epilithic diatom indices were derived and used for the biological integrity assessment of each survey site. The trophic diatom index (TDI) was used to assess the nutritional status of rivers by calculating the IndVal according to relative density, pollution sensitivity, and level of diatoms present in each site. This approach, together with diatom assemblage index of organic water pollution (DAIpo), has been utilized for a relatively long time [23]. As the nutrient concentrations increases, the value of this index increases, whereas when pollution levels increase, the value decreases. The present study used a modified version of TDI suited for Korea [51]. Meanwhile, DAIpo was used to calculate IndVal according to organic indicators (sensitive species and tolerant species) of species present in the survey sites [21], where an increase in level of pollution results in decrease in the value of this index. The Achnanthes/Achnanthes + Navicula (AAN) index is the proportion (%) of Achnanthes among total sum of Achnanthes (saproxenous species representing clear water areas) and Navicula (eutrophic species that prefer highly polluted water areas) [52]. An increase in the level of pollution results in a decrease in the value of this index. The motile diatoms (MD) index is a value that indicates the proportion (%) of motile diatoms in heavily polluted water areas among the total amount of diatoms present in each survey site [53]. An increase in the concentration of organic matters results in an increase in the value of this index. The classification of motile diatoms in this study followed the method by Hill et al. [54]. Lastly, the number of Gomphonema species (NGO) index is a value that indicates the percentage (%) of species that are member of the Gomphonema genus among all diatom species present in each survey site [52]. An increase in concentration of organic matters results in a decrease in the value of this index.
To compare the differences in community characteristics (total number of species and biomass), community indices, and environmental factors between the groups, Tukey's post hoc test was performed for Analysis of Variance (ANOVA) nonparametric multiple comparisons. Moreover, Pearson's correlation analysis was employed to analyze the relationships between indicator species and environmental factors for each group.
For cluster, indicator species, and Canonical Correspondence Analyses (CCA), the PC-Ord program (ver. 4.25. MjM Software, Gleneden Beach, OR, USA) was used. The random forest model was run by the CORElearn package in R statistic program (http://cran.r-project.org). ANOVA was performed using SPSS software (ver. 21. IBM, New York, NY, USA).
For epilithic diatom community data, species that appeared in less than 5% of all survey sites (25 sites) were identified as rare taxa and excluded from the statistical analysis. Moreover, to reduce variations among individuals, data were converted to the natural log (ln) and a value of 1 was added to each variable to prevent the ln value from becoming 0.

Diatom Distribution and Community Characteristics
A total of 566 taxa of epilithic diatoms were present in 512 survey sites. The dominant species was Nitzschia inconspicua (13.9%), while the subdominant species was Nitzschia perminuta (7.4%). The total number of epilithic diatom species present was highest in the southern sea with 457 species, followed in order by the eastern sea (354 species) and the western sea (346 species). Nitzschia inconspicua was the dominant species in all of the sea areas, with the highest percentage of 18.9% found in the southern sea area ( Table 2). In the cluster analysis, using the current level of presence (cell density) of a total of 139 species after excluding epilithic diatoms that were present in <5% (<25 sites) of all survey sites, the species were grouped into four groups at a 25% level: Group 1 (G1: 91 sites), Group 2 (G2: 234 sites), Group 3 (G3: 89 sites), and Group 4 (G4: 98 sites) ( Figure 2). G1, comprising mostly of sites located in eastern sea region, and G2, comprising sites in the southeastern coast of the Korean Peninsula, showed relatively high percentages of open estuaries with 96% and 71%, respectively. On the other hand, G3 and G4, mostly encompassing sites located in the western sea region, displayed high percentages of closed estuaries with 61% and 93%, respectively. In the cluster analysis, using the current level of presence (cell density) of a total of 139 species after excluding epilithic diatoms that were present in <5% (<25 sites) of all survey sites, the species were grouped into four groups at a 25% level: Group 1 (G1: 91 sites), Group 2 (G2: 234 sites), Group 3 (G3: 89 sites), and Group 4 (G4: 98 sites) ( Figure 2). G1, comprising mostly of sites located in eastern sea region, and G2, comprising sites in the southeastern coast of the Korean Peninsula, showed relatively high percentages of open estuaries with 96% and 71%, respectively. On the other hand, G3 and G4, mostly encompassing sites located in the western sea region, displayed high percentages of closed estuaries with 61% and 93%, respectively. With respect to the dominant epilithic diatom species in each group, Achnanthes minutissima (8.5%) and A. alteragracillima (8.4%) were the dominant species in G1, but they showed very low percentages in other groups. On the other hand, Nitzschia inconspicua, which was the dominant species for all survey sites, displayed high relative frequencies in G2 (7.9%), G3 (19.4%), and G4 (11.5%), appearing as the dominant or subdominant species (Figure 3). With respect to the dominant epilithic diatom species in each group, Achnanthes minutissima (8.5%) and A. alteragracillima (8.4%) were the dominant species in G1, but they showed very low percentages in other groups. On the other hand, Nitzschia inconspicua, which was the dominant species for all survey sites, displayed high relative frequencies in G2 (7.9%), G3 (19.4%), and G4 (11.5%), appearing as the dominant or subdominant species (Figure 3).

Locations
With respect to biological indices in each group, G3 had the highest number of species present, while G1 and G2 exhibited higher community indices and dominance indices than G3 and G4. Moreover, G3 and G4 had significantly higher diversity indices, G3 had the highest richness index, and G4 had the highest evenness index (p < 0.01; Figure 4). With respect to biological indices in each group, G3 had the highest number of species present, while G1 and G2 exhibited higher community indices and dominance indices than G3 and G4. Moreover, G3 and G4 had significantly higher diversity indices, G3 had the highest richness index, and G4 had the highest evenness index (p < 0.01; Figure 4). In the indicator species analysis of each group targeting 139 taxa of epilithic diatoms with frequency ≥ 5% in all survey sites, there were 16 taxa with IndVal ≥ 25%, which were considered good indicators with values more than five times higher than other groups: G1 (5 species), G3 (8 species), and G4 (3 species). Meanwhile, no indicator species appeared in G2. The species with the highest  With respect to biological indices in each group, G3 had the highest number of species present, while G1 and G2 exhibited higher community indices and dominance indices than G3 and G4. Moreover, G3 and G4 had significantly higher diversity indices, G3 had the highest richness index, and G4 had the highest evenness index (p < 0.01; Figure 4). In the indicator species analysis of each group targeting 139 taxa of epilithic diatoms with frequency ≥ 5% in all survey sites, there were 16 taxa with IndVal ≥ 25%, which were considered good indicators with values more than five times higher than other groups: G1 (5 species), G3 (8 species), and G4 (3 species). Meanwhile, no indicator species appeared in G2. The species with the highest In the indicator species analysis of each group targeting 139 taxa of epilithic diatoms with frequency ≥ 5% in all survey sites, there were 16 taxa with IndVal ≥ 25%, which were considered good indicators with values more than five times higher than other groups: G1 (5 species), G3 (8 species), and G4 (3 species). Meanwhile, no indicator species appeared in G2. The species with the highest IndVal (46%) was Stephanodiscus invisitatus, which was the indicator species of G3, but all other indicator species presented IndVal ≤ 40% (Table 3).

Physiochemical Water Quality
In the comparison of the physicochemical environment between the four groups (communities) categorized by distribution of epilithic diatoms using ANOVA, the factors that demonstrated significant differences were as follows ( Figure 5): water temperature was lowest (18.5 • C) in G1 with highest percentage of ESE; pH (8.0) was highest in G3; and salinity (5.7 ppt) and EC (9396.8 µS/cm) were highest in G2 with highest percentage of ESE, SSE, and open estuaries (p < 0.01). Moreover, turbidity, TN, and TP were significantly higher in G3 and G4 than G1 and G2, while DO was highest in G1 and BOD was highest in G4 (p < 0.01). Consequently, the physicochemical environment of estuaries in the Korean Peninsula were classified similarly to the distribution characteristics of epilithic diatoms, except for water temperature, salinity, and EC that are affected by geographical influence. Furthermore, the results showed that water quality in the East-Southeastern Sea was better than that of the West-Southwestern Sea.

Relationship between Diatom Distribution and the Environment
In the correlation analysis between the indicator species and environmental factors of each group, most group indicator species displayed high correlations with turbidity, regardless of the groups, negative correlations with the indicator species of G1, and positive correlations with the indicator species of G3 and G4 (Table 4). In G1, Cymbella silesiaca, the species with highest IndVal in this group, showed a negative correlation with water temperature, pH, salinity, EC, turbidity, BOD, TN, TP, and AFDM, and a positive correlation with DO. In G3, Stephanodiscus invisitatus, the species with highest IndVal in this group, exhibited a positive correlation with pH, turbidity, TN, TP, and AFDM. In G4, the indicator species Bacillaria paradoxa correlated positively with turbidity, BOD, and AFDM. These results demonstrate contradicting correlations between the indicator species of G1 and the indicator species of G3 and G4.

Relationship Between Diatom Distribution and the Environment
In the correlation analysis between the indicator species and environmental factors of each group, most group indicator species displayed high correlations with turbidity, regardless of the groups, negative correlations with the indicator species of G1, and positive correlations with the indicator species of G3 and G4 (Table 4). In G1, Cymbella silesiaca, the species with highest IndVal in this group, showed a negative correlation with water temperature, pH, salinity, EC, turbidity, BOD, TN, TP, and AFDM, and a positive correlation with DO. In G3, Stephanodiscus invisitatus, the species with highest IndVal in this group, exhibited a positive correlation with pH, turbidity, TN, TP, and AFDM. In G4, the indicator species Bacillaria paradoxa correlated positively with turbidity, BOD, and AFDM. These results demonstrate contradicting correlations between the indicator species of G1 and the indicator species of G3 and G4.  The environmental factors that impact the presence of species in epilithic diatom communities were assessed using a random forest model. The results varied depending on the species, with accuracy ranging from 0.82 to 0.98 and the AUC displaying a range of 0.94 to 1.00 (Table 5). Among 139 taxa, the species that exhibited the highest predictive power was Navicula atomus var. permitis (accuracy: 0.98, AUC: 1.00), whereas the species with the lowest accuracy were Gomphonema lagenula, Navicula gregaria, and Nitzschia dissipata, and the species that displayed the lowest AUC value was Achnanthes convergens. To assess the contribution of environmental factors that impact the presence of epilithic diatoms, a sensitivity analysis was performed using MDL of random forest (RF). The most important factors that impacted the presence of epilithic diatoms were found to be EC (41 species, 29.5%) and salinity (36 species, 25.9%). These factors explained the presence of 55% of the species in epilithic diatom communities in estuaries. The results also showed that turbidity (13 species, 9.4%) and TN (13 species, 9.4%) were relatively important for the appearance of species in epilithic diatom communities (Table 4). Other important factors included EC for Cymbella silesiaca (indicator species with the highest IndVal in G1), pH for Stephanodiscus invisitatus (indicator species in G3), and BOD for Bacillaria paradoxa (indicator species in G4) ( Table 4). Meanwhile, the main factor that determined the presence of epilithic diatoms in each group was water temperature for G1, salinity for G2 and G4, and TP for G3 ( Figure 6). Table 4. Relationship between indicator species and environmental variables in each diatom group in Korean estuaries between 2009 and 2016. Groups were divided by cluster analysis with diatom abundance and appearance. CODE (name of diatom) cited from Table 3.

Biological Integrity and Water Quality Assessment
When comparing the four groups by ANOVA using epilithic diatom indices for the assessment of biological integrity of estuaries in the Korean Peninsula, all items demonstrated significant differences ( Figure 7). TDI, DAIpo, AAN, and NGO, which are items that decrease in value when the level of pollution increases, tended to be highest in G1 and lowest in G3. MD, which increases in value when the level of pollution increases, was lowest in G1 and highest in G3. Thus, the biological integrity was assessed to be high in G1, which exhibited low nutrient levels similar to water quality characteristics, while the integrity in G3 was determined to be low, with a relatively high level of nutrients.

Biological Integrity and Water Quality Assessment
When comparing the four groups by ANOVA using epilithic diatom indices for the assessment of biological integrity of estuaries in the Korean Peninsula, all items demonstrated significant differences (Figure 7). TDI, DAIpo, AAN, and NGO, which are items that decrease in value when the level of pollution increases, tended to be highest in G1 and lowest in G3. MD, which increases in value when the level of pollution increases, was lowest in G1 and highest in G3. Thus, the biological integrity was assessed to be high in G1, which exhibited low nutrient levels similar to water quality characteristics, while the integrity in G3 was determined to be low, with a relatively high level of nutrients.

Discussion
The present study analyzed the relationships between environmental factors and the distribution of diatom communities in estuaries in the Korean Peninsula. During the study period, 566 taxa of epilithic diatoms were identified in 512 sites. These results determined a higher number of taxa than the 327 taxa found in 161 sites between 2012 and 2014 in a previous study [31], which appears to be the result of the present study having more survey sites than previous studies. With respect to dominant species, the Nitzschia genus appeared in over 38% of estuaries in the Korean Peninsula. Moreover, the dominant species in the group that was directly affected by the ocean in the Ebro estuary of Spain

Discussion
The present study analyzed the relationships between environmental factors and the distribution of diatom communities in estuaries in the Korean Peninsula. During the study period, 566 taxa of epilithic diatoms were identified in 512 sites. These results determined a higher number of taxa than the 327 taxa found in 161 sites between 2012 and 2014 in a previous study [31], which appears to be the result of the present study having more survey sites than previous studies. With respect to dominant species, the Nitzschia genus appeared in over 38% of estuaries in the Korean Peninsula. Moreover, the dominant species in the group that was directly affected by the ocean in the Ebro estuary of Spain included Nitzschia frustulum and Nitzschia inconspicua [55]. Meanwhile, the major epilithic diatoms present in estuaries in the Korean Peninsula (Nitzschia, Navicula, Achnanthes, and Fragilaria genera) have also been widely recorded in estuaries throughout the world, including Hungary, Sweden [56], the United States [57], Argentina, Uruguay [58], and the United Kingdom [59].
Based on the similarities of epilithic diatom communities in estuaries in the Korean Peninsula, a cluster analysis was performed to categorize four groups. G1 comprised the eastern sea region that included mostly the eastern area of Han River and parts of Nakdong River. G2 spread widely across the eastern and southern sea regions that included the eastern part of Han River, Nakdong River, and Seomjin River. According to Rho and Lee [4], land cover types in estuaries located in the eastern regions of Han River and Nakdong River showed a high percentage for forest (60%) and a low percentage for farmland (≤20%). G1, located in eastern sea region, had low nutrient concentrations and significantly low levels of salinity and turbidity. This is consistent with previous studies where forest land use was reported to have a negative correlation with nutrient levels [60,61]. Furthermore, because there is a continuation of the ridgeline with rapidly descending altitude that extends from the Tae Baek Mountains to the East Sea, small rivers flowing into the East Sea have steep downward slopes and short extended waterways [62]. Tidal variations in this coast are minimal, and, thus, the tidal effects are usually very weak [63]. Accordingly, estuaries located on the eastern coast display lower salinity than those located on the western or southern coasts.
G3, located in the western sea region, included mostly the eastern section of Han River and parts of Geum River. G4, located in the western and southern sea regions, comprised mostly of the Geum and Yeongsan Rivers. The Geum River and Yeongsan River regions have proportions of farmland of ≥ 35%, which are much higher than the average of 28.8% for Korea. This is because of an increase in farming area from large-scale land reclamation by drainage and land clearing projects concentrated in these regions [4].
G3 and G4, located in western and southwestern estuaries, demonstrated significantly high levels of nutrients and turbidity. A previous study reported that a high percentage of farmland is highly correlated with total suspended solids (TSS) [64,65]. Moreover, when farmland is the type of land use, strong positive correlations with nutrients such as TN and TP are found [60], and as turbidity increases, DO concentration is known to decrease [66,67]. Furthermore, salinity and EC appeared at significantly high levels in G2, which appears to be the result of seawater flowing deeply into the estuaries due to the high percentage of open estuaries in the southern sea region.
Achnanthes alteragracillima, the subdominant species in G1, was almost not present in other groups, and this species is a saproxenous species that is present in relatively clean water. Nitzschia inconspicua, the dominant species in G2, G3, and G4, also appeared as the dominant species in 30 estuaries in 2012 [30].
Cymbella silesiaca, Fragilaria rumpens var. fragilarioides, and Reimeria sinuata, the indicator species of G1, correlated positively with DO and negatively with salinity, EC, turbidity, BOD, TN, and TP (Table 4). These species are typical saproxenous species [68], which are known to grow in oligotrophic and mesotrophic waters [69]. Stephanodiscus invisitatus, Cyclotella atomus, Stephanodiscus hantzschii, Navicula veneta, and Navicula accomoda, the indicator species in G3, displayed positive correlations with salinity, turbidity, TN, and TP and a negative correlation with DO (Table 4). These species grow mostly by floating in freshwater with high EC or are present in brackish water zones in rivers. They are known to be tolerant to organic pollutants [70,71]. The indicator species in G4, Bacillaria paradoxa, Navicula capitate, and Nitzschia calida, showed positive correlations with turbidity and BOD (Table 4). These species grow in brackish water zones or eutrophic waters with high EC, and they are known to have broad range of tolerance to pollutants.
Environmental factors that influence species presence or emergence in epilithic diatom communities were predicted using a random forest model ( Table 5). The results showed that the most important factors were EC (41 species, 29.5%) and salinity (36 species, 25.9%). These factors explained the existence of 55% of the species in epilithic diatom communities in estuaries. Turbidity (13 species, 9.4%) and TN (13 species, 9.4%) also appeared as being relatively important for species presence in epilithic diatom communities. EC is known to be an important factor for determining the composition of epilithic diatom community [72,73].
In a preliminary study on diatom distribution associated with salinity, the salinity gradient was mostly the result of changes in the concentration of a single salt, sodium chloride (NaCl) [74][75][76]. As a result, it was difficult to differentiate the effect of a specific ion and the overall effect of osmotic pressure. Experimental results showed that medium osmotic pressure was an important factor in limiting the growth of freshwater diatoms [77] and affected nutrient intake [78]. Therefore, total ion strength and EC can explain the significant changes between diatom communities [72]. Amino acid, ammonium, and nitrate, which are a types of nitrogen compounds, are nutrients preferred by marine benthic diatoms [79]; however, because the present study only measured TN, it is necessary to conduct further experiments to measure a greater variety of nitrogen compounds. Therefore, if more detailed water quality (various nitrogen compounds and ions) measurements were conducted in the future, then more definitive evidence for distribution of epilithic diatoms might be available.
As a result of biological integrity assessment using epilithic diatom indices, the TDI could be divided into grades from A for very good to E for very poor, in accordance with the grading system given in the Biomonitoring Survey and Assessment Manual [51]. The TDI grade for estuaries was C (average) in G1, which had the highest IndVal of 42, while all other groups had a grade of D (poor). Compared to river water quality standards in Korea, G1 was assessed as having very good DO, somewhat good TP, and average BOD, with a TDI grade slightly lower than the water quality grade. Moreover, DAIpo, AAN, and NGO also showed low values below 50, relative to 100. This is because most of the epilithic diatom indices used in the present study were developed for freshwater systems, and thus, species that are ecologically important in estuaries were not included in the indices [80]. Moreover, some species that are included in the biological indices may not respond the same to environmental situations in estuaries and freshwater. For example, N. frustulum, which appeared as a major species with a high percentage of 6.7%, is very abundant in freshwater due to the high level of organic matters [81], high concentrations of inorganic nutrients [82], and high EC [82,83]. However, they may be present in high levels in estuaries without showing any reduction in health [80]. Nanivula perminuta, another major species, exhibited different concentrations in ammonium and nitrate that showed a peak growth rate according to salt concentrations [84]. Therefore, to establish the biological impact assessment system using epilithic diatoms in water areas with severe fluctuation, such as estuaries, it is necessary to adjust the index values according to the nutritional status of the estuary instead of using freshwater indices as they are. Additional studies are also required to further understand the role of epilithic diatom communities.

Conclusions
Between 2009 and 2016 during May, we assessed the feasibility of applying diatom indices previously studied to assess the biological integrity of estuaries, while also predicting the importance of environmental factors and species appearance of epilithic diatoms in the southern part of the Korean Peninsula.
1. In total, 564 taxa of diatoms were found and the dominant species were identified as Nitzschia inconspicua and N. perminuta.
2. According to the species appearance and their abundances of diatoms, estuaries in the Korean Peninsula were geographically categorized into four groups. G1 showed high water temperature and DO levels, while nutrient levels were significantly low. G3 and G4 showed significantly high turbidity and nutrient levels. 3. A random forest model indicated that the major factors predicting diatom appearance in estuary are electric conductivity, salinity, turbidity, and total nitrogen. 4. The biological integrity of the estuary of Korean peninsula using "stream diatom indices" is very low through the sampling sites; however, a de novo diatom index should be developed to assess different or specialized ecosystem of estuary.