Diversity and Distribution of Endemic Stream Insects on a Nationwide Scale , South Korea : Conservation Perspectives

This study aimed to identify the biogeographical and environmental factors affecting the biodiversity of endemic aquatic species (i.e., Ephemeroptera, Plecoptera and Trichoptera; EPT). We used data collected from 714 sampling sites combined with 39 environmental factors. Ten EPT endemic species were identified. The sampling sites grouped into four clusters based on the similarities of the endemic EPT assemblages using a hierarchical cluster analysis. Non-metric multidimensional scaling (NMS) revealed the differences among the four clusters, with the first three axes being strongly related to annual average, August, and January temperatures, as well as altitude. The random forest model identified geological and meteorological factors as the main factors influencing species distribution, even though the contributions of environmental factors were species-specific. Species with the lower occurrence frequency (i.e., Pteronarcys macra, Kamimuria coreana, and Psilotreta locumtenens) mainly occurred in the least-disturbed habitats. P. macra represents a priority conservation species, because it has a limited distribution range and is highly vulnerable to anthropogenic disturbance. Our results support the need for an environmental management policy to regulate deforestation and conserve biodiversity, including endemic species.


Introduction
Endemic species generally inhabit a geologically limited area, and are highly vulnerable to small environmental changes [1].Therefore, we must strive to conserve and manage such endemic species from loss and extinction.However, knowledge remains limited on the biodiversity of endemic species in freshwater ecosystems, along with the key factors that influence their distribution.Severe pressures from various anthropogenic disturbances are continuously causing changes to original habitats and threatening the continued existence of endemic species [1,2].
Freshwater habitats cover just 0.8% of the Earth's surface and comprise around 0.01% of the world's water bodies [3,4].However, freshwater habitats contain disproportionally high biodiversity of organisms [5].Freshwater organisms tend to inhabit smaller geographic ranges [6], resulting in high levels of endemism evolving.Severe anthropogenic disturbances have been further shrinking their potential distribution ranges, as well as individual home ranges.Dudgeon et al. [7] listed five major factors that threaten freshwater biodiversity: overexploitation (e.g., [8]); water pollution (e.g., [9]); flow modification (e.g., [10]), and; the destruction and degradation of habitat and invasion of exotic species (e.g., [11]).The impacts of these factors, both separately and in combination, have caused the populations of freshwater organisms to decline, along with their distributions, and the homogenization of freshwater ecosystem worldwide.These factors threaten the survival of endemic species, which are highly adapted to the environmental factors within their specific ranges [12].
Research on the endemism, diversity, distribution, and conservation efforts of endemic species remains limited, especially with respect to endemic macroinvertebrates at nationwide scales.Most literature on endemic species focuses on single species (i.e., Lednia tumana, [13]).Other studies have been conducted within specific areas or under specific environmental conditions, such as with respect to the effects of glacial melting [13], deforestation [14], and the introduction of invasive plants [15].
Among various ecologically similar species, rare species tend to have smaller populations that are more likely to be vulnerable to environmental events, resulting in a greater extinction risk than common species [16].Therefore, the degree of rarity of endemic species should be the first criterion used to determine conservation priorities and conservation responsibilities [17].Among aquatic insects, species in Ephemeroptera, Plecoptera, and Trichoptera (EPT) exhibit sensitive responses to physical environmental factors at broad scales, in addition to water quality factors at small scales.The diversity of EPT represents one of the most important biological indices for evaluating the status of freshwater habitats [18].However, few studies have focused on the rarity and distribution of EPT endemic species [19], despite their high vulnerability to future environmental change [1,[20][21][22][23] and their ecological impacts and roles in ecosystem functioning [24].To our knowledge, even though several studies have investigated the endemism of EPT species, they mainly focus on how climate change or temperature related factors influence the vulnerability of ETP species [13,19].
In this study, we evaluated two hypotheses: first, endemic species diversity becomes higher in the least disturbed areas, and; second, large scale factors, such as meteorological and geological factors (e.g., temperature, latitude, and longitude) are important determinants of the distribution and occurrence of endemic species, even though habitat preference and environmental tolerance differ among endemic species.Our results are expected to provide baseline information on which to build suggestions to advance the conservation efforts of endemic species.

Ecological Data
Data on endemic EPT species were obtained from the database of the National Aquatic Ecological Monitoring Program (NAEMAP), supported by the Ministry of Environment and National Institute of Environmental Research, Republic of Korea [25].From the NAEMAP database, we selected a dataset consisting of samples collected at 717 sampling sites of 371 streams in five major river basins at a national scale in South Korea from 2008 to 2013.South Korea has a temperate climate with an annual mean temperature of 12.8 • C and an annual mean precipitation of 1589 mm/year [26].The Han River basin (basin area: 41,957 km 2 ) runs through Seoul, the capital of South Korea, and is the largest basin in South Korea (covering approximately one third of the country).The basin is located in the northern part of South Korea, and flows westward into the Yellow Sea.The Nakdong River basin (31,785 km 2 ) is located in the southeastern part of South Korea, and flows into the South Sea.The Guem River basin (17,537 km 2 ) is in the mid-western part of South Korea and the Yeongsan River basin (12,833 km 2 ) is in the southwestern part of the country.Both basins flow into the Yellow Sea.The Seomjin River basin (4914 km 2 ) is located between the Nakdong and Yeongsan river basins, and flows into the South Sea.The Yeongsan and Seomjin rivers were treated as one river system (hereafter, the Yeongseom River) for management purposes, because their catchments are in quite close proximity and share similar geographical conditions [23].
At each sampling site, three replicate sampling surveys were conducted in riffle zones within a 200 m reach of the river using a Surber net (30 × 30 cm 2 , 1 mm mesh size), based on the guidelines of the National Survey for Stream Ecosystem Health in South Korea [27].All specimens collected during sampling were sorted and identified to the lowest level (mostly species level) under the microscope based on the published literature [28][29][30][31][32].We used 39 environmental factors from seven different categories (including meteorology, geography, land use, hydrology, flow type, substrate, and physicochemical water quality) to evaluate how these factors affect EPT endemic species.The factors of hydrology, flow type, substrate, and physicochemical water quality were obtained from the NAEMAP database [33].Meteorological factors were obtained from the Korea Meteorological Administration.Geographical and land use factors were extracted from digital maps in ArcGIS (Ver.10.1) [34] based on the coordinate information of each sampling site [25].

Data Analysis
We conducted data analyses in four steps to characterize the rarity, abundance, and distribution of endemic EPT species.First, we evaluated the rarity of endemic EPT species by considering the number of sampling sites where endemic EPT species were found and the number of endemic EPT species recorded at each individual sampling site at nationwide and basin scales.Second, a hierarchical cluster analysis (CA) based on the endemic EPT assemblage was conducted to define the pattern of similarity of the endemic EPT assemblage structure.CA was calculated based on the Ward's linkage method [35] with the Bray-Curtis distance measure.Then, a Kruskal Wallis test (KW) was conducted to compare the differences of endemic EPT assemblage structure among the clusters defined in the CA.When there were significant differences in KW (p < 0.05), multiple comparison tests were performed to compare the differences between clusters.Multi-response permutation procedures (MRPP) were conducted to check for significant differences among the clusters.CA and MRPP were conducted with a function hclust and mrpp, respectively, in the package vegan [36] in R [37].KW and multiple comparison tests were carried out with the function kruskal in the package agricolae [38] in R [37].
Third, non-metric multidimensional scaling (NMS) was conducted using the Bray-Curtis distance as the dissimilarity measure for the endemic EPT assemblage to determine the distribution pattern of endemic EPT species.NMS is a non-linear method that is suitable for zero-inflated ecological data sets with unknown data distribution [39].We used a metaMDS function to determine the best solution (i.e., the lowest stress value) in the package vegan [36] in R. The relationship between endemic EPT species and environmental factors was determined using a function envfit in the package vegan [39,40].
Finally, the occurrence probability of endemic species was predicted using a random forest (RF) model, using the 39 environmental factors as independent variables [41].RF is an ensemble machine learning technique that is based on a combination of a large set in decision tree.Each tree is trained to select a random sample (i.e., the calibration data set) from a random set and the training dataset of variables [42].After building the RF model, the relative importance of environmental factors influencing the occurrence of endemic species was calculated from a mean decrease in accuracy, and was then rescaled from 0 to 100.In this study, RF was computed using the package random Forest [43] in R with three training parameters (such as mtree, mtry, and node size) at the default setting.The abundance of each species and some of the environmental factors with high variation (i.e., distance from source, water width, average depth, and average velocity) were log transformed before the data analysis.

Characteristics of Endemic EPT Species
Ten endemic EPT species were identified in the dataset, including five, three, and two species belonging to Ephemeroptera, Plecoptera, and Trichoptera, respectively (Table 1).Out of these 10 species, Rhoenanthus coreanus (Ephemeroptera) was the most widely distributed, with 42.2% occurrence frequency (301 sites), and was found within all four basins (Figure 1 and Table 1).However, most species had very low occurrence frequencies.Pteronarcys macra (Plecoptera) was only recorded at eight sites (1.1% of the occurrence frequency) in only one basin (the Han River basin) in the northern part of South Korea (Figure 1, Table 1).Only four species (i.e., R. coreanus, Drunella aculea, Potamanthus yooni, and Kamimuria coreana) had more than 10% occurrence frequency.Out of these four species, only two (R. coreanus and K. coreana) were observed in all four basins, even though the occurrence frequency of K. coreana was extremely low in the Geum and Yeongseom basins (3.1% in both cases).In addition, out of six species with less than 10.0% of the occurrence frequency, only one (Neoperla coreensis) was recorded in all basins.Overall, the majority of sites did not have any endemic species (i.e., 277 sites, 38.6%) or just one endemic species (i.e., 219 sites, 30.5%) (Figure 2).occurrence frequency of K. coreana was extremely low in the Geum and Yeongseom basins (3.1% in both cases).In addition, out of six species with less than 10.0% of the occurrence frequency, only one (Neoperla coreensis) was recorded in all basins.Overall, the majority of sites did not have any endemic species (i.e., 277 sites, 38.6%) or just one endemic species (i.e., 219 sites, 30.5%) (Figure 2).occurrence frequency of K. coreana was extremely low in the Geum and Yeongseom basins (3.1% in both cases).In addition, out of six species with less than 10.0% of the occurrence frequency, only one (Neoperla coreensis) was recorded in all basins.Overall, the majority of sites did not have any endemic species (i.e., 277 sites, 38.6%) or just one endemic species (i.e., 219 sites, 30.5%) (Figure 2).

Relationships between Endemic EPT Species and Environmental Factors
The sampling sites were classified into four clusters (1 to 4) based on the similarities of the endemic EPT assemblage composition (Figure 3).MRPP identified significant differences in the endemic EPT assemblage among the four clusters (A = 0.09, p < 0.05).Cluster 1 was characterized by D. aculea, P. macra, K. coreana, N. coreensis, Psilotreta locumtenens, and Ceraclea armata (KW, p < 0.05, Table 2).Cluster 2 was characterized by R. coreanus.Cluster 3 was characterized by P. yooni.Cluster 4 was characterized by Procloeon halla.
Differences in the composition of the endemic EPT assemblage were also reflected in NMS (Figure 4).In NMS ordination, the first three axes (stress value = 7.32) had the highest relationship with annual average temperature (R 2 = 0.598, p < 0.05), followed by August temperature (R 2 = 0.561, p < 0.05), January temperature (R 2 = 0.503, p < 0.05), latitude (R 2 = 0.445, p < 0.05), and altitude (R 2 = 0.373, p < 0.05) (Table 3).The sampling sites with high values for latitude, water velocity, and altitude were located on the left parts of axis 1, whereas sites with relatively high values for (Chl-a), poor water quality (i.e., high conductivity, biological oxygen demand (BOD), total phosphate (TP), and chlorophyll a (Chl-a)) were on the right part of axis 1 (Figure 4).Species such as P. halla, were located on the right part of axis 1.These species mainly inhabit the lowland areas of the southern parts of South Korea, where the sampling areas were mostly characterized as having high conductivity, BOD, TP, and Chl-a.Species, such as P. locumtenens, K. coreana, P. macra, and D. aculea, were located on the left part of axis 1.These species were mainly found in the least disturbed freshwater habitats (e.g., with good water quality, high water velocity, and altitude), such as mountain areas.     1 and 3, respectively.Cluster 1: square with green color; cluster 2: circle with red color; cluster 3: triangle with blue color, and; cluster 4: triangle with orange color.    1 and 3, respectively.Cluster 1: square with green color; cluster 2: circle with red color; cluster 3: triangle with blue color, and; cluster 4: triangle with orange color.  1 and 3, respectively.Cluster 1: square with green color; cluster 2: circle with red color; cluster 3: triangle with blue color, and; cluster 4: triangle with orange color.

Influential Environmental Factors on the Occurrences of Endemic Species
The distribution of each endemic species was predicted well in the range from 0.990 to 0.998 based on environmental factors through the RF learning process (Figure 5).Overall, geographical and meteorological factors represented the main factors influencing species distribution, even though different species responded differently to various environmental factors.For example, altitude was the most important (100) for the occurrence of R. coreanus, which had the highest occurrence frequency in the dataset, followed by pebbles (90.4), longitude (83.1), distance from source (77.4), and annual average temperature (64.7).Meanwhile, D. aculea, which was the second highest occurrence frequency species, was characterized by January temperature (100) and annual average temperature (95.9).P. macra, which rarely occurred, was influenced by the ratio of agriculture area in land use (100), followed by the run ratio in flow type (86.8), forest ratio (85.7), phosphate-phosphorus (PO4P) (84.0), and average depth (52.0).

Influential Environmental Factors on the Occurrences of Endemic Species
The distribution of each endemic species was predicted well in the range from 0.990 to 0.998 based on environmental factors through the RF learning process (Figure 5).Overall, geographical and meteorological factors represented the main factors influencing species distribution, even though different species responded differently to various environmental factors.For example, altitude was the most important (100) for the occurrence of R. coreanus, which had the occurrence frequency in the dataset, followed by pebbles (90.4), longitude (83.1), distance from source (77.4), and annual average temperature (64.7).Meanwhile, D. aculea, which was the second highest occurrence frequency species, was characterized by January temperature (100) and annual average temperature (95.9).P. macra, which rarely occurred, was influenced by the ratio of agriculture area in land use (100), followed by the run ratio in flow type (86.8), forest ratio (85.7), phosphate-phosphorus (PO4P) (84.0), and average depth (52.0).  3.

Priority Species for the Conservation
Endemic species are characterized by their limited spatial distribution and poor dispersal, resulting in their being rare.The rarity of endemic species is a major causal factor of their going extinct in both ecological and geological timeframes [44].Therefore, endemic species are likely to be the first candidates for extinction.In our study, out of the 10 endemic EPT species, the occurrence frequency of six species was less than 10.0%.The frequency of C. armata (1.5%) and P. macra (1.1%) was lower  3.

Priority Species for the Conservation
Endemic species are characterized by their limited spatial distribution and poor dispersal, resulting in their being rare.The rarity of endemic species is a major causal factor of their going extinct in both ecological and geological timeframes [44].Therefore, endemic species are likely to be the first candidates for extinction.In our study, out of the 10 endemic EPT species, the occurrence frequency of six species was less than 10.0%.The frequency of C. armata (1.5%) and P. macra (1.1%) was lower than 2.0%.The species that were widely distributed across several basins might lose their conservation priority when compared with species that are only found within a certain basin and with a limited distribution [12].In this sense, both species should have high conservation priority.In particular, P. macra should be considered as priority species of conservation concern, because it was only found in a limited northeastern area of the Han River, which has a low annual temperature.P. macra is a cold-adapted species, making it vulnerable to global warming and anthropogenic disturbances, which threaten its existence in their original habitat [23].

Influential Environmental Factors Related to the Existence of Endemic Species: Conservation Implications
We should be aware of increasing threats to the existence of endemic species.Not surprisingly, endemism is an important criterion for determining national conservation responsibilities [17].However, measurements for the future extinction of species are hindered by the lack knowledge of the life history, traits, niche, resource requirements, and suitable criteria of endemic species making it difficult to determine their rarity [45,46].The current study showed that the factors influencing the differentiation of endemic EPT species included geology (i.e., latitude and altitude), meteorology (especially temperature-related factors), hydrology (water velocity), and water chemistry (BOD, TP, Chl-a, and conductivity).
Our results support the environmental filtering hypothesis, which proposes that environmental drivers act as hierarchical filters constraining assemblages [47].Large-scale factors, such as geological and meteorological factors, strongly influence the local habitat and biological diversity of streams and rivers [48,49].These factors are closely connected with the diversity and composition of endemic EPT species [6].Strayer [6] suggested that some range boundaries are set by the climate and other ecological patterns, even though the current distribution patterns of endemic EPT species is based on the history of drainage connections [50].Freshwater fauna are particularly sensitive and vulnerable to the impacts of climate change, because they usually have limited dispersal abilities [51].Consequently, current and future changes caused from human activities are likely to have a stronger influence on their diversity than past anthropogenic alterations [52].Global warming especially threats the potential future distribution and persistence of the sensitive habitats used by endemic Plecoptera [13,53], even though this phenomenon was not directly considered in our study.Several studies have also shown that land use is an influential factor in determining species distribution, even though, in our study, this influence was relatively weak based on NMS.Flather et al. [54] found that forests and rangelands are important factors for differentiating endangered species "hot spots." When the suitable habitats and their climatic refuges are degraded, the existence of endemic taxa becomes threatened (e.g., [55]).Since the 20th century, intensive alterations of hydrology have been closely related with the massive constructions of reservoirs and dams, as well as the disappearance of streams and springs, due to human actions [56].Dams and impoundments alter the hydrological regimes of rivers, resulting in reduced water flow (water level fluctuations), the accumulation of silt, and the loss of habitat diversity.These alterations induce changes of life cycles, block of species dispersal, and reduce the abundance of freshwater fauna [57][58][59].If these threats align within the zone containing high endemic species richness and if there are no conservation actions in that zone, it would accelerate the further homogenization of these habitats and the simplification of the macroinvertebrate community, which, in turn, would precipitate the loss of rare endemic species.
In addition, we found that endemic species with the lower occurrence frequency (i.e., P. macra, K. coreana and P. locumtenens) were mainly observed in the least-disturbed habitats, such as mountain areas (northerneast part of the Han River catchment).Certain factors (such as low temperature, high altitude, high ratio of forest land use, rapid water velocity, high ratio of riffles, and good water quality) strongly influenced the distribution of these three species based on RF.These three species primarily inhabit zones with little urbanization and extensive forest watersheds.However, the forest area in South Korea has gradually declined (i.e., 6.406 × 10 6 ha in 2003 and 6.335 × 10 6 ha in 2015) because of the construction of convenience facilities for humans.Furthermore, the ratio of private forests is high (4.25 × 10 6 ha, 67.1% of the total forest area); thus, such forest areas are likely to be quickly lost due to potentially disruptive activities by the owners.Therefore, a strict environmental management policy is required to minimize the deforestation of such areas to conserve biodiversity, including endemic species.

Conclusions
The conservation, protection, and management of freshwater biodiversity are the ultimate conservation challenge because multiple human stakeholders continue to threaten biodiversity.Various efforts to maintain and protect biodiversity should be conducted globally through establishing a system of protected areas.We suggest that endemic invertebrates should be priority candidate species of conservation concern based on their rarity and the types of environmental factors that determine their diversity and distribution.Out of the 10 species that were found as endemic species in South Korea, P. macra had an occurrence frequency of just 1.1% with an extremely limited distribution (i.e., found in one catchment, especially the least disturbed area), indicating the high conservation priority of this species.RF showed that the distribution of the endemic species was mainly influenced by geographical and meteorological factors.Benthic macroinvertebrates play an important role in many freshwater ecosystems, and are useful for evaluating biological integrity and water and habitat quality.Furthermore, the high diversity of endemic EPT species is a valuable predictor of the diversity of other aquatic invertebrates.Therefore, conservation efforts of sites containing endemic EPT species also guarantee the conservation of other freshwater taxa.

Figure 1 .
Figure 1.Sampling sites (a) and occurrence patterns of endemic EPT species (b) in South Korea.

Figure 2 .
Figure 2. Number of sampling sites for endemic EPT species.

Figure 1 .
Figure 1.Sampling sites (a) and occurrence patterns of endemic EPT species (b) in South Korea.

Figure 1 .
Figure 1.Sampling sites (a) and occurrence patterns of endemic EPT species (b) in South Korea.

Figure 2 .
Figure 2. Number of sampling sites for endemic EPT species.

Figure 2 .
Figure 2. Number of sampling sites for endemic EPT species.

Figure 3 .
Figure 3. Dendrogram of cluster analysis based on the endemic EPT assemblages.

Figure 4 .
Figure 4. Non-metric multidimensional scaling (NMS) ordination based on endemic EPT assemblages with fitted vectors of environmental factors.Black circles with four letters represent species, and others without letters indicate sampling sites.The direction and length of arrows represent the strength of the relationship between the environmental variables and the ordination axes.Only 10 environmental factors with R 2 values > 0.3 are displayed in the figure.Abbreviation s for species and environmental factors are presented in Tables1 and 3, respectively.Cluster 1: square with green color; cluster 2: circle with red color; cluster 3: triangle with blue color, and; cluster 4: triangle with orange color.

Figure 3 .
Figure 3. Dendrogram of cluster analysis based on the endemic EPT assemblages.

Figure 3 .
Figure 3. Dendrogram of cluster analysis based on the endemic EPT assemblages.

Figure 4 .
Figure 4. Non-metric multidimensional scaling (NMS) ordination based on endemic EPT assemblages with fitted vectors of environmental factors.Black circles with four letters represent species, and others without letters indicate sampling sites.The direction and length of arrows represent the strength of the relationship between the environmental variables and the ordination axes.Only 10 environmental factors with R 2 values > 0.3 are displayed in the figure.Abbreviation s for species and environmental factors are presented in Tables1 and 3, respectively.Cluster 1: square with green color; cluster 2: circle with red color; cluster 3: triangle with blue color, and; cluster 4: triangle with orange color.

Figure 4 .
Figure 4. Non-metric multidimensional scaling (NMS) ordination based on endemic EPT assemblages with fitted vectors of environmental factors.Black circles with four letters represent species, and others without letters indicate sampling sites.The direction and length of arrows represent the strength of the relationship between the environmental variables and the ordination axes.Only 10 environmental factors with R 2 values > 0.3 are displayed in the figure.Abbreviation s for species and environmental factors are presented in Tables1 and 3, respectively.Cluster 1: square with green color; cluster 2: circle with red color; cluster 3: triangle with blue color, and; cluster 4: triangle with orange color.

Figure 5 .
Figure 5. Relative importance in the occurrence of endemic EPT species based on the Random Forest model.Abbreviations of environmental factors are presented in Table3.

Figure 5 .
Figure 5. Relative importance in the occurrence of endemic EPT species based on the Random Forest model.Abbreviations of environmental factors are presented in Table3.

Table 1 .
Occurrence frequency of endemic Ephemeroptera, Plecoptera and Trichoptera (EPT) species recorded in South Korea from 2008 to 2013.

Table 2 .
Differences in the abundance of endemic EPT species among four clusters from a cluster analysis.The values in parenthesis are the standard deviation.Different alphabets indicate significant differences of variables among clusters based on the multiple comparison tests (p < 0.05).

Table 3 .
Relationship between environmental factors and NMS ordination based on endemic EPT assemblages.The environmental factors of the 10 highest R 2 are presented as boldface letters.