Predicting the Impact of Climate Change on Freshwater Fish Distribution by Incorporating Water Flow Rate and Quality Variables

: In this study, water ﬂow rate and quality variables that restrict freshwater ﬁsh distribution were incorporated in species distribution modeling to evaluate the impacts of climate change. A maximum entropy model (MaxEnt) was used to predict the distribution of 76 ﬁsh species in the present (2012–2014) and in the future (2025–2035 and 2045–2055) based on representative concentration pathway (RCP) 4.5 and RCP 8.5 scenarios for ﬁve major river basins (Han, Nakdong, Geum, Seomjin, and Yeongsan) in South Korea. The accuracy of MaxEnt performance was improved from 0.905 to 0.933, and from 0.843 to 0.864 in the model training and test, respectively, by introducing ﬂow rate, total nitrogen, total phosphorus (TP), and total suspended solids (TSS). TSS and TP were ranked as the second and fourth contributing parameters, respectively, among the 17 variables considered in this study. There was a greater decline in species richness index under scenario RCP 8.5 than under scenario RCP 4.5, and in 2050 compared with 2030. However, the tolerance guild index (TGI) was predicted to improve in the future. The increase in TGI coupled with the decrease in species richness index (SRI), indicated that climate change is likely to have adverse e ﬀ ects on freshwater ﬁsh. Notably, the habitat of Korean spotted barbel ( Hemibarbus mylodon ), an endemic species of South Korea, is expected to contract largely in 2050 based on the RCP 8.5 scenario. These ﬁndings demonstrate that the incorporation of ﬂow rate and water quality parameters into climatic variables can improve the prediction of freshwater ﬁsh distribution under climate change.


Introduction
Climate change will affect the distribution of fish species, changing the diversity of ecosystems [1]. Previous studies have shown that freshwater ecosystems will be negatively influenced [2], and thus studies are needed to predict species distribution under climate change. In general, the representative greenhouse gas emission scenario (representative concentration pathway, RCP) adopted by the Intergovernmental Panel on Climate Change [3] is used for future climate scenarios. In Korea, the National Institute of Meteorological Research has developed a regional climate model based on the RCP

Study Area and Fish Data
The study area was determined based on the river basin distribution, and five major river basins (Han, Nakdong, Geum, Seomjin, and Yeongsan) in South Korea were selected ( Figure S1, Supplementary Materials). Other river basins were excluded from the study area due to insufficient flowmeter and water quality observations to model flow rate and water quality on a subwatershed scale. Some watersheds of the Han River basin are in North Korea across the Military Demarcation Line. Therefore, those were excluded from the study area because of insufficient environmental data. A river map associated with the five selected river basins was converted into a raster form of 1 km Sustainability 2020, 12, 10001 3 of 15 units, and the converted river map was set as the reference map. All map coordinates were set to WGS84 (World Geodetic System 1984) datum.
Data on the occurrence of freshwater fish were obtained from the Water Environment Information System [25], which was provided by the National Aquatic Ecological Monitoring Program of Korea [26]. Fish were sampled according to national standard protocols [26], ensuring homogeneity and reliability of the data. Fish sampling was conducted twice per year in spring and fall to cover spawning and growth season. The casting net and skimming net methods were used for fish capture. Fish occurrence data were collected from 960 representative locations between 2012 and 2014 throughout the five major river basins of South Korea. To ensure homogeneity of fish data, recent occurrence data that are recorded in different sampling locations and periods were not used in this study. These records contain 115 freshwater fish species in total. When sorting presence records, coordinates were standardized, and duplicates were removed. To ensure the accuracy of the distribution model, species with less than 10 training presence records were excluded from the target species. Consequently, 76 species were selected for distribution analysis (Table S1, Supplementary Materials).

Environmental Data
Five categories of environmental data, consisting of 17 variables, were prepared at the resolution of 1 km (Table 1). Categories pertaining to air temperature, precipitation, and flow rate were subdivided into four groups to reflect seasonal characteristics in Korea. Specifically, annual average, spring average (from March to May), autumn average (from September to November), and the mean annual difference were calculated for each category. Annual average total nitrogen (TN), total phosphorus (TP), and total suspended solids (TSS) were included in the water quality category. Topographic data were composed of two variables: elevation and slope. All environmental data were compiled from 2012 to 2014 to match the fish occurrence data. Current temperature and precipitation data were downloaded from the Korea Meteorological Administration (KMA) [27] and interpolated to a 1 km scale using ArcGIS version 10.2 [28]. Downscaled future climatic data in 2030 (2025-2035) and 2050 (2045-2055) were also obtained from the KMA, which were projected based on RCP greenhouse gas emission scenarios (RCP 4.5 and 8.5). RCP 4.5 is an intermediate emission scenario while RCP 8.5 is the most pessimistic emission scenario [3]. Current and future flow rate and water quality data were produced using the Soil and Water Assessment Tool (SWAT, version 2012, USDA Agricultural Research Service and Texas A&M AgriLife Research). SWAT is a physically based hydrological model that simulates surface runoff, total flow, sediment transportation, and nitrogen and phosphorus concentrations [29,30]. Briefly, the Han, Nakdong, Geum, Seomjin, and Yeongsan river basins were divided into 182, 195, 78, 46, and 34 subwatersheds, respectively, according to the Water Resources Management Information System [31]. Subwatersheds were further divided into hydrological response units (HRUs), which are nonspatially organized units based on environmental similarity [32]. To simulate the point-source loading effect in water quality, we summarized pollution loads in SWAT, allowing for pollutant information along the water channel [33]. Pollution loads were averaged in each subwatershed, and a total of 2778 point-source data (721, 793, 460, 365, and 439 point sources in Han, Nakdong, Geum, Seomjin, and Yeongsan river basins, respectively) [31] were used for this process. To estimate parameters of the SWAT model, parameters were calibrated with monitoring data from 2010 to 2016, and the period of model stabilization was set at 4 years. After estimating model parameters, we calibrated the flow rate and water quality data with one major river in each river basin. Hydrological changes and pollutant loads were calculated for each HRU, and the flow rate and water quality of one subwatershed were calculated by summing the results of the HRUs belonging to that subwatershed. As a result, present and future flow rates and water quality data were produced for each subwatershed.
High-resolution topographic maps (30 m unit scale) for elevation and slope were obtained from the National Geographic Information Institute [34] and the National Institute of Agricultural Sciences [35], respectively, and converted into 1 km units. Topographic variables were assumed to remain constant throughout the measured timeframe.

Species Distribution Modeling
The MaxEnt model (version 3.3.3k) [36] was used to estimate distribution potential [12]. The random test percentage was set as 30% for model verification, thus, 70% of presence records were used for model training (Table S1, Supplementary Materials). Training samples were selected by bootstrapping in each model replicate. MaxEnt limits and protects model complexity using a regularization multiplier, which was set as the default (β = 1) [12]. If the regularization multiplier exceeds 1, the model loses specific ecological characteristics of the training samples. Conversely, if the regularization multiplier is below the default, results can be overfitted to the training set. We generated the area under the receiver operation characteristic curve (AUC) value to determine the accuracy of the quantitative model. The receiver operation characteristic (ROC) curves, which represent the accuracy of the threshold-independent model, were generated for each replicate. Using ROC curves, training AUC and test AUC values were calculated. This study applied model-result filtering to ensure prediction accuracy. The training and test AUC of the ROC curve were used to filter the model results [37]. When the training AUC was less than 0.8, or the test AUC was less than 0.7 in each replication, the model output of the replication was excluded. The filtered model outputs were then converted to a logistic probability of presence, which is expressed as a value between 0 and 1. Species' presence and absence were determined by applying the minimum training presence (MTP) threshold rule to the calculated presence probability for each replicate. The presence and absence replicates were averaged and rounded to form an integrated presence-absence map for each species. Overall, presence-absence maps were produced for the 76 fish Modeling results were interpreted as the species richness index (SRI) and the tolerance guild index (TGI). The SRI was calculated by accumulating presence-absence maps for 76 species in each projection, ranging from 0 to 76. The TGI was applied to identify the relative proportion of sensitive species in the river. The tolerance guild of freshwater fish is determined by the response of fish populations to water pollutant [38]. Each species was classified into three groups, including: tolerant species (TS),  [39]. TGI was calculated by applying Equation (1)  (1) To identify tendencies for changes in freshwater distribution under climate change, SRI and TGI were analyzed in association with the representative topographic variable, elevation.

Model Performance
The performance of MaxEnt was evaluated by calculating training and test AUC values, which indicate the accuracy of model training and validation, respectively. When distribution was modeled only with air temperature, precipitation, and topographic variables (Model A), the average ± standard deviation of training and test AUC were 0.905 ± 0.0515 and 0.843 ± 0.0858, respectively. However, inclusion of flow rate and water quality variables (Model B) improved model AUC. The average ± standard deviation of training and test AUC of the model were 0.933 ± 0.0415 and 0.864 ± 0.0764, respectively, indicating "excellent" training and "good" test results [40]. Although the use of AUC to evaluate model performance may cause overestimation [41], AUC values are used widely to determine discrimination power in many statistical models, including MaxEnt [42][43][44]. Additionally, the AUC is considered a suitable parameter to compare accuracy between models that have different threshold values [11,45].
The average contribution and rank of environmental variables were calculated for Models A and B (Table 2). Both models showed that elevation was the most important variable in the freshwater fish distribution model. Although elevation is a nonclimatic environmental factor, it can restrain the species distribution boundary with climate [46]. Yoon et al. [47] reported that elevation was the best correlative environmental factor to explain distribution pattern of 38 freshwater fish in South Korea among 13 environmental variables studied. In addition to elevation, mean annual difference in temperature (Tdif) and annual average precipitation (Pavg), were the top contributing variables for the model A. In general, temperature variance can limit or change the distribution of fish [48,49]. Kwon et al. [7] also demonstrated that the difference in temperature between July and January was the most important factor to predict the distribution of 22 endemic Korean fishes, in addition to elevation. Additionally, the high contribution of precipitation reflects the importance of instream flow on fish distribution. In general, instream flow affects water depth and velocity, which play key roles in the suitability of fish habitats [50,51].
Water quality parameters, such as total suspended solids (TSS) and total phosphorus (TP), were ranked second and fourth, respectively, in model B. TSS may induce both direct and indirect sublethal stress on fish (e.g., evasion from adverse environments) [52][53][54]. Additionally, high suspended solid concentration can incur adverse impacts on fish both in the short term (e.g., reduced feeding) and in the long term (e.g., reduced survival rate) [55,56]. Richter et al. [57] reported that deterioration of TSS affected approximately 35% of freshwater fauna, including fish, and threatened species distribution in habitats. Additionally, altered phosphorus load has been reported as a major threat to freshwater fish, causing habitat degradation [57]. Kwon et al. [58] demonstrated that TP was important for predicting the occurrence of 15 freshwater fish species in Korea using a random forest model, and specifically, was the most important environmental variable for distributions of Cyprinus carpio and Carassius auratus.
Considering the relatively high model performance (AUC) and strong contribution of water quality parameters, model B was adopted to predict the distribution of freshwater fish, in this study.

Overall Prediction of Fish Distribution
The MaxEnt model predicted the present and future SRI in South Korea ( Figures S2 and S3, Supplementary Materials). Presently, the SRI of the Geum, Yeongsan, and Seomjin river basins (40.18, 37.05, and 36.31, respectively) was higher relative to that of the Han and Nakdong river basins (32.53 and 29.54, respectively) ( Table S2, Supplementary Materials). Assuming that climate change follows the RCP 8.5 scenario, the number of species in the Geum and Seomjin river basins is expected to decline by 21.13 and 17.78 in 2050, respectively. These values were much greater than those for the Han, Nakdong, and Yeongsan river basins, which are expected to decline 9.5, 11.09, and 13.37, respectively. Chung et al. [59] also reported that endemic fish species in the Geum and Yeongsan-Seomjin river basins will be more severely affected by climate change than those in the Han and Nakdong river basins, resulting in larger decreases in SRI in the Geum and Yeongsam-Seomjin river basins.
Mean differences of SRI in 2030 and 2050 were −4.29 and −9.37 for scenario RCP 4.5, respectively, and −5.21 and −13.22 for scenario RCP 8.5, respectively (Table 3). These findings show that a larger decline in species is predicted under scenario RCP 8.5 than under scenario RCP 4.5, and in 2050 compared with 2030 under both scenarios. As indicated in Table 4, the increased air temperature (Tavg, Tspr, and Taut), increased fluctuation of precipitation (Pdif) and flow rate (FRdif), and the deterioration of water quality (TN, TP, and TSS) may drive species decline in 2050 under the RCP 8.5 scenario. Similar trends have been reported by Markovic et al. [1], which demonstrated that current fish habitats in Europe will decrease at a median rate of 43.2% owing to climate change. These findings emphasize the severity of climate change impact on freshwater fish.
Pearson correlation analysis showed that the present SRI was negatively correlated to elevation (r = −0.6610, p < 0.05) ( Table 5), indicating that high elevation rivers had relatively low species richness compared with low elevation rivers. Kwon et al. [7] also reported that elevation was negatively correlated with the richness of freshwater fish species in Korea based on Spearman's rank correlation. However, the difference in the SRI between the present and the future was positively correlated to elevation (0.4 < r < 0.7), implying a higher increase in the number of species in high elevation areas than in low elevation areas. Chen et al. [60] also demonstrated that terrestrial species, including fish, will shift to higher elevation under climate change. These findings suggest that elevation will greatly influence the number of freshwater fish species under climate change.  Fish tolerance factors, such as the number and percentage of SS, have been used to evaluate freshwater fish habitat [61,62]. The predicted number of SS in the present was the highest in the Han river basin, followed by the Seomjin, Nakdong, Geum, and Yeongsan river basins (Table S2, Supplementary Materials). Both future scenarios showed a net increase in SS in 2030 and a net decrease in 2050 (Table 3). SS will lose more habitat under RCP 8.5 than under RCP 4.5 in 2050. This can be explained by the fact that air temperature (Tavg, Tspr, and Taut) is expected to decrease in 2030 but increase in 2050 under both climate change scenarios.
TGI was also calculated to determine the relative proportion of SS in the present and in the future ( Figures S4 and S5, Supplementary Materials). As shown in Table S2, the TGI of five river basins in the present period followed the order Han > Seomjin > Nakdong > Geum > Yeongsan river basins (0.595 ± 0.256, 0.528 ± 0.192, 0.488 ± 0.265, 0.347 ± 0.211, and 0.341 ± 0.199, respectively), which is consistent with the number of SS. However, the average TGI was predicted to increase under both RCP 4.5 and 8.5 scenarios in 2030 and 2050, except for the Han river basin. Given that water quality (TN, TP, and TSS) deteriorates under climate change, the increased TGI is unlikely (Table 4) [63].
The relationship between future changes in TGI and SRI was further analyzed in the five river basins ( Figure S6, Supplementary Materials). As indicated in Figure 1, the increase in TGI (∆TGI ≥ 0) under climate change was mainly accompanied by a decrease in SRI (∆SRI < 0). This indicates that the positive effects of climate change (the increase in TGI) should be evaluated with consideration of the negative effects (the decrease in SRI). Moreover, the percentage of areas where both TGI and SRI worsen (∆TGI < 0 and ∆SRI < 0) is expected to increase over time. These findings suggest that climate change will have an adverse impact on the distribution of freshwater fish.

Predicted Distribution of Korean Spotted Barbel
The distribution of Korean spotted barbel under climate change was investigated as a case study. MaxEnt modeling showed that Korean spotted barbel can inhabit the Han, Nakdong, and Geum river basins, while present records were only located for the Han and Nakdong river basins (Figure 2a). According to Lee and Noh [64], Korean spotted barbel inhabited upstream of the Geum river basin until the early 1980s. This study also supports the possibility of the presence of Korean spotted barbel in a small region located upstream of the Daecheong dam (green circle in Figure 2a) in the Geum river basin. Sustainability 2020, 12, x FOR PEER REVIEW 9 of 16

Predicted Distribution of Korean Spotted Barbel
The distribution of Korean spotted barbel under climate change was investigated as a case study. MaxEnt modeling showed that Korean spotted barbel can inhabit the Han, Nakdong, and Geum river basins, while present records were only located for the Han and Nakdong river basins (Figure 2a). According to Lee and Noh [64], Korean spotted barbel inhabited upstream of the Geum river basin until the early 1980s. This study also supports the possibility of the presence of Korean spotted barbel in a small region located upstream of the Daecheong dam (green circle in Figure 2a) in the Geum river basin.
Among all input environmental variables, mean air temperature (Tspr), precipitation (Pspr), river flow rate (FRspr) in spring, and elevation (Elev) contributed more than 10% in the modeling distributions of Korean spotted barbel ( Figure S7, Supplementary Materials). The model showed that Among all input environmental variables, mean air temperature (Tspr), precipitation (Pspr), river flow rate (FRspr) in spring, and elevation (Elev) contributed more than 10% in the modeling distributions of Korean spotted barbel ( Figure S7, Supplementary Materials). The model showed that presence probability increases as the mean spring air temperature decreases (Figure 3a). In general, Korean spotted barbel inhabits where the water temperature is relatively cool (13 • C in average) [65]. Chung et al. [59] also showed that the optimum temperature for Korean spotted barbel was relatively low, at approximately 9.2 • C, compared with 39 endemic fish species (8.75-11.86 • C) in Korea. The probability of presence was high from 200 to 300 m (Figure 3b). Given the average elevation (201.07 m) of rivers in South Korea (Table 4), the Korean spotted barbel mainly dwell in the midstream to upstream regions [66].
Korean spotted barbel inhabits where the water temperature is relatively cool (13 °C in average) [65]. Chung et al. [59] also showed that the optimum temperature for Korean spotted barbel was relatively low, at approximately 9.2 °C, compared with 39 endemic fish species (8.75-11.86 °C) in Korea. The probability of presence was high from 200 to 300 m (Figure 3b). Given the average elevation (201.07 m) of rivers in South Korea (Table 4), the Korean spotted barbel mainly dwell in the midstream to upstream regions [66].  Korean spotted barbel inhabits where the water temperature is relatively cool (13 °C in average) [65]. Chung et al. [59] also showed that the optimum temperature for Korean spotted barbel was relatively low, at approximately 9.2 °C, compared with 39 endemic fish species (8.75-11.86 °C) in Korea. The probability of presence was high from 200 to 300 m (Figure 3b). Given the average elevation (201.07 m) of rivers in South Korea (Table 4), the Korean spotted barbel mainly dwell in the midstream to upstream regions [66].  The model showed that presence probability of Korean spotted barbel was expected to decrease with increasing amounts of spring precipitation (Figure 3c). Furthermore, there was a sharp decrease in the probability presence-response curve when the mean spring flow rate was very low (Figure 3d). These findings suggest that Korean spotted barbel are controlled by hydrological conditions in spring, which may influence the building of spawning nests. In general, the Korean spotted barbel spawns in April and May (spring) and builds a nest with gravel to protect the eggs [66]. Nest building by the hornyhead chub (Nocomis biguttatus), river chub (Nocomis micropogon), and bluehead chub (Nocomis leptocephalus) are mainly constrained by flow rate [67][68][69]. Additionally, water depth is another important parameter for nest building, which is related to the hydrological conditions avoiding excessive flow rate to destruct the nest [67]. However, the effect of hydrology on the building of spawning nests for the Korean spotted barbel is very limited, requiring further investigations.
Predictions for 2050 suggest a large decline in species under the RCP 8.5 scenario (Figure 2c), and a lesser decline under the RCP 4.5 scenario (Figure 2b). This is expected because the distribution of Korean spotted barbel is limited by the increase in temperature under climate change ( Table 4). The disappearance of Korean spotted barbel currently inhabiting the Han river basin was also predicted under climate change, as the maximum tolerance temperature was exceeded [70]. Additionally, the RCP 8.5 scenario accelerated habitat loss to a greater extent than the RCP 4.5 scenario did. These findings may be useful in the decision-making process for restoration and conservation plans, including prioritizing conservation areas for Korean spotted barbel.

Model Limitations and Challenges
Fundamental and realized niches can simulate potential species distribution [71]. The fundamental niche considers only the environmental resilience of species [72], while the realized niche also considers ecological effects, such as interspecific competition or predation [73]. Given that it is hard to quantitatively evaluate ecological effects, the fundamental niche was considered for MaxEnt modeling. Therefore, all presence data were assumed to have been recorded in the preferred environment (fundamental niche). This means that fish can occur in the absence of interspecific competition or interference and are free to disperse and migrate. In natural ecosystems, however, competition and migration are common [74]. Thus, it is necessary to reflect the realized niche in fish distribution modeling, although the fundamental niche can indicate an ecological preference of the species.
Fish species with low presence records (<10 training presence records) were excluded from the model to ensure adequate model accuracy (0.7 < test AUC). For example, endangered species (e.g., Pungitius sinensis) and climate-sensitive species (e.g., Rhynchocypris steindachneri, Hypomesus nipponensis, Oncorhynchus masou masou) are distributed in a limited area of South Korea [75,76]. Given that these species are ecologically important, further research is needed to predict their distribution. Recent studies have attempted to simulate fish distribution with an insufficient amount of presence data [77,78]. Anderson and Gonzalez [77] demonstrated that high predictive performance was achieved, even with limited presence data, by adjusting the species-specific regularization value in MaxEnt. Shcheglovitova and Anderson [78] used the delete-1 jackknife sampling method to discourage overfitting for distribution modeling of two spiny pocket mice with sample size less than 10. This jackknife approach allows the use of as much data as possible for model training, while leaving the data required for model testing. In a further study, 39 excluded fish species could be simulated by applying jackknife sampling and species-specific model building, including regularization value.

Conclusions
The present and future distribution of freshwater fish in five major river basins of South Korea was comprehensively predicted by integrating climate, topography, flow rate, and water quality parameters using a species distribution model, MaxEnt. Under climate change, the SRI of freshwater fish in Korea will decrease in 2050, while the TGI is expected to increase. Given that the increase in TGI is related to the decreased number of species, the adverse impacts of climate change should be assessed by integrating SRI and TGI. In addition, this forecasting on a national scale is important for policy makers in terms of setting conservation areas and a greenhouse gas emission target. In the future, this study should be improved by using an ensemble model with proper external validation.