Transferability of Monitoring Data from Neighboring Streams in a Physical Habitat Simulation

Habitat simulation models heavily rely on monitoring data, which can have serious effects on the success of a physical habitat simulation. However, if data monitored in a study reach are not available or insufficient, then data from neighboring streams are commonly used. The problem is that the impact of using data from neighboring streams has rarely been studied before. Motivated by this, we report herein on an investigation of the transferability of data from neighboring streams in a physical habitat simulation. The study area is a 2.5 km long reach located downstream from a dam in the Dal River, Korea. Zacco platypus was selected as the target fish for the physical habitat simulation. Monitoring data for the Dal River and three neighboring streams were obtained. First, similarities in the data related to channel geometry and in the observed distribution of the target species were examined. Principal Component Analysis (PCA) was also carried out to see the characteristics of the habitat use of the target species. Habitat Suitability Curves (HSCs) were constructed using the Gene Expression Programming (GEP) model, and improved Generalized Habitat Suitability Curves (GHSCs) were proposed. The physical habitat simulations were then performed. The Composite Suitability Index (CSI) distributions were predicted, and the impact of using data from the neighboring streams was investigated. The results indicated that the use of data from a neighboring stream even in the same watershed can result in large errors in the prediction of CSI. The physical habitat simulation with the improved GHSCs was found to best predict the CSI.


Introduction
A physical habitat simulation is a numerical tool that quantifies physical habitat in terms of the flow depth, velocity, and substrate at a particular discharge for a given stream [1].Thus, it is capable of predicting the impact of a change in flow on habitat availability for target species.Physical habitat simulations have been successfully used to estimate the environmental flows in rivers [2][3][4][5], in designing a river restoration [6][7][8], to evaluate river health [9][10][11], and to assess the impact of river works or river development [12][13][14].
For the success of a physical habitat simulation, acquiring relevant monitoring data is important.This is because most habitat simulation models are heavily dependent on the monitoring data [15][16][17][18][19].However, a significant portion of previous physical habitat simulations have used monitoring data from neighboring streams due to the lack of sufficient data.The monitoring data include such physical habitat variables as flow depth, velocity, substrate, and related issues.Obtaining monitoring data is, in general, costly and time-consuming.For example, more than one hundred physical habitat simulations have been carried out for streams in Korea.However, only about 15% of these studies used data obtained for the actual stream being studied [20,21].The situation is not much better in the US [18,[22][23][24][25].
The use of monitoring data from neighboring streams involves an implicit hypothesis that the knowledge-based or data-based models constructed using data from a neighboring stream are applicable to the stream being studied.The similarity of HSCs between the study stream and neighboring streams has been studied by many researchers [18,23,25,[26][27][28][29][30][31][32].However, the impact of using such data from neighboring streams has rarely been investigated and a general and efficient solution to this problem has never been proposed.This motivated the present study.
The goal of this study was to assess the impact of using data from neighboring streams in a physical habitat simulation and to propose a generalized and efficient Habitat Suitability Index (HSI) model using data from neighboring streams.For this, a 2.5 km long reach in the Dal River, Korea was selected.This study reach is a gravel-bed stream located downstream from a dam.For the physical habitat simulation, Zacco platypus was selected for the target fish.Monitoring data from three neighboring streams were obtained for the physical habitat simulations of the study reach.Similarities of data for the channel geometry and for the observed distribution of the target fish against physical habitat variables were studied.Physical habitat simulations were carried out using the CCHE2D model and the HSI model for hydraulic and habitat simulations, respectively.For the HSI model, HSCs were constructed using the GEP model, and improved GHSCs were proposed using the suitable range concept.First, the impact of using data from neighboring streams in physical habitat simulations was examined quantitatively.Then, the improved GHSCs were used to predict the CSI distribution, and simulated results were compared.

Study Area
Figure 1 shows the study area in the Dal River, Korea and its neighboring streams.The Dal River is a mid-sized stream, a tributary of the Han River, and the basin area is 682.41 km 2 .The study reach is 2.5 km in length and extends from the Sujeon Bridge to the Daesu Weir.The Goesan Dam, located 0.92 km upstream from the Sujeon Bridge, regulates the flow in the study reach.The Goesan dam discharges water irregularly only for hydropower generation.As a result, the role of the Daesu Weir is to maintain a constant flow during periods when the dam is not discharging water.For the study reach, the discharges for drought flow, low flow, normal flow, and averaged-wet flow are 1.82, 4.02, 7.23, and 17.13 m 3 /s, respectively [33].

Monitoring Data
The three neighboring streams include the Hongcheon River, the Geum River, and the Chogang Stream, which are shown in Figure 1.For the Dal River and three neighboring streams, hydrologic, water quality, and fish monitoring data were collected for the period of 2007-2010 through government R&D projects [34,35].To measure the water level and velocity, a radar water gauge and a price current meter were used, respectively, at the Sujeon Bridge.Dissolved oxygen and pH were measured using the handheld dissolved oxygen meter and pH meter, respectively.Turbidity was measured by turbidity meter (PT-200).Fish monitoring was carried out using cast nets and kick nets, revealing that dominant species in the study area is a minnow (Zacco platypus), followed by dark chubs (Zacco temmincki) and swiri (Coreoleuciscus splendidus).They account for 27%, 15% and 15%, respectively.In the present study, the adult minnow was selected as the target fish.Since the monitoring data includes the number of individuals, flow depth, velocity, and substrate, they are habitat use data of Category II based on the criterion by Bovee [15].

Habitat Suitability Curves
In the present study, the GEP was used to construct HSCs.GEP takes advantage of both the Genetic Algorithm (GA) and Genetic Programming (GP).The GEP uses linear chromosomes with a fixed length and nonlinear parse trees with varied sizes and shapes.The former is obtained from the GA and the latter from the GP.Therefore, in the GEP, individuals are encoded as chromosomes, which are then expressed as expression trees.The combination of these separate entities, chromosomes and expression trees, enables the GEP to perform with a high degree of efficiency compared to GA and GP.

Improved Generalized Habitat Suitability Curves
Maki-Petays et al. [25] introduced GHSCs for physical habitat simulations of juvenile salmon in four rivers in Finland.They constructed GHSCs for each habitat variable using the arithmetic means of the habitat suitability indices for the four neighboring rivers.To smooth the curves, Maki-Petays et al. [25] used the distance weighted least square method.
In the present study, a new method for constructing GHSCs by using the suitable range concept is proposed.The improved GHSCs use the arithmetic means of the HSCs of neighboring streams constructed by using data in a suitable range.The suitable range, the concept of which was proposed by Thomas and Bovee [26], is the range containing the central 95% of the occupied locations in the HSC.
The ranges of data for the Dal River and three neighboring streams are presented in Figure 2 where the total range, suitable range, and optimum range are denoted by dotted, black and red bold arrows, respectively.The optimum range is the interval containing the central 50% of the occupied locations in the HSC [26].The suitable and optimum ranges of the Dal River data are shadowed by black and red, respectively, in the figure.It can be seen that the suitable and optimum ranges of the Chogang Stream data are the most similar to those of the Dal River data.

Hydraulic Simulation
CCHE2D, a numerical model for analyzing unsteady turbulent flows in an open-channel, was developed by the National Center for Computational Hydrosciences and Engineering at the University of Mississippi, US.The CCHE2D solves two-dimensional depth-averaged hydrodynamic equations using the efficient element method [38].The continuity and longitudinal (x) and lateral (y) components of momentum equations are, respectively, given by: 0 where H is the flow depth, U and V are the depth-averaged velocities in the x-and y-directions, respectively, qx and qy are respective discharges per unit width (qx = HU, qy = HV), g is the gravitational acceleration, ρ is the water density, S0i and Sfi are the river bed slope and friction slope in the i-direction, and τij is the horizontal turbulent stress tensor.

Method of Comparing Data and Results
For quantitative comparisons of geometric data of the streams, relative errors of each component defined below were computed.
where Φi_Dal denotes the i-th geometric component of the Dal River and Φi_NS is the same component of the neighboring streams.In addition, to compare the observed and predicted CSIs, the following MAPE (Mean Average Percentage Error) is used: in which n is the number of data and CSI o and CSI p are the observed and predicted CSIs, respectively.

Data Variability between Target and Source Streams
The Hongcheon River belongs to the Han River basin, the same as the Dal River, whereas the other two belong to the Geum River basin.Detailed characteristics of channel geometry and the substrate of the Dal River and neighboring streams are given in Table 1.The shape factor, defined by the basin area divided by the length of the stream, in the table denotes the average width of the basin.It should be noted in the table that the Geum River is a large-sized sand-bed river whose average slope is mild compared to the other streams.The geometric components include basin area, stream length, average width, mean elevation, and mean slope.Average values of the relative errors of five geometric components in Table 1 are 51%, 357% and 25% for the Hongcheon River, the Geum River, and the Chogang Stream, respectively.This indicates that the Chogang Stream is the most similar to the Dal River in terms of channel geometry.

Habitat Use Characteristics of Freshwater Minnow and Constructed HSCs
The distribution of the Zacco platypus in the Chogang Stream is the most similar to that in the Dal River (Figure 3).However, the target species in the Hongcheon River and Geum River appears to be distributed differently.In the figure, the classification scheme by Wentworth [39] was used for the substrate.Specifically, the ranges for the high population of Zacco platypus in the Dal River are a velocity of 0.4-0.65 m/s, a flow depth of 0.4-0.7 m, and a substrate of 4-6.However, in the Hongcheon River, the target species are densely populated in the ranges of a velocity of 0.6-1.0m/s, a flow depth of 0.5-0.8m, and a substrate of 2-4.In the Geum River, the ranges for a high population are a velocity of 0.1-0.6 m/s, a flow depth of 0.2-0.6 m, and a substrate of 2-5.The figure implies that monitoring data for the Chogang Stream can be used acceptably for a physical habitat simulation of the target fish in the study reach if data for the Dal River are not available.In order to investigate the habitat use difference between the Dal River and the three neighboring streams, Principal Component Analysis (PCA) was carried out with seven variables, namely observed number of individuals, flow depth, velocity, substrate, pH, DO, and turbidity.The total number of data is 1586.A two-dimensional plot of PC2 versus PC1 is shown in Figure 4.It can be seen that the data can be grouped into four, which is the number of streams in this study.For PC1, the data for the Dal River lie in the range of −60-5.Ranges of data for PC1 are 10-60, 5-60, and −45-−10 for the Hongcheon River, the Geum River, and the Chogang Stream, respectively.For PC2, the data for the Dal River, the Hongcheon River, the Geum River, and the Chogang Stream range from −30-40, −60-0, 0-70, and 0-70, respectively.The results from PCA indicate that the pattern of the Chogang Stream data is the most similar to that of the Dal River data, followed by the Geum River data and the Hongcheon River data.Before performing the physical habitat simulations, HSCs were constructed using the GEP model.For three physical habitat variables, HSCs were constructed with the three monitoring datasets from the neighboring streams and are compared with those constructed using the Dal River data. Figure 5 shows the resulting HSCs.The grey bars in the figure indicate that the number of observed individuals for each physical habitat variables.Choi and Choi [40] found that the HSCs by the GEP model are very similar to those by the method of Gosse [41] and the GEP model predicts HSCs better than the Adaptive Neuro Fuzzy Inference System (ANFIS) model.Furthermore, Choi and Choi [40] indicated that the GEP model is robust and non-subjective compared the method of Gosse [41].
For the flow depth and velocity, the HSCs for the Chogang Stream appear to be the most similar to those of the Dal River (Figure 5).However, the HSCs for the Hongcheon River are substantially different from those for the Dal River even though the two streams are part of the same watershed.The level of similarity can be expected from prior investigations in Table 1 and Figures 3 and 4.
Regarding the substrate, all HSCs differ seriously in Figure 5c.This is because the target fish, Zacco platypus, does not have a substrate preference [42,43].That is, Zacco platypus lives in both sand-bed and gravel-bed streams.Thus, hereafter, the substrate will not be considered in the physical habitat simulations.The improved GHSCs are constructed and compared with the GHSCs constructed by the method proposed by Maki-Petays et al. [25] (Figure 6).It can be seen that both curves appear to be very similar except that the GHSCs have long tails in case where the values for the flow depth and velocity are large.

Transferability of HSCs and Validation of Improved GHSCs
In this section, CSI distributions of the Dal River are presented based on the physical habitat simulations.First, various HSCs in Figure 5, constructed using the study reach and the neighboring streams, were used for the HSI model.Then, GHSCs in Figure 6 were used to compute the CSI distributions.The former is to investigate the transferability of HSCs of the neighboring streams and the latter is to show the v alidity of the proposed GHSCs.Using the HSCs in Figure 5, physical habitat simulations were carried out for Zacco platypus, and the resulting CSI distributions in the study reach are shown in Figure 7.The CSI was calculated using the multiplicative aggregation method.It can be seen that the CSI distribution computed using Chogang Stream data is the most similar to that of the Dal River.The CSI distribution computed using Geum River data is also similar, but the CSI distribution predicted with the use of Hongcheon River data is substantially different from the CSI distribution for the Dal River.In order to evaluate the impact of using data from neighboring streams, the CSI of the Dal River versus the CSI predicted using data from neighboring streams are plotted in Figure 8.Only non-zero values of CSI are plotted in the figure, where the 45 degree line indicates a perfect match.It can be seen that values of CSI predicted using Chogang Stream data provide the match best with those of the Dal River.However, the use of Hongcheon River data and Geum River data results in slight and serious over-predictions of CSI, respectively.The values for MAPE were also computed, and 67.5%, 39.2% and 25.1% were obtained for predictions using Hongcheon River data, Geum River Data, and Chogang Stream data, respectively.The results indicate that the use of data from a neighboring stream even in the same watershed results in larger errors in the prediction of the CSI.Prior investigations in Table 1 and Figures 3 and 4 can be useful for selecting appropriate datasets for a physical habitat simulation.

Conclusions
This paper investigated the impact of using monitoring data from neighboring streams for a physical habitat simulation and proposed a new method for constructing GHSCs using monitoring data from neighboring streams.The present study showed that great attention should be paid to the use of data from neighboring streams when monitoring data are not available for the physical habitat simulation.That is, the data from a neighboring stream whose geometrical properties are similar to those of the study reach should be used.Even the data from a stream that shares the same watershed as the study reach can result in large errors in the prediction of the CSI.
In addition, for a general strategy, the present study proposed an improved GHSC, which can be used for the physical habitat simulation more confidentially with data from neighboring streams.The new method used the arithmetic means of the HSCs that were constructed with data only in a suitable range.The predicted CSI distribution was compared with that computed using the conventional HSI model, revealing that the prediction made using the improved GHSCs was better.However, the applicability of the proposed GHSCs can be investigated further by applying the methodology to various target streams that have their own monitoring data as well as that from neighboring streams.

Figure 1 .
Figure 1.The Dal River and its neighboring streams.

Figure 2 .
Figure 2. (a) Flow depth; (b) velocity.Suitable range and optimum range of data.

Figure 4 .
Figure 4. Result of a principal component analysis.

Figure 7 .
Figure 7. CSI distribution for Zacco platypus (a) with Dal River data; (b) with Hongcheon River data; (c) with Geum River data; (d) with Chogang Stream data.

Figure 9 Figure 9 .
Figure9shows the CSI distributions for Zacco platypus in the Dal River.The GHSCs and improved GHSCs were used for CSI distributions in Figure9a,b, respectively.It appears that the use of GHSCs substantially improves the CSI distribution compared to the CSI distributions in Figure7a.Quantitatively, the uses of HSC constructed with the Chogang Stream data, GHSC, and improved GHSC result in MAPE values of 17.18%, 19.35% and 15.46%, respectively.This indicates that the use of the improved GHSC leads to a CSI distribution that is most similar to that of the Dal River.

Table 1 .
Characteristics of the data used.