Evaluating Sampling Designs for Demersal Fish Communities

Fish communities play an important role in determining the dynamics of marine ecosystems, while the evaluation and formulation of protective measures for these fish communities depends on the quality and quantity of data collected from well-designed sampling programs. The ecological model was used first to predict the distribution of the demersal fish community as the “true” population for the sampling design. Four sampling designs, including simple random sampling, systematic sampling, and stratified sampling with two sampling effort allocations (proportional allocation and Neyman allocation), were compared to evaluate their performance in estimating the richness and biodiversity indices of the demersal fish community. The impacts of two different temperature change scenarios, uniform temperature and non-uniform temperature increase on the performance of the sampling designs, were also evaluated. The proportional allocation yielded the best estimates of fish community richness and biodiversity relative to a synthetic baseline. However, its performance was not always robust relative to the simulated temperature change. When the water temperature changed unevenly, systematic sampling tended to perform the best. Thus, it is important to adjust the strata for a stratified sampling when the habitat experiences large changes. This suggests that we need to carefully evaluate the appropriateness of stratification when temperature change-induced habitat changes are large enough to result in substantial changes in the fish community.


Introduction
Fish communities play an important role in the dynamics of marine ecosystems and are an important food source [1,2].The evaluation and formulation of protective measures for fish communities depends on the quality and quantity of data collected from well-designed sampling programs [3].Low quality and an insufficient amount of data derived from a poorly designed sampling program can lead to significant errors in quantifying the spatio-temporal dynamics of fish communities, especially in coastal waters because of the complicated habitat type and composition of the biological resources [4][5][6].With an increased demand for high quality data for the assessment and management of fisheries resources, the design of sampling programs has received increased attention [7][8][9].
Conducting a sampling program in the sea is often difficult, expensive, and tends to be restricted by ocean conditions [3,7,10].We cannot afford to test and evaluate different designs in the field in order to determine the yield of the data in terms of quantity and quality [10,11].This calls for an evaluation of alternative sampling designs.
Simple random sampling, systematic sampling design, and stratified sampling design are commonly used in the sampling of fish communities [12].Simple random sampling provides unbiased data [13] but tends to yield low precision data, especially where populations are not randomly distributed in the ecosystems.A systematic sampling design is also widely used, especially when spatio-temporal distributions of targeting a fish community or species are unknown [13].However, when the spatio-temporal distribution of populations is structured, this can result in an overestimation or underestimation [13].Stratified sampling designs are widely used in fisheries resource surveys [14].This design can greatly improve the sampling precision compared to a simple random design for a population of high heterogeneity in spatio-temporal distributions.A heterogeneous distributional area can be divided into different strata with more homogeneous distributions within strata [13].As fish populations tend to have patchy distributions following their optimal habitats or biological behavior-like migrations even within an optimal habitat, the stratified sampling design can be cost effective in improving sampling precisions at the same or less cost compared to simple random sampling [15].There are many rules to allocate the sample sites of sampling among strata, such as the Neyman allocation and allocating samples among strata in a way that is proportional to the strata area.Different allocations may have different levels of precision.Clearly, the performance of a sampling design may differ with different patio-temporal distributions of fish populations and objectives of sampling programs and an optimal sampling design can only be identified through careful evaluations of alternative sampling designs [16].
The abundance estimation of a population is the main purpose for many sampling programs [5,[17][18][19].The precision of these estimates is critical to ensure accurate fish stock assessment and management.Other sampling programs tend to focus on the estimation of fish communities [20,21].Variables influencing the spatio-temporal distribution of fish and fish communities should be considered in the development and evaluation of a sampling design [16,22].Many studies have shown that temperature has an effect on the distribution of a wide range of species [23][24][25].Such effects may result in shifts in fish distributions over time and should be considered when we develop and evaluate a sampling design [26].
In this study, using the data collected from a fish community study in a coastal marine ecosystem, we developed a Monet Carlo simulation approach to evaluate the performance of the four sampling designs for quantifying the demersal fish community.After this, an optimal sampling design was selected.We subsequently evaluated if the performance of the selected optimal sampling design might change with different temperature change scenarios.

Approach
A significant relationship was found between the fish community and habitats.The ecological model was used first to establish the relationship between the fish community and the environment, especially habitats.As environmental variables tend to be highly correlated with each other, their direct use as the explanatory variables may reduce the statistical power of a generalized additive model (GAM).A principal component analysis (PCA) was first applied to the environmental variables in order to develop a set of new environmental variables (i.e., principal components), which are not correlated.The distribution of fish communities in different habitats can be predicted by this PCA-based GAM model to be the "true" population for sampling design.Four sampling designs, including simple random sampling, systematic sampling, and stratified sampling with two sampling effort allocations (proportional allocation and Neyman allocation), were compared to evaluate their performance in estimating richness and biodiversity indices of a demersal fish community (Figure 1).

Study Area and Fish Species Composition
The study area, with a total area of 549 square kilometers, is located at the Ma'an archipelago, Zhejiang province, China (Figure 1).Habitats in this area have large spatial heterogeneity and vary greatly on a rather small spatial scale, which ranges from rock reef, sand, mud, and mussels aquaculture area to other natural or artificial habitats [27].The water depth in this area ranges from 3 m to more than 30 m.
The fish communities in the study area showed an obvious seasonal time distribution and a uniform spatial distribution.In the rocky reef habitat, which is mainly supported by macro algae, the community structure of living resources changes with the seasonal distribution of macro algae [27].Dominant species in the near shore rocky area are Sebastiscu smarmoratus, Agrammus and Nibeaalbiflora.Platessapercocephalus, Lateolabrax maculates, and Larimichthys polyactis are common species.The species in the sand habitat, which are mainly Paraplagusia japonica and migration species, such as Thrissa kammalensis and Mugilcephalus, play an important role in the species composition.The species composition in the mud habitat is very different from that in the rocky reef habitat as Johnius belangerii and Lophius litulon are the common species.In the study area, the habitats of mussel aquaculture area and aquaculture cages were very attractive to Hexagrammos otakii and Sebastiscus marmoratus.The species in the artificial reef habitat are mainly reef fish species, such as Sebastiscus marmoratus.

Data Collection
The data of demersal fish community and environment variables used for modeling were collected monthly during 2009 from 24 sampling sites, which were identified roughly through the stratification based on the habitat types (Figure 2).Data collected in the survey included demersal fish abundance, longitude, latitude, month, temperature, salinity, oxygen, chlorophyll, turbidity, depth, and type of habitats.Two stationary bottom gillnet types were used as the sampling gear because the trawl cannot be used in every type of habitat.One had a height of 1.8 m with four different mesh sizes (25 mm, 34 mm, 43 mm, and 58 mm) and a total length of 60 m.The other set of gillnets

Study Area and Fish Species Composition
The study area, with a total area of 549 square kilometers, is located at the Ma'an archipelago, Zhejiang province, China (Figure 1).Habitats in this area have large spatial heterogeneity and vary greatly on a rather small spatial scale, which ranges from rock reef, sand, mud, and mussels aquaculture area to other natural or artificial habitats [27].The water depth in this area ranges from 3 m to more than 30 m.
The fish communities in the study area showed an obvious seasonal time distribution and a uniform spatial distribution.In the rocky reef habitat, which is mainly supported by macro algae, the community structure of living resources changes with the seasonal distribution of macro algae [27].Dominant species in the near shore rocky area are Sebastiscu smarmoratus, Agrammus and Nibeaalbiflora.Platessapercocephalus, Lateolabrax maculates, and Larimichthys polyactis are common species.The species in the sand habitat, which are mainly Paraplagusia japonica and migration species, such as Thrissa kammalensis and Mugilcephalus, play an important role in the species composition.The species composition in the mud habitat is very different from that in the rocky reef habitat as Johnius belangerii and Lophius litulon are the common species.In the study area, the habitats of mussel aquaculture area and aquaculture cages were very attractive to Hexagrammos otakii and Sebastiscus marmoratus.The species in the artificial reef habitat are mainly reef fish species, such as Sebastiscus marmoratus.

Data Collection
The data of demersal fish community and environment variables used for modeling were collected monthly during 2009 from 24 sampling sites, which were identified roughly through the stratification based on the habitat types (Figure 2).Data collected in the survey included demersal fish abundance, longitude, latitude, month, temperature, salinity, oxygen, chlorophyll, turbidity, depth, and type of habitats.Two stationary bottom gillnet types were used as the sampling gear because the trawl cannot be used in every type of habitat.One had a height of 1.8 m with four different mesh sizes (25 mm, 34 mm, 43 mm, and 58 mm) and a total length of 60 m.The other set of gillnets were 2.4 m high with mesh sizes of 50 mm, 60 mm, 70 mm, and 80 mm in panels of 30 m length each (a total length of 120 m).Gillnets were set for about 24 h (mean of 23.6 ± 2.5 h) at each sampling site, the data was not transformed to Catch-per-unit-effort since the sampling times were close.The species of each fish sampled was identified and the abundance of each fish species was measured.The sampling location was recorded by GPS (Global Positioning System).The CTD (Conductivity, Temperature and Depth sensor; seabird 19 plus) was used to measure depth, temperature, salinity, chlorophyll, turbidity, and oxygen.
Sustainability 2018, 10, x FOR PEER REVIEW 4 of 23 were 2.4 m high with mesh sizes of 50 mm, 60 mm, 70 mm, and 80 mm in panels of 30 m length each (a total length of 120 m).Gillnets were set for about 24 h (mean of 23.6 ± 2.5 h) at each sampling site, the data was not transformed to Catch-per-unit-effort since the sampling times were close.The species of each fish sampled was identified and the abundance of each fish species was measured.The sampling location was recorded by GPS (Global Positioning System).The CTD (Conductivity, Temperature and Depth sensor; seabird 19 plus) was used to measure depth, temperature, salinity, chlorophyll, turbidity, and oxygen.Margalef's richness index (D) and Shannon index (H') were chosen to quantify fish communities in the analysis of spatial assemblages of demersal fish species [28].The relationship between the selected fish community index and environmental variables was developed using a generalized additive model (GAM).Habitats and environmental variables, such as temperature, depth and salinity, play a major role in influencing fish community distribution [29,30].Different habitats may result in different fish community assemblages [31].As there is a large spatial heterogeneity in the distribution of habitats in the study area, the type of habitat was chosen as an important variable.Other environmental variables, such as depth, temperature, and salinity, were also important in influencing fish distribution directly or indirectly [32][33][34].The environmental variables chosen for this study included longitude, latitude, month, temperature, salinity, oxygen, chlorophyll, turbidity, depth, and type of habitats.
PCA-based GAM used a logic link function with a Gaussian error distribution.The GAM for the Margalef's richness index or Shannon index can be written as: Margalef's richness index (D) and Shannon index (H') were chosen to quantify fish communities in the analysis of spatial assemblages of demersal fish species [28].The relationship between the selected fish community index and environmental variables was developed using a generalized additive model (GAM).Habitats and environmental variables, such as temperature, depth and salinity, play a major role in influencing fish community distribution [29,30].Different habitats may result in different fish community assemblages [31].As there is a large spatial heterogeneity in the distribution of habitats in the study area, the type of habitat was chosen as an important variable.Other environmental variables, such as depth, temperature, and salinity, were also important in influencing fish distribution directly or indirectly [32][33][34].The environmental variables chosen for this study included longitude, latitude, month, temperature, salinity, oxygen, chlorophyll, turbidity, depth, and type of habitats.
PCA-based GAM used a logic link function with a Gaussian error distribution.The GAM for the Margalef's richness index or Shannon index can be written as: where Margalef's richness index is the richness of the demersal fish community; the Shannon index is the diversity of the demersal fish community; Logit is a logit link function; s is the spline smoother; Lon is longitude; Lat is latitude; comp i is the ith principal component, n is the number of principal components chosen to be included in the PCA-based GAM; month ranges from January to December; and type is habitattypes, including rock reef, sand, mud, mud and sand, mussel aquaculture area, artificial reef and aquaculture cage.This PCA-based GAM method was shown to perform better than the original variables-based GAM [28].
Based on the estimated PCA-based GAM model and environmental variables, the spatial distribution of Margalef's richness index and Shannon index in June, August, October, and September in 2012 and February and April in 2013 were predicted (Figure 3).This predicted spatial distribution of demersal fish community index was considered as the "true" fish community structure in the evaluation of different sampling designs.where Margalef's richness index is the richness of the demersal fish community; the Shannon index is the diversity of the demersal fish community; Logit is a logit link function; s is the spline smoother; Lon is longitude; Lat is latitude; comp i is the ith principal component, n is the number of principal components chosen to be included in the PCA-based GAM; month ranges from January to December; and type is habitattypes, including rock reef, sand, mud, mud and sand, mussel aquaculture area, artificial reef and aquaculture cage.This PCA-based GAM method was shown to perform better than the original variables-based GAM [28].
Based on the estimated PCA-based GAM model and environmental variables, the spatial distribution of Margalef's richness index and Shannon index in June, August, October, and September in 2012 and February and April in 2013 were predicted (Figure 3).This predicted spatial distribution of demersal fish community index was considered as the "true" fish community structure in the evaluation of different sampling designs. (a)

Treatment Sampling Designs
The size of sampling grids used in this study was 300 × 300 m, which was proposed based on the size of the gillnets used in sampling [10].There were 3381 potential sampling stations of the defined size within the study area.Those stations that were in edges and/or could not be sampled were not included in the 3381 stations.The following four sampling designs were considered in the evaluation: (1) Simple random sampling n sites were randomly selected for sampling from the potential 3381 sites.This design was referred to as Design I.
(2) Systematic sampling design All 3381 sites were sorted by longitude first and then by latitude.The list of all the stations were evenly divided into n groups, before the sampling distance d was calculated as 3381/n.One station was selected from each of the n groups for sampling, resulting in n sites selected for sampling.The first site was chosen randomly from the first group, followed by sites with a sampling distance of d, 2d, … and (n−1)d.This design was referred to as Design II.
(3) Stratified sampling design A previous study suggests that substrates tended to influence the fish species composition in the study area [27] and we used the substrates to stratify the survey area.Sand and artificial reef take up a small area and thus, they were combined into one stratum.Therefore, four strata were defined based on substrates for the stratified sampling design: rock reef, sand and artificial reef, mud as well as mud and sand.The total number of potential sites for sampling was 139 in the rock-reef stratum; 35 sites in sand and artificial reef; 2339 sites in mud; and 868 sites in mud and sand.Samples were randomly selected within each stratum.The performance of the stratified sampling design can be

Treatment Sampling Designs
The size of sampling grids used in this study was 300 × 300 m, which was proposed based on the size of the gillnets used in sampling [10].There were 3381 potential sampling stations of the defined size within the study area.Those stations that were in edges and/or could not be sampled were not included in the 3381 stations.The following four sampling designs were considered in the evaluation: (1) Simple random sampling n sites were randomly selected for sampling from the potential 3381 sites.This design was referred to as Design I.
(2) Systematic sampling design All 3381 sites were sorted by longitude first and then by latitude.The list of all the stations were evenly divided into n groups, before the sampling distance d was calculated as 3381/n.One station was selected from each of the n groups for sampling, resulting in n sites selected for sampling.The first site was chosen randomly from the first group, followed by sites with a sampling distance of d, 2d, . . .and (n−1)d.This design was referred to as Design II.
(3) Stratified sampling design A previous study suggests that substrates tended to influence the fish species composition in the study area [27] and we used the substrates to stratify the survey area.Sand and artificial reef take up a small area and thus, they were combined into one stratum.Therefore, four strata were defined based on substrates for the stratified sampling design: rock reef, sand and artificial reef, mud as well as mud and sand.The total number of potential sites for sampling was 139 in the rock-reef stratum; 35 sites in sand and artificial reef; 2339 sites in mud; and 868 sites in mud and sand.Samples were randomly selected within each stratum.The performance of the stratified sampling design can be greatly affected by the allocation of the sampling efforts among the strata [3].In this study, we considered two sample allocation scenarios for the stratified sampling design: (1) Allocate samples based on the habitat area proportion of each stratum (referred to as Design III) (2) Allocate the first half of samples evenly among the strata first, then allocate the remaining samples in a way that is inversely proportional to the variances of strata (Neyman allocation referred to as Design IV).

Simulation Procedure and Measure Indices
In order to test the impact of sample size on precision, three sample sizes were considered (25, 50 and 100).For a given sample size of each design, we sampled the "true" fish community (Figure 2) 100 times.For each simulated sampling, we calculated three performance indices used in this study to measure the performance of the sampling design: Design effect (Deff), Relative Estimation Error (REE), and Relative Bias (RB) [12,35].The three indices reflect different aspects of sampling performance.Deff measures the effectiveness of a sampling design relative to the simple random design; REE measures the overall errors of estimated mean; and RB measures the level of estimation bias for a given design.The three indices can be written as: where V k ( θ) is the variance of sample mean of kth sampling design; V SRS ( θ) is the variance of the sample mean derived for the simple random design; V estimated is the estimated mean in the jth simulation run of the kth sampling design; V true is the true mean of the kth sampling design; and N is the number of simulation runs (i.e., 100).

The Influence of Temperature Change
In order to assess the influence of temperature on the performance of the sampling design for the distribution, the performance was evaluated with increases in temperature that were uniform or non-uniform [36].Two scenarios were considered: (1) Temperature did not uniformly increase spatially.The scenario of the non-uniform spatial increase of temperature was simulated with the temperature of a warmer month but all other environmental variables were kept invariant.In this study, February, April, and June were selected.For example, when the community index in February was simulated, the temperature for prediction came from April while keeping the other environmental variables invariant.(2) Temperature in each month increased consistently and uniformly by 1 • C to 5 • C. The distribution of the demersal fish was predicted based on the GAM with the increased temperature.To avoid temperatures beyond the upper limit, the indices for February were selected in this section.

Design Effect (DEFF)
The values of Deff showed that the performance of four sampling designs for the Margalef's richness index had the following ranking: Design III > Design IV > Design II > Design I (Figure 4a,c,e).Design III performed best in all six months.For the Shannon index, the ranking of designs was: Design III > Design I > Design IV > Design II (Figure 4b,d,f).Design III performed well in the overall average of Deff but did not always perform best in all the 6 months.For example, Design II performed best in June with a sample size of 25 (Figure 4b).

Design Effect (DEFF)
The values of Deff showed that the performance of four sampling designs for the Margalef's richness index had the following ranking: Design III > Design IV > Design II > Design I (Figure 4a,c,e).Design III performed best in all six months.For the Shannon index, the ranking of designs was: Design III > Design I > Design IV > Design II (Figure 4b,d,f).Design III performed well in the overall average of Deff but did not always perform best in all the 6 months.For example, Design II performed best in June with a sample size of 25 (Figure 4b).

Relative Estimation Error (REE)
Design III yielded the best estimates of Margalef's richness index, followed by Design IV.Design I performed the worst for all the three levels of sample sizes considered in this study (Figure 5a,c,e).For example, the average REE was 1.590 (approximately 1.481-1.738)for Design I and 0.470 (0.155-0.890) for Design III.The REE of stratified sampling design was smaller than that for the systematic sampling design and simple random sampling (Figure 6).

Relative Estimation Error (REE)
Design III yielded the best estimates of Margalef's richness index, followed by Design IV.Design I performed the worst for all the three levels of sample sizes considered in this study (Figure 5a,c,e).For example, the average REE was 1.590 (approximately 1.481-1.738)for Design I and 0.470 (0.155-0.890) for Design III.The REE of stratified sampling design was smaller than that for the systematic sampling design and simple random sampling (Figure 6).
For the Shannon index, the overall REE of Design III was lowest (Figure 5b,d,f).However, the REE of Design II was lower than that for Design III in June and October (Figure 5b).The REE of the four sampling designs decreased after the sample size was increased from 25 to 100.
For the Shannon index, the overall REE of Design III was lowest (Figure 5b,d,f).However, the REE of Design II was lower than that for Design III in June and October (Figure 5b).The REE of the four sampling designs decreased after the sample size was increased from 25 to 100.(e) (f)

Relative Bias (RB) of Each Sampling Design
For the Margalef's richness index, RB values of the four sampling designs were distributed symmetrically around 0 (Figure 7a,c,e), indicating that all the designs yielded more or less unbiased estimates.The distribution of RB was narrower and more concentrated for the stratified sampling design than that for the simple random sampling design and systemic sampling design.There was no obvious difference in the RB between the two stratified sampling designs (i.e., Designs III and IV).

Relative Bias (RB) of Each Sampling Design
For the Margalef's richness index, RB values of the four sampling designs were distributed symmetrically around 0 (Figure 7a,c,e), indicating that all the designs yielded more or less unbiased estimates.The distribution of RB was narrower and more concentrated for the stratified sampling design than that for the simple random sampling design and systemic sampling design.There was no obvious difference in the RB between the two stratified sampling designs (i.e., Designs III and IV).For the Shannon index, the RB values of the four sampling designs were not distributed symmetrically around 0. For example, Design II was skewed with a distribution tail extending to the right when the sample size was 50, suggesting that the systematic sampling design might overestimate the Shannon index.Stratified sampling designs (Design III and Design IV) yielded For the Shannon index, the RB values of the four sampling designs were not distributed symmetrically around 0. For example, Design II was skewed with a distribution tail extending to the right when the sample size was 50, suggesting that the systematic sampling design might overestimate the Shannon index.Stratified sampling designs (Design III and Design IV) yielded similar distributions for the Margalef's richness index and Shannon index (−0.1 to 0.1).However, the distributions of Design I and Design II for the Margalef's richness index (−0.2 to 0.2) was wider than the Shannon index (−0.1 to 0.1).The range of the RB distribution decreased and became closer to 0 for all four designs when the sample sizes increased from 25 to 100 (Figure 7).

Possible Influence of Temperaturechanges
When the temperature uniformly increased across the study area by 1-5 • C, Design III performed well because it had the lowest Deff and REE among the four sampling designs (Table 1).With an increased temperature, the Deff and REE of Design III did not change much (Figure 8).Because the same result was obtained for the Shannon index, only the Margalef's richness indexis shown in Figure 8.The smallest Deffand REE for each scenario are identified in the bold print.
Sustainability 2018, 10, x FOR PEER REVIEW 18 of 23 similar distributions for the Margalef's richness index and Shannon index (−0.1 to 0.1).However, the distributions of Design I and Design II for the Margalef's richness index (−0.2 to 0.2) was wider than the Shannon index (−0.1 to 0.1).The range of the RB distribution decreased and became closer to 0 for all four designs when the sample sizes increased from 25 to 100 (Figure 7).

Possible Influence of Temperaturechanges
When the temperature uniformly increased across the study area by 1-5 °C, Design III performed well because it had the lowest Deff and REE among the four sampling designs (Table 1).With an increased temperature, the Deff and REE of Design III did not change much (Figure 8).Because the same result was obtained for the Shannon index, only the Margalef's richness indexis shown in Figure 8.When the temperature increased non-uniformly across the study area, Design II had the lowest REE (Table 2).However, the Deff of Design II was not the best.The values of Deff for the four sampling designs had the following ranking: Design III > Design IV > Design II > Design I.The smallest Deff and REE for each scenario are identified in the bold print.02-04 means that the temperature for prediction came from April while keeping the other environmental variables invariant compared to February.04-06 means that the temperature for prediction came from June while keeping the other environmental variables invariant compared to April.06-08 means that the temperature for prediction came from August while keeping the other environmental variables invariant compared to June.

Discussion and Conclusions
The performance of different sampling designs in estimating fish abundance was examined in many studies [2,7,9,38].Most of these studies focused on the population of a single fish species [7], while only a few studies focused on the fish communities.There is a lack of research evaluating the influence of temperature change on the performance of a sampling design.In this study, the four sampling designs were compared in their performance of estimating different community structure indices, while the influence of temperature change on sampling was also evaluated.This study shows that a stratified sampling design (Design III) yielded the best estimates.However, the performance of Design III might vary with increased temperature.
All sampling designs have their advantages and disadvantages [39,40].A systemic sampling design can be easily operated and is more suitable for sampling fish communities with no prior information [3,41].A stratified random sampling design is also commonly used in many fishery surveys [2,12,15] and the variance between strata can be reduced by suitable stratification.Previous When the temperature increased non-uniformly across the study area, Design II had the lowest REE (Table 2).However, the Deff of Design II was not the best.The values of Deff for the four sampling designs had the following ranking: Design III > Design IV > Design II > Design I.

Discussion and Conclusions
The performance of different sampling designs in estimating fish abundance was examined in many studies [2,7,9,38].Most of these studies focused on the population of a single fish species [7], while only a few studies focused on the fish communities.There is a lack of research evaluating the influence of temperature change on the performance of a sampling design.In this study, the four sampling designs were compared in their performance of estimating different community structure indices, while the influence of temperature change on sampling was also evaluated.This study shows that a stratified sampling design (Design III) yielded the best estimates.However, the performance of Design III might vary with increased temperature.
All sampling designs have their advantages and disadvantages [39,40].A systemic sampling design can be easily operated and is more suitable for sampling fish communities with no prior information [3,41].A stratified random sampling design is also commonly used in many fishery surveys [2,12,15] and the variance between strata can be reduced by suitable stratification.Previous studies have shown that a systemic sampling design and stratified sampling design tend to have higher accuracy than simple random sampling [16,42].For an infinite population, a stratified sampling design with equally sized stratum has higher precision [2,16].The distribution of fish community indices showed a large spatial variability as a result of the patchy distribution of various habitats.The stratified sampling design (Design III) tested in this study is based on substrate types for the stratification and thus reduces the variance between strata and improves the precision of the estimates.
The stratified sampling design based on strata area (i.e., Design III) performed best out of the two stratified sampling designs.For a stratified sampling design, the allocation of sampling efforts can be done using several different criteria [2,12].Neyman allocation allocates samples based on strata variance [43].However, it is difficult to know the variance before a survey is conducted or during the survey.In general, previous years' surveys can be used to estimate the variance [3,43].In this study, as the sampling area is not very large, it is possible to do a Neyman allocation based on an initial equal samples allocation first.However, it is more suitable to use the variance estimated from previous years' surveys when the sample areas are large or the environment has changed greatly.In this study, Design III, which allocates sampling efforts among the strata based on strata areas, tends to perform better than the Design IV based on the Neyman allocation, although the difference is small (Figures 2, 3 and 5).Proportional allocation of stratified sampling design was relatively simple without the requirement of previous knowledge on spatial variability in organism distributions, which is required in the Neyman allocation [12].
The choice of optimal sampling design may depend on the objectives of a sampling program [39].Traditional sampling designs, such as systemic sampling design and stratified sampling design, were more suitable for a continuous distribution or area source occupying large spatial areas [3].As we used the fish community indices in this study instead of the abundance indices of a single fish population, the heterogeneity of the spatial distribution of the fish community was reduced in the whole area.For such cases, the traditional sampling designs are more suitable.The performance of sampling designs was different between the Margalef's richness index and the Shannon index.For example, Design III had the smallest REE in all the months for the Margalef's richness index, but this was not true for the Shannon index.This might result from the different spatial distribution of these two indices [7], the Margalef's richness index yielded more regular and obvious patterns between different habitats than the Shannon index (Figure 2).The strata of the stratified sampling design were decided by habitat types.Thus, the estimation errors of stratified sampling design were higher for the Margalef's richness index than for the Shannon index.
Design III performed best when the temperature was predicted to increase uniformly over the study area.The trend of the four sampling designs changed inconspicuously with an increase of 5 • C in temperature.The Deff and REE values of Design III were still at the same level with the changes in temperature (Figure 6).This may result from the fact that the modeled temperature change did not result in large changes in the spatial distribution of the synthetic fish community indices.In this study, although the values of richness or diversity indices changed, the spatial pattern was not significantly change.Thus, a uniform change in temperature across the study area resulted in almost no changes in the ranking of the performance of the sampling design.
Design II performed better when the temperature increased non-uniformly.Design III was not the optimal sampling design, although Deff was the smallest.Under this temperature change scenario, Design III performed worse than in the unaltered temperature (Figure 5, Table 2).This may result from the large changes in the spatial distribution of fish community indices as a result of non-uniform changes in the habitats.Thus, the stratification based on previous habitat did not perform well.It is important to adjust the strata for the stratified sampling design when the habitat experiences large changes in the target area.This suggests that we need to carefully evaluate the appropriateness of stratification when temperature change-induced habitat changes are large enough to result in substantial changes to a fish community.
Sample size was an important factor affecting the precision of indices from the sampling designs [44].The influence of sample size was examined in this study with the sample sizes increasing from 25 to 100.In some cases, optimal sampling design can improve the precision resulting from a low sampling effort.For example, Design III with a sample size of 25 is more accurate than the simple random sampling (Design I) with a sample size of 50 (Figure 5).The simple random sampling with a sample size of 100 has the same estimation accuracy as the stratified sampling design with a sample size of 25.This suggests that an optimal sampling design can increase sampling accuracy even with a low sampling effort.
A significant relationship was found between fish community and habitats.The optimal sampling design in complicated habitats mainly considers the sampling method and site allocation.This study provided a frame for optimal sampling design chosen for complicated habitats.The ecological model was used first to establish the relationship between the fish community and the environment, especially habitats.The fish community distribution in different habitats can be predicted by this model to be the "true" population for the sampling design.Different sampling design and sampling effort allocation can be compared based on this "true" distribution.Stratified sampling design with sampling effort allocated by habitat area was the optimal sampling designand therefore the stratified sampling design can be used in this water area for a fish community survey.A similar approach used in this study can also be used for other fish species or populations to carry out a fishery-independent survey program.
Fish resources are limited and thus optimal sampling designs are important for fish community protection and utilization.This study evaluated the performance of the four sampling designs for quantifying demersal fish diversity and richness and provided a method for sample design chosen that can be applied to the monitoring of fisheries resources.This study also suggests that we need to carefully evaluate the appropriateness of stratification when the temperature change-induced habitat changes are large enough to result in substantial changes to a fish community.

Figure 1 .
Figure 1.Flowchart of the "true" population derived from the principal component analysis based generalized additive model (principal component analysis (PCA)-based generalized additive model (GAM)).

Figure 1 .
Figure 1.Flowchart of the "true" population derived from the principal component analysis based generalized additive model (principal component analysis (PCA)-based generalized additive model (GAM)).

Figure 2 .
Figure 2. The study area and sampling sites.

Figure 2 .
Figure 2. The study area and sampling sites.

Figure 3 .
Figure 3. Habitat type in study area and the distribution of demersal fish community index in October.(a) The distribution of Margalef's richness index; and (b) The distribution of Shannon index.

Figure 8 .
Figure 8. Performance of Design III for the Margalef's richness index with increased temperature for: (a) Deff; and (b) REE.

Table 1 .
The Deff and REE for different sampling designs of Margalef's richness index with a uniform temperature increase.

Table 1 .
The Deff and REE for different sampling designs of Margalef's richness index with a uniform temperature increase.
The smallest Deffand REE for each scenario are identified in the bold print.

Table 2 .
The Deff and REE of each sampling design of Margalef's richness index with a non-uniform temperature increase.

Table 2 .
The Deff and REE of each sampling design of Margalef's richness index with a non-uniform temperature increase.The smallest Deff and REE for each scenario are identified in the bold print.02-04 means that the temperature for prediction came from April while keeping the other environmental variables invariant compared to February.04-06 means that the temperature for prediction came from June while keeping the other environmental variables invariant compared to April.06-08 means that the temperature for prediction came from August while keeping the other environmental variables invariant compared to June.