1. Introduction
Fisheries’ bycatch poses a serious threat to seabird populations, especially albatrosses, petrels, and shearwaters at the global scale, with both direct and indirect consequences to ecosystem health and the trophic dynamics of ocean systems [
1,
2,
3,
4]. The latest global assessment ranks bycatch as the top threat by the number of seabirds affected, followed by impact from invasive species and climate change [
5]. Recognizing seabird bycatch as a threat to sustainable fisheries, key tuna Regional Fisheries Management Organizations, such as the International Commission for the Conservation of Atlantic Tunas and the Western and Central Pacific Fisheries Commission (WCPFC), often mandate bycatch mitigation requirements, such as night setting, the use of streamer lines and/or hook shielding devices, in areas with historically high levels of bycatch [
6].
Seabirds spend a substantial part of their time beyond national borders and their life history depends on the high seas [
7,
8]. Each year, billions of birds undergo extensive migrations, and long-distance travels across the seas are not uncommon [
9]. These flights connect the remote regions of the world. The extensive overlap of the distribution of seabirds with the footprint of fishery operations is particularly concerning [
10,
11,
12]. Globally, it has been estimated that at least 160 thousand seabirds are killed annually, with persistent data gaps in some regions and fleets [
13]. Despite the global extent of the issue, seabird bycatch assessments are commonly completed at a local level, for example, the US North Atlantic pelagic longline fleet [
14], the Chinese Atlantic longline fleet [
15], longline vessels in the Western Mediterranean around Columbretes Islands [
16], and the gillnets in the Polish exclusive economic zone of the Baltic Sea [
17]. This can cause problems when one needs to evaluate the overall size of seabird bycatch at a regional or global scale, based on individual assessment reports at local scales. These regional/global assessments have direct consequences on policy making, and in directing conservation efforts at the national and international level.
Assessing the uncertainty in the total seabird bycatch estimate is important because it characterizes the accuracy and reliability of the estimate [
13], and some management agency has set a direct target on its value [
18]. In a benchmark review, Anderson, Small, Croxall, Dunn, Sullivan, Yates, and Black [
13] compiled individual seabird bycatch assessments from 68 longline fisheries around the globe, and they found that the upper range of the total seabird bycatch approximately doubles the average estimate, and they could not calculate a lower range for their estimate, due to a high level of uncertainty in bycatch rate estimates. In their study, the upper ranges from individual assessments were added together to arrive at the upper range for the total count. This procedure seems harmless, but it almost always overestimates the upper range, because, for their estimate to be true, the upper ranges of the seabird bycatch estimate in all of the 68 fisheries need to co-occur, an unlikely event. So far, it is not clear if it is common for the seabird bycatch rates from the different fisheries to fluctuate in a completely synchronized way; in the case of the spatially distant sources, if it is safe to make the independence assumption; and if the answer is no for both of the questions, how to obtain an uncertainty estimate from individual seabird bycatch assessments at a local level.
The main goal of this study is to fill this knowledge gap by developing a workflow for the estimation of the uncertainty level of seabird bycatch at a regional/global scale, based on the assessment reports at local scales. It is assumed that the annual bycatch rates from each source have already been assessed and are available to the researcher, which is a common situation. The task of a local assessment is to provide an unbiased estimate of the seabird bycatch rate. At the local level, it is important to consider aspects affecting bycatch susceptibility, such as the species identity and life history stages. However, at a higher level, it is assumed that these aspects have already been incorporated into the reported bycatch rate from each source, and therefore, they do not further affect the calculation of the total seabird bycatch at a higher level.
In the following, in addition to a theoretical analysis, we also explore different scenarios, based on seabird bycatch records from the WCPFC convention area. These records were used to hypothetically simulate multiple spatially distant separately managed areas with a relatively low level of observer coverage, a common situation for the assessments conducted on a regional/global scale. We demonstrate how to obtain an uncertainty estimate of the total bycatch based on the individual assessments, show that the difference among the different assumptions can be substantial when the number of individual assessments is high, and provide some empirical evidence for the co-variability of the patterns of seabird bycatch rates.
2. Methods
2.1. Problem Statement
Given unbiased seabird bycatch rate estimates and the total fishing effort from multiple sources, the total seabird bycatch can be calculated as the sum of the product of the total fishing efforts and the estimated bycatch rates; the task is to estimate the level of uncertainty by the variance in the total seabird bycatch. The primary source of variation comes from the bycatch rate estimates. For each source, the seabird bycatch rate can be estimated, based on a sample of observed fishing operations. The seabird bycatch rate varies with natural factors, such as changes in the seabird abundance, spatial distribution, and the spatial–temporal overlap between the seabirds and fisheries. The other sources of variation include sampling error, intrinsic stochasticity, and changes in the distribution of the fishing effort in each region, and, for simplicity, these variations are assumed to be independent for each region. Specifically, the incomplete observer coverages (i.e., less than 100% coverage) contribute to the different levels of sampling errors. It is important for the estimation model to handle potentially different levels of sampling error from each source, due to the differences in observer coverage.
The bycatch rate also varies with technological and managerial factors, such as changes in the bycatch mitigation requirements, and these factors mainly affect its average value rather than the variability. In the following, we consider the common case where the fishing effort is known; when the fishing effort is not known, its variation also contributes to the uncertainty in the estimated total bycatch. The coefficient of variation (CV) is another commonly used measure of uncertainty in the bycatch studies [
16,
17]. In the US, the use of the CV is recommended for evaluating the bycatch uncertainty of the US fisheries, with a target precision of 20–30% by the National Bycatch Strategy of the National Marine Fisheries Service [
18]. Therefore, both the variance and the CV were investigated in this study.
2.2. Correlation Structures of Seabird Bycatch Rates
When the total fishing effort is known from each source, for example, based on the fishery logbooks, the variance of the total bycatch can be calculated, based on the covariance matrix of bycatch rates. However, the covariance matrix is only partially known based on the individual assessments, and additional information is needed to fill in the missing entries. The key is to obtain the bycatch rate variability from each source and the co-variability of the bycatch rates among all of the sources. Biologically, the bycatch rate variability from each source is due to both sampling errors and factors such as abundance fluctuations and seasonal migration, and the bycatch rates from different areas may fluctuate synchronously/anti-synchronously. See
File S1 for the formula when the total fishing effort is also estimated.
There are three special cases where additional knowledge of the covariance matrix is not needed for the calculation of the uncertainty estimate of the total bycatch. The first one is the completely synchronized case, where the bycatch rates from any two sources are perfectly correlated. For the second case, the bycatch rates from all of the sources are assumed to be independent from each other. For the third case, the variation of the bycatch rates is completely counter-balanced, and the total bycatch is perfectly determined. See
File S1 for more technical details on these special cases.
Next, we motivate the development of the methodology by considering the biological processes that could naturally produce the correlation structures mentioned above. The processes that synchronize the variations in the bycatch rates include fluctuations in the seabird population’s abundance, and shrinkage and expansion in the distribution of seabirds (
Figure 1 left panel). For example, a population decline would reduce the availability of the seabirds to incidental capture, and it would synchronize a declining trend in the bycatch rates for all of the fishing vessels operating within range without any other changes. Similarly, population growth would produce an increasing trend in the bycatch rates. The expansion and shrinkage of the seabird distribution in either space or time have the same effect as the fluctuating abundance discussed above. On the other hand, seabird distribution shifts in either space or time can produce the counter-balanced variation in the bycatch rates (
Figure 1 right panel). Holding the overall population abundance constant, the distribution shifts will increase the availability of seabirds and thus a higher bycatch rate in some of the regions, and simultaneously decrease the availability of seabirds in some of the other regions and thus a lower bycatch rate. Realistically, the fluctuating population abundances and shifts in the temporal and spatial distribution act simultaneously on the bycatch rate, and the overall correlation will depend on the relative strength of these processes. Thus, the biological processes naturally give rise to correlated bycatch rates among the spatially separate areas. Note that the purpose of this thought experiment is to motivate the development of the methodology and provide some biological context to potential patterns, and it is by no means to provide any hypotheses to be tested here.
Generally, the bycatch rates from two different regions may vary between the completely synchronized case and the completely counter-balanced case, and the uncertainty estimation of the total bycatch would require additional knowledge of the missing correlation coefficients. When the time series of bycatch rates estimates from each source are available, these missing coefficients can be empirically estimated. However, when detailed information is unavailable, it is not clear which simplified case could be used as the default. In the following hypothetical example, we empirically estimate the correlation in bycatch rates among two configurations of spatial partitions in the Western and Central Pacific, and the results may provide guidance for uncertainty calculations in other regions.
2.3. Scenario Analysis: Seabird Bycatch in Longline Fisheries in the Western and Central Pacific
The WCPFC convention area covers a large proportion of the species range of the albatross species, many of which are globally threatened [
19,
20]. In this study, the public domain aggregate (5° × 5° latitude/longitude) seabird bycatch and target catch records from the WCPFC convention area were obtained (
Figure 2). The temporal coverage of the analysis spans eight years from 2013 to 2020. The bycatch dataset used contains a total of 5224 seabird capture records from over 180 million observed hooks. In this study, no distinction is made between the different seabird species, as species identification is incomplete in this dataset, and the seabird bycatch rate is calculated as the number of captured seabirds per 1000 hooks.
The observer coverage of this dataset is approximately 2.5% of the total fishing effort in number of hooks. The observer coverage may be sporadic in some of the areas in this region, and it may contribute to a high level of uncertainty in the bycatch rate estimate. Globally, this situation is the common case rather than an exception. This study addresses the issue of how to properly propagate the uncertainty at a local scale to estimates at a higher level. Another relevant issue to investigate is the validity of independence assumption among spatially distant areas, and the large spatial extent of this dataset allows us to also address this issue. On the other hand, a dataset with a high observer coverage (100%, for example) and local spatial extent would not allow us to expose those important issues commonly encountered in the assessments conducted at a regional/global scale.
We need to emphasize that the purpose of this example is to demonstrate the workflow and explore some common issues in estimating the uncertainty of the total bycatch at a regional/global scale with a real-world dataset. It is not the purpose of this example to estimate the bycatch rates for this region, which falls under the scope of a local scale assessment. To prevent any potential misuse of the particular results reached in this example, the level of uncertainty was calculated based on a hypothetical vector of total fishing efforts. Again, it falls under the scope of local scale assessment reports to supply the estimated mean and variance of the bycatch rate for each area, which are subsequently used in a regional/global assessment. The total fishing effort can be either be known or also estimated with its mean and variance, and the corresponding formula for either case can be found in
File S1.
In this study, two scenarios of partitioning the whole region into adjacent latitude bands were investigated (
Figure 3). In scenario one, the four adjacent areas were defined based on the latitude of the location: area one for locations to the north of 20° N, area two for the area between 20° N and the equator, area three for the area between the equator and 20° S, and area four for the area south of 20° S (
Figure 3 left panel). These areas partition the whole region into approximately equal parts. Next, areas one and two in scenario one were combined to form area one in scenario two, area four in scenario one was divided into areas three and four in scenario two, and area two of scenario one remains unchanged and was renamed area two in scenario two (
Figure 3, right panel).
The purpose of analyzing these configurations of the spatial partition is to show a real-world example of the synchronized and counter-balanced variation in the bycatch rates. The hard constraint for these areas is that they must be large enough such that there is a non-zero observer coverage in every year from 2013 to 2020 for each of the individual areas. The latitude bands were used due to the strong longitudinal gradient in seabird distribution in the Pacific [
7], and seabird bycatch mitigation requirements in this region are also latitude based [
21], such that the bycatch rates are more similar among the locations within each latitude band than across the different bands. To some degree, these individual areas are arbitrarily defined, and one can come up with other configurations to achieve qualitatively similar results.
For each area, the observed annual seabird bycatch rate was assumed to be given and calculated as the total number of the observed captures divided by the total observation effort in the thousands of hooks observed in that year, and the estimated annual total bycatch was the sum of the product of the observed bycatch rate and the total fishing effort in that area. The annual bycatch rate was assumed to have mean
. The method of generalized least squares was used to estimate the type of correlation among the bycatch rates in different areas [
22]. Three different correlation structures were tested. These included the independent, compound symmetry, and general symmetric correlation matrices. In the compound symmetry case, with the estimate of the common coefficient, we can test the validity of the three special cases of correlation in the bycatch rates (see
File S1 for details).
All of the analyses were conducted in the R 4.1.3 statistical environment [
23] with package nlme [
24]. The model selection was based on AIC [
25]. As a rule of thumb, models having
have substantial support, those having
have considerably less support, and models having
have essentially no support [
26].
3. Results
The model selection results indicate complex interactions between the seabird bycatch rates in the Western and Central Pacific scenario analysis. For both of the configurations of the spatial partition, the model with a general correlation matrix was selected as the best model, based on AIC (
Table 1). For scenario one, the independent case was the second-best model, followed by the compound symmetry case with a similar fit, while for scenario two, the compound symmetry case was selected as the second-best model followed by the independence case, which fits the data rather poorly with
.
The estimates of the general correlation matrices in the selected models show the existence of both a synchronized and counter-balanced variation. Both of the positive and negative components were present in the estimated correlation matrix of scenario one, and the large positive correlation (0.897) between the two spatially separated areas one and four was potentially mediated through their common negative correlation with area two located in the middle (
Table 2). In contrast, all of the estimated correlation coefficients were positive in scenario two with no negative components (
Table 3). By combining and splitting some of the individual areas (
Figure 3), we obtained qualitative changes in the correlation matrix, suggesting that the correlation structure is sensitive to the spatial configuration of the individual areas.
Surprisingly, in both of the scenarios the strongest correlation appears between the areas most separated in distance, i.e., areas one and four (
Table 2 and
Table 3). Specifically, in scenario two, the correlation between the bycatch rate in area one and the other areas increases with distance (
Table 3). These results suggest that the commonly used spatial correlation structures that dictate a monotonically declining correlation with distance will not provide a good fit for this particular dataset.
Even though the compound symmetry correlation structure is flexible enough to accommodate either zero, positive, or negative correlations among the areas, it failed to fit the data well. In the first scenario, a weakly positive correlation parameter was obtained (
in
Table A2,
Appendix A) whilst both of the positive and negative correlations were identified in the general model, and the negative correlations were masked by the positive correlation that is larger in magnitude (
Table 2). In the second scenario, where all of the correlations were identified to be positive, but with different magnitudes in the general model, a moderate correlation parameter was obtained by assuming the compound symmetry structure (
in
Table A4,
Appendix A). In both of the cases, the assumption of a common correlation coefficient between the areas was not met by the data, and, as a result, a poor fit was provided as compared to the general case.
With both the variance structure and the correlation structure estimated based on the selected model (
Table 2 and
Table 3), the uncertainty estimate of the total bycatch can be readily calculated. For example, using the formula for the known total fishing efforts (see
File S1 for details), for a hypothetical vector of the total fishing effort of 20, 30, 40, and 50 million hooks in areas one through four in scenario one and the associated vector of standard errors, as shown in
Table 2, the total seabird bycatch is 5440 birds with a CV of 87.78%. Assuming complete synchronization gives the same average, but with a CV of 91.86%, and assuming independence gives a CV of 64.35%. Since the vector of standard errors comes from the time series model, the calculated uncertainty is for the expected total observed seabird bycatch averaged over all of the years. In this hypothetical example, since we know the observed seabird bycatch rate for all of the years across all of the areas, we can calculate the uncertainty directly from the data, and the CV is 86.81% based on a direct calculation, which matches closely with the value based on the selected model. Note that, in practice, the vector of standard errors would come from the local level assessment report from each area, and a direct calculation would not be possible. When any of the total fishing effort is also estimated, the area specific bycatch rate, which also comes from the local level assessment report, will also be used in the calculation with a different formula (also found in
File S1). By making additional distributional assumptions on the bycatch rates, one can further obtain the confidence interval estimates on total bycatch. The order of the uncertainty estimates with different assumptions on the correlation structure is as expected, and the discrepancy among them is relatively small because the number of areas here is only four. In the following, we show that the discrepancy can be substantial when the number of areas is high.
On the other hand, the estimates of seabird bycatch rates are not sensitive to the assumptions in the correlation structure. For both of the scenarios, the estimates of the area specific seabird bycatch rate are similar among the models with different correlation structures (
Table 2;
Table A1 and
Table A2 (
Appendix A) for scenario one;
Table 3;
Table A3 and
Table A4 (
Appendix A) for scenario two). This is not unexpected, because the covariation of the seabird bycatch rates between the two areas does not concern the average bycatch rate in any of the areas. However, this does imply that finding the correct correlation structure is not important. As further illustrated below, and also described in the Methods section, the correlation structure becomes more important when we need to compile estimates from a large number of individual sources. Incidentally, the seabird bycatch rates from mid-latitudes in both of the hemispheres are significantly higher than those from the equatorial region (
Table 2), despite the more stringent bycatch requirements in the higher latitudes, suggesting a review of the implementation and effectiveness of the bycatch mitigation measures in those areas.
The synchronized variation drives a much faster growth of variability in the estimate of the total bycatch than the independent case, and the effect is more pronounced as we have more individual sources. Assuming that each area has the same fishing effort, and the seabird bycatch rate in each area has a common standard deviation, in the completely synchronized case, the standard deviation of the total bycatch scales linearly with the number of the individual sources, and in the independent case, the standard deviation only scales with the square root of the number of sources. The difference between the two estimates grows steadily with the number of sources (the upper and lower curves in
Figure 4) and the difference is substantial when the number of sources is large. For reference, the latest global review of seabird bycatch in longline fisheries compiled estimates from 68 individual sources and still with gaps in some regions and fleets [
13]. In terms of the CV, the level of uncertainty of the total seabird bycatch remains constant with respect to the number of sources in the completely synchronized case; for the compound symmetry correlation with an intermediate positive correlation, the CV quickly reaches a finite asymptote; for the independent case, the CV vanishes when the number of sources tends to infinity (
Figure 5). These analytical results show that the penalty is large for choosing either the most conservative option (perfectly synchronized) or a wrong correlation structure when the number of individual reports to compile is large. See
File S1 for additional details.
4. Discussion and Conclusions
The complex correlation structure among the seabird bycatch rates between the different sources is likely facilitated by underlying biological processes, such as long-distance migration and abundance fluctuations. For example, the Sooty Shearwater (
Ardenna grisea), a common bycatch species in longline fisheries [
27,
28], is one of the most abundant seabird species in the world [
29]. In the Pacific, its population conducts trans-equatorial post-breeding migrations during the summer, connecting remote regions in the mid-latitudes of both the Northern and Southern Hemispheres [
30]. Even though abundant, its populations are in decline both in its primary breeding ground in the South Pacific and in its wintering grounds [
31,
32]. Other seabird species, such as the Great Shearwater (
Ardenna gravis) and the Short-tailed Shearwater (
Ardenna tenuirostris) [
33], also undertake trans-equatorial migrations and they are also common fishery bycatch species. Potentially, the population fluctuations of these seabird species synchronized the strong correlation in seabird bycatch rates between areas one and four in both of the scenarios’ analyses. Other highly migratory bycatch species, such as marine mammals and turtles, may present similar issues, further complicating the assessment of the total fishery impact on their population [
34].
The factors affecting seabird bycatch estimates are scale dependent. At the local scale, the primary task of a bycatch assessment is to provide an unbiased estimate of the bycatch rate specific to the area. At this level, many of the factors affect the observed bycatch rate, such as differences in foraging behavior due to species identity and season [
35,
36,
37], density effect due to competition [
38], and environmental factors [
39,
40]; and in the case of pelagic longlines, the time of setting is often an important factor [
2,
16,
41,
42], and species identity also affects the scaling factor used to recover the unobservable portion of the seabird bycatch [
43]. Due to the non-random distribution of seabirds, it is important to standardize the seabird bycatch per unit effort (BPUE), through the use of distance-based random field at a local scale, for example [
44]. The adoption of bycatch mitigation measures, for example, streamer lines [
45,
46], night setting [
47], weighted hooks [
48], and hook-shielding devices [
49], can substantially reduce seabird interactions, and their effect should be considered in BPUE standardization. Meanwhile, because the effect of those mitigation measures often varies substantially between the experimental trials and the actual implementation [
50], caution is needed when extrapolating their effect. Once the BPUE has been estimated, those local factors do not further complicate the calculation of the total bycatch at the higher level, because their impact has already been considered. The issue at a higher level is to properly account for the potential synchronous/anti-synchronous variations among the BPUEs from different areas, which can substantially impact on the uncertainty estimation, as shown in this study.
Completely synchronized seabird bycatch rates are likely not a common occurrence, and the scenario analysis provides no support for completely synchronized variation in the Western and Central Pacific. In the scenario analysis, the spatially adjacent areas are found to have either positive or negative correlations with small magnitudes. The weak correlation between the adjacent areas may be due to the relatively low observer coverage in the region. Seabird bycatch is a statistically rare event, and a minimum of 20% observer coverage is recommended for the estimation of bycatch rates of common bycatch species [
51]. The current coverage percentage is vastly insufficient, and it will introduce a large amount of sampling variations into the bycatch rate estimates, and, in turn, it will degrade the level of correlation seen in the data. This is a common situation, as currently only three out of 17 regional fisheries management organizations require 100% observer coverage on the fishing vessels [
52]. Thus, in most of the cases, we would not expect a completely synchronized variation due to sampling errors alone. Despite the relatively low coverage in the scenario analysis, the absolute observer effort is relatively high, with over 180 million hooks observed. This high effort level enables us to find the correlation structure substantially different from the independence case for this region, and to show that no simplified structures explored in this study can be used as the default case.
Simplified correlation structures failed to capture the interaction of seabird bycatch rates in the scenario analysis. Even though the compound symmetry correlation structure can accommodate both the positive and negative correlations, it cannot replicate the spatially heterogeneous correlation-pattern in the data. Neither can the commonly used distance-based spatial correlation structures replicate the patterns well, due to the presence of strong long-distance synchronized variations and negative correlations. The most common parametric models, such as Matern and exponential covariances [
53,
54], cannot handle the negative correlations [
55]. The spatial correlations can be useful to infer the influence of oceanographical processes at a local scale, where geo-referenced bycatch records are available. This is usually carried out at the individual assessment report level. For example, based on the analysis of a Gaussian random field, gulf stream meanders have been suggested as having an influence on the seabird bycatch rates of the US Atlantic pelagic longline fleet in mid-Atlantic bight [
56]. Similarly in the Pacific, the El Niño–Southern Oscillation (ENSO) events can limit prey availability to seabirds in certain areas [
57], and drive distributional shifts in the species that cannot persist in low productivity waters [
58,
59]. At a regional or global level, however, the estimation of a general correlation matrix becomes necessary, due to the inter-connectiveness of seabirds at the global scale.
To assess the fisheries’ impact on the seabird populations on a global scale, the number of individual assessments to synthesize is likely to be in excess of 68, based on longline gear alone [
13]. The inclusion of other gear types, such as gillnets [
60,
61,
62,
63] and trawls [
64], will further increase the number of sources, and highlight the importance of choosing the correct correlation structure. In summary, this study addresses an important issue in uncertainty estimation of the total seabird bycatch on a regional/global scale. Assuming a completely synchronized variation then produces the most conservative uncertainty estimate, however it also misses any opportunity to improve the precision of the estimate when the time series of the bycatch rates is available. It is also dangerous to assume independence even between spatially distant areas, because it would lead to an underestimate of the uncertainty when there is some connectivity between those areas. The issue of both under- and over-estimation due to an invalid assumption is especially acute when the number of individual areas is high. When the time series of the bycatch rates is available, it is recommended to empirically estimate the correlation of the bycatch rates between each pair of sources, instead of relying on some unwarranted assumptions.