Advanced Spatial Modeling to Inform Management of Data-Poor Juvenile and Adult Female Rays

Chronic overfishing has depleted numerous elasmobranch stocks in the North East Atlantic, but addressing this issue has been hampered by management complications and lacking data. Spatial management approaches have thus been advocated. This work presents a novel application and further development of an advanced spatial modeling technique to identify candidate nursery grounds and spawning areas for conservation, by subsetting already limited data. Boosted Regression Tree models are used to predict abundance of juvenile and mature female cuckoo (Leucoraja naevus), thornback (Raja clavata), blonde (Raja brachyura), and spotted (Raja montagui) rays in the Irish Sea using fish survey data and data describing fishing pressure, predation and environmental variables. Model-predicted spatial abundance maps of these subsets reveal distinct nuances in species distributions with greater predictive power than maps of the whole stock. These resulting maps are then integrated into a single easily understood map using a novel approach, standardizing and facilitating the spatial management of data-limited fish stocks.


Introduction
Decades of overfishing have significantly impacted the diversity of North-East Atlantic ecosystems, and markedly reduced the numbers of large ray species [1][2][3]. The large size and low fecundity of elasmobranchs such as rays makes them especially vulnerable to fishing pressure [4][5][6]. As a result of high fishing effort, the Irish Sea now supports fewer of each species of ray, which are of smaller average size, and inhabit smaller areas, than ever before [1,7,8]. The study species-cuckoo ray Leucoraja naevus, thornback ray Raja clavata, blonde ray Raja brachyura and spotted ray Raja montagui-are classed as either "data-limited" (blonde ray) or have enough surveyed data to reveal trends (the other species) [9], based on data collected annually in the ICES (International Council for the Exploration of the Sea) IBTS (International Bottom Trawl Survey) [10]. Conservation efforts for these species are complicated by inconsistencies in how the stocks are managed, both spatially and temporally [11].
They are managed under a joint Total Allowable Catch (TAC) limit (with small-eyed (Raja microocellata), sandy (Leucoraja circularis) and shagreen ray (Leucoraja fullonica)) for the geographical stock areas of waters west, northwest and southwest of the UK and Ireland [12], and in the Irish Sea are mainly taken as bycatch to the TR1 métier (otter trawl and demersal seine with mesh size ≥100 mm) fleet, and also targeted by a dwindling Irish fleet of three vessels or fewer. The spatial distribution of fishing effort is relatively annually stable, principally centered around the

Relative Importance of Explanatory Variable Types
The summed influence score for environmental variables was twice as great as fish predator CPUE for the analysis of the juvenile subset ( Figure 1, sixth and subsequent bars), and eggcase-reducing agents (whelk catch rate and scallop effort) had only a minor influence (6.5%). There was no relationship between commercial fishing pressure (LPUE) and surveyed juvenile or adult females ray occurrence and CPUE (Figure 1). See Table 3 for a list of the explanatory variables used in this study, their spatial resolutions and sources.

Influential Variable Relationships and Predicted CPUE Maps
Predicted CPUE surfaces are presented in Figure 2, and Figure S2a-i for per-species binary and Gaussian bar charts of variable influence, and line plots of individual variable/response relationships for presence/absence and CPUE.

Cuckoo Ray
The presence/absence and CPUE of mature female cuckoo ray was best predicted by depth (>100 m), distance to shore (mostly increasing with distance offshore but greatest at 44 km), and current speed (especially >1 m s −1 ), similar to the results for all cuckoo rays found by [11] (see that paper's Supplementary Material). Highest CPUE was predicted in the middle of the deeper Central Channel, and in the North Channel.
The presence/absence of juveniles was best predicted by haddock CPUE (small ray presence peak at 12 haddock per hour, then a trough, then positive correlation, until reaching a plateau above 40 haddock per hour), current speed (presence rising sharply above 0.8 m s −1 ) and distance to shore (positive relationship, almost exponential in appearance, peak at 38 km then final peak and plateau above 48 km). Juvenile CPUE was best predicted by grain size (preference for sand/gravel), blonde rays (positive relationship), and haddock (positive relationship with peak ray CPUE at ~50 haddock per hour, then dropping ray CPUE with continually climbing haddock numbers; Figure S2e). While the model predicted high CPUEs in similar areas to adult females, CPUE was more strongly concentrated in smaller, more distinct central regions within the Central Channel and off the southwest coast of the Isle of Man.

Influential Variable Relationships and Predicted CPUE Maps
Predicted CPUE surfaces are presented in Figure 2, and Figure S2a-i for per-species binary and Gaussian bar charts of variable influence, and line plots of individual variable/response relationships for presence/absence and CPUE.

Cuckoo Ray
The presence/absence and CPUE of mature female cuckoo ray was best predicted by depth (>100 m), distance to shore (mostly increasing with distance offshore but greatest at 44 km), and current speed (especially >1 m s −1 ), similar to the results for all cuckoo rays found by [11] (see that paper's Supplementary Materials). Highest CPUE was predicted in the middle of the deeper Central Channel, and in the North Channel.
The presence/absence of juveniles was best predicted by haddock CPUE (small ray presence peak at 12 haddock per hour, then a trough, then positive correlation, until reaching a plateau above 40 haddock per hour), current speed (presence rising sharply above 0.8 m s −1 ) and distance to shore (positive relationship, almost exponential in appearance, peak at 38 km then final peak and plateau above 48 km). Juvenile CPUE was best predicted by grain size (preference for sand/gravel), blonde rays (positive relationship), and haddock (positive relationship with peak ray CPUE at~50 haddock per hour, then dropping ray CPUE with continually climbing haddock numbers; Figure S2e). While the model predicted high CPUEs in similar areas to adult females, CPUE was more strongly concentrated in smaller, more distinct central regions within the Central Channel and off the southwest coast of the Isle of Man. Model predicted CPUE (numbers per hour) of mature female rays based on environmental variables and fishing pressure (left column) and juvenile rays based on environmental variables, fishing pressure, direct removal of eggcases by scallop dredges, whelk LPUE, and CPUE of suitably sized cod, haddock, plaice, whiting, common skate and blonde ray (right column).

Thornback Ray
Mature female thornback ray presence/absence and CPUE were best predicted by salinity (increasing with increasing salinity), depth (greatest at~40 and 100-120 m) and temperature (strong preference for waters warmer than 16 • C). Highest CPUE of thornback rays was predicted in a band running from the southeastern tip of Ireland to Anglesey, and down across Cardigan Bay, as well as in the near-shore areas in both of the Eastern bays. While these findings concur with those of [11], bands of high CPUE were more pronounced.
Juvenile presence/absence and CPUE were best predicted by plaice CPUE (CPUE of thornback rays increasing with plaice CPUE, peaking at~200 plaice per hour and then dropping to a plateau) ( Table 1), temperature (preference rising sharply above 16 • C), and salinity (avoidance between 33.4-34.2 ppm, peak preference at 34.3-34.6 ppm; Figure S2g). Areas of peak predicted CPUE were in small distinct near-shore regions around Anglesey and Liverpool Bay.

Blonde Ray
Mature female blonde ray presence/absence and CPUE were best predicted by current speed (peak between 1 and 1.2 m s −1 ), depth (<30 m or >90 m) and temperature (preference for colder waters but also at 15 • C). High CPUE of blonde rays was predicted in two discrete patches within the Central Channel and just off the near-shore Irish banks.
Juvenile presence/absence and CPUE was best predicted by depth (aversion to the 40-65 m range), current speed (positive relationship), and temperature (positive relationship but with a sub-peak at the midrange 15.5 • C; Figure S2h). Predicted areas of high CPUE are very similar to those of mature females, though slightly higher in the southern Central Channel. The vertical line of high predicted abundance around 53 • N coincides with the bottom of an ICES rectangle. This plotting artifact occurs because the whelk and scallop data are recorded at the resolution of an ICES rectangle.

Spotted Ray
Mature female spotted ray presence and CPUE was best predicted by current speed (>0.9, peaking at 1.2 m s −1 ), depth (peaks at 75 and >100 m) and salinity (>34 ppm). The highest CPUEs of spotted rays were patchily distributed around the Central Channel, especially at the northern and southern extremes, and off Caernarfon Bay.
Presence/absence and CPUE of juvenile spotted ray was best predicted by plaice CPUE (ray CPUE peaks at 300 and 1050 plaice per hour, with a trough between) and salinity (peak at 34.4 ppm then high afterwards; Figure S2i). This resulted in a few specific patches of high CPUE being predicted in the spatial distribution. The spatial distribution for juveniles features distinct patches of high CPUE in the South-Eastern bays and off the sandbanks by the Central Channel, compared to adult females, which have smoothly blending gradients of predicted high CPUE.

Fish Predators and Eggcase Removers
Aside from the most important explanatory variables listed above, fish predator and eggcase removal predictors showed varying levels of influence on the presence/absence and CPUE of juveniles of these four rays (Table 1).
Larger rays (blonde and thornback, adult total lengths 114 and 105 cm respectively) are only significantly influenced by one predator/eggcase remover each, whereas the smaller rays (cuckoo and spotted, adult total lengths 71 and 80 cm respectively) are influenced by four (Table 1). Predicted CPUE surfaces are presented in Figure 2, and Figure S2a-i for per-species binary and Gaussian bar charts of variable influence, and line plots of individual variable/response relationships for presence/absence and CPUE. Table 1. Influence and relationship of eggcase removers and predators upon juvenile ray CPUE, represented by partial dependence plots produced by the model. The horizontal axes represent rising levels of the predictor variable, the vertical axes represent rising levels of the response variable, i.e., ray CPUE. Bar plots of variable influence, and full partial dependence plots for juvenile and mature female CPUE are available in Figure S2a- Table 1. Influence and relationship of eggcase removers and predators upon juvenile ray CPUE, represented by partial dependence plots produced by the model. The horizontal axes represent rising levels of the predictor variable, the vertical axes represent rising levels of the response variable, i.e., ray CPUE. Bar plots of variable influence, and full partial dependence plots for juvenile and mature female CPUE are available in Figure S2a-i.  Table 1. Influence and relationship of eggcase removers and predators upon juvenile ray CPUE, represented by partial dependence plots produced by the model. The horizontal axes represent rising levels of the predictor variable, the vertical axes represent rising levels of the response variable, i.e., ray CPUE. Bar plots of variable influence, and full partial dependence plots for juvenile and mature female CPUE are available in Figure S2a-i.

Cuckoo Ray
The combination map (see Materials and Methods for details of the construction of this figure) prominently featured high CPUE areas that were also evident in the original maps, such as in the North Channel and patches in the Central Channel. However, some areas of high-CPUE that were prominent in the original maps were less apparent in the combined map; for example, the high-CPUE areas in the Central Channel and North Channel were reduced in size. The various small peaks in juvenile CPUE were mostly downplayed unless they happened to overlap with mature female peaks ( Figure 3, including for subsequent species in this section).

Cuckoo Ray
The combination map (see Materials and Methods for details of the construction of this figure) prominently featured high CPUE areas that were also evident in the original maps, such as in the North Channel and patches in the Central Channel. However, some areas of high-CPUE that were prominent in the original maps were less apparent in the combined map; for example, the high-CPUE areas in the Central Channel and North Channel were reduced in size. The various small peaks in juvenile CPUE were mostly downplayed unless they happened to overlap with mature female peaks (Figure 3, including for subsequent species in this section).

Thornback Ray
Mature female thornback rays' CPUE peaks were mostly in offshore bands, whereas juveniles were small near-shore pockets around Cardigan Bay, Caernarfon Bay, and Anglesey: the combination map assimilates all of this information well.

Thornback Ray
Mature female thornback rays' CPUE peaks were mostly in offshore bands, whereas juveniles were small near-shore pockets around Cardigan Bay, Caernarfon Bay, and Anglesey: the combination map assimilates all of this information well.

Blonde Ray
Mature female blonde rays were solely concentrated in the middle of the Central Channel, off the various banks there; juveniles were found there but also in a low-CPUE patch offshore in the southern Central Channel. The mid-Central Channel CPUE peak dominated the combination map, with only one thin stripe of predicted high CPUE remaining in the Southern channel.

Spotted Ray
Mature female spotted rays were distributed in offshore bands fringing bays like thornback rays, and in a band in the North Channel and patches in the mid-Central Channel like cuckoo rays. The juveniles were distributed in discrete blocks in the Southern Irish Sea. The combination map therefore prioritized the areas where both original maps' CPUE peaks overlapped, which led to bands and patches in the Southern Irish Sea.
Amalgamating all eight predicted CPUE maps (juvenile and adult female for four species) into one ( Figure 4) revealed the extent to which their predictions overlapped. Only two core areas of universally high CPUE remained, notably the Central Channel North-South streak and the patches east of the Irish Banks, around 53.1 • N. Patches of moderate CPUE were clear from the various original maps, such as the offshore and North Channel bands from thornback and cuckoo maps. Mature female blonde rays were solely concentrated in the middle of the Central Channel, off the various banks there; juveniles were found there but also in a low-CPUE patch offshore in the southern Central Channel. The mid-Central Channel CPUE peak dominated the combination map, with only one thin stripe of predicted high CPUE remaining in the Southern channel.

Spotted Ray
Mature female spotted rays were distributed in offshore bands fringing bays like thornback rays, and in a band in the North Channel and patches in the mid-Central Channel like cuckoo rays. The juveniles were distributed in discrete blocks in the Southern Irish Sea. The combination map therefore prioritized the areas where both original maps' CPUE peaks overlapped, which led to bands and patches in the Southern Irish Sea.
Amalgamating all eight predicted CPUE maps (juvenile and adult female for four species) into one ( Figure 4) revealed the extent to which their predictions overlapped. Only two core areas of universally high CPUE remained, notably the Central Channel North-South streak and the patches east of the Irish Banks, around 53.1° N. Patches of moderate CPUE were clear from the various original maps, such as the offshore and North Channel bands from thornback and cuckoo maps.

Model Performance Metrics
For mature females, the high training data Area Under the Receiver-Operator-Characteristic Curve (AUC) scores indicate excellent discrimination of probabilities between presence and absence samples, i.e., the predictive power of the presence/absence model is high. This is also true for the Cross-Validated (CV) AUC scores, with relatively minor decrease from training to CV AUC scores, indicating that overfitting was not a serious issue and predictions were more likely to be correct [30].

Model Performance Metrics
For mature females, the high training data Area Under the Receiver-Operator-Characteristic Curve (AUC) scores indicate excellent discrimination of probabilities between presence and absence samples, i.e., the predictive power of the presence/absence model is high. This is also true for the Cross-Validated (CV) AUC scores, with relatively minor decrease from training to CV AUC scores, indicating that overfitting was not a serious issue and predictions were more likely to be correct [30]. These scores indicate that we can have confidence in the model predictions. See Table 2 for model  performance scores. For juveniles, more so than adult females, these scores indicate that overfitting was not an issue and predictions were likely to be correct. The discrepancies between the mature female and juvenile ( Figure 2) maps indicate that filtering full datasets into subcomponents does indeed allow those subcomponents to be discriminated spatially.
CV AUC standard error was better for the full dataset than mature females but the same as juveniles. While none of the models were affected by overfitting, overfitting was marginally lower for the juveniles than the full dataset, both of which were lower than the mature females.

Overview
The present study has shown that there are many factors that can influence the distributions of these four ray species, with all descriptor variables except common skate CPUE having an influence on the CPUE of at least one species. Furthermore, these factors worked differently across the four species and between immature and mature females. BRT spatial modeling proved capable of resolving differences in the spatial distribution of population components/subsets (i.e., mature females and juveniles), revealing the most influential factors affecting their distribution ( Figure 1, Table 1, and Figure S2a-i). Comparing results of the adult female against the juvenile analyses shows how species' habitat preferences develop over their lives, and comparing both sets of results to the whole-stock results from [11] demonstrates how disambiguating these subsets reveals subtle differences which the whole-stock analysis may blur.
High CPUE areas were clearly delineated for each species (Figure 2)-these are candidate areas where management could protect the stocks' reproductive potential. Scaling and combining predicted CPUE maps from different species' subsets led to combined maps (Figures 3 and 4) which are easy to understand and encompass the most important areas. We included all eight subsets in this analysis but the code allows users to omit subsets as desired.

Juvenile Rays and Teleost "Predators"
For juveniles, fish predator CPUE was a reasonably influential explanatory variable, and showed predominantly positive correlations with ray CPUE, however only a few individual predator species highly influenced ray CPUE. The positive correlation between ray and predator CPUEs-most apparent at the extremes of the variable response line plots ( Figure S2a-i)-may reflect shared habitat preferences, as rays, adult cod, haddock and whiting all prefer sand, gravel, and hard substrates [33]. Ray CPUE rapidly increased at moderate levels of predator CPUE, suggesting that as unfavorable environmental conditions improve, both rays and demersal teleosts benefit. It is unlikely that predator CPUEs are rising in response to rising ray CPUEs, as rays would need to comprise a majority of their predators' diets for this to be the case [34], which they do not [35]. For the teleost species recorded to predate upon young rays (cod, mackerel, tub gurnard [35]), such incidents are not common. While gut evacuation rates are likely to be accelerated for fragile, cartilaginous, often embryonic prey, conceivably lowering the likelihood of recording such predation events, it is also plausible that some of these species do not predate upon young rays or eggcases enough to drive behavioral change. If true, computed relationships between these larger teleosts and young rays would indeed be better explained by mutually beneficial habitat preferences or association with teleost spawning sites (for positive correlations), or resource partitioning (for negative correlations).

Influence of Fishing Pressure
While fishing mortality is often considered to be the key driver of population change in demersal stocks [36,37], ray fishery LPUE had little influence on surveyed ray CPUE (Figure 1), suggesting environmental and teleost variables are more important drivers of their current spatial distribution. However, the LPUE dataset only overlaps 28% of the ray trawl survey data, with juvenile rays only present in 9% of those overlap-area trawls, mature females in 7%. Subsequently, caution must be taken when extrapolating from the model's fishery LPUE/ray CPUE relationship. These rays may have adapted to the fleet by sheltering in near-shore refugia, or the fleet influence on ray distributions may reflect the low overlap between fleet and survey data. Alternately, commercial fishing pressure may be shared across the range of the scientific sampling data, driven by the movements of these fish. Distinctions in species diversity and individual body size have proven the existence of refugia in the Irish and Celtic Seas [36], but the fisheries-independent survey may fail to capture this for the same reasons the commercial fishery fails to capture rays in those areas: the depth is too low and/or the substrate too rocky to reliably trawl there.
Further, associations between ray and fishery distribution-which both have the potential to be highly temporally dynamic-may be muted by the across-year pooling of data, required for the BRTs to run; see Section 3.6 for more details. We know these species are mobile enough within the Irish Sea to mix into and out of refugia rather than being extirpated in fine spatial scales [3,5,8,38]. Before implementing spatial management based on this method, one should ground-truth such areas with directed sampling trips [39,40], incorporate additional datasets where possible (e.g., from the catch and discard observer program), or use manual testing/training splits to test predictive power. When implementing management for data-poor fisheries, such approaches to dedicated data collection will help to maximize the information returned from deployment of scarce monitoring resources.

Modelling Subsets
The predicted CPUE maps for mature female and juvenile rays ( Figure 2) shared similarities with CPUE maps generated using the full datasets [11], and simpler presence-only plots [29], but also revealed distinct localized CPUE peaks. Many areas of high predicted juvenile CPUE were small, distinct hotspots, some not seen in [11], like the Southern Central Channel hotspots for spotted ray. The predictive maps in [11] featured smooth areas of low-to-medium CPUE. This may be because the distributions of adult females, males and the juveniles differ significantly, but the full-dataset maps smooth these differences.
Modeling all of each ray species' data together [11] may mask distinct nuances of the subsets' individual distributions. The AUC scores reveal that predictions were marginally more reliable for the juvenile subset than the full dataset, and comparably reliable for the mature female subset, however the standard error was better for the full dataset, and overfitting was marginally reduced due to increased sample size. That CV AUC scores-the predictive power of the models-were similar or better for the subsetted datasets shows that modeling subsets improves our understanding of species distribution patterns, an important finding. Including relative measures of fishing effort, predation and egg case removal led to improved predictions of juvenile ray CPUE compared to a model of all rays based only on environmental predictors.

Amalgamated and Scaled Maps
Combining the predicted CPUE maps for the juvenile and mature female subsets into one "conservation map" per species facilitates access to the key results, and normalizing both maps to the same scale ( Figure 3) allows the importance of specific subsets to be selectively emphasized for each species. These maps can direct spatial conservation conversations between managers and stakeholders (especially fishermen) by pinpointing preferential closure sites and thus facilitating discussion and compromise between the two parties. Similarly scaling and amalgamating all eight original maps ( Figure 4) produced a single map that gave equal conservation importance to all species. This can again support species management by revealing the priority areas where spatial conservation (e.g., Marine Protected Areas (MPAs)) could be proposed in collaboration with stakeholders, possibly during the spawning season. This map (Figure 4) was similar to the maps of high juvenile and mature female CPUE used to create it (Figure 2), however hotspots were discrete and fragmented rather than appearing as large smooth bands and patches, and more evident than in [11], whose rudimentary "top 50% CPUE" cut-off masks nuances of spatial abundance.
Scaling conservation value maps allows knowledge of species-specific biology and threats to be incorporated into the final maps. While we normalized each maps' values to the same scale (1), the weightings could be adjusted based on conservation priorities for the species in question [41]. In the Irish Sea, blonde and cuckoo rays are considered particularly vulnerable [42] and could be given a higher weighting. Relatedly, the generation of both the per-species (Figure 3) and all-amalgamated maps ( Figure 4) affords marine managers the option to pursue spatial management options for the conservation-priority population components of single or multiple species, at their discretion, again depending on conservation priority status.

Representativeness and Uncertainty
While the survey data reasonably represented the range of environmental parameters in the Irish Sea, predicted CPUE maps should be interpreted according to how representative the underlying data could be (e.g., using Representativeness Surface Builder (RSB) maps: Figure S1). CPUE predictions in areas with poor RSB scores may be less reliable than for areas with good survey coverage. In this study, the annual autumn ray survey data from 1990 to 2014 were aggregated in order to ensure the models ran, and increase their statistical power and the spatial resolution of the output maps. The subsequent predictive CPUE maps therefore identify stock structures from data pooled across those years, relying on the assumption that the stock distributions are temporally stable, which may not be the case. Pooling other data (e.g., scallop, whelk, trawling) across years may also limit the model's ability to discern any influence of temporally dynamic variables, especially if the time ranges are not aligned, or if a lag period is required, e.g., in order to account for impacts on subsequent years' recruitment. Similarly, such pooling may obscure temporal lags between localized fishery LPUE and surveyed CPUE hotspots.
Again, ground-truthing of modeled CPUE hotspots, adding datasets, and/or further statistical evaluation of the results would be appropriate. Reassuringly, a quick analysis of annual CPUE for all rays showed there to be no clear trends, just fluctuation around the mean, suggesting there should be no stark "regime shifts" in abundance. Another approach would be to use years as an explanatory factor, which may reveal the importance of the inter-annual variation of other variables. With sufficient data one could also carry out sensitivity and/or trend based analyses to confirm or deny temporal stability. This is important as high CPUE and repeated use of nursery habitats are two of [43]'s criteria for defining nursery areas. Success in resolving temporality for these data (e.g., by bootstrapping) would also allow modeling of future scenarios such as climate change (e.g., [44]), feasibly predicting shifting abundance hotspots as environmental conditions change.

Spawning Grounds
We were able to identify areas where mature females of each species were abundant (Figure 2). Conservation measures to protect species often focus on spawning grounds in order to conserve reproductive potential. The study species spawn from March to July [45][46][47] whereas the beam trawl survey is conducted in September. Females are known to migrate to spawn (mostly inshore, cuckoo offshore [48][49][50]) therefore CPUEs estimated outside the peak spawning period may not be an ideal proxy for spawning grounds. This issue could be addressed either by conducting sampling during the known spawning season or exploring alternative datasets. The former could be expensive, and is therefore unlikely to be a realistic option for data-poor fisheries. Identification of aggregations of mature females outside of the spawning season may still provide valuable management options for protecting reproductive potential, as long as other management measures ensure that spawning success is maintained. This might include technical measures such as a maximum landing size obligation for females, designed to ensure the mature females are discarded unharmed.

Modelling, Biological and Socioeconomic Context
Delta log-normal BRTs demonstrate some clear benefits over more rudimentary and even highly advanced spatial analysis tools, and out-performed or matched 15 other such models when evaluated together [51,52]. The AUC scores and other metrics from this study showed that we can have confidence in these outputs. The utility of this method as a spatial analysis tool could be further enhanced by relating proposed spatial extent to biological parameters such as home range size, migration preference and spawning substrate preference [53][54][55] or by such proposals with fisheries stock science concepts such as MSY (e.g., [56]). Any use of these or similar findings would need to consider other stakeholders-such as fishermen from different métiers-and any existing or proposed closed areas. The socioeconomic impact of implementing such conservation areas should also be considered, especially the impact on fisheries [57]. Effort displaced from the conservation areas may also have negative consequences for the stocks [58]. This could initially be addressed by treating fleet impact and possibly other variables (e.g., temperature) as pressures rather than states, as is currently the case: using existing explanatory variables and stakeholder input and output calibrators in an iterative annual modeling process would expand this approach to be temporally dynamic and sensitive to shifting spatial patterns by the fleet, rays, and environmental variables. Quantified stakeholder preference and biologically-underpinned closed area proposals are incorporated in [56], which leverages the results of this conservation priority mapping approach into a Decision Support Tool designed to enfranchise stakeholders and facilitate management adoption of the approach detailed in this study.
If employed in fisheries management frameworks, this method could be complemented by technical measures designed to protect sensitive groups. These could include a minimum landing size for juveniles, and again a maximum landing size for mature females. In addition, conservation areas could have a temporal component, for example areas of high juvenile CPUE could be completely closed during certain times of year, while maximum landing size rules for females could remain in place at all times. This could be managed using a real-time closure system (e.g., US Pacific groundfish whiting fishery example in [59]).

Materials and Methods
Delta log-normal boosted regression trees split zero-inflated, long-tailed distribution data (most trawls caught nothing, and very few caught many specimens) into zero/non-zero catches, converted to 0 or 1, and log-normalized non-zero catches (CPUE) (see Figure 5, [11] and [28] for a more detailed explanation).
The model used the author's own function gbm.auto [60] with the following BRT parameter values: • Tree complexity, controlling variable interactions, of 2 or 15 for all juveniles, 2 or 6 for all mature females (whelk, scallop, and fish predators are not included in the female model). This allows us to evaluate whether all variables interacting provides a better model result than only two. • Learning rate, controlling the contribution each tree has to the model, of 0.01 and 0.005 for all rays bar mature female blonde rays where we used 0.01 and 0.001 as this subset has fewer data. Smaller rates processing slower but usually more accurately. • Bag fraction, specifying the proportion of randomly selected training/testing data, of 0.5, as used successfully in [11,28].
The model loops through all combinations of these parameters to find the best one, then tests whether that combination would perform better with any variables omitted (using the gbm.simplify function). The best combination is chosen based on performance metrics generated by the model, such as the training data correlation (the correlation between the whole dataset split into training and testing data; higher is better) and mean deviance (lower is better). Performance of the best binary models was quantified using the AUC statistic [61], calculated for the training and CV data used in the model. The discrepancy between the two indicates overfitting, which BRT modeling is designed to reduce with the k-fold (10) CV subroutine. Model validation is conducted by testing the predictive performance of the training-learnt BRT model object (the size of which is determined by the Bag Fraction) against the (remaining) testing data, before all data are used to construct the final model object which is used for prediction [28].
The model machine-learns the relationship between the explanatory variables and each of the two split sets of response variables. Probability of occurrence is predicted using a BRT model with a binomial distribution on the zero/non-zero catches, then abundance is predicted using a BRT model on the non-zero catches with a Gaussian distribution (under the assumption of normally distributed data once the zero-inflated and long tailed elements are accounted for). Predicted abundance values are then inverse log transformed and bias-corrected to avoid inaccuracies associated with using exponents [62]. The probability of occurrence is then multiplied by expected abundance to create predicted CPUE values for the whole study area, exported as csv files and maps. Bar plots of the most influential variables are created, as are partial dependence plots resulting from thousands of binary predictive splits whereby the model learns a more nuanced predictor/response relationship for each variable-for more see [11].

Database Selection and Processing
The substrate of the Irish Sea is mostly a sandy/gravel mix, generally finer close to shore, with rocks to the center/Eastern edges (off Anglesey) and a large mud bank along the Western edge (Dublin coast), with lower current speed there. Current speed at the bottom is low to moderate other than at a few headlands [63]. It is a well-mixed sea but a persistent front develops in the summer. The Irish Sea is a shallow-shelf sea with a 100 m-deep Central Channel and very shallow (≤5 m) sandbanks running along the Western edge (Irish coast), which create deeper channels 7-12 km from shore [24,63,64].

Database Selection and Processing
The substrate of the Irish Sea is mostly a sandy/gravel mix, generally finer close to shore, with rocks to the center/Eastern edges (off Anglesey) and a large mud bank along the Western edge (Dublin coast), with lower current speed there. Current speed at the bottom is low to moderate other than at a few headlands [63]. It is a well-mixed sea but a persistent front develops in the summer. The Irish Sea is a shallow-shelf sea with a 100 m-deep Central Channel and very shallow (≤5 m) sandbanks running along the Western edge (Irish coast), which create deeper channels 7-12 km from shore [24,63,64].  [11,38,68,69]. Neither standard linear correlations nor GAMs (conducted on these data in previous studies [11]) revealed stronger relationships between survey CPUE and any specific month of environmental variable data. Since the survey trawls were conducted in September, environmental data from September were used in this analysis. Correlated variables can be problematic for some modeling techniques, however unlike GLMs or simple regression trees, BRTs are robust to auto-correlated variables as they feature mechanisms (regularization and stochasticity processes [28]), which improves (minimizes) their overfitting [20,52,70]. [28] also find that overfitting does not compromise BRT predictions. Nonetheless, linear correlations between environmental variable pairs did not exceed an r 2 value of 0.37, and were generally 0.14 or below, limiting cause for concern.

Fishery LPUE
Including fishing pressure might improve predicted ray distributions, so ray LPUE data from demersal trawls operating in the Irish Sea were used as an explanatory variable for the juveniles and mature female CPUE analysis. These data were averaged across the available time period (2006)(2007)(2008)(2009)(2010)(2011)(2012) [9].

Whelk CPUE and Scallop Fishery Effort
Similarly, including processes that remove egg cases might improve predicted distributions of juvenile ray CPUEs. These species deposit fertilized eggs on the seabed, which may be removed by humans or natural predators. Scallop dredging occurs in the Irish Sea on sand and gravel [71][72][73], where rays can lay eggs, which scallop dredges are known to remove [74,75], so higher dredging effort should remove more eggcases. Scallop effort data were then binned into ICES rectangles and averaged across all available years. Whelks (Buccinum undatum) are strongly implicated in the predation of elasmobranch eggcases [76][77][78]. High surveyed whelk LPUE likely indicates high abundance and thus higher eggcases predation. Whelk LPUE data were binned into ICES rectangles and averaged across all available years.

Fish Predators' CPUE
To obtain the CPUE of potential predator species at the size range where they could potentially prey on newborn rays, we first needed to compute the size ranges of those species. The weight of an individual marine predator is generally about 100 times the weight of their prey items, plus or minus one order of magnitude [79]. Therefore, potential ray predators would be at least ten times the rays' birth weight. To calculate this, lengths at birth from [80] were converted to weights using equation W = a x Lb, where a and b are constants again from [80]. The weights were multiplied by ten (100 minus one order of magnitude), then converted back to lengths using the inverse of the previous equation, using predator-specific a and b values from [81], i.e., L = 10W . The resultant lengths are the hypothesised minimum sizes of the potential predatory species of age 0 rays. These sizes were used to filter the survey data for each combination of the four ray species and six potential fish predator species (five for blonde rays). Potential predators ("surveyed fish predator CPUE" in Table 3) were cod (Gadus morhua), haddock (Melanogrammus aeglefinus), plaice (Pleuronectes platessa), whiting (Merlangius merlangus), common skate (Dipturus flossada) and blonde ray (treated as a potential predator for all but blonde ray since these species spatially segregate by size to avoid cannibalism [82]). These species were chosen as a representative selection of the more common piscivores in the Irish Sea, since some of these and similar species have been recorded predating upon these young rays/eggcases in the Irish Sea and surrounding waters [35].

Ray Survey CPUE
Ray survey CPUE data were pooled across all available years as there were insufficient data in any one year for the model to run. Trawl midpoints were used as catch locations, many of which overlapped as the survey aims to resample sites in successive years (see Section 2.2). For the mature female analysis, data were limited to females at or above total length at first maturity-49 cm for cuckoo ray, 60 for thornback, 81 for blonde, and 52 for spotted ray [49,83]. For juvenile analysis, data were limited to rays ≤34/33/36/33 cm (respectively): the maximum lengths at age 1 calculated using the growth parameters of [80].
In order to create a single database of explanatory variables to be predicted to, explanatory variables were interpolated to the highest resolution dataset, which was depth points covering the whole Irish Sea. Distance to shore was calculated using raster proximity analysis in QGIS mapping software [84]. The substrate layer was converted from descriptive Folk classifications [85] into median grain size [86], allowing it to be treated as a continuous variable by the model. For the gridded data where the model predicts CPUE values, polygon layers (ray and whelk trawl catch rates, scallop effort) were appended to the depth points directly, while point layers were first converted to Voronoi polygons then appended. For the response variable dataset, explanatory variables were appended to the ray survey data points using the same approach.

Preliminary Analysis
Compared to the whole study area, the survey trawls under-sample extreme shallows, depths over 50 m, corresponding distances from shore (close and far, respectively), and areas of high current velocity [11]. This was not considered an issue as the model compares sampling frequency of variable ranges from the survey, to the whole study area, and produces "unrepresentativeness maps" using our RSB function (gbm.rsb) in R [87]. These describe the extent to which environmental conditions (depth, current speed etc.) across the study area are represented by the range of conditions encountered in the sampling survey and allows the user to assign a level of confidence to the outputs ( Figure S1).
Auto-correlation of sample data locations (all trawls) was checked to ensure that neighboring points were statistically independent [88,89] by analyzing the residuals of a GAM of CPUE as explained by latitude and longitude, which showed the residuals to be normally distributed (mgcv in R). A Mantel test (vegan in R) showed the model sufficiently accounted for spatial autocorrelation in the raw data, and that the residuals were not auto-correlated (Mantel correlation 0.078, p = 0.001) [89,90].

Modelling Approach
For the juvenile subset, the model was run twice for each species. Once with all six (five for blonde ray) fish predator CPUEs as individual variables, and then with those CPUEs combined. Based on performance metrics, individual variables outperformed combined variables in seven of the eight comparisons (four species' binary and Gaussian BRTs). This makes sense as BRTs can accommodate numerous explanatory variables without penalty [20,28]. We therefore kept variables separate.
To examine the relative importance of each variable in the model, the percentage influence values displayed on the bar plots of variable influence (top left panes in Figure S2a-i) were totaled for each individual variable and grouped into categories: environmental (temperature, salinity, distance to shore, current speed, depth, substrate grain size), fishery (fishery CPUE), potential fish predators (cod, haddock, plaice, whiting, common skate, blonde ray) and eggcase removers (whelk LPUE, scallop effort). This was done for the binary and Gaussian models, for each species (Figure 1).
Amalgamated predicted CPUE maps were initially created by adding the adult and juvenile maps' values together. Maximum CPUE was highest for spotted ray (8.9 fish per hour total; 8.1 for juveniles and 1.1 for mature females) and lowest for blonde ray (1.1 fish per hour total; 0.87 juveniles, 0.56 mature females). To avoid the most abundant subset (juvenile spotted rays) dominating the combined CPUE, we normalized both of the original maps' ranges to 1, then summed them to give a theoretical maximum of 2, presented as a percentage for ease of interpretation ( Figure 3). A similar process was used for Figure 4: each species' dataset for juveniles and adult females were scaled to 1 and summed to a total of 8, again presented as a percentage.

Conclusions
The study has shown that the distribution of mature female and juvenile rays in the Irish Sea can be predicted using an appropriate choice of covariates and available survey data. The analysis method chosen, Boosted Regression Trees, allowed us to model these distributions with many different covariates, but without over-fitting, a common problem with advanced regression methods. Differences in response and hence distribution between adult females and juveniles were also revealed by the use of population component subsets, opening up novel possibilities for nuanced spatial management to protect spawning and nursery areas; these analyses also improved the predictive power of the models. Normalizing and combining their maps into a single map makes the results of these analyses more accessible to users and managers, with the underlying code facilitating rapid processing of these outputs.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2410-3888/2/3/12/s1, Figure S1: RSB map of mature female and juvenile cuckoo, thornback, blonde and spotted rays. The product of comparing histograms of each explanatory variable's values at the response variable sampling sites, with the histograms of those explanatory variables across the whole study area. The modulus of the differences for each histogram bar region, for each variable, for both binary and Gaussian maps, are added, resulting in maps where higher numbers indicate higher total deviance from a representative coverage of the explanatory variable values at that site. These maps give a spatial representation of how well those bins of the environmental variables at a specific site (e.g., 20 m depth, 3 km from shore) are represented in the samples data, i.e., how much confidence one can have for abundance predictions at that site; Figures S2a-i: Bar plots of the relative influence of each explanatory variable to the predicted abundance maps of each species, showing which are the most important, and "partial dependence" line plots detailing the relationship of those variables to predicted ray CPUE. Only produced for Gaussian bars and partial dependence lines, as both binary and Gaussian lines tend to be similar, and only produced for variables exerting >5% influence, in order to reduce figure numbers from 180 to 62.