1. Introduction
Wildlife management actions typically require a relatively accurate estimate of the number of animals in a region [
1]. For example, density and population size estimates may be used in semi-arid regions to assess carrying capacity of the forage resource in rangelands. Managers may also use estimates of population size to assess trends over time to understand the mechanisms for population dynamics.
One popular method to estimate population sizes is line transect surveys and distance sampling techniques [
2]. Distance sampling assumes that animals that are closer to the transect line have a higher probability of being seen than animals farther from the survey route due to the relative proximity to the observers [
3,
4]. The perpendicular distance to each animal observed along the transect is recorded, and once the survey is completed, these estimated distances are compiled to develop a detection function for that species and landscape. The detection function mathematically models the probability of observing an individual as a function of its perpendicular distance from the transect line. The shape of the detection function is critical, as it allows for an estimate of how many animals were present but not detected along the survey route. Through distance sampling and subsequent analyses, natural resource managers can obtain an unbiased estimate of density and population size for the species of interest on their landscape [
5,
6].
However, a unique problem arises in dry regions that may violate an underlying assumption of transect surveys [
3,
7]. Distance sampling methods assume the transect line is placed randomly in respect to the distribution of objects [
4], but this is not always the case. Transect survey lines often double as desolate roads used for servicing artificial waterholes, which are crucial to the survival of wildlife species in dry landscapes such as those in Namibia and other regions in southern Africa [
8]. Because these transect lines are unable to be randomly placed with respect to the wildlife on the landscape, due to a lack of resources, time, and manpower, surveyors must utilize the roads that are already in place that connect to surrounding waterholes. The waterholes affect the dynamics of the ecosystem and lead to a nonrandom concentration of animals near the water source; we note that other resources such as sites for supplemental feeding would result in the same issue. Observers conducting the transect survey have a much higher probability of observing these large concentrations of wildlife immediately on the transect line than if they were away from the waterhole. Thus, the recorded observations from distance surveys in these locations reflect many individuals that are close to the transect line, which has the potential to overestimate, or positively bias, the overall density estimate for the species [
7].
Line transect surveys have been compared to other survey types including point surveys [
7,
9], strip surveys [
2], and track surveys [
10], but there is limited information available on assessments of biases that result from concentrations of animals near the transect line due to nonrandom transect placement. One such study conducted in Virginia, USA in 2014 found that white-tailed deer (
Odocoileus virginianus) population densities were likely to be positively biased when using distance surveys along roads due to the roads’ nonrandom placement with respect to the target species and landscape [
11]. The scenario we examine in southern Africa has similar potential for bias, but with one unique difference: waterholes cause an attraction of animals to the transect line at specific locations rather than along the entire transect.
Line transect surveys are typically the chosen wildlife survey method due to the relative ease with which they can be conducted and their ability to estimate population sizes for multiple species that are visible on open landscapes. Although analyses based on distance sampling are robust and can account for many sources of error [
4], factors such as the use of nonrandom transect routes when conducting distance surveys on pre-established roads can result in biases [
12]. Again, these biases can occur due to the violation of the distance sampling assumption that transects are placed randomly with respect to the objects and landscape. Because it is difficult to assess bias in field-based studies, landscape simulations have been used in the past to evaluate biases related to distribution of habitat and animal movements [
13,
14].
The purpose of this study was to quantify the magnitude of sampling bias caused by nonrandom transects with respect to resources such as waterholes and to evaluate practical solutions for wildlife managers. To achieve this, we used oryx (Oryx gazella) as a model species to simulate populations on virtual landscapes with and without waterholes. By focusing on the unique challenges of arid environments—where waterholes attract wildlife to specific locations along the very roads used as survey transects—we aimed to identify and test correction methods that provide more reliable population estimates. We predicted that waterhole-driven clumping would significantly violate distance sampling assumptions, leading to an overestimation of density, and that our proposed correction methods would effectively mitigate this positive bias.
2. Materials and Methods
2.1. Study Area
Our study and simulation model was based on empirical data (
Figure 1) collected at the NamibRand Nature Reserve, covering 172,200 hectares (1722 sq. km) in southwest Namibia (latitude: −24.999825, longitude: 16.000376) [
15]. The reserve staff use distance sampling during the annual game count to survey 10 transect routes by road, each with a length of approximately 50–60 km. The landscapes at NamibRand are managed to minimize human impacts on the reserve and roads are minimized to protect fragile soil surfaces along the Namib Desert. The landscape and terrain consist of savanna grasslands, sand and gravel plains, inselbergs, small mountain ranges, and vegetated dune belts. While the landscape and terrain do differ slightly on the reserve, we assume the same pattern of detectability for the entire transect survey area for the purposes of this study.
Namibia experienced severe drought conditions from 2013 to 2019, with annual precipitation levels under the yearly average by 92–296 mm each year except 2014 [
16]. During drought conditions, managers of the reserve are especially aware of the importance of density estimates for grazing wildlife species to plan for forage availability as it decreases with the lack of precipitation. NamibRand is home to a variety of animal and plant species that are native to southern Africa with oryx and springbok (
Antidorcas marsupialis) as the most prevalent animal species [
15].
2.2. Analysis of Empirical Data
We needed to establish real-world oryx densities and detection probabilities (the probability of a surveyor detecting an oryx at a given distance from the survey transect) based on actual wildlife surveys to use in our simulations. We used oryx sightings from the 2016 game count at NamibRand to estimate density and determine the best model with which to describe detection probabilities of oryx. Surveys for wildlife are conducted each year during late May; transect lines are driven starting 30 min after sunrise. Each management zone has a different survey crew which determines the distance from the transect by trained estimation without binoculars. We analyzed the oryx data (
n = 1778 observations across the entire reserve) using four detection models in Program DISTANCE [
17]: half-normal, uniform, hazard rate, and negative exponential decay curves. We allowed for a cosine adjustment term on the four detection models and used AIC [
7,
18] to determine the model that best described the distribution of observations away from the transect line. The negative exponential decay model most closely represented the detection of oryx in the 2016 game count. We then used the negative exponential model in the analyses of our simulated survey data to establish patterns of detection probability and to estimate density and population size.
2.3. Simulation Methods
Our simulations involved two steps. First, we created a population of oryx on the simulated landscape using a Microsoft Excel spreadsheet (v. 2016) in which each line of the spreadsheet was an individual animal. Columns provided information about the distance of each animal from the transect and the animal’s detection status (detected/not detected). In the real scenario on which we based our study, the actual population size is not known; however, in these simulations we were able to set our own values and then modify them in ways to determine which outcome was the most closely aligned with the original value. With that knowledge, we simulated a population of 750 oryx data points, based on the number of oryx seen within 500 m along transects in each management zone (
Figure 2) at NamibRand during the 2016 game count. Thus, our simulated populations were similar to the real-life scenario being modeled given the length and width of our simulation. Although oryx are often sighted in groups, we simplified our model and distributed individuals on the landscape.
Our simulated landscape was 1000 m wide and 50 km long (5000 ha), with a transect straight through the middle (500 m width on each side;
Figure 2) to mimic a single management zone surveyed at NamibRand. We randomly assigned each oryx a location with a perpendicular distance away from the transect (x-value) and a position along the transect (y-value).
The second step in our simulation was to conduct a survey. We used the detection probabilities from the 2016 game count data to set discrete regions with increasing distance from the transect and decreasing levels of detection probability on our own landscape. We assigned individual animals that were on the transect line a detection probability of 1.0 due to the underlying assumption of distance sampling that objects directly on the line are detected with certainty [
4,
17,
19].
The probability of detecting an oryx in our simulated survey declined in 100 m intervals until we hit the maximum distance of 500 m away from the transect. Based on the negative exponential detection probability model established with the analyses of empirical data, the detection probabilities and their respective distances in our simulation model were: 0.88 for 1–100 m, 0.70 for 100–200 m, 0.58 for 200–300 m, 0.45 for 300–400 m, and 0.35 for 400–500 m. We determined which of the simulated animals were “seen” by observers in our model by comparing each of the 750 individuals’ zone-specific detection probability with a randomly generated number between 0 and 1.0. If the random number was less than the detection probability assigned to each individual, we categorized it as observed. We imported the animals that were observed and their respective x-values (perpendicular distance from transect) into Program DISTANCE. This initial analysis from a landscape with no waterholes and no intentional clumping provided a baseline (S0) of estimated populations sizes and animal densities.
This initial control (S0) simulated survey without waterholes and clumped animals resulted in a satisfactory (<10% difference) estimate of the simulated known population size (750 individuals); therefore, we proceeded with adjustments to the simulation to assess potential for bias caused by clumping at the waterhole. We then simulated a new landscape with two waterholes located at 12.5 km and 37.5 km along the 50 km transect (
Figure 2). We used trigonometry to calculate the distance of each oryx to the waterholes in the initial simulation.
We created three different scenarios to simulate in which the closest 5%, 10%, and 20% of data points were moved to their respective waterhole (simulations S5, S10, and S20, respectively). We created this gradient of waterhole use to represent the variability in drinking and grazing behavior that may exist in different real-life scenarios. We assigned the selected individuals a new perpendicular distance of zero meters and a detection probability of 1.0. As in the real-life scenario, the waterholes in the simulation are directly along the transect line with all animals at the waterhole detected. Individuals not at the waterholes maintained their same detection status as in the first simulation, and we imported each of the new simulation sets into Program DISTANCE to estimate population sizes and densities. We defined bias in the population size estimate as the difference between the estimate from the initial control simulation (S0, no waterholes or clumping) and the population size estimates resulting from the simulations with waterholes. In this way, we were able to evaluate the influence of levels of clumping on bias in population size estimates.
We used only one simulation to compare the S0, S5, S10, and S20 scenarios (
Figure 3) because we were using large sample sizes of simulated animals to establish the distribution of detected animals in each scenario. Our population size estimates from each scenario should be interpreted as a relative representation for the scenario, as our approach does not provide information about the variability that could result from each scenario. However, we anticipated very little change in the overall distribution of simulated animals and density estimates because of our robust sample size. Instead, our goal for this modeling exercise was to assess the relative difference between scenarios.
2.4. Correction Methods
We then explored ways to correct for the bias caused by clumping of animals at waterholes during surveys. Our two correction methods were to (1) censor data from the survey section near waterholes or (2) redistribute the concentrated animals to where they might have been located in the absence of the attraction of the waterhole. The censor method is currently in use in some areas of Namibia, as it combats the violation of the distance sampling assumption of nonrandom transect placement by eliminating that nonrandom data, and we proposed the redistribution method as an attempt to still account for those individuals in the total count. Given the positioning of the transects as roads, it is not feasible to modify their placement on the landscape to follow the assumption of distance sampling in which transects are randomly placed with regard to individuals on the landscape. With that in mind, we redistributed these individuals at random locations across the landscape in an attempt to best represent the assumption of random transect placement given the constraints on the setup.
We implemented the censor method by removing the animals observed at the waterholes as well as the transect lengths corresponding to those sections of survey route from where the simulated animals were pulled to clump near the waterhole. We imported the new data into Program DISTANCE and recorded the population estimates and densities. This process was completed for the simulated distributions of animals in the S5, S10, and S20 scenarios, which resulted in censor scenarios S5-C, S10-C, and S20-C.
We moved all animals from the waterhole in the S5, S10, and S20 scenarios to new, random distances from the transect line to implement the redistribution correction method. After redistribution, individual animals were determined to be detected based on the same detection probabilities used for our base simulations. We imported these new data sets into Program DISTANCE, performed the analysis, and recorded the resulting population and density estimates for redistribution scenarios S5-R, S10-R, and S20-R. Our last step was to compare results from the two correction methods to determine which method led to population estimates that were the most similar to the initial simulation data (S0) from a landscape without waterholes.
3. Results
Our density analysis from the baseline simulation (S0) with 750 oryx and no waterholes resulted in 452 of the 750 oryx observed with a resulting population estimate of 798 individuals (95% CI: 698–912; density estimate = 0.160/ha, 95% CI: 0.140–0.182). The true population size was only 6.7% (48 individuals) different from our estimate and within its 95% confidence interval. Our scenarios with animals clumped at waterholes confirmed the alterations of the distribution of animals (
Figure 3), which led to bias in the estimates of density and population size. Our estimates were positively biased at levels of 67.1%, 603.9%, and 966.8% (5%, 10%, and 20% of the individuals moved to waterholes; S5, S10, S20) higher than the known population size of 750, respectively (
Table 1).
Our censor correction method applied to S5, S10, and S20 resulted in population estimates within 4.1%, 1.3%, and 6.9% of the true population size, while the redistribution method provided estimates that were within 8.3%, 6.8%, and 11.7% of the true size (
Table 1). For all six of the correction simulations, the distributions of animals used for analysis were similar to the S0 scenario’s distribution (
Figure 3 and
Figures S1 and S2). Further, the true population size of 750 oryx fell within the 95% confidence interval of population size from the corrected simulations. For comparison, none of the confidence intervals for estimates of population size from the uncorrected simulations S5, S10, and S20 included the true value of 750 oryx (
Table 1).