Next Article in Journal
An Efficient Method for Solving Two-Dimensional Partial Differential Equations with the Deep Operator Network
Next Article in Special Issue
Higher-Order INAR Model Based on a Flexible Innovation and Application to COVID-19 and Gold Particles Data
Previous Article in Journal
Mass Generation via the Phase Transition of the Higgs Field
Previous Article in Special Issue
Statistical Analysis of Type-II Generalized Progressively Hybrid Alpha-PIE Censored Data and Applications in Electronic Tubes and Vinyl Chloride
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Enhanced Spatial Capture Model for Population Analysis Using Unidentified Counts through Camera Encounters

by
Mohamed Jaber
,
Farag Hamad
†,‡,
Robert D. Breininger
and
Nezamoddin N. Kachouie
*
Department of Mathematics and Systems Engineering, Florida Institute of Technology, Melbourne, FL 32901, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Current Affiliation: Department of Mathematics, Benghazi University, Benghazi, Libya.
Axioms 2023, 12(12), 1094; https://doi.org/10.3390/axioms12121094
Submission received: 8 August 2023 / Revised: 18 November 2023 / Accepted: 23 November 2023 / Published: 29 November 2023
(This article belongs to the Special Issue Stochastic and Statistical Analysis in Natural Sciences)

Abstract

:
Spatial capture models are broadly used for population analysis in ecological statistics. Spatial capture models for unidentified individuals rely on data augmentation to create a zero-inflated population. The unknown true population size can be considered as the number of successes of a binomial distribution with an unknown number of independent trials and an unknown probability of success. Augmented population size is a realization of the unknown number of trials and is recommended to be much larger than the unknown population size. As a result, the probability of success of binomial distribution, i.e., the unknown probability that a hypothetical individual in the augmented population belongs to the true population, can be obtained by dividing the unknown true population size by the augmented population size. This is an inverse problem as neither the true population size nor the probability of success is known, and the accuracy of their estimates strongly relies on the augmented population size. Therefore, the estimated population size in spatial capture models is very sensitive to the size of a zero-inflated population and in turn to the estimated probability of success. This is an important issue in spatial capture models as a typical count model with censored data (unidentified and/or undetected). Hence, in this research, we investigated the sensitivity and accuracy of the spatial capture model to address this problem with the objective of improving the robustness of the model. We demonstrated that the estimated population size using the proposed enhanced capture model was more accurate in comparison with the previous spatial capture model.

1. Introduction

Population analysis based on spatial sampling is an emerging field of research due to its broad range of applications. Some important applications of spatial population analysis are to preserve endangered species and to control invasive species. One of the first steps required to manage a population is estimating the population size. Several methods have been developed to estimate the abundance of animals [1,2,3]. A standard approach is counting the members of the population in a random sample to estimate the population size. Population density can be then estimated by dividing the population size by the area of the animals’ habitat. The spatial capture methods can be split into two major groups, capture–removal methods and capture–recapture methods, as demonstrated in Figure 1. In capture–removal methods, the individuals that are captured will be counted, recorded, and then removed from the habitat.
In contrast, in capture–recapture methods, the individuals that are captured will be counted, recorded, and then released back into the habitat. Capture–recapture methods are commonly used in ecological statistics for estimating population size [1,3,4,5,6,7]. Capture–recapture methods can be split into two sub-groups, capture–mark–recapture and capture-unidentified individuals. The distinction between the two sub-groups is whether the captured individuals are marked, identified, and released, or whether individuals are virtually captured without being identified. The intuition behind the capture–mark–recapture method is that the proportion of the individuals that were captured, marked, and released in the first sample and were then recaptured in the second sample can be used to estimate the population size [1]. In the capture-unidentified-individuals method, animals are virtually captured without being identified, obtaining only the number of encountered individuals.
Capture–mark–recapture methods rely on the physical capture of individuals to collect the individual encounter history. Due to technological advances, the ability to capture individuals has been improved through more efficient methods, such as camera traps [8], acoustic recordings, and DNA samples [9,10]. Individuals can be virtually captured using their signs without being identified. Camera traps can be effectively used for the virtual capture of individuals. The inherent characteristics of sampling using camera traps include the following: (1) the same individual can be captured multiple times by the same camera, (2) the same individual can be captured by different cameras, (3) often, only a subset of individuals will be captured, and (4) captured individuals are not identified.
Conventionally, capture methods do not use spatial information about the captured individuals. Advanced spatial capture models have been implemented in the past decade [11,12,13,14,15,16]. A spatial capture model that has been broadly used was introduced by Chandler and Royle [17]. This model yields promising outcomes using unmarked or partially marked individuals to estimate the population size and density. However, due to the complexity of the spatial capture-unidentified approach, the error in the estimated population size and density could be prohibitively high. An open-ended problem is to address the shortcomings of this method and make the model robust regarding spatial sampling and spatiotemporal population analysis. The objective of this study is to improve the estimated population size by reducing the estimation error to make it robust. To this end, the proposed method performs the following tasks:
(1)
Employs a prior distribution for the essential parameter of the zero-inflated population;
(2)
Regularizes the Markov chain Monte Carlo (MCMC) by controlling the effective sample size;
(3)
Reduces the order of the chain by controlling the correlation of generated samples through Gibbs sampling [18,19,20].

2. Methods

The capture–recapture method is one of the most common methods to estimate the population size. The capture–recapture method has its strengths and limitations based on the species, habitat, and available resources.

2.1. Hierarchical Spatial Capture–Recapture Model

Hierarchical spatial capture–recapture (HSCR) is a statistical model widely used in wildlife ecology to estimate population size [11]. It combines information from multiple trapping sites and accounts for spatial dependences among the captured encounters to estimate the population size [21].
In this model, each individual is associated with an unknown spatial activity center and home range radius. Considering population size N , there are N unknown activity centers. It is assumed that individuals in the same habitat have the same unknown home range radius. Individual encounters in the study area are recorded using camera traps. The schematic of the capture–recapture model [22] is depicted in Figure 2 and is discussed below.
Using distance sampling, the distance between the camera locations and the unknown center of activities is reflected in the model. It is assumed that an individual i has a fixed center of activity defined with the coordinates s i = ( s x , s y ) where i = 1 ,   2 , , N , and N centers of activities are randomly distributed over the area of study S . A bivariate uniform prior is used to model the unknown activity center s i :
s i ~   U n i f o r m ( S ) ,
There are J camera locations; each is defined by the coordinate x j ,   j = 1 ,   2 , , J . Notice that an individual can be detected at multiple cameras and/or at multiple times by the same camera during a sampling occasion. A Poisson distribution is used to model a camera encounter history z i j k for individual i , at camera j , on occasion k :
z i j k ~ P o i s s o n ( λ i j ) ,
where λ i j is the encounter rate for individual i at camera j . The expected number of the captures or detections of an individual i at camera j , which is a function of the Euclidean distance between activity center s i and the camera location d i j = s i x j , is
λ i j = λ 0 g i j ,
where λ 0 is the baseline encounter rate, and g i j is a function of the distance which monotonically decreases and is modeled using a half Gaussian function:
g i j = e x p d i j 2 σ 2 ,
where σ is a scale parameter and will be estimated using the collected data. If an individual can be captured once during a sampling occasion, the encounter history takes binary values; that is, z i j k takes a value of one if the individual i is captured, or zero otherwise. However, an individual can be captured more than once during a sampling occasion. In this case, z i j k will be the number of times that the individual i has been encountered at camera j on occasion k . Therefore, a ( J × K ) encounter history matrix is considered for each individual. Obviously, the capture histories z i j k cannot be directly observed for unmarked individuals. A data augmentation method has been implemented to estimate the unknown population size. The number of camera encounters at camera j on occasion k is modeled by
n j k = i = 1 N z i j k .
The full conditional latent encounter data are defined by a multinomial distribution:
z 1 j k ,   z 2 j k ,   , z N j k ~   M u l t i n o m i a l ( n j k π 1 j , π 2 j ,   , π N j ) ,
where π i j = λ i j i = 1 N λ i j . The camera encounter counts are modeled using a Poisson distribution:
n j k ~   P o i s s o n   Λ j ,
where
Λ j = λ 0   i = 1 N g i j .
The number of camera encounters at camera j can be obtained by
n j . = k = 1 K n j k .
Because Λ j and K are independent,
n j . ~   P o i s s o n   K Λ j .
In the data augmentation method used in Chandler and Royle [22], Royle et al. [17], and Royle and Dorazio [23], the camera encounter histories were augmented with a set of all-zero camera encounter histories to create a hypothetical augmented population of size M in the study area. The augmented parameter M is an integer and is recommended to be much greater than unknown N , i.e., M > > N , to avoid the truncation of the posterior distribution of N . Notice that, a very large value of M will increase the computational time. Uninformative prior distributions are assumed for the unknown parameters. Prior distributions of λ 0 ,   σ , and ψ are considered U n i f o r m   ( 0,1 ) , where ψ , probability of success, is the probability that an individual in the occupancy model of size M is a member of the true population of size N . A binomial prior distribution, N ~   B i n o m i a l ( M , ψ ) , is assumed for N where ψ ~   U n i f o r m 0,1 . Assuming a discrete uniform distribution for the detection of individuals in the hypothetical population of size M , L = M n individuals are associated with all-zero encounter histories. In turn, indicator variables ω 1 ,   ω 2 ,   ,   ω M are introduced such that
ω i = 0 ,       i f   t h e   i n d i v i d u a l   i   i s   n o t   a   m e m b e r   o f   t h e   p o p u l a t i o n 1 ,       i f   t h e   i n d i v i d u a l   i   i s   a   m e m b e r   o f   t h e   p o p u l a t i o n ,
where ω i ~   B e r n o u l l i   ψ ,   i = 1 ,   2 , , M , with expected value E ω i = ψ and variance V a r ω i = ψ ( 1 ψ ) . Hence, the encounter data for each individual in the augmented population can be modeled by
z i j k | ω i = 1   ~   P o i s s o n ( λ i j ω i ) , z i j k | ω i = 0 ~ I z i j k = 0 ,
and in turn, the population size can be obtained by
N ^ = i = 1 M ω i .
Assuming mutual independence of the prior distributions, the joint prior distribution is
P ψ , λ 0 , σ   P ψ P λ 0 P σ
and in turn, the joint posterior distribution of the parameters is
z , ω , s , ψ , λ 0 , σ | n , X   i = 1 M { j = 1 J k = 1 K n j k | z i j k z i j k | ω i , s i , σ , λ 0 }   ω i | ψ s i   ψ λ 0 σ .
where X is a coordinate matrix of camera locations. Notice that in the original model, the assumed prior distributions for λ 0 and σ are uninformative. A spatial Metropolis–Gibbs Markov chain Monte Carlo (MCMC) algorithm for estimating the model parameters is used in Chandler and Royle [22].

2.2. Proposed Method

For a random sample Y from a given population with unknown probability distribution f ( Y ,   θ ) , the unknown parameter θ can be estimated using a point estimator to construct a confidence interval for the unknown parameter. In Bayesian statistics, the samples are often generated from an uninformative prior. However, an informative prior to sample parameter θ can be inferred empirically or could be available from previous studies [21,22,23]. In the next section, an informative prior to sample ψ is introduced.

2.3. Sensitivity of the Model to ψ

The probability of success ψ that an individual is a member of the population is
ψ = N M ,
where N is the unknown true population size, and M is the size of the zero-inflated population. In the data augmentation model, an arbitrary large zero-inflated population is generated. As a result, the upper bound of augmented population size M can increase indefinitely. It has been demonstrated using simulations that by increasing the zero-inflated populationsize, the true estimation error will be increased. In turn, the spatial models can substantially overestimate the unknown population size. To address the inflated estimation error due to the inflated size of the population, we suggest bounding the size of the augmented data. A practical lower bound of the augmented data is twice as high as the unknown population size. By increasing the size of the inflated population above twice the true population size, the error of the estimated population size will increase, and in turn, the width of the credible interval will be increased.
To investigate the sensitivity of the model to ψ , simulations were designed, and the impact of ψ on the estimated population size was studied. In the first set of simulations, an uninformative prior with no constraints was considered to sample the probability of success ψ . It means all values in the range of [0, 1] will be accepted for ψ by the Gibbs sampling. In the next set of simulations, different constraints for sampling ψ were enforced. An estimate of ψ can be obtained by ψ ^ = N ^ M , where ψ ^ is sampled from a Beta distribution with parameters:
α = 1 + i = 1 M ω i ,
and
β = 1 + M i = 1 M ω i ,
The estimated number of individuals, in Equation (13), is given by N ^ = i = 1 M ω ^ i . The expected population size at camera j is given by
E N ^ = E i = 1 M ω ^ i = i = 1 M E ω ^ i = i = 1 M I i ψ ^ ,
where I i is the indicator function for individual i , and ψ ^ is the estimated probability of success, that is, the probability that an individual belongs to the true population. The variance of the estimated N ^ can be derived from
V a r N ^ = V a r i = 1 M ω ^ i = i = 1 M V a r ω ^ i = i = 1 M I i ψ ^ ( 1 ψ ^ ) ,
and the standard deviation of the estimated N ^ is
S d N ^ = i = 1 M I i ψ ^ ( 1 ψ ^ ) = M ψ ^ ( 1 ψ ^ ) ,
S d ψ ^ = ψ ^ ( 1 ψ ^ ) M .

2.4. Autocorrelation Plot

The autocorrelation plot is an effective tool to assess the correlation of the samples produced by a Markov chain and inspect whether the samples are well mixed [24]. The correlation coefficient range is [−1, 1], with −1 indicating perfect negative correlation, zero representing no correlation, and one indicating perfect positive correlation. It will also quantify the correlation between the current value of the chain and its past values (lags). Generated samples by MCMC from one iteration to the next will be somewhat correlated. In a well-mixed chain, the correlation is small, and the autocorrelation drops relatively quickly. However, if the chain does not mix well, samples will be highly correlated, and the correlation will decay slowly. In turn, a large number of iterations is required to reach the stationary distribution of the Markov chain [25,26]. Lag k autocorrelation represents the autocorrelation between the current sample and kth preceding sample [19].

2.5. Effective Sample Size

Effective sample size (ESS) is another way to study the convergence of the chain. ESS provides an estimated number of independent observations equivalent to the samples generated by a Markov chain [27] that iterates T times (Figure 2). In other words, it represents the number of independent samples in the simulation and gives an estimate of how well the simulation represents the target distribution. As we have mentioned, the samples generated by MCMC are somewhat correlated. It means, less information is provided by highly correlated or poor mixing chains. A high ESS indicates a well-mixed chain, while a low ESS indicates poor convergence. A chain with a low ESS must run for a larger number of iterations to improve the convergence of the chain toward its stationary distribution [26].

3. Results

The simulations were implemented using R v4.2.2 (a statistical analysis programming language) within the RStudio and a range of specialized packages for Markov chain Monte Carlo (MCMC), the Metropolis–Hasting algorithm, and the Gibbs sampler (see [19,28,29,30,31]). Statistical simulations were implemented to study the sensitivity of the spatial capture model to the data augmentation parameter L (added number of zeros) by estimating unknown population size N , home range σ , and λ 0 . In this model, a hypothetical population size ( M ) for the upper bound of the true population size ( N ) is selected such that M = N + L . In turn, the estimated N can assume values between zero and M . Several simulations were performed to test the sensitivity of the model to the selected value of M for the true value of N = 25 , and the results are shown in Table 1 and Figure 3. The estimated values of σ ,   λ 0 , ψ , and N were calculated along with the ESS and L a g 10 autocorrelation. The estimated population size N is obtained by running simulations with and without constraint on the sampling range of ψ (Table 1). For M = 100 , the true probability of success ( ψ = N M ) is 0.25 , and it is 0.125 for M = 200 .
It can be observed (in Table 1 and Figure 3) that the estimated N has assumed a broad range of values. The sensitivity of the model to the added number of zeros L is more noticeable when sampling the probability of success ψ from [0, 1], i.e., without constraint. In all cases, the estimated values of N are more accurate with the constraints on ψ . It was observed that by decreasing the sampling range of the ψ , the effective sample size will increase, while the rejection rate and in turn computational cost increase.
It is recommended to choose large values for the data augmentation parameter M . However, it must be pointed out that M may not be increased indefinitely, as it may produce very small ψ = N M close to zero and in turn significant overestimation of N . Hence, two different constraints, [0.0, 0.5] and [0.1, 0.4], were set for the range of parameter ψ , and the simulation results were compared with the uninformed prior of ψ and sampling it from [0, 1]. We ran 50 simulations for each scenario and computed the average to be able to generalize the results.
As depicted in Figure 3, regardless of the number of added zeros L , the sensitivity decreases by constraining the probability of success, and true ψ is contained in the estimated confidence interval highlighted in gray. Furthermore, the estimated N is consistent for ψ = 0.25 , as depicted in Figure 2. A fair estimate of N is obtained when sampling ψ from [0.0, 0.5] regardless of L . The results show that a reasonable estimate of N between 23 and 33 can be obtained by constraining ψ , and with a higher probability of success of 0.25, the estimated N is more accurate (between 23 and 27).
Table 2 shows the estimated values of σ , λ 0 , ψ , and N , along with ESS and L a g 10 autocorrelation for ψ and N . For M = 100 , the estimated value of the population size N in the case of no constraint on the parameter ψ is about 32, with an ESS equal to 1862. After enforcing the constraint [0, 0.5] on the range of ψ to reject all samples with ψ > 0.50 , the estimated value of N is about 27 with an ESS of 4368. Moreover, by limiting the range of ψ to [0.10, 0.40], the estimated population size is 26 with the ESS of 7678. Pointwise nonparametric confidence intervals for population size given the range of ψ are compared for different values of M (Figure 4). As we can see in this figure, more robust estimates of the population size and ψ are obtained for M = 100 , while the estimation error and bias are increased for M = 200 . Figure 5 shows the estimated ψ and its confidence interval regarding the ESS of N ^ and ESS of ψ ^ for different values of M . It is clear that the estimated ψ   is more robust and converges to its true value within an ESS of 3000 for M = 100 , while it is overestimated and does not converge to the true value of ψ for M = 200 .
Figure 6 and Figure 7 show the estimated parameters σ and λ 0 in comparison with their true values. In all cases, the densities of the estimated parameters converged to the stationary distribution. With the constraint on parameter ψ , the mixing of samples in the chains were improved, providing more accurate estimates of N . With no constraint on the parameter ψ and M = 200 , the Monte Carlo average of the estimated population size N was about 33 with an ESS of 1779. By enforcing the constraints on the parameter ψ , the estimated population size N was 31 which demonstrated a smaller absolute error as well as improved accuracy with a narrower confidence interval in comparison with the scenario with an uninformative prior distribution (Table 2). We should point out that with M = 200 , the augmented population size is eight times larger than the true population size of N = 25 , as a result of which, N is substantially overestimated, and N = 25 is well outside of the confidence interval of [LB = 34.0,UB = 44.0] (Table 3).
The estimated N for the constrained ψ in [0.10, 0.40] and M = 100 has the minimum absolute error ( | N N ^ | = | 25 26 | = 1 and provides the highest number of independent samples with E S S = 7678 generated through 90,000 MCMC iterations. With an uninformative prior for ψ , the range of estimated σ is [0.30, 2.03] with an average of 0.56. By regularizing ψ and constraining its range to [0, 0.50], the range of the estimated σ is [0.40, 1.70]. Moreover, by constraining ψ to [0.10, 0.40], the range of the estimated σ is [0.39, 0.8] with an average of 0.52 and the narrowest range for the estimated ψ containing true ψ and with the lowest absolute error of the estimate. The estimated λ 0 for the aforementioned ranges of ψ is 0.497, 0.546, and 0.557, respectively. The estimated population size ranges from 8 to 61 with an average of 32 for no restriction on ψ , from 6 to 37 with an average of 27 for ψ belongs to [0, 0.50], and from 16 to 37 with an average of 26 for ψ belongs to [0.1, 0.40]. The estimated value of the population size is more accurate, and the range of the estimated value is narrower after regularizing   ψ (Table 1, Table 2 and Table 3). Moreover, the standard error for the estimated population size ranges from 5.8 to 24.7, with an average of 14.37 with no restriction on ψ . The standard error is between 5.8 and 13 with an average of 8.56 for ψ     0.50 . With ψ constrained between 0.10 and 0.40, the standard error ranges from 5 to 9 with an average of 6.8.
The estimated values for σ with M = 200 are 0.546, 0.527, and 0.496 (Table 2) for all three scenarios regarding ψ , i.e., no constraint, [0, 0.5], and [0.1, 0.4]. Also, the estimated values of λ 0 are 0.527, 0.564, and 0.545 (Table 2) for the aforementioned constraints on ψ . As it can be observed, the range of the estimated parameter is smaller with a constraint on ψ (Table 1 and Table 2). The estimated population size ranges from 6 to 107 with an average of 33 with no constraint on ψ , from 11 to 61 with an average of 31 by constraining ψ to [0, 0.5], and from 19 to 47 with an average of 31 by constraining ψ to [0.10, 0.40]. As depicted in Figure 7, we can see that by constraining ψ to [0.10, 0.40], the distribution of N ^ has a shorter tail and converges to the stationary posterior distribution. Moreover, there is better mixing of the chains providing more accurate estimates of the parameters.

4. Discussion

By constraining the parameter ψ to the range [0, 0.50], M can assume any value greater than 2 N ( M   2 N ). In contrast, by restricting ψ to the range [0.10, 0.40], M is double-bounded within 2.5 N     M     10 N . Enforcing an upper bound of 10 N is a sufficient assumption that satisfies M > > N recommended for spatial capture models, while we prevent indefinitely large M that can result in prohibitively small estimates of ψ (close to zero) with a highly skewed distribution. In practical applications, determining an accurate prediction for the value of M can be challenging. In cases where no prior information about M is known, the range [0, 0.5] can be considered for ψ . However, if prior information about the population size is known, and the camera grid provides adequate coverage of the habitat of interest, the range [0.1, 0.4] for ψ is preferred to avoid underestimation of ψ toward zero.
It was observed that the estimated population size using the spatial capture models with camera encounters is subject to overestimation and bias. The estimation bias tends to increase by increasing M . ESS and lag 10 can be used to control the convergence of MCMC and in turn controling the estimation bias. Specifically, lower ESS values are associated with higher bias, whereas higher ESS values are associated with lower bias. Additionally, larger values of lag 10 tend to correspond to higher levels of bias, and vice versa.
In terms of M , by setting it within five times of N , the estimated N is relatively accurate with low bias, while by increasing M toward 10 times of N , the estimated N converges to higher values (overestimates) with increased bias. It should be noticed that the values of ψ and λ 0 depend on each other, and in turn, their estimated values are not mutually exclusive. Specifically, there is a trade-off between the two estimated parameters. If the estimate of λ 0 increases, the estimates of ψ will decrease, and vice versa. Potentially, the estimated population size could be further improved by considering a prior distribution for λ 0 , which is the subject of our future work. Nonetheless, regularizing the value of ψ by a prior distribution in conjunction with controlling the ESS and log 10 improved the accuracy of the estimated population size.

5. Conclusions

Population management is important to preserve the populations of endangered species and to control the populations of invasive species. Estimating the population size is an essential task in managing the population. Collecting a random sample of individuals from the population is a feasible practice to study the population when it is not possible to count every individual. Capture–recapture methods based on spatial sampling and count models have become the standard methods in the analytical framework for ecological statistics and are widely used for population analysis to estimate population size and density.
The unknown size of a population is considered as the number of successes in a binomial distribution. The parameters of this binomial distribution are an unknown number of independent trials and an unknown probability of success. This is an inverse problem as none of the parameters including the population size, the probability of success, and the number of independent trials is known. An initial realization of the unknown number of trials is required to approach a solution while the accuracy of the estimated parameters strongly relies on the initial value for the zero-inflated population size. Hence, the estimated population size in spatial capture models is quite sensitive to the size of a zero-inflated population and in turn to the estimated probability of success. This is a typical count model with censored data as captured individuals are not identified, and some individuals are not detected. To address this problem and improve the estimation accuracy, in this research, we investigated the sensitivity and accuracy of the spatial capture-unidentified models (using virtual encounters) with the objective of improving their robustness. Statistical simulations were implemented to study the sensitivity of the spatial capture models to the zero-inflated population parameter ψ .
In capture-unidentified models, augmented population size ( M ) is allowed to be increased indefinitely, and so ψ will be allowed to belong to [0, 1]. As a result, the population size N will be highly overestimated, and the accuracy of the estimate declines for large values of M while the estimated ψ moves toward zero. Consequently, the credible interval becomes wider, and the distribution of N ^ displays a heavy and long right tail. Moreover, the true estimation error and estimation bias increase, while ESS tends to remain low. To address the aforementioned issues, a lower and an upper bound for ψ were recommended to prevent overestimation of population size that is caused by excessive underestimation of ψ when it is not regularized. In this way, the accuracy and bias were retained providing fairly narrow credible intervals, while ESS was improved.

Author Contributions

Conceptualization, M.J., F.H., R.D.B. and N.N.K.; methodology, M.J., F.H., R.D.B. and N.N.K.; software, M.J., F.H., R.D.B. and N.N.K.; validation, M.J., F.H., R.D.B. and N.N.K.; formal analysis, M.J. and F.H.; investigation, M.J., F.H., R.D.B. and N.N.K.; resources, N.N.K.; data curation, M.J., F.H. and R.D.B.; writing—original draft preparation, M.J., F.H., R.D.B. and N.N.K.; writing—review and editing, M.J., F.H., R.D.B. and N.N.K.; visualization, M.J., F.H., R.D.B. and N.N.K.; supervision, N.N.K.; project administration, N.N.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No data was collected. Rather, data was randomly sampled in the simulation study.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

MCMCMarkov chain Monte Carlo
HSCRHierarchical spatial capture–recapture
N Population size
N ^ Estimated population size
K Sampling occasion
z i j k Camera encounter history for individual i , at camera j , on occasion k
λ i j The encounter rate for individual i at camera j
λ 0 The baseline encounter rate
σ Home range radius.
d i j The Euclidean distance between activity center s i and the camera location x j
n j k The number of camera encounters at camera j on occasion k
M The augmented parameter (the total number of hypothetical individuals)
ψ Probability of success, i.e., the probability that an individual in the occupancy model of size M is a member of the original model of size N
α and β Parameters of Beta probability distribution
ESSThe effective sample size
L The number of zeros added to the model L = M N (data augmentation size)
L a g   k The autocorrelation between the current sample and kth preceding sample

References

  1. Pollock, K.H. Capture-Recapture Models: A Review of Current Methods, Assumptions and Experimental Design. Stud. Avian Biol. 1981, 6, 426–435. [Google Scholar]
  2. Schwarz, C.J.; Seber, G.A. Estimating Animal Abundance: Review III. Stat. Sci. 1999, 14, 427–456. [Google Scholar] [CrossRef]
  3. Nichols, J.D. Capture-Recapture Models. BioScience 1992, 42, 94–102. [Google Scholar] [CrossRef]
  4. Pollock, K.H. A Capture-Recapture Design Robust to Unequal Probability of Capture. J. Wildl. Manag. 1982, 46, 752–757. [Google Scholar] [CrossRef]
  5. Pollock, K.H.; Nichols, J.D.; Brownie, C.; Hines, J.E. Statistical Inference for Capture-Recapture Experiments. Wildl. Monogr. 1990, 48, 3–97. [Google Scholar]
  6. Karanth, K.U. Estimating Tiger Panthera Tigris Populations from Camera-Trap Data Using Capture—Recapture Models. Biol. Conserv. 1995, 71, 333–338. [Google Scholar] [CrossRef]
  7. O’Connell, A.F.; Nichols, J.D.; Karanth, K.U. Camera Traps in Animal Ecology: Methods and Analyses; Springer: Berlin/Heidelberg, Germany, 2011; Volume 271. [Google Scholar]
  8. Jaber, M.; Woesik, R.V.; Kachouie, N.N. Probabilistic Detection Model for Population Estimation; Old Dominion University: Norfolk, VA, USA, 2018; p. 43. [Google Scholar]
  9. Distiller, G.; Borchers, D.L. A spatially explicit capture–recapture estimator for single-catch traps. Ecol. Evol. 2015, 5, 5075–5087. [Google Scholar] [CrossRef]
  10. Engeman, R.; Massei, G.; Sage, M.; Gentle, M.N. Monitoring Wild Pig Populations: A Review of Methods. Environ. Sci. Pollut. Res. 2013, 20, 8077–8091. [Google Scholar] [CrossRef]
  11. Royle, J.A.; Young, K.V. A Hierarchical Model for Spatial Capture–Recapture Data. Ecology 2008, 89, 2281–2289. [Google Scholar] [CrossRef]
  12. Royle, J.A.; Karanth, K.U.; Gopalaswamy, A.M.; Kumar, N.S. Bayesian Inference in Camera Trapping Studies for a Class of Spatial Capture–Recapture Models. Ecology 2009, 90, 3233–3244. [Google Scholar] [CrossRef]
  13. Kery, M.; Gardner, B.; Stoeckle, T.; Weber, D.; Royle, J.A. Use of Spatial Capture-recapture Modeling and DNA Data to Estimate Densities of Elusive Animals. Conserv. Biol. 2011, 25, 356–364. [Google Scholar] [CrossRef] [PubMed]
  14. Borchers, D. A Non-Technical Overview of Spatially Explicit Capture–Recapture Models. J. Ornithol. 2012, 152, 435–444. [Google Scholar] [CrossRef]
  15. Royle, J.A.; Chandler, R.B.; Sollmann, R.; Gardner, B. Spatial Capture-Recapture; Academic Press: Cambridge, MA, USA, 2013; ISBN 0-12-407152-X. [Google Scholar]
  16. Gardner, B.; Reppucci, J.; Lucherini, M.; Royle, J.A. Spatially Explicit Inference for Open Populations: Estimating Demographic Parameters from Camera-trap Studies. Ecology 2010, 91, 3376–3383. [Google Scholar] [CrossRef] [PubMed]
  17. Royle, J.A.; Dorazio, R.M.; Link, W.A. Analysis of Multinomial Models with Unknown Index Using Data Augmentation. J. Comput. Graph. Stat. 2007, 16, 67–85. [Google Scholar] [CrossRef]
  18. Jiao, G.; Liang, J.; Wang, F.; Chen, X.; Chen, S.; Li, H.; Jin, J.; Cai, J.; Zhang, F. Longitudinal Data Analysis Based on Bayesian Semiparametric Method. Axioms 2023, 12, 431. [Google Scholar] [CrossRef]
  19. Carlo, C.M. Markov Chain Monte Carlo and Gibbs Sampling. Lect. Notes EEB 2004, 581, 3. [Google Scholar]
  20. Alotaibi, R.; Nassar, M.; Elshahhat, A. Statistical Analysis of Inverse Lindley Data Using Adaptive Type-II Progressively Hybrid Censoring with Applications. Axioms 2023, 12, 427. [Google Scholar] [CrossRef]
  21. Sollmann, R.; Tôrres, N.M.; Furtado, M.M.; de Almeida Jácomo, A.T.; Palomares, F.; Roques, S.; Silveira, L. Combining Camera-Trapping and Noninvasive Genetic Data in a Spatial Capture–Recapture Framework Improves Density Estimates for the Jaguar. Biol. Conserv. 2013, 167, 242–247. [Google Scholar] [CrossRef]
  22. Chandler, R.B.; Royle, J.A. Spatially explicit models for inference about density in unmarked or partially marked populatioins. Ann. Appl. Stat. 2013, 7, 936–954. [Google Scholar] [CrossRef]
  23. Royle, J.A.; Dorazio, R.M. Parameter-Expanded Data Augmentation for Bayesian Analysis of Capture–Recapture Models. J. Ornithol. 2012, 152, 521–537. [Google Scholar] [CrossRef]
  24. Christensen, O.F.; Roberts, G.O.; Sköld, M. Robust Markov Chain Monte Carlo Methods for Spatial Generalized Linear Mixed Models. J. Comput. Graph. Stat. 2006, 15, 1–17. [Google Scholar] [CrossRef]
  25. Kuzmanovska, I. Markov Chain Monte Carlo Methods in Biological Mechanistic Models. 2012. Available online: https://www.research-collection.ethz.ch/handle/20.500.11850/153590 (accessed on 8 August 2023).
  26. Lewis, A. Efficient Sampling of Fast and Slow Cosmological Parameters. Phys. Rev. D 2013, 87, 103529. [Google Scholar] [CrossRef]
  27. Martino, L.; Elvira, V.; Louzada, F. Effective Sample Size for Importance Sampling Based on Discrepancy Measures. Signal Process. 2017, 131, 386–401. [Google Scholar] [CrossRef]
  28. Fabreti, L.G.; Höhna, S. Convergence Assessment for Bayesian Phylogenetic Analysis Using MCMC Simulation. Methods Ecol. Evol. 2022, 13, 77–90. [Google Scholar] [CrossRef]
  29. Zhang, J.; Cui, S. Investigating the Number of Monte Carlo Simulations for Statistically Stationary Model Outputs. Axioms 2023, 12, 481. [Google Scholar] [CrossRef]
  30. Jaber, M. A Spatiotemporal Bayesian Model for Population Analysis. 2022. Available online: https://repository.fit.edu/etd/880/ (accessed on 8 August 2023).
  31. Martin, A.D.; Quinn, K.M.; Park, J.H. MCMCpack: Markov Chain Monte Carlo in R. J. Stat. Softw. 2011, 42, 1–21. [Google Scholar] [CrossRef]
Figure 1. Lineage of spatial capture methods.
Figure 1. Lineage of spatial capture methods.
Axioms 12 01094 g001
Figure 2. Schematic of capture–recapture model.
Figure 2. Schematic of capture–recapture model.
Axioms 12 01094 g002
Figure 3. Sensitivity of the estimated probability of success ( ψ ^ ) to the selected range of ψ for M = 100 (a) and M = 200 (b); sensitivity of the estimated population size ( N ^ ) to the selected range of ψ for M = 100 (c) and M = 200 (d).
Figure 3. Sensitivity of the estimated probability of success ( ψ ^ ) to the selected range of ψ for M = 100 (a) and M = 200 (b); sensitivity of the estimated population size ( N ^ ) to the selected range of ψ for M = 100 (c) and M = 200 (d).
Axioms 12 01094 g003
Figure 4. Pointwise 95% confidence intervals for N regarding ψ ^ for M = 100 (a) and M = 200 (c); nonparametric 95% confidence intervals for N regarding ψ ^ for M = 100 (b) and M = 200 (d); (e) combined plot of the confidence intervals in panels (b,d) for comparison.
Figure 4. Pointwise 95% confidence intervals for N regarding ψ ^ for M = 100 (a) and M = 200 (c); nonparametric 95% confidence intervals for N regarding ψ ^ for M = 100 (b) and M = 200 (d); (e) combined plot of the confidence intervals in panels (b,d) for comparison.
Axioms 12 01094 g004aAxioms 12 01094 g004b
Figure 5. Estimated ψ : (a) vs. ESS of N ^ and (b) vs. ESS of ψ ^ ; estimated 95% confidence interval for ψ regarding ESS of N ^ for M = 100 (c) and M = 200 (d); estimated 95% confidence interval for ψ regarding ESS of ψ ^ for M = 100 (e) and for M = 200 (f).
Figure 5. Estimated ψ : (a) vs. ESS of N ^ and (b) vs. ESS of ψ ^ ; estimated 95% confidence interval for ψ regarding ESS of N ^ for M = 100 (c) and M = 200 (d); estimated 95% confidence interval for ψ regarding ESS of ψ ^ for M = 100 (e) and for M = 200 (f).
Axioms 12 01094 g005aAxioms 12 01094 g005b
Figure 6. Convergence plots (first column), running mean (second column), and estimated posterior densities (third column) for σ (first row), λ 0 (second row), and N (third row) for M = 100 . Left: no constraint on ψ . Right: ψ constrained on [0.1, 0.4].
Figure 6. Convergence plots (first column), running mean (second column), and estimated posterior densities (third column) for σ (first row), λ 0 (second row), and N (third row) for M = 100 . Left: no constraint on ψ . Right: ψ constrained on [0.1, 0.4].
Axioms 12 01094 g006
Figure 7. Convergence plots (first column), running mean (second column), and estimated posterior densities (third column) for σ (first row), λ 0 (second row), and N (third row) for M = 200 . Left: no constraint on ψ . Right: ψ constrained on [0.1, 0.4].
Figure 7. Convergence plots (first column), running mean (second column), and estimated posterior densities (third column) for σ (first row), λ 0 (second row), and N (third row) for M = 200 . Left: no constraint on ψ . Right: ψ constrained on [0.1, 0.4].
Axioms 12 01094 g007
Table 1. Summary of the estimated mean of σ ,   λ 0 ,   ψ , and population size N for different ranges of ψ and M { 100,200 } .
Table 1. Summary of the estimated mean of σ ,   λ 0 ,   ψ , and population size N for different ranges of ψ and M { 100,200 } .
M
ψ
σ ^
λ ^ 0
ψ ^
S d ψ ^
% e ψ ^
N ^
S d N ^
E S S N
E S S ψ
L a g 10 N
L a g 10 ψ
1000.00–1.000.4370.5890.3130.04625.20030.8872.5771113.21297.60.6680.625
0.00–0.500.4510.6060.2740.0459.60027.1072.3221850.82180.90.4820.412
0.10–0.400.4650.5850.2530.0431.20025.2552.1852679.43232.70.4030.324
0.05–0.350.4790.5840.2310.0427.60023.3262.0362505.53145.00.3950.309
0.10–0.350.4710.5890.2370.0435.20023.8622.0772738.43681.90.3520.260
2000.00–1.000.4160.5960.1990.02859.20039.2312.501250.6263.30.9090.894
0.00–0.500.4430.5790.1580.02626.40030.9792.0301066.81140.40.6740.623
0.10–0.400.4150.5790.1740.02739.20033.4332.1922340.82727.20.4960.431
0.05–0.350.4430.5790.1530.02522.40029.9221.9692082.82203.20.5470.486
0.10–0.350.4110.5920.1730.02738.40033.3402.1843017.53701.70.4460.375
Table 2. Summary of 50 runs of the estimated mean of σ ,   λ 0 ,   ψ , and population size N for different ranges of ψ and M { 100,200 } .
Table 2. Summary of 50 runs of the estimated mean of σ ,   λ 0 ,   ψ , and population size N for different ranges of ψ and M { 100,200 } .
M
ψ
σ ^
λ ^ 0
ψ ^
% e ψ ^
N ^
E S S N
E S S ψ
L a g 10 N
L a g 10 ψ
1000.00–1.000.5600.4970.32329.20031.9431674.31862.40.6350.593
0.00–0.500.5630.5460.2739.20027.1733661.74368.00.3930.334
0.10–0.400.5160.5570.2614.40026.2515987.07677.70.2700.205
2000.00–1.000.5460.5270.17036.00033.3211490.31778.80.6700.620
0.00–0.500.5270.5640.15826.40031.0262647.53152.30.5280.473
0.10–0.400.4960.5450.16532.0031.1683880.04606.60.4120.355
Table 3. Estimated 95% confidence interval for population size N and probability of success ψ .
Table 3. Estimated 95% confidence interval for population size N and probability of success ψ .
95% CI 95% CI
N ^
N ^ 2 S d N ^
N ^ + 2 S d N ^
ψ ^
ψ ^ 2 S d ψ ^
ψ ^ + 2 S d ψ ^
M = 10030.88725.73336.0410.3130.2200.406
27.10722.46331.7510.2740.1850.363
25.25520.88629.6240.2530.1660.340
23.32619.25527.3970.2310.1470.315
23.86219.70728.0170.2370.1520.322
M = 20039.23134.23044.2320.1990.1430.255
30.97926.91935.0390.1580.1060.210
33.43329.04937.8170.1740.1200.228
29.92225.98433.8600.1530.1020.204
33.34028.97237.7080.1730.1200.226
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jaber, M.; Hamad, F.; Breininger, R.D.; Kachouie, N.N. An Enhanced Spatial Capture Model for Population Analysis Using Unidentified Counts through Camera Encounters. Axioms 2023, 12, 1094. https://doi.org/10.3390/axioms12121094

AMA Style

Jaber M, Hamad F, Breininger RD, Kachouie NN. An Enhanced Spatial Capture Model for Population Analysis Using Unidentified Counts through Camera Encounters. Axioms. 2023; 12(12):1094. https://doi.org/10.3390/axioms12121094

Chicago/Turabian Style

Jaber, Mohamed, Farag Hamad, Robert D. Breininger, and Nezamoddin N. Kachouie. 2023. "An Enhanced Spatial Capture Model for Population Analysis Using Unidentified Counts through Camera Encounters" Axioms 12, no. 12: 1094. https://doi.org/10.3390/axioms12121094

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop