Skip to Content
MathematicsMathematics
  • Feature Paper
  • Article
  • Open Access

22 November 2025

A Refined Regression Estimator for General Inverse Adaptive Cluster Sampling

,
,
,
and
1
Department of Mathematics, Faculty of Science, Mahasarakham University, Maha Sarakham 44150, Thailand
2
Department of Mathematics, Faculty of Science, Khon Kaen University, Khon Kaen 40002, Thailand
*
Author to whom correspondence should be addressed.

Abstract

Adaptive cluster sampling (ACS) is a sampling technique commonly used for rare populations that exhibit spatial clustering. However, the initially selected sample units may not always satisfy the specified inclusion condition. To address these limitations, general inverse sampling has been incorporated into ACS, in which the initial units are sequentially selected, and a termination criterion is applied to control the number of rare elements drawn from the population. The objective of this study is to develop an estimator of the population mean that incorporates auxiliary information within the framework of general inverse adaptive cluster sampling (GI-ACS). The proposed estimator is constructed based on a regression-type estimator and analytically examined. A simulation study was conducted to validate the theoretical findings. Three scenarios were considered, representing low, moderate, and high correlations between the variable of interest and the auxiliary variable. The simulation results indicate that the proposed estimator achieves lower variance than the GI-ACS estimator that does not utilize auxiliary information across all examined correlation scenarios. Therefore, the proposed estimator is more efficient and preferable when auxiliary variables are available.

1. Introduction

Adaptive cluster sampling (ACS), introduced by Thompson [1], is an effective technique for studying populations that are rare and spatially clustered. Initially, a set of sample units is selected using simple random sampling without replacement (SRSWOR). Whenever a selected unit satisfies a predefined condition C based on the variable of interest, all neighboring units are included. This procedure is applied iteratively such that any newly added unit meeting condition C results in the inclusion of its adjacent units, and the process continues until no further units meet the condition. If an initial unit does not satisfy C, no additional units are included, and the cluster size equals one. The set of all initial and associated neighboring units that satisfy C forms a network.
ACS has been widely applied in the study of rare and clustered populations, including forest ecosystem monitoring [2], herpetofauna surveys in tropical rainforests [3], assessments of sea lamprey larvae [4], and freshwater mussel investigations [5]. It has also been applied in hydroacoustic research [6] and studies related to the COVID-19 pandemic [7,8,9]. Beyond ecological and epidemiological contexts, ACS has recently extended to emerging areas such as autonomous systems and IoT-based applications [10,11].
In many ACS studies, estimation efficiency has been enhanced by incorporating auxiliary variables correlated with the variable of interest. A ratio-type estimator was first proposed in [12], followed by several modifications [13] and extensions to two auxiliary variables [14]. Other developments include ratio-type estimators based on known population parameters—such as the coefficient of variation, kurtosis, and skewness—of the auxiliary variable [15,16], generalized exponential-type estimators [17,18], and generalized ratio-type classes [19]. Beyond advances in estimator development for ACS, significant progress has also been achieved in variance estimation under adaptive cluster sampling [20], with these developments further extending to stratified ACS frameworks [21].
However, in ACS, the initial sample size is predetermined, and it is possible that not all randomly selected units satisfy condition C. To address this issue, inverse ACS was proposed in [22], in which units are sampled sequentially until a specified number of rare elements has been observed. When the number of qualifying units is very small, this procedure may fail to terminate. General inverse sampling, introduced in [23], addresses this limitation by imposing a cap on the final sample size while maintaining control over the number of rare elements obtained. This approach has since been integrated with ACS, forming the general inverse ACS (GI-ACS) design. By imposing an upper bound on the total number of sampled units and simultaneously ensuring that a prescribed number of qualifying units is obtained, GI-ACS provides a mechanism to control the final sample size while maintaining ACS’s capacity to efficiently detect rare and spatially aggregated units. These features make GI-ACS particularly suitable for ecological abundance estimation, environmental surveillance, and epidemiological monitoring. An estimator under GI-ACS based on the Rao–Blackwell theorem was developed in [24], and the combination of unequal-probability inverse sampling with ACS was examined in [25].
In comparison with other probability sampling frameworks, ACS generally achieves superior efficiency when the population is rare and exhibits pronounced spatial clustering. However, its effectiveness may be diminished when the initial random selection yields few or no rare units. GI-ACS alleviates this limitation by combining adaptive neighborhood expansion with a general inverse sampling method. This method controls the final sample size while ensuring a prescribed number of rare units. In contrast, other sampling designs, such as simple random sampling, stratified random sampling, and ranked set sampling, do not expand networks in response to observed rare units, often making them less efficient under spatial association. These distinctions highlight GI-ACS as a flexible alternative that retains the adaptive strengths of ACS while enhancing sampling completeness and operational control.
The authors of previous studies on GI-ACS have primarily focused on estimators based only on the study variable Y. However, in many real-world applications, auxiliary variables correlated with Y are often available and can be measured concurrently at minimal cost. Utilizing such auxiliary information can substantially improve estimation efficiency. Motivated by the regression-type estimator in general inverse sampling proposed in [26], in this study, we develop an estimator that incorporates auxiliary information within the GI-ACS framework.
The remainder of the paper is organized as follows: In Section 2, we provide a review of inverse and general inverse sampling. In Section 3, we describe the general inverse ACS design. In Section 4, we introduce the proposed regression-type estimator using auxiliary information. In Section 5, we present the simulation study and discuss the results. Lastly, Section 6 concludes the study.

2. Inverse Sampling and General Inverse Sampling

Consider a finite population of size N , consisting of the set of values U = y 1 , y 2 , , y N . Let Y denote the variable of interest, and let y i be the y -value associated with the i t h unit. The population is partitioned into two subgroups based on whether the y -value satisfies the precondition C. Following the notations in [22], define the two subgroups by U M = y : y i C ; i = 1 , 2 , N and U N M = U M = y : y i C ; i = 1 , 2 , N , where M is the unknown number of units of U M . A unit’s classification is not known until it is selected. The sampling procedure involves sequentially selecting units at random from the target population until a predetermined number of units r from U M 1 < r M are obtained.
The estimator in inverse sampling was presented in [22]. Suppose a sample size n I is selected via SRSWOR. If at least r r > 1 units from U M are observed, sampling stops. Otherwise, sampling continues sequentially until exactly r units are selected. The total sequential sample size is denoted by n T . Let μ y be the population mean of the variable of interest, then an unbiased estimator of μ y is
y ¯ I S = 1 n I i = 1 n I y i i f n T = n I , 1 N M y ¯ M + N M y ¯ M i f n T > n I ,
where y ¯ M = 1 r i S M y i ; S M is the index label of the members of U M .
y ¯ M = 1 n T r i S M y i ; S M is the index label of the members of U M . In application, M will be not known. The unbiased estimator of M is M ^ = N r 1 n T 1 .
The variance of y ¯ I S is
V y ¯ I S = N n I N S 2 n I i f n T = n I , V y ¯ I i f n T > n I ,
Using the variance of Murthy’s estimator [27], V y ¯ I = 1 N 2 i = 1 N j < i N 1 s i , j Pr s i Pr s j Pr s y i p i y j p j 2 p i p j , where Pr s is the probability of obtaining sample s , Pr s i is the conditional probability of obtaining s given the i t h unit was selected first, and p i is the selection probability of unit i in the first draw and S 2 = N 1 1 i = 1 N y i μ y 2 .
For any sequential sampling design, it may not be possible to complete the sampling procedure in certain situations. Such a situation may occur in an inverse sampling design because r is too large or the population contains very few units satisfying condition C. General inverse sampling was proposed in [23]. Beginning with a sample size n I selected via SRSWOR, we stop further sampling if at least r units from U M are selected. Otherwise, sampling continues until r units from U M are obtained; however, a limit is placed on the final sample size, such that n F = N . Then, an unbiased estimator of μ y is
y ¯ G I S = 1 n I i = 1 n I y i i f A 1 , 1 N M ^   y ¯ M + N M ^ y ¯ M i f A 2 , 1 n F i = 1 n F y i i f A 3 ,
where A 1 = n T = n I ,
A 2 = n I < n T < n F   o r n T = n F & S M = r , and
A 3 = n T = n F & S M < r .
The variance of y ¯ G I S is
V y ¯ G I S = N n I N S 2 n I i f A 1 , V y ¯ I i f A 2 , N n F N S 2 n F i f A 3 ,
The unbiased estimator of the variance of y ¯ G I S is
V ^ y ¯ G I S = N n I N s I 2 n I i f A 1 , V ^ y ¯ I i f A 2 , N n F N s F 2 n F i f A 3 ,
Here, V ^ y ¯ I = 1 N 2 i = 1 n T j < i n T Pr s i , j Pr s Pr s i Pr s j Pr s 2 y i p i y j p j 2 p i p j , where Pr s i , j is the probability of obtaining sample s , given that units i and j were selected (in either order) on the first two draws. s I 2 = n I 1 1 i = 1 n I y i y ¯ I 2 , y ¯ I = n I 1 i = 1 n I y i , and s F 2 = n F 1 1 i = 1 n F y i y ¯ F 2 , y ¯ F = n F 1 i = 1 n F y i .

3. General Inverse ACS

The integration of general inverse sampling with ACS yields the general inverse ACS (GI-ACS) design. Under GI-ACS, if an initially selected unit satisfies the pre-specified condition C, the entire network to which that unit belongs is included. Consequently, the final sample consists of the sequentially selected initial units and all network members generated by means of adaptive expansion, which also includes the edge units that sequentially satisfy condition C.
An estimator of μ y for the general inverse design with ACS [23]:
y ¯ G I S , A = 1 n I i = 1 n I y ¯ i * i f A 1 , 1 N M ^   y ¯ M * + N M ^ y ¯ M * i f A 2 , 1 n F i = 1 n F y ¯ i * i f A 3 ,
where y ¯ i * is the average of the variable of interest in the network that includes unit i of the initial sample, defined as y ¯ i * = 1 m i j ψ i y j , where ψ i denotes the network that includes unit i , and m i is the number of units in that network.
The variance of y ¯ G I S , A is
V y ¯ G I S , A = N n I N S * 2 n I i f A 1 , V y ¯ I * i f A 2 , N n F N S * 2 n F i f A 3 ,
V y ¯ I * = 1 N 2 i = 1 N j < i N 1 s i , j Pr s i Pr s j Pr s y ¯ i * p i y ¯ j * p j 2 p i p j and S * 2 = N 1 1 i = 1 N y ¯ i * μ y 2 .
The unbiased estimator of the variance of y ¯ G I S , A is
V ^ y ¯ G I S , A = N n I N s I * 2 n I i f A 1 , V ^ y ¯ I * i f A 2 , N n F N s F * 2 n F i f A 3 ,
where V ^ y ¯ I * = 1 N 2 i = 1 n T j < i n T Pr s i , j Pr s Pr s i Pr s j Pr s 2 y ¯ i * p i y ¯ j * p j 2 p i p j , s I * 2 = n I 1 1 i = 1 n I y ¯ i * y ¯ I * 2 , y ¯ I * = n I 1 i = 1 n I y ¯ i * , and s F * 2 = n F 1 1 i = 1 n F y ¯ i * y ¯ F * 2 , y ¯ F * = n F 1 i = 1 n F y ¯ i * .

4. Proposed Estimator in General Inverse ACS

Motivated by Moradi’s regression-type estimator [26] formulated under inverse sampling using unit-level observations, the proposed estimator adapts this approach to the GI-ACS design, operating on network-level means resulting from adaptive expansion. In this section, let μ x be the population mean of the auxiliary variable, and the estimator x ¯ G I S , A can be calculated similar to y ¯ G I S , A by using the x -variable.
The modified regression estimator in general inverse ACS is
y ¯ R G I S , A = y ¯ G I S , A + B * μ x x ¯ G I S , A
where B * = i = 1 N x ¯ i * y ¯ i * N μ y μ x i = 1 N x ¯ i * 2 N μ x 2 . The estimators of μ y and μ x are y ¯ G I S , A and x ¯ G I S , A , respectively, whereas i = 1 N x ¯ i * y ¯ i * and i = 1 N x ¯ i * 2 are estimated using Murthy’s estimator. These estimators are given by T ^ x y * = i = 1 n T Pr s i N Pr s x ¯ i * y ¯ i * , T ^ x x * = i = 1 n T Pr s i N Pr s x ¯ i * x ¯ i * . Using Equation (6) for the new variables x ¯ i * y ¯ i * and x ¯ i * x ¯ i * in GI-ACS yields
T ^ x y * = N n I i = 1 n I x ¯ i * y ¯ i * i f A 1 , M ^ x ¯ * y ¯ * M + N M ^ x ¯ * y ¯ * M i f A 2 , N n F i = 1 n F x ¯ i * y ¯ i * i f A 3 ,
where x ¯ * y ¯ * M = r 1 i S M x ¯ i * y ¯ i * and x ¯ * y ¯ * M = n T r 1 i S M x ¯ i * y ¯ i * .
T ^ x x * = N n I i = 1 n I x ¯ i * 2 i f A 1 , M ^ x ¯ * x ¯ * M + N M ^ x ¯ * x ¯ * M i f A 2 , N n F i = 1 n F x ¯ i * 2 i f A 3 ,
where x ¯ * x ¯ * M = r 1 i S M x ¯ i * 2 and x ¯ * x ¯ * M = n T r 1 i S M x ¯ i * 2 .
Therefore, the estimator of B * is b * = T ^ x y * N y ¯ G I S , A x ¯ G I S , A T ^ x x * N x ¯ G I S , A 2 .
The bias of y ¯ R G I S , A is
E y ¯ R G I S , A = E y ¯ G I S , A + b * μ x x ¯ G I S , A
= E y ¯ G I S , A + μ x E b * E b * x ¯ G I S , A .
Since y ¯ G I S , A and x ¯ G I S , A are unbiased estimators of μ y and μ x , respectively, we have
E y ¯ R G I S , A = μ y + E x ¯ G I S , A E b * E b * x ¯ G I S , A
= μ y C O V b * , x ¯ G I S , A
Therefore, the bias of y ¯ R G I S , A is
B i a s y ¯ R G I S , A = E y ¯ R G I S , A μ y = C O V b * , x ¯ G I S , A
From y ¯ R G I S , A = y ¯ G I S , A + b * μ x x ¯ G I S , A = y ¯ G I S , A b * x ¯ G I S , A + b μ x , let z i * = y ¯ i * B * x ¯ i * , with B * estimated by b * . Then y ¯ R G I S , A = z ¯ G I S , A + b * μ x .
The variance of y ¯ R G I S , A is
V y ¯ R G I S , A = N n I N S z * 2 n I i f A 1 , V y ¯ R I * i f A 2 , N n F N S z * 2 n F i f A 3 ,
where S z * 2 = 1 N 1 i = 1 N z i * μ y 2 and V y ¯ R I * is derived from the variance of Murthy’s estimator [27], which applied the Rao–Blackwell theorem.
V y ¯ R I * = V E z ¯ * s = V z ¯ * E V z ¯ * s ,
where V z ¯ * = 1 N 2 i = 1 N V z i * p i I i + 1 N 2 i = 1 N i j C O V z i * p i I i , z i * p i I i . I i is an indicator function that equals 1 when the unit i is selected as the first unit and 0 otherwise, I i I j = 0 for i j , E I i I j = 0 , and C O V I i I j = p i p j , so that V y ¯ * = 1 N 2 i = 1 N j < i N z i * p i z j * p i 2 p i p j .
V z ¯ * s = 1 N 2 V i = 1 N z i * p i I i s
= 1 N 2 i = 1 N z i * 2 p i 2 Pr I i = 1 s Pr I i = 1 s 2 1 N 2 i = 1 N j i N z i * z j * p i p j Pr I i = 1 s Pr I j = 1 s
= 1 N 2 i = 1 N j < i N Pr s i P s j Pr s 2 z i * p i z j * p j p i p j , which gives
E V z ¯ * s = 1 N 2 s Pr s i = 1 N j < i N Pr s i Pr s j Pr s 2 z i * p i z j * p j p i p j
= 1 N 2 i = 1 N j < i N Pr s i Pr s j Pr s z i * p i z j * p j p i p j
Therefore, V y ¯ R I * = 1 N 2 i = 1 N j < i N 1 s i , j Pr s i Pr s j Pr s z i * p i z j * p j 2 p i p j .
Under the GI-ACS design, the inclusion probabilities are determined by the sampling design. As such, these probabilities are treated as design-based quantities, not as random variables drawn from a specific probability distribution. Independence among inclusion events is not assumed. Adaptive expansion naturally creates dependence among units in the same network. The proposed estimator and its variance rely solely on assumptions of general inverse sampling. No extra distributional assumptions are required beyond these design-based properties.
The estimator of the variance of y ¯ R G I S , A is
V ^ y ¯ G I S , A = N n I N s z I * 2 n I i f A 1 , V ^ y ¯ R I * i f A 2 , N n F N s z F * 2 n F i f A 3 ,
where the unbiased estimator of V y ¯ R I * is V y ¯ R I * = 1 N 2 i = 1 n T j < i n T Pr s i , j Pr s Pr s i Pr s j Pr s 2 z i * p i z j * p j 2 p i p j , with Pr s i , j = Pr ( i and j are the first two units, s ) / p i j . From Murthy [27],
V y ¯ R I * = 1 N n T 1 2 n T 2 r r 2 N n T + 1 N n T 2 r 2 i S M j < i : j S M z i * z j * 2
+ N n T + 1 N 2 r 1 r i S M j S M z i * z j * 2 + i S M j > i : j S M z i * z j * 2
V y ¯ R I * = α s M 2 + 1 N 2 V ^ M ^ z ¯ M * z ¯ M * 2 + γ s M 2
where
α = M ^ N 2 N n T + 1 n T r n T r N n T 2 N r n T 2 r 1 ,   γ = N n T + 1 n T r 1 N n T 1 n T 2
V ^ M ^ = N n T r r 1 N n T + 1 n T 1 2 n T 2 = 1 n T 1 N M ^ N M ^ n T 2
s M * 2 = r 1 1 i S M z i * z ¯ M * 2 ,   z ¯ M * = r 1 i M z i * ,
s M * 2 = n T r 1 1 i S M z i * z ¯ M * 2 ,   z ¯ M * = n T r 1 i M z i * ,
s z I * 2 = n I 1 1 i = 1 n I z i * z ¯ I * 2 ,   z ¯ I * = n I 1 i = 1 n I z i * ,   and
s z F * 2 = n F 1 1 i = 1 n F z i * z ¯ F * 2 ,   z ¯ F * = n F 1 i = 1 n F z i * .

5. Simulation Studies and Discussion

Simulation studies were conducted under three correlation scenarios between the auxiliary variable (X) and the variable of interest (Y):
Scenario I: High correlation;
Scenario II: Moderate correlation;
Scenario III: Low correlation, with Y representing real data and X simulated accordingly.
Population I: A Poisson cluster process [28] was used to generate the auxiliary variable X and the variable of interest Y over a 20 × 20 grid (400 units). The population mean of y -values was 59.7375, and the correlation coefficient between the y -data and x -data was 0.9973. The criterion for including neighboring units was C = y : y > 0 . The population I is shown in Figure 1 and Figure 2.
Figure 1. The variable of interest ( y ) for Population I. The shaded regions in different colors denote distinct networks.
Figure 2. The auxiliary variable ( x ) for Population I. The position of the network is aligned with the data y in Figure 1.
Population II: Based on the method of Chao [12], this population was generated using a linked-pairs process combined with a bivariate Poisson cluster process over a 20 × 20 grid (400 units). The population was more weakly clustered than Population I, with several single-unit networks. The population mean of y was 0.6475, and the Pearson correlation coefficient between x and y was 0.7070. The criterion for including neighboring units was C = y : y > 0 . Population II is shown in Figure 3 and Figure 4.
Figure 3. The variable of interest (y) for Population II. The shaded regions in different colors denote distinct networks.
Figure 4. The auxiliary variable (x) for Population II. The position of the network is aligned with the data y in Figure 3.
Population III: This population was derived from the real blue-winged teal dataset [29]. The study area (5000 km2 in central Florida) was divided into 50 units of 100 km2 each. These data were assigned as y-values. The corresponding x-values [30] were simulated using the model x i = 4 y ¯ i * γ i , where γ i ~ N 0 , y ¯ i * × y i and x = 0 was assumed whenever y = 0. The population mean of y was 282.420, and the Pearson correlation coefficient between x and y was 0.4733. The criterion for including neighboring units was C = y : y > 0 . Population III is shown in Figure 5 and Figure 6.
Figure 5. The variable of interest (y) for Population III (blue-winged teal data). The shaded regions in different colors denote distinct networks.
Figure 6. The auxiliary variable (x) for Population III. The position of the network is aligned with the data y in Figure 5.
A total of 10,000 iterations were performed for each estimator. E n T is the expected sequential sample size from general inverse sampling and E ν is the expected final sample size, respectively.
The estimated variance of the estimators is defined as
V ^ y ¯ G I S , A = 1 10 , 000 1 i = 1 10 , 000 y ¯ G I S , A i i = 1 10 , 000 y ¯ G I S , A i 10 , 000 2 ,
V ^ y ¯ R G I S , A = 1 10 , 000 1 i = 1 10 , 000 y ¯ R G I S , A i i = 1 10 , 000 y ¯ R G I S , A i 10 , 000 2 .
The relative efficiency of the proposed estimator, compared with y ¯ G I S _ A , is defined as R E y ¯ R G I S , A = V ^ y ¯ G I S , A V ^ y ¯ R G I S , A .
Since the proposed estimator y ¯ R G I S , A is a biased estimator, the estimated absolute relative bias is defined as A R B y ¯ R G I S , A = 1 10 , 000 i = 1 10 , 000 y ¯ R G I S , A i μ y μ y .

Discussion

Based on the data analyzed in this study, the variable of interest and the auxiliary variable exhibit a positive linear relationship. The selected values of the number of rare units r and the initial sample size n I under the GI-ACS design represent practical ranges that support feasible sampling while enabling meaningful variation for evaluating estimator performance under different levels of network expansion.
The results of the study are as follows:
Based on the results presented in Table 1, since the proposed estimator y ¯ R G I S , A is biased, the empirical evidence indicates that when the variable of interest and the auxiliary variable are highly correlated, an increase in both r and n I leads to a decrease in the estimated absolute relative bias. Conversely, when the correlation between the variables is low to moderate, the estimated absolute relative bias of the proposed estimator does not show a significant reduction.
Table 1. The estimated absolute relative bias of the proposed estimator for the population mean of the variable of interest.
Based on the results presented in Table 2, the estimated variance of all estimators decreases as r and n I increase. Across all scenarios, the proposed estimator y ¯ R G I S , A , which incorporates auxiliary information, consistently yields lower estimated variance than the estimator based solely on the variable of interest y ¯ G I S , A . The lower variance of the proposed estimator relative to y ¯ G I S , A across all scenarios reinforces the stabilizing effect of the auxiliary variable. The proposed estimator incorporates regression adjustment at the network level. In Population I and Population III, where network-level variability is prominent, the auxiliary variable significantly enhances the predictive strength. Even in Population II, where the correlation is moderate and clustering is weaker, the proposed estimator maintains lower variance, indicating that auxiliary information offers robustness against variability induced by both sequential sampling and adaptive expansion. Moreover, the relative efficiency of the proposed estimator is greater than one in all cases, indicating that y ¯ R G I S , A is more efficient than y ¯ G I S , A . The regression-type estimators remain effective even under GI-ACS, providing substantial improvement over estimators that rely solely on the study variable.
Table 2. The estimated variance of the estimators for the population mean of the variable of interest and the relative efficiency of the proposed estimator compared with y ¯ G I S , A .
Across all three simulation scenarios, the relative bias decreases and the estimated variance declines as the sample size increases, indicating that the proposed estimator under GI-ACS tends to approach the population mean for the variable of interest. This behavior indicates that the regression-based estimator exhibits design consistency within the GI-ACS framework.

6. Conclusions

General inverse sampling was incorporated into adaptive cluster sampling (ACS) in [21], and this design is referred to as the general inverse ACS (GI-ACS) design. In this study, we developed a regression-type estimator for GI-ACS that incorporates auxiliary information to improve the estimation of the population mean in rare and spatially clustered populations. Analytical results established the bias and variance properties of the proposed estimator, and simulation studies conducted under three correlation scenarios demonstrated clear efficiency gains over the existing GI-ACS estimator that does not employ auxiliary variables.
The findings indicate that, across all investigated scenarios, the proposed estimator is consistently more efficient than the estimator of the population mean based solely on the study variable. Moreover, when the auxiliary variable is highly correlated with the variable of interest, the proposed estimator exhibits noticeably smaller bias compared to situations in which the correlation is low to moderate. These results suggest that the proposed estimator under the GI-ACS framework is particularly advantageous when a strong correlation exists between the auxiliary and study variables. Overall, the findings highlight the practical value of auxiliary information in improving the stability and precision of estimators within the GI-ACS design.
The main methodological contribution of this work lies in extending regression-type estimation to the GI-ACS framework, thereby addressing a key limitation of existing GI-ACS estimators, which rely exclusively on the study variable. By incorporating auxiliary information, the proposed estimator provides enhanced precision and robustness across a range of sampling conditions.
In this study, the proposed estimator utilized a single auxiliary variable. The authors of future studies should focus on developing extensions that incorporate additional auxiliary information, such as the coefficient of variation in the auxiliary variable, the correlation coefficient between the auxiliary variable and the variable of interest, or the inclusion of multiple auxiliary variables. Such extensions could further improve the efficiency of estimators under the GI-ACS framework.

Author Contributions

Conceptualization, N.C. and P.G.; methodology, S.W. and N.C.; software, N.C.; validation, M.C. and C.B.; formal analysis, N.C. and M.C.; investigation, S.W. and P.G.; resources, N.C. and C.B.; data curation, N.C. and C.B.; writing—original draft preparation, N.C. and M.C.; writing—review and editing, P.G. and S.W.; visualization, P.G. and C.B.; supervision, N.C. and P.G.; project administration, N.C.; funding acquisition, N.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research project was financially supported by Mahasarakham University.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

References

  1. Thompson, S.K. Adaptive cluster sampling. J. Am. Stat. Assoc. 1990, 85, 1050–1059. [Google Scholar] [CrossRef]
  2. Magnussen, S.; Kurz, W.; Leckie, D.G.; Paradine, D. Adaptive cluster sampling for estimation of deforestation rates. Eur. J. For. Res. 2005, 124, 207–220. [Google Scholar] [CrossRef]
  3. Noon, B.R.; Ishwar, N.M.; Vasudevan, K. Efficiency of adaptive cluster and random sampling in detecting terrestrial herpetofauna in a tropical rainforest. Wildl. Soc. Bull. 2006, 34, 59–68. [Google Scholar] [CrossRef]
  4. Sullivan, W.P.; Morrison, B.J.; Beamish, F.W.H. Adaptive cluster sampling: Estimating density of spatially autocorrelated larvae of the sea lamprey with improved precision. J. Great Lakes Res. 2008, 34, 86–97. [Google Scholar] [CrossRef]
  5. Smith, D.R.; Villella, R.F.; Lemarié, D.P. Application of adaptive cluster sampling to low-density populations of freshwater mussels. Environ. Ecol. Stat. 2003, 10, 7–15. [Google Scholar] [CrossRef]
  6. Conners, M.E.; Schwager, S.J. The use of adaptive cluster sampling for hydroacoustic surveys. ICES J. Mar. Sci. 2002, 59, 1314–1325. [Google Scholar] [CrossRef]
  7. Olayiwola, O.M.; Ajayi, A.O.; Onifade, O.C.; Wale-Orojo, O.; Ajibade, B. Adaptive cluster sampling with model based approach for estimating total number of Hidden COVID-19 carriers in Nigeria. Stat. J. IAOS 2020, 36, 103–109. [Google Scholar] [CrossRef]
  8. Chandra, G.; Tiwari, N.; Nautiyal, R. Adaptive cluster sampling-based design for estimating COVID-19 cases with random samples. Curr. Sci. 2021, 120, 1204–1210. [Google Scholar] [CrossRef]
  9. Stehlík, M.; Kiseľák, J.; Dinamarca, A.; Alvarado, E.; Plaza, F.; Medina, F.A.; Stehlíková, S.; Marek, J.; Venegas, B.; Gajdoš, A.; et al. REDACS: Regional emergency-driven adaptive cluster sampling for effective COVID-19 management. Stoch. Anal. Appl. 2022, 41, 474–508. [Google Scholar] [CrossRef] [PubMed]
  10. Hwang, J.; Bose, N.; Fan, S. AUV adaptive sampling methods: A Review. Appl. Sci. 2019, 9, 3145. [Google Scholar] [CrossRef]
  11. Giouroukis, D.; Dadiani, A.; Traub, J.; Zeuch, S.; Markl, V. A survey of adaptive sampling and filtering algorithms for the internet of things. In Proceedings of the 14th ACM International Conference on Distributed and Event-Based Systems, Montreal, QC, Canada, 13–17 July 2020; pp. 27–38. [Google Scholar] [CrossRef]
  12. Chao, C.T. Ratio estimation on adaptive cluster sampling. J. Chin. Stat. Assoc. 2004, 42, 307–327. [Google Scholar] [CrossRef]
  13. Dryver, A.L.; Chao, C.T. Ratio estimators in adaptive cluster sampling. Environmetric 2007, 18, 607–620. [Google Scholar] [CrossRef]
  14. Chutiman, N.; Kumphon, B. Ratio estimator using two auxiliary variables for adaptive cluster sampling. Thail. Stat. 2008, 6, 241–256. [Google Scholar]
  15. Chutiman, N. Adaptive cluster sampling using auxiliary variable. J. Math. Stat. 2013, 9, 249–255. [Google Scholar] [CrossRef]
  16. Yadav, S.K.; Misra, S.; Mishra, S. Efficient estimator for population variance using auxiliary variable. Am. J. Oper. Res. 2016, 6, 9–15. [Google Scholar] [CrossRef]
  17. Chaudhry, M.S.; Hanif, M. Generalized exponential-cum-exponential estimator in adaptive cluster sampling. Pak. J. Stat. Oper. Res. 2015, 11, 553–574. [Google Scholar] [CrossRef]
  18. Chaudhry, M.S.; Hanif, M. Generalized difference-cum-exponential estimator in adaptive cluster sampling. Pak. J. Stat. 2017, 33, 335–367. [Google Scholar]
  19. Bhat, A.A.; Sharma, M.; Shah, M.; Bhat, M. Generalized ratio type estimator under adaptive cluster sampling. J. Sci. Res. 2023, 67, 46–51. [Google Scholar] [CrossRef]
  20. Yasmeen, U.; Thompson, M. Variance estimation in adaptive cluster sampling. Commun. Stat. Theory Methods 2019, 49, 2485–2497. [Google Scholar] [CrossRef]
  21. Shahzad, U.; Ahmad, I.; Al-Noor, N.H.; Benedict, T.J. Use of calibration constraints and linear moments for variance estimation under stratified adaptive cluster sampling. Soft Comput. 2022, 26, 11185–11196. [Google Scholar] [CrossRef]
  22. Christman, M.C.; Lan, F. Inverse adaptive cluster aampling. Biometrics 2001, 57, 1096–1105. [Google Scholar] [CrossRef] [PubMed]
  23. Salehi, M.; Seber, G.A.F. A general inverse sampling scheme and its application to adaptive cluster sampling. Aust. N. Z. J. Stat. 2004, 46, 483–494. [Google Scholar] [CrossRef]
  24. Pochai, N. An improved the estimator in inverse adaptive cluster sampling. Thail. Stat. 2008, 6, 15–26. [Google Scholar]
  25. Sangngam, P. Unequal probability inverse adaptive cluster sampling. Chiang Mai J. Sci. 2013, 40, 736–742. [Google Scholar]
  26. Moradi, M.; Salehi, M.; Brown, J.A.; Karimi, N. Regression estimator under inverse sampling to estimate arsenic contamination. Environmetrics 2011, 22, 894–900. [Google Scholar] [CrossRef]
  27. Salehi, M.; Seber, G.A.F. Theory & methods: A new proof of Murthy’s estimator which applies to sequential sampling. Aust. N. Z. J. Stat. 2001, 43, 281–286. [Google Scholar]
  28. Subzar, M.; Alqurashi, T.; Chandawat, D.; Tamboli, S.; Raja, T.A.; Attri, A.K.; Wani, S.A. Generalized robust regression techniques and adaptive cluster sampling for efficient estimation of population mean in case of rare and clustered populations. Sci. Rep. 2025, 15, 2069. [Google Scholar] [CrossRef]
  29. Smith, D.R.; Conroy, M.J.; Brakhage, D.H. Efficiency of adaptive cluster sampling for estimating density wintering waterfowl. Biometrics 1995, 51, 777–788. [Google Scholar] [CrossRef]
  30. Pochai, N. Double and Resampling in Adaptive Cluster Sampling. Doctoral Dissertation, National Institute of Development Administration, Bangkok, Thailand, 2006. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.