Next Article in Journal
CL-NOTEARS: Continuous Optimization Algorithm Based on Curriculum Learning Framework
Next Article in Special Issue
Nonparametric Test for Logistic Regression with Application to Italian Enterprises’ Propensity for Innovation
Previous Article in Journal
The Proximal Gradient Method for Composite Optimization Problems on Riemannian Manifolds
Previous Article in Special Issue
Review about the Permutation Approach in Hypothesis Testing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Sharma–Taneja–Mittal Entropy and Its Application of Obesity in Saudi Arabia

by
Hanan H. Sakr
1,2,* and
Mohamed Said Mohamed
2
1
Department of Management Information Systems, College of Business Administration in Hawtat Bani Tamim, Prince Sattam Bin Abdulaziz University, Saudi Arabia
2
Mathematics Department, Faculty of Education, Ain Shams University, Cairo 11341, Egypt
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(17), 2639; https://doi.org/10.3390/math12172639
Submission received: 6 June 2024 / Revised: 6 July 2024 / Accepted: 10 July 2024 / Published: 25 August 2024
(This article belongs to the Special Issue Nonparametric Statistical Methods and Their Applications)

Abstract

:
This paper presents several nonparametric estimators for the Sharma–Taneja–Mittal entropy measure of a continuous random variable with known support, utilizing spacing, a local linear model, and a kernel function. The properties of these estimators are discussed. Their performance was also examined through real data analysis and Monte Carlo simulations. In the Monte Carlo experiments, the proposed Sharma–Taneja–Mittal entropy estimators were employed to create a test of goodness-of-fit under the standard uniform distribution. The suggested test statistics demonstrate strong performance, as evidenced by a comparison of their power with that of other tests for uniformity. Finally, we examine a classification issue in the recognition of patterns to underscore the significance of these measures.

1. Introduction

The foundation of information theory commenced with Shannon’s introduction of the concept of differential entropy or Shannon entropy (cf. [1]). Assessing the uncertainty linked to a random variable can be conducted through information measures. In various real-life circumstances, such as experimental physics, neuroscience, data analysis, demography, econometrics, and entropy coding, gauging the uncertainty tied to a random variable holds significant importance. Under the absolutely continuous non-negative random variable Y, the expression for the differential entropy is provided as follows:
S n ( Y ) = 0 h ( y ) ln h ( y ) d y ,
where h ( y ) represents the probability density function (PDF). Furthermore, the cumulative residual entropy, as given by Rao et al. [2], is defined as
C S n ( Y ) = 0 H ¯ ( y ) ln H ¯ ( y ) d y ,
where H ( y ) is the cumulative distribution function (CDF), and H ¯ ( y ) = 1 H ( y ) .
Tsallis [3] introduced the concept of Tsallis entropy through multiple uncertainty generalizations. The continuous Tsallis entropy measure for a continuous random variable Y, 1 θ > 0 , is expressed as
T n θ ( Y ) = 1 θ 1 1 0 h θ ( y ) d y ,
Here, when θ is 1, then lim θ 1 T n θ ( Y ) = S n ( Y ) .
Sharma–Taneja–Mittal entropy was presented independently with Sharma and Taneja [4] and Mittal [5] by the following form:
S T M n θ 1 , θ 2 ( Y ) = 1 θ 1 θ 2 0 ( h θ 1 ( y ) h θ 2 ( y ) ) d y ,
where θ 1 θ 2 > 0 . Moreover, if θ 2 is 1, then S T M n θ 1 , 1 ( Y ) = T n θ 1 ( Y ) , and we can see that the Sharma–Taneja–Mittal entropy measure could be regarded as an extension of negative Tsallis entropy. Recently, Kattumannil et al. [6] introduced the Sharma–Taneja–Mittal cumulative residual and cumulative entropies, respectively, as follows:
C R S T M n θ 1 , θ 2 ( Y ) = 1 θ 1 θ 2 0 ( H ¯ θ 1 ( y ) H ¯ θ 2 ( y ) ) d y ,
C S T M n θ 1 , θ 2 ( Y ) = 1 θ 1 θ 2 0 ( H θ 1 ( y ) H θ 2 ( y ) ) d y ,
where θ 1 θ 2 > 0 . Recently, Sudheesh et al. [7] expressed measures (4) and (6) in terms of probability-weighted moments (PWMs). Inferential methods based on the established results of PWM can be developed by expressing these measures using PWMs.
Numerous researchers have introduced nonparametric methods for estimating information measures. In cases where the measure relies on the PDF, Vasicek [8] demonstrated his entropy estimator by exploiting the ability to express Equation (1) in an alternate form:
S n ( Y ) = 0 h ( y ) ln h ( y ) d y = 0 1 ln d d δ H 1 ( δ ) 1 d δ .
Subsequently, the CDF F was substituted with the difference operator, and empirical CDF F n was employed in lieu of the differential operator. Based on the function of the order statistics, an estimation of the derivative of F 1 ( δ ) was then derived. In a formal manner, given the order statistic Y ( 1 ) Y ( n ) from the random sample Y 1 , , Y n , Vasicek’s estimator is consequently expressed as
S n ( H n ) = 1 n k = 1 n ln n 2 r ( Y ( k + r ) Y ( k r ) ) ,
where the positive integer window size r < n 2 , and Y k = Y 1 if k < 1 and Y k = Y n if k > n . Subsequently, Ebrahimi et al. [9] highlighted the shortcomings of (8) at its boundaries and proposed their entropy estimator by making modifications to it as follows:
S n ( H n ) = 1 n k = 1 n ln n T k r ( Y ( k + r ) Y ( k r ) ) ,
where
T k = 1 + k 1 r , 1 k r 2 , r + 1 k n r 1 + n k r , n r + 1 k n .
It is worth mentioning that the majority of the tests outlined previously employ Vasicek’s entropy estimator due to its simplicity and notably relative accuracy. For instance, Noughabi and Jarrahiferiz [10] and Qiu and Jia [11] estimated the extropy measure using the presented procedure. Moreover, Wachowiak et al. [12] used sample spacing to estimate the entropies that are generalized, like the Tsallis and Renyi entropies.
This article aims to investigate the nonparametric estimation of the Sharma–Taneja–Mittal entropy measure using different techniques. We use real and simulated data to show the behavior of those nonparametric estimators compared with the theoretical Sharma–Taneja–Mittal entropy measure. Moreover, we use those estimated models in testing uniformity and classification problems with the recognition of patterns.

Work Motivation

Tsallis introduced the entropy of order θ , known as Tsallis entropy, which significantly impacts the measurement uncertainty of random variables and leads to nonextensive statistics. This form of entropy underpins nonextensive statistical mechanics, a generalization of the Boltzmann–Gibbs theory. Therefore, the Sharma–Taneja–Mittal entropy measure could be regarded as an extension of Tsallis entropy, which can provide a good application that covers Tsallis entropy. There are various kinds of probability distributions, with the uniform distribution being one of the simplest. The uniform distribution assigns equal probability across a specified range in a continuous distribution. This makes it particularly useful as a reference distribution. One of its primary applications is in generating random numbers. Additionally, in economics, demand and replenishment often do not conform to the typical normal distribution. Therefore, alternative distribution models are utilized to more accurately predict probabilities and trends. In addition, Wanke [13] asserts that a uniform distribution is more efficient for assessing lead time in inventory management during the early stages of a product’s lifecycle when a new product is under analysis. Moreover, social scientists employ the uniform distribution to signify an absence of information. For instance, when the distribution is unknown in a simulation, uniform random variate are frequently utilized. Additionally, the uniform distribution is applied to depict the measurement error in certain instruments or measuring systems. These reasons contribute to the growing interest in selecting simple and computationally efficient tests for hypotheses concerning the uniform distribution of analyzed samples. Therefore, the test of uniformity should produce a good, detailed discussion.
The following sections of this paper will proceed as outlined below. In Section 2, the suggested nonparametric estimators depending on spacing and kernel functions are presented. Moreover, the investigation through simulation and real data is presented in Section 3. In Section 4, we test for uniformity by comparing critical values and power between our methods and several other established approaches. Finally, in Section 5, we utilize the discrete Sharma–Taneja–Mittal entropy to address a classification problem.

2. The Nonparametric Estimators Proposed

In this section, we will suggest several nonparametric estimators for Sharma–Taneja–Mittal entropy and subsequently assess their efficacy against prominent alternatives. Assume that Y ( 1 ) Y ( n ) represent ordered statistics derived from a random sample of size n drawn from an unspecified continuous CDF H and following a PDF h. Table 1 shows the Sharma–Taneja–Mittal entropy for some popular distributions.

2.1. First and Second Procedures

Inspired by the methods given in (8) and (9) to estimate the entropy function, we can estimate the Sharma–Taneja–Mittal entropy as follows:
  • From (4), the Vasicek’s estimator is available for our utilization to estimate the Sharma–Taneja–Mittal entropy measure as
    S T M n 1 θ 1 , θ 2 ( h n , Y ) = 1 n ( θ 1 θ 2 ) k = 1 n n 2 r Y ( k + r ) Y ( k r ) 1 θ 1 n 2 r Y ( k + r ) Y ( k r ) 1 θ 2 ,
  • From (4), we estimate the Sharma–Taneja–Mittal entropy as
    S T M n 2 θ 1 , θ 2 ( h n , Y ) = 1 n ( θ 1 θ 2 ) k = 1 n n T k r Y ( k + r ) Y ( k r ) 1 θ 1 n T k r Y ( k + r ) Y ( k r ) 1 θ 2 ,
    where 1 θ 1 , θ 2 > 0 , θ 1 θ 2 , positive integer window size r < n 2 , Y k = Y 1 if k < 1 and Y k = Y n if k > n , and T k is defined in (10).
The next result discusses the affine transformation on the proposed estimators.
Proposition 1.
Let us consider a series of random variables, denoted as Y 1 , , Y n , that are independent and identically distributed. Let Z i = α Y i + β , α > 0 , β R , i = 1 , 2 , , n . From (11) and (12), we have
S T M n 1 θ 1 , θ 2 ( h n , Z ) = α 1 θ 1 S T M n 1 θ 1 , θ 2 ( f n , Y ) + α 1 θ 1 α 1 θ 2 n ( θ 1 θ 2 ) k = 1 n n 2 r Y ( k + r ) Y ( k r ) 1 θ 2 ,
S T M n 2 θ 1 , θ 2 ( h n , Z ) = α 1 θ 1 S T M n 2 θ 1 , θ 2 ( f n , Y ) + α 1 θ 1 α 1 θ 2 n ( θ 1 θ 2 ) k = 1 n n T k r Y ( k + r ) Y ( k r ) 1 θ 2 .
Remark 1.
Vasicek [8] and Ebrahimi et al. [9] showed the consistency (in the probability perspective) of their proposed estimators of entropy. Meanwhile, for the proposed estimators of the Sharma–Taneja–Mittal entropy given in (11) and (12), we can see that it is not easy to show their consistency because they are consistent with the difference of two pacing parts, Y ( k + r ) Y ( k r ) 1 θ 1 Y ( k + r ) Y ( k r ) 1 θ 2 , which are inflexible to deal with. In addition, the distribution of this spacing will not yield a known distribution. Moreover, this matter makes the discussion of the asymptotic normality difficult to discuss theoretically because of the need to obtain the expectation and variance.
Next, we will discuss the asymptotic normality using the Monte Carlo simulation. We aim to verify that
Z i = S T M n i θ 1 , θ 2 ( h n , Z ) E [ S T M n i θ 1 , θ 2 ( h n , Z ) ] V a r [ S T M n 2 θ 1 , θ 2 ( h n , Z ) ] , i = 1 , 2 ,
has a standard normal distribution. In total, 1000 samples, each with a size of n = 50, are drawn from an exponential distribution with a parameter of 1, where θ 1 = 2 and θ 2 = 4 . To verify the asymptotic normality of the estimators in (11) and (12), the histogram shown in Figure 1 is presented.

2.2. Third Procedure

According to a local linear frame, the proposed estimator is suggested in the following manner. Take into account the provided sample data ( G n ( Y ( 1 ) ) , Y ( 1 ) ) , ( G n ( Y ( 2 ) ) , Y ( 2 ) ) , …, ( G n ( Y ( n ) ) , Y ( n ) ) . From (11), the estimator S T M n 1 θ 1 , θ 2 can be written as
S T M n 1 θ 1 , θ 2 ( h n , Y ) = 1 n ( θ 1 θ 2 ) k = 1 n k + r n k r n Y ( k + r ) Y ( k r ) θ 1 1 k + r n k r n Y ( k + r ) Y ( k r ) θ 2 1 ,
where k + r n k r n Y ( k + r ) Y ( k r ) is the straight line slope of the points ( G n ( Y ( k + r ) ) , Y ( k + r ) ) and ( G n ( Y ( k r ) ) , Y ( k r ) ) . Thus, from the method of least squares, the Sharma–Taneja–Mittal entropy estimator is
S T M n 3 θ 1 , θ 2 ( h n , Y ) = 1 n ( θ 1 θ 2 ) k = 1 n i = k r k + r ( Y ( i ) Y ¯ ( k ) ) ( i k ) n i = k r k + r Y ( i ) Y ¯ ( k ) θ 1 1 i = k r k + r ( Y ( i ) Y ¯ ( k ) ) ( i k ) n i = k r k + r Y ( i ) Y ¯ ( k ) θ 2 1 ,
where Y ¯ ( k ) = 1 2 r + 1 i = k r k + r Y ( i ) , and the method is similarly obtained as presented by Correa [14] to estimate the entropy. Moreover, this method uses the model of a local linear under 2 r + 1 points as follows:
G ( y ( i ) ) = a + b y ( i ) + e r r o r , i = r k , , r + k .
Similar to Remark 1, we cannot show the consistency of the estimator S T M n 3 θ 1 , θ 2 ( h n , Y ) . Furthermore, we can discuss the asymptotic normality using Monte Carlo simulations. We aim to verify that
Z 3 = S T M n 3 θ 1 , θ 2 ( h n , Z ) E [ S T M n 3 θ 1 , θ 2 ( h n , Z ) ] V a r [ S T M n 2 θ 1 , θ 2 ( h n , Z ) ] ,
has a standard normal distribution. In total, 1000 samples, each with a size of n = 50, are drawn from an exponential distribution with a parameter of 1, where θ 1 = 2 and θ 2 = 4 . To verify the asymptotic normality of the estimators in (13), the histogram shown in Figure 2 is presented.

2.3. Fourth Procedure

In this procedure, we depend on the consistent kernel estimator defined as follows (see, Parzen [15]):
h ^ ( Y i ) = 1 n l j = 1 n g Y i Y j l , i = 1 , , n ,
where the kernel function of choice is the standard normal PDF, while the normal optimal smoothing formula (i.e., rule-of-thumb bandwidth selection) is employed to determine the bandwidth l using 1.06 σ ^ n 1 5 , and σ ^ is the standard deviation sample. Hence, the Sharma–Taneja–Mittal entropy can be written as
S T M n θ 1 , θ 2 ( Y ) = 1 θ 1 θ 2 0 ( h θ 1 ( y ) h θ 2 ( y ) ) d y = 1 θ 1 θ 2 E h θ 1 1 ( Y ) E h θ 2 1 ( Y ) .
Thus, our suggestion involves the estimation of the Sharma–Taneja–Mittal entropy of an unidentified continuous PDF h through
S T M n 4 θ 1 , θ 2 ( h n , Y ) = 1 n ( θ 1 θ 2 ) k = 1 n h ^ θ 1 1 ( Y ( k ) ) h ^ θ 2 1 ( Y ( k ) ) ,
where h ^ ( y ) is defined in (14).
In the following, we will discuss the consistency of the estimator h ^ θ ( Y ) , θ > 0 . Masry [16] expresses the bias and variance of h ^ ( Y ) as
B i a s ( h ^ ( Y ) ) l ρ w ρ ρ ! h ( ρ ) ( Y ) ,
V a r ( h ^ ( Y ) ) W g n l h ( Y ) ,
where w ρ = u ρ g ( u ) d u , W g = g 2 ( u ) d u , and h ( ρ ) ( Y ) is the ρ -th derivative of h regarding y. By utilizing the Taylor’s series expansion, we obtain
h ^ θ ( Y ) h θ ( Y ) + θ h θ 1 ( Y ) ( h ^ ( Y ) h ( Y ) ) .
Then, we have
h ^ θ ( Y ) h θ ( Y ) θ h θ 1 ( Y ) ( h ^ ( Y ) h ( Y ) ) .
As a result, through using (16)–(18), the bias and variance of h ^ θ ( Y ) can be expressed as
B i a s ( h ^ θ ( Y ) ) θ h θ 1 ( Y ) B i a s ( h ^ ( Y ) ) = θ l ρ w ρ ρ ! h θ 1 ( Y ) h ( ρ ) ( Y ) ,
V a r ( h ^ θ ( Y ) ) θ 2 h 2 θ 2 ( Y ) V a r ( h ^ ( Y ) ) = θ 2 W g n l h 2 θ 1 ( Y ) .
Thus, the mean square error (MSE) is expressed as
M S E ( h ^ θ ( Y ) ) = ( B i a s ( h ^ θ ( Y ) ) ) 2 + V a r ( h ^ θ ( Y ) ) θ l ρ w ρ ρ ! h θ 1 ( Y ) h ( ρ ) ( Y ) 2 + θ 2 W g n l h 2 θ 1 ( Y ) ,
and
M S E ( h ^ θ ( Y ) ) 0 , a s n + ,
Then,
k = 1 n h ^ θ ( Y ( k ) ) Pr . k = 1 n h θ ( Y ( k ) ) ,
which shows the consistency of the estimator h ^ θ ( Y ) , θ > 0 . As S T M n 4 θ 1 , θ 2 ( h n , Y ) consists of the difference between two parts, we discuss the consistency of each part in (15) separately. But we cannot show the consistency of the estimator S T M n 4 θ 1 , θ 2 ( h n , Y ) .
Now, we can discuss the asymptotic normality using Monte Carlo simulations. We aim to verify that
Z 4 = S T M n 4 θ 1 , θ 2 ( h n , Z ) E [ S T M n 3 θ 1 , θ 2 ( h n , Z ) ] V a r [ S T M n 2 θ 1 , θ 2 ( h n , Z ) ] ,
has a standard normal distribution. In total, 1000 samples, each with a size of n = 50, are drawn from an exponential distribution with a parameter of 1, where θ 1 = 2 and θ 2 = 4 . To verify the asymptotic normality of the estimators in (15), the histogram shown in Figure 3 is presented.

3. Numerical Study

In this section, we will present some numerical results, including real and simulated data, to demonstrate the conduct exhibited by our estimators. To estimate the Sharma–Taneja–Mittal entropy, one must ascertain the appropriate value of r corresponding to a given n. Depending on the floor value, the heuristic formula employed is
r = [ n + 0.5 ] ,
as suggested by Grzegorzewski and Wieczorkowski [17] for entropy estimations.

3.1. Real Data

This study examines obesity rates throughout the 13 administrative Saudi Arabia regions, categorized by gender and age. This study investigates the frequency of obesity within the surveyed population, which is characterized by a Body Mass Index (BMI) of 30 or above. It is segmented by region, gender, and age, as outlined by Althumiri et al. [18]. The analysis, depicted in Figure 4, illustrates the BMI distribution among the 13 administrative Saudi Arabia regions.
Suppose the random variable X conforms to an inverse gamma distribution with specified parameters μ and λ (i.e., I G ( μ , λ ) ), and take Y = 1 X . Thus, the PDF of Y is provided as
h ( y ) = 2 2 π ν exp ( y η y ) 2 2 ν 2 , y 0 ,
where η = 1 μ , and ν 2 = 1 λ . Then, from (4), the Sharma–Taneja–Mittal entropy of Y is expressed by
S T M n θ 1 , θ 2 ( Y ) = 1 θ 1 θ 2 0 ( h θ 1 ( y ) h θ 2 ( y ) ) d y = 1 θ 1 θ 2 1 θ 1 2 ν 2 π θ 1 1 0 2 2 π ν θ 1 exp ( y η y ) 2 2 ν θ 1 2 d y 1 θ 2 2 ν 2 π θ 2 1 0 2 2 π ν θ 2 exp ( y η y ) 2 2 ν θ 2 2 d y = 1 θ 1 θ 2 1 θ 1 2 ν 2 π θ 1 1 1 θ 2 2 ν 2 π θ 2 1 .
Moreover, the unbiased uniformly minimum variance estimator of ν 2 is the following (see Mudholkar and Tian [19]):
u v 2 = i = 1 n y i 2 n 1 n 2 ( n 1 ) ( i = 1 n 1 y i 2 ) .
For instance, by taking θ 1 = 2 and θ 2 = 6 , we obtain
S T M n θ 1 , θ 2 ( Y ) = 1 θ 1 θ 2 1 θ 1 2 u v 2 π θ 1 1 1 θ 2 2 u v 2 π θ 2 1 = 0.00462 .
Under the distribution of Y = 1 X , the sample estimates of the Sharma–Taneja–Mittal entropy are S T M n i θ 1 , θ 2 ( h n , Y ) , i = 1 , 2 , 3 , 4 , given in (11), (12), (13), and (15) respectively. Table 2 displays the Sharma–Taneja–Mittal entropy estimates and calculated absolute and relative biases. The four estimators appear to have a similar impact and produce values that are nearly identical to the actual value. Moreover, by increasing the values of and between θ 1 and θ 2 , we can obtain the small value of the absolute bias.

3.2. Investigation through Simulation

A study was conducted wherein simulations were utilized to examine the performances of the suggested estimators for the Sharma–Taneja–Mittal entropy. One thousand samples were generated for every sample size, and the estimators were evaluated along with their root mean squared errors (RMSEs). Various distributions, including exponential and standard uniform, were employed in the analysis. The exponential distribution follows a PDF h ( y ) = λ e λ y , y > 0 , which took rate parameter λ = 0.5 in this study. Table 3, Table 4, Table 5 and Table 6 present the values of the root mean squared error ( R M S E ) , along with the standard deviation ( S D n ) , for the four Sharma–Taneja–Mittal estimators across various sample sizes n = 10 , 20 , 30 , 40 , 50 , for each of the two distributions under consideration. We can conclude the following:
  • Through increasing n, the R M S E and its corresponding S D n decreases for each estimator.
  • Under an exponential distribution, for small n, we can see that the fourth estimator S T M n 4 shows strong performance in comparison to its counterparts; see Figure 5. Meanwhile, for large n, we can see that the four estimators have almost the same effect; see Figure 6.
  • Under a standard uniform distribution, we can see that the fourth estimator S T M n 4 shows strong performance in comparison to its counterparts; see Figure 7 and Figure 8.

4. Test Statistic of Testing Uniformity

In this section, we investigate the potential improvement of the entropy-based test of uniformity by incorporating the best estimator of the Sharma–Taneja–Mittal entropy in the Monte Carlo experiment. Assessing uniformity holds significance in various practical contexts where evaluating goodness-of-fit often involves scrutinizing uniformity. Suppose we have a random sample Y 1 , , Y n of size n, drawn from a population described by an unknown CDF H. Our aim is to assess the null hypothesis H 0 : H ( y ) = H 0 ( y ) , for all y R , as opposed to the alternative hypothesis H 1 : H ( y ) H 0 ( y ) , for some y R , where H 0 ( y ) is fully specified. Let Z i = H 0 ( Y i ) , for i = 1 , , n . As per the theorem of probability integral transformation, Z i for i = 1 , , n adheres to a uniform distribution over the interval ( 0 , 1 ) . Consequently, the problem of assessing goodness-of-fit mentioned above is tantamount to examining the null hypothesis H 0 : G ( y ) = y for all y ( 0 , 1 ) , in contrast to H 1 : G ( y ) y for some y ( 0 , 1 ) , where G denotes the CDF of Z i for i = 1 , , n . Marhuenda et al. [20] provide a thorough overview of research on testing uniformity, along with a comprehensive list of references.
Let Y be a random variable with support on the interval ( 0 , 1 ) with PDF h, and let U be a uniform random variable over the interval ( 0 , 1 ) . From (4), we can obtain the lower bound of the Sharma–Taneja–Mittal entropy utilizing the Cauchy–Schwarz inequality, as follows
S T M n θ 1 , θ 2 ( Y ) = 1 θ 1 θ 2 0 1 ( h θ 1 ( y ) h θ 2 ( y ) ) d y 1 θ 1 θ 2 0 1 h ( y ) . 1 d y θ 1 0 1 h ( y ) . 1 d y θ 2 = 0 = S T M n θ 1 , θ 2 ( U ) .
According to the Sharma–Taneja–Mittal entropy estimates S T M n i θ 1 , θ 2 ( h n , Y ) , i = 1 , 2 , 3 , 4 , given in (11), (12), (13), and (15), respectively, and under the null hypothesis G ( y ) = y for all y ( 0 , 1 ) , we have
S T M n i θ 1 , θ 2 ( f n , y ) Pr . 0 , w h e n n , r , r n 0 .
Considering an alternative distribution with a PDF f defined on the interval ( 0 , 1 ) , we obtain
S T M n i θ 1 , θ 2 ( f n , y ) Pr . S T M n θ 1 , θ 2 ( Y ) 0 , w h e n n , r , r n 0 .
Following the methodology established by Dudewicz and Van der [21] and Zamanzade and Arghami [22], we propose the test statistic S T M n U = S T M n i θ 1 , θ 2 ( h n , Y ) , i = 1 , 2 , 3 , 4 , to assess uniformity. Values of S T M n U that deviate from zero indicate non-uniformity, leading us to reject the hypothesis of uniformity.

Critical Values and Power Comparisons

Regrettably, the complexity of the test statistic S T M n U prevents us from determining its exact distribution based on the null hypothesis. Consequently, we used Monte Carlo simulations to obtain its critical values. Under the Sharma–Taneja–Mittal entropy estimates, we employed the Monte Carlo technique to obtain the percentage points. The interval
[ l o w e r v a l u e , u p p e r v a l u e ] : = [ S T M n i θ 1 , θ 2 , α 2 o , S T M n i θ 1 , θ 2 , 1 α 2 o ] , i = 1 , 2 , 3 , 4 ,
describes the area essential for establishing the uniformity test, with α indicating the chosen significance level and S T M n i o θ 1 , θ 2 , α signifying the α quantile function, based on the null hypothesis, of the approximate or asymptotic CDF test statistic. Table 7 presents the precise critical points of the statistic test S T M n U for different sample sizes n = 10 , 20 , 30 , based on Monte Carlo simulations with 1000 iterations. We can see the following:
  • In increasing the sample size n, the difference in percentage points decreases.
  • For fixed n and increasing θ 1 and θ 2 , the difference in percentage points increases.
  • The fourth estimator introduces the percentage points that cover the actual value of the Sharma–Taneja–Mittal entropy measure for the different values of θ 1 , θ 2 , and n.
Figure 9 shows the density of the four Sharma–Taneja–Mittal entropy estimators for n = 10 , 20 , 30 , 40 , 50 , θ 1 = 2 , and θ 2 = 4 . It is noted that as n grows, the test statistics approach the exact values more closely, suggesting that both bias and variance decrease with larger n. Moreover, we can see that the fourth estimator, which depends on the kernel function, behaves well in this part.
We used Monte Carlo simulations to assess the effectiveness of our proposed test by comparing its power with that of other tests under seven different alternatives. Various researchers, including Dudewicz and Van der [21], and Zamanzade [23], have examined a range of alternatives in uniformity testing, classifying them into three groups according to the shapes of their densities, as follows:
A p * : H ( y ) = 1 ( 1 y ) p , 0 y 1 , p = 1.5 , 2 , B p * : H ( y ) = 2 p 1 y p , 0 y 0.5 , 1 2 p 1 ( 1 y ) p , 0.5 y 1 , p = 1.5 , 2 , 3 , C p * : H ( y ) = 0.5 2 p 1 ( 0.5 y ) p , 0 y 0.5 , 0.5 + 2 p 1 ( y 0.5 ) p , 0.5 y 1 , p = 1.5 , 2 .
We evaluate the effectiveness of our suggested test statistic in comparison with several established statistics under identical alternatives. The provided statistics include the Cramer–von Mises test statistic (Cramer [24]; Von Mises [25]), Kolmogorov–Smirnov test statistic (Kolmogorov [26]; Smirnov [27]), Anerson–Darling test statistic (Anderson and Darling [28]), and Extropy test statistic (Qiu and Jia [11]). These statistics are given as follows:
  • Cramer–von Mises test statistic:
    C M U = k = 1 n Y ( k ) 2 k 1 2 n + 1 12 n .
  • Kolmogrov–Smirnov test statistic:
    K S U = max max 1 k n k n Y ( k ) , max 1 k n Y ( k ) k 1 n .
  • Anderson–Darling test statistic:
    A D U = 2 n k = 1 n ( k 0.5 ) log ( Y ( k ) ) + ( n k + 0.5 ) log ( 1 Y ( k ) ) n .
  • Extropy test statistic:
    E x U = 1 2 n k = 1 n T k r / n Y ( k + r ) Y ( k r ) ,
    where the positive integer window size r < n 2 ; Y k = Y 1 if k < 1 and Y k = Y n if k > n ; and T k is defined in (10).
Regrettably, some of the proposed tests’ powers vary with the alternative distributions and the window size. As a result, it is not feasible to identify the optimal value of r that maximizes the test’s power for all alternatives. Thus, we utilize the subsequent heuristic formula to choose r in order to guarantee the effectiveness of the suggested test that achieves satisfactory (though not optimal) power across all alternative distributions, as outlined in (19). Moreover, when selecting the sample size n, we opt for n = 10 , 20 , 30 and do not go beyond these to steer clear of the central limit theorem and maintain normality.
Table 8, Table 9, Table 10 and Table 11 list the power analysis estimations at a significance level of 0.05 for the different values of θ 1 and θ 2 . We can see the following:
  • Under alternative B p * , the four Sharma–Taneja–Mittal entropy estimators have good behavior compared with the other tests. Stephens [29] viewed alternative A p * as indicating a shift in the mean, alternative B p * as signifying a move toward a smaller variance, and alternative C p * as representing a shift toward a larger variance. Therefore, our tests perform optimally compared to alternatives when there is a shift toward a smaller variance.
  • Under alternatives A p * , B p * , and C 2 * , the fourth estimator, which relies on the kernel function, outperforms all other tests across various values of n, θ 1 , and θ 2 .
  • For fixed n and increasing θ 1 and θ 2 , the power of the four Sharma–Taneja–Mittal entropy estimators decreases.

5. Classification Problem via Pattern Recognition

In this section, we will use the discrete form of Sharma–Taneja–Mittal entropy to treat the classification problem using pattern recognition. If Y denotes an unknown but observable quantity with a finite discrete set of possible initial values { y 1 , y 2 , , y N } , and it is associated with a vector probability mass function p N = ( p 1 , p 2 , , p N ) , then the discrete form of Sharma–Taneja–Mittal entropy is given by
S T M n θ 1 , θ 2 ( p N ) = 1 θ 1 θ 2 i = 1 N ( p i θ 1 p i θ 2 ) ,
where θ 1 θ 2 > 0 . Figure 10 shows S T M n θ 1 , θ 2 ( p 1 , p 2 , p 3 ) , where p 3 = 1 p 1 p 2 , through a 3-D unit simplex, illustrating the convexity and negativity by increasing the value between θ 1 and θ 2 .
We utilize Sharma–Taneja–Mittal entropy to address a classification issue. Our analysis focuses on the Iris dataset presented in [30]. Additionally, we contrast our results with the techniques proposed by Kang et al. [31] and Buono and Longobardi [32]. The goal is to categorize three varieties of flowers: Iris Setosa ( C 1 ), Iris Versicolour ( C 2 ), and Iris Virginica ( C 3 ). A collection of 150 samples within a dataset was used, with an equal distribution of 50 samples per category. The attributes measured for each flower include sepal length ( A 1 ), sepal width ( A 2 ), petal length ( A 3 ), and petal width ( A 4 ), all in cm. For each Iris species, 40 specimens were selected, and a sample with the highest and lowest values was identified to create a model of interval numbers, as shown in Table 12. Each entry in the dataset represents an unknown test sample. It was assumed that the chosen singleton sample data are ( 4.9 , 2.5 , 4.5 , 1.7 ) , which comes from the C 3 species.
Next, we generated four unique probability distributions by employing the approach of Kang et al. [31], which is drawing from the resemblance in interval numbers. Additionally, their similarity is defined by the following equation:
M ( Y 1 , Y 2 ) = 1 1 + τ I ( Y 1 , Y 2 ) ,
where the support coefficient is denoted by τ . In one example, τ was set to 5, and the distance between the intervals Y 1 = [ g 1 , g 2 ] and Y 2 = [ h 1 , h 2 ] is obtained as
I ( Y 1 , Y 2 ) = g 1 + g 2 2 h 1 + h 2 2 2 + 1 3 g 1 g 2 2 2 + h 1 h 2 2 2 .
In order to produce probability distributions, we used the intervals specified in Table 12(i) for interval Y 1 and individual values from the selected sample for interval Y 2 . Each of the four measured characteristics provides three similarity values, which are subsequently normalized to create a probability distribution, as illustrated in Table 12(ii). Subsequently, an evaluation was performed on our measures for these probability distributions, as outlined in Table 13(i), across various values of θ 1 and θ 2 . Due to the monotonic nature of the exponential function, our choice for the baseline weight function was Q ( y ) = e y , from which we obtained the weights via normalization. For example, in the context of petal width concerning the Sharma–Taneja–Mittal entropy, the procedure results are shown in Table 13(ii):
Q ( A 4 ) = e S T M n θ 1 , θ 2 ( A 4 ) e S T M n θ 1 , θ 2 ( A 1 ) + e S T M n θ 1 , θ 2 ( A 2 ) + e S T M n θ 1 , θ 2 ( A 3 ) + e S T M n θ 1 , θ 2 ( A 4 ) .
Thus, the selection of θ 1 = 1 and θ 2 = 2 results in the final probability distribution of the Sharma–Taneja–Mittal entropy as
P ( C 1 ) = 0.213238 , P ( C 2 ) = 0.393339 , P ( C 3 ) = 0.393423 ,
and θ 1 = 1 and θ 2 = 3 ,
P ( C 1 ) = 0.212512 , P ( C 2 ) = 0.393471 , P ( C 3 ) = 0.394017 ,
and θ 1 = 2 and θ 2 = 3 ,
P ( C 1 ) = 0.211786 , P ( C 2 ) = 0.393602 , P ( C 3 ) = 0.394611 ,
and θ 1 = 2 and θ 2 = 4 ,
P ( C 1 ) = 0.211627 , P ( C 2 ) = 0.39363 , P ( C 3 ) = 0.394743 .
Afterward, it was established that the chosen flower was identified as being part of the class with the highest probability, namely Iris Virginica. As a result, a precise conclusion was made in this particular case.
Employing this method, we examined all 150 samples, comprising 50 samples from each species, across various values of θ 1 and θ 2 . Our observations suggest that the overall recognition rates of the method, based on our assessments, remain at 94.66%. We contrasted these findings with the recognition rates documented by Buono and Longobardi [32] and Kang et al. [31], as presented in Table 14. It is evident that our approach demonstrates a slightly better performance compared to the other two methodologies.

6. Conclusions

The Sharma–Taneja–Mittal entropy is considered a generalization form of Tsallis entropy. This consideration explores the nonparametric methods for the Sharma–Taneja–Mittal entropy using different procedures, including spacing, a local linear model, and a kernel function. Moreover, we discuss some properties and the consistency of the four different procedures. We introduce the real and simulated data applied to our models, and we compared them. Moreover, we present critical values and power comparisons to test uniformity, and we saw that the estimator, which depends on the kernel function, introduces more accurate results. Finally, the issue of classification based on a dataset was examined to highlight the significance of these measures in the recognition of patterns.

Author Contributions

Methodology, M.S.M.; Software, M.S.M.; Validation, H.H.S.; Resources, H.H.S. All authors have read and agreed to the published version of the manuscript.

Funding

This study is supported via funding from Prince sattam bin Abdulaziz University project number (PSAU/2024/R/1445).

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available within the manuscript.

Acknowledgments

The authors expressed the gratitude to the funding from Prince sattam bin Abdulaziz University project number (PSAU/2024/R/1445).

Conflicts of Interest

The authors confirm the lack of any conflicts of interest. The authors have no relevant financial or non-financial interests to disclose.

References

  1. Shannon, C. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
  2. Rao, M.; Chen, Y.; Vemuri, B.; Wang, F. Cumulative residual entropy: A new measure of information. IEEE Trans. Inf. Theory 2004, 50, 1220–1228. [Google Scholar] [CrossRef]
  3. Tsallis, C. Possible generalization of Boltzmann-Gibbs statistics. J. Stat. Phys. 1988, 52, 479–487. [Google Scholar] [CrossRef]
  4. Sharma, B.D.; Taneja, I.J. Entropy of type (α,β) and other generalized measures in information theory. Metrika 1975, 22, 205–215. [Google Scholar] [CrossRef]
  5. Mittal, D.P. On some functional equations concerning entropy, directed divergence and inaccuracy. Metrika 1975, 22, 35–45. [Google Scholar] [CrossRef]
  6. Kattumannil, S.K.; Sreedevi, E.P.; Balakrishnan, N. A Generalized Measure of Cumulative Residual Entropy. Entropy 2022, 24, 444. [Google Scholar] [CrossRef]
  7. Sudheesh, K.K.; Sreedevi, E.P.; Balakrishnan, N. Relationships between cumulative entropy/extropy, Gini mean difference and probability weighted moments. Probab. Eng. Inf. Sci. 2024, 38, 28–38. [Google Scholar]
  8. Vasicek, O. A Test for Normality based on Sample Entropy. J. R. Stat. Soc.-Ser. 1976, 38, 54–59. [Google Scholar] [CrossRef]
  9. Ebrahimi, N.; Pflughoeft, K.; Soofi, E.S. Two Measures of Sample Entropy. Stat. Probab. Lett. 1994, 20, 225–234. [Google Scholar] [CrossRef]
  10. Noughabi, H.A.; Jarrahiferiz, J. Extropy of order statistics applied to testing symmetry. Commun. Stat.-Simul. 2020, 51, 3389–3399. [Google Scholar] [CrossRef]
  11. Qiu, G.; Jia, K. Extropy Estimators with Applications in Testing Uniformity. J. Nonparametric Stat. 2018, 30, 182–196. [Google Scholar] [CrossRef]
  12. Wachowiak, M.P.; Smolikova, R.; Tourassi, G.D.; Elmaghraby, A.S. Estimation of generalized entropies with sample spacing. Pattern Anal. Appl. 2005, 8, 95–101. [Google Scholar] [CrossRef]
  13. Wanke, P. The uniform distribution as a first practical approach to new product inventory management. Int. J. Prod. Econ. 2008, 114, 811–819. [Google Scholar] [CrossRef]
  14. Correa, J.C. A new Estimator of entropy. Commun. Stat.-Theory Methods 1995, 24, 2439–2449. [Google Scholar] [CrossRef]
  15. Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat. 1962, 33, 1065–1076. [Google Scholar] [CrossRef]
  16. Masry, E. Recursive probability density estimation for weakly dependent stationary processes. IEEE Trans. Inf. Theory 1986, 32, 254–267. [Google Scholar] [CrossRef]
  17. Grzegorzewski, P.; Wieczorkowski, R. Entropy-based Goodness-of-fit Test for Exponentiality. Commun. Stat.-Theory Methods 1999, 28, 1183–1202. [Google Scholar] [CrossRef]
  18. Althumiri, N.A.; Basyouni, M.H.; AlMousa, N.; AlJuwaysim, M.F.; Almubark, R.A.; BinDhim, N.F.; Alkhamaali, Z.; Alqahtani, S.A. Obesity in Saudi Arabia in 2020: Prevalence, Distribution, and Its Current Association with Various Health Conditions. Healthcare 2021, 9, 311. [Google Scholar] [CrossRef]
  19. Mudholkar, G.S.; Tian, L. An Entropy Characterization of the Inverse Gaussian Distribution and Related Goodness-of-Fit Test. J. Stat. Plan. Inference 2002, 102, 211–221. [Google Scholar] [CrossRef]
  20. Marhuenda, Y.; Morales, D.; Pardo, M.C. A Comparison of Uniformity Tests. Statistics 2005, 39, 315–327. [Google Scholar] [CrossRef]
  21. Dudewicz, E.J.; Van der Meulen, E.C. Entropy-based Tests of Uniformity. J. Am. Stat. Assoc. 1981, 76, 967–974. [Google Scholar] [CrossRef]
  22. Zamanzade, E.; Arghami, N.R. Goodness-of-Fit Test based on Correcting Moments of Modified Entropy Estimator. J. Stat. Comput. Simul. 2011, 81, 2077–2093. [Google Scholar] [CrossRef]
  23. Zamanzade, E. Testing Uniformity based on New Entropy Estimators. J. Stat. Simul. 2015, 85, 3191–3205. [Google Scholar] [CrossRef]
  24. Cramér, H. On the Composition of Elementary Errors: II. Statistical Applications. Scand. Actuar. J. 1928, 1928, 141–180. [Google Scholar]
  25. Von Mises, R. Wahrscheinlichkeitsrechnung und ihre Anwendung in der Statistik und Theoretischen Physik; Deuticke: Leipzig, Germany, 1931. [Google Scholar]
  26. Kolmogorov, A.N. Sulla Determinazione Empirica di una Legge di Distibuziane. G. Dell’IstitutaItaliano Degli Attuari 1933, 4, 83–91. [Google Scholar]
  27. Smirnov, N.V. Estimate of Derivation between Empirical Distribution Functions in Two Independent Samples. Bull. Mosc. Univ. 1939, 2, 3–16. (In Russian) [Google Scholar]
  28. Anderson, T.W.; Darling, D.A. A Test of Goodness-of-Fit. J. Am. Stat. Assoc. 1954, 49, 765–769. [Google Scholar] [CrossRef]
  29. Stephens, M.A. EDF statistics for goodness of fit and some comparisons. J. Am. Stat. Assoc. 1974, 69, 730–737. [Google Scholar] [CrossRef]
  30. Fisher, R.A. Iris. UCI Machine Learning Repository. 1988. Available online: https://archive.ics.uci.edu/dataset/53/iris (accessed on 1 July 2024).
  31. Kang, B.Y.; Li, Y.; Deng, Y.; Zhang, Y.J.; Deng, X.Y. Determination of basic probability assignment based on interval numbers and its application. Dianzi Xuebao (Acta Electron. Sin.) 2012, 40, 1092–1096. [Google Scholar]
  32. Buono, F.; Longobardi, M. A dual measure of uncertainty: The deng extropy. Entropy 2020, 22, 582. [Google Scholar] [CrossRef]
Figure 1. Histograms of Z 1 (left) and Z 2 (right).
Figure 1. Histograms of Z 1 (left) and Z 2 (right).
Mathematics 12 02639 g001
Figure 2. Histogram of Z 3 .
Figure 2. Histogram of Z 3 .
Mathematics 12 02639 g002
Figure 3. Histogram of Z 4 .
Figure 3. Histogram of Z 4 .
Mathematics 12 02639 g003
Figure 4. BMI throughout the 13 administrative Saudi Arabia regions and its analysis.
Figure 4. BMI throughout the 13 administrative Saudi Arabia regions and its analysis.
Mathematics 12 02639 g004
Figure 5. Sharma–Taneja–Mittal entropy estimates for exponential distribution with λ = 0.5 and n = 10 .
Figure 5. Sharma–Taneja–Mittal entropy estimates for exponential distribution with λ = 0.5 and n = 10 .
Mathematics 12 02639 g005aMathematics 12 02639 g005b
Figure 6. Sharma–Taneja–Mittal entropy estimates for exponential distribution with λ = 0.5 and n = 50 .
Figure 6. Sharma–Taneja–Mittal entropy estimates for exponential distribution with λ = 0.5 and n = 50 .
Mathematics 12 02639 g006
Figure 7. Sharma–Taneja–Mittal entropy estimates for standard uniform distribution with n = 10 .
Figure 7. Sharma–Taneja–Mittal entropy estimates for standard uniform distribution with n = 10 .
Mathematics 12 02639 g007
Figure 8. Sharma–Taneja–Mittal entropy estimates for standard uniform distribution with n = 50 .
Figure 8. Sharma–Taneja–Mittal entropy estimates for standard uniform distribution with n = 50 .
Mathematics 12 02639 g008
Figure 9. The density of the four Sharma–Taneja–Mittal entropy estimators for n = 10 , 20 , 30 , 40 , 50 , θ 1 = 2 , and θ 2 = 4 .
Figure 9. The density of the four Sharma–Taneja–Mittal entropy estimators for n = 10 , 20 , 30 , 40 , 50 , θ 1 = 2 , and θ 2 = 4 .
Mathematics 12 02639 g009
Figure 10. 3-D plot of S T M n θ 1 , θ 2 ( p 1 , p 2 , p 3 ) through unit simplex.
Figure 10. 3-D plot of S T M n θ 1 , θ 2 ( p 1 , p 2 , p 3 ) through unit simplex.
Mathematics 12 02639 g010
Table 1. Sharma–Taneja–Mittal entropy measures of some well-known distributions.
Table 1. Sharma–Taneja–Mittal entropy measures of some well-known distributions.
Distributions F ( y ) STMn θ 1 , θ 2 ( Y )
Standard uniformy; 0 < y < 1 0
Uniform y a b a ; a < y < b ( ( b a ) ( ( 1 / ( b a ) ) θ 1 ( 1 / ( b a ) ) θ 2 ) ) θ 1 θ 2
Exponential 1 e λ y ; y > 0 , λ > 0 1 λ ( θ 1 θ 2 ) λ θ 1 θ 1 λ θ 2 θ 2
Power ( β y ) α ; 0 < y < 1 β , α , β > 0 1 β ( θ 1 θ 2 ) ( α β ) θ 1 1 + θ 1 ( α 1 ) ( α β ) θ 2 1 + θ 2 ( α 1 )
Pareto I 1 y k ; y 1 , k > 0 1 ( θ 1 θ 2 ) k θ 1 θ 1 ( 1 + k ) 1 k θ 2 θ 2 ( 1 + k ) 1
Table 2. Sharma–Taneja–Mittal entropy and their estimates, absolute biases ( A b ), and relative biases ( R b ) of the BMI among the 13 administrative Saudi Arabia regions.
Table 2. Sharma–Taneja–Mittal entropy and their estimates, absolute biases ( A b ), and relative biases ( R b ) of the BMI among the 13 administrative Saudi Arabia regions.
θ 1 , θ 2 STMnSTMn1STMn2STMn3STMn4
1, 2−0.981514−0.9877
A b = 0.006215
R b = −0.63325
−0.98859
A b = 0.00708
R b = −0.7218
−0.98611
A b = 0.004604
R b = −0.46912
−0.982392
A b = 0.000878187
R b = −0.0894727
1, 3−0.499803−0.499781
A b = 0.00002194
R b = 0.004391
−0.499851
A b = 0.00004819
R b = 0.009643
−0.499853
A b = 0.00005017
R b = 0.010038
−0.49983
A b = 0.0000273802
R b = 0.0054782
2, 3−0.0180914−0.011832
A b = 0.00625941
R b = 34.5988
−0.0111024
A b = 0.006989
R b = 38.6318
−0.013587
A b = 0.004504
R b = 24.8969
−0.017268
A b = 0.000823427
R b = 4.55148
2, 4−0.00923853−0.00613189
A b = 0.003106
R b = 33.627
−0.00569784
A b = 0.0035406
R b = 38.3252
−0.00693782
A b = 0.0023007
R b = 24.9034
−0.00880043
A b = 0.000438093
R b = 4.74203
3, 4−0.000385664−0.00043179
A b = 0.000046135
R b = 11.9625
−0.000293323
A b = 0.00009234
R b = 23.9433
−0.000288458
A b = 0.000097205
R b = 25.2047
−0.000332904
A b = 0.0000527602
R b = 13.6804
3, 5−0.000197194−0.000219139
A b = 0.000021945
R b = 11.1286
−0.000149047
A b = 0.000048147
R b = 24.4163
−0.000147068
A b = 0.0000501269
R b = 25.42
−0.000169846
A b = 0.0000273489
R b = 13.869
3, 6−0.000131531−0.000146162
A b = 0.0000146314
R b = 11.1239
−0.0000994011
A b = 0.0000321299
R b = 24.4276
−0.0000980839
A b = 0.000033447
R b = 25.429
−0.000113278
A b = 0.0000182528
R b = 13.8772
Table 3. The R M S E and S D n (below it in parentheses) for the estimators of the Sharma–Taneja–Mittal entropy of the exponential distribution with λ = 0.5 .
Table 3. The R M S E and S D n (below it in parentheses) for the estimators of the Sharma–Taneja–Mittal entropy of the exponential distribution with λ = 0.5 .
θ 1 = 1 , θ 2 = 2 , STMn θ 1 , θ 2 ( X ) = 0.75
nSTMn1STMn2STMn3STMn4
100.14461
(0.132534)
0.129365
(0.124113)
0.129513
(0.122418)
0.0855713
(0.0847806)
200.0816596
(0.0754353)
0.0781294
(0.0743387)
0.0734482
(0.0714015)
0.0671174
(0.0542884)
300.0665794
(0.0616712)
0.0648844
(0.0613456)
0.0607927
(0.0591587)
0.0626705
(0.0418085)
400.0548361
(0.0516832)
0.0539021
(0.0515542)
0.051165
(0.0502906)
0.0611906
(0.033919)
500.0469785
(0.0450095)
0.0464326
(0.044974)
0.0444812
(0.0440465)
0.0608884
(0.030506)
θ 1 = 1 , θ 2 = 3 , STMn θ 1 , θ 2 ( X ) = 0.45833
100.16812
(0.150558)
0.132232
(0.121322)
0.124813
(0.115287)
0.0277982
(0.0264395)
200.0549721
(0.0462316)
0.0505812
(0.043901)
0.0446313
(0.0403246)
0.0219168
(0.0143902)
300.0408954
(0.0342019)
0.0389946
(0.0333485)
0.0346193
(0.031104)
0.0210433
(0.0103421)
400.0314564
(0.0265216)
0.0304197
(0.0261155)
0.0273505
(0.0248309)
0.0208732
(0.00797974)
500.0258993
(0.0221422)
0.025241
(0.0219065)
0.0229246
(0.0210369)
0.0208521
(0.00721379)
Table 4. The R M S E and S D n (below it in parentheses) for the estimators of the Sharma–Taneja–Mittal entropy of the exponential distribution with λ = 0.5 .
Table 4. The R M S E and S D n (below it in parentheses) for the estimators of the Sharma–Taneja–Mittal entropy of the exponential distribution with λ = 0.5 .
θ 1 = 2 , θ 2 = 3 , STMn θ 1 , θ 2 ( X ) = 0.166667
nSTMn1STMn2STMn3STMn4
100.212645
(0.191847)
0.156784
(0.140966)
0.144111
(0.13389)
0.0358821
(0.0355005)
200.0421057
(0.0312879)
0.0384475
(0.0281834)
0.0342259
(0.0270588)
0.0275815
(0.0268353)
300.0285834
(0.0206948)
0.0275623
(0.0197346)
0.0258025
(0.0199804)
0.0239704
(0.0217743)
400.0214715
(0.0148946)
0.0212414
(0.0146132)
0.0202013
(0.0150943)
0.0220457
(0.0182628)
500.0181633
(0.0122971)
0.0181706
(0.0121897)
0.017365
(0.0126431)
0.0212449
(0.0163526)
θ 1 = 2 , θ 2 = 4 , STMn θ 1 , θ 2 ( X ) = 0.109375
100.385437
(0.377294)
0.257659
(0.251518)
0.252319
(0.247595)
0.0297265
(0.0297261)
200.0377388
(0.035082)
0.0320088
(0.0291091)
0.0306848
(0.0283813)
0.0237941
(0.0212249)
300.0217646
(0.0199784)
0.0204595
(0.01847)
0.0205809
(0.0187179)
0.0216616
(0.0169128)
400.0146617
(0.0133615)
0.0144534
(0.0129698)
0.0148408
(0.0132896)
0.020708
(0.0140194)
500.0121171
(0.0110248)
0.0120924
(0.0108211)
0.0125347
(0.0111286)
0.0203776
(0.0125835)
Table 5. The R M S E and S D n (below it in parentheses) for the estimators of the Sharma–Taneja–Mittal entropy of the standard uniform distribution with S T M n θ 1 , θ 2 ( X ) = 0 .
Table 5. The R M S E and S D n (below it in parentheses) for the estimators of the Sharma–Taneja–Mittal entropy of the standard uniform distribution with S T M n θ 1 , θ 2 ( X ) = 0 .
θ 1 = 1 , θ 2 = 2
nSTMn1STMn2STMn3STMn4
100.324999
(0.302211)
0.288194
(0.288029)
0.33933
(0.301435)
0.257691
(0.233468)
200.187321
(0.160289)
0.163991
(0.160743)
0.175266
(0.158899)
0.137397
(0.136884)
300.141005
(0.118538)
0.124708
(0.121589)
0.1305
(0.119432)
0.100954
(0.0994799)
400.104646
(0.0893122)
0.0935664
(0.0925936)
0.0982116
(0.0919585)
0.0858848
(0.0791714)
500.08443
(0.0737325)
0.0767405
(0.0766393)
0.0800848
(0.0760987)
0.0790904
(0.0671044)
θ 1 = 1 , θ 2 = 3
101.75955
(1.42486)
1.21433
(1.06603)
1.19997
(1.0179)
0.366708
(0.322417)
200.58992
(0.363632)
0.46281
(0.336973)
0.450122
(0.319156)
0.17617
(0.169084)
300.418685
(0.243132)
0.344598
(0.239065)
0.331397
(0.224744)
0.120857
(0.120316)
400.307888
(0.158279)
0.250952
(0.159762)
0.243638
(0.151957)
0.0930044
(0.0927536)
500.25518
(0.123582)
0.206611
(0.126006)
0.203724
(0.12069)
0.0807951
(0.0788088)
Table 6. The R M S E and S D n (below it in parentheses) for the estimators of the Sharma–Taneja–Mittal entropy of the standard uniform distribution with S T M n θ 1 , θ 2 ( X ) = 0 .
Table 6. The R M S E and S D n (below it in parentheses) for the estimators of the Sharma–Taneja–Mittal entropy of the standard uniform distribution with S T M n θ 1 , θ 2 ( X ) = 0 .
θ 1 = 2 , θ 2 = 3
nSTMn1STMn2STMn3STMn4
103.24246
(2.59429)
2.2065
(1.88305)
2.09309
(1.77137)
0.480594
(0.416191)
201.0107
(0.573765)
0.795043
(0.519611)
0.744272
(0.48933)
0.22101
(0.203322)
300.711141
(0.373241)
0.591952
(0.36185)
0.5504
(0.337942)
0.148635
(0.142783)
400.526556
(0.230126)
0.438324
(0.229652)
0.408419
(0.216469)
0.108955
(0.107423)
500.441826
(0.175791)
0.368536
(0.177629)
0.347178
(0.169075)
0.0916764
(0.0915285)
θ 1 = 2 , θ 2 = 4
1012.7007
(12.2844)
9.39012
(9.16204)
8.66346
(8.36002)
0.711518
(0.625726)
201.81269
(1.39144)
1.49908
(1.20588)
1.44436
(1.15244)
0.293881
(0.264811)
301.16698
(0.841139)
1.03401
(0.793981)
0.98455
(0.749115)
0.193146
(0.180129)
400.722116
(0.416834)
0.640433
(0.411796)
0.617659
(0.395966)
0.135057
(0.128615)
500.57095
(0.295466)
0.505213
(0.295314)
0.494735
(0.28746)
0.113137
(0.110129)
Table 7. The test statistics’ percentage points of the Sharma–Taneja–Mittal entropy estimators at a significance level of 0.05.
Table 7. The test statistics’ percentage points of the Sharma–Taneja–Mittal entropy estimators at a significance level of 0.05.
n = 10
θ 1 , θ 2 STMn1STMn2STMn3STMn4
1, 2(−0.252703, 0.560403)(−0.343895, 0.419436)(−0.259882, 0.526772)(−0.202206, 0.406931)
1, 3(0.141055, 2.52503)(−0.0899544, 1.67325)(−0.0777993, 1.68698)(−0.170763, 0.572097)
2, 3(0.526473, 4.53774)(0.15415, 2.98353)(0.0929531, 2.8155)(−0.141173, 0.776607)
2, 4(0.264779, 9.23515)(0.0310672, 6.0256)(0.0588096, 6.21605)(−0.124758, 1.02762)
n = 20
1, 2(−0.127013, 0.46065)(−0.197468, 0.385614)(−0.17302, 0.41996)(−0.160847, 0.23215)
1, 3(0.0780801, 1.33191)(−0.0457253, 1.13025)(−0.0456312, 1.05308)(−0.138835, 0.343722)
2, 3(0.276409, 2.21915)(0.105258, 1.87488)(0.0781993, 1.68431)(−0.118179, 0.442431)
2, 4(0.212467, 4.06039)(0.0764948, 3.60973)(0.0763925, 3.17668)(−0.104545, 0.567234)
n = 30
1, 2(−0.0937319, 0.332303)(−0.152613, 0.293168)(−0.131898, 0.302147)(−0.150464, 0.140867)
1, 3(0.0654442, 0.862308)(−0.023163, 0.772293)(−0.0217618, 0.729274)(−0.129833, 0.224846)
2, 3(0.224547, 1.43339)(0.0999416, 1.27472)(0.0825614, 1.16331)(−0.111212, 0.29924)
2, 4(0.19284, 2.4853)(0.0881774, 2.27333)(0.0800395, 2.03479)(−0.0994638, 0.387009)
Table 8. Power analysis estimations of the different statistics at a significance level of 0.05, with θ 1 = 1 and θ 2 = 2 .
Table 8. Power analysis estimations of the different statistics at a significance level of 0.05, with θ 1 = 1 and θ 2 = 2 .
nAlt. STMn U KS U AD U CM U Ex U
STMn 1 STMn 2 STMn 3 STMn 4
10 A 1.5 * 0.2420.2530.2480.2550.1540.1560.1680.136
A 2 * 0.4640.4830.4710.5080.3860.4090.4380.32
B 1.5 * 0.2250.2430.2660.2520.0370.0150.0260.21
B 2 * 0.4940.6450.5540.4990.0440.0090.0220.455
B 3 * 0.8760.9480.8970.8230.0870.0180.0510.801
C 1.5 * 0.0880.1020.1080.0910.1130.1210.0980.082
C 2 * 0.1390.1720.2410.8230.1990.2270.1550.131
20 A 1.5 * 0.1830.2040.1750.5000.2790.3090.3240.258
A 2 * 0.5210.5540.4870.8330.6990.7560.7710.665
B 1.5 * 0.2370.2530.2690.5670.0590.0320.0420.399
B 2 * 0.6490.7540.6630.8880.1250.0920.1000.808
B 3 * 0.9880.9950.9861.0000.4100.5370.5130.991
C 1.5 * 0.0590.0610.0720.0770.1500.1560.1190.155
C 2 * 0.1000.0930.1151.0000.3150.3710.2500.363
30 A 1.5 * 0.2660.2710.2600.7010.5200.5920.5940.529
A 2 * 0.7370.7400.7120.9660.9550.9780.9780.919
B 1.5 * 0.3580.3450.3740.7860.1130.1080.0930.581
B 2 * 0.8730.9170.8790.9890.3760.5600.4450.924
B 3 * 1.0001.0001.0001.0000.9280.9960.9861
C 1.5 * 0.0630.0630.0880.0710.2300.2410.1750.217
C 2 * 0.1770.1280.1541.0000.5610.6950.5570.782
Table 9. Power analysis estimations of the different statistics at a significance level of 0.05, with θ 1 = 1 and θ 2 = 3 .
Table 9. Power analysis estimations of the different statistics at a significance level of 0.05, with θ 1 = 1 and θ 2 = 3 .
nAlt. STMn U KS U AD U CM U Ex U
STMn 1 STMn 2 STMn 3 STMn 4
10 A 1.5 * 0.1460.1850.1740.220.1540.1560.1680.136
A 2 * 0.3320.3760.3580.4370.3860.4090.4380.32
B 1.5 * 0.1770.180.1870.2130.0370.0150.0260.21
B 2 * 0.4030.6630.4150.4430.0440.0090.0220.455
B 3 * 0.820.9630.8060.7620.0870.0180.0510.801
C 1.5 * 0.0970.1280.0930.0960.1130.1210.0980.082
C 2 * 0.1650.2010.1910.7620.1990.2270.1550.131
20 A 1.5 * 0.1290.1480.1270.3990.2790.3090.3240.258
A 2 * 0.4190.4350.4160.740.6990.7560.7710.665
B 1.5 * 0.1740.1740.1770.4470.0590.0320.0420.399
B 2 * 0.530.6490.5220.8260.1250.0920.10.808
B 3 * 0.9670.9880.9560.9960.4100.5370.5130.991
C 1.5 * 0.0720.0720.0850.0780.1500.1560.1190.155
C 2 * 0.1280.1170.1460.9960.3150.3710.250.363
30 A 1.5 * 0.220.2220.2020.5660.5200.5920.5940.529
A 2 * 0.6270.6390.5950.9280.9550.9780.9780.919
B 1.5 * 0.2790.2730.2660.680.1130.1080.0930.581
B 2 * 0.7790.8450.7450.970.3760.5600.4450.924
B 3 * 0.99810.99310.9280.9960.9861
C 1.5 * 0.1010.090.1080.070.2300.2410.1750.217
C 2 * 0.2490.1860.24410.5610.6950.5570.782
Table 10. Power analysis estimations of the different statistics at a significance level of 0.05, with θ 1 = 2 and θ 2 = 3 .
Table 10. Power analysis estimations of the different statistics at a significance level of 0.05, with θ 1 = 2 and θ 2 = 3 .
nAlt. STMn U KS U AD U CM U Ex U
STMn 1 STMn 2 STMn 3 STMn 4
10 A 1.5 * 0.1350.1620.1570.1810.1540.1560.1680.136
A 2 * 0.310.3540.3420.3680.3860.4090.4380.32
B 1.5 * 0.1690.1720.1760.1750.0370.0150.0260.21
B 2 * 0.3840.6730.3820.3660.0440.0090.0220.455
B 3 * 0.8060.9650.7870.6920.0870.0180.0510.801
C 1.5 * 0.0990.1290.0920.0910.1130.1210.0980.082
C 2 * 0.1620.2050.1780.9430.1990.2270.1550.131
20 A 1.5 * 0.1210.1340.1260.3560.2790.3090.3240.258
A 2 * 0.3880.4070.3880.710.6990.7560.7710.665
B 1.5 * 0.1570.1590.1610.4070.0590.0320.0420.399
B 2 * 0.4920.6490.4650.7930.1250.0920.10.808
B 3 * 0.9550.9850.9380.9930.4100.5370.5130.991
C 1.5 * 0.0780.0740.0890.0840.1500.1560.1190.155
C 2 * 0.1390.120.16110.3150.3710.250.363
30 A 1.5 * 0.1890.2110.1850.5090.5200.5920.5940.529
A 2 * 0.5750.5860.5540.8940.9550.9780.9780.919
B 1.5 * 0.2460.2340.2290.6350.1130.1080.0930.581
B 2 * 0.7180.8250.6920.9580.3760.5600.4450.924
B 3 * 0.9950.9990.9910.9280.9960.9861
C 1.5 * 0.1060.0940.1120.0720.2300.2410.1750.217
C 2 * 0.2570.2010.27510.5610.6950.5570.782
Table 11. Power analysis estimations of the different statistics at a significance level of 0.05, and θ 1 = 2 and θ 2 = 4 .
Table 11. Power analysis estimations of the different statistics at a significance level of 0.05, and θ 1 = 2 and θ 2 = 4 .
nAlt. STMn U KS U AD U CM U Ex U
STMn 1 STMn 2 STMn 3 STMn 4
10 A 1.5 * 0.1330.1520.1370.1790.1540.1560.1680.136
A 2 * 0.2920.3250.3070.3570.3860.4090.4380.32
B 1.5 * 0.1290.1440.1410.1730.0370.0150.0260.21
B 2 * 0.2880.4550.3180.3640.0440.0090.0220.455
B 3 * 0.6840.8320.7120.680.0870.0180.0510.801
C 1.5 * 0.0960.1160.0970.0950.1130.1210.0980.082
C 2 * 0.1630.1860.1790.8020.1990.2270.1550.131
20 A 1.5 * 0.1180.1180.1180.3380.2790.3090.3240.258
A 2 * 0.3510.3470.3440.6930.6990.7560.7710.665
B 1.5 * 0.1350.1350.1440.3990.0590.0320.0420.399
B 2 * 0.4010.4750.3960.7790.1250.0920.10.808
B 3 * 0.8970.9390.8920.9910.4100.5370.5130.991
C 1.5 * 0.0840.0820.0970.0910.1500.1560.1190.155
C 2 * 0.1850.1270.18210.3150.3710.250.363
30 A 1.5 * 0.1640.1760.1650.4760.5200.5920.5940.529
A 2 * 0.4950.5090.4840.8710.9550.9780.9780.919
B 1.5 * 0.1770.1910.1950.6060.1130.1080.0930.581
B 2 * 0.6110.6640.5920.9470.3760.5600.4450.924
B 3 * 0.9840.990.98310.9280.9960.9861
C 1.5 * 0.110.0990.1170.0740.2300.2410.1750.217
C 2 * 0.30.2160.28410.5610.6950.5570.782
Table 12. (i) The interval numbers of the statistical model. (ii) Probability distributions based on Kang’s method.
Table 12. (i) The interval numbers of the statistical model. (ii) Probability distributions based on Kang’s method.
(i) Item A 1 A 2 A 3 A 4
C 1 [4.4, 5.8][2.3, 4.4][1.0, 1.9][0.1, 0.6]
C 2 [4.9, 7.0][2.0, 3.4][3.0, 5.1][1.0, 1.7]
C 3 [4.9, 7.9][2.2, 3.8][4.5, 6.9][1.4, 2.5]
(ii) Item A 1 A 2 A 3 A 4
P( C 1 )0.2705710.274770.1459780.156327
P( C 2 )0.4195840.3516070.4294460.373669
P( C 3 )0.3098450.3736220.4245760.470003
Table 13. (i) Sharma–Taneja–Mittal entropy measures; (ii) the weights attributed to different choices of θ 1 and θ 2 .
Table 13. (i) Sharma–Taneja–Mittal entropy measures; (ii) the weights attributed to different choices of θ 1 and θ 2 .
(i) Item A 1 A 2 A 3 A 4
θ 1 = 1 , θ 2 = 2 −0.654737−0.66128−0.614002−0.61503
θ 1 = 1 , θ 2 = 3 −0.438289−0.441816−0.420576−0.42009
θ 1 = 2 , θ 2 = 3 −0.221841−0.222352−0.227151−0.22515
θ 1 = 2 , θ 2 = 4 −0.149847−0.149125−0.159518−0.158039
(ii) Item Q ( A 1 ) Q ( A 2 ) Q ( A 3 ) Q ( A 4 )
θ 1 = 1 , θ 2 = 2 0.2546010.2562720.2444380.244689
θ 1 = 1 , θ 2 = 3 0.252020.252910.2475950.247475
θ 1 = 2 , θ 2 = 3 0.2494290.2495570.2507580.250256
θ 1 = 2 , θ 2 = 4 0.2489280.2487490.2513470.250976
Table 14. The recognition rates of different methods.
Table 14. The recognition rates of different methods.
Approach C 1 C 2 C 3 Overall
Kang’s approach100%96%84%93.33%
Buono and Longobardi’s approach100%96%86%94%
Sharma–Taneja–Mittal approach100%98%86%94.66%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sakr, H.H.; Mohamed, M.S. Sharma–Taneja–Mittal Entropy and Its Application of Obesity in Saudi Arabia. Mathematics 2024, 12, 2639. https://doi.org/10.3390/math12172639

AMA Style

Sakr HH, Mohamed MS. Sharma–Taneja–Mittal Entropy and Its Application of Obesity in Saudi Arabia. Mathematics. 2024; 12(17):2639. https://doi.org/10.3390/math12172639

Chicago/Turabian Style

Sakr, Hanan H., and Mohamed Said Mohamed. 2024. "Sharma–Taneja–Mittal Entropy and Its Application of Obesity in Saudi Arabia" Mathematics 12, no. 17: 2639. https://doi.org/10.3390/math12172639

APA Style

Sakr, H. H., & Mohamed, M. S. (2024). Sharma–Taneja–Mittal Entropy and Its Application of Obesity in Saudi Arabia. Mathematics, 12(17), 2639. https://doi.org/10.3390/math12172639

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop