Estimation of Weighted Extropy with Focus on Its Use in Reliability Modeling

In the literature, estimation of weighted extropy is infrequently considered. In this paper, some non-parametric estimators of weighted extropy are given. The validation and comparison of the estimators are implemented with the help of simulation study and data illustration. The usefulness of the estimators is demonstrated using real data sets.


Introduction
The concept of extropy and its use has been explored rapidly in the recent years.It measures the uncertainty contained in the probability distributions and is considered as the complimentary dual of entropy introduced in [1].The entropy measure is shift-independent, that is, it is the same for both X and X + b and it cannot be applied in some fields such as neurology.Thus, in [2], the notion of weighted entropy measure was introduced.The authors pointed out that occurrence of an event has an impact on uncertainty in two ways.It presents both quantitative and qualitative information.That is, it initially reveals the probability of an event occurring and later demonstrates its efficacy in achieving qualitative features of a goal.It is important to note that the information obtained when a device fails to operate or a neuron fails to release spikes in a specific time interval differs significantly from the information obtained when such events occur in other equally wide intervals.This is why there is a need, in some cases, to employ a shift-dependent information measure that assigns varying measures to these distributions.The importance of the presence of weighted measures of uncertainty was exhibited in [3].
The concept of extropy for a continuous rv X has been presented and discussed across numerous works in the literature.The differential extropy defined by [4] is One can refer to [5] for the extropy properties of order statistics and record values.The applications of extropy in automatic speech recognition can be found in [6].Various literature sources have presented a range of extropy measures and their extensions.Analogous to weighted entropy, in [7], the concept of weighted extropy was introduced (WE) in the literature.It is given as Variable x in the integral emphasizes the weight related to the occurrence of event X = x.Here, it assigns more significance to large values of X.In the literature, extropy, its different versions and their applications have been studied by several authors (see, for instance, [8][9][10]).In particular, a unified version of extropy in classical theory and in Dempster-Shafer theory was studied in [11].
There are several papers available in the literature that delve into the estimation of extropy and its various versions.Kernel estimation on the functionals of the density function was proposed in [12].The optimal bandwidth for kernel density functionals is provided in [13].In [14], a brief explanation was established on optimal bandwidth estimators of kernel density functionals for contaminated data.In [15], estimators of extropy were proposed, and also its application was worked on by testing uniformity.In [16], the concept of length biased sampling in estimating extropy was approached.Research on nonparametric estimation using dependent data is also well-explored in the literature.Work by [17] explained the recursive and non-recursive kernel estimation of negative cumulative extropy under the α-mixing dependence condition.Recently, in [18], the kernel estimation of the extropy function was discussed using α-mixing-dependent data.Moreover, in [19], the log kernel estimation of extropy was introduced.
Even if there are several works available in the literature related to the estimation of extropy, little has been published on WE and its estimation until now.There are situations in which we are forced to use WE instead of extropy.Unlike extropy, the qualitative characteristics of information are also represented here.In [20], the significance of employing WE as opposed to regular extropy in certain scenarios was demonstrated.There are instances where certain distributions possess identical extropy values but exhibit distinct WE values.In such situations, it becomes necessary to opt for WE.The estimators of WE can also be used in the selection of models in the reliability analysis.Here, we tried to find some estimators for WE and validated it using simulation study and data analysis.
The paper is organized as follows: In Section 2, we introduce the log kernel estimation of WE.In Section 3, an empirical kernel smoothed estimator of WE is given.A simulation study is conducted to evaluate the estimators, and we also compare log kernel to kernel estimators of WE in Section 4. Section 5 is devoted to the real data analysis to examine the proposed estimators.Finally, we conclude the study in Section 6.

Log Kernel Estimation of Weighted Extropy
In this section, we introduce the concept of log kernel-based estimation of WE.Let us define an rv X with unknown pd f f X (x).We assume that X is defined on R and f X (x) is continuously differentiable.We suppose {X i ; 1 ≤ i ≤ n} is a sequence of identically distributed rvs.The most commonly used estimator of f X (x) is the kernel density estimator (KDE), given by [21,22] as where K(x) is the kernel function which satisfies the following conditions: Here, bandwidth parameter h → 0 and nh → +∞ as n → +∞.When probability density functions are estimated in a non-parametric way, standard KDE is frequently used.However, when we deal with data that fit distributions with heavy tails, multiple modes, or skewness, particularly those with positive values these estimators may lose their effectiveness.In all of these scenarios, applying a transformation, we can yield more consistent results.Such transformation involves a logarithmic transformation to create a non-parametric KDE.An important aspect of the logarithmic transformation is its ability to compress the right tail of the distribution.The obtained KDE are called logarithmic KDE (denoted as L − KDE) (refer to [23]).Let us define Y = log(X), Y i = log(X i ); i = 1, 2, . . ., n and let f Y (y) be the pd f of Y.The L − KDE is defined as where ) is the log kernel function with bandwidth h > 0 at location parameter z.For any z, h ∈ (0, +∞), L(x, z, h) satisfies conditions L(x, z, h) ≥ 0 for all x ∈ (0, +∞) and For any X ∈ (0, +∞), (2) where We let (X 1 , X 2 , . . ., X n ) be a sample of identically distributed observations.We obtain the L − KDE for WE by using the estimator defined in Equation ( 4).
The L − KDE for the WE function is which again can be alternatively expressed as The following theorem gives the expression for bias and variance of the L − KDE of WE.
Theorem 1. Assume that the conditions given in Section 2 are satisfied in the case of log kernel function L(x) and bandwidth h.Then, the bias and variance of L − KDE Ĵw n (X) are given, respectively, as where Proof.The proof is omitted as it is similar to [19].
The following theorem shows that the proposed estimator is consistent.
Theorem 2. Ĵw n (X) is a consistent estimator of J w (X), where Ĵw n (X) and J w (X) are defined in Equations ( 2) and (7).Also, let L(x) be the log kernel function and h be the bandwidth which satisfies the conditions given in Section 2.Then, we can say that, as n tends to +∞, Proof.Since the proof is similar to that of [19], it is omitted.
The below theorem shows that the L − KDE of WE is integratedly uniformly consistent in the quadratic mean estimator of J w (X).Theorem 3. Consider log kernel function L(x) and bandwidth parameter h that fulfills the conditions outlined in Section 2. If Ĵw n (X) is L − KDE according to Equation ( 7), then Ĵw n (X) is integratedly uniformly consistent in the quadratic mean estimator of J w (X).
Proof.As the proof resembles that of [19], it is omitted here.
Here, we provide the expression for the optimal bandwidth of Ĵw n (X).
Optimal Bandwidth Here, we offer the expression for the optimal bandwidth using mean integrated square error (MISE).The MISE of Ĵw n (X) is given as Using the expression for bias and variance given in Equations ( 9) and ( 10), the MISE of Ĵw n (X) is given as The asymptotic MISE (AMISE) can be obtained by ignoring the higher-order terms and is given as The optimal bandwidth is then attained after minimizing AMISE with respect to h, and it is given by ).

Empirical Estimation of Weighted Extropy
Non-parametric estimation is a widely employed technique in various research papers for estimating extropy and its associated measures.One common approach within nonparametric estimation is the use of kernel density estimation, which is a popular method in the literature used in order to obtain smoothed estimates.
In this section, we introduce the empirical method for estimating pd f to assess WE.This estimation is achieved through the utilization of a non-parametric KDE (see [24,25]).The empirical kernel smoothed estimator for WE is where f X (.) is the KDE given by [21] and X i:n is the i th order statistic of the random sample.
Table 1 shows the values of mean and variance of the samples of Example 1.Hence, it is clear that the values of mean is changing and the variance is tending to zero when the sample size increases.It is therefore clear that the mean and variance of empirical estimators are influenced by the size of the sample.
V Ĵw n1 (X) = From Table 2, it is clear that the variance is decreasing to zero and the mean is increasing in the case of Rayleigh distribution with parameter one, which indicates the dependence of empirical estimators on sample size.

Simulation Study
We manage a simulation study to evaluate the performance of the presented estimators.Random samples are generated corresponding to different sample sizes from some standard distributions, and then both bias and root mean square (RMSE) are calculated for 10,000 samples.Bandwidth parameter h is determined using the plug-in method as proposed in [26].
To enable a comparison between L − KDE and KDE of WE, we again propose a KDE for WE using Equation (3).The estimator is given by where fX (x) is the KDE given in [21].Using the consistency property of the KDE, it is clear that the proposed estimator in Equation (18) for WE is also consistent.To lay the ground work for comparison, we generate samples from exponential distribution, log normal distribution, a heavy-tailed distribution and uniform distribution.The Gaussian log transformed kernel and the Gaussian kernel are the kernel functions used for simulation.
From the above Tables 3-5, it is clear that the RMSE and bias of both estimators are decreasing with sample size.The decreasing RMSE indicates that estimator predictions are approaching the true values with larger sample sizes, demonstrating enhanced accuracy and efficiency in estimation.The decreasing bias also shows the accuracy of the estimators.The comparison of bias and RMSE between the presented estimators in the simulation for WE reveals that L − KDE slightly outperforms KDE in certain scenarios, particularly when dealing with heavy-tailed distribution and skewed distributions.

Data Analysis
In this section, we performed a comparison study and validated the accuracy of the proposed estimators using real data analysis.In each of the three scenarios, the bandwidth parameter employed for estimation was derived from the bandwidth proposed in [26].

Data 1
The comparison between L − KDE and KDE of WE was demonstrated using the data given in [27].The data demonstrate the quantity of thousands of cycles to failure for electrical appliances in a life test.
The graphical representation in Figure 1 indicates the presence of slight skewness in the dataset.We fit exponential distribution with parameter 0.640 to the data.Upon analyzing the Q-Q plot in Figure 2, it becomes evident that the exponential distribution is a suitable model for the observed data.The p-value obtained for the Kolmogorov-Smirnov test (0.124) is 0.390, which reveals that exponential distribution is a good fit to the data.The estimate obtained using maximum likelihood estimation is −0.125.The estimate of WE earned using log kernel and kernel estimation are Ĵw n (X) = −0.127,Ĵw nk (X) = −0.144and Ĵw n1 (X) = −0.148.Hence, from the closeness of estimates to the maximum likelihood estimate of WE, it is clear that estimator Ĵw n (X) performs better than the other two estimators.

Data 2 (Heavy-Tailed Data)
Again, we illustrate the comparison between the three estimators using the data from [28].The data represent the remission times (months) of 137 cancer patients.A kurtosis value of 15.195 is obtained.It is exceptionally high and suggests a very heavytailed or leptokurtic distribution.Hence, log normal distribution is fitted to the data and the parameters obtained are μ = 1.756, σ = 1.066.
Figure 3 indicates the presence of rightly skewed heavy-tailed data.Upon examination of the Q-Q plot presented in Figure 4, it is clear that the data align well with the characteris-tics of log normal distribution, indicating that the log normal model is an appropriate fit for the observed dataset.Using the Kolmogorov-Smirnov test with a statistic of 0.06 and a p-value of 0.591, it is clear that log normal distribution is the best fit here.The estimates of WE using the proposed estimators and by maximum likelihood estimation are calculated for these data.We obtain Ĵw n (X) = −0.1346,Ĵw nk (X) = −0.1418,and Ĵw nk (X) = −18.952.The estimate of WE using maximum likelihood estimation is secured as −0.1323, which signifies that the L − KDE of WE performs better than the WE estimated with standard kernel estimation methods when dealing with heavy-tailed data.

Data 3 (The Time until Failure of the Three Systems)
The data are obtained from [29].The observations represent three reparable systems observed until the time of their 12th failure.They clarify that the three identically designed systems exhibit distinct behaviors, with their repair rates demonstrating a decreasing trend indicative of improvement in one system, a stable linear trend in another system, and an increasing trend signifying deterioration in the third system.Figure 5 shows the density plot of the three systems.Table 6 shows the value of suggested estimators of WE for these systems.According to [30], the system or component which is said to have high uncertainty is less reliable.In accordance with this concept, we can infer that System 3 is less reliable than System 1 and System 2 with regard to the three proposed estimators.Using repair rates, in [29], System 3 was also mentioned as the deteriorating system.This example vividly demonstrates how the estimation of WE is useful in choosing a reliable system among the several available competing models.

Conclusions
In this article, we considered non-parametric estimation of WE.L − KDE and the empirical kernel smoothed estimator for WE were depicted.The bias, variance, optimal bandwidth and some properties of the L − KDE of the extropy function were also established here.KDE was also proposed to enable a comparison with the proposed L − KDE.We ensured the accuracy of the three estimators by evaluating their performance using measures such as bias and RMSE.We determined that in some situations, for example, when dealing with heavy tailed or skewed data sets, the L − KDE of WE performs slightly better than the other two estimators.The real data analyses also involved an assessment of the performance of the estimator and its utility in reliability modeling.We also demonstrated how WE is beneficial when choosing a reliable system from various competing models, highlighting its practicality in the selection process.

Figure 5 .
Figure 5. Density plot of "Failure time of three systems".

Table 2 .
Mean and Variance of Ĵw n1 (X) for Rayleigh distribution with parameter = 1.