The Lomax-Exponentiated Odds Ratio–G Distribution and Its Applications

: This paper introduces the Lomax-exponentiated odds ratio–G (L-EOR–G) distribution, a novel framework designed to adeptly navigate the complexities of modern datasets. It blends theoretical rigor with practical application to surpass the limitations of traditional models in capturing complex data attributes such as heavy tails, shaped curves, and multimodality. Through a comprehensive examination of its theoretical foundations and empirical data analysis, this study lays down a systematic theoretical framework by detailing its statistical properties and validates the distribution’s efficacy and robustness in parameter estimation via Monte Carlo simulations. Empirical evidence from real-world datasets further demonstrates the distribution’s superior modeling capabilities, supported by compelling various goodness-of-fit tests. The convergence of theoretical precision and practical utility heralds the L-EOR–G distribution as a groundbreaking advancement in statistical modeling, significantly enhancing precision and adaptability. The new model not only addresses a critical need within statistical modeling but also opens avenues for future research, including the development of more sophisticated estimation methods and the adaptation of the model for various data types, thereby promising to refine statistical analysis and interpretation across a wide array of disciplines.


Introduction
The introduction of generalized distributions represents a significant advancement in the field of statistical analysis, laying the foundation for the development of both flexible and complex statistical models.These frameworks are highly regarded by researchers and statisticians for their capability to tailor analytical strategies to the unique challenges encountered in various datasets [1].Generalized distributions have garnered widespread interest across numerous fields, such as epidemiology and survival analysis, due to their comprehensive applicability, as illustrated by recent models proposed in [2][3][4].Recent additions to this domain include the transformed Log-Burr III distribution [5], the Ristić-Balakrishnan-Topp-Leone-Gompertz-G distribution [6], a novel family of modified slash distribution [7], the flexible Gumbel distribution [8], the new sine inverted exponential distribution [9], the Beta-truncated Lomax distribution [10], the bivariate Chen distribution [11], the power function-Lindley distribution [12], and the two-parameter Weibull distribution [13], among others.
This paper introduces the Lomax-exponentiated odds ratio-G distribution, a novel integration of the Lomax distribution's resilience with the odds ratio.Originating from the Pareto Type-II distribution, the addition of an extra scale parameter to the Lomax distribution significantly enhances its effectiveness in accurately estimating failure times and lifespans [14].The odds ratio is utilized in various areas, including epidemiology and the social sciences, to accurately model binary events.It elucidates the impact of numerous factors on dichotomous outcomes, providing insights into the likelihood of an event occurring.The incorporation of exponentiation within the Odds Ratio distribution broadens its adaptability, enabling the modeling of complex interrelationships and interactions among variables.This enhancement exemplifies the extensive utility of statistical distributions in addressing a wide array of analytical challenges [15][16][17].
The L-EOR-G distribution demonstrates promising potential as a robust solution for a wide range of data analysis challenges.By amalgamating these well-established concepts, the L-EOR-G distribution seeks to enhance the adaptability and potential of the foundational models.Compared to other sophisticated generalized models, the L-EOR-G distribution is conceptually straightforward and aligns with well-known distributions such as the Pareto type I, beta prime, F distribution, and q-exponential distribution.
The rationale for proposing the L-EOR-G distribution stems not only from the need for greater model flexibility but also from the desire to provide a unified approach to data analysis that can be easily interpreted and applied across different disciplines.The L-EOR-G distribution is posited to offer superior fit and predictive accuracy for datasets that exhibit behaviors challenging to model with traditional distributions.Moreover, the development of the L-EOR-G distribution is motivated by the ongoing evolution in statistical methodologies, where there is a pressing need for models that not only fit the data well but also offer insights into the underlying processes generating the data.By providing a more adaptable and intuitive modeling tool, the L-EOR-G distribution aims to contribute to the advancement of statistical science, opening new avenues for research and application.
The manuscript details the proposal of the new distribution in Section 2, followed by an exploration of the mathematical properties of this distribution family in Section 3. Section 4 discusses various parameter estimation methods alongside Monte Carlo simulations for each estimator.Section 5 presents several special cases, including their probability density functions, hazard rate functions, skewness, and kurtosis.Finally, Section 6 showcases the application of the new model to real-life datasets, demonstrating its practical utility and flexibility.Detailed proofs for all theorems and lemmas presented in this manuscript are available in the Supplementary Information, which is accessible via the following URL: https://github.com/shusenpu/LEORG/blob/8f2f214fad950627216556df0158f2ca6d03554d/SI.pdf(accessed on 17 May 2024).

Lomax-Exponentiated Odds Ratio-G Family of Distributions
Chen et al. proposed the exponentiated odds ratio generator [18], which introduces a comprehensive framework for generalized odds ratio distributions.In this paper, the Lomax distribution is selected as the outer distribution with its cumulative distribution function (cdf) represented as where λ > 0 is the scale parameter and k > 0 is the concentration of the distribution.Consequently, the cdf of the newly defined Lomax-exponentiated odds ratio-G is expressed by the following: which simplifies to the following: and the probability density function (pdf) is detailed as follows: where the parameters α, β, and k are positive, characterizing the scale, shape, and concentration of the distribution.The function g(x, ψ) denotes the baseline pdf, G(x, ψ) represents the cdf associated with the baseline distribution, and G(x, ψ) = 1 − G(x, ψ) is defined as the survival function.This formulation encapsulates the intricate relationship between the proposed L-EOR-G distribution and its underlying baseline distribution, emphasizing the adaptability and comprehensive nature of the L-EOR-G model in capturing various statistical phenomena.Our objective is to demonstrate that the incorporation of extra parameters can enhance simple distributions, originally characterized by limited variability, enabling them to exhibit a diverse array of shapes and skewness.For practical applications, utilizing this newly developed family of distributions, we advocate the selection of baseline functions with straightforward formats.
Notably, when k = 1 α , the L-EOR-G distribution converges to the extended odd Weibull-G distribution as discussed by [19].The ensuing subsections are dedicated to a thorough exploration of the statistical properties, simulations, special cases, and real-world applications of this innovative distribution.

Mathematical and Statistical Properties
Transitioning into a deeper exploration of the Lomax-exponentiated odds ratio-G distribution, this section is dedicated to elucidating its mathematical and statistical properties.The analysis includes moments of the L-EOR-G distributions, moment-generating function, raw, central, and incomplete moments, the Rényi entropy, and probability-weighted moments, establishing a comprehensive understanding of the distribution's characteristics.

Expansion of the Probability Density Function
Theorem 1.The probability density function (pdf) of the L-EOR-G distribution is formulated as a linear combination of exponentiated generalized distributions, as follows: c j,m s m+β(j+1) (x, ψ), (5) where coefficients c j,m are specified by the following: The function s m+β(j+1) (x, ψ) represents the exponentiated-G distribution with parameter m + β(j + 1), described as follows: Proofs for Theorem 1 and all subsequent theorems and lemmas are provided in the Supplementary Information.

Hazard Rate
Theorem 2. The hazard rate function (hrf) of the L-EOR-G distribution is articulated as follows: and its reverse hazard rate function (rhrf) is given by the following: 3.3.Quantile Function Theorem 3. The quantile function for the L-EOR-G distribution is delineated as follows: The subsequent subsections further detail moments, central moments, incomplete moments, and generating functions, providing a multifaceted view of the L-EOR-G distribution's mathematical framework.

Moments, Incomplete Moments and Generating Functions
Moments serve as statistical measures to characterize probability distributions and succinctly summarize datasets.From the first moment (mean), representing the distribution's location, to the variance (second central moment) depicting the spread, and onto skewness and kurtosis (third and fourth standardized moments) illustrating the shape, each moment contributes to a full picture of the distribution.

Raw Moments
Lemma 1.The r th order raw moment of the L-EOR-G distribution is expressed as follows: where Z j,m is the exponentiated-G random variable with parameter (m + β(j + 1)), and c j,m is as defined in Equation (6).

Central Moments
Lemma 2. The n th -order central moment of the L-EOR-G distribution is articulated as follows: where E[Y m+β(j+1) ] signifies the expected value of the exponentiated-G random variable Y m+β(j+1) , and c j,m is as outlined in Equation (6).

Incomplete Moments
Lemma 3. The s th incomplete moment of the L-EOR-G distribution is detailed as follows: x s h m+β(j+1) (x, ψ)dx, (13) highlighting the integration of x s with the function h m+β(j+1) (x, ψ) over the range to t, and c j,m is as defined in Equation (6).

Rényi Entropy and Order Statistics
Lemma 5.The Rényi entropy of the L-EOR-G distribution is expressed as follows: where I REG is the Rényi entropy of the exponentiated-G family.
In the field of statistics, the k th -order statistic from a statistical sample denotes the k th smallest value within that sample.Order statistics, together with rank statistics, serve as essential instruments in the areas of non-parametric statistics and statistical inference.Lemma 6.Let X 1 , X 2 , ..., X n be independent and identically distributed (i.i.d.) random variables from the L-EOR-G distribution.The i th -order statistic is articulated as follows: illustrating the complexity and application potential of the L-EOR-G distribution's order statistics.

Probability-Weighted Moments
Lemma 7. The probability-weighted moments of the L-EOR-G distribution are given as follows:

Special Cases of the L-EOR-G Distribution
This section is dedicated to examining several special cases within the Lomaxexponentiated odds ratio-G framework, emphasizing its flexibility and adaptability across various distributions.Through this exploration, we aim to highlight the versatility of the L-EOR-G model by using different baseline distributions.For each distinct case, we present plots of its probability density function, hazard rate function, as well as its skewness and kurtosis for chosen parameters.Further visualizations of skewness and kurtosis for these cases are provided in the Supplementary Information.

Lomax-Exponentiated Odds Ratio-Uniform Distribution
Given a uniform baseline distribution G(x, ψ) with parameter θ > 0, where , we derive the Lomax-exponentiated odds ratio-uniform (L-EOR-U) distribution.Its cumulative distribution function and probability density function are, respectively, defined as follows: The hazard rate function is elucidated as follows, showcasing the distribution's behavior in failure rate modeling: Figure 1 displays the pdf and hrf visualizations for the L-EOR-U distribution, showing a diverse array of curve shapes including monotonically decreasing trends, as well as leftand right-skewed, and various bell-shaped patterns.In parallel, the hrf of the L-EOR-U distribution illustrates a spectrum of growth trends including consistent increases and distinct bathtub profiles, emphasizing the model's versatility.Further demonstrating this flexibility, Figures 2 and 3 present the skewness and kurtosis across different parameter settings for the L-EOR-U distribution, highlighting the broad adaptability and shape variability of this model.The skewness plots show that the distributions can be left or right skew and the kurtosis plot shows that the peak and tails can have various shapes.

Lomax-Exponentiated Odds Ratio-Exponential Distribution
When the baseline distribution G(x, ψ) follows an exponential distribution with parameter γ > 0, i.e., g(x; γ) = γe −γx and G(x; γ) = 1 − e −γx , we obtain the Lomaxexponentiated odds ratio-exponential (L-EOR-E) distribution.Its cdf and pdf are given by the following: Figure 4 illustrates the pdf for five combinations of the L-EOR-E distribution, demonstrating a range of skewness profiles.The hrf for the L-EOR-E distribution reveals a variety of behaviors, including consistent declines, assorted growth trends, and inverted U-shaped curves.Furthermore, variations in skewness and kurtosis across different parameter settings for the L-EOR-E distribution are detailed in Figures 5 and 6, highlighting the distribution's flexibility in shape under varied conditions.The skewness and kurtosis plots show that the distributions can have different symmetry and tail shapes.

Lomax-Exponentiated Odds Ratio-Weibull Distribution
Considering a Weibull baseline distribution with parameters λ, γ > 0, the Lomaxexponentiated odds ratio-Weibull (L-EOR-W) distribution emerges.Its cdf and pdf, showcasing a rich array of behaviors across different parameter values, are defined as follows: Figure 7 presents the pdf for various configurations of the L-EOR-W distribution, including a wide range of curve characteristics from left-and right-skewed to nearly symmetrical and declining profiles.Notably, the pdf illustrates a distinct stretched M shape featuring dual peaks.Correspondingly, the hrf for these distributions unveils a variety of shapes, including different forms of U-shaped, inverted U-shaped, and concaveupward trends.Further detailing the distribution's adaptability, Figures 8 and 9 explore the skewness and kurtosis across assorted parameter settings for the L-EOR-W distribution, showcasing its shape versatility under varying conditions.Diverse skewness and kurtosis plots indicate that the L-EOR-W distributions can have different symmetry and tail characteristics.

Lomax-Exponentiated Odds Ratio-Kumaraswamy Distribution
With the Kumaraswamy distribution as the baseline, characterized by parameters a, b > 0, we explore the Lomax-exponentiated odds ratio-Kumaraswamy (L-EOR-K) distribution.This variant is particularly notable for its flexibility in modeling data with diverse behaviors through its cdf and pdf: Figure 10 features the pdf for five notable instances within the L-EOR-K distribution framework, exhibiting a spectrum of shapes from unimodal to bathtub configurations.The hrf for these distributions further diversifies the analysis, showcasing U-shaped and varied ascending patterns.Additionally, Figures 11 and 12 delve into the skewness and kurtosis across multiple parameter settings of the L-EOR-K distribution, revealing its capacity to span a broad range from negative to positive values in skewness, emphasizing the model's extensive flexibility in statistical shape representation.The skewness and kurtosis plots suggest that the L-EOR-K distributions can have different symmetry and tail shapes.This exploration of special cases within the L-EOR-G distribution framework not only emphasizes its theoretical significance but also shows its practical potential in modeling real-world data with varying characteristics.

Methods of Estimation
This section provides an in-depth examination of the diverse estimation methodologies applicable to the Lomax-exponentiated odds ratio-G distribution.The array of approaches, from maximum likelihood estimation to the Anderson-Darling approach, demonstrates the distribution's efficacy in statistical modeling.This exploration highlights the adaptability and robust nature of the L-EOR-G model through various estimation strategies, affirming its utility in diverse analytical scenarios.

Maximum Likelihood Estimation
The maximum likelihood estimator (MLE) method is widely recognized for its efficacy in parameter estimation within distributions.Let ∆ = (α, β, k, ψ) T , then the likelihood for ∆ is given by the following: The MLE can be obtained by solving the nonlinear equations ∂ℓ ∂α , ∂ℓ ∂β , ∂ℓ ∂k , ∂ℓ ∂ψ k = 0 employ- ing numerical methods, such as the Newton-Raphson approach.

Least Squares and Weighted Least Squares Estimation
Both (weighted) least squares estimation (LS or WLS) methodologies provide estimators for model parameters, with the LS estimation formulated as follows: The WLS estimation, on the other hand, seeks to minimize the following: LS or WLS can be derived by solving the nonlinear equations ∂LS ∂k , ∂LS ∂α , ∂LS ∂β , ∂LS ∂ψ s = 0 using numerical methods like the Newton-Raphson approach.

Maximum Product Spacing Approach of Estimation
The maximum product spacing approach of estimation (MPS) approach proves especially valuable when confronting unknown or intricate distributions, as highlighted by [20].
The geometric mean of MPS spacings is defined as follows: where Therefore, the objective is to maximize the following: Equivalently, we can maximize the function L(σ) = log[MPS(σ)] The MPS is achievable by solving the nonlinear equations ∂L ∂k , ∂L ∂α , ∂L ∂β , ∂L ∂ψ s = 0 via numerical methods.The MPS approach is particularly useful when dealing with unknown or complex distributions, enhancing the robustness of the estimation process.

Cramér-von Mises Approach of Estimation
We can employ the Cramér-von Mises criterion to derive estimators by minimizing the function CV M(x, σ) with respect to σ, where

Anderson-Darling Approach of Estimation
The Anderson-Darling approach of estimation can be determined by minimizing the function AD(σ) with respect to σ, where The AD approach enables the refinement of estimations, contributing to the accurate characterization of the L-EOR-G distribution.

Simulation Study
In this subsection, we incorporate a simulation study to show the practical application of the estimation techniques explored.By integrating Monte Carlo simulations with various estimation strategies, we aim to estimate the parameters of Lomax-exponentiated odds ratio-exponential distribution with predefined settings, such as α = 1.5, β = 2.9, γ = 1.3, and k = 0.8.Sample sizes of N = 50, 100, 250, 500, and 1000 were utilized to generate random samples, with each experiment replicated N = 500 times to ensure robustness.Subsequently, both the bias and mean squared error (MSE) were computed for each dataset to evaluate the estimators' performance.The outcomes, illustrated in Table 1 and Figure 13 within this paper, reveal that MSE tends toward zero as sample size increases, affirming the reliability and stability of the estimations across different scenarios.This empirical analysis not only validates the effectiveness of the estimation methods but also demonstrates their applicability in real-world data analysis, thereby enhancing the L-EOR-G distribution's utility in statistical modeling.

MSE of k in Predictions
Figure 13.MSE of parameters in Table 1.
Our comparative analysis employs a comprehensive set of metrics for evaluating model performance, such as the −2 log-likelihood, Cramér-von Mises (W * ), Anderson-Darling (A * ), Akaike information criterion (AIC), Bayesian information criterion (BIC), consistent Akaike information criterion (CAIC), Hannan-Quinn information criterion (HQIC), and the Kolmogorov-Smirnov (K-S) statistic with its p-value, to determine the most suitable model for capturing the intricacies of the observed data.

Analysis of Carbon Fiber Strength Data
This section examines the breaking strength data of carbon fibers, each with a length of 50 mm, based on a study with a sample size of n = 66, as detailed by Nichols et al. [26].We present a detailed comparison of parameter estimates and goodness-of-fit statistics in Table 2. Furthermore, Figure 14 displays the empirical distribution of the observed data alongside the estimated density functions of several fitted models.Among these, the LEORE distribution emerges as the superior fit for this dataset, evidenced by its exemplary goodness-of-fit statistics and the most favorable K-S test p-value, as tabulated in Table 2.In addition, Figure 15 presents a collection of diagnostic plots, including the Kaplan-Meier (K-M) survival curve, juxtapositions of the theoretical and empirical cumulative distribution functions (TCDF and ECDF), and a scaled total time on test (TTT) plot.The congruence between theoretical forecasts and empirical observations reaffirms the suitability of the LEORE distribution in accurately modeling the dataset, particularly highlighting its capacity to capture non-monotonic hazard rate behaviors.

Survival Analysis of Guinea Pigs
This dataset encompasses survival durations of guinea pigs following injections with tubercle bacilli, explored in the study by Kundu et al. [27].Parameter estimations along with goodness-of-fit metrics are detailed in Table 3. Figure 16 shows the histogram of actual survival times against the density curves of several fitted models.The LEORE distribution distinctly outperforms its counterparts, as evidenced by its superior goodnessof-fit indicators and the most significant K-S test p-value.
Additionally, a set of illustrative plots-including the K-M survival curve, both TCDF and ECDF, and the TTT plot-are presented in Figure 17.The remarkable alignment between empirical data and theoretical predictions reinforces the LEORE model's accuracy in fitting the data.The TTT plot, in particular, suggests the model's proficiency in delineating a monotonic hazard rate structure, further confirming its applicability.

Analysis of Chemotherapy Treatment Data
This section examines the survival times (in years) of patients undergoing chemotherapy as part of a study reported by Bekker et al. [28].Parameter estimations and goodness-offit assessments are systematically tabulated in Table 4. Additionally, Figure 18 offers a visual comparison between the empirical distribution and fitted density functions, alongside the expected probabilities.According to the comparative analysis in Table 4, the LEORE distribution emerges as the most accurate model, demonstrating the lowest goodness-of-fit values and the most substantial p-value in the K-S test.Figure 19 further provides a suite of plots, including the Kaplan-Meier survival curve, TCDF and ECDF, and a TTT plot.The match between empirical and theoretical observations indicates the efficacy of the proposed model in capturing the dataset's characteristics.Notably, the scaled TTT plot reveals the model's capability to accurately represent a nonmonotonic hazard rate structure, affirming its suitability for complex survival data analysis.

Conclusions
The exploration and unveiling of the Lomax-exponentiated odds ratio-G distribution within this study marks a new advancement in the field of statistical analysis.This workaimed at transcending the limitations of conventional statistical distributions in capturing the complex nature of modern datasets-rigorously explores the L-EOR-G distribution's theoretical properties, parameter estimation methodologies, and empirical applicability.Our analysis evidences the distribution's unparalleled flexibility and efficiency in modeling a wide spectrum of data behaviors, setting it apart from many existing models.
The empirical analysis demonstrates the L-EOR-G distribution's superiority in fitting a diverse array of data distributions more adeptly than traditional counterparts.This is substantiated by consistently superior goodness-of-fit measures and K-S test p-values across various datasets, illustrating not only the model's exceptional fitting prowess but also its potential to significantly enhance scientific research and informed decision-making processes.In summation, the introduction of the L-EOR-G distribution signifies a leap forward in bridging the divide between theoretical sophistication and practical utility in statistical modeling.

3. 4 . 4 .Lemma 4 .
Moment-Generating Functions The moment-generating function (MGF) of the L-EOR-G distribution is provided in terms of the MGF of the exponentiated-G distribution:

Figure 1 .
Figure 1.The pdf (left) and hrf (right) plots for the L-EOR-U distribution with various parameters.

Figure 2 .
Figure 2. Skewness and kurtosis plots for the L-EOR-U distribution.The parameters α and β are varied, while keeping k = 1 and θ = 1.

Figure 3 .
Figure 3. Skewness and kurtosis plots for the L-EOR-U distribution.The parameters β and θ are varied, while keeping α = 1 and k = 2.

Figure 4 .
Figure 4.The pdf (left) and hrf (right) plots for the L-EOR-E distribution with various parameters.

Figure 5 .
Figure 5. Skewness and kurtosis plots for the L-EOR-E distribution.The parameters α and β are varied, while keeping k = 1 and γ = 1.

Figure 6 .
Figure 6.Skewness and kurtosis plots for the L-EOR-E distribution.The parameters k and γ are varied, while keeping α = 1, and β = 1.5.

8 Figure 7 .
Figure 7.The pdf (left) and hrf (right) plots for the L-EOR-W distribution with various parameters.

7 Figure 10 .
Figure 10.The pdf (left) and hrf (right) plots for the L-EOR-K distribution with various parameters.

Figure 11 .
Figure 11.Skewness and kurtosis plots for the L-EOR-K distribution.The parameters α and β are varied, while keeping k = 1, a = 1, and b = 1.

Figure 12 .
Figure 12.Skewness and kurtosis plots for the L-EOR-K distribution.The parameters β and a are varied, while keeping α = 1, k = 1, and b = 1.

Figure 14 .Figure 15 .
Figure 14.(left) Fitted density superposed on the histogram and observed probability for the carbon fiber data.(right) Expected probability plots for the carbon fiber data.

Figure 16 .Figure 17 .
Figure 16.(left) Fitted density superposed on the histogram and observed probability for the carbon fiber data.(right) Expected probability plots for the Guinea data.

Figure 18 .Figure 19 .
Figure 18.(left) Fitted density superposed on the histogram and observed probability for the chemotherapy data.(right) Expected probability plots for the chemotherapy data.

Table 2 .
MLEs and goodness-of-fit statistics for carbon fiber data

Table 3 .
MLEs and goodness-of-fit statistics for the Guinea data.

Table 4 .
MLEs and goodness-of-fit statistics for chemotherapy data.