Next Article in Journal
An Efficient Horvitz–Thompson-Type Estimator for Two Sensitive Means Using a Three-Stage Quantitative Randomized Response Under Complex Sampling
Next Article in Special Issue
Advanced Analysis of the Newly Unit-Lindley Model Under Improved Censoring: Applications to Biomedical and Engineering Systems
Previous Article in Journal
Two-Sided Zero-Divisor Graphs of Order-Preserving and A-Decreasing Finite Transformation Semigroups
Previous Article in Special Issue
Efficient Analysis of the Gompertz–Makeham Theory in Unitary Mode and Its Applications in Petroleum and Mechanical Engineering
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Natural Generalization of the XLindley Distribution and Its First-Order Autoregressive Process with Applications to Non-Gaussian Time Series

by
Emrah Altun
1,*,
Soheyla A. Ghomeishi
1 and
Hana N. Alqifari
2,*
1
Department of Statistics, Gazi University, Ankara 06560, Turkey
2
Department of Statistics and Operations Research, College of Science, Qassim University, Buraydah 52571, Saudi Arabia
*
Authors to whom correspondence should be addressed.
Axioms 2026, 15(2), 107; https://doi.org/10.3390/axioms15020107
Submission received: 2 December 2025 / Revised: 23 January 2026 / Accepted: 29 January 2026 / Published: 31 January 2026
(This article belongs to the Special Issue Advances in the Theory and Applications of Statistical Distributions)

Abstract

The natural generalization of the XLindley distribution is proposed. The mathematical properties of the generalized XLindley distribution are derived. The importance of the proposed model is evaluated on the first-order autoregressive process, and compared with its counterparts. Extensive simulation studies are carried out to demonstrate the suitability of the estimation methods. Empirical findings reveal that the first-order autoregressive process with generalized XLindley innovations produces better forecasting results than those of the gamma, weighted Lindley, and normal innovations. Additionally, a web-tool application of the proposed model is developed and deployed on a free server that is accessible for practitioners.

1. Introduction

Autoregressive (AR) models are fundamental tools in time series analysis, widely used for modeling and forecasting data over a specified time period. The classical AR(1) process assumes Gaussian innovations due to its mathematical simplicity and tractability. However, in many real-world applications such as finance, hydrology, and energy studies, data exhibit asymmetry, heavy tails, or positivity constraints that violate the normality assumption. Therefore, researchers have developed autoregressive processes with non-Gaussian innovations to better capture such empirical behaviors.
One of the earliest non-Gaussian extensions of the AR(1) process was proposed by Andel [1], who introduced an AR(1) model with exponential innovations. The work of [1] opened a way for further exploration of AR processes with skewed and heavy-tailed innovations such as gamma innovation [2], skew-normal (SN) innovation [3], generalized hyperbolic (GH) innovations [4], and gamma-Lindley innovations [5]. Recently, increasing focus has been given to Lindley-based innovations, motivated by their positive support and analytical tractability. Ref. [6] developed an AR(1) process with Lindley innovations, calling the process LER(1). Ref. [7] introduced an AR(1) process with Lindley marginals and called the process LAR(1) and applied the Gaussian estimation method. The generalization of the LER(1) process, the AR(1)-weighted Lindley process, was introduced by [8]. Ref. [9] used the scale mixtures of SN distributions as an innovation process and carried out the parameter estimation using the Bayesian approach. The AR(1) process with epsilon-SN innovations was introduced by [10]. Similarly to [11], ref. [12] used the scale mixtures of the normal innovations, and parameter estimation was carried out with the expectation-maximization (EM) algorithm. The missing value estimation in the AR(1) process with exponential innovations was investigated by [13].
Ref. [11] introduced the XLindley (XL) distribution as an extended and more flexible form of the classical Lindley distribution (see [14] for more details on the Lindley distribution). The XL distribution is a mixture of exponential and Lindley distributions with θ / ( θ + 1 ) mixing proportion. Ref. [11] compared the XL with exponential, xgamma [15], and Lindley distributions and stated that the XL distribution demonstrates better results than its competitive models according to the model selection criteria and goodness-of-fit statistics. However, the flexibility of the XL distribution is limited since it has only one parameter that is a scale parameter. To gain flexibility, we propose a natural generalization of the XL distribution by adding a shape parameter. The proposed distribution is called generalized XL (GXL). The GXL distribution contains the XL distribution as its sub-model. The statistical properties of the GXL distribution are all obtained in explicit forms. Therefore, the GXL is a tractable distribution and can be adapted to many models, such as regression, time series and survival.
Thanks to the tractable properties of the GXL, the AR(1) process with GXL innovations (AR(1)-GXL) is introduced. The parameters of the AR(1)-GXL process are estimated using three different approaches. The theoretical properties of the AR(1)-GXL process are derived. Two data sets are analyzed with the AR(1)-GXL process and compared with three different innovation distributions: gamma, WL and normal. The ARGXL, the web application of the proposed model, is developed using the shiny package of the R software (version 4.3.1). The ARGXL is accessible via https:gazistat.shinyapps.io/ARGXL (accessed on 2 December 2025). The results presented in this study can be reproduced using the ARGXL web application. Furthermore, researchers can upload their own data sets to obtain parameter estimates for the AR(1)-GXL process, perform residual analyses, and obtain the model fit measures.
Although several non-Gaussian autoregressive models based on Lindley-type distributions have been proposed in the literature, including the Lindley and weighted Lindley AR processes, the proposed AR(1)-GXL model introduces several fundamental novelties. First, unlike the classical Lindley distribution, which is governed by a single scale parameter, the GXL distribution incorporates an additional shape parameter, significantly enhancing its flexibility in modeling skewness and tail behavior. Second, in contrast to weighed Lindley-based AR model, the proposed innovation distribution provides the ability to model higher skewness and heavy-tailed structures in time series. These features distinguish the proposed model as a flexible and practically alternative to existing Lindley and weighted Lindley autoregressive models.
The remaining parts of the study are outlined as follows: Section 2 discusses the GXL distribution and its properties. The AR(1)-GXL process is introduced in Section 3. Section 4 is devoted to the parameter estimation methods for the AR(1)-GXL process. Two applications are given in Section 5. The web application, ARGXL, is introduced in Section 6. The important outcomes of the presented study are summarized in Section 7.

2. Generalized XLindley Distribution

The probability density function (pdf) of the XL distribution is
f x ; θ = θ 2 2 + θ + x exp θ x 1 + θ 2 ,   x > 0 ,
where θ > 0 . The XL distribution is a mixture distribution of two independent random variables (rvs) such as X 1 Exp θ and X 2 Lindley θ with p = θ / ( θ + 1 ) mixing proportion. Consider the function
f x ; α , θ 2 + θ + x α exp θ x
where α , θ > 0 . The corresponding pdf of (2) is obtained by determining the appropriate normalizing constant. A natural generalization of the XL distribution, say GXL, is proposed with the following proposition.
Proposition 1.
The pdf of the GXL distribution is
f x ; α , θ = C α , θ 2 + θ + x α exp θ x ,
where
C α , θ = θ α + 1 θ α θ + 2 + Γ α + 1
is the normalization constant. The resulting density is referred to as GXL distribution.
Proof. 
The GXL distribution is defined as
f x ; α , θ = C α , θ 2 + θ + x α exp θ x ,   x > 0 ,
where α is the shape parameter and C α , θ is a normalizing constant that is derived as follows:
C 1 α , θ = 0 2 + θ + x α exp θ x d x .
Dividing this integral into two parts, we have
0 2 + θ + x α exp θ x d x = ( 2 + θ ) 0 exp θ x d x + 0 x α exp θ x d x .
The first part of the integration is easy to derive. For the second part, we apply the partial integration setting u = x α and use the gamma function properties,
0 exp θ x d x = 1 θ ,
0 x α exp θ x d x = Γ ( α + 1 ) θ α + 1 .
So, the result of the integration in Equation (6) is
0 2 + θ + x α exp θ x d x = 2 + θ θ + Γ ( α + 1 ) θ α + 1 .
The normalizing constant is
C 1 α , θ = θ α θ + 2 + Γ α + 1 θ α + 1
Inserting Equation (11) into Equation (5), we have
f x ; α , θ = θ α + 1 2 + θ + x α exp θ x θ α θ + 2 + Γ α + 1 .
The proof is completed. □
The proposed extension of the XL distribution preserves analytical tractability, includes the XLindley distribution as a special case when α = 1 , and allows for greater control over skewness and tail behavior. The cumulative distribution function (cdf) of GXL is
F x = θ θ + 2 1 exp θ x + θ 1 α γ α + 1 , θ x θ 2 + 2 θ + Γ α + 1 θ 1 α ,
where γ · , · and Γ · are the lower incomplete and complete gamma functions, respectively. The shapes of the GXL are displayed in Figure 1. It is observed that the GXL density has right-skewed and almost symmetric shapes.

2.1. Momnents and Related Measures

Proposition 2.
The kth raw moment of the GXL distribution is
E X k = 1 θ k θ + 2 θ α Γ k + 1 + Γ k + α + 1 θ + 2 θ α + Γ α + 1 .
Proof. 
Let D = θ + 2 θ α + Γ α + 1 , we have
f x = θ α + 1 exp θ x 2 + θ + x α D .
The kth raw moment of the rv X is
E X k = θ α + 1 D 0 x k exp θ x 2 + θ + x α d x .
The integration in Equation (16) can be divided in two parts, as follows:
0 x k exp θ x 2 + θ + x α d x = θ + 2 0 x k exp θ x d x I + 0 x k + α exp θ x d x II .
From the gamma integration, it is known that
Γ k θ k = 0 x k 1 exp x θ d x .
So, the integration in Equation (17) is
0 x k exp θ x 2 + θ + x α d x = θ + 2 Γ k + 1 θ k + 1 + Γ k + α + 1 θ k + α + 1 .
Inserting Equation (19) into Equation (16), we have
E X k = θ α + 1 D θ + 2 Γ k + 1 θ k + 1 + Γ k + α + 1 θ k + α + 1 .
Replacing D = θ + 2 θ α + Γ α + 1 in Equation (20), the kth raw moment is
E X k = 1 θ k θ + 2 θ α Γ k + 1 + Γ k + α + 1 θ + 2 θ α + Γ α + 1 .
 □
The mean of the rv X is
μ = E ( X ) = ( θ + 2 ) θ α 1 + Γ ( α + 2 ) θ 1 D .
The second raw moment is
E ( X 2 ) = 2 ( θ + 2 ) θ α 2 + Γ ( α + 3 ) θ 2 D .
Simply, the variance of the GXL is σ 2 = Var ( X ) = E ( X 2 ) μ 2 . The third and fourth central moments are required to calculate the skewness and kurtosis measures,
μ 3 = E [ ( X μ ) 3 ] = E [ X 3 ] 3 μ σ 2 μ 3 ,
μ 4 = E [ ( X μ ) 4 ] = E [ X 4 ] 4 μ E [ X 3 ] + 6 μ 2 E [ X 2 ] 3 μ 4 ,
where
E ( X 3 ) = 6 ( θ + 2 ) θ α 3 + Γ ( α + 4 ) θ 3 D ,
and
E ( X 4 ) = 24 ( θ + 2 ) θ α 4 + Γ ( α + 5 ) θ 4 D .
So, the skewness and excess kurtosis can be computed via γ 1 = μ 3 / σ 3 and γ 2 = μ 4 / σ 4 3 . Figure 2 displays the skewness and kurtosis plots of the GXL. When α is constant, the skewness and kurtosis increase once θ increases. When θ is constant, the skewness and kurtosis decrease once α increases.
The skewness and kurtosis parameters play an important role in capturing empirical features of non-Gaussian data. The higher skewness reflects stronger asymmetry, which is often associated with accumulation effects or positive shocks in financial time series. Similarly, higher kurtosis indicates heavier tails and higher possibility of extreme observations, observed in volatile markets. The ability of the GXL distribution to control both skewness and kurtosis through its shape and scale parameters allows practitioners to model asymmetric behavior and tail risk more accurately than classical models.
Proposition 3.
The moment generating function (mgf) of the GXL is
M X ( t ) = θ α + 1 D θ + 2 θ t + Γ ( α + 1 ) ( θ t ) α + 1 ,   θ t
Proof. 
The mgf of X is
M X ( t ) = E [ exp t X ] = 0 exp t x f ( x ) d x .
Substituting pdf of X in Equation (29), we have
M X ( t ) = θ α + 1 D 0 exp ( θ t ) x θ + x α + 2 d x .
Divide the integration into two parts,
0 exp ( θ t ) x ( θ + x α + 2 ) d x = ( θ + 2 ) 0 exp ( θ t ) x d x , + 0 x α exp ( θ t ) x d x .
As in Proposition 1, we use the gamma integration, as follows:
0 exp u x d x = 1 u ,
0 x α exp u x d x = Γ ( α + 1 ) u α + 1 ,
where u = θ t . Finally, we have
M X ( t ) = θ α + 1 D θ + 2 θ t + Γ ( α + 1 ) ( θ t ) α + 1 .
   □
Proposition 4.
The Laplace transformation (LT) of the GXL is
Φ ( s ) = θ α + 1 D θ + 2 θ + s + Γ ( α + 1 ) ( θ + s ) α + 1 , s 0 .
Proof. 
The result follows immediately from the relation between the LT and mgf, Φ s = M X s . □

2.2. Mixture Representation and Data Generation

Proposition 5.
The GXL is a mixture distribution of exponential and gamma distributions.
Proof. 
Let D = ( θ + 2 ) θ α + Γ ( α + 1 ) . Then, the pdf of the GXL is
f ( x ; α , θ ) = θ α ( 2 + θ ) D θ exp θ x + Γ ( α + 1 ) D θ α + 1 x α exp θ x Γ ( α + 1 ) .
Hence, the pdf can be written as a mixture density of the exponential and gamma distributions, as follows:
f ( x ; α , θ ) = p ( α , θ ) f Exp ( x ; θ ) + ( 1 p ( α , θ ) ) f Γ ( x ; α + 1 , θ ) ,
where
f Exp ( x ; θ ) = θ exp θ x , f Γ ( x ; α + 1 , θ ) = θ α + 1 x α exp θ x Γ ( α + 1 ) ,
and the weight function is defined by
p ( α , θ ) = θ α ( 2 + θ ) ( θ + 2 ) θ α + Γ ( α + 1 ) ,
 □
Algorithm 1 is defined to generate random variables from the GXL distribution using the mixture representation.
Algorithm 1 A new algorithm for generating random variables
1:
Set the parameters α and θ
2:
Compute the weight
p = ( θ + 2 ) θ α ( θ + 2 ) θ α + Γ ( α + 1 ) .
3:
Generate U Uniform ( 0 , 1 )
4:
If U p generate X Exp ( θ ) otherwise X Gamma ( α + 1 , θ )
5:
Return Step 3.

2.3. Interpretation of the Additional Parameter

The motivation of the proposed generalization is not only to increase flexibility but also to introduce an additional parameter that admits a structural and interpretable role within the innovation process. The proposed GXL distribution arises from a mixture representation as a convex combination of an exponential and a gamma distribution. Under this representation, the parameter α does not behave as a shape parameter but controls the relative dominance of the gamma component over the exponential component. From a time series perspective, the parameter α regulates the degree of the magnitude of positive shocks in the innovation process. Small values of α correspond to frequent and small memoryless shocks, while larger values of α reflect rare but more persistent innovations.

3. AR(1)-GXL Process

A first-order autoregressive, shortly AR(1), process is
X t = ρ X t 1 + ε t ,
where 0 ρ < 1 and ε t is the innovation distribution. We assume that the process X t is a stationary process with GXL innovations and cov ε t , X t r = 0 for r 1 [7]. In the remaining part of the section, we follow the results of [5,8]. The AR(1) process can be defined using the innovation sequence, as follows:
X t = ρ X t 1 + ε t = ρ ρ X t 2 + ε t 1 + ε t = ρ ρ ρ X t 3 + ε t 2 + ε t 1 ε t   = ρ t X 0 + r = 0 t 1 ρ r ε t r lim t ρ t X 0 + r = 0 t 1 ρ r ε t r = r = 0 ρ r ε t r .
Using Equations (37) and (35), the LT of the process X t is
Φ X t s = r = 0 Φ ε s ρ r = θ α + 1 D θ + 2 θ + s ρ r + Γ α + 1 θ + s ρ r α + 1 .
It is not possible to obtain the distribution of X t from Equation (38). However, the properties of the AR(1) process under the GXL innovations can be obtained using the properties of the GXL distribution. Since X t is a stationary process, the mean of the process is
E X t = E ρ X t 1 + ε t = ρ E X t 1 + E ε t = ρ E X t + E ε t = 1 1 ρ E ε t = 1 1 ρ θ + 2 θ α 1 + Γ α + 2 θ 1 D .
The variance of X t is
Var X t = Var ρ X t 1 + ε t = ρ 2 Var X t 1 + 2 cov X t 1 , ε t + Var ε t .
Since X t and ε t are independent, cov X t 1 , ε t = 0 . So, we have
Var X t = Var ρ X t 1 + ε t = ρ 2 Var X t 1 + Var ε t = 1 1 ρ 2 Var ε t = 1 1 ρ 2 D θ + 2 θ α 1 + Γ α + 2 θ 1 θ + 2 θ α 1 + Γ α + 2 θ 1 2 D 2 .

3.1. Conditional Mean and Variance

The one-step conditional mean of the process is
E X t + 1 X t = E ρ X t + ε t + 1 X t .
Since cov ε t + 1 , X t = 0 , we have
E X t + 1 X t = ρ X t + E ε t + 1 = ρ X t + θ + 2 θ α 1 + Γ α + 2 θ 1 D .
The conditional variance is
Var X t + 1 X t = Var ρ X t + ε t + 1 X t = Var ε t + 1 = σ ε 2 = D θ + 2 θ α 1 + Γ α + 2 θ 1 θ + 2 θ α 1 + Γ α + 2 θ 1 2 D 2 .
Similarly to Equation (37), the AR(1) can be defined using the innovation sequence, as follows:
X t + 1 = ρ X t + ε t + 1 X t + 2 = ρ X t + 1 + ε t + 2 = ρ ρ X t + ε t + 1 + ε t + 2 = ρ 2 X t + ρ ε t + 1 + ε t + 2   X t + h = ρ h X t + j = 1 h ρ h j ε t + j .
Equation (45) is useful to obtain the h-step conditional mean and variance of the AR(1)-GXL process. The h-step conditional mean is
E X t + h X t = E ρ h X t + j = 1 h ρ h j ε t + j X t = ρ h X t + j = 1 h ρ h j E ε t + j X t = ρ h X t + j = 1 h ρ h j E ε t + j = ρ h X t + j = 1 h ρ h j μ ε .
where μ ε is the mean of the innovation distribution. Using the geometric series and changing the limits of the summation by i = h j , we have
j = 1 h ρ h j = i = 0 h 1 ρ i = 1 ρ h 1 ρ .
Inserting Equation (47) into Equation (46), we have
E X t + h X t = ρ h X t + μ ε 1 ρ h 1 ρ = ρ h X t + 1 ρ h 1 ρ θ + 2 θ α 1 + Γ α + 2 θ 1 D .
Similarly, the h-step conditional variance is
Var X t + h X t = Var j = 1 h ρ h j ε t + j X t = j = 1 h ρ 2 h j Var ε t + j X t = σ ε 2 i = 0 h 1 ρ 2 i = σ ε 2 1 ρ 2 h 1 ρ 2 = 1 ρ 2 h 1 ρ 2 D θ + 2 θ α 1 + Γ α + 2 θ 1 θ + 2 θ α 1 + Γ α + 2 θ 1 2 D 2 .
where σ ε 2 is the variance of the innovation distribution. When h , we have
lim h E X t + h X t = μ ε 1 ρ ,
lim h Var X t + h X t = σ ε 2 1 ρ 2 .
It is easy to verify that when h , the conditional mean and variance are equal to the unconditional mean and variance of the process AR(1)-GXL. The auto-covariance function is
γ h = cov X t , X t + h = ρ h γ 0 ,
where
γ 0 = σ ε 2 1 ρ 2 , = D θ + 2 θ α 1 + Γ α + 2 θ 1 θ + 2 θ α 1 + Γ α + 2 θ 1 2 1 ρ 2 D 2 .
So, the autocorrelation function is
η h = γ h γ 0 = ρ h .

3.2. Residual

The residual analysis is important to verify the assumptions on the AR(1) process. The standardized Pearson residuals (spr) are calculated by
e t = X t E X t X t 1 V X t X t 1 ,
where E X t X t 1 and V X t X t 1 are in Equations (43) and (44), respectively. The autocorrelation problem is assessed using the autocorrelation (ACF) plot of the spr. The spr is preferred over raw residuals to evaluate the statistical validity of the fitted model. Once the model is statistically valid, the mean and variance of the spr should be zero and one [16].

4. Estimation

Three different approaches are used to estimate the parameters of the AR(1)-GXL process. These are maximum likelihood, Gaussian and least squares estimation methods.

4.1. Maximum Likelihood

Assume that the innovation process ε t follows the GXL distribution, given in Equation (3). The innovation process is defined as ε t = X t ρ X t 1 . So, the log-likelihood function of the AR(1)-GXL process is
ρ , α , θ = T 1 α + 1 log θ + t = 2 T θ X t ρ X t 1 + t = 2 T 2 + θ + X t ρ X t 1 α T 1 log θ + 2 θ α + Γ α + 1 .
Taking partial derivatives of Equation (56) with respect to the unknown parameters, we have the score vectors
ρ = t = 2 T θ X t 1 + t = 2 T α ( X t ρ X t 1 ) α 1 · ( X t 1 ) ,
θ = ( T 1 ) α + 1 θ t = 2 T ( X t ρ X t 1 ) + ( T 1 ) ( T 1 ) θ α + α ( θ + 2 ) θ α 1 ( θ + 2 ) θ α + Γ ( α + 1 ) ,
α = ( T 1 ) log θ + t = 2 T ( X t ρ X t 1 ) α log | X t ρ X t 1 | ( T 1 ) ( θ + 2 ) θ α log θ + Γ ( α + 1 ) ψ ( α + 1 ) ( θ + 2 ) θ α + Γ ( α + 1 ) .
The ML estimators of the AR(1)-GXL process can be obtained by equating these score vectors to zero and solving simultaneously. However, there are no closed-form expressions for the ML estimators of the AR(1)-GXL process. For this reason, we use the direct maximization method to obtain the ML estimations of the processes.

4.2. Gaussian

The Gaussian estimation (GE) method, proposed by [17], is based on the likelihood function of the Gaussian distribution. The mean and variance of the Gaussian distribution are replaced by the conditional mean and variance of the AR-GXL(1) process. The conditional likelihood function is
L ρ , α , θ = t = 2 n f x t x t 1 .
The log-likelihood function of (60) is
ρ , α , θ = t = 2 n log f x t x t 1 .
where X t X t 1 N μ X t X t 1 , σ X t X t 1 2 . The definitions of μ X t X t 1 , and σ X t X t 1 2 are in Equations (43) and (44), respectively. Inserting the conditional mean and variance of X t into Equation (61), we have
( ρ , α , θ ) = T 2 log ( 2 π ) ( T 1 ) 2 log D ( θ + 2 ) θ α 1 + Γ ( α + 2 ) θ 1 ( θ + 2 ) θ α 1 + Γ ( α + 2 ) θ 1 2 D 2 1 2 t = 2 T x t ρ x t 1 ( θ + 2 ) θ α 1 + Γ ( α + 2 ) θ 1 D 2 × D 2 D ( θ + 2 ) θ α 1 + Γ ( α + 2 ) θ 1 ( θ + 2 ) θ α 1 + Γ ( α + 2 ) θ 1 2 .
There are no explicit solutions for the parameters of the AR(1)-GXL distribution. Therefore, Equation (62) should be maximized using iterative optimization algorithms. For this purpose, we use the Nelder–Mead algorithm defined in the optim function of R.

4.3. Conditional Least Squares

The conditional least squares (CLS) estimations of the AR(1)-GXL are obtained by minimizing
Q ρ , α , θ = t = 2 T X t E X t X t 1 2 ,
where E X t X t 1 is defined in Equation (43). Inserting Equation (43) into Equation (63), we have
Q ρ , α , θ = t = 2 T X t ρ X t 1 θ + 2 θ α 1 + Γ α + 2 θ 1 D 2 .
Since it is not possible to obtain closed form expressions for the CLS estimators of the AR(1)-GXL process, as in the MLE approach, the equation in Equation (64) should be minimized using the iterative optimization algorithms. Again, we use the Nelder–Mead algorithm defined in the optim function.

4.4. Simulation

The ML, GE and CLS methods are compared via simulation study. The simulation replication number is set to N = 1000 . Two parameter vectors and four sample sizes are used. These are ρ = 0.8 ,   α = 2 ,   θ = 1 , ρ = 0.8 ,   α = 3 ,   θ = 2 , and n = 100 , 200 , 300 , 500 . The simulation results are reported in Table 1.
The ML method exhibits the best overall performance for the autoregressive parameter ρ . The ML estimates are nearly unbiased, with mean values consistently close to the true value of 0.8 , and with mean square error (MSE) values smaller than those of GE and CLS. The MSE for ρ under ML decreases rapidly with increasing n, approaching zero when n = 500 , demonstrating the expected asymptotic efficiency of the ML estimator.
For the parameters α and θ , ML maintains good performance but displays moderately higher variance compared to CLS. Although ML remains nearly unbiased for these parameters, its MSE values are consistently larger than those of CLS across all sample sizes. This pattern suggests that ML, while theoretically optimal in large samples, may exhibit less stable behavior for the parameters of the innovation distribution when n is relatively small.
The GE estimator performs consistently worse than CLS and ML for all parameters and across both scenarios. The most important deficiency appears in the estimation of the shape parameter α , where GE exhibits substantial positive bias and extremely large MSEs. The important findings of the simulation study can be summarized as follows:
  • The ML estimator is the most accurate and stable method for estimating ρ .
  • The CLS estimator consistently outperforms the alternatives for the parameters α and θ , providing the smallest MSEs and near-unbiased estimates.
  • The GE estimator performs poorly for all parameters and especially inefficient in estimating α .
Finally, we recommend the use of the CLS method instead of the GE and ML methods for small sample sizes. However, if the sample size is large enough, the ML and CLS methods can be preferred.

5. Applications

5.1. US Wheat Index

The daily closing prices of the US wheat data, from 2 February 2025 to 11 November 2025, are obtained from https://tr.investing.com/commodities/us-wheat-historical-data (accessed on 15 November 2025). The dataset is divided into two parts: testing and training. 70% percent of the data is used for training, while 30% percent is used for testing. Therefore, the first 140 observations are used for training, while the remaining 59 observations are evaluated for testing purposes. The ACF, partial ACF (PACF), and time series plots are displayed in Figure 3. Also, the descriptive statistics and stationary test of the Philips–Perron (with drift and no trend) are given in Table 2.

5.1.1. In-Sample Performance

Using the training data set, the AR(1) process is modeled under the GXL and normal innovation distributions. The results are presented in Table 3. According to the results in Table 3, it is concluded that the AR(1)-GXL process yields better results than the AR(1)-N process since its AIC and BIC values are smaller than those of the AR(1)-N process.
In the second stage, the sprs are obtained for both processes using the estimated parameters. According to the Box–Pierce test in Table 4, there is no autocorrelation between the residuals. Furthermore, according to the KS test result, the residuals obtained from both processes satisfy the assumptional distributions on the innovation process.
Figure 4 and Figure 5 display the residuals and fitted values of the AR(1)-GXL and AR(1)-N processes. The ACF and cumulative periodogram plots confirm that the residuals have no trend and no autocorrelation problem. Also, the residuals are randomly distributed around the zero. The fitted values of the AR(1)-GXL and AR(1)-N processes are close to the actual values of the US wheat data.
In summary, the use of the GXL distribution as an innovation distribution has increased the success of the model. The fit of the residuals obtained from the AR(1) process to the GXL distribution is higher than that of the AR(1)-N process.

5.1.2. Out-of-Sample Performance

The out-of-sample performance of the AR(1)-GXL and AR(1)-N processes is compared using the root mean squared error (RMSE), mean absolute error (MAE), and symmetric mean absolute percentage error (SMAPE) metrics. The definitions of these measures can be found in [8]. The results are presented in Table 5, where the RMSE, MAE, and SMAPE values of the AR(1)-GXL process are found to be smaller than those of the AR(1)-N process. Therefore, the forecasting performance of the AR(1)-GXL process is higher than that of the AR(1)-N process. Furthermore, a graphical comparison of the forecasted and actual values is provided in Figure 6.
The differences in AIC and BIC reported in Table 3 are numerically modest. However, the practical significance of the AR(1)-GXL model is assessed not just through AIC and BIC, but also via its ability to address skewness, heavier tails, and extreme observations, as evidenced by residual diagnostics and forecasting accuracy.

5.2. Synthetic Data

Consider the AR(1) process X t = ρ X t 1 + ε t where the innovations ε t are independent and identically distributed (iid) from a two-component mixture,
ε t Exp ( λ ) with probability p , Gamma ( α , θ ) with probability 1 p ,
where 0 p 1 , λ > 0 for the exponential component, and α > 0 , θ > 0 for the gamma component. The synthetic data is generated using the above structure with the following parameters: ρ = 0.9 ,   p = 0.2 ,   λ = 2 ,   α = 2 and θ = 2 . The sample size is determined as N = 500 . The ACF, PACF and time series plots of the synthetic data are displayed in Figure 7. From ACF and PACF plots, it is clear that the data exhibits AR(1) structure.
We compare the AR(1)-GXL process with the AR(1) process under different innovation distributions such as gamma, weighted Lindley and normal. As in the previous application, the data is divided into two sets: training and testing. The first 350 observations are used as a training set and the remaining 150 observations are used as a testing set. Table 6 shows the estimated parameters of the AR(1) processes for the training set. Since the AR(1)-GXL process has the lowest values of the AIC and BIC values, we conclude that the AR(1)-GXL provides better results than the AR(1)-WL, AR(1)-Gamma and AR(1)-normal.
Table 7 shows the Box–Pierce and KS test results for the residuals of the AR(1) processes. The AR(1)-GXL process has the highest p-value for the KS test result. Also, there is no autocorrelation problem for all AR(1) processes. The mean and variance of the sprs are near zero and one, respectively, for AR(1)-GXL and AR(1)-N. However, the variance of the sprs of the AR(1)-Gamma and AR(1)-WL are far away from one. It means that the variance of the process is underestimated with Gamma and WL innovations.
Figure 8 displays the graphical results for the residuals of the AR(1)-GXL process. From these plots, it is obvious that GXL distribution provides a perfect fit for the innovation process of the AR(1).
Table 8 compares the out-of-sample performances of the AR(1) processes. The AR(1)-GXL processes have the lowest values of the MAE, RMSE and SMAPE metrics. Therefore, it is selected as the best process for the data. The surprising result belongs to the AR(1)-Normal process. Despite showing the worst performance in the training set, the AR(1)-Normal process is the second-best model in the testing set. The graphical comparison of the forecasted and actual values is displayed in Figure 9.

6. ARGXL Shiny Application

The ARGXL Shiny application provides a computational framework for analyzing and estimating the ARGXL autoregressive model. The ARGXL tool includes data upload, visualization, unit root testing, maximum likelihood estimation, diagnostic checking, and out-of-sample forecasting features. The application is accessible via https://gazistat.shinyapps.io/ARGXL (accessed on 2 December 2025).

6.1. Data Upload Panel

The Data Upload panel enables the user to import datasets in CSV formats. The user can specify header options and separators. Once the dataset is loaded, the panel automatically displays a preview of the observations and generates a dynamic variable selection interface (see Figure 10).

6.2. Plots and Unit Root Test Panel

The plots and unit root test panel displays the plot of the selected time series and provides ACF and partial PACF plots. In addition, it performs the PP test for checking whether the time series is stationary. The results are presented in an interactive table with lag, test statistic, and p-value displayed for different trend and drift specifications. The panel helps the user in determining whether the time series is stationary and suitable for ARGXL modeling (see Figure 11).

6.3. Estimation Panel

The estimation panel carries out maximum likelihood estimation of the ARGXL model parameter. The user may optionally specify the length of the sample used for estimation. Standard errors are computed via numerical approximation of the Hessian matrix. The panel outputs the parameter estimates, their asymptotic standard errors, the maximized log-likelihood, and the commonly used information criteria, AIC and BIC (see Figure 12).

6.4. Diagnostic Panel

The Diagnostics panel evaluates the adequacy of the fitted model (see Figure 13). It computes raw residuals, sprs, and their empirical distribution properties. A histogram of raw residuals with the estimated probability density function of the GXL innovation distribution is provided. Additional diagnostic tools include the following:
  • KS test for comparing residuals with the fitted GXL distribution.
  • Box–Pierce test for serial correlation in Pearson residuals.
  • Cumulative periodogram for assessing the white noise process.

6.5. Forecasting

The forecasting panel implements one-step rolling forecasting. The user may choose a fixed training sample size or rely on the default choice of 70% of available observations. For each forecasting iteration, the model is re-estimated and forecasted values are obtained using the conditional mean of the GXL innovation distribution. The forecasting accuracy is evaluated using the following metrics: RMSE, MAE, and SMAPE. Additionally, the panel provides a graphical comparison of the forecasted and observed values (see Figure 14).

6.6. Packages and Deployment Process

The application is implemented in the R environment using the shiny and shinydashboard packages. The application is deployed on the free R Shiny server. The used packages are listed below:
  • shinydashboard: User interface generation [18].
  • ggplot2: Graphical visualization and comparison [19].
  • aTSA: PP unit root test [20].
  • Metrics: Computation of RMSE, MAE, and SMAPE [21].
  • numDeriv: Numerical differentiation and estimation of Hessian matrix [22].
  • zipfR: Evaluation of incomplete gamma function [23].
  • DT: Interactive rendering of tables [24].

7. Results

The GXL distribution is proposed by adding a shape parameter to the XL distribution. The moments and related measures of the GXL distribution are derived in explicit forms. An AR(1) process with GXL innovations is introduced. The parameter estimations of the proposed process are discussed with three estimation methods via a comprehensive simulation study. The practical importance of the AR(1)-GXL process is demonstrated using two data sets. Empirical findings show that the AR(1) process with GXL innovations outperforms the gamma, WL and normal innovation distributions. The ARGXL web application is developed to enable researchers to use the proposed model and to ensure the reproducibility of the results given in the study.
The AR(1)-GXL model has several limitations. The main limitation of the proposed model is about the parameter estimation process. The problem we encountered in the parameter estimation process is that the parameter estimates obtained from the GE and LSE methods may produce negative residuals. The negative residuals are not possible in the AR(1)-GXL model. However, the negative residual problem is not encountered in the ML method. Although the LSE method is the best method based on the simulation results, it is recommended to use the ML method, which guarantees positive residuals.

Author Contributions

Conceptualization, E.A., S.A.G. and H.N.A.; methodology, E.A., S.A.G. and H.N.A.; software, E.A.; validation, E.A., S.A.G. and H.N.A.; writing—original draft preparation, E.A., S.A.G. and H.N.A.; writing—review and editing, E.A., S.A.G. and H.N.A.; visualization, E.A. All authors have read and agreed to the published version of the manuscript.

Funding

The researchers would like to thank the Deanship of Graduate Studies and Scientific Research at Qassim University for financial support (QU-APC-2026).

Data Availability Statement

The original data presented in the study are openly available at https://tr.investing.com/commodities/us-wheat-historical-data 22 January 2026.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Andel, J. On AR (1) processes with exponential white noise. Commun. Stat.-Theory Methods 1988, 17, 1481–1495. [Google Scholar] [CrossRef]
  2. Gaver, D.P.; Lewis, P.A. First-order autoregressive gamma sequences and point processes. Adv. Appl. Probab. 1980, 12, 727–745. [Google Scholar] [CrossRef]
  3. Sharafi, M.; Nematollahi, A.R. AR (1) model with skew-normal innovations. Metrika 2016, 79, 1011–1029. [Google Scholar] [CrossRef]
  4. Ghasami, S.; Khodadadi, Z.; Maleki, M. Autoregressive processes with generalized hyperbolic innovations. Commun. Stat.-Simul. Comput. 2020, 49, 3080–3092. [Google Scholar] [CrossRef]
  5. Mello, A.B.; Lima, M.C.; Nascimento, A.D. A notable Gamma-Lindley first-order autoregressive process: An application to hydrological data. Environmetrics 2022, 33, e2724. [Google Scholar] [CrossRef]
  6. Nitha, K.U.; Krishnarani, S.D. On autoregressive processes with Lindley-distributed innovations: Modeling and simulation. Stat. Transit. 2024, 25, 31–47. [Google Scholar] [CrossRef]
  7. Bakouch, H.S.; Popovic, B.V. Lindley first-order autoregressive model with applications. Commun. Stat.-Theory Methods 2016, 45, 4988–5006. [Google Scholar] [CrossRef]
  8. Gabr, M.; Bakouch, H.; El-Taweel, H. A first-order autoregressive process with weighted Lindley innovations and its applications to energy and financial data. Ann. Math. Comput. Sci. 2025, 29, 1–19. [Google Scholar] [CrossRef]
  9. Maleki, M.; Arellano-Valle, R.B.; Dey, D.K.; Mahmoudi, M.R.; Jalali, S.M.J. A Bayesian approach to robust skewed autoregressive processes. Calcutta Stat. Assoc. Bull. 2017, 69, 165–182. [Google Scholar] [CrossRef]
  10. Bondon, P. Estimation of autoregressive models with epsilon-skew-normal innovations. J. Multivar. Anal. 2009, 100, 1761–1776. [Google Scholar] [CrossRef]
  11. Chouia, S.; Zeghdoudi, H. The XLindley distribution: Properties and application. J. Stat. Theory Appl. 2021, 20, 318–327. [Google Scholar] [CrossRef]
  12. Maleki, M.; Nematollahi, A.R. Autoregressive models with mixture of scale mixtures of Gaussian innovations. Iran. J. Sci. Technol. Trans. A Sci. 2017, 41, 1099–1107. [Google Scholar] [CrossRef]
  13. Saadatmand, A.; Nematollahi, A.R.; Sadooghi-Alvandi, S.M. On the estimation of missing values in AR (1) model with exponential innovations. Commun. Stat.-Theory Methods 2017, 46, 3393–3400. [Google Scholar] [CrossRef]
  14. Ghitany, M.E.; Atieh, B.; Nadarajah, S. Lindley distribution and its application. Math. Comput. Simul. 2008, 78, 493–506. [Google Scholar] [CrossRef]
  15. Sen, S.; Maiti, S.S.; Chandra, N. The xgamma distribution: Statistical properties and application. J. Mod. Appl. Stat. Methods 2016, 15, 38. [Google Scholar] [CrossRef]
  16. Weiß, C.; Scherer, L.; Aleksandrov, B.; Feld, M. Checking model adequacy for count time series by using Pearson residuals. J. Time Series Econom. 2020, 12, 20180018. [Google Scholar] [CrossRef]
  17. Whittle, P. Gaussian estimation in stationary time series. Bull. Int. Stat. Inst. 1961, 39, 105–129. [Google Scholar]
  18. Chang, W.; Borges Ribeiro, B. shinydashboard: Create Dashboards with ’Shiny’. R Package Version 0.7.2. 2021. Available online: https://CRAN.R-project.org/package=shinydashboard (accessed on 15 November 2025).
  19. Wickham, H. ggplot2: Elegant Graphics for Data Analysis; Springer: New York, NY, USA, 2016. [Google Scholar]
  20. Qiu, D. aTSA: Alternative Time Series Analysis. R Package Version 3.1.2.1. 2024. Available online: https://CRAN.R-project.org/package=aTSA (accessed on 15 November 2025).
  21. Hamner, B.; Frasco, M. Metrics: Evaluation Metrics for Machine Learning. R Package Version 0.1.4. 2018. Available online: https://CRAN.R-project.org/package=Metrics (accessed on 15 November 2025).
  22. Gilbert, P.; Varadhan, R. numDeriv: Accurate Numerical Derivatives. R Package Version 2016.8-1.1. 2019. Available online: https://CRAN.R-project.org/package=numDeriv (accessed on 15 November 2025).
  23. Evert, S.; Baroni, M. zipfR: Word frequency distributions in R. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Posters and Demonstrations Sessions, Prague, Czech Republic, 23–30 June 2007; pp. 29–32. [Google Scholar]
  24. Xie, Y.; Cheng, J.; Tan, X. DT: A Wrapper of the JavaScript Library ’DataTables’. R Package Version 0.33. 2024. Available online: https://CRAN.R-project.org/package=DT (accessed on 15 November 2025).
Figure 1. Pdf of the GXL distribution for different values of α with fixed θ .
Figure 1. Pdf of the GXL distribution for different values of α with fixed θ .
Axioms 15 00107 g001
Figure 2. Heat map for skewness and kurtosis measures.
Figure 2. Heat map for skewness and kurtosis measures.
Axioms 15 00107 g002
Figure 3. PACF (top-left), ACF (top-right) and daily prices of the US wheat index (bottom-middle).
Figure 3. PACF (top-left), ACF (top-right) and daily prices of the US wheat index (bottom-middle).
Axioms 15 00107 g003
Figure 4. ACF and cumulative periodogram for the residuals of the AR(1)-GXL, fitted and observed values of the data and estimated GXL innovations with PP plot.
Figure 4. ACF and cumulative periodogram for the residuals of the AR(1)-GXL, fitted and observed values of the data and estimated GXL innovations with PP plot.
Axioms 15 00107 g004
Figure 5. ACF and cumulative periodogram for the residuals of the AR(1)-N, fitted and observed values of the data and estimated normal innovations with PP plot.
Figure 5. ACF and cumulative periodogram for the residuals of the AR(1)-N, fitted and observed values of the data and estimated normal innovations with PP plot.
Axioms 15 00107 g005
Figure 6. Comparison of actual and forecasted values of the US wheat data.
Figure 6. Comparison of actual and forecasted values of the US wheat data.
Axioms 15 00107 g006
Figure 7. ACF, PACF and time series plots of the synthetic data.
Figure 7. ACF, PACF and time series plots of the synthetic data.
Axioms 15 00107 g007
Figure 8. Residual plots of the AR(1)-GXL process.
Figure 8. Residual plots of the AR(1)-GXL process.
Axioms 15 00107 g008
Figure 9. Comparison of actual and forecasted values of the AR(1) processes.
Figure 9. Comparison of actual and forecasted values of the AR(1) processes.
Axioms 15 00107 g009
Figure 10. The data upload panel.
Figure 10. The data upload panel.
Axioms 15 00107 g010
Figure 11. The plots and unit root test panel.
Figure 11. The plots and unit root test panel.
Axioms 15 00107 g011
Figure 12. The estimation panel.
Figure 12. The estimation panel.
Axioms 15 00107 g012
Figure 13. The diagnostic panel.
Figure 13. The diagnostic panel.
Axioms 15 00107 g013
Figure 14. Forecasting panel.
Figure 14. Forecasting panel.
Axioms 15 00107 g014
Table 1. Simulation results of the AR(1)-GXL process.
Table 1. Simulation results of the AR(1)-GXL process.
ScenarionMethodMeanMSE
ρ α θ ρ α θ
I100GE0.77354.03661.66000.004140.21754.7136
CLS0.76492.19200.99870.00550.22430.0229
ML0.80192.10911.0617<0.00010.86950.1145
200GE0.78123.73251.56360.001812.53791.1151
CLS0.78092.11891.00280.00240.09460.0124
ML0.80102.06731.0340<0.00010.36580.0314
300GE0.78703.56931.52820.001111.91511.0760
CLS0.78912.08491.00680.00150.04070.0087
ML0.80062.01461.0126<0.00010.29350.0229
500GE0.79083.23571.43430.00069.35950.8257
CLS0.79282.07861.01010.00080.02610.0053
ML0.80042.01011.0082<0.00010.15690.0132
II100GE0.79222.68812.03310.001116.00221.8758
CLS0.76423.23641.99070.00490.27320.0574
ML0.80172.98472.0433<0.00011.13310.0902
200GE0.79392.81472.02980.00088.25240.8931
CLS0.78093.18952.01110.00220.19180.0303
ML0.80093.06692.0407<0.00010.58510.0474
300GE0.79642.77402.00290.00075.71760.6291
CLS0.78853.17152.03200.00150.16300.0251
ML0.80052.96952.0081<0.00010.31330.0234
500GE0.79732.81481.99380.00053.83760.4122
CLS0.79373.13742.03080.00080.08670.0145
ML0.80033.00422.0090<0.00010.18620.0157
Table 2. Descriptive statistics and PP test results for the training set.
Table 2. Descriptive statistics and PP test results for the training set.
MinimumMaximumMeanStd. Deviationp-Value of PP Test
499617.75545.7521.120.0479
Table 3. Estimated parameters of the AR(1) process with GXL and normal innovations.
Table 3. Estimated parameters of the AR(1) process with GXL and normal innovations.
ModelsParametersEstimatesStd. Errors AICBIC
AR(1)-GXL ρ 0.956<0.001496.2745998.54911007.374
α 5.6820.782
θ 0.2840.035
AR(1)-N ρ 0.9120.034500.2401006.4801015.300
μ 47.52218.832
σ 8.5620.514
Table 4. Residuals of the AR(1)-GXL and AR(1) processes.
Table 4. Residuals of the AR(1)-GXL and AR(1) processes.
ModelsMeanVarianceBox–PierceKS
Statisticsp-ValueStatisticsp-Value
AR(1)-GXL<0.0010.9500.0530.8180.0750.416
AR(1)-N<0.0011.0040.3460.5560.0980.140
Table 5. MAE, RMSE and SMAPE values for the AR(1)-GXL and AR(1)-N processes.
Table 5. MAE, RMSE and SMAPE values for the AR(1)-GXL and AR(1)-N processes.
ModelsMAERMSESMAPE
AR(1)-GXL5.317836.916550.01020
AR(1)-N5.811937.191390.01116
Table 6. Estimated parameters of the AR(1) processes for the synthetic data.
Table 6. Estimated parameters of the AR(1) processes for the synthetic data.
ModelsParametersEstimatesStd. Errors AICBIC
AR(1)-GXL ρ 0.9000.0001020.3152046.6312059.275
α 2.5160.291
θ 0.8660.084
AR(1)-WL ρ 0.9000.0001023.5642053.1272065.771
c0.8940.061
θ 0.5160.028
AR(1)-Gamma ρ 0.8990.0001031.4452068.8892081.533
α 1.4270.083
θ 2.1090.145
AR(1)-N ρ 0.8690.0221112.5992231.1982243.842
μ 3.9000.667
σ 2.2490.071
Table 7. Residuals of the fitted AR(1) processes for the synthetic data.
Table 7. Residuals of the fitted AR(1) processes for the synthetic data.
ModelsMeanVarianceBox–PierceKS
Statisticsp-Value Statisticsp-Value
AR(1)-GXL−0.001310.969840.063440.801100.027700.83860
AR(1)-Gamma0.000010.895370.054660.815100.053290.11750
AR(1)-WL0.005810.876810.063440.801100.056370.08395
AR(1)-N−0.000091.001050.122230.726600.085840.00128
Table 8. Comparison of MAE, RMSE and SMAPE values of the AR(1) processes.
Table 8. Comparison of MAE, RMSE and SMAPE values of the AR(1) processes.
ModelsMAERMSESMAPE
AR(1)-GXL1.875622.343150.06344
AR(1)-WL1.921012.391780.06493
AR(1)-Gamma1.922052.395620.06496
AR(1)-N1.911642.384390.06461
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Altun, E.; Ghomeishi, S.A.; Alqifari, H.N. A Natural Generalization of the XLindley Distribution and Its First-Order Autoregressive Process with Applications to Non-Gaussian Time Series. Axioms 2026, 15, 107. https://doi.org/10.3390/axioms15020107

AMA Style

Altun E, Ghomeishi SA, Alqifari HN. A Natural Generalization of the XLindley Distribution and Its First-Order Autoregressive Process with Applications to Non-Gaussian Time Series. Axioms. 2026; 15(2):107. https://doi.org/10.3390/axioms15020107

Chicago/Turabian Style

Altun, Emrah, Soheyla A. Ghomeishi, and Hana N. Alqifari. 2026. "A Natural Generalization of the XLindley Distribution and Its First-Order Autoregressive Process with Applications to Non-Gaussian Time Series" Axioms 15, no. 2: 107. https://doi.org/10.3390/axioms15020107

APA Style

Altun, E., Ghomeishi, S. A., & Alqifari, H. N. (2026). A Natural Generalization of the XLindley Distribution and Its First-Order Autoregressive Process with Applications to Non-Gaussian Time Series. Axioms, 15(2), 107. https://doi.org/10.3390/axioms15020107

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop