Next Article in Journal
Evidence That Supertriangles Exist in Nature from the Vertical Projections of Koelreuteria paniculata Fruit
Next Article in Special Issue
Human Decision Time in Uncertain Binary Choice
Previous Article in Journal
On the Interpretation of the Balance Function
Previous Article in Special Issue
A New Formula for Calculating Uncertainty Distribution of Function of Uncertain Variables
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Symmetry Analysis of the Uncertain Alternative Box-Cox Regression Model

1
School of Economics and Management, Beijing Forestry University, Beijing 100083, China
2
Center for Statistical Science, Tsinghua University, Beijing 100084, China
3
CEMSE Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
*
Author to whom correspondence should be addressed.
Symmetry 2022, 14(1), 22; https://doi.org/10.3390/sym14010022
Submission received: 29 November 2021 / Revised: 14 December 2021 / Accepted: 15 December 2021 / Published: 24 December 2021
(This article belongs to the Special Issue Uncertainty Theory: Symmetry and Applications)

Abstract

:
The asymmetry of residuals about the origin is a severe issue in estimating a Box-Cox transformed model. In the framework of uncertainty theory, there are such theoretical issues regarding the least-squares estimation (LSE) and maximum likelihood estimation (MLE) of the linear models after the Box-Cox transformation on the response variables. Heretofore, only weighting methods for least-squares analysis have been available. This article proposes an uncertain alternative Box-Cox model to alleviate the asymmetry of residuals and avoid λ tending to negative infinity for uncertain LSE or uncertain MLE. Such symmetry of residuals about the origin is reasonable in applications of experts’ experimental data. The parameter estimation method was given via a theorem, and the performance of our model was supported via numerical simulations. According to the numerical simulations, our proposed ‘alternative Box-Cox model’ can overcome the problems of a grossly underestimated lambda and the asymmetry of residuals. The estimated residuals neither deviated from zero nor changed unevenly, in clear contrast to the LSE and MLE for the uncertain Box-Cox model downward biased residuals. Thus, though the LSE and MLE are not applicable on the uncertain Box-Cox model, they fit the uncertain alternative Box-Cox model. Compared with the uncertain Box-Cox model, the issue of a systematically underestimated λ is not likely to occur in our uncertain alternative Box-Cox model. Both the LSE and MLE can be used directly without constructing a weighted estimation method, offering better performance in the asymmetry of residuals.

1. Introduction

Experts’ experimental data involve the subjective judgment of different experts on the possibility of an uncertain event. Due to the small sample size and imprecise data, the mechanism differs from the random sampling in classical statistical research. It may not be appropriate to apply statistical inference directly. Professor Liu Baoding proposed the uncertainty theory to address problems with this kind of data [1]. The uncertain phenomena are described by establishing a new axiomatic system, and the concepts and applications of uncertain measure, uncertain distributions, and uncertain inverse distributions are introduced [2,3,4,5]. This theoretical framework opens up a new research direction and has been widely adopted in many fields [6,7].
Uncertain regression analysis is a classic subject in uncertainty theory, much discussed in depth by many researchers. For instance, an uncertain linear regression model is established based on uncertainty theory [8], Yao and Liu [9] proposed least-squares estimation, Song and Fu [10] introduced a least squares method to estimate unknown parameters of uncertain multivariable linear regression, and Lio and Liu [11,12] suggested the residuals and confidence interval then came up with the uncertain MLE estimation. Meanwhile, the residual analysis of uncertain Gompertz regression model was provided by Hu and Gao [13]. Liu et al. [14] presented a k-fold cross-validation method for the model selection of inaccurate observations. A new minimum absolute deviation estimation method for unknown parameters in uncertain multiple regression model was proposed by Zhang et al. [15]. The least absolute deviations estimation and variable selection were proposed by Liu and Yang [16,17] for uncertain regression with imprecise observations, and Ye et al. [18] further proposed uncertain hypothesis testing. For other relevant studies on uncertain regression analysis, please refer to Yang and Liu [19], Yang and Gao [20], Liu and Jia [21], Ye and liu [22].
In statistics, linear regression is based on normal distribution, so data transformation is required to conform to normal distribution. Box-Cox transformation [23] usually can satisfy the linearity, independence, homogeneity of variance, and normality of the linear regression model without losing information. Therefore, it is widely used in data analysis. Such an assumption is not met when studying the observed data. To make sense of the linear models, one often needs to use index changes or logarithmic transformation. There are two following objectives for Box-Cox transformation. One is to reduce the unobservable errors and the correlation of prediction variables to a certain extent. The primary change is on the dependent variable. The transformed dependent variable shares a linear relationship with the predictors with the error term assuming a normal distribution, equal variance, and being independent of each other. The second is to assign the dependent variable with specific properties, such as stationarity in the time series analysis or normal distribution of the dependent variable. The regression model obtained using the Box-Cox transformation data is superior to the model before transformation and facilitated the interpretability of the model.
Similarly, in uncertain analysis, there are similar properties. Fang et al. [24] proposed three transformation methods for the response variables in uncertain regression analysis with imprecise observation data. The modified least square estimation [14] and the uncertain MLE [25] in the discrete case are further proposed. However, the asymmetry problems of residuals and λ tending to negative infinity still exist, making the results impractical. Liu et al. [14] alleviated this problem by adding a penalty term for uncertain least-squares estimation. Fang et al. [25] also pointed out that in the case of imprecise observations, the same problems also occur in the maximum likelihood estimation of the uncertain Box-Cox model. The method of selecting the corresponding penalty term in the model has not been studied. The purpose of the data transformation is to improve the residual fitting results of the uncertain model. When the original data cannot give satisfactory residual fitting results, it is necessary to transform the variables.
Therefore, this paper aims to propose an alternative Box-Cox model, which is equivalent to reparameterization. One can directly fit the uncertain Box-Cox model using this model, avoiding the asymmetry problems of residuals and λ . In addition, as far as parameter estimation is concerned, applying the least-squares method on the alternative model is equivalent to using the weighted least-squares method on the transformed model proposed by Fang et al. [25]. Since there is no similar weighted modification for the uncertain maximum likelihood estimation yet, the proposed method of providing the alternative model is valuable and flexible.
The structure of this paper is as follows. The second section offers the theory of alternative Box-Cox transformation. The third section presents the parameter estimation theory, including the least square estimation of uncertain alternative transformation and uncertain MLE estimation. The fourth section includes the numerical simulation in which the performance of the alternative Box-Cox model and the ordinary Box-Cox model are compared. A set of imprecise observation data was selected, and the least-square and MLE fitting models were used, respectively; the residual graph reached the fitting quality. The concluding remarks and discussions are shared in the final section.

2. Alternative Box-Cox Transformation

This section proposes the uncertain alternative Box-Cox transformation and the corresponding uncertain model. Recall that, for the uncertain variable Y and the explanatory variables x 1 , , x p , Liu et al. [14] proposed the following uncertain Box-Cox model:
H ( Y ; λ ) = f ( x 1 , , x p ; β ) + ϵ ,
where
H ( y ; λ ) = y λ 1 λ , if λ 0 , log y , if λ = 0
is the Box-Cox function, f is a prespecified function, λ and β are unknown parameters, ϵ is the error term in which the expectation is zero, and σ 2 is the unknown variance. In this model, the data can be precisely or imprecisely observed. When the data are precisely observed, the response variable is denoted by y i and the explanatory variables by x i 1 , , x i p , where i = 1 , , n . When the data are imprecisely observed, the response variable is denoted by y ˜ i and the explanatory variables by x ˜ i 1 , , x ˜ i p , i = 1 , , n . We assume that y i or y ˜ i is always larger than zero throughout this article.
The unknown parameters in the uncertain model are often estimated by the uncertain least-squares method. Letting the data be imprecisely observed, the uncertain least-squares estimation of the unknown parameters β and λ is defined by
( β ^ , λ ^ ) = argmin β , λ L ( β , λ ) ,
where
L ( β , λ ) = i = 1 n E H ( y ˜ i ; λ ) f ( x ˜ i 1 , , x ˜ i p ; β ) 2
is the least-squares term. However, Liu et al. [14] pointed out that when f is a linear function, e.g., f ( x 1 , , x p ; β ) = β 0 + β 1 x 1 + + β p x p , the least-squares term tends to zero when β 0 and λ , leading to unreasonable estimates. They solved this problem by introducing a penalty term in the least-squares estimation. Fang et al. [25] mentioned that when the data are precisely observed, such unreasonable estimates also appear if one adopts the uncertain maximum likelihood estimation (MLE). To the best of our knowledge, there are no relevant research works on avoiding such issues for the uncertain MLE.
In classical statistics framework, Draper and Smith [26] also noticed that the Box-Cox transformation may be invalid when λ tends to and proposed the alternative Box-Cox transformation. This article propose the following uncertain alternative Box-Cox transformation:
V ( y ; λ ) = ( y λ 1 ) / ( λ Y ˙ λ 1 ) , if λ 0 , Y ˙ log y , if λ = 0 .
Here,
Y ˙ = exp 1 n i = 1 n log y i
if the data are precisely observed, and
Y ˙ = exp 1 n i = 1 n E ( log y ˜ i )
if the data are imprecisely observed. Note that our proposed transformation V ( y ; λ ) is the Box-Cox transformation H ( y ; λ ) divided by Y ˙ λ 1 . Draper and Smith [26] pointed out in Section 13.2 that the term Y ˙ λ 1 is the n-th power of the Jacobian determinant of H ( y , λ ) : = ( H ( y 1 , λ ) , , H ( y n , λ ) ) , with respect to responses y = ( y 1 , , y n ) . Thus, the unit volume is preserved when y i is transformed to V ( y i ; λ ) .
Unlike H ( y ; λ ) , the size of V ( y ; λ ) does not change enormously with respect to different λ , so the unreasonable estimates caused by the shrinkage of H ( y ; λ ) when λ will not appear.
Based on our proposed transformation (2), when the observations y i or y ˜ i are available, we define the following uncertain alternative Box-Cox model:
V ( Y ; λ ) = f ( x 1 , , x p ; β ) + ϵ ,
where V ( Y ; λ ) is defined by (2). Here, f is a predetermined function, λ and β are unknown parameters, ϵ is the error term with zero expectation, and the variance of ϵ is an unknown parameter σ 2 .
When f ( x 1 , , x p ; β ) = β 0 + β 1 x 1 + + β p x p , our proposed model can be regarded as a reparametrization of the uncertain Box-Cox model (1) because (3) is equivalent to
H ( Y ; λ ) = β ˜ 0 + β ˜ 1 x 1 + + β ˜ p x p + e ,
where β ˜ i = β i Y ˙ λ 1 . e is the error term, which the expectation is zero and the variance is σ ˜ 2 = σ 2 Y ˙ 2 λ 2 . In fact, using the least-squares estimation to fit our proposed model is equivalent to using the penalized least-squares method introduced by Liu et al. [14] to fit the Box-Cox model (1). Therefore, one can directly adopt popular uncertain fitting methods, such as least-squares estimation and maximum likelihood estimation, to our alternative model without worrying about finding penalty terms or tackling the shrinkage problem.

3. Model Estimation

This section introduces the least-squares estimation (LSE) and maximum likelihood estimation (MLE) of our proposed model (3). When the data are imprecisely observed, the response variable is denoted by y ˜ i and the explanatory variables are denoted by x ˜ i 1 , , x ˜ i p , i = 1 , , n . For the case where the data are precisely observed, the estimations are performed similarly and thus omitted here.

3.1. The LSE of Uncertain Alternative Box-Cox Regression Model

First, this subsection provide the least-squares estimation (LSE). For the uncertain alternative Box-Cox model (3), the LSE for parameters β and λ is defined by
( β ^ , λ ^ ) = argmin β , λ L ( β , λ ) ,
where
L ( β , λ ) = i = 1 n E V ( y ˜ i ; λ ) f ( x ˜ i 1 , , x ˜ i p ; β ) 2 .
Using Theorem 2.15 in Liu [5], the computation formula of the least-squares term (5) can be provided by the following theorem.
Theorem 1.
Consider the uncertain alternative Box-Cox model (3). Assume that the response data y ˜ i and explanatory data x ˜ i 1 , , x ˜ i p , i = 1 , , n are imprecisely observed. Let y ˜ i and x ˜ i 1 , , x ˜ i p , i = 1 , , n be independent regular uncertain variables, with uncertainty distributions Ψ i and Φ i 1 , , Φ i p , respectively. Let f ( x 1 , , x p ; β ) be a continuous function with respect to x 1 , , x p satisfying the following conditions: for all j = 1 , , p , there exists sets B j , B j R p such that
1.
When β B j , f is strictly decreasing with respect to x j ;
2.
When β B j , f is strictly increasing with respect to x j ;
3.
When β R p \ { B j B j } , f is irrelevant to x j .
Then, the least-squares term (5) satisfies
L ( β , λ ) = i = 1 n 0 1 F r e s , i 1 ( α ) 2 d α ,
where
F r e s , i 1 ( α ) = V ( Ψ i 1 ( α ) , λ ) f ( Γ i 1 1 ( α ) , , Γ i p 1 ( α ) ; β ) , i = 1 , , n ,
Γ i j 1 ( α ) = Φ i j 1 ( α ) , i f β B j , Φ i j 1 ( 1 α ) , i f β B j , j = 1 , , p .
Proof of Theorem 1.
For any β , define g ( y , x 1 , , x p ) = V ( y , λ ) f ( x 1 , , x p , β ) . Then g is strictly increasing with respect to y. For any j { 1 , , p } , if β B j , g is strictly increasing with respect to x j ; if β B j , g is strictly decreasing with respect to x j ; if β R p \ { B j B j } , g is irrelevant to x j . According to Theorem 2.15 in Liu [5], the inverse uncertainty function of g ( y ˜ i , x ˜ i 1 , , x ˜ i p ) = V ( y ˜ i ; λ ) f ( x ˜ i 1 , , x ˜ i p ; β ) is F res , i 1 ( α ) defined by (6).
One can easily verify that F res , i 1 ( α ) is a continuous, strictly increasing function defined on ( 0 , 1 ) for each i. We define F res , i 1 ( 0 ) = lim α 0 + F res , i 1 ( α ) , F res , i 1 ( 1 ) = lim α 1 F res , i 1 ( α ) , so the inverse function F res , i ( x ) is a continuous, strictly increasing function with respect to x at which 0 < F res , i ( x ) < 1 , and lim x F res , i ( x ) = 0 , lim x F res , i ( x ) = 1 . Thus, F res , i ( x ) is a regular uncertainty distribution, which is defined by Definition 2.12 in the work of Liu [5]. According to Theorem 2.44 in Liu [5], E V ( y ˜ i ; λ ) f ( x ˜ i 1 , , x ˜ i p ; β ) 2 = 0 1 F res , i 1 ( α ) 2 d α , so the theorem is proved. □
Remark 1.
When f is a linear function defined by
f ( x 1 , , x p ; β ) = β 0 + β 1 x 1 + + β p x p ,
then f satisfies the assumptions of Theorem 1 for B j = { β : β j < 0 } and B j = { β : β j > 0 } .
Remark 2.
When f is the linear function (7), the LSE of our proposed model is equivalent to the penalized least-squares method proposed by Liu et al. [14]. For our model with imprecisely observed data,
L ( β , λ ) = i = 1 n E H ( y ˜ i ; λ ) ( β ˜ 0 + β ˜ 1 x ˜ i 1 + + β ˜ p x ˜ i p ) 2 Y ˙ ( 2 λ 2 ) ,
where β ˜ i = β i Y ˙ λ 1 . Note that the objective function of the penalized least-squares method for model (3) is
R ( β , λ ) = i = 1 n E H ( y ˜ i ; λ ) ( β ˜ 0 + β ˜ 1 x ˜ i 1 + + β ˜ p x ˜ i p ) 2 Y ˙ 2 λ ,
so the two estimates are equivalent.

3.2. The MLE of Uncertain Alternative Box-Cox Regression Model

Next, this subsection provides the maximum likelihood estimation based on Theorem 4 of Lio and Liu [12]. For our proposed model (3), the MLE for parameters β , λ , and error-term parameters θ ϵ are defined by
( β ^ M L E , λ ^ M L E , θ ^ ϵ , M L E ) = argmax β , λ L ( β , λ , θ ϵ | z ˜ 1 , , z ˜ n ) ,
L ( β , λ , θ ϵ | z ˜ 1 , , z ˜ n ) = max z 1 , , z n i = 1 n F ( z i | θ ϵ ) G i ( z i | β , λ ) ,
here F ( · | θ ϵ ) is the uncertainty distribution of the error term ϵ , G i ( · | β , λ ) , i = 1 , , n is the uncertainty distribution of
z ˜ i : = V ( y ˜ i ; λ ) f ( x ˜ i 1 , , x ˜ i p | β ) ,
respectively. We set that ϵ has an uncertain normal distribution N ( 0 , σ ) for which the uncertainty distribution is
Φ ( z | σ ) = 1 + exp π z 3 σ 1 ,
and the error-term parameter is σ . In our numerical experiment, the derivative G i ( z i | β , λ ) is computed numerically.
One can derive a penalized MLE for the original Box-Cox model by a similar method in Remark 2, but it is more straightforward to adopt the MLE directly to our proposed model.

4. Numerical Experiment

This section conducts a numerical experiment to compare the goodness of fit between the rescaled Box-Cox model and the ordinary Box-Cox model. We select a set of inaccurate observation data, use least squares and MLE to fit the two models, respectively, and compare the goodness of fit by the residual graphs.

4.1. Uncertain Linear Case

Firstly, this subsection considers fitting an uncertain Box-Cox linear model. Table 1 shows the imprecise data used, the same as the data used in the literature [14], which is for comparative analysis with the previous simulation results.
For this numerical experiment, this section uses the LSE method and the MLE method, respectively, to fit the original uncertain Box-Cox linear model (1) and the uncertain Box-Cox model (2) corresponding to the new transformation. The goodness of fit is evaluated by the residual graph, the mean and standard deviation of the residuals. The extremum is calculated using the optim function of the R software (version 3.6.1) [27], where β , λ , and σ in the MLE method are taken by multiple sets of initial values, where β i , i = 0 , 1 , 2 , 3 can be set as 2 , 0 or 2, λ can be 0 , 0.5 , 1 , and σ can be 0.5 or 1. Please refer to Table 2 for the specific selected initial values.
The best result of the extreme value from the numerical experiment is selected as the estimate. The integration in the least-squares method is done using the integrate function, whereas the numerical derivation calculation in the MLE method is done using the fderiv function of the pracma package (version 2.2.5) [28]. Refer to Table 3 for the results of the parameter estimation as well as the mean and standard deviation of the residual analysis. In this table, ‘RLSE’ stands for the weighted least-squares method from [14], ‘LSE’ stands for the least-squares estimation, and ‘MLE’ stands for the maximum likelihood estimation. ‘Original’ means that the fitted model is a Box-Cox model, ‘Alternative’ means that the fitted model is an alternative Box-Cox model, and ‘Transformed’ means that after fitting the alternative Box-Cox model, the estimation results are reparameterized to be compared with that of the Box-Cox model. The residual plot is shown in Figure 1. In this figure, ‘LSE’ stands for the least-squares estimation, ‘MLE’ stands for the maximum likelihood estimation, ‘Original’ stands for the Box-Cox linear model, and ‘Alternative’ stands for the alternative Box-Cox linear model.
Figure 1 shows that when the fitted model is a Box-Cox linear model, the residual plots of the least-squares estimation and maximum likelihood estimation have obvious deviations. However, when the fitted model is an alternative Box-Cox linear model, the residuals of the two estimates do not deviate from zero or change unevenly. This shows that the least-squares and maximum likelihood estimations suit the uncertain alternative Box-Cox linear models but not the uncertain Box-Cox linear models. In addition, Table 3 shows that when the fitted model is a Box-Cox linear model, the estimated values of λ for the least-squares analysis and the maximum likelihood estimation are all relatively small. Note that the problem of underestimation of λ in the least-squares estimation of Box-Cox linear model was pointed out in [14]. A similar situation occurred for the maximum likelihood estimation in this numerical experiment.
For the uncertain Box-Cox linear model, a weighted LSE method was proposed by [14]. Table 3 shows that the results of the weighted least-squares estimation method are very similar to that of the direct least-squares after reparameterization. This is because the two estimation methods are equivalent, which we have explained before. Therefore, directly fitting the uncertain alternative Box-Cox linear model without constructing a penalty term can also avoid underestimating λ . It is more straightforward than [14].
Table 3 shows that when the fitted model is an uncertain alternative Box-Cox model, the results of the least-squares analysis and that of the maximum likelihood estimation are significantly different. Since the residual biases and standard deviations of the least-squares estimation are relatively small, and the estimation process is relatively simple, we recommend using the uncertain least-squares method to estimate the alternative Box-Cox model.
To sum up, such a numerical example shows that, compared with the Box-Cox model, we do not have the problem of underestimated λ for the uncertain alternative Box-Cox model. Therefore, the least-squares estimation or the maximum likelihood estimation can be used directly without constructing a weighted estimation method. In addition, for the uncertain alternative Box-Cox model, the least-squares estimation shows better performance in terms of residuals and is therefore recommended.

4.2. Uncertain Michaelis–Menten Kinetics Case

Next, this subsection consider fitting an uncertain Box-Cox Michaelis–Menten kinetics regression model [24]. The imprecise data shown in Table 4 are adopted from the source there.
In this case, this section also applies the LSE and the MLE respectively to fit the original uncertain Michaelis–Menten kinetics regression model and the transformed uncertain Box-Cox Michaelis–Menten kinetics regression model.
The residual graph, as well as the mean and standard deviation of the residuals, remains useful here for residual analysis. The optimization, integration in the LSE and numerical derivation in the MLE are again carried out via the optim ( ) function, the integrate function and the fderiv function, respectively. The parameter settings are explored as are those of the previous linear case. Please refer to Table 5 for the parameters details and Table 3 for the results of the parameter estimation with the mean and standard deviation. The residual plot is presented in Figure 2. In this figure, ‘LSE’ stands for the least-squares estimation, ‘MLE’ stands for the maximum likelihood estimation; ‘Original’ stands for the uncertain Box-Cox Michaelis–Menten kinetics regression model, and ‘Alternative’ stands for the alternative uncertain Box-Cox Michaelis–Menten kinetics regression model.
From Figure 2, we can see that for the original uncertain Michaelis–Menten kinetics regression model, there is evident variation in the residual plots from the LSE and the MLE. This is not true for the uncertain alternative Box-Cox Michaelis–Menten kinetics regression model, where the residuals neither deviate from zero nor change unevenly. Therefore, we conclude that the LSE and MLE are both appropriate to use for estimating the uncertain alternative Box-Cox Michaelis–Menten kinetics regression model, but not for the original uncertain Michaelis–Menten kinetics regression models. Table 6 shows the estimation results of the uncertain Box-Cox Michaelis–Menten kinetics regression model as well as the mean (Resid. mean) and standard deviation (Resid. sd) of the residuals. In this table, ‘RLSE’ stands for the weighted least-squares method from [14], ‘LSE’ stands for the least-squares estimation, and ‘MLE’ stands for the maximum likelihood estimation. ‘Original’ means that the fitted model is a Box-Cox model, ‘Alternative’ means that the fitted model is an alternative Box-Cox model, and ‘Transformed’ means that after fitting the alternative Box-Cox model, the estimation results are reparameterized to be compared with that of the Box-Cox model. Table 6 offers evidence that when the original uncertain Michaelis–Menten kinetics regression model is used, both LSE and MLE are inclined to conspicuously underestimate λ , which is a persisting issue pointed out in [14]. On the contrary, Table 6 supports the uncertain alternative Box-Cox Michaelis–Menten kinetics regression model, with distinct LSE and MLE results.
Due to the user-friendliness and small-scale residual biases and standard deviations, the alternative Box-Cox Michaelis–Menten kinetics regression model using uncertain LSE to estimate parameters should be advocated.
The weighted least-squares estimation method proposed by [14] for the uncertain Box-Cox Michaelis–Menten kinetics regression model is also inspected. It turns out that the corresponding results shown in Table 6 are pretty close to those from the direct least-squares after reparameterization, which verifies our claim on the equivalence of the two estimation methods. As a result, the uncertain alternative Box-Cox Michaelis–Menten kinetics regression model for the simplicity of directly fitting deserves to be recommended, with no need for penalty terms and without concerns of underestimated λ .
In conclusion, a thorough comparison between the uncertain Box-Cox Michaelis–Menten kinetics regression model and the uncertain alternative Box-Cox Michaelis–Menten kinetics regression model favors the latter. Both LSE and MLE perform well in the latter case, with no need for a weighted procedure. As far as LSE and MLE are compared in the latter case, LSE stands out.

5. Conclusions and Discussion

This article proposed an alternative Box-Cox model, gave its least-squares estimation and maximum likelihood estimation, and compared its performance with previous estimation methods. Parameter estimation was realized mainly by minimizing the Euclidean distance between the response variables H ( y ; λ ) (after transformation) and the corresponding fitted values f ( x 1 , x 2 , , x p | β ) .
There are theoretical issues mentioned in the literature [14] that state that when f is a linear function of x 1 , x 2 , , x p and all Y i are greater than 1, this distance approaches zero as λ and β = 0 . In the numerical simulations above, the upper and lower bounds of λ were used to avoid this problem. However, the cross-validation results showed that the estimated value of λ could remain unsatisfactorily low. Therefore, regarding this issue, this article introduced a penalty term on λ for MLE and proposed an uncertain alternative Box-Cox model to improve the estimation. In addition, a residual graph method was proposed to evaluate the validity of the fitting uncertainty model. It turns out that the residuals become symmetric about the origin, which is desirable in applications of experts’ experimental data. Our work enriches the theoretical framework of uncertain regression, providing effective analytical evaluation methods for a large class of flexible uncertain regression analysis, especially on nonlinear uncertain Box-Cox models. This paper has value in analyzing uncertain data, such as experts’ experimental data in terms of theories and applications. Later on, we plan to test the significance of regression coefficients in some uncertain regression models, exploring ways to compare and analyze the differences between the uncertain significance tests and the statistical significance tests.

Author Contributions

Conceptualization, L.F. and Y.H.; formal analysis, L.F. and Y.H.; investigation, L.F.; methodology, Y.H. and L.F.; software, L.F. and Y.H.; supervision, L.F.; validation, L.F.; writing—original draft, L.F. and Y.H.; writing—review & editing, Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Social Science Foundation of China grant number 21BGL164.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

All authors would like to thank the editors and the uncertainty theory Lab at Tsinghua University for their help.

Conflicts of Interest

The authors declare no conflict of interest. We declare that we have not any relevant or material financial interests that relate to the research described in this paper. The manuscript has neither been published before, nor has it been submitted for consideration of publication in another journal.

Abbreviations

The following abbreviations are used in this manuscript:
LSELeast-Squares Estimation
MLEMaximum Likelihood Estimation

References

  1. Liu, B. Uncertainty Theory, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 15–17. [Google Scholar]
  2. Liu, B. Some research problems in uncertainy theory. J. Uncertain Syst. 2009, 3, 3–10. [Google Scholar]
  3. Liu, B. Uncertainty Theory: A Branch of Mathematics for Modeling Human Uncertainty; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
  4. Liu, B. Uncertainty Theory, 4th ed.; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
  5. Liu, B. Uncertainty Theory, 5th ed.; Uncertainty Theory Laboratory, Department of Mathematical Sciences, Tsinghua University: Beijing, China, 2018. [Google Scholar]
  6. Chen, X.; Li, J.; Xiao, C.; Yang, P. Numerical solution and parameter estimation for uncertain SIR model with application to COVID-19. Fuzzy Optim. Decis. Mak. 2021, 1, 1–20. [Google Scholar] [CrossRef]
  7. Ding, J.; Zhang, Z. Statistical inference on uncertain nonparametric regression model. Fuzzy Optim. Decis. Mak. 2021, 20, 451–469. [Google Scholar] [CrossRef]
  8. Guo, H.; Wang, X.; Gao, Z. Uncertain linear regression model and its application. J. Intell. Manuf. 2017, 28, 559–564. [Google Scholar] [CrossRef]
  9. Yao, K.; Liu, B. Uncertain regression analysis: An approach for imprecise observations. Soft Comput. 2018, 22, 5579–5582. [Google Scholar] [CrossRef]
  10. Song, Y.; Fu, Z. Uncertain multivariable regression model. Soft Comput. 2018, 22, 5861–5866. [Google Scholar] [CrossRef]
  11. Lio, W.; Liu, B. Residual and confidence interval for uncertain regression model with imprecise observations. J. Intell. Fuzzy Syst. 2018, 35, 2573–2583. [Google Scholar] [CrossRef]
  12. Lio, W.; Liu, B. Uncertain maximum likelihood estimation with application to uncertain regression analysis. Soft Comput. 2020, 24, 9351–9360. [Google Scholar] [CrossRef]
  13. Hu, Z.; Gao, J. Uncertain Gompertz regression model with imprecise observations. Soft Comput. 2020, 24, 2543–2549. [Google Scholar] [CrossRef]
  14. Liu, S.; Fang, L.; Zhou, Z.; Hong, Y. Uncertain Box-Cox Regression Analysis With Rescaled least squares Estimation. IEEE Access 2021, 8, 84769–84776. [Google Scholar] [CrossRef]
  15. Zhang, C.; Liu, Z.; Liu, J. Least Absolute Deviations for Uncertain Multivariate Regression Model. Int. J. Gen. Syst. 2020, 49, 449–465. [Google Scholar] [CrossRef]
  16. Liu, Z.; Yang, Y. Least absolute deviations estimation for uncertain regression with imprecise observations. Fuzzy Optim. Decis. Mak. 2020, 19, 33–52. [Google Scholar] [CrossRef]
  17. Liu, Z.; Yang, X. Variable selection in uncertain regression analysis with imprecise observations. Soft Comput. 2021, 25, 13377–13387. [Google Scholar] [CrossRef]
  18. Ye, T.; Liu, B. Uncertain hypothesis test with application to uncertain regression analysis. Fuzzy Optim. Decis. Mak. 2021, 2021, 1–18. [Google Scholar] [CrossRef]
  19. Yang, X.; Liu, B. Uncertain time series analysis with imprecise observations. Fuzzy Optim. Decis. Mak. 2019, 18, 263–278. [Google Scholar] [CrossRef]
  20. Yang, X.; Gao, J.; Ni, Y. Resolution principle in uncertain random environment. IEEE Trans. Fuzzy Syst. 2018, 26, 1578–1588. [Google Scholar] [CrossRef]
  21. Liu, Z.; Jia, L. Cross-validation for the uncertain Chapman-Richards growth model with imprecise observations. Fuzziness Knowl.-Based Syst. 2020, 5, 769–783. [Google Scholar] [CrossRef]
  22. Ye, T.; Liu, Y. Multivariate uncertain regression model with imprecise observations. J. Ambient Intell. Humaniz. Comput. 2020, 11, 41–49. [Google Scholar] [CrossRef]
  23. Box, G.; Cox, D. An analysis of transformations. J. R. Stat. Soc. Ser. B 1964, 64, 211–252. [Google Scholar] [CrossRef]
  24. Fang, L.; Hong, Y. Uncertain revised regression analysis with responses of logarithmic, square root and reciprocal transformations. Soft Comput. 2019, 24, 2655–2670. [Google Scholar] [CrossRef]
  25. Fang, L.; Hong, Y.; Zhou, Z.; Chen, W. Uncertain logistic and Box-Cox regression analysis with maximum likelihood estimation. Commun. Stat.-Theory Methods 2021, 9, 1–20. [Google Scholar] [CrossRef]
  26. Draper, N.; Smith, H. Applied Regression Analysis, 3rd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2010; pp. 277–280. [Google Scholar]
  27. R Core Team. R: A Language and Environment for Statistical Computing, version 3.6.1; R Foundation for Statistical Computing: Vienna, Austria, 2019; Available online: https://www.R-project.org/ (accessed on 14 November 2021).
  28. Borchers, H.W. pracma: Practical Numerical Math Functions, R package version 2.3.3. 2019. Available online: https://CRAN.R-project.org/package=pracma/ (accessed on 14 November 2021).
Figure 1. The residuals of uncertain Box-Cox and alternative Box-Cox linear models.
Figure 1. The residuals of uncertain Box-Cox and alternative Box-Cox linear models.
Symmetry 14 00022 g001
Figure 2. The residuals of uncertain Box-Cox and uncertain alternative Box-Cox Michaelis–Menten kinetics regression models.
Figure 2. The residuals of uncertain Box-Cox and uncertain alternative Box-Cox Michaelis–Menten kinetics regression models.
Symmetry 14 00022 g002
Table 1. The imprecise observation data used to fit the uncertain Box-Cox linear model, where L ( a , b ) is the linear uncertainty variable.
Table 1. The imprecise observation data used to fit the uncertain Box-Cox linear model, where L ( a , b ) is the linear uncertainty variable.
i123456
x ˜ i 1 L ( 5 , 6 ) L ( 5 , 6 ) L ( 3 , 4 ) L ( 4 , 5 ) L ( 5 , 6 ) L ( 6 , 7 )
x ˜ i 2 L ( 18 , 20 ) L ( 20 , 22 ) L ( 9 , 10 ) L ( 31 , 34 ) L ( 33 , 36 ) L ( 13 , 15 )
x ˜ i 3 L ( 7 , 8 ) L ( 6 , 7 ) L ( 6 , 7 ) L ( 7 , 8 ) L ( 6 , 7 ) L ( 5 , 6 )
y ˜ i L ( 9 , 11 ) L ( 9 , 11 ) L ( 6 , 8 ) L ( 8 , 10 ) L ( 15 , 17 ) L ( 11 , 13 )
i789101112
x ˜ i 1 L ( 5 , 6 ) L ( 6 , 7 ) L ( 3 , 4 ) L ( 7 , 8 ) L ( 4 , 5 ) L ( 4 , 5 )
x ˜ i 2 L ( 30 , 33 ) L ( 25 , 28 ) L ( 5 , 6 ) L ( 47 , 50 ) L ( 11 , 13 ) L ( 25 , 28 )
x ˜ i 3 L ( 4 , 5 ) L ( 6 , 7 ) L ( 5 , 6 ) L ( 8 , 9 ) L ( 6 , 7 ) L ( 5 , 6 )
y ˜ i L ( 19 , 21 ) L ( 13 , 15 ) L ( 4 , 6 ) L ( 13 , 15 ) L ( 6 , 8 ) L ( 7 , 9 )
i131415161718
x ˜ i 1 L ( 6 , 7 ) L ( 6 , 7 ) L ( 8 , 9 ) L ( 7 , 8 ) L ( 6 , 7 ) L ( 3 , 4 )
x ˜ i 2 L ( 39 , 42 ) L ( 35 , 38 ) L ( 23 , 26 ) L ( 40 , 43 ) L ( 7 , 8 ) L ( 21 , 24 )
x ˜ i 3 L ( 5 , 6 ) L ( 7 , 8 ) L ( 7 , 8 ) L ( 7 , 8 ) L ( 5 , 6 ) L ( 4 , 5 )
y ˜ i L ( 14 , 16 ) L ( 8 , 10 ) L ( 22 , 24 ) L ( 19 , 21 ) L ( 10 , 12 ) L ( 6 , 8 )
i192021222324
x ˜ i 1 L ( 5 , 6 ) L ( 4 , 5 ) L ( 3 , 4 ) L ( 4 , 5 ) L ( 4 , 5 ) L ( 5 , 6 )
x ˜ i 2 L ( 27 , 30 ) L ( 23 , 26 ) L ( 15 , 17 ) L ( 35 , 38 ) L ( 34 , 37 ) L ( 33 , 36 )
x ˜ i 3 L ( 4 , 5 ) L ( 3 , 4 ) L ( 5 , 6 ) L ( 6 , 7 ) L ( 8 , 9 ) L ( 4 , 5 )
y ˜ i L ( 15 , 17 ) L ( 11 , 13 ) L ( 9 , 11 ) L ( 5 , 7 ) L ( 2 , 4 ) L ( 14 , 16 )
Table 2. The initial values of the parameters selected for fitting the linear Box-Cox model.
Table 2. The initial values of the parameters selected for fitting the linear Box-Cox model.
No. β 0 β 1 β 2 β 3 λ σ No. β 0 β 1 β 2 β 3 λ σ
1−2−2−2−200.510−2−2220.51
2−20000.50.511−20−2−211
3−222210.512−220001
40−2−200.50.5130−20201
5000210.514002−20.51
6022−200.51502−2011
72−20−210.5162−22011
8202000.51720−2201
922−220.50.518220−20.51
Table 3. The estimation results of the linear Box-Cox model and the mean (Resid. mean) and standard deviation (Resid. sd) of the residuals.
Table 3. The estimation results of the linear Box-Cox model and the mean (Resid. mean) and standard deviation (Resid. sd) of the residuals.
MethodModel β 0 β 1 β 2 β 3 λ σ Resid. meanResid. sd
RLSEOriginal2.48920.87410.0185−0.44530.54220.00001.0710
LSEOriginal0.32710.00100.0000−0.0014−3.08530.00030.0032
LSEAlternative7.29652.56210.0542−1.30520.54230.00083.1399
LSETransformed2.48920.87410.0185−0.44530.54230.00031.0712
MLEOriginal−0.70960.00290.01150.0930−3.82800.28330.08110.2097
MLEAlternative0.45992.91950.1418−1.54961.00011.77160.18803.4442
MLETransformed0.46002.92000.1418−1.54981.00011.77160.18803.4448
Table 4. The imprecise data used to fit the uncertain Box-Cox Michaelis-Menten kinetics regression model, where L ( a , b ) is the linear uncertainty variable.
Table 4. The imprecise data used to fit the uncertain Box-Cox Michaelis-Menten kinetics regression model, where L ( a , b ) is the linear uncertainty variable.
i123456
x ˜ i L ( 5 , 6 ) L ( 5 , 6 ) L ( 3 , 4 ) L ( 4 , 5 ) L ( 5 , 6 ) L ( 6 , 7 )
i789101112
x ˜ i L ( 5 , 6 ) L ( 6 , 7 ) L ( 3 , 4 ) L ( 7 , 8 ) L ( 4 , 5 ) L ( 4 , 5 )
i131415161718
x ˜ i L ( 6 , 7 ) L ( 6 , 7 ) L ( 8 , 9 ) L ( 7 , 8 ) L ( 6 , 7 ) L ( 3 , 4 )
i192021222324
x ˜ i L ( 5 , 6 ) L ( 4 , 5 ) L ( 3 , 4 ) L ( 4 , 5 ) L ( 4 , 5 ) L ( 5 , 6 )
Table 5. The initial values of the parameters selected for fitting the uncertain Box-Cox Michaelis–Menten kinetics regression model.
Table 5. The initial values of the parameters selected for fitting the uncertain Box-Cox Michaelis–Menten kinetics regression model.
No. β 1 β 2 λ σ No. β 1 β 2 λ σ
12200.510220.51
2000.50.5112211
32210.5120001
4200.50.5130201
50210.514220.51
62200.5152011
70210.5162011
82000.5172201
9220.50.518020.51
Table 6. The estimation results of the uncertain Box-Cox Michaelis–Menten kinetics regression model as well as the mean (Resid. mean) and standard deviation (Resid. sd) of the residuals.
Table 6. The estimation results of the uncertain Box-Cox Michaelis–Menten kinetics regression model as well as the mean (Resid. mean) and standard deviation (Resid. sd) of the residuals.
MethodModel β 1 β 2 λ σ Resid. meanResid. sd
RLSEOriginal1.39595.36822.0315-0.26431.5360
LSEOriginal0.00190.3167−2.5421-0.00001.4704
LSEAlternative8.28933.18762.0316-0.00083.1312
LSETransformed1.39595.36822.0316-0.26431.5360
MLEOriginal0.01290.0012−3.32200.21240.11320.1421
MLEAlternative3.42320.36811.25611.33270.27814.3663
MLETransformed3.43360.36811.25611.33270.27813.3663
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Fang, L.; Zhou, Z.; Hong, Y. Symmetry Analysis of the Uncertain Alternative Box-Cox Regression Model. Symmetry 2022, 14, 22. https://doi.org/10.3390/sym14010022

AMA Style

Fang L, Zhou Z, Hong Y. Symmetry Analysis of the Uncertain Alternative Box-Cox Regression Model. Symmetry. 2022; 14(1):22. https://doi.org/10.3390/sym14010022

Chicago/Turabian Style

Fang, Liang, Zaiying Zhou, and Yiping Hong. 2022. "Symmetry Analysis of the Uncertain Alternative Box-Cox Regression Model" Symmetry 14, no. 1: 22. https://doi.org/10.3390/sym14010022

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop