1. Introduction
In order to more realistically describe time-varying systems in the real world, there are two mathematical tools in total. One is the stochastic differential equation based on probability theory (proposed by Kolmogorov [
1]), and the other is the uncertain differential equation based on uncertainty theory (proposed by Liu [
2]). Since their proposal, stochastic differential equations have been extensively studied by numerous scholars, giving rise to numerous research branches, and successfully applied in fields such as financial research, social sciences, physical systems, and chemical evolution.
Although stochastic differential equations have produced many achievements, most studies on stochastic differential equations only focus on their theoretical results and do not discuss their applicability based on actual data. In other words, the theoretical basis of stochastic differential equations is probability theory, and the premise that probability theory can reasonably model the real world is based on the condition that the frequency of the observed data we obtain is stable enough, which is a strong enough condition. In fact, a large number of empirical studies have proven that the observed data in practice do not meet the conditions of frequency stability. At this time, the distribution function generated based on historical data cannot be close enough to the actual frequency of the research object. If it is regarded as a probability distribution function, and the stochastic differential equation based on probability theory is used to describe the research object, unreasonable results will be obtained. For example, Yang and Ke [
3] showed that stochastic differential equations are not suitable for portraying Shanghai interbank offered rates, which in turn shows that the actual data in the financial system cannot meet the frequency stability conditions. Yang and Liu [
4] illustrated that if population size data are characterized by using stochastic partial differential equations, unstable frequency results will also be obtained, and Liu et al. [
5] modeled online ride-hailing data in Beijing, China, by using the stochastic renewal process, and obtained results that violate common sense. These results all show that the actual data in the social system are also frequency unstable. In addition to showing, from the perspective of empirical research, that stochastic differential equations are not suitable for modeling the real world, Liu [
6] also pointed out that the velocity of Brownian motion
is infinite (that is, the derivative of Brownian motion
at any moment is a normal random variable whose expected value is 0, and the variance is infinite) and theoretically deduced the paradox of the stochastic differential equation. Since the real world cannot guarantee the fulfillment of the assumption of frequency stability, stochastic differential equations cannot reasonably model time-varying systems in the real world. In order to solve the problem of how to reasonably describe the phenomenon of frequency instability, Liu [
2] proposed uncertainty theory in 2007, which is an axiomatic mathematical system based on normativity, duality, subadditivity, and the product axioms. To date, uncertainty theory has been studied by many scholars and has spawned many research branches, such as the uncertain differential equation [
7], uncertain optimal control [
8], uncertain renewal process [
9], uncertain graph [
10], etc.
Among the many theoretical branches of uncertainty theory, uncertain differential equations hold a crucial position as a powerful tool for characterizing the evolution of dynamic systems over time under uncertain environments. They were proposed by Liu [
11] as a class of differential equations driven by Liu processes. Since their proposal, they has attracted the attention of many scholars. In theoretical research, Chen and Liu [
12] first investigated the existence and uniqueness conditions for solutions of uncertain differential equations. Based on this work, Liu [
13] began research on stability for uncertain differential equations. Subsequently, Yao et al. [
14] proposed several stability theorems, further advancing the field of stability analysis for uncertain differential equations. The Yao–Chen formula developed by Yao and Chen [
15] is a significant contribution to the study of uncertain differential equations, which links uncertain differential equations with ordinary differential equations, demonstrating that the solution to an uncertain differential equation can be represented by the solutions of a family of ordinary differential equations. Therefore, the Yao–Chen formula provides an accurate and elegant way to obtain numerical solutions to uncertain differential equations. Based on this, Yao and Chen [
15] first proposed a numerical method for solving uncertain differential equations. Based on the above pioneering work, many scholars have devoted themselves to researching and improvementing on the theoretical work of uncertain differential equations, producing a large number of results.
The application of uncertain differential equations in practice is the problem of parameter estimation, which is also the current research hotspot. However, the current research on parameter estimation of uncertain differential equations mainly focuses on methods based on the residuals of uncertain differential equations. Although the parameter estimation methods based on the residuals have high accuracy, such methods strictly rely on the existence of analytical solutions to uncertain differential equations or the existence of the inverse uncertainty distributions of their solutions. As we all know, only a few uncertain differential equations can obtain corresponding analytical solutions, and only regular uncertain differential equations can be solved numerically and obtain inverse uncertainty distributions. Therefore, parameter estimation methods based on the residuals of uncertain differential equations have certain limitations. In order to study the least squares estimation of general uncertain differential equations, this paper will construct a symmetric statistical invariant based on the difference scheme of uncertain differential equations, and study its least squares estimation based on this statistical invariant and the principle of least squares. Specifically, the main contributions of this paper are as follows:
A statistical invariant based on the difference scheme of uncertain differential equations has been constructed, and for general uncertain differential equations, the analytical expression of this statistical invariant can be obtained. This resolves the problem that the residual-based parameter estimation method for uncertain differential equations is limited by the solvability of the uncertainty distribution or the inverse uncertainty distribution of the solution.
The least squares estimations of the constant parameters and time-varying parameters of general uncertain differential equations are proposed. This solves the problem of applying the least squares method based on uncertainty distribution to the parameter estimation of constant parameters and time-varying parameters in uncertain differential equations.
Two numerical algorithms are designed to solve the numerical solution of the aforementioned least squares estimations.
Two numerical examples and an empirical study are provided to illustrate the corresponding methods proposed in this paper.
2. Literature Review of Parameter Estimation of Uncertain Differential Equations
Parameter estimation of uncertain differential equations is the current research hotspot in this field and is also the research topic of this paper. Therefore, this section will conduct a literature review on parameter estimation of uncertain differential equations.
Parameter estimation for uncertain differential equations originated from the work of Yao and Liu [
16], which first proposed moment estimation for uncertain differential equations based on their difference scheme and the method of moments. However, when there are too many unknown parameters, moment estimation may not be possible. To address this issue, Liu [
17] further developed the generalized moment estimation of uncertain differential equations by minimizing the sum of squares of the deviations between each order of sample moments and each order of population moments. Furthermore, Sheng et al. [
18] investigated a least squares estimation for uncertain differential equations by minimizing the noise terms. Yang et al. [
19] explored the minimum covering estimation by ensuring that the observed data fall within a reasonable range of the
paths. Liu and Liu [
20] borrowed the idea of uncertain maximum likelihood to present the maximum likelihood estimation for uncertain differential equations. These works are early achievements in this field and are all based on difference schemes for uncertain differential equations. Their disadvantage is that these difference schemes can introduce errors, and this error is particularly significant when the time step of the observation data is not sufficiently small, which is usually impossible to control.
In order to address the problems caused by the difference schemes, Liu and Liu [
21] proposed the concept of residuals for uncertain differential equations and established a precise connection between uncertain differential equations and observed data. Since then, the focus of many scholars has shifted to residual-based parameter estimation methods for uncertain differential equations. Based on the residuals of uncertain differential equations, Liu and Liu [
21] first studied residual-based moment estimation. Then, Liu and Liu borrowed the ideas of uncertain maximum likelihood and least squares criteria to explore residual-based maximum likelihood estimation (Liu and Liu [
22]) and residual-based least squares estimation (Liu and Liu [
23]), respectively. In addition, scholars have also explored the parameter estimation of multi-factor uncertain differential equations, including residual-based moment estimation (Yao and Sheng [
24]), residual-based maximum likelihood estimation (Liu et al. [
25]), and residual-based least squares estimation (Wu and Liu [
26]). Some scholars have also conducted research on parameter estimation of a class of uncertain partial differential equations. For example, Yang and Liu [
4] studied residual-based moment estimation of a class of uncertain partial differential equations, and Yang and Liu [
27] proposed residual-based least squares estimation and applied it to modeling the Chinese population. These studies are precisely the focus of parameter estimation in uncertain differential equations at the methodological level, and the advantage of these methods is that they are based on residuals that accurately connect the uncertain differential equations to the observed data, resulting in relatively more accurate results. However, their disadvantage is that they rely heavily on the solvability of the residuals. For the general uncertain differential equations that do not satisfy regularity conditions, the residuals are essentially unobtainable. In such cases, these methods will fail, and we will be forced to resort to parameter estimation methods based on difference schemes.
3. Preliminary
In this section, we will provide some concepts and theorems about uncertainty theory and uncertain differential equations to ensure that readers can understand the following content more easily.
The uncertain measure is the foundation of uncertainty theory, which is actually a set function defined on a -algebra, and the specific definition is as follows:
Definition 1 (Liu [
2])
. Let Γ
denote the universal set, and let denote the σ-algebra over Γ
. If the measurable set function defined on the σ-algebra satisfies the following three conditions, then the measurable set function is called an uncertain measure, and thus is called an uncertainty space:Normality Axiom: for the universal set Γ.
Duality Axiom: For any event , always holds.
Subadditivity Axiom: For every countable sequence of events always holds.
For the purpose of obtaining the uncertain measure of a composite event, the product uncertain measure
on the product
-algebra
was defined by Liu [
13] by assuming it follows the product axiom. That is, let
,
be a sequence of uncertainty spaces. If the uncertain measure
defined on the
-algebra
satisfies
where
for
, then the uncertain measure
is called a product uncertain measure. Based on the uncertainty space
, an instrument used to describe quantities in uncertain environments is the uncertain variable, which was defined by Liu [
2] as a measurable function from an uncertainty space
to the set of real numbers and was denoted by
, i.e., the set
is always an event for any Borel set
B of real numbers. Subsequently, in order to facilitate the calculation of uncertain variables in practical situations, the uncertainty distribution of an uncertain variable
was also defined by Liu [
2] as
Example 1. If the uncertain variable ξ has a normal uncertainty distributionthen it is called a normal uncertain variable, and is represented as , where e is the expected value and σ is the standard deviation. In addition, a normal uncertainty distribution is called standard if and . In practice, only independent uncertain variables can be subjected to operations. In order to define what kind of uncertain variables are independent, Liu [
13] suggested that if the uncertain variables
satisfy
for any Borel sets
of real numbers, then the uncertain variables
can be called independent. Furthermore, in order to provide a more accurate description of time-varying systems in uncertain environments, the Liu process
was defined by Liu [
13] as follows:
Definition 2. An uncertain process is said to be a Liu process if
(i) , and almost all sample paths are Lipschitz continuous;
(ii) has stationary and independent increments;
(iii) every increment is a normal uncertain variable with expected value 0 and standard deviation t.
4. Symmetric Statistical Invariant Based on Difference Scheme
In this section, we consider the general uncertain differential equation expressed in the following form:
where
and
are two continuous real-valued functions satisfying the existence and uniqueness conditions for the solution of uncertain differential Equation (
1), i.e., linear growth condition
and Lipschitz condition
for some constant
L,
is a vector of unknown parameters to be estimated, and
is a Liu process.
Suppose also that there exists a set of observed data
of the solution
to uncertain differential Equation (
1) at time points
with
, respectively. In order to link the uncertain differential Equation (
1) with the observed data (
2), we first discretize uncertain differential Equation (
1) at the observed time points
. That is, we choose to write uncertain differential Equation (
1) into the Euler difference form,
Based on the above Euler difference form, we can obtain
for
. For each index
i with
, it follows from Definition 2 that the increments
of the Liu process
over the time intervals
are a normal uncertain variable with expected value 0 and standard deviation
. That is,
Therefore, it is easy to infer that
,
are always independent of each other and
always holds. By using the above transformation formula of the Euler difference form of uncertain differential Equation (
1), we can also infer that
are always independent of each other and
always holds. For the sake of convenience, we write
Then, it follows from the above analysis that
On the other hand, we substitute the observed data (
2) into
and write
Note that
,
are
real-valued functions with respect to the vector of unknown parameters
and can be regarded as the samples of
,
, respectively. That is, we should have
At this point, we have constructed a series of real-valued functions
,
of the vector of unknown parameters
based on uncertain differential Equation (
1) and its observed data (
2), and processed the population distribution of the real-valued functions
,
as a statistical invariant
. Since the uncertainty distribution of
is symmetric about the origin, this statistical invariant is also called symmetric.
6. Least Squares Estimation of General Uncertain Differential Equations with Time-Varying Parameters
Generally speaking, the unknown parameters in uncertain differential equations tend to be time-varying rather than constant. Building on this, we will further investigate the least squares estimation problem for uncertain differential equations with time-varying parameters.
Here, we consider the following uncertain differential equations with time-varying parameters,
where all the assumptions are the same as those in (
1), with the exception that
is assumed to be a time-varying parameter related to time
t. On the other hand, we also assume that there exists a set of observed data
of the solution
to uncertain differential Equation (
11) at time points
with
, respectively. By using the method proposed in
Section 4, we can also obtain
samples of the symmetric statistical invariant
, where the time-varying parameter
serves as the independent variable. That is,
Before estimating the time-varying parameters
, we first set a sliding window
k, which is a given positive integer within an appropriate range. Then, for each index
j with
, the sequence
can also be regarded as a set of samples of
. In this time, since the time period for the observed data is solely within the interval
, we can assume that the variations of the time-varying parameters
within this period are small. Thus, the time-varying parameters
can be regarded as a constant
in this period. Therefore, we can obtain the least squares estimation
of
by solving the following minimization problem:
Iterating over
j from 2 to
, and repeating the above operation, we can obtain a set of least squares estimations
for the time-varying parameters
. Then, we select an appropriate uncertain regression model to fit the set of least squares estimations (
15), and the resulting regression equation is just the least squares estimation of the time-varying parameters
.
In order to ensure that the solution to the least squares estimation of the time-varying parameters can be truly achieved, we will design a numerical algorithm.
Step 0: Input the general uncertain differential equation with time-varying parameters,
that needs to be estimated, the observed data
, and the sliding window
k.
Step 1: Determine the feasible regions of the unknown parameters vector .
Step 2: Set .
Step 3: For each
, compute
by
for
.
Step 4: Set and .
Step 5: Set
and
.
Step 6: If , then go to Step 5.
Step 7: Find such that reaches its minimum value by using MATLAB, and set .
Step 8: If , then go to Step 3.
Step 9: Select the uncertain regression model to fit , and obtain the regression equation .
Step 10: Output .