Fractional State Space Description: A Particular Case of the Volterra Equations

To tackle several limitations recently highlighted in the field of fractional differentiation and fractional models, some authors have proposed new kernels for the definition of fractional integration/differentiation operators. Some limitations still remain, however, with these kernels, whereas solutions prior to the introduction of fractional models exist in the literature. This paper shows that the fractional pseudo state space description, a fractional model widely used in the literature, is a special case of the Volterra equations, equations introduced nearly a century ago. Volterra equations can thus be viewed as a serious alternative to fractional pseudo state space descriptions for modelling power law type long memory behaviours. This paper thus presents a new class of model involving a Volterra equation and several kernels associated with this equation capable of generating power law behaviours of various kinds. One is particularly interesting as it permits a power law behaviour in a given frequency band and, thus, a limited memory effect on a given time range (as the memory length is finite, the description does not exhibit infinitely slow and infinitely fast time constants as for pseudo state space descriptions).


Introduction
For 30 years, research on fractional differentiation and integration operators has steadily increased. While from a mathematical point of view these operators are now well defined, this is not the case for their physical interpretations or for their use in the definition of fractional models.
In the field of automatic control, the literature abounds with papers based on a generalisation of the classical state space description denoted "fractional pseudo state space description". It is mathematically defined by the following equations: In relation (1), d ν dt ν [x(t)] replaces the usual derivative and denotes the fractional derivative of order ν of the variable (or vector of the variable) x(t). A first difficulty related to such a representation is the considerable number of possible definitions for the fractional differentiation operator d ν dt ν [ .]. More than 30 are listed in [1], and new definitions of fractional operator have appeared recently [2][3][4][5][6]. These definitions are not equivalent if initial conditions are taken into account, and in many cases each author chooses what suits them according to the results they want to obtain. The most widely used definitions are the Riemann-Liouville and Caputo definitions [7]. The latter takes initial conditions into account very easily (as for classical state space descriptions), but it was demonstrated that this is not done in a physically consistent way [8][9][10][11] (the memory of the model is not taken into account at the initial Fractal Fract. 2020, 4, 23 2 of 14 time). While a similar problem exists with the Riemann-Liouville definition, it nonetheless has the merit of taking into account the entire past of the system at the initial time t = t 0 , which better reflects the behaviour of the model for t > t 0 [9].
A fractional pseudo state space description (1)-which it is better to denote a pseudo state space description, since the variable x(t) does not meet the definition of a state as shown in [9,12]-is in fact a simple "fractionalisation" of the classical state space description, without physical justification and resulting from the need for models that fit power law type long memory behaviours.
Numerous examples of this kind of behaviour were revealed in various domains, for instance: • Electrochemistry due to charge diffusion in batteries [13,14]; • Thermal conduction in a semi-infinite medium [15]; • Biology for modelling complex dynamics in biological tissues [16]; • Mechanics with the dynamical property of viscoelastic materials and, in particular, wave propagation problems in these materials [17]; • Acoustics to model visco-thermal losses in wind instruments [18]; • Electrical distribution networks [19].
Given the ubiquity of such systems, efficient modelling tools are needed. Of course, fractional integration and differentiation operators have the ability to fit these kinds of behaviours. A fractional integration operator that links an input u(t) to an output y(t) is defined in the Riemann-Liouville sense by the relation [7]: Such an operator exhibits a power law behaviour as the modulus of its Fourier transform, denoted Y( jω)/U( jω) , is defined by 1/ω ν . As it is defined using a fractional integrator operator, the fractional-order differentiation operator has a similar property.
Even if a nice fitting of many systems' input-output behaviours is obtained with models such as (1), several drawbacks are associated with the description (1) and with fractional differential equations in general:

•
The variable x(t) that plays the role of the state does not have the properties of a state, which is why the name "pseudo state" was introduced [11,12]; • The initial conditions are not well taken into account if the Caputo or Riemann-Liouville definitions are used for the derivative of the pseudo state, and it is better to define the description with a fractional integration to take the model past into account [10][11][12]20]; • This description memory is infinite and it exhibits infinitely slow and infinitely fast time constants (even if they are attenuated, they exist), which excludes the possibility of linking the pseudo-state variable to a physical variable [11]; • Observability results (but also controllability, flatness, etc.) depend on the definition used for fractional differentiation [20][21][22], and exact observability cannot be reached if all the system past must be known to predict its future [20], • Fractional integration given by relation (2) involves a singular kernel, as mentioned in [2]; this leads to complications in the solution/simulation of the fractional-order differential equations (and that is why some new kernels were proposed; see, for instance, [2][3][4][5][6]); • The parameter units associated with description (2) (parameters inside matrices A and B) have no physical meaning (e.g., sec −ν ); • The physical interpretations proposed in the literature [23][24][25][26][27][28][29] are weakly convincing, and in the case of incommensurate orders, some can invalidate the obtained model [30].
It is therefore necessary to find models that overcome these limitations and that are physically consistent and interpretable. In this paper, Section 2 highlights that the pseudo state space description frequently encountered in the literature for power law type long memory behaviours is a special case of the Volterra equations in the sense that the kernel used in its definition has a constrained form. In the field of power law type long memory behaviour modelling, it thus appears more general to directly work with Volterra equations, which allow great freedom in the choice of the kernel which composes these equations. Several kernels are proposed in Section 3, with some that never been proposed in the literature, that permit power law type long memory behaviours while solving several drawbacks associated with pseudo state space descriptions. Note that the introduction of special kernels in other integro-differential equations to produce power law type long memory behaviour was done in [31,32]. With Section 4, the paper ends with a discussion of the results presented.
Notation: In the following, the same letter is used to designate a time or a frequency variable, or the Laplace transform of these variables. The argument associated to this variable then makes it possible to lift the indeterminacy: z(t) denotes a time variable; -z( jω) denotes a frequency variable; -z(s) denotes the Laplace transform of z(t), z(s) = L z(t) .

Pseudo State Space Description: A Particular Case of the Volterra Equations
Volterra equations of the first kind were introduced by Volterra himself [33,34] to model population growth. In the linear and scalar case (x ∈ R, v ∈ R) they are defined by Such a representation appears more general than the fractional pseudo state space description (1). To demonstrate this, the operator d ν dt ν [] in relation (1) is assumed to be defined in the Riemann-Liouville sense [7] (for the reasons explained in Section 1). It is also assumed in (1) that A ∈ R n×n , B ∈ R n×n , and C ∈ R 1×n .
According to [7] (p. 46) (if the fractional integral of order ν of each component of vector x(t) exists) and after first-order integration of both sides of the first equation in relation (1), the following equation can be obtained: where the kernel in (4) is η f (t) = t −ν /Γ(1 − ν) and multiplies each component of vector x(t). Representation (1) can thus be rewritten under the form of a Volterra equation of the first kind: where I n denotes an identity matrix of dimension n. Relation (5) demonstrates that a pseudo state space description is a particular case of a Volterra equation of the first kind, as the kernel in (5) has a fixed structure. Using a Volterra equation, it is thus possible to generalise a pseudo state space description in two ways: • By adapting the kernel η(t) in relation (3), it is possible to produce, with the same kind of equation, power law behaviours of various types (denoted explicit, implicit), but also many other long memory behaviours; • In relation (3), if x(t) ∈ R n , then η(t) is a matrix of kernels such that η(t) = η i,j (t) , thus permitting great flexibility in the tuning of relation (3). The case η(t) = diag[η i (t)] comes closer to the non-commensurate fractional pseudo state space representation case, but it should be remembered that physical interpretations invalidate this kind of model [30]. Description (3) has another important advantage. Model memory can be limited by introducing a parameter T f in the integral bounds such that Using the change of variable ξ = t − τ, relation (6) becomes Relation (7) explicitly shows that knowledge of the model state x(t) is required on 0, T f to compute its future. Modification of the lower bound of relation (3) to produce relation (6) is thus of interest in the initialisation problem. Initialisation of relation (6) only requires knowledge of the past of variable x(t) on the interval t 0 − T f , t 0 if t 0 denotes the initial time, while knowledge of the past on ] −∞, t 0 ] is required for fractional model (1).
For all these reasons, in a modelling approach it is better to work with model (6) than with model (1), which is more general as previously demonstrated. This can be done by searching the kernel η(t) directly without any assumption on its structure.

A Volterra-Equation-Based Model for Power Law Type Long Memory Behaviour
This section shows that by choosing an appropriate kernel η(t) in relation (3), it is possible to produce power law behaviours of various kinds (explicit, implicit). To highlight this, the following model involving a Volterra equation of the first kind is considered: It is assumed that u(t) ∈ R denotes the input of the model and that y(t) ∈ R is its output. The Laplace transform of relation (8) is, without considering initial conditions, and from an input-output point of view, the following transfer function is thus obtained:

A First Kernel
The paper by Cole and Cole [35] is considered by some as the first application of fractional differentiation. As Volterra equations of the first kind were introduced by Volterra himself [33] at least 20 years before the Cole and Cole paper, instead of using their empirical formula (see relation (1) in [35]) for dispersion and absorption modelling in dielectrics, model (8) (or, more generally, a Volterra equation of the first kind) could have been used with in which H(t) denotes the Heaviside function. The Laplace transform of relation (12) is (13) and thus leads to the frequency response (11) for the transfer function (10), a frequency response that exhibits a power law behaviour of order ν − 1 on the frequency domain 1 τ 0 , ∞ . However, this kernel is singular at t = 0. Moreover, computation of the impulse response of the transfer function (13) using Cauchy's theorem as detailed in Section 3.5 gives Laplace transform applied to relation (14) leads to Relation (15) shows that the transfer function η 1 (s) exhibits a distribution of poles within the interval ] −∞, 0]. Kernel η 1 (t) thus exhibits infinitely slow and infinitely fast time constants, and is not a candidate to solve the drawbacks mentioned in the introduction.
This kernel is the so-called tempered (or truncated) Mittag-Leffler memory kernel, which has been investigated in different contexts and especially in generalised diffusion equations [36,37], in generalised diffusion-wave equations [38], and in generalised Langevin equations [39,40].
As L t α E 1,1+α (at) = s −α /(s − a) [41], and using the shift rule of the Laplace transform, the Laplace transform of kernel η 2 (t) is given by and the transfer function of model (8) is thus This is also known as the Davidson-Cole transfer function, which also exhibits a power law behaviour of order −ν on the frequency domain [ω min , ∞]. Such a kernel remains singular at t = 0. Computation of the impulse response of the transfer function (18) using Cauchy's theorem as detailed in Section 3.5 gives Fractal Fract. 2020, 4, 23

of 14
Laplace transform applied to relation (20) leads to Relation (15) shows that the transfer function η 2 (s) exhibits a distribution of poles within the interval ] −∞, −ω min ] (without taking into account the 1/s term). Kernel η 2 (t) thus does not exhibit infinitely slow time constants, which should simplify the initialisation of the equation in which this kernel will be used. However, this kernel still has infinitely fast time constants, which can be problematic for a physical interpretation.

A Third Kernel
The third kernel studied is defined by The Laplace transform of this kernel is defined by and the transfer function (10) is thus defined by This transfer function exhibits a power law behaviour of order −ν on the frequency domain [0, ω min ]. Moreover, as lim s→∞ sη 3 (s) 0, using the initial value theorem, it can be concluded that kernel η 3 (t) is not singular at time t = 0. Computation of the impulse response of the transfer function (23) using Cauchy's theorem as detailed in Section 3.5 gives Laplace transform applied to relation (25) leads to Relation (21) shows that the transfer function η 2 (s) exhibits a distribution of poles within the interval ] −∞, 0] as kernel η 1 (t).

A Fourth Kernel
The fourth kernel studied is defined by Fractal Fract. 2020, 4, 23

of 14
The Laplace transform of this kernel is defined by η 4 (s) = 1 s s ω min ν + 1 (28) and the transfer function (10) is thus defined by This transfer function exhibits a power law behaviour of order +ν on the frequency domain [ω min , ∞]. As lim s→∞ sη 4 (s) = 0, using the initial value theorem, it can be concluded that kernel η 4 (t) is not singular at time t = 0. Computation of the impulse response of the transfer function (28) using Cauchy's theorem as detailed in Section 3.5 gives Laplace transform applied to relation (30) leads to Relation (15) shows that transfer function η 4 (s) exhibits a distribution of poles within the interval ] −∞, −ω min ] (without taking into account the 1/s term). Kernel η 4 (t) thus does not exhibit infinitely slow time constants, which should simplify the initialisation of the equation in which this kernel is used. However, this kernel still has infinitely fast time constants, which can be problematic for a physical interpretation.

A Fifth Kernel
The following kernel is now considered: In relation (32), function F 1 (-, -, -) is a confluent hypergeometric function [42]. The Laplace transform of this kernel is defined by [43] (p. 238): Note that this kernel can be represented in terms of the three-parameter Mittag-Leffler function due to the following link with confluent hypergeometric function F 1 (δ, 1, a) = E δ 1,1 (az). The model (10) input-output Laplace transform is thus Fractal Fract. 2020, 4, 23

of 14
The gain diagrams of transfer functions η 5 (s) and H 5 (s) are shown in Figure 1 with ω min = 0.001 rd/s, ω max = 1000 rd/s, and ν = 0.5. These transfer functions exhibit power law behaviour in the frequency band [ω min , ω max ].
Note that this kernel can be represented in terms of the three-parameter Mittag-Leffler function due to the following link with confluent hypergeometric function 1 ( , 1, ) = 1,1 ( ).
The model (10)    A property (and advantage) of this kernel is a distribution of its poles on a limited frequency band. To demonstrate such a property, the impulse of the transfer function (which is in the definition of η 5 (s)) is computed using Cauchy's theorem over the path Γ = γ 0 ∪ . . . ∪ γ 7 represented by Figure 2.  For 0 < ν < 1, the transfer function ( ) has no pole inside Γ and, thus, With the operator ( ) being strictly proper, by Jordan's lemma, integrals on the large circular arcs of radius R, R → ∞ can be neglected: For 0 < ν < 1, the transfer function G(s) has no pole inside Γ and, thus, With the operator G(s) being strictly proper, by Jordan's lemma, integrals on the large circular arcs of radius R, R → ∞ can be neglected: Let s = xe jπ and x ∈ ] ∞, ω max ] on γ 2 ; thus, ds = e jπ dx. Let also s = xe −jπ and x ∈ [ω max , ∞[ on γ 6 ; thus, ds = e − jπ dx. Then γ 2+ γ6 G(s)e ts ds = J γ 2 +γ 6 (t) = ω min ν or J γ 2 +γ 6 (t) = ω min ν and, thus, J γ 2 +γ 6 (t) = 0.
It shows that the frequency response of H T f 5 (s) remains similar to that of H 5 (s) under the condition T f > 5/ω min . In such a situation, parameter T f can be viewed as the memory length of this kernel. Note that the memory length is connected to the corner frequency below which the kernel frequency response has a constant gain.
The Laplace transform of relation (48) without considering initial conditions is and thus, the transfer function 5 ( ) linking the input ( ) to the input ( ) is defined by (50) Figure 3 compares the gain diagram of 5 ( ) (computed numerically) with the gain diagram of 5 ( ) given by relation (34) in the domain of interest (the whole gain diagram is shown in Figure 1). It shows that the frequency response of 5 ( ) remains similar to that of 5 ( ) under the condition > 5⁄ . In such a situation, parameter can be viewed as the memory length of this kernel. Note that the memory length is connected to the corner frequency below which the kernel frequency response has a constant gain.

Conclusion
Dynamic systems with power law type long memory behaviours are generally modelled in the literature using fractional models. However, several studies have highlighted severe limitations of

Conclusions
Dynamic systems with power law type long memory behaviours are generally modelled in the literature using fractional models. However, several studies have highlighted severe limitations of these models. This situation is, in fact, surprising as all the material required for power law type long memory behaviours existed before the introduction of fractional models, in the form of Volterra equations.
The paper first showed that the fractional pseudo state space description currently used in the literature is, in fact, a particular case of the Volterra equations. This class of equations was introduced by Volterra himself in the 1910s. This is 20 years before the paper by Cole and Cole, which is perceived by some as the first application of fractional differentiation and models, and who could have used this equation to model dispersion and adsorption phenomena in dielectrics. As shown in the present paper, by defining a new class of model involving a Volterra equation, and by choosing appropriate kernels in the Volterra equation, various kinds of power law type long memory behaviours can be generated. One kernel was more deeply studied as it permits power law behaviour in a given frequency band and thus avoids infinitely small and infinitely large time constants. With this kernel, it was also shown that the model memory can be limited without affecting its input-output behaviour provided that a condition linking the memory length and the lowest time constant of the kernel is met.
Some of the kernels proposed in Section 3 are continuous. These kernels can solve many of the questions currently raised by the classical and historical definitions of fractional operators. In relation to the list of drawbacks given in the introduction with the analysed kernels associated with model (3) or (8),
The mathematical operators used in Equations (3) and (8) are clearly defined, and this equation also well defines how the model past (initial conditions) should be taken into account; 3.
The choice of kernel is free in Relations (3) and (8), and as shown in this paper with kernel η 5 (t) (Relation (24)), the model time constant can be limited within an interval, and as shown with relation (35), the model history can be limited; 4.
Relations (3) and (8) are uniquely defined, which cannot lead to different conclusions on the properties of this equation; 5.
non-singular kernel can be used in relations (3) and (8), such as kernels η 3 (t) and η 4 (t), to produce power law type long memory behaviours; 6.
Several physical interpretations can be found in the literature for Relation (3) (see, for instance, [34]).
In contradiction with the results published in [44], these kernels are not restrictive. The author of [44] reached this conclusion because its analysis was based on the idea that initial conditions can be defined only using information at time t = 0 of the variable on which the differential equation relates (u(x, 0) = φ(0) in [44]). As previously claimed, it was indeed shown in [9] and [11] that for the classical Caputo's or Riemann-Liouiville kernels, the entire past of this variable is required from -∞ to ensure proper initialisation of the equation. The problem is not continuous kernels, but how initialisation is defined. This also joins the work of Lorenzo and Hartley [45], in which an initialisation function was added to the equation solution to take into account the entire equation past. A similar situation holds in a distributed delay equation, in which the past must be known within the time interval on which the integral is defined [46].
To finish, the author argues that the current trend to redefine fractional differentiation operators in the hope of solving the limitations of fractional models, such as the one given by Relation (1) or more generally those based on fractional differential equations, is not a profitable strategy. It seems more interesting and general to work directly on integro-differential equations and select appropriate kernels. The author thus intends now to study in greater depth the properties of Volterra equations for power law type long memory behaviours.
Funding: This research received no external funding.