Matrix Algebraic Properties of the Fisher Information Matrix of Stationary Processes

In this survey paper, a summary of results which are to be found in a series of papers, is presented. The subject of interest is focused on matrix algebraic properties of the Fisher information matrix (FIM) of stationary processes. The FIM is an ingredient of the Cramér-Rao inequality, and belongs to the basics of asymptotic estimation theory in mathematical statistics. The FIM is interconnected with the Sylvester, Bezout and tensor Sylvester matrices. Through these interconnections it is shown that the FIM of scalar and multiple stationary processes fulfill the resultant matrix property. A statistical distance measure involving entries of the FIM is presented. In quantum information, a different statistical distance measure is set forth. It is related to the Fisher information but where the information about one parameter in a particular measurement procedure is considered. The FIM of scalar stationary processes is also interconnected to the solutions of appropriate Stein equations, conditions for the FIM to verify certain Stein equations are formulated. The presence of Vandermonde matrices is also emphasized.


Introduction
In this survey paper, a summary of results derived and described in a series of papers, is presented.It concerns some matrix algebraic properties of the Fisher information matrix (abbreviated as FIM) of stationary processes.An essential property emphasized in this paper concerns the matrix resultant property of the FIM of stationary processes.To be more explicit, consider the coefficients of two monic polynomials p(z) and q(z) of finite degree, as the entries of a matrix such that the matrix becomes singular if and only if the polynomials p(z) and q(z) have at least one common root.Such a matrix is called a resultant matrix and its determinant is called the resultant.The Sylvester, Bezout and tensor Sylvester matrices have such a property and are extensively studied in the literature, see e.g., [1][2][3].The FIM associated with various stationary processes will be expressed by these matrices.The derived interconnections are obtained by developing the necessary factorizations of the FIM in terms of the Sylvester, Bezout and tensor Sylvester matrices.These factored forms of the FIM enable us to show that the FIM of scalar and multiple stationary processes fulfill the resultant matrix property.Consequently, the singularity conditions of the appropriate Fisher information matrices and Sylvester, Bezout and tensor Sylvester matrices coincide, these results are described in [4][5][6].
A statistical distance measure involving entries of the FIM is presented and is based on [7].In quantum information, a statistical distance measure is set forth, see [8][9][10], and is related to the Fisher information but where the information about one parameter in a particular measurement procedure is considered.This leads to a challenging question that can be presented as, can the existing distance measure in quantum information be developed at the matrix level ?
The matrix Stein equation, see e.g., [11], is associated with the Fisher information matrices of scalar stationary processes through the solutions of the appropriate Stein equations.Conditions for the Fisher information matrices or associated matrices to verify certain Stein equations are formulated and proved in this paper.The presence of Vandermonde matrices is also emphasized.The general and more detailed results are set forth in [12] and [13].In this survey paper it is shown that the FIM of linear stationary processes form a class of structured matrices.Note that in [14], the authors emphasize that statistical problems related to stationary processes have been treated successfully with the aid of Toeplitz forms.This paper is organized as follows.The various stationary processes, considered in this paper, are presented in Section 2, the Fisher information matrices of the stationary processes are displayed in Section 3. Section 3 sets forth the interconnections between the Fisher information matrices and the Sylvester, Bezout, tensor Sylvester matrices, and solutions to Stein equations.A statistical distance measure is expressed in terms of entries of a FIM.

The Linear Stationary Processes
In this section we display the class of linear stationary processes whose corresponding Fisher information matrix shall be investigated in a matrix algebraic context.But first some basic definitions are set forth, see e.g., [15].
If a random variable X is indexed to time, usually denoted by t, the observations {X t , t ∈ T} is called a time series, where T is a time index set (for example, T = Z, the integer set).Definition 2.1.A stochastic process is a family of random variables {X t , t ∈ T} defined on a probability space {Ω, F, P}.Definition 2.2.The Autocovariance function.If {X t , t ∈ T} is a process such that Var(X t ) < ∞ (variance) for each t, then the autocovariance function γ X (•, •) of {X t } is defined by γ X (r, s) = Cov (X r , X s ) = E [(X r − EX r ) (X s − EX s )] , r, s ∈ Z and E represents the expected value.Definition 2.3.Stationarity.The time series {X t , t ∈ Z}, with the index set Z = {0, ±1, ±2, . ..} , is said to be stationary if for all t ∈ Z, m is the constant average or mean (iii) γ X (r, s) = γ X (r + t, s + t) for all r, s, t ∈ Z, From Definition 2.3 can be concluded that the joint probability distributions of the random variables {X 1 , X 2 , . . .X tn } and {X 1+k , X 2+k , . . .X tn+k } are the same for arbitrary times t 1 , t 2 , . . ., t n for all n and all lags or leads k = 0, ±1, ±2, . ... The probability distribution of observations of a stationary process is invariant with respect to shifts in time.In the next section the linear stationary processes that will be considered throughout this paper are presented.

The Vector ARMAX or VARMAX Process
We display one of the most general linear stationary process called the multivariate autoregressive, moving average and exogenous process, the VARMAX process.To be more specific, consider the vector difference equation representation of a linear system {y(t), t ∈ Z}, of order (p, r, q), p j=0 A j y(t − j) = r j=0 C j x(t − j) + q j=0 B j ε(t − j), t ∈ Z (1) where y(t) are the observable outputs, x(t) the observable inputs and ε(t) the unobservable errors, all are n-dimensional.The acronym VARMAX stands for vector autoregressive-moving average with exogenous variables.The left side of (1) is the autoregressive part the second term on the right is the moving average part and x(t) is exogenous.If x(t) does not occur the system is said to be (V)ARMA.Next to exogenous, the input x(t) is also named the control variable, depending on the field of application, in econometrics and time series analysis, e.g., [15], and in signal processing and control, e.g., [16,17].The matrix coefficients, A j ∈ R n×n , C j ∈ R n×n , and B j ∈ R n×n are the associate parameter matrices.We have the property A 0 ≡ B 0 ≡ C 0 ≡ I n .Equation (1) can compactly be written as where B j z j we use z to denote the backward shift operator, for example z x t = x t−1 .The matrix polynomials A(z), B(z) and C(z) are the associated autoregressive, moving average matrix polynomials, and the exogenous matrix polynomial respectively of order p, q and r respectively.Hence the process described by Equation ( 2) is denoted as a VARMAX(p, r, q) process.Here z ∈ C with a duplicate use of z as an operator and as a complex variable, which is usual in the signal processing and time series literature, e.g., [15,16,18].The assumptions Det(A(z)) = 0, such that |z| ≤ 1 and Det(B(z)) = 0, such that |z| < 1 for all z ∈ C, is imposed so that the VARMAX(p, r, q) process (2) has exactly one stationary solution and the condition Det(B(z)) = 0 implies the invertibility condition, see e.g., [15] for more details.Under these assumptions, the eigenvalues of the matrix polynomials A(z) and B(z) lie outside the unit circle.The eigenvalues of a matrix polynomial Y (z) are the roots of the equation Det(Y (z)) = 0, Det(X) is the determinant of X.The VARMAX(p, r, q) stationary process ( 2) is thoroughly discussed in [15,18,19].
The error {ε(t), t ∈ Z} is a collection of uncorrelated zero mean n-dimensional random variables each having positive definite covariance matrix Σ and we assume, for all s, t, E ϑ { x(s) ε (t)} = 0, where X denotes the transposition of matrix X and E ϑ represents the expected value under the parameter ϑ.The matrix ϑ represents all the VARMAX(p, r, q) parameters, with the total number of parameters being n 2 (p + q + r).For different purposes which will be specified in the next sections, two choices of the parameter structure are considred.First, the parameter vector ϑ ∈ R n 2 (p+q+r)×1 is defined by The vec operator transforms a matrix into a vector by stacking the columns of the matrix one underneath the other according to vec X = col(col(X ij ) n i=1 ) n j=1 , see e.g., [2,20].A different choice is set forth, when the parameter matrix ϑ ∈ R n×n(p+q+r) is of the form ϑ = (ϑ 1 ϑ 2 . . .ϑ p ϑ p+1 ϑ p+2 . . .ϑ p+r ϑ p+r+1 ϑ p+r+2 . . .ϑ p+r+q ) (4) Representation (5) of the parameter matrix has been used in [21].The estimation of the matrices A 1 , A 2 ,. .., A p , C 1 , C 2 ,. .., C r , B 1 , B 2 , . .., B q and Σ has received considerable attention in the time series and statistical signal processing literature, see e.g., [15,17,19].In [19], the authors study the asymptotic properties of maximum likelihood estimates of the coefficients of VARMAX(p, r, q) processes, stored in a ( × 1) vector ϑ, where = n 2 (p + q + r).
Before describing the control-exogenous variable x(t) used in this survey paper, we shall present the different special cases of the model described in Equations ( 1) and (2).

The Vector ARMA or VARMA Process
When the process (2) does not contain the control process x(t) it yields which is a vector autoregressive and moving average process, VARMA(p, q) process, see e.g., [15].The matrix ϑ represents now all the VARMA parameters, with the total number of parameters being n 2 (p+q).The VARMA(p, q) version of the parameter vector ϑ defined in (3) is then given by A VARMA process equivalent to the parameter matrix ( 4) is then the n × n(p + q) parameter matrix A description of the input variable x(t), in Equation (2) follows.Generally, one can assume either that x(t) is non stochastic or that x(t) is stochastic.In the latter case, we assume E ϑ { x(s) ε (t)} = 0, for all s, t, and that statistical inference is performed conditionally on the values taken by x(t).In this case it can be interpreted as constant, see [22] for a detailed exposition.However, in the papers referred in this survey, like in [21] and [23], the observed input variable x(t), is assumed to be a stationary VARMA process, of the form where α(z) and β(z) are the autoregressive and moving average polynomials of appropriate degree and {η(t), t ∈ Z} is a collection of uncorrelated zero mean n-dimensional random variables each having positive definite covariance matrix Ω.The spectral density of the VARMA process x(t) is R x (•)/2π and for a definition, see e.g., [15,16], to obtain where i is the imaginary unit with the property i 2 = −1, ω is the frequency, the spectral density R x (e iω ) is Hermitian, and we further have, R x (e iω ) ≥ 0 and π −π R x (e iω )dω < ∞.As mentioned above, the basic assumption, x(t) and ε(t) are independent or at least uncorrelated processes, which corresponds geometrically with orthogonal processes, holds and X * is the complex conjugate transpose of matrix X.

The ARMAX and ARMA Processes
The scalar equivalent to the VARMAX(p, r, q) and VARMA(p, q) processes, given by Equations ( 2) and ( 6) respectively, shall now be displayed, to obtain for the ARMAX(p, r, q) process and for the ARMA(p, q) process a(z) popularized in, among others, the Box-Jenkins type of time series analysis, see e.g., [15].Where a(z), b(z) and c(z) are respectively the scalar autoregressive, moving average polynomials and exogenous polynomial, with corresponding scalar coefficients a j , b j and c j , Note that as in the multiple case, a 0 = b 0 = 1.The parameter vector, ϑ, for the processes, Equations ( 11) and ( 12) is then respectively.
In the next section the matrix algebraic properties of the Fisher information matrix of the stationary processes (2), ( 6), (11) and ( 12) will be verified.Interconnections with various known structured matrices like the Sylvester resultant matrix, the Bezout matrix and Vandermonde matrix are set forth.The Fisher information matrix of the various stationary processes is also expressed in terms of the unique solutions to the appropriate Stein equations.

Structured Matrix Properties of the Asymptotic Fisher Information Matrix of Stationary Processes
The Fisher information is an ingredient of the Cramér-Rao inequality, also called by some the Cauchy-Schwarz inequality in mathematical statistics, and belongs to the basics of asymptotic estimation theory in mathematical statistics.The Cramér-Rao theorem [24] is therefore considered.When assuming that the estimators of ϑ, defined in the previuos sections, are asymptotically unbiased, the inverse of the asymptotic information matrix yields the Cramér-Rao bound, and provided that the estimators are asymptotically efficient, the asymptotic covariance matrix then verifies the inequality Cov ϑ I −1 ϑ here I ϑ is the FIM, Cov ϑ is the covariance of ϑ, the unbiased estimator of ϑ, for a detailed fundamental statistical analysis, see [25,26].The FIM equals the Cramér-Rao lower bound, and the subject of the FIM is also of interest in the control theory and signal processing literature, see e.g., [27] .Its quantum analog was introduced immediately after the foundation of mathematical quantum estimation theory in the 1960's, see [28,29] for a rigorous exposition of the subject.More specifically, the Fisher information is also emphasized in the context of quantum information theory, see e.g., [30,31].It is clear that the Cramér-Rao inequality takes a lot of attention because it is located on the highly exciting boundary of statistics, information and quantum theory and more recently matrix theory.In the next sections, the Fisher information matrices of linear stationary processes will be presented and its role as a new class of structured matrices will be the subject of study.
When time series models are the subject, using Equation ( 2) for all t ∈ Z to determine the residual ε(t) or ε t (ϑ), to emphasize the dependency on the parameter vector ϑ, and assuming that x(t) is stochastic and that (y(t), x(t)) is a Gaussian stationary process, the asymptotic FIM F(ϑ) is defined by the following ( × ) matrix which does not depend on t where the (υ × ) matrix ∂(•)/∂ϑ , the derivative with respect to ϑ , for any (υ × 1) column vector (•) and is the total number of parameters.The derivative with respect to ϑ is used for obtaining the appropriate dimensions.Equality ( 16) is used for computing the FIM of the various time series processes presented in the previous sections and appropriate definitions of the derivatives are used, especially for the multivariate processes (2) and (6), see [21,22].

The Fisher Information Matrix of an ARMA(p, q) Process
In this section, the focus is on the FIM of the ARMA process (12).When ϑ is given in Equation ( 15), the derivatives in Equation ( 16) are at the scalar level ∂ε t (ϑ) ∂a j = 1 a(z) ε t−j for j = 1, . . ., p and ∂ε t (ϑ) when combined for all j and k, the FIM of the ARMA process (12) with the variance of the noise process ε t (ϑ) equal to one, yields the block decomposition, see [32] The expressions of the different blocks of the matrix F(ϑ) are where the integration above and everywhere below is counterclockwise around the unit circle.The reciprocal monic polynomials a(z) and b(z) are defined as a(z) = z p a(z −1 ) and b(z) = z q b(z −1 ) and ϑ = (a 1 , . . ., a p , b 1 , . . ., b q ) introduced in (15).For each positive integer k we have Considering the stability condition of the ARMA(p, q) process implies that all the roots of the monic polynomials a(z) and b(z) lie outside the unit circle.Consequently, the roots of the polynomials a(z) and b(z) lie within the unit circle and will be used as the poles for computing the integrals ( 18)-( 21) when Cauchy's residue theorem is applied.Notice that the FIM F(ϑ) is symmetric block Toeplitz so that F ab (ϑ) = F ba (ϑ) and the integrands in ( 18)-( 21) are Hermitian.The computation of the integral expressions, (18)-( 21) is easily implementable by using the standard residue theorem.The algorithms displayed in [33] and [22] are suited for numerical computations of among others the FIM of an ARMA(p, q) process.

The Sylvester Resultant Matrix -The Fisher Information Matrix
The resultant property of a matrix is considered, in order to show that the FIM F(ϑ) has the matrix resultant property implies to show that the matrix F(ϑ) becomes singular if and only if the appropriate scalar monic polynomials a(z) and b(z) have at least one common zero.To illustrate the subject, the following known property of two polynomials is set forth.The greatest common divisor (frequently abbreviated as GCD) of two polynomials is a polynomial, of the highest possible degree, that is a factor of both the two original polynomials, the roots of the GCD of two polynomials are the common roots of the two polynomials.Consider the coefficients of two monic polynomials p(z) and q(z) of finite degree, as the entries of a matrix such that the matrix becomes singular if and only if the polynomials p(z) and q(z) have at least one common root.Such a matrix is called a resultant matrix and its determinant is called the resultant.Therefore we present the known (p + q) × (p + q) Sylvester resultant matrix of the polynomials a and b, see e.g., [2], to obtain Consider the q × (p + q) and p × (p + q) upper and lower submatrices S p (b) and S q (−a) of the Sylvester resultant matrix S(−b, a) such that The matrix S(a, b) becomes singular in the presence of one or more common zeros of the monic polynomials a(z) and b(z), this property is assessed by the following equalities and R(b, −a) = (−1) q i=1,...,p j=1,...,q where R(a, b) is the resultant of a(z) and b(z), and is equal to Det S(a, b).The string of equalities in ( 24) and ( 25) , see [34].The zeros of the scalar monic polynomials a(z) and b(z) are α i and β j respectively and are assumed to be distinct.By this is meant, when we have (z − α i ) nα i and z − β j n β j with the powers n α i and n β j both greater than one, that only the distinct roots will be considered free from the corresponding powers.The key property of the classical Sylvester resultant matrix S(a, b) is that its null space provides a complete description of the common zeros of the polynomials involved.In particular, in the scalar case the polynomials a(z) and b(z) are coprime if and only if S(a, b) is non-singular.The following key property of the classical Sylvester resultant matrix S(a, b), is given by the well known theorem on resultants, to obtain where ν(a, b) is the number of common roots of the polynomials a(z) and b(z), with counting multiplicities, see e.g., [3].The dimension of a subspace V is represented by dim (V), Ker (X) is the null space or kernel of the matrix X, denoted by Null or Ker.The null space of an n × n matrix A with coefficients in a field K (typically the field of the real numbers or of the complex numbers) is the set Ker A = {x ∈ K n : Ax = 0}, see e.g., [1,2,20].
In order to prove that the FIM F(ϑ) fulfills the resultant matrix property, the following factorization is derived, Lemma 2.1 in [5], where the matrix P(ϑ) ∈ R (p+q)×(p+q) admits the form It is proved in [5] that the symmetric matrix P(ϑ) fulfills the property, P(ϑ) O.The factorization (27) allows us to show the matrix resultant property of the FIM, Corollary 2.2 in [5] states.
The FIM of an ARMA(p, q) process with polynomials a(z) and b(z) of order p, q respectively becomes singular if and only if the polynomials a(z) and b(z) have at least one common root.From Corollary 2.2 in [5] can be concluded, the FIM of an ARMA(p, q) process and the Sylvester resultant matrix S(−b, a) have the same singularity property.By virtue of ( 26) and ( 27) we will specify the dimension of the null space of the FIM F(ϑ), this is set forth in the following lemma.
Proof.The matrix P(ϑ) ∈ R (p+q)×(p+q) , given in (27), fulfills the property of positive definiteness, as proved in [5].This implies that a Cholesky decomposition can be applied to P(ϑ), see [35] for more details, to obtain P(ϑ) =L (ϑ)L(ϑ), where L(ϑ) is a R (p+q)×(p+q) upper triangular matrix that is unique if its diagonal elements are all positive.Consequently, all its eigenvalues are then positive so that the matrix L(ϑ) is also positive definite.Factorization of ( 27) now admits the representation and taking the property, if A is an m × n matrix, then Ker (A) = Ker A A , into account, yields when applied to (30) Ker We will now consider the Rank-Nullity Theorem, see e.g., [1], if A is an m × n matrix, then and the property dim which completes the proof.
Notice that the dimension of the null space of matrix A is called the nullity of A and the dimension of the image of matrix A, dim (Im A), is termed the rank of matrix A. An alternative proof to the one developed in Corollary 2.2 in [5], is given in a corollary to Lemma 3.1, reconfirming the resultant matrix property of the FIM F(ϑ).
Corollary 3.2.The FIM F(ϑ) of an ARMA(p, q) process becomes singular if and only if the autoregressive and moving average polynomials a(z) and b(z) have at least one common root.

The Statistical Distance Measure and the Fisher Information Matrix
In [7] statistical distance measures are studied.Most multivariate statistical techniques are based upon the concept of distance.For that purpose a statistical distance measure is considered that is a normalized Euclidean distance measure with entries of the FIM as weighting coefficients.The measurements x 1 , x 2 ,. . ., x n are subject to random fluctuations of different magnitudes and have therefore different variabilities.It is then important to consider a distance that takes the variability of these variables or measurements into account when determining its distance from a fix point.A rotation of the coordinate system through a chosen angle while keeping the scatter of points given by the data fixed, is also applied, see [7] for more details.It is shown that when the FIM is positive definite, the appropriate statistical distance measure is a metric.In case of a singular FIM of an ARMA stationary process, the metric property depends on the rotation angle.The statistical distance measure, is based on m parameters unlike a statistical distance measure introduced in quantum information, see e.g., [8,9], that is also related to the Fisher information but where the information about one parameter in a particular measurement procedure is considered.
The straight-line or Euclidean distance between the stochastic vector x = x 1 x 2 . . .x n and fixed vector y = y 1 y 2 . . .y n where x, y ∈ R n , is given by where the metric d(x, y) := x − y is induced by the standard Euclidean norm • on R n , see e.g., [2] for the metric conditions.The observations x 1 , x 2 , . . ., x n are used to compute maximum likelihood estimated of the parameters ϑ 1 , ϑ 2 , . . ., ϑ m and where m < n.These estimated parameters are random variables, see e.g., [15].The distance of the estimated vector ϑ ∈ R m , given in (15), is studied.Entries of the FIM are inserted in the distance measure as weighting coefficients.The linear transformation is applied, where L i (φ) ∈ R m×n is the Givens rotation matrix with rotation angle φ, with 0 ≤ φ ≤ 2π and i ∈ {1, . . ., m − 1}, see e.g., [36], and is given by The following matrix decomposition is applied in order to obtain a transformed FIM where F φ (ϑ) and F (ϑ) are respectively the transformed and untransformed Fisher information matrices.
It is straightforward to conclude that by virtue of ( 35), the transformed and untransformed Fisher information matrices F φ (ϑ) and F(ϑ), are similar since the rotation matrix L i (φ) is orthogonal.Two matrices A and B are similar if there exists an invertible matrix X such that the equality AX = XB holds.As can be seen, the Givens matrix L i (φ) involves only two coordinates that are affected by the rotation angle φ whereas the other directions, which correspond to eigenvalues of one, are unaffected by the rotation matrix.By virtue of ( 35) can be concluded that a positive definite FIM, F(ϑ) 0, implies a positive definite transformed FIM, F φ (ϑ) 0. Consequently, the elements on the main diagonal of F(ϑ), f 1,1 , f 2,2 , . . ., f m,m , as well as the elements on the main diagonal of F φ (ϑ), f 1,1 , f 2,2 , . . ., f m,m are all positive.However, the elements on the main diagonal of a singular FIM of a stationary ARMA process are also positive.
As developed in [7], combining (33) and ( 35) yields the distance measure of the estimated parameters ϑ 1 , ϑ 2 , . . ., ϑ m accordingly, to obtain where and f j,l are entries of the FIM F(ϑ) whereas f i,i (φ) and f i+1,i+1 (φ) are the transformed components since the rotation affects only the entries i, i and i + 1, i + 1 as can be seen in matrix L i (φ).In [7], the existence of the following inequalities is proved this guaratees the metric property of (36).When the FIM of an ARMA(p, q) process is the case, a combination of ( 27) and (35) for the ARMA(p, q) parameters, given in (15) yields for the transformed FIM, where P(ϑ) is given by ( 28) and the transformed Sylvester resultant matrix is of the form Proposition 3.5 in [7], proves that the transformed FIM F φ (ϑ) and the transformed Sylvester matrix S φ (−b, a) fulfill the resultant matrix property by using the equalities (40) and (39).The following property is then set forth.In the next section a distance measure introduced in quantum information is discussed.

Statistical Distance Measure -Fisher Information and Quantum Information
In quantum information, the Fisher information, the information about a parameter θ in a particular measurement procedure, is expressed in terms of the statistical distance s, see [8][9][10].The statistical distance used is defined as a measure to distinguish two probability distributions on the basis of measurement outcomes, see [37].The Fisher information and the statistical distance are statistical quantities, and generally refer to many measurements as it is the case in this survey.However, in the quantum information theory and quantum statistics context, the problem set up is presented as follows.There may or may not be a small phase change θ, and the question is whether it is there.In that case you can design quantum experiments that will tell you the answer unambiguously in a single measurement.The equality derived is of the form the Fisher information is the square of the derivative of the statistical distance s with respect to θ. Contrary to (36), where the square of the statistical distance measure is expressed in terms of entries of a FIM F (ϑ) which is based on information about m parameters estimated from n measurements, for m < n.A challenging question could therefore be formulated as follows, can a generalization of equality (41) be developed in a quantum information context but at the matrix level ?To be more specific, many observations or measurements that lead to more than one parameter such that the corresponding Fisher information matrix is interconnected to an appropriate statistical distance matrix, a matrix where entries are scalar distance measures.This question could equally be a challenge to algebraic matrix theory and to quantum information.

The Bezoutian -The Fisher Information Matrix
In this section an additional resultant matrix is presented, it concerns the Bezout matrix or Bezoutian.The notation of Lancaster and Tismenetsky [2] shall be used and the results presented are extracted from [38].Assume the polynomials a and b given by a(z) = This matrix is often referred as the Bezoutian.We will display a decomposition of the Bezout matrix B(a, b) developed in [38].For that purpose the matrix U φ and its inverse T φ are presented, where φ is a given complex number, to obtain The following non-symmetric decomposition of the Bezoutian is derived, considering the notations above with a α 1 such that a α 1 u n (z) = a −1 similarly for b β 1 .Iteration gives the following expansion for the Bezout matrix where e n 1 is the first unit standard basis column vector in R n , by e j we denote the jth coordinate vector, e j = (0, . . ., 1, . . ., 0) , with all its components equal to 0 except the jth component which equals 1.
The following corollarys to Proposition 3.1 in [38] are now presented.Corollary 3.2 in [38] states.Let φ be a common zero of the polynomials a(z) and b(z).Then This a direct consequence of (42) and from which can be concluded that the Bezoutian B(a, b) is non-singular if and only if the polynomials a(z) and b(z) have no common factors.A similar conclusion is drawn for the FIM in (27) so that matrices F(ϑ) and B(a, b) have the same singularity property.Related to Corollary 3.2 in [38], this is where we give a description of the kernel or nullspace of the Bezout matrix.
Corollary 3.3 in [38] is now presented.Let φ 1 , . .., φ m be all the common zeros of the polynomials a(z) and b(z), with multiplicities n 1 , . . ., n m .Let be the last unit standard basis column vector in R n and put w j k = T j φ k J j−1 for k = 1, . . ., m and j = 1, . . ., n k and by J we denote the forward n × n shift matrix, J ij = 1 if i = j + 1.Consequently, the subspace Ker B(a, b) is the linear span of the vectors w j k .An alternative representation to (27) but involving the Bezoutian B(b, a) and derived in Proposition 5.1 in [38] is of the form where and M(b, a)= P 0 P S( a)P P S( b)P and and Q(ϑ) 0 The matrix S( a) is the symmetrizer of the polynomial a(z), in this paper a 0 = 1, see [2] and P is a permutation matrix.In [38] it is shown that the matrix Q(ϑ) is the unique solution to an appropriate Stein equation and is strictly positive definite.However, in the next section an explicit form of the Stein solution Q(ϑ) is developed.Some comments concerning the property summarized in Corollary 5.2 in [38] follow.
The matrix H(ϑ) is non-singular if and only if the polynomials a(z) and b(z) have no common factors.The proof is straightforward since the matrix Q(ϑ) is non-singular which implies that the matrix H(ϑ) is only non-singular when the Bezoutian B(b, a) is non-singular and this is fulfilled if and only if the polynomials a(z) and b(z) have no common factors.The matrix M(b, a) is non-singular if a 0 = 0 and b 0 = 0, which is the case since we have a 0 = b 0 = 1.From (43) can be concluded that the FIM F(ϑ) is non-singular only when the matrix H(ϑ) is non-singular or by virtue of (44) when the Bezoutian B(b, a) is non-singular.Consequently, the singularity conditions of the Bezoutian B(b, a), the FIM F(ϑ) and the Sylvester resultant matrix S(b, −a) are therefore equivalent.Can be concluded, by virtue of ( 29 In [12], a link between the FIM of an ARMA process and an appropriate solution of a Stein equation is set forth.In this survey paper we shall present some of the results and confront some results displayed in the previous sections.However, alternative proofs will be given to some results obtained in [12,38]. The In this section an interconnection between the representation (27) of the FIM F(ϑ) and an appropriate solution to a Stein equation of the form (45) as developed in [12] is set forth.The distinct roots of the polynomials a(z) and b(z) are denoted by α 1 , α 2 , . . ., α p and β 1 , β 2 , . . ., β q respectively such that the non-singularity of the FIM F(ϑ) is guaranteed.The following representation of the integral expression ( 28) is given when Cauchy's residue theorem is applied, equation (4.8) in [12] where U(ϑ) = u p+q (α 1 ), u p+q (α 2 ), . . ., u p+q (α p ), u p+q (β 1 ), u p+q (β 2 ), . . ., u p+q (β q ) a(β j ) b(z; β j )a(β j )b(β j ) , i = 1, . . ., p and j = 1, . . ., q and U(ϑ) = v p+q (α 1 ), v p+q (α 2 ), . . ., v p+q (α p ), v p+q (β 1 ), v p+q (β 2 ), . . ., v p+q (β q ) the polynomial p(•; β) is defined accordingly, p(z; β) = p(z) (z − β) and D(ϑ) is the (p + q) × (p + q) diagonal matrix.The matrices U(ϑ) and U(ϑ) in (47) are the (p + q) × (p + q) Vandermonde matrices V αβ and V αβ respectively, given by It is clear that the (p + q) × (p + q) Vandermonde matrices V αβ and V αβ are nonsingular when α i = α j , β k = β h and α i = β k for all i, j = 1, . . ., p and k, h = 1, . . ., q.A rigorous systematic evaluation of the Vandermonde determinants DetV αβ and Det V αβ , yields Since V αβ = P V αβ and given the configuration of the permutation matrix, P , this leads to the equalities Det V αβ =DetP DetV αβ and DetP = (−1) (p+q)(p+q−1)/2 so that Det We shall now introduce an appropriate Stein equation of the form (45) such that an interconnection with P(ϑ) in (47) can be verified.Therefore the following (p + q) × (p + q) companion matrix is introduced, where the entries g i are given by z p+q + p+q i=1 g i (ϑ)z p+q−i = a(z) b(z) = g(z, ϑ) and g(ϑ) is the vector g(ϑ) = (g p+q (ϑ), g p+q−1 (ϑ), . . ., g 1 (ϑ)) .Likewise is the vector g(z, ϑ) = a(z)b(z) and g(ϑ) = (g 1 (ϑ), g 1 (ϑ), . . ., g p+q (ϑ)) , for investigating the properties of a companion matrix see e.g., [36], [2].Since all the roots of the polynomials a(z) and b(z) are distinct and lie within the unit circle implies that the products α i β j = 1, α i α j = 1 and β i β j = 1 hold for all i = 1, 2, . . ., p and j = 1, 2, . . ., q.Consequently, the uniqueness condition of the solution of an appropriate Stein equation is verified.The following Stein equation and its solution, according to (45) and (46), are now presented where the closed contour is now the unit circle |z| = 1 and the matrix Γ is of size (p + q) × (p + q).A more explicit expression of the solution S is of the form where adj(X) = X −1 Det(X), the adjoint of matrix X.When Cauchy's residue theorem is applied to the solution S in (49), the following factored form of S is derived, equation (4.9) in [12] where and D(ϑ) is given in (47), the following matrix rule is applied and the operator ⊗ is the tensor (Kronecker) product of two matrices, see e.g., [2], [20].Combining (47) and (50) and taking the assumption, α i = α j , β k = β h and α i = β k , into account implies that the inverse of the (p + q) × (p + q) Vandermonde matrices V αβ and V αβ exist, as Lemma 4.2 [12] states.
The following equality holds true Consequently, under the condition α i = α j , β k = β h and α i = β k , and by virtue of ( 27) and ( 51 where e p+q is the last standard basis column vector in R p+q , e m i is the i-th unit standard basis column vector in R m , with all its components equal to 0 except the i-th component which equals 1.The next lemma is formulated.Using the property of the companion matrix C g , standard computation shows that the last column of adj(zI p+q − C g ) is the basic vector u p+q (z) and consequently the last column of adj(I p+q − zC g ) is the basic vector v p+q (z) = z p+q−1 u p+q (z −1 ).This implies that adj(zI p+q − C g )e p+q = u p+q (z) and e p+q adj(I p+q − zC g ) = v p+q (z) or Consequently, the solution S to the Stein Equation ( 52) coincides with the matrix P(ϑ) defined in (28).
The Stein equation that is verified by the FIM F(ϑ) will be considered.For that purpose we display the following p × p and q × q companion matrices C a and C b of the form, , where e 1 p and e 1 q are the first standard basis column vectors in R p and R q respectively.Consider the Stein equation followed by the theorem.
Representation ( 54) is such that in order to obtain an equivalent representation to the FIM F(ϑ) in ( 17), the transpose of the solution to the Stein Equation ( 53) is therefore required, to obtain or The symmetry property of the FIM F(ϑ), leads to S = F(ϑ).From the representation (55) can be concluded that the solution S of the Stein Equation (53) coincides with the symmetric block Toeplitz FIM F(ϑ) given in (17).This completes the proof.
It is straightforward to verify that the submatrix (1,2) in ( 55) is the complex conjugate transpose of the submatrix (2,1), whereas each submatrix on the main diagonal is Hermitian, consequently, the integrand is Hermitian.This implies that when the standard residue theorem is applied, it yields F(ϑ) = F (ϑ).

An Illustrative Example of Theorem 3.6
To illustrate Theorem 3.6, the case of an ARMA(2, 2) process is considered.We will use the representation (17) for computing the FIM F(ϑ) of an ARMA(2, 2) process.The autoregressive and moving average polynomials are of degree two or p = q = 2 and the ARMA(2, 2) process is described by, where y(t) is the stationary process driven by white noise ε(t), a(z and the parameter vector is ϑ = (a 1 , a 2 , b 1 , b 2 ) .The condition, the zeros of the polynomials are in absolute value smaller than one, is imposed.The FIM F(ϑ) of the ARMA(2, 2) process ( 56) is of the form where The submatrices F aa (ϑ) and F bb (ϑ) are symmetric and Toeplitz whereas F ab (ϑ) is Toeplitz.One can assert that without any loss of generality, the property, symmetric block Toeplitz, holds for the class of Fisher information matrices of stationary ARMA(p, q) processes, where p and q are arbitrary, finite integers that represent the degrees of the autoregressive and moving average polynomials, respectively.The appropriate companion matrices C a , C b , the 4 × 4 matrices K(ϑ) and BB are where B = 1 0 −1 0 .It can be verified that the Stein equation holds true, when F(ϑ) is of the form (57) and the matrices K(ϑ) and BB are given in (58).

Some Additional Results
In Proposition 5.1 in [38], the matrix Q(ϑ) in (44) fulfills the Stein Equation ( 59) and the property Q(ϑ) 0 is proved.It states that when e P = e 1 P, 0 = (e n , 0 n ) ∈ R 2n , where e 1 is the first unit standard basis column vector in R n and e n is the last or n-th unit standard basis column vector in R n , the following Stein equation admits the form where A corollary to Proposition 5.1, [38] will be set forth, the involvement of various Vandermonde matrices in the explicit solution to equation (59) is confirmed.For that purpose the following Vandermonde matrices are displayed, , and (60) where V β and V β have the same configuration as V α and V α respectively.A corollary to Proposition 5.1 in [38] is now formulated.Corollary 3.7.An explicit expression of the solution to the Stein equation ( 59) is of the form where the n × n and 2n × 2n diagonal matrices D kl (ϑ) shall be specified in the proof.
Proof.The condition of a unique solution of the Stein Equation ( 59) is guaranteed since the eigenvalues of the companions matrices C a and C b given respectively by the zeros of the polynomials a(z) and b(z) are in absolute value smaller than one.Consequently, the unique solution to the Stein Equation (59) exists and is given by in order to proceed successfully, the following matrix property is displayed, to obtain When applied to the Equation (62), it yields Considering that the last column vector of the matrices adj(zI p − C a ) and adj(I n − z C a ) are the vectors u n (z) and v n (z) respectively, it then yields Applying the standard residue theorem leads for the respective submatrices where the n × n diagonal matrices are and the 2n × 2n diagonal matrices are It is clear that the first and third matrices in Q 11 (ϑ), Q 12 (ϑ), Q 21 (ϑ) and Q 22 (ϑ) are the appropriate Vandermonde matrices displayed in (60), it can be concluded that the representation (61) is verified.This completes the proof.
In this section an explicit form of the solution Q(ϑ), expressed in terms of various Vandermonde matrices, is displayed.Also, an interconnection between the Fisher information F(ϑ) and appropriate solutions to Stein equations and related matrices is presented.Proofs are given when the Stein equations are verified by the FIM F(ϑ) and the associated matrix P(ϑ).These are alternative to the proofs developed in [38].The presence of various forms of Vandermonde matrices is also emphasized.In the next section some matrix properties of the FIM F(ϑ) of an ARMAX process is presented.

The Fisher Information Matrix of an ARMAX(p, r, q) Process
The FIM of the ARMAX process (11) is set forth according to [4].The derivatives in the corresponding representation (16) are where j = 1, . . ., p, l = 1, . . ., r and k = 1, . . ., q. Combining all j, l and k yields the where the submatrices of G(ϑ) are given by where R x (z) is the spectral density of the process x(t) and is defined in (10).Let , combining all the expressions in (63) leads to the following representation of G(ϑ) as the sum of two matrices where (X) * is the complex conjugate transpose of the matrix X ∈ C m×n .Like in (23) we set forth here S p (c) is formed by the top p rows of S(−c, a).In a similar way we decompose The representation (64) can be expressed by the appropriate block representations of the Sylvester resultant matrices, to obtain where the matrix P(ϑ) is given in (28) and the matrix W(ϑ) ∈ R (p+r)×(p+r) is of the form It is shown in [4] that W(ϑ) O.As can be seen in ( 65), the ARMAX part is explained by the first term, whereas the ARMA part is described by the second term, the combination of both terms is a summary of the Fisher information of a ARMAX(p, r, q) process.The FIM G(ϑ) under form (65) allows us to prove the following property, Theorem 3.1 in [4].The FIM G(ϑ) of the ARMAX(p, r, q) process with polynomials a(z), c(z) and b(z) of order p, r, q respectively becomes singular if and only if these polynomials have at least one common root.Consequently, the class of resultant matrices is extended by the FIM G(ϑ).

The Stein Equation -
The Fisher Information Matrix of an ARMAX(p, r, q) Process In Lemma 3.5 it is proved that the matrix P(ϑ) (28) fulfills the Stein Equation (52).We will now consider the conditions under which the matrix W(ϑ) (66) verifies an appropriate Stein equation.For that purpose we consider the spectral density to be of the form R x (z) = (1/h(z)h(z −1 )).The degree of the polynomial h(z) is and we assume the distinct roots of the polynomial h(z) to lie outside the unit circle, consequently, the roots of the polynomial h(z) lie within the unit circle.We therefore rewrite W(ϑ) accordingly We consider a companion matrix of the form (48) and with size p + q + , it is denoted by C f and the entries f i are given by z p+q+ + p+q+ i=1 f i (ϑ)z p+q+ −i = a(z) b(z) h(z) = f (z, ϑ) and f (ϑ) is the vector f (ϑ) = (f p+q+ (ϑ), f p+q+ −1 (ϑ), . . ., f 1 (ϑ)) .Likewise for the vector f (z, ϑ) = a(z)b(z)h(z) and f (ϑ) = (f 1 (ϑ), f 1 (ϑ), . . ., f p+q+ (ϑ)) .The property Det(zI p+q+ − C f ) = a(z) b(z) h(z) and Det(I p+q+ − zC f ) = a(z)b(z)h(z) holds and assume r = q + or p + q + = p + r and r > q (67) W(ϑ) is then of the form We will formulate a Stein equation when the matrix Γ = e p+r e p+r and which is of the form where e p+r is the last standard basis column vector in R p+r .The next lemma is formulated.Consequently, the matrix W(ϑ) defined in (68) verifies the Stein Equation (69).This completes the proof.
The matrices, P(ϑ) and W(ϑ), in (65), verify under specific conditions appropriate Stein equations, as has been shown in Lemma 3.5 and Lemma 3.8, respectively.We will now confirm the presence of Vandermonde matrices by applying the standard residue theorem to W(ϑ) in (68), to obtain The (p + r) × (p + r) diagonal matrix R (ϑ) is of the form where φ(z) = a(z)b(z)h(z) and i = 1, . . ., p, j = 1, . . ., q and k = 1, . . ., .Whereas the (p + r) × (p + r) matrices V αβξ and V αβξ are of the form The (p + r) × (p + r) Vandermonde matrices V αβξ and V αβξ are nonsingular when for all i, j = 1, . . ., p, k, h = 1, . . ., q and m,n = 1, . . ., .The Vandermonde determinants DetV αβξ and Det V αβξ , are where Lemma 3.9.Assume the conditions (67) to hold and consider the representations of P(ϑ) and W(ϑ) in ( 47) and (70) respectively, leads to an alternative form to (65), it is given by In Lemma 3.9, the FIM G(ϑ) is expressed by submatrices of two Sylvester matrices and various Vandermonde matrices, both type of matrices become singular if and only if the appropriate polynomials have at least one common root.

The Fisher Information Matrix of a Vector ARMA(p, q) Process
The process ( 6) is summarized as, and we assume that {y(t), t ∈ N}, is a zero mean Gaussian time series and {ε(t), t ∈ N} is a n-dimensional vector random variable, such that E ϑ {ε(t)} = 0 and E ϑ ε(t)ε (t) = Σ and the parameter vector ϑ is of the form (7). In [6] it is shown that representation (16) for the n 2 (p+q)×n 2 (p+q) asymptotic FIM of the VARMA process ( 6) is where ∂ε/∂ϑ is of size n × n 2 (p + q) and for convenience t is omitted from ε(t).Using the differential rules outlined in [6], yields The substitution of representation (72) of ∂ε/∂ϑ in (71) yields the FIM of a VARMA process.The purpose is to construct a factorization of the FIM F (ϑ) that should be a multiple variant of the factorization (27), so that a multiple resultant matrix property can be proved for F (ϑ).As illustrated in [6], the multiple version of the Sylvester resultant matrix (22) does not fulfill the multiple resultant matrix property.In that case even when the matrix polynomials A(z) and B(z) have a common zero or a common eigenvalue, the multiple Sylvester matrix is not neccessarily singular.This has also been illustrated in [3].In order to consider a multiple equivalent to the resultant matrix S(−b, a), Gohberg and Lerer set forth the n 2 (p + q) × n 2 (p + q) tensor Sylvester matrix 73) In [3], the authors prove that the tensor Sylvester matrix S ⊗ (−B, A) fulfills the multiple resultant property, it becomes singular if and only if the appropriate matrix polynomials A(z) and B(z) have at least one common zero.In Proposition 2.2 in [6], the following factorized form of the Fisher information F (ϑ) is developed where In order to obtain a multiple variant of ( 27), the following matrix is introduced, where and the matrix P (ϑ) is a multiple variant of the matrix P(ϑ) in ( 28), it is of the form In Lemma 2.3 in [6], it is proved that the matrix M (ϑ) in (76) becomes singular if and only if the matrix polynomials A(z) and B(z) have at least one common eigenvalue-zero.The proof is a multiple equivalent of the proof of Corollary 2.2 in [5], since the equality (76) is a multiple version of (27).Consequently, the matrix M (ϑ) like the tensor Sylvester matrix S ⊗ (−B, A), fulfills the multiple resultant matrix property.Since the matrix M (ϑ) is derived from the FIM F (ϑ), this enables us to prove that the matrix F (ϑ) fulfills the multiple resultant matrix property by showing that it becomes singular if and only if the matrix M (ϑ) is singular, this is done in Proposition 2.4 in [6].Consequently, it can be concluded from [6] that the FIM of a VARMA process F (ϑ) and the tensor Sylvester matrix S ⊗ (−B, A) have the same singularity conditions.The FIM of a VARMA process F (ϑ) can therefore be added to the class of multiple resultant matrices.
A brief summary of the contribution of [6] follows, in order to show that the FIM of a VARMA process F (ϑ) is a multiple resultant matrix two new representations of the FIM are derived.To construct such representations appropriate matrix differential rules are applied.The newly obtained representations are expressed in terms of the multiple Sylvester matrix and the tensor Sylvester matrix.The representation of the FIM expressed by the tensor Sylvester matrix is used to prove that the FIM becomes singular if and only if the autoregressive and moving average matrix polynomials have at least one common eigenvalue.It then follows that the FIM and the tensor Sylvester matrix have equivalent singularity conditions.In a numerical example it is shown, however, that the FIM fails to detect common eigenvalues due to some kind of numerical instability.The tensor Sylvester matrix reveals it clearly, proving the usefulness of the results derived in this paper.
3.9.The Fisher Information Matrix of a Vector ARMAX(p, r, q) Process The n 2 (p + q + r) × n 2 (p + q + r) asymptotic FIM of the VARMAX(p, r, q) process ( 2) is displayed according to [23] and is an extension of the FIM of the VARMA(p, q) process (6).Representation (16) of the FIM of the VARMAX(p, r, q) process is then To obtain the term ∂ε/∂ϑ , of size n × n 2 (p + q + r), the same differential rules are applied as for the VARMA(p, q) process.In Proposition 2.3 in [23], the representation of the FIM of a VARMAX process is expressed in terms of tensor Sylvester matrices, this obtained when ∂ε/∂ϑ in (78) is substituted in (16), to obtain The matrices in (79) are of the form additionally we have Ψ(z) = R x (z) ⊗ σ(z) and the Hermitian spectral density matrix R x (z) is defined in (10), whereas the matrix polynomials Θ(z) and σ(z) are presented in (75).In (80), we have the pn 2 × (p + q)n 2 and qn 2 × (p + q)n 2 submatrices S ⊗ p (−B) and S ⊗ q (A) of the tensor Sylvester resultant matrix S ⊗ p,q (−B, A).Whereas the matrices S ⊗ p (−C) and S ⊗ r (A) are the upper and lower blocks of the (p+r)n 2 ×(p+r)n 2 tensor Sylvester resultant matrix S ⊗ p,r (−C, A).As for the FIM of the VARMA(p, q) process, the objective is to construct a multiple version of (65), this done in [23], to obtain   and P (ϑ) is given in (77).Note, the matrices Φ x (z), Λ x (z), L(z) and W(z) are the corrected versions of the corresponding matrices in [23].
A parallel between the scalar and multiple structures is straightforward.This is best illustrated by comparing the representations ( 27) and ( 28) with (76) and (77) respectively, confronting the FIM for scalar and vector ARMA(p, q) processes.The FIM of the scalar ARMAX(p, r, q) process contains an ARMA(p, q) part, this is confirmed by (65), through the presence of the matrix P(ϑ) which is originally displayed in (28).The multiple resultant matrices M (ϑ) and M x (ϑ) derived from the FIM of the VARMA(p, q) and VARMAX(p, r, q) processes respectively both contain P (ϑ), whereas the first matrix term of the matrices Φ(z) and Φ x (z), which are of different size, consist of the same nonzero submatrices.To summarize, in [23] compact forms of the FIM of a VARMAX process expressed in terms of multiple and tensor Sylvester matrices are developed.The tensor Sylvester matrices allow us to investigate the multiple resultant matrix property of the FIM of VARMAX(p, r, q) processes.However, since no proof of the multiple resultant matrix property of the FIM G(ϑ) has been done yet, justifies the consideration of a conjecture.A conjecture that states, the FIM G(ϑ) of a VARMAX(p, r, q) process becomes singular if and only if the matrix polynomials A(z), B(z) and C(z) have at least one common eigenvalue.A multiple equivalent to Theorem 3.1 in [4] and combined with Proposition 2.4 in [6], but based on the representations (79) and (81), can be envisaged to formulate a proof which will be a subject for future study.

Conclusions
In this survey paper, matrix algebraic properties of the FIM of stationary processes are discussed.The presented material is a summary of papers where several matrix structural aspects of the FIM are investigated.The FIM of scalar and multiple processes like the (V)ARMA(X) are set forth with appropriate factorized forms involving (tensor) Sylvester matrices.These representations enable us to prove the resultant matrix property of the corresponding FIM.This has been done for (V)ARMA(p, q) and ARMAX(p, r, q) processes in the papers [4][5][6].The development of the stages that lead to the appropriate factorized form of the FIM G(ϑ) (79) is set forth in [23].However, there is no proof done yet that confirms the multiple resultant matrix property of the FIM G(ϑ) of a VARMAX(p, r, q) process.This justifies the consideration of a conjecture which is formulated in the former section, this can be a subject for future study.
The statistical distance measure derived in [7], involves entries of the FIM.This distance measure can be a challenge to its quantum information counterpart (41).Because (36) involves information about m parameters estimated from n measurements.Whereas in quantum information, like in e.g., [8][9][10], the information about one parameter in a particular measurement procedure is considered for establishing an interconnection with the appropriate statistical distance measure.A possible approach, by combining matrix algebra and quantum information, for developing a statistical distance measure in quantum information or quantum statistics but at the matrix level, can be a subject of future research.Some results concerning interconnections between the FIM of ARMA(X) models and appropriate solutions to Stein matrix equations are discussed, the material is extracted from the papers, [12] and [13].However, in this paper, some alternative and new proofs that emphasize the conditions under which the FIM fulfills appropriate Stein equations, are set forth.The presence of various types of Vandermonde matrices is also emphasized when an explicit expansion of the FIM is computed.These Vandermonde matrices are inserted in interconnections with appropriate solutions to Stein equations.This explains, when the matrix algebraic structures of the FIM of stationary processes are investigated, the involvement of structured matrices like the (tensor) Sylvester, Bezoutian and Vandermonde matrices is essential.

Lemma 3 . 1 .
Assume that the polynomials a(z) and b(z) have ν(a, b) common roots, counting multiplicities.The factorization (27) of the FIM and the property (26) enable us to prove the equality dim (Ker F(ϑ)) = dim (Ker S(b, −a)) = ν(a, b) Proof.By virtue of the equality(31) combining with the property Det S (b, −a) = Det S(b, −a) and the matrix resultant property of the Sylvester matrix S(b, −a) yields, Det S (b, −a) = 0 ⇐⇒ Ker S (b, −a) = {0} if and only if the ARMA(p, q) polynomials a(z) and b(z) have at least one common root.Equivalently, Det S (b, −a) = 0 ⇐⇒ Ker S (b, −a) = {0} if and only if the ARMA(p, q) polynomials a(z) and b(z) have no common roots.Consequently, by virtue of the equality Ker F(ϑ) = Ker S (b, −a) can be concluded, the FIM F(ϑ) becomes singular if and only if the ARMA(p, q) polynomials a(z) and b(z) have at least one common root.This completes the proof.

a
j z j and b(z) = n j=0 b j z j , cfr.(13) but where p = q = n, and we further assume a 0 = b 0 = 1.The Bezout matrix B(a, b) of the polynomials a and b is defined by the relation factor of a(z) and b(z) respectively and α 1 and β 1 are zeros of a(z) and b(z).Consider the factored form of the nth order polynomials a(z) and b(z) of the form a(z) = (1 − α 1 z)a −1 (z) and b(z) = (1 − β 1 z)b −1 (z) respectively.Proceeding this way, for α 2 , . . ., α n yields the recursion a −(k−1) (z) = (1 − α k z)a −k (z), equivalently for the polynomials b −k (z) and a 0 (z) = a(z) and b 0 (z) = b(z).Proposition 3.1 in [38] is presented.

Theorem 3 . 4 .C(
Stein matrix equation is now set forth.Let A ∈ C m×m , B ∈ C n×n and Γ ∈ C n×m and consider the Stein equation S − BSA = Γ (45) It has a unique solution if and only if λµ = 1 for any λ ∈ σ(A) and µ ∈ σ(B), the spectrum of D is σ(D) = {λ ∈ C: det(λI m − D) = 0}, the set of eigenvalues of D. The unique solution will be given in the next theorem [11].Let A and B be, such that there is a single closed contour C with σ(B) inside C and for each non-zero w ∈ σ(A), w −1 is outside C. Then for an arbitrary Γ the Stein Equation (45) λI n − B) −1 Γ(I m − λA) − dλ (46) ), an interconnection involving the FIM F(ϑ), a solution to an appropriate Stein equation S, the Sylvester matrix S(b, −a) and the Vandermonde matrices V αβ and V αβ is established.It is clear that by using the expression (43), the Bezoutian B (a, b) can be inserted in equality (51).We will formulate a Stein equation when the matrix Γ = e p+q e p+q , S − C g SC g = e p+q e p+q (52)

Proof.(
The unique solution of (52) is according to (46) zI p+q − C g ) −1 e p+q e p+q (I p+q − zC g ) − dz more explictely written, p+q − C g )e p+q e p+q adj(I p+q − zC g ) a(z)b(z) a(z) b(z) dz

Theorem 3 . 6 .dz
The Fisher information matrix F(ϑ)(17) coincides with the solution to the Stein equation (53).Proof.The eigenvalues of the companion matrices C a and C b are respectively the zeros of the polynomials a(z) and b(z) which are in absolute value smaller than one.This implies that the unique solution of the Stein Equation (53) exists and is given by S = 1 2πi |z|=1 (zI p+q − K(ϑ)) −1 BB (I p+q − zK(ϑ)) − dz developing this integral expression in a more explicit form yields Considering the form of the companion matrices C a and C b leads through straightforward computation to the conclusion, the first column of adj(zI p − C a ) is the basic vector v p (z) and consequently the first column of adj(I p − zC a ) is the basic vector u p (z). Equivalently for the companion matrix C b , this yields