Numerical Inverse Transformation Methods for Z-Transform

: Numerical inverse Z-transformation (NIZT) methods have been efﬁciently used in engineering practice for a long time. In this paper, we compare the abilities of the most widely used NIZT methods, and propose a new variant of a classic NIZT method based on contour integral approximation, which is efﬁcient when the point of interest (at which the value of the function is needed) is smaller than the order of the NIZT method. We also introduce a vastly different NIZT method based on concentrated matrix geometric (CMG) distributions that tackles the limitations of many of the classic methods when the point of interest is larger than the order of the NIZT method.


Introduction
Z-transformation is one of the most frequently used non-linear transformations for describing discrete time series [1]. In several engineering and applied mathematical fields, non-linear transformations provide a compact description of the system behavior. Many practically important operations, e.g., discrete convolution, are much easier to handle in Z-transform domain, which makes the use of Z-transform domain system description widespread in many fields. While a lot of important characteristics (e.g., poles, frequency response, initial/final values, etc.) can be obtained directly from Z-transform domain description, there are plenty of practically important cases where explicit time domain values are needed. In this case, inverse Z-transformation (IZT) must be performed.
In some special cases, symbolic IZT is feasible, but in a wide range of practically important cases numerical IZT (NIZT) is required to compute the time domain values based on the Z-transform domain description. This paper focuses on the problem of NIZT.
Since NIZT has been widely applied in practice for a long time, its literature is rather rich. In this paper, we consider only the most efficient methods of the recent literature [2]. These methods are introduced in the subsequent discussions and are used also for numerical comparison.
Section 2 is devoted to the general setup of Z-transformation. Section 3 gives a brief review of general methods in the literature. From among the few NIZT methods available in the literature, one stands out, described by several authors independently [2][3][4]. We will refer to this method as the CIR method, and it is described in Section 4 along with an interpretation based on Contour Integral approximation starting from the positive Real axis. Section 5 provides a variant of this method with the starting point of the Contour Integral Shifted, referred to as the CIS method. Section 6 provides an entirely new method, referred to as the CMG method, based on Concentrated Matrix Geometric (CMG) where C is a counterclockwise closed path encircling the origin and entirely in the region of convergence, and i is the complex unit. The Z-transform has a natural scaling property: which allows any NIZT method to be applied to g (az) instead of g (z) to approximate a −T g(T) (and then g(T) by multiplying by a T ). In some cases, the scaling with a has special analytical interpretation; we provide such interpretation for contour integral methods. In the numerical section we study the effect of a for all the methods included in the present paper.
In this work, we assume that g(t) is real for any non-negative integer t. This assumption is well-suited for most practical applications and implies g (z) =ḡ (z) (wherez denotes the complex conjugate of z). Consequently, it is sufficient to evaluate g (z) only at one of each complex conjugate pair.

General Inverse Z-Transformation Methods
In this section, we provide a short overview of various NIZT methods proposed in the literature that do not use the contour integral in (2) for the inverse Z-transformation. The contour integral-based NIZT approach of [2][3][4] will be discussed separately in Section 4.

Inverse Transformation Based on Moments
In [5], Tagliani proposes a method for NIZT that requires the availability of a finite number of the transform's derivatives. The derivatives are used to calculate the moments of g(t), and based on these moments an approximating "analytical form" is calculated. Consequently, while the author presents it as an inverse Z-transform method, it is more of a moment fitting algorithm. The benefit of Tagliani's approach is that it can be used as long as the moments of g(t) are obtainable even if only a functional equation is available for the Z-transform. However, the performance of the method is significantly worse than numerical integration-based methods both in terms of precision and in computation time.

Inverse Transformation Based on Orthogonal Decomposition
Rajković et al. propose an inversion method in [6] that approximates g(t) with Parameter q can be chosen freely, but should be slightly smaller than 1 (in the numerical experiments of [6] q = 3/4 and q = 5/6 are used). The b q,n,k coefficients are calculated as The c n coefficients are calculated as The above c n parameters will minimize ||g(t) − g N (t)|| 2 for the given ϕ q (t) set of series. The idea behind the method is that the b q,n,k parameters are chosen such that ϕ (1) q (t) = δ j,k r(q) ∀j, k ∈ N, where δ j,k is the Kronecker-delta (δ j,k = 1 if j = k and δ j,k = 0 otherwise) and r(q) is a function of q. It is the consequence of this orthogonality that the c n values calculated as above are optimal (in 2-norm).

Inverse Transformation Based on a Linear System of Equations
The g(t) series can also be approximated based on a simple truncation of the Z-transform presented by Merrikh-Bayat in [4]. From the definition of the Z-transform we have By using this approximation in the z 1 , z 2 , . . . , z m set of points chosen from the ROC we obtain If m = N, the above equation has a unique solution (assuming that the matrix is non-singular, which is true if z j = z k , ∀j = k). The issue with the m = N case is that it leads to an ill-conditioned problem, thus Merrikh-Bayat proposes to choose m ≈ 1.1N then minimize ||Ag t − g z || 2 , where g t = [g(0), . . . , g(N)], g z = [g (z 1 ), . . . , g (z m )], and Minimization of ||Ag t − g z || 2 is a least squares problem, which is done using QR decomposition or singular value decomposition in [4]. These have a computational cost of O(m 2 N)(= O(N 3 ) when m ≈ 1.1N) for an m × N matrix, thus this method is computationally expensive compared to numerical integration-based methods. The error in (4) is small when g(t) is rapidly decaying and N is large enough so that g(t), 0 ≤ t < N captures most of the significant values in the sequence. Accordingly, numerical experiments show that the method gives best results for rapidly decaying g(t) sequences when N is sufficiently large; however, increasing N further does not seem to improve the error. Overall, the non-vanishing error renders the applicability of this method limited. This is addressed further in Section 7.

Contour Integral-Based Inverse Z-Transformation Methods
Equation (2) can be rewritten as where is the complex unit. Equation (6) corresponds to the case when C is the circle of radius a in (2). a is actually equivalent to the scaling parameter in (3), as shown in Remark 4. Contour integral-based methods, in general, approximate integral (5) with the finite sum where ω k , k = 0, 1, 2, . . . , N define a properly chosen partition of [0, 2π] and N is the order of the approximation. This method is described in [3,4,7] in a slightly different manner independently from each other. In theory any ω k partition could be chosen, but the above papers use exclusively. Since the ω k 's are equidistant with ω k − ω k−1 = 2π N , (6) can be further simplified as where The β k parameters are referred to as nodes in the rest of the paper. Choosing the nodes according to (9) has several consequences: 1. Since the β k s consist of complex conjugate pairs (along with real numbers 1 (and also (−1) for even N)), approximation (8) is guaranteed to be real. 2. In case one of the aβ k values coincides with a pole of g , (8) cannot be evaluated (even if they are close, there is numerical instability). A typical example is when g has a pole at 1 and a = 1. 3. Selecting a to be larger than the limit of absolute convergence c guarantees that (8) avoids all poles of g . That said, selecting a too large may also cause issues: depending on the function g (and g ) evaluating g (aβ k ) with sufficient precision might be difficult; also, the large factor a T may cause numerical instability if there are cancellations in the sum in (8). The choice of a that provides the most accurate estimate for g(T) is highly dependent on g (and g ) and can be difficult to determine in general. 4. We have showing that in accordance with (3), (8) is indeed the same numerical method applied to g (az) instead of g (z). 5. Due to complex conjugate pairs, the actual number of evaluations of g to compute (8) is (N + 1)/2 . However, we stick with N in the notation as N is in general a better indicator for the properties of the approximation.

Interpretation Using Approximate Dirac function
This section provides a different interpretation of the approximating function g N (T). Substituting (1) into (8) gives where (10) tells us that g N (t) is a convolution of the original g( ) with the function f N,a ( ); g N will approximate g well if f N,a ( ) is close to the function δ 0 (which is defined to be 1 at 0 and 0 at every other integer).
At integer points f N,a ( ) simplifies to thus f N,a (0) = 1, and the closest non-zero values are at ±N. As N → ∞, f N,a ( ) converges to δ 0 at integer points. (10) and (12) also ensure that g N (T) is N-periodic except for the exponential factor introduced by a, i.e., This periodic behavior indicates that for non-periodic g(t) functions the g N (T) approximation might be poor. The error of the approximation is Remarks about (14): 1. As long as N > T, the first sum vanishes, since f N,a (T) = 0 for T = 1, 2, . . . , N − 1.
2. Based on (12), (14) can be written as (15) can be intuitively understood as cutting the sequence g(T) into sections of length N, shifting them over the same interval and summing them, see Figure 1. The first section (K = 0) corresponds to values of the original function over the interval 0 ≤ T < N, while the rest corresponds to the error. One consequence of (15) is that for non-decaying or slowly decaying z(T) sequences, selecting a > 1 is necessary to ensure fast decay of the error. 3. If a > 1, then the first sum in (15) is magnified and the second sum is diminished. This is particularly useful when N > T, since in that case the first sum vanishes and the second sum can be diminished arbitrarily (at the cost of loss of precision, more on this later). 4. If a < 1, the first sum is diminished, and the second sum is magnified. 5. If N ≤ T, the first sum in (15) does not vanish, and is magnified when choosing a < 1. However, choosing a > 1 magnifies the second sum in (15) instead; altogether, this results in a non-vanishing error regardless of the choice of a. Consequently, the classic method (or any contour integral-based method) has significant approximation error when N ≤ T, as shown by the numerical results in Section 7. 6. When g( ) > 0, according to (12) the second sum has positive terms only, so there are no cancellations. On the other hand, the approximation preserves nonnegativity, which might be a relevant property in certain applications. 7. If g( ) has positive and negative values alternating, then there can be cancellations according to (15) which reduce the corresponding error.

Shifting the Nodes of the CIR Method
In this section, we present a variant of the CIR method, referred to as CIS method, where the nodes are shifted by a half period compared to (7), i.e., we propose to use in (6) instead of (7). Figures 2 and 3 display the positioning of the β k nodes for the CIR and the β (2) k nodes for the CIS methods for even and odd values of N.
Theorem 1. Applying the β (2) k nodes partition in the contour integral-based NIZT according to (8), results in the NIZT procedure whose error is N,a ( ) converges to δ 0 at integer points.   In this section, we present a variant of the CIR method, referred to as nodes are shifted by a half period compared to (7). That is, we propose to us in (6)  1. Sections shifted and summed according to (15)    Proof. Applying the β (2) k values in (8) gives (17). Similar to (10), we have where which satisfies (19) in integer points. Subtracting g(T) from both sides of (20) gives (18).
Remarks about (18): 1. f N,a (0) = 1, and the closest non-zero values are at ±N, which are negative. 2. As long as N > T, the first sum vanishes, since f 4. If a > 1, then the first sum is magnified, and the second sum is diminished. This is particularly useful when N > T, since in that case the first sum vanishes, and the second sum can be diminished arbitrarily. 5. If a < 1, the first sum is diminished, and the second sum is magnified. 6. When g(T) > 0, (22) has alternating terms, so there are cancellations in the second sum, reducing the corresponding error. This is particularly useful when g has a pole at c, since the β (2) k values do not include a after the node shift, so (17) can still be evaluated with a = c, and the error from the second sum will be smaller due to cancellations. On the other hand, unlike for CIR, the approximation does not necessarily preserve nonnegativity. 7. If g(T) is alternating, then there are no cancellations in (22). 8. Due to complex conjugate pairs, the actual number of evaluations of g necessary to compute (17) is N/2 . However, just like for CIR, we stick with N as the notation.

Concentrated Matrix Geometric Distribution-Based Inverse Transformation
As we have seen in Remark 5 of (14), contour integral-based approximation methods are generally ill-suited to approximate g(T) when the node number is smaller than T, i.e., N ≤ T. If increasing N further is not feasible (e.g., because the computational cost of evaluating g (z) N times gets too large, or due to the precision loss of the procedure with the given floating point number representation), then a different NIZT procedure is needed.
In this section, we propose an approach for NIZT based on concentrated matrix geometric distributions, which we thus call CMG method and is the application of the CME (concentrated matrix exponential) method [12] for discrete time. The CME method is a numerical inverse Laplace transformation (NILT) procedure that uses the Abate-Whitt framework [13]. In the following, we first introduce the Abate-Whitt framework, then we present the CME method, finally we show how it can be applied to discrete time to obtain the proposed CMG method.

Abate-Whitt Framework for Numerical Inverse Laplace Transformation
The Laplace-transform of a function h(t) is defined as The inverse transform problem is to find an approximate value of function h at point T (i.e., h(T)) based on h * (s). The Abate-Whitt framework uses the following form for this approximation: This approximation has a simple interpretation based on the reformulation where If f n (t) was the Dirac impulse function at point T, then the Laplace inversion would be perfect, but depending on the weights η k and nodes β k , the function f n (t) only approximates the Dirac impulse function with a given accuracy. There are multiple different types of f n (t) functions that can be used for the approximation [12].

CME Method
In the CME method the probability density function (pdf) of a matrix exponential distribution is chosen as the f n (t) function. The class of matrix exponential (ME) distributions of order N contains positive random variables with pdf of the form where α is a real row vector of length N, A is a real matrix of size N × N, and 1 is a column vector of ones of size N [14]. To ensure ∞ 0 f X (t)dt = 1, α and A are such that α1 = 1 and the eigenvalues of A have negative real part.
Nonnegativity of f X (t) does not follow from (25), but some (α, A) pairs result in f X (t) functions that are non-negative for t ≥ 0 [15]. When A is diagonalizable with spectral decomposition A = ∑ N k=1 u k λ k v k , where λ k are the eigenvalues, u k are the right eigenvectors and v k are the left eigenvectors of A for k = 1, . . . N, then f X can be written as with η k = c k and β k = −λ k . Comparing (26) and (24) shows that ME distributions with diagonalizable matrix A can be used in the place of f n (t) to obtain an ILT method of the Abate-Whitt framework. As mentioned before, the primary task when using the Abate-Whitt framework is to approximate the Dirac impulse with the f n (t) function as closely as possible. The squared coefficient of variation (SCV) measures how concentrated a non-negative normalized function on R + is, and it is a good indicator of the quality of the approximation. The SCV of f can be calculated as Function f with SCV( f ) = 0 is the Dirac function and the smaller SCV( f ) is, the better f approximates the Dirac function. The parameters of CME distributions with low SCV have been calculated for up to order 1000 [15] and can be accessed at [16].
The CME method has several advantages compared to other NILT methods [12]. It is more stable numerically, provides smooth, over-and under-shooting free approximation even for discontinuous functions and, contrary to other methods of the family, its precision gradually improves when increasing its order (N). The application of the CME method for NIZT is discussed in the next section.

CMG Method
A discrete counterpart of the CME method is formulated in the following theorem.
Theorem 2. For a discrete function g : Z → R, defining the continuous function and applying the CME inverse Laplace transformation method at point T with weights η k and nodes β k , results in the NIZT procedure Proof. The inverse Laplace transformation ofĝ(t) according to (23) can be written aŝ Based on this last expression, we can expressĝ N (T) from g (z) using its definition in (1) aŝ From the definition of the Z-transform we have g(0) = g (0), thus (28) takes the form of (27).

General Comparison of the NIZT Methods
For this comparison, we included the NIZT methods from the previous sections which provide the best results. Ort1 and Ort2 refer to the method presented in Section 3.2 and [6] (using the suggested q = 3/4 and q = 5/6 parameters), based on orthogonal functions. MB refers to the method in Section 3.3, based on matrix pseudoinverse calculation. Table 2 compares the error of all methods for the list of functions in Table 1 for N = 64 and T max = 32. In the table,"∼0" means practical zero, a value smaller than 10 −100 , "p.inf." stands for practical infinity, denoting errors larger than 10 2 , while "n/a" means not applicable due to a pole of g which is evaluated by the given NIZT method (e.g., the CIR method with a = 1 fails for functions with a pole at 1). All calculations related to Table 2 were carried out using high precision (200 digits) floating point arithmetic.
According to Remark 4 at the end of Section 4 (and Remark 5 at the end of Section 5), as long as T max < N, the accuracy of the contour integral-based methods CIR and CIS improves as a is increased, and this also helps avoiding the pole at 1 for the CIR method. Table 3 displays this effect by setting a = 2 instead of a = 1.
Based on Tables 2 and 3, we conclude that the Ort1 method gives the most precise results, the CIR, CIS give precise results when a is sufficiently large, the Ort2 and MB methods are unreliable, and CMG is relatively reliable for a = 1 (although the error is not as small as for CIR, CIS and Ort1), but unreliable for a = 2. Altogether, one might have the impression that Ort1 is the best method; however, we have not yet examined other important questions like precision loss or running time. In the next subsection, precision loss is examined along with a more detailed analysis of the role of a.
Due to their unreliability (c.f. Tables 2 and 3) methods Ort2 and MB are excluded from further investigations.

Precision Loss and the Effect of a
For a more detailed view on the performance of the methods CIR, CIS, Ort1, and CMG, we investigate their accuracy as a function of parameter a. Table 4 contains the error and precision loss (p.l., in digits, calculated using Wolfram Mathematica) for the Polynomial(1/t) function with N = 64, T max = 32, and a taking the values 1/2, 1, 11/10, 2, 4.
For contour integral methods CIR and CIS, as long as N > T, setting a to a larger value will diminish the error in the second term of (14) and (18), while the first term cancels out entirely. For a = 1/2, the approximations in Table 4 are poor. This is because for a < 1, the error in the tail is magnified. If g(t) is rapidly decaying, then the error introduced by a = 0.5 is small, but a = 1 or a > 1 are still better choices.
For a = 1, the "n/a" values for the CIR method are due to the pole of the function at 1. This is where the CIS method has an advantage: the shifted nodes avoid the pole when a = 1, so it gives a meaningful result.
For a > 1, the error decreases rapidly for the contour integral methods CIR and CIS as the error in the tail is diminished, at the cost of increased precision loss. Interestingly, the precision loss is of similar order as the error. (This is not necessarily the case in general, but the precision loss does seem to increase rapidly with a in general.) The error for the CIS method is slightly smaller than for the CIR due to cancellations (see Remark 6 after (18)), but the difference is practically negligible.
The Ort1 method, while gives the lowest error, suffers from huge precision losses. Notably, for any calculations with precision smaller than 142 digits, Ort1 would give meaningless results due to precision loss. The precision loss is inherent to the Ort1 method due to the highly fluctuating orthogonal functions involved. Overall, we recommend avoiding the use of the Ort1 method due to its unpredictable high precision loss, and instead we recommend using either CIR or CIS when T < N, with a set to as large as possible, depending on the precision loss tolerated.
Interestingly, with the Polynomial(1/t) function the CMG method works best when setting a = 1. An intuitive explanation of this property is as follows. For the CMG method, a > 1 enlarges the errors that are caused by the non-zero g(t) values for t < T and diminishes errors that are caused by the non-zero g(t) values for t > T, and a < 1 has the opposite effect. Since Polynomial(1/t) is a rather flat function the effect of enlarging the error is more dominant for both a > 1 and a < 1. This property might slightly change for steeper functions, but, in general, we recommend using a = 1 for the CMG method. We also note that the CMG method involves practically no loss of precision, so it can be used efficiently even with standard precision floating point arithmetic. Table 5 investigates the error and the precision loss for various Z-transform g functions using the CIR, CIS and Ort1 methods with parameters N = 64, T max = 32, a = 2. We note that the CMG method fails due to a > 1. From Table 5, we conclude that the precision loss heavily depends on the g function. As a result, high precision calculations with precision loss check is recommended for NIZT with the CIR, CIS and Ort1 methods. However, with high precision calculations, the precision loss remains tolerable for CIR and CIS.

Numerical Properties of NIZT Methods When T ≥ N
Finally, we examine the case when T ≥ N. This case is relevant when sampling of the Z-transform g is costly, but we still need to approximate g(T) for large values of T.

Failure of Contour Integral Methods
To start off, we display the remarks at the end of Section 4.2 in practice. Table 6 shows the result of applying the CIR approximation method to the z-transform of the Poisson(1) distribution with order N = 4 and various choices of a, compared with the actual values of the Poisson distribution.
For a = 1, the approximation is relatively accurate up to T = N − 1 = 3, but there is a sharp change in the behavior at T = N: the approximation is periodic with a period of N, rendering the approximation values for T ≥ N useless (see also Figure 1). For a = 2, the approximation is even more accurate up to T = N − 1, but the error is magnified for T ≥ N (due to the scaling by a N in accordance with (13), e.g., g N (4) = 5.9014 = a 4 · g N (0) = 16 · 0.3688 in this case). For a = 1/2, the error for the T ≥ N terms is diminished, but the approximation up to T = N − 1 is much less accurate. Altogether, the CIR method cannot be used to obtain an approximation suitable for both T < N and T ≥ N. The CIS method suffers from the same issues. Since the error of the cases when T < N has been considered in Section 7.1, in this section we define the error as Table 7 contains the error of each method for various functions. The parameters are T max = 32, N = 16 and a either 1 or 1.1. In this table, "p.inf." marks elements larger than 10 3 . We note that the MB method is not applicable with these parameters, since the pseudoinverse calculation is only applicable when T max < 1.1N.
Based on Table 7, we conclude that as long as T ≥ N, the error of the CIR and CIS methods ( (14) and (18)) is large either in first or the second term depending on whether a < 1 or a > 1, and the Ort1 and Ort2 methods are also unreliable. Only the CMG method gives meaningful results for the case when T ≥ N. As for the value of a, we recommend using 1 as generally that seems to give a reliably low error.  Table 8 contains the errors with parameters T max = 512, N = 256 and a = 1. At this parameter setting, all other methods except CMG fail, while CMG gives a reasonably good approximation.

Conclusions
In this paper, we collected many of the NIZT methods from the literature and presented two new methods. One of them, the CIS method, is a variant of the contour integral-based CIR method and the other one, the CMG method, is inherited from numerical inverse Laplace transformation.
A wide numerical investigation on these NIZT methods indicated that different methods are accurate when the order of the method is higher than the required parameter (N > T) and when it is lower (N ≤ T). In the first case, the CIR and the CIS methods are the most reliable (moderate error and tolerable precision loss) and they perform rather similarly. Their behavior differs only when one of the methods must evaluate the transform function near one of its poles. In the second case, when N ≤ T, the CMG method outperforms all other methods in both accuracy and precision loss. The rest of the methods perform poorly, in general, except the Ort1 method, which gives more accurate results than the CIR and the CIS methods if appropriately high precision is used, but its numerical instability often results in larger precision loss than the applied high precision arithmetic.
While the optimal method to choose may depend on multiple factors (e.g., tolerated precision loss, computational time, tolerated error, etc.) as a rule of thumb for N > T we recommend to use the CIR and CIS methods, and for N ≤ T the CMG method. The Mathematica implementation of the considered NIZT methods and some results of the numerical evaluation are available at http: //webspn.hit.bme.hu/~telek/tools/nizt.zip.