Discretization of Fractional Operators: Analysis by Means of Advanced Computational Techniques

: This paper studies the discretization of fractional operators by means of advanced clustering methods. The Grünwald–Letnikov fractional operator is approximated by series generated by the Euler, Tustin and generalized mean. The series for different fractional orders form the objects to be assessed. For this purpose, the several distances associated with the hierarchical clustering and multidimensional scaling computational techniques are tested. The Arc-cosine distance and the 3-dim multidimensional scaling produce good results. The visualization of the graphical representations allows a better understanding of the properties embedded in each type of approximation of the fractional operators.


Introduction
Fractional calculus (FC) generalizes the concept of integrals and derivatives to real and complex orders [1,2]. The FC is a mathematical tool that captures non-local effects and phenomena with long-range memory. Due to this key characteristic, FC entails a variety of applications that have been successfully explored during the last years [3][4][5][6][7][8][9]. Nevertheless, the adoption of FC in applied sciences poses some difficulties in the development of efficient algorithms due to the higher complexity required by the calculation of the fractional operators.
In signal processing and control applications, the fractional operators are approximated by recursive algorithms based on the Grünwald-Letnikov (GL) definition [10,11]. The resulting expressions are calculated in discrete-time either using series or Padé fractions often expanded with the Z-transform. Usually the performance of such approximations is assessed through the time and frequency responses [12]. Moreover, we can interpret fractional derivatives and integral in a number of different ways [13,14], which reveals the higher generality of FC and its adaptability to a number of distinct problems. Nonetheless, besides the classical methods of comparison, recent methods of clustering and computational visualization play a key role in the development of new algorithms and should be explored in the scope of FC.
The adoption of computer-based techniques has been an area of fast development during the last decade. In this paper, we study the application of hierarchical clustering (HC) and multidimensional scaling (MDS) methods for computational clustering and visualization [15][16][17][18][19][20][21][22][23]. The HC yields data 2-dim portraits known as dendrograms and trees, where the objects under analysis are the 'leafs'. In the case of the MDS, the objects are 'points' positioned either in 2-or in 3-dim charts. These computational schemes have been adopted successfully in a number of scientific areas and allow to unveil patterns embedded in the data [24,25]. In the fields of dynamical analysis and control systems, the application of these methods is still giving the first steps [26][27][28], and therefore we can foresee a significant development in the near future. This paper takes advantage of the two computer-based techniques for comparing several series approximating the GL fractional operator. Therefore, this study is organized in four sections. Section 2 introduces the fundamental concepts adopted in the rest of the manuscript and includes two sub-sections. The first summarizes the discretization schemes of fractional derivatives. The second outlines two clustering analysis and visualization technique and is divided into there parts. The ideas behind the concepts of distance, hierarchical clustering (HC) and MDS are discussed briefly. Section 3 applies the HC and MDS techniques to numerical series and includes two sub-sections. The first illustrates the proposed methods by means of a test-bed of 13 well-known series. The second assesses several series approximating fractional operators. Finally, Section 4 summarizes the main conclusions.

Discretization of Fractional Derivatives
The most common definitions of a fractional derivative or integral are the so-called Riemann-Liouville, Caputo and GL formulations [29,30]. From the point of view of signal processing and real-time control system the GL definition is more straightforward to apply. The GL definition of a derivative of fractional order α ∈ R of the signal x(t) is given by where Γ(·) is the gamma function and h is the time increment. This formulation leads to a discrete-time algorithm, with the time increment h approximated the sampling period T. After truncating at the r-th term, definition (1) yields the expression in the Z domain: where Z {x(t)} = X(z). Expression (2) is the Euler conversion scheme s → z, adopted in continuous to discrete time approximations, with s and z standing for the variables in the Laplace and Z domains. Nonetheless, in the control system design, the Tustin conversion scheme is also often adopted. Therefore, for a fractional operator of order α, the two s → z schemes can be written as Tustin: respectively. Expressions (3) and (4) are also called generating approximants of zero and first order, H 0 and H 1 , respectively. We can obtain an average of H 0 and H 1 by weighing them by the factors p and 1 − p, respectively, such that where 0 p 1. The mean, M α z −1 , involves one parameter p which allows some adaptation to a specific application. Nonetheless, expression (5) can be further generalized using the concept of generalized mean [12,31,32], yielding where 0 p 1 and q ∈ R. For q = {−1, 0, 1} we obtain the so-called harmonic, geometric and arithmetic means.

Distances
A function d(·, ·) is a metric that gives the distance between two objects S 1 and S 2 if satisfies the identity, symmetry and triangle inequality axioms [33]: These axioms imply the non-negativity (or separation condition) d(S 1 , S 2 ) ≥ 0. In fact, axioms (7) allow the use of different functions, each one with distinct pros and cons [34,35].
A variety of algorithms [36][37][38] was adopted for comparing data sequences [39][40][41][42]. Nonetheless, we must have in mind that the selection of a specific distance for an application requires some experience and that some preliminary tests are necessary before selecting the 'best' [43][44][45][46].
For capturing the characteristics of each object, we construct a 1 × r vector with the coefficients of the numerical series. This means that, for example, the harmonic series ∑ ∞ k=1 1 k is described by its truncated form, that is, by the 1 × r vector 1 1 , 1 2 , 1 3 , . . . , 1 r . To compare the vectors we consider four distances, namely the Euclidean, Tchebichef, Jaccard and ArcCosine metrics, given by [35] where γ i (k) denotes the k-th coefficient of the i = 1, . . . , N, series. The Euclidean and Tchebichef distances are special cases of the Minkowski distance , namely, for p = 2 and p → ∞, respectively. The Jaccard distance measures the dissimilarity between two sample sets using a logic similar to that of the Venn diagrams. The Jaccard distance is useful for comparing observations with categorical variables. The ArcCosine distance is not sensitive to the amplitudes of the two vectors and gives the angle between them.
A typical question is what distances to adopt, given the high number of possible expressions. Indeed, we tested other distances such as the Manhattan, Canberra, Clark, Lorentzian, Sørenson and others [47], but the expressions (8) are the best for characterizing numerical series.

Hierarchical Clustering
The HC is a technique that compares a set of N objects in a n-dim space A and yields a graphical portrait highlighting their main similarities in the sense of some metric [18,48].
The algorithm starts by gathering N objects in a high-dimensional data-set A. After defining a metric for comparing the objects a N × N matrix D = d ij , i, j = 1, . . . , N, of object-to-object distances is calculated. When adopting distances, the matrix D is symmetric, with zeros in the main diagonal, and positive values in the rest. The HC uses the matrix D and constructs a graphical representation, consisting in a dendrogram or a hierarchical tree, that tries to reflect the input information. In this case, the objects are represented by 'leafs'.
The HC can use either the agglomerative, or the divisive clustering iterative computational schemes. In the agglomerative, each object starts in a separate cluster and the algorithm merges the most similar objects during the iterations. In the divisive, all objects start in a common cluster and the successive iterations separate the most dissimilar. A linkage criterion, such as the maximum, minimum and average linkages, calculates the dissimilarity between clusters [49]. The clustering quality can be assessed by means of the cophenetic correlation [50]. When the cophenetic correlation is close to 1 (to 0) we have a good (weak) cluster representation of the original data.

Multidimensional Scaling
The MDS is a technique that reproduces approximately in a space of dimension n objects described in a space of dimensional m > n. In this case, the objects are represented by 'points'.
The MDS calculates a N × N matrixD = d ij , with distancesd, that tries to reproduce the original ones. Usually, the algorithm minimizes some quadratic index, often called 'stress', and the problem is converted to a numerical optimization of an index such as . The iterations of the MDS algorithm stop when we have either no further significant improvement of the stress reduction, or the maximum number of iterations is reached. Usually, the dimensions n = 2 or n = 3 are adopted to produce a simple visualization. In the follow-up, n = 3 is selected because the loci allow a superior visualization, at the cost of requiring some successive operations of rotation, shift and amplification for achieving the best perspective.
The quality of the MDS loci, also called a 'map', can be assessed by means of the so-called Sheppard and stress diagrams. The Sheppard portrait draws d ij versusd ij , and a low/high scatter means a good/poor match between the distances. The stress diagram plots Stress versus the number of dimensions n. This representation gives a monotonic decreasing curve, that usually reveals a significant reduction after the initial values.
The axes of the MDS representation have no physical meaning and there is no good/bad interpretation to the high/low values of the point coordinates. Indeed, the interpretation of the map follows the clusters and patterns formed by the points. As for the HC, users must test several distances in advance for capturing the characteristics of the data. In general, the MDS loci are different for each distinct metric, but we can have several distances producing 'good' maps.
We calculate the MDS through the Matlab classical multidimensional scaling command cmdscale.

A Test-Bed of Numerical Series
We start by considering a test-bed of S i , i = 1, . . . 13, numerical series as follows: S 10 : S 12 : S 13 : The series S 1 to S 12 converge to known results with exception of the harmonic series S 13 that diverges. Nonetheless, we have a truncation at r terms. Therefore, all series converge, but the final result will not be identical to the one for the non-truncated expression. For the sake of easing the identification, we denote the series by their ideal value in the plots, while the harmonic series S 13 is denoted by H. Figure 1 shows the HC portrayed by a dendrogram and a hierarchical tree for the series S i , i = 1, . . . 13, using the ArcCosine distance, d AC , for r = 100.    We verify that for this test-bed all distances and graphical representations give approximately identical conclusions. In general and in global terms, we observe two sets A and B. A close placement of the π-related series A = {S 2 , S 3 , S 4 , S 5 , S 6 } and some separation with regard to the series B = {S 1 , S 7 , S 8 , S 9 , S 10 , S 11 , S 12 , S 13 }. As mentioned before, in spite of the truncation, they are denoted A = π 4 , π 2 6 , π 4 90 , π 3 32 , π 2 12 , B = ln 2, 1 2 , 3 4 , e, 2, 4, 4 3 , H , respectively. We must note that the truncated harmonic series, H, leads to a finite sum. It is not obvious from these initial simple tests, but as we shall verify in the follow-up the ArcCosine distance, d AC , gives slightly better display results, and the MDS 3-dim loci allow a better visualization for a larger number of series.
We can also compare the results of using a different truncation order. Figure 4 depicts the MDS 3-dim plots for the series S i , i = 1, . . . , 13, using the ArcCosine distance for r = 100. We observe the minor influence of the truncation order r upon the clustering. Of course we could discuss further the pros and cons of this test-bed and, eventually, include other series. However, this is not the main purpose of the initial set of experiments that merely have the intention of illustrating the proposed method.

Series Approximating Fractional Operators
In this sub-section, we compare the series for approximating the fractional operators. Therefore, we apply the HC and MDS techniques to the series produced by the GL (or Euler) and Tustin schemes (3) and (4), that we abbreviate by the symbols G α and T α , with those arising from the generalized mean M α q . We consider (i) the interval between the integer-order integral and derivative of orders 1, that is, −1 α 1 with steps of 0.1, (ii) the parameter value p = 0.5, and (iii) the cases q = {−1, 0, 1} (i.e., the harmonic, geometric and arithmetic means) for α = {−1, −0.5, 0.5, 1}. Figure 5 shows the dendrogram and the hierarchical tree for the N = 54 numerical series using the ArcCosine distance, d AC , for r = 100. The series G α and T α , with 21 objects each, are depicted in red and blue colors, respectively. The averages M α q , with 3 objects each, are colored in yellow, cyan, green and white for α = {−1, −0.5, 0.5, 1}. Without loss of relevant information, in several plots some labels are not included, otherwise their overlap worsens the visualization.    We note clearly that (i) the 3-dim MDS representation in association with the ArcCosine distance, d Ac , is superior in the sense of yielding a more clear visualization; (ii) as expected the cases with α = 0 coincide; (iii) the difference between the G α and T α approximations, which is particularly visible for the integral case (i.e., α < 0); and (iv) the generalized mean M α q , q = {−1, 0, 1}, interpolates the G α and T α approximations. In summary, the proposed method gives clear conclusion and provides a new perspective that complements the classical performance evaluation based on the time and frequency responses.

The Effect of Series Truncation
The MDS approach allows also assessing other aspects of the discrete approximations. In this perspective, Figure 8 compares the GL and Tustin series after truncating at r = 10 and r = 50 with the ones obtained with r = 100, using the ArcCosine distance. For clarifying the graphical representation only the integer orders have labels. The straight lines connect plot marks corresponding to the same fractional order. We verify that for small values of r we have a significant impact due to the truncation. The results are particularly remarkable for the fractional integrals, being less significant for derivatives. We note also that, as expected, when r increases, the effect of truncation diminishes. In conclusion, the use of clustering methods is a relevant option for taking advantage of present day computational resources. While the paper focused mainly on the analysis of the series generated by expanding the fractional operators, we can address other problems such as comparing the time and frequency responses of closed-loop control systems in the presence of nonlinear dynamics. Another possible direction of study is also to explore other clustering algorithms that have been advanced recently [53-55].

Conclusions
This paper presented a new method for comparing numerical series based on advanced computational techniques. For this purpose, the association of several distances and clustering techniques was explored. The ArcCosine distance and 3-dim MDS representation provided the best visualization for the series under assessment. First, a test-bed involving a collection of 13 well-known numerical series was tested. Afterwards, a collection of 54 objects, representing several discretizations of fractional operators stemming from the GL, Tustin and generalized average, was considered. The results show the minor variability between objects for derivatives, but a larger scattering for the case of integrals. Additionally, the effect of series truncation was also explored under the light of the MDS representation. Therefore, the option of generalized average gives an extra degree of freedom to adapt the discretization to each specific application. In future, this technique can be further applied for analyzing the performance of the complete closed-loop system under distinct controller and discretization designs.