Multidimensional Scaling Visualization Using Parametric Similarity Indices

: In this paper, we apply multidimensional scaling (MDS) and parametric similarity indices (PSI) in the analysis of complex systems (CS). Each CS is viewed as a dynamical system, exhibiting an output time-series to be interpreted as a manifestation of its behavior. We start by adopting a sliding window to sample the original data into several consecutive time periods. Second, we deﬁne a given PSI for tracking pieces of data. We then compare the windows for different values of the parameter, and we generate the corresponding MDS maps of ‘points’. Third, we use Procrustes analysis to linearly transform the MDS charts for maximum superposition and to build a global MDS map of “shapes”. This ﬁnal plot captures the time evolution of the phenomena and is sensitive to the PSI adopted. The generalized correlation, the Minkowski distance and four entropy-based indices are tested. The proposed approach is applied to the Dow Jones Industrial Average stock market index and the Europe Brent Spot Price FOB time-series.


Introduction
Complex systems (CS) are frequent in many natural (e.g., geophysics, cosmology, ecology, biology, genetics) and man-made (e.g., economy, computer science, chemical and physical apparatus) systems [1][2][3][4][5][6][7][8][9][10].They are often constituted by multiple interacting entities that contribute to a collective behavior revealing surprising dynamical phenomena.CS modeling can adopt sophisticated mathematical tools, but often, we verify that those methods are still far from capturing the overall richness of the system evolution.Therefore, a fruitful interplay is possible by experimenting in a given case with mathematical tools that are usual in distinct areas [11].
Multidimensional scaling (MDS) is a computer method for visualizing and comparing data that have been applied in many distinct areas [12][13][14][15].Polzella and Reid [16] used MDS in the study of pilot performance data obtained during simulated combat.Costa et al. [17] considered MDS to analyze DNA code in the perspective of identifying structural patterns in the nuclear and mitochondrial genomes.Machado et al. [18] adopted MDS to study fifteen stock markets and to unveil time-varying correlations between them.Oñate and Pou [19] analyzed the temperature time-series from eleven meteorological stations over the Iberian Peninsula.Trends were identified by means of Mann-Kendall tests, while MDS was employed to build automatic grouping.More recently, Stephenson and Doblas-Reyes [20] used MDS as an exploratory tool for describing ensembles of forecasts.Lopes and Machado [11] studied global temperature time-series, showing that MDS is able to provide an intuitive and useful visual representation of the complex relationships present in the data.
In this paper, we combine MDS tools and parametric similarity indices (PSI) in the analysis of CS.Each phenomenon is viewed as a dynamical system whose output is a time-series.The time-series are interpreted as manifestations of the system behavior.The novel methodology is structured as follows.First, we adopt a sliding time-window to convert the original data into smaller parts that reflect evolution in time.Second, for a given PSI, we compare the time-windows for different values of the parameter, and we generate the corresponding MDS maps of "points".Third, we transform the individual MDS charts for obtaining the maximum object superposition.The construction of the global MDS map unveils the characteristics of the CS.
Our approach represents a generalization of MDS classical schemes.In fact, standard MDS representations capture the system dynamics by means of a single similarity index.Such an index depends of the researcher's choice, and therefore, we can define distinct criteria.The MDS interpretation is based on the emerging clusters and distances between "objects" ("points") in the map, rather than on their absolute coordinates or the geometrical form of the locus.We propose subdividing each time-series into several smaller slices, to capture time dynamics, and to adopt PSI in each MDS map, providing distinct comparisons for the same windows.The interpretation of the MDS map is now based on "objects" consisting of "shapes" (a collection of points) instead of "points", capturing the time evolution of the phenomena and being sensitive to the parametric index adopted.
Loosely speaking, with the standard MDS methods, we are viewing "objects" under a monochrome beam of light, while with the PSI approach, we vary the light wavelength, leading to a more colorful and detailed characterization of the "objects".
The generalized correlation, Minkowski distance and entropy are tested as candidates for the parameter-dependent comparison indices.The proposed approach is applied to the Dow Jones Industrial Average stock market index (DJ) and Europe Brent Spot Prices FOB (BR) time-series.
Bearing these ideas in mind, this paper is organized as follows.Section 2 introduces the main mathematical tools used for processing the data.Section 3 analyses real data by means of several PSI and MDS.Finally, Section 4 outlines the main conclusions.

Mathematical Tools
This section formulates the main mathematical tools adopted for data numerical analysis.Section 2.1 presents the PSI, namely the generalized correlation, the Minkowski distance and the entropy-based indices.Section 2.2 addresses the MDS technique.

Generalized Correlation
Given two random variables X and Y , its generalized correlation coefficient is expressed by [21]: where E(•) denotes the expectation operator and (X i , Y i ) and (X j , Y j ) are independent bivariate vectors with i = j, (i, j) = 1, 2, ..., n.The function g q (z) in Equation ( 1) is given by: The generalized correlation is always in the interval ρ q ∈ [−1, 1] with lower and upper values corresponding to the Kendall, ρ K , and Pearson, ρ P , correlation coefficients, respectively: Expression ( 1) is a general parametric statistics with an extra degree of freedom, q, that can be tuned by the user according to the problem under analysis.
The generalized correlation coefficient of a sample, ρq , can be estimated by means of: where,

Minkowski Distance
The Minkowski distance is a metric on the Euclidean space that generalizes other distances [22].Given two points X = (x 1 , x 2 , ..., x n ) and Y = (y 1 , y 2 , ..., y n ) in R n , the q-order Minkowski distance is expressed by: For q = 1, Expression (9) yields the Manhattan distance; for q = 2, we have the Euclidean distance; and for q → ∞, we obtain the Chebyshev distance.

Entropy
Entropy has been used not only in the context of CS, but also in many other scientific areas [23][24][25].In information theory, entropy was introduced by Shannon for studying the amount of information contained in a message.Shannon entropy satisfies the so-called Khinchin axioms [26] and is given by: Entropy represents the expected value of the information content, I(p i ) = − ln p i , of an event with probability of occurrence p i , where Rényi and Tsallis entropies are generalizations of Shannon's entropy and are given by, respectively: Inspired by the concepts of fractional calculus (FC), Ubriaco [27] proposed the following expression: that has the same properties as the Shannon entropy, except additivity.Rényie, Tsallis and Ubriaco entropies reduce to Shannon's expression when q → 1.
FC denotes the branch of calculus that extends the concepts of integrals and derivatives to non-integer and complex orders.During the last few decades, FC was found to play a fundamental role in modeling many important physical phenomena and emerged as an important tool in the area of dynamical systems [28][29][30].
Recently, also inspired by FC, the concepts of information content and entropy of order q ∈ R, I (G) q and S (G) q , were proposed [31,32].These generalizations of the classical definitions are given by: where D q (•) is the fractional derivative of order q and Γ (•) and ψ (•) represent the gamma and digamma functions, respectively.The generalized entropy (15) does not obey some of the Khinchin axioms, except for q = 0 [31].In this case, it leads to the classical Shannon entropy (10).

Multidimensional Scaling
MDS is a technique for visualizing information that explores similarities in data [33][34][35][36][37].The main idea is to detect underlying dimensions that allow the researcher to observe similarities between the items under analysis.The MDS algorithm requires the definition of a similarity index and the construction of a h × h matrix M of item to item similarities, where h is the total number of items.In classical MDS, matrix M is symmetric, and its main diagonal is composed of "1".MDS assigns a coordinate point to each item in a multi-dimensional space and arranges the set of h coordinates in order to reproduce the observed similarities.Often, instead of similarities, there are considered dissimilarities, or distances, between the items.For low dimensional spaces (e.g., m = 2 or m = 3) the resulting "points" can be displayed in a "map".As mentioned, by rearranging the item positions in the space, MDS tries to arrive at a configuration that best approximates the observed similarities.For this purpose, MDS uses a function minimization algorithm that evaluates different configurations with the goal of maximizing the goodness-of-fit.A common measure to evaluate how well a particular configuration reproduces the observed distance matrix is the raw stress, S, in general defined as: where d ij stands for the reproduced distances between items i and j, (i, j) = {0, 1, 2, ..., h − 1}, δ ij represents the observed distances and f (•) indicates some type of transformation.The smaller the value of S, the better is the fit between d ij and δ ij .We can rotate or translate the MDS map, since the distances between "points" remain identical.The quality of the MDS approach can be evaluated by means of the stress and Shepard plots.The stress plot represents S versus the number of dimensions m of the MDS map.We get a monotonic decreasing chart, and we choose m as a compromise between reducing S and having a low dimension for the MDS map.The Shepard diagram compares the d ij distances, for a particular value m, versus the δ ij distances.Therefore, a narrow scatter around the 45-degree line indicates a good fit between d ij and δ ij .

MDS Analysis and Visualization of Complex Systems
In this section, we apply PSI and MDS tools in the analysis of real-world time-series.In Section 3.1, we use a simple example to introduce the approach.In Sections 3.2, 3.3 and 3.4, we process data from two CS by means of the generalized correlation index, the Minkowski distance and entropy, respectively.We use DJ and BR time-series at a daily time horizon [38,39] to characterize the CS dynamics.The data are available at the Yahoo Finance (https://finance.yahoo.com/)and the U.S. Energy Information Administration (http://www.eia.gov/)websites.The time period of analysis is from 20 May 1987, up to 5 January 2015.Some missing values are estimated by means of a linear interpolation algorithm, applied between neighbor values, so that all weeks have five values.

Illustrative Example
Given a time-series representative of a CS over the period T , x(t), 0 ≤ t < T , we divided the data into h intervals, denoted by x k (t), kT /h ≤ t < (k + 1)T /h, k = {0, 1, 2, ..., h − 1}.For a PSI, we calculate p similarity matrices, M q , h × h dimensional, where q ∈ {q 1 , q 2 , ..., q p }. Matrices M q feed the MDS algorithm, which generates p intermediate maps of "points" (i.e., one map per value q).The charts are then processed by means of Procrustes analysis in order to obtain a single global plot of "shapes" where the "points" of the original maps are optimally "superimposed".
Procrustes analysis performs linear transformations, namely translation, reflection, orthogonal rotation and scaling, with the objective of minimizing a measure of the difference between the "points" in the original maps.The algorithm: (i) chooses a reference MDS map (by selecting one of the available instances); (ii) superimposes all other MDS instances into the current reference; (iii) computes the mean form of the current set of superimposed maps; (iv) compares the distance between the mean and the reference instances to a given threshold value and, if above, sets the reference to the mean form and continues to Step (ii).
For illustrating our approach, we start by using a time-series consisting of the U.S. GDP per capita, for the period from year 1800 up to 2010, with units of 1990 GK dollars per capita.This time-series has the advantage of being simple and smooth over time, but rich enough to be used as an illustrative example.The data are from the Maddison Project [40].
Figure 1 shows the GDP time-series divided into h = 42 non-overlapping intervals of five years each.We adopt the Minkowski distance (9) with p = 22 parameter values and q varying uniformly in the interval q ∈ [0.12, 1.20].We then calculate the matrices M q = [d ij q ], (i, j) = 0, ..., 41, that constitute the input for the MDS algorithm.
Figure 2 depicts the two-and three-dimensional global MDS maps that result from the Procrustes analysis.In these charts, we get a continuous set of p points for each time-window, meaning that we have maps of "shapes", instead of merely "points".We verify that the "shapes" have a one-dimensional nature and that their length varies along time.For example, "shapes" of windows {27, 31, 32, 34, 36, 42} exhibit a larger length, meaning that they are more sensitive to the variation of q.Superimposed to the global maps, we represent the time behavior of the CS by means of a set of oriented arrows that connect the window "shapes" and indicate the flow of time.We verify longer "jumps" between certain time-windows, namely {29-30, 30-31, 31-32, 32-33, 33-34}, which means that they have different characteristics from the rest.
The following list emphasizes the relationships between the GDP time-series and the MDS global maps: • During the time period 1800-1930, the GDP had a moderate growth with small oscillations over time.In the MDS maps, we observe small "shapes", representing Windows 1 up to 26, appearing in sequence and close to each other;

MDS Based on the Generalized Correlation Index
We start by dividing the total DJ series into h = 27 non-overlapping intervals of approximately a one-year time-length each.This value of h mitigates the problems related to the non-stationarity of the data and establishes a good compromise between time discrimination and solid histogram representation.We then adopt p = 22 distinct parameter values, uniformly distributed in the interval q ∈ [0, 1], and we calculate the corresponding matrices M q = [ρ ij q ], (i, j) = 0, ..., 26, where ρ ij q represents the generalized correlation between x i (t) and x j (t).
To assess the quality of the p intermediate MDS maps that are used to construct the global map of "shapes", we plot the Shepard and the stress diagrams.We illustrate the results obtained for the DJ time-series and ρ q .For the BR data, and for all PSI, the results are similar.Figure 5 represents the p Shepard diagrams superimposed in a single chart.We can observe a scatter of points distributed around the 45-degree line, which means a good fit of the observed distances to the dissimilarities.As expected, for m = 3, we get slightly better results than for m = 2. Figure 6 depicts the superposition of the stress diagrams, revealing that, for the p intermediate MDS maps, a three-dimensional space describes the data well.

MDS Based on Entropy Measures
In this subsection, we embed entropy-based PSI into the MDS.For data comparison, we compute the p matrices, M q = [c ij q ], where c ij q represents the Canberra distance between the entropy values of the time-windows x i (t) and x j (t): The entropy, S (•) q , is estimated by means of the histograms of relative frequencies.For constructing the histograms, we use N = 10 bins.q , is used with the DJ data.
The results obtained with Expressions (11), ( 12), ( 13) and ( 15) are similar for both the DJ and the BR time-series.We opt for presenting the MDS maps generated with the generalized entropy (15), since it is more sensitive to the data.
The parameter q varies uniformly in the interval q ∈ [0, 0.3], and h and p take the values adopted in the previous subsections.q , is used with the BR data.
Figures 9 and 10 depict the two-and three-dimensional MDS maps for the DJ and BR time-series, respectively.As before, we observe that the global charts exhibit continuous sets of points ("shapes") for all time windows.On the other hand, as for ρ q , the entropy-based indices reveal difficulties with time discrimination, giving rise to a large number of "jumps" between "shapes".For the DJ time-series, windows A = {1, 18, 19, 21} and B = {14} have different behavior, being located far apart from the remaining.Regarding the BR data, windows A = {4, 9} and B = {27} unveil different characteristics of those exhibited by the other time periods.
The proposed methodology yields two orthogonal descriptions, namely the set of "shapes" and the set of time discrete steps, referred previously as "jumps".Therefore, it is of interest to check how the two concepts evolve along the series, for distinct indices.Figure 11 depicts the locus of "jump" versus "shape" lengths (T and Σ, respectively), where the labels represent the windows numbers.For avoiding having distinct ranges and scales that could mislead, both variables are normalized to the interval [0, 1].We show the results obtained for the DJ data; however identical conclusions can be drawn for the BR data series.
We observe a pattern where time is an irregular flux, but where it visits often the same trail.The pattern varies with the comparison index, so it is not an intrinsic property, but rather the results of the observation perspective provided by each individual index.The patterns reveal trains, somehow in a tree structure.Another aspect to investigate is the effect of the window length over the time MDS map.Therefore, we consider shorter time windows, namely h = 2 × 27 and h = 3 × 27, corresponding to periods of approximately six and four months, respectively.Figure 12 shows the locus of T versus Σ for the DJ time-series, where the labels represent the windows.We observe the emergence of more complex, fractal-like, trees.Moreover, we verify that the CS revisits the same areas of the locus over time.The study of their properties needs to be pursuit, but it seems to reveal that for the CS under analysis time flow has a non-smooth texture.
In conclusion, we have demonstrated that the choice of the comparison index leads to distinct sensitivities to the variation of the parameter and the evolution of time.While the graphical layout of the MDS "points" is usually of smaller importance, their clustering is fundamental in tracing conclusions.Therefore, the construction of MDS "shapes" can be explored to get an extra insight into the phenomena under analysis.This study constitutes a step towards building a more general formulation of the standard MDS method, and several questions emerge.Since we can explore indices with several parameters, do they lead to useful multidimensional "shapes" in the MDS maps?How can we analyze more assertively the sensitivity of different time windows?While the standard visualization of the flux of time as a constant velocity process, is MDS pointing to a variable speed time arrow?We plan to apply the new methodology to several data series in the future, aiming to clarify some of these issues.

Figure 1 .
Figure 1.Time-series of the U.S. GDP per capita, for the period from year 1800 up to 2010.

Figure 2 .
Figure 2. MDS maps of "shapes" and the flow of time, obtained after Procrustes analysis: (a) m = 2; (b) m = 3.The Minkowski distance, d q , is used with the U.S. GDP per capita time-series.

Figure 3 .
Figure 3. Multidimensional scaling (MDS) maps of "shapes" and the flow of time, obtained after Procrustes analysis: (a) m = 2; (b) m = 3.The generalized correlation index, ρ q , is used with the Dow Jones (DJ) data.

Figure 4 .
Figure 4. MDS maps of "shapes" and the flow of time, obtained after Procrustes analysis: (a) m = 2; (b) m = 3.The generalized correlation index, ρ q , is used with the Brent Spot (BR) data.

Figure 5 .
Figure 5. Superposition of the p MDS Shepard diagrams: (a) m = 2; (b) m = 3.The generalized correlation index, ρ q , is used with the DJ data.

Figure 6 .
Figure 6.Superposition of the p MDS stress diagrams.The generalized correlation index, ρ q , is used with the DJ data.

Figure 7 .
Figure 7. MDS maps of "shapes" and the flow of time, obtained after Procrustes analysis: (a) m = 2; (b) m = 3.The Minkowski distance, d q , is used with the DJ data.

Figure 8 .
Figure 8. MDS maps of "shapes" and the flow of time, obtained after Procrustes analysis: (a) m = 2; (b) m = 3.The Minkowski distance, d q , is used with the BR data.

Figure 9 .
Figure 9. MDS maps of "shapes" and the flow of time, obtained after Procrustes analysis: (a) m = 2; (b) m = 3.The generalized entropy index, S (G)

Figure 10 .
Figure 10.MDS maps of "shapes" and the flow of time, obtained after Procrustes analysis: (a) m = 2; (b) m = 3.The generalized entropy index, S (G)

•
In the time period 1930-1935, corresponding to the worst years of the Great Depression, the GDP had a strong oscillation, increasing (with small fluctuation) during the subsequent five-year period, 1935-1940.Such behavior is observed in the MDS maps as two moderate length "jumps" from Window 26 towards Window 27 and then towards Window 28;•Periods 1940Periods  -1945Periods  , 1945Periods  -1950Periods   and 1950Periods  -1955had contrasting behavior, characterized by growth and recession, mostly determined by World War II.In the MDS maps, we observe large "jumps" across Windows 29 up to 31;• For the time period 1955-1960, GDP growth became slower and then recovered during the period 1960-1965.This corresponds to large "jumps" across Windows 31, 32 and 33 in the MDS; • From year 1960 onward, we observe a GDP moderate growth trend, with oscillations over time.This behavior translates into moderate, and similar, length "jumps" between shapes in the MDS maps.