Complexity Analysis of Escher’s Art

Art is the output of a complex system based on the human spirit and driven by several inputs that embed social, cultural, economic and technological aspects of a given epoch. A solid quantitative analysis of art poses considerable difficulties and reaching assertive conclusions is a formidable challenge. In this paper, we adopt complexity indices, dimensionality-reduction and visualization techniques for studying the evolution of Escher’s art. Grayscale versions of 457 artworks are analyzed by means of complexity indices and represented using the multidimensional scaling technique. The results are correlated with the distinct periods of Escher’s artistic production. The time evolution of the complexity and the emergent patterns demonstrate the effectiveness of the approach for a quantitative characterization of art.


Introduction
During human history, artists conceived harmonies of objects and forms in their works [1]. Artworks are manifestations of the artists' creativity, reflecting their thoughts and culture [2]. Often we have a glimpse of a kind of mathematical exercise embedded in the artworks [3][4][5][6][7][8][9]. It is well known the symbiosis of art and science produced by Leonardo da Vinci [10], or the magic complexity of the music composed by Bach [11]. The fascination of many artists/scientists for art/mathematics is ubiquitous in human history [12][13][14]. Nowadays, we can take advantage of computational algorithms to stimulate synergies between art and science.
Art is a token of a complex system produced by mankind and influenced by a plethora of social, cultural, economic and technological inputs that interact in time and space. The study of artworks may help interpreting the world and the human mind. However, an assertive analysis of art poses conceptual and practical difficulties and reaching quantitative conclusions represents a huge challenge.
The tools to analyze complex systems have been successfully adopted in different areas, including economics, life and social sciences [15], with the objective of finding fundamental principles and universalities that govern the systems' dynamics [16,17]. Complexity is helpful to quantitatively describe nonlinear systems and to detect changes in their dynamics.
Several complexity indices have been adopted to analyze art, namely entropy [18,19], Kolmogorov complexity [20,21], fractal dimension [22,23], and others [24,25]. Such indices are not independent, but they capture different aspects of the system state, complementing each other and leading to a deeper assessment of the subject under study [26]. The quantitative analysis of art dates back to 1933, when Birkhoff proposed an aesthetic measure as the ratio between order (i.e., number of regularities) and complexity (i.e., number of elements) of an image. Nevertheless, only recently quantitative techniques were applied, impelled by the availability of digital data and the development of computational tools. Taylor et al. [22] verified that Jackson Pollock's ) dripped patterns are fractals and that the fractal dimension of the paintings increased over the course of his artistic career. Dodgson [18] adopted the concepts of entropy and correlation to describe Bridget Riley's (1961-2012) stripe paintings. Cucker [3] suggested that geometry is an important source of rules for artistic creation. Wallraven et al. [27] used multidimensional scaling (MDS) and clustering techniques to categorize paintings. Kim et al. [28] analyzed a large database of images, finding that the color-usage distribution is remarkably different among historical periods of western paintings. Lee et al. [29] examined almost 180 thousand paintings, focusing on the evolution of the color contrast. Among other findings, they observed a sudden increase in the diversity of color contrast after 1850. Machado and Lopes [20] studied paintings from the viewpoint of information theory and fractional calculus. Sigaki et al. [19] addressed the local order patterns of almost 140 thousand artwork images using complexity indices and observed a clear and robust time evolution. Escher (1898Escher ( -1972 is one of the most celebrated modern graphic artists [30,31]. He is known for his works in woodcuts, lithographs and mezzotints representing fantastic, unusual and impossible objects, with various perspectives, generating optical illusions. Escher is considered a mathematical artist, especially geometric, influenced by his relationships with mathematicians and by its own interests and abilities in mathematics. His artworks explore aspects such as infinity, perspective, symmetry, reflection, hyperbolic geometry, truncated and stellated polyhedra, and tessellations [32,33]. Escher (1946)(1947)(1948)(1949)(1950)(1951)(1952)(1953)(1954)(1955)(1956))-in this period Escher worked with engravings, using unusual and multiple viewpoints, vanishing points and perspectives. Some works suggest the infinity of space through multiple vanishing points and bundles of straight lines. Escher stressed the sense of depth through the use of colors, progressively blurred throughout the pictures and creating the idea of an aerial perspective (e.g., 'Depth ', c. 1955). He also demonstrated interest in geometric solids due to his studies in mineralogy and crystallography.
In this paper we adopt complexity indices, dimensionality reduction and visualization techniques for studying the evolution of Escher's art. In the first phase, 457 artworks, produced between 1916 and 1969, were converted into digital format, discretized and represented in grayscale. The images were then processed by computing six distinct complexity indices, whose evolution was correlated with the periods of the artist's career. In a second phase, a MDS algorithm is adopted for visualizing complexity. The MDS was fed with dissimilarity information calculated with two different distance measures. The MDS maps were interpreted under the light of the emerging clusters and correlated with the periods of Escher's art. It should be noted that other indices can be used for quantifying complexity and different techniques can be adopted for dimensionality reduction, clustering and visualization. We can mention, for example, the use of time-frequency signal processing and hierarchical clustering for studying tidal data [35], the Lempel-Ziv complexity, sample entropy, signal harmonics power ratio, and fractal dimension for analyzing temperature time series [26], and information theory, fractional calculus and hierarchical clustering for studying art [20].
Considering these ideas, Section 2 introduces the mathematical background, emphasizing the concept of complexity and the MDS technique. Section 3 analyzes the evolution of Escher's art in the perspective of six complexity indices. Section 4 uses MDS for dimensionality reduction and data visualization, and interprets the generated maps in the perspective of the periods of the artist's career. Finally, Section 5 presents the conclusions.

Classic Information Indices
The information theory proposed by Shannon [36] was recently adopted in the study of complex systems [37,38].
Let us consider a discrete 1-dim random variable X with sample space {x 1 , . . . , x i , . . . , x N } and probability distribution P(X). An event, x i , with probability of occurrence P(x i ) has the information content: The Shannon entropy is the arithmetic average, or expected value, of h [P (x i )]: where the operator E (·) represents the expected value. The joint entropy of a two-dimensional discrete random variable (X, Y) with sample spaces {x 1 , . . . , x i , . . . , x N } and {y 1 , . . . , y j , . . . , y M }, and joint probability distribution P(X, Y) is [39]: The mutual information of (X, Y), with marginal probability distributions P(X) and P(Y), respectively, is given by [40,41]: and assesses the information shared by (X, Y). When X and Y are independent, there is no shared information between them, and the mutual information is I(X; Y) = 0.
In the follow-up, the complexity indices S(X, Y) and I(X; Y) will be interpreted as state variables and the locus (S, I) will be called the entropy-mutual information plane.

Permutation Entropy and Statistical Complexity
The permutation entropy (PE) was originally proposed as a robust index to assess the complexity of time series [42].
For a time series, {x k : k = 1, . . . , W}, x k ∈ R, we define two parameters: the embedding dimension, d ≥ 2, d ∈ N, and the embedding delay, τ ∈ N, that represent the length of the time series partitioning sequences, and the separation time between their elements, respectively. Let us denote by Ψ = {Π 1 , . . . , Π d! } the set of all possible permutations of the ordinals {1, . . . , d}, and by [I] the Iverson bracket [43], such that I = 1, if I is true 0, if I is false . The procedure for calculating PE is as follows: 1.
Sort the array by increasing order of the elements in the first row; 1.4.
Denote by π k the sequence of numbers in the second row of the sorted array; 2.
Compute the probability distribution P = [p 1 , p 2 , . . . , The value of PE lies in the interval 0 ≤ PE ≤ 1. The lower value PE = 0 indicates that the time series is regular or predictable, while the upper value PE = 1 corresponds to a random time series. The embedding dimension must be chosen such that W d! in order to obtain reliable values of PE. For practical purposes the values d ∈ {3, . . . , 7} and τ = 1 are suggested [42].
For two-dimensional data, as is the case of images, the generalization of the procedure is straightforward. Let us consider a greyscale image represented by the N x × N y matrix A. We define four parameters: two for the embedding dimensions, d x , d y ≥ 2, with d x , d y ∈ N, and two for the embedding delays, τ x , τ y ∈ N. The symbolic sequences for calculating the probabilities, P = [p 1 , p 2 , . . . , p (d x d y )! ], are now obtained from the spatial information in overlapping d x × d y dimensional submatrices. We have a total of (d x d y )! permutations of the ordinals {1, . . . , d x d y } and we choose d x and d y such that N x N y (d x d y )! for having reliable values for PE (for details see [44,45]). For images, the PE is a measure of 'randomness' in the layout of the pixels. If the pixels appear in a random (in the same) order, then PE → 1 (PE → 0).
Another complexity measure is the statistical complexity (C) proposed in [46,47]: where JSD [P U] is the Jensen-Shannon divergence between P = [p 1 , p 2 , . . . , p (d x d y )! ] and the uniform distribution (7) is a normalization constant.
Often the locus (PE, C) is adopted for characterizing data, being called as the complexityentropy plane [46,48].

Kolmogorov Complexity-Based Indices
The Kolmogorov complexity, K(A), of an object A provides a measure of information independently of any probabilistic assumptions about the data sequences in A. The complexity K(A) is defined as the size of the shortest program that, given an empty object at its input, computes A in an universal computer and then stops [49][50][51]. The exact value of K(A) is not computable [49][50][51][52][53][54]. Therefore, approximations based on the Lempel-Ziv [55], linguistic [56] and compression [57] algorithms are used to obtain the upper bounds K(A).
Lossless compression algorithms approximate K(A) by the size of the compressed object, K(A) ≈ size{Φ(A)}, where Φ(·) denotes compression [50,51]. However, for obtaining a good approximation, the compressor has to be 'normal'. This means that, given the object A and the concatenation of A with itself, AA, the compressor has to generate compressed objects such that size[Φ(A)] ≈ size[Φ(AA)] [50,51]. For having a complexity index independent of size[A] we use the complexity ratio (CR): The information distance between two objects {A 1 , A 2 } can be computed using the conditional Kolmogorov complexity, K(A 1 |A 2 ). This corresponds to the size of the shortest program to compute A 1 , given that A 2 is provided [58,59]. Therefore, when we have almost similar objects A 1 and A 2 , the task is less complex and the size of the program is smaller. The inequality K(A 1 |A 2 ) ≤ K(A 1 ) always holds and the normalized information distance, N ID, is formulated as an universal metric [58]: The N ID is a distance and thus it satisfies the conditions: N ID(A 1 , The N ID is based on K(·) and it is not computable [58]. To surpass this limitation, the normalized compression distance: was proposed for approximating the N ID [60,61]. The NCD is nonnegative, and we have 0 < NCD(A 1 , A 2 ) < 1 + , where > 0 denotes the error introduced by the compressor [61,62]. Stemming from (10), we propose the complexity index NCD R (A, R), where R represents a reference object of the same size of A. We tested for R pixels with random noise and all-identical shade of gray. The results showed that NCD R (A, R) is quite insensitive to the type of R, since it consists of a normalized distance. The locus (CR, NCD R ) will be designated by compression distance-ratio plane.

Multidimensional Scaling
The MDS is a technique for dimensionality reduction, clustering and computational visualization of multidimensional data [63][64][65][66][67][68][69]. Given L objects x i , i = 1, . . . , L, in a r-dim space and a measure of dissimilarity between the ith and jth objects, δ ij (x i , x j ), the procedure starts by calculating a L × L symmetric matrix, ∆ = [δ ij ] of object-to-object dissimilarities. The matrix ∆ gives the input information to the MDS numerical algorithm. The MDS represents objects by means of points located in a q-dim space (q < r) at distances θ ij , and iterates multiple configurations in order to maximize a fitness function and to achieve a map of points that approximates the original ones. By other words, the MDS calculates the matrix of distances Θ = [θ ij ] that try to mimic ∆ = [δ ij ]. A fitness function widely used is the raw stress: where f (·) denotes some linear or non-linear transformation. The MDS interpretation follows the patterns of points obtained in the MDS locus. Two similar (dissimilar) objects are shown as two points that are close to (far from) each other. Thus, the MDS interpretation does not follow neither the coordinates of the points, nor the shape of the clusters. In fact, we can translate, rotate and magnify the map, since the object-to-object distances are identical. Moreover, the MDS axes have neither units nor special physical meaning.
The MDS quality can be quantified by means of the Shepard and stress plots. The Shepard diagram compares θ ij and δ ij , for a particular value of q. A narrow scattering of the points represents a good fit between θ ij and δ ij . The stress diagram represents the locus of R versus q. Usually users adopt q = 2 or q = 3, because such values allow a direct visualization and establish a compromise between achieving low values of R and q.

Complexity of Escher's Art
This Section addresses the evolution of Escher's art through time in the perspective of six complexity indices. Section 3.1 characterizes the data set and the conversion scheme of the artwork into digital format. Section 3.2 analyzes the time evolution of the complexity indices. Section 3.3 discusses the complexity of Escher's art under the light of the loci (S, I), (PE, C) and (CR, NCD R ).

Data Description
The study involved 457 artworks created by Escher between the years 1916 and 1969. The digital images were obtained from the visual arts encyclopedia, at the website www.wikiart.org, on 5 March 2019. WikiArt is one of the largest visual arts databases available online for free. Each image file, stored in JPEG format, is read into a N x × N y × N z dimensional matrix, A. For N z = 3, A represents color images and its elements are 8-bit integers in the range between 0 and 255, corresponding to red, green and blue (RGB) intensities. The color images are converted to grayscale and represented by two-dimensional matrices (N z = 1), where the elements of A denote values between black and white (Gr), generated by the ITU-R BT.601 RGB to gray conversion scheme, Gr = 0.2989R + 0.5870G + 0.1140B. This pre-processing yields some loss of information in the originals, not only in terms of color and texture, but also about the three-dimensional textural surface of the painting. However, since most Escher paintings are black and white or with shades of gray, the procedure, necessary for reducing the volume of information and convert all works into an uniform format, does not compromise significantly the analysis. For example, Figure 1 depicts the 'Another World II' (c. 1947), and its corresponding grayscale image.

Time Evolution of the Complexity Indices
Here the artworks are characterized by means of the complexity indices in the set {S, I, PE, C, CR, NCD R }. Their time evolution is then correlated with the Escher's artistic periods P i , i = 1, . . . , 5.
For calculating S and I the probabilities P(x i , y j ) are obtained from the matrices A = [a ij ], i = 1, 2, . . . , N x , j = 1, 2, . . . , N y , through the proportion P( , where x i and y i are the i-th row and j-th column. Therefore P(x i , y i ) represents a normalized version of an image A in the perspective of probability. Figure 2 presents an illustrative example, showing P(x i , y j ) for the grayscale version of the artwork 'Another World II'. For determining PE and C, we use the parameters d x = d y = 2 and τ x = τ y = 1, which were adjusted by numeric simulations. For computing CR and NCD R we adopt the Windows implementation of the gzip compressor, version 1.3.12 (built upon the Lempel-Ziv coding algorithm LZ77), while the reference for the NCD R is a white image. Figure 3 depicts the S and I versus time, using box plots, for the period 1916-1969. In each box, the central mark indicates the median, the bottom and top edges correspond to the 25 and 75 percentiles, respectively, the whiskers span between the extreme data points (not considering the outliers), and the outliers are depicted using the marker '+'. The complexity indices were calculated considering artworks within five-year windows centered at the time stamp. Numerical experiments showed that this value was a good compromise for obtaining a readable and detailed graphical representation. Lower values increase the detail, but blur the charts, while higher values tend to filter too much the data and details are lost. We note that S has higher variations than I, discriminating better between periods. Indeed, we can find a relationship between the time evolution of S and I and the different periods of the Escher's artistic career, even knowing that this division is neither rigidly defined, nor absolutely consensual [34]. 1916  1917  1918  1919  1920  1921  1922  1923  1924  1925  1926  1927  1928  1929  1930  1931  1932  1933  1934  1935  1936  1937  1938  1939  1940  1941  1942  1943  1944  1945  1946  1947  1948  1949  1950  1951  1952  1953  1954  1955  1956  1957  1958  1959  1960  1961  1962  1963  1964  1965  1966  1967  1968  1969 Year 1 1.5  1918  1919  1920  1921  1922  1923  1924  1925  1926  1927  1928  1929  1930  1931  1932  1933  1934  1935  1936  1937  1938  1939  1940  1941  1942  1943  1944  1945  1946  1947  1948  1949  1950  1951  1952  1953  1954  1955  1956  1957  1958  1959  1960  1961  1962  1963  1964  1965  1966  1967  1968  1969 Year  For the other complexity indices we reach similar conclusions and, therefore, the corresponding box plots are omitted.
In synthesis, the complexity indices unveil direct correspondence with Escher's artistic periods. However, we note in the box plots some dispersion and outliers that may rise interesting questions.
For example, what is it about the outliers that make the complexity indices to exhibit fluctuation? Should the outliers be interpreted as anomalies? What is it about the quality of the original images? Since they are from a public database, is the quality reliable? Does it have influence in the results?

Loci of the Complexity Indices
We analyzed the loci (S, I), (PE, C) and (CR, NCD R ) to assess the complexity of Escher's art. The resulting curves are depicted in Figure 4, where the labels denote the two last digits of the years (form 1916 to 1969) and the points correspond to the medians of the complexity indices of artworks within year-centered five-year length windows. This windowing procedure hides some details, but gives a clear picture of the global locus.
Analyzing the locus (S, I) (Figure 4a), we verify that during P 1 the complexity has a very limited evolution, for P 2 it has a large excursion, for P 3 it changes direction, for P 4 it has another large excursion and, finally for P 5 it evolves almost in opposite direction with respect to the former period. It is also interesting to see that between two consecutive periods P i and P i+1 , (i = 2, 3, 4) we always has a tangle revealing the artist search for the new direction of work. For the locus (PE, C) (Figure 4b), we verified that the complexity indices were strongly correlated and, therefore, the five periods of Escher's career appear somewhat overlapped. Therefore, the exercise of discriminating P i , i = 1, . . . , 5, had some limitations. Moreover, we did not find an obvious presence of the inter-period tangles observed in Figure 4a. For (CR, NCD R ) (Figure 4c), we observed again a good match between the artistic periods of Escher and the locus. Furthermore, we noted also the presence of the inter-period tangles.

MDS Visualization of Complexity
In this Section we use the MDS technique for visualizing complexity. In Section 4.1 the input for the MDS is the dissimilarity information between artworks computed with the NCD (10). In Section 4.2 the MDS is fed with the dissimilarity information between the six complexity indices {S, I, PE, C, CR, NCD R } calculated with the Euclidean distance.

MDS Visualization of Complexity Based on the NCD
We compute the 457 × 457 dimensional matrix ∆ NCD = [NCD(A i , A j )], where NCD(A i , A j ) is given by (10) and denotes the dissimilarity between the artworks A i and A j , i, j = 1, . . . , 457. The matrix ∆ NCD is used as the input to the MDS numerical scheme. Since the MDS outputs a large number of points, we post-process the results by calculating the medians of the (x, y, z) coordinates that correspond to artworks within year-centered five-year length windows. Figure 5 depicts the resulting 54-point MDS three-dimensional map obtained with ∆ NCD for the period 1916-1969. We verify again the presence of the five periods P 1 -P 5 and the inter-period tangle structures.  Figure 6a,b show the corresponding MDS assessment charts. Since the Shepard diagram reveals a small scatter around the 45 degree line, we conclude that there exists a good fit between the original and the reproduced distances. The stress plot shows that the three-dimensional map (n = 3) is a good representation of the locus of points, since n = 3 corresponds to the elbow of the curve. Therefore, three-dimensional maps represent a good compromise between accuracy and readability.

MDS Visualization of Complexity Based on the Euclidean Distance
The six complexity indices {S, I, PE, C, CR, NCD R } reveal some correlation, as shown in Figure 7. Nonetheless, we conjecture that each index captures distinct details of the complex system and that a more complete assessment is accomplished when using all indices simultaneously. However, a six-dimensional visual representation is not possible and we decided to test the MDS technique for dimensionality-reduction and visualization. Correlation Matrix In a first phase a 54 × 6 dimensional array, T = [t ik ], is constructed, where t ik , i = 1, . . . , 54, k = 1, . . . , 6, represents the i-th year and the k-th complexity index. The data in T is then normalized by the mean and standard deviation to avoid numerical saturation. Therefore, the columns of T, that is, v k , are converted to:v where µ(·) and σ(·) denote the arithmetic mean and the standard deviation, respectively. In a second phase, the lines of the normalized array,û i , are used for calculating the dissimilarity matrix ∆ T = [δ(û i ,û j )], i, j = 1, . . . , 54, where δ(û i ,û j ) = ∑ k (û ik −û jk ) 2 1 2 denotes the Euclidean distance between the complexity indicesû i andû j . Other distances can be adopted, but several numerical experiments that the Euclidean distance yields good results. Finally, in a third phase, the matrix ∆ T is processed by means of the MDS for constructing the loci of objects that represent the the evolution of complexity. Figure 8 depicts the MDS three-dimensional map obtained with ∆ T for the period 1916-1969. We verify roughly the same characteristics as before for the periods, but in this case we obtain a smoother trajectory, since the adoption of six indices works like a low-pass filter of noise present in the individual complexity indices.
The Sheppard and stress plots are omitted since they are similar to those shown in Figure 6a,b.

Conclusions
We adopted six complexity indices, dimensionality-reduction and visualization techniques for studying the evolution of Escher's art. A total of 457 artworks, produced between years 1916 and 1969, were converted into digital format, discretized and represented in grayscale. The artworks were assessed by means of six distinct complexity indices and by the MDS technique. The results showed that the evolution of complexity is correlated with the periods of Escher's artistic career. On a different level, we conclude that the proposed indices represent reliable and assertive tools for assessing artwork, and motivate their adoption in other artistic manifestations of the human spirit.