Contextuality Analysis of Impossible Figures

This paper has two purposes. One is to demonstrate contextuality analysis of systems of epistemic random variables. The other is to evaluate the performance of a new, hierarchical version of the measure of (non)contextuality introduced in earlier publications. As objects of analysis we use impossible figures of the kind created by the Penroses and Escher. We make no assumptions as to how an impossible figure is perceived, taking it instead as a fixed physical object allowing one of several deterministic descriptions. Systems of epistemic random variables are obtained by probabilistically mixing these deterministic systems. This probabilistic mixture reflects our uncertainty or lack of knowledge rather than random variability in the frequentist sense.


Introduction
Our main purpose is to illustrate the use of epistemic random variables using objects that are naturally described in a deterministic way, but not uniquely. That is, these objects are described by one of several deterministic systems. The first applications of contextuality analysis to such systems is presented in Reference [1], using various deterministic representations of the Liar's paradox. In the present paper the objects of the analysis are the so-called impossible figures: the Penrose triangle and several similar figures, as well as the Ascending and Descending staircase lithograph by M. C. Escher. The triangle and the staircase figures were famously discussed by Penrose and Penrose [2]. In Appendix B to this paper we report several measures of contextuality computed for our systems of epistemic random variables, but in the main text we focus on one measure only, introduced here for the first time, a hierarchical version of the (non)contextuality measure CNT 2 -NCNT 2 described in References [3,4].
The Contextuality-by-Default (CbD) theory [1,[3][4][5] has been developed to apply to abstract systems of random variables, irrespective of one's interpretation of probabilistic notions involved. However, most of its applications have dealt with empirical data and random variables understood in the frequentist sense. In this paper we use the term epistemic random variable to denote a variable for which the probabilities with which it falls in different sets of possible values reflect our uncertainty or lack of knowledge rather than random variability in the frequentist sense.
A system R of random variables is a set of double-indexed random variables R c q , where q ∈ Q denotes their content, that can be defined as a question to which the random variable responds, and c ∈ C is their context, encompassing the conditions under which it is recorded. A system can be presented as R = {R c q : c ∈ C, q ∈ Q, q ≺ c}, where q ≺ c indicates that content q is responded to in context c. The variables of the subset R c = {R c q : q ∈ Q, q ≺ c} (2) are jointly distributed, whereas any two random variables in the subset are stochastically unrelated. (More generally, any two R c q , R c q ∈ R with c = c are stochastically unrelated. That is, they do not possess a joint distribution.) The set R c is called the bunch corresponding to context c and the set R q is referred to as the connection for content q.
We will limit our discussion and applications to finite systems of binary random variables, with n = |Q| and m = |C|. Without loss of generality we may assume all variables R c q take values 0/1. These systems can be represented by three vectors. The first one is the vector of the low-level marginals, p c q = R c q = Pr(R c q = 1), with 1 formally equal to · . The second vector is the vector of maximal probabilities with which two random variables in a connection could both equal 1 (if they possessed a joint distribution). It is uniquely determined by vector l. The third vector is where b s = (p c q 1 ,...,q s : q 1 , . . . , q s ∈ Q, c ∈ C, q 1 , . . . , q s ≺ c, q 1 , . . . , q s are distinct).
Here, s = 2, . . . , r, with p c q 1 ,...,q s = R c q 1 · · · R c q s = Pr(R c q 1 = 1, . . . , R c q s = 1), and 2 ≤ r ≤ n is the largest number of distinct q's within a bunch. (We could allow r = 1, in which case b is empty and the system is trivially noncontextual.) Thus, b is a minimal set of probabilities that completely describes the joint distributions of the bunches in the system (for a given vector l). The coordinates of a specific system of random variables are then given by This is the reduced vectorial representation of the system, as discussed in Reference [3]. The system is noncontextual if and only if there is a vector h ≥ 0 (component-wise) such that Here, the elements of h are probabilities assigned to all possible combinations of values (1's and 0's) assigned to all random variables R c q in the system, and M is an incidence (Boolean) matrix [3,5]: in the row of M corresponding to a given element of p * , say, Pr(R c q 1 = . . . = R c q s = 1), we put 1 in the columns corresponding to the elements of h in which R c q 1 , . . . , R c q s are assigned the value 1; other columns in this row are filled with 0.
Denoting the rows of M that correspond to l * , c * , b * by, respectively, M l , M c , M b , we can rewrite (9) in extenso: Reference [3] also introduces a measure of contextuality, CNT 2 , with its natural extension into a measure of noncontextuality, NCNT 2 . In Reference [4], the behavior of both measures was characterized for a special class of systems, known as cyclic. These measures can be computed with the aid of linear programming. To compute CNT 2 , one solves the following task find minimizing subject to And, for any solution x * , one finds CNT 2 = b * − M b x * 1 (L 1 -norm). By enumerating the elements of b * as 1, . . . , K, the NCNT 2 measure of noncontextuality is computed as where d * − i , d * + i are solutions of the following linear programming tasks (denoting by e i the unit vector with unity in its i-th element) Clearly, for a contextual system CNT 2 > 0 and NCNT 2 = 0, whereas for a noncontextual one, CNT 2 = 0 and NCNT 2 ≥ 0.

Hierarchically Measuring Contextuality
The development of this measure has greatly benefited from discussions with and critical analysis by Janne V. Kujala (see Acknowledgements).
For a given vector l * (hence also the vector c * ), we call the convex polytope the noncontextuality polytope. A system represented by the point p * = (l * , c * , b * ) is noncontextual if and only if b * ∈ K. The measures CNT 2 and NCNT 2 are the L 1 -distance from b * to the surface of K when the system is, respectively, contextual or noncontextual. In References [4,6] it is noted that This observation is easily generalized as max 0, ≤ p c q 1 ,q 2 ,q 3 ,q 4 ≤ min(p c q 1 ,q 2 ,q 3 , p c q 1 ,q 2 ,q 4 , p c q 1 ,q 3 ,q 4 , p c q 2 ,q 3 ,q 4 ), for s = 2, . . . , r. That is, the elements of b * are hierarchically bounded, with l * providing the bounds for b * 2 and b * s−1 determining the bounds for b * s if 2 < s ≤ r. These hierarchical restrictions suggest a hierarchical way of approaching the measurement of the (non)contextuality of a system R represented by p * . Consider the systems of equations where M s is the submatrix formed by the rows of M corresponding to the elements of b * s . Clearly, for a contextual system there is a value 2 ≤ s * ≤ r such that there is no solution h s ≥ 0 for any s ≥ s * while there is a solution for each s < s * . For a noncontextual system, the solution to the system (9) implies the solution to all systems (17). Therefore, if R is contextual, we can further qualify its contextuality and say that it is contextual at level s * . Its degree of contextuality at level s * can be computed by solving the linear programming task find minimizing subject to And computing, for any solution x * , Moreover, for each s for which (M l , M c , M 2 , . . . , M s ) h s = (l * , c * , b * 2 , . . . , b * s ) has a solution, we can compute the noncontextuality of R at level s as where K s is the number of elements of b * s , and d * − i , d * + i are solutions to the linear programming tasks In this way, we construct the hierarchical measure of (non)contextuality which characterizes the degree of (non)contextuality of a system R by a vector of size s * − 1 if the system is contextual, or of size r − 1 if the system is noncontextual. For a contextual system, CNT s * 2 gives the L 1 -distance from b * s * to the surface of the polytope And, at each level 2 ≤ s < s * (2 ≤ s ≤ r for a noncontextual system), NCNT s 2 is the L 1 -distance from b * s to the surface of the polytope Analogously to CNT 2 and NCNT 2 , we have that for a system contextual at level s * , CNT s * 2 > 0 and NCNT s * 2 = 0, and in addition for s < s * , CNT s 2 = 0.

The Penrose Triangle
We will now apply the measures just constructed to some drawings known as impossible figures. The general idea underlying their contextuality analysis is that the epistemic random variables representing an impossible figure should always be compared to those representing a realizable (i.e., "normal") figure, and the degree of contextuality be derived from the difference between the two.
We begin with the well-known Penrose triangle, depicted in Figure 1a. Observe that one can see precisely two faces of each of the three bars forming the figure. So each corner of the triangle is formed by four of these faces, one of which ends in the inner fold of the corner (invisible in Figure 1a; in other figures, for example, in Figure 2a, if the inner fold is visible, then two faces, of two different bars, end in it). Looking at the two faces of a given bar near one of the corners, either one of the faces ends in the inner fold while another does not (we encode this case by 0), or both do not (encoded as 1). In Figure 1b this is shown by interrupted and solid lines of the cuts made in each bar at the two corners it connects. The endpoint labels 1 and 0 correspond to, respectively, the case when both cut lines are solid, and the case when one of them is interrupted.
We do not claim that this encoding describes how the figure or its elements are perceived. The latter is a question for an empirical investigation, such as the one presented in Reference [7], where contextuality analysis was applied to perception of an ambiguous figure (Schröder's stair). We are merely selecting a possible description of the figure as a fixed physical object. Other mathematical descriptions of the Penrose triangle and other impossible figures (some of them very different from those considered in this paper) can be found in References [8][9][10].
Perception may make use of such descriptions, but it is most likely a complex process with descriptions changing in time. We can construct a system representation of the Penrose triangle in two ways. The first one consists in looking at each of the three bars separately, and arbitrarily picking one of its two ends as the first one in an ordered pair. We see in Figure 1 that whenever the end we pick is coded 0, the other end of the bar is coded 1, and vice-versa. Thus, we get two deterministic descriptions, where c designates a specific bar (here, left, right, or bottom one), and (q, q ) is an ordered pair of its endpoints. Since these deterministic descriptions are equally plausible (i.e., there is no preferred ordering of the endpoints of an isolated bar), we can assign the epistemic probability 1 /2 to each of them, obtaining thereby two perfectly anticorrelated uniformly distributed epistemic random variables, The stand-alone numbers in the table are joint and marginal probabilities. We now have to add for comparison epistemic random variables describing a bar of a realizable (i.e., "normal") triangle, as the ones shown in Figure 2. We define a realizable figure as one that can be viewed as an oblique projection of a physical object of relatively small thickness (so that perspective can be ignored). Each of the contexts representing a realizable bar has two identically labelled ends, and depending on the oblique projection angle, each bar (left, right, or bottom) can be labeled (1, 1) or (0, 0). Using the same uniform epistemic mixing as before, we obtain two perfectly correlated uniformly distributed epistemic random variables, Repeating this reasoning for each of the three bars, we get the context-content matrix.
Option 1 The contexts and contents in P 1 3 have been numbered so that the odd contexts represent the bars of the Penrose triangle (with the perfectly anticorrelated variables), and the even ones represent the corresponding realizable bars (with perfectly correlated variables). This ordering highlights the fact that the system is composed of three disjoint 2 × 2 subsystems (formally, cyclic systems of rank 2). We call this way of representing the impossible figure Option 1.
Another way (Option 2) to approach the construction of a system of epistemic random variables describing the Penrose triangle is to look at the sequence of the labels for the cuts in the entire triangle. We arbitrarily choose one of the six cuts in the figure as a starting point, and then proceed to the second cut in the same bar, then to the nearest cut in the adjacent bar, and so forth. This produces one of the two patterns shown below, representing three starting points each: By uniformly mixing these patterns, we obtain a vector of six jointly distributed epistemic random variables that represent the Penrose triangle in a single context of random variables: (1, 0, 1, 0, 1, 0) (0, 1, 0, 1, 0, 1) any other pattern The numbers in the second row are probabilities of the values in the first row, dist = stands for "distributed as." To complete Option 2, we have to add for comparison the epistemic random variables describing a realizable triangle in the same fashion. We select the triangle depictions that may be obtained by oblique projection at arbitrary angles excluding multiples of 60 deg. (We exclude the multiples of 60 deg because at these angles one of the bars is drawn with a single visible side instead of two.) Except for rotations, this process produces two distinct figures with respect to the patterns of 0 and 1 that describe them in a similar way to the Penrose triangle above. The three left patterns in (28) below are the possible patterns that describe the triangle in Figure 2a, and the three patterns on the right describe the one in Figure 2b.
By taking the uniform mixture of these deterministic patterns, we produce a joint distribution that represents a realizable triangle as the second context in the context-content matrix P 2 3 below Option 2 Whichever of the two options we choose, the pairs of random variables that represent two cuts on the same bar are perfectly negatively correlated in the case of the Penrose triangle, and are perfectly positively correlated in a realizable triangle. Because of this the systems in both Options 1 and 2 are contextual. In both cases the contextuality is achieved at level 2. Therefore their hierarchical representation is a one-component vector. The values are CNT 2 2 = 1.5 in Option 1 and CNT 2 2 = 4.5 in Option 2. (At this introductory stage we have not discussed the issue of normalization of the (non)contextuality values, because of which we should avoid comparing the values computed for systems of different format, such as our Option 1 and Option 2 systems.) The value of CNT 2 2 for Option 1 is clearly the sum of the contextuality values of the three disjoint subsystems. Generally, for a system R which is composed of N disjoint systems R i , the noncontextuality polytope the Cartesian product of the corresponding noncontextuality polytopes K i of each system. Consequently, and for 2 ≤ s ≤ s * .

Other Impossible Figures
To further explore how our Option 1 and Option 2 representations capture the impossibility of a figure by the degree of contextuality of the resulting systems, let us consider an alternative impossible triangle, depicted in Figure 3. Unlike the Penrose triangle, this one has only one bar (the left one) with different labels at its ends. For the bottom bar, both ends are coded as 1, and for the right bar the two ends are coded as 0. Intuitively, this triangle seems "less impossible" than the Penrose one. Following the same Option 1 procedure as before, with the same variables describing a realizable triangle, we obtain the system of the same format as system P 1 3 ; however, in two of the disjoint 2 × 2 subsystems, in the odd-numbered contexts representing the bottom and right bars, the two random variables become deterministic, making thereby these subsystems noncontextual. (In CbD, any deterministic variable in a system can be deleted from the system without affecting its (non)contextuality, and the same is true for any variables that is alone in its connection.) Therefore, two of the subsystems, corresponding to the lower and the right bars, can be removed without affecting the analysis). For Option 2, we obtain a system of the same format as P 2 3 , but the distribution of the bunch corresponding to c 1 (impossible triangle) is given by the uniform mixture of the patterns below: Finally, we need to check that the procedures above yield no contextuality if an impossible figure is replaced with a realizable one. We use the realizable triangle in Figure 2a. Option 1 produces three deterministic pairs of random variables for all three odd-numbered contexts corresponding to the pictured triangle. For Option 2, the realizable triangle is represented by a context with the joint distribution given by uniform mixture of the following patterns: The realizable triangle is noncontextual under both options, and all its hierarchical NCNT s 2 measures are zero (indicating that the systems for the realizable triangle lie on the surface of the corresponding noncontextuality polytopes).
Contextuality analysis similar to the reported above for the impossible triangles can be extended to other impossible figures. We have conducted this analysis for the impossible square (or rectangle) and the impossible circle (also known as impossible loop). For the impossible square, in addition to the figure constructed with the same type of corners as in the Penrose triangle (Figure 4a), we have considered an alternative impossible square (Figure 4b). Figure 4c shows a realizable ("normal") square. Since the procedures and reasoning here are in all essential details the same as for the impossible triangles, we have relegated the details to Appendix A. Figure 5 depicts an impossible circle ( Figure 5a) and a realizable one (Figure 5b). To characterize the circles, we may look at them as being composed of two handles joined by curved bars, with the cut lines drawn between the handles and the joining bars. In this way, we obtain systems analogous to those we found for the two ways we are using to represent the figures. Again, we relegate the details of the analysis to Appendix A. We summarize the results of our analysis in Table 1. As we see, all the impossible figures we explored, under both Option 1 and Option 2, are contextual at level 2. Triangle in Figure 1 1.5 4.5 Triangle in Figure 3 0.5 1.5 Square in Figure 4a 2.0 8.0 Square in Figure 4b 1.0 2.0 Circle in Figure 5a 1.0 2.0

Escher's "Ascending and Descending"
We approach the Ascending and Descending lithograph by M. C. Escher ( Figure 6) by considering the four stair flights as four contents, q 1 , q 2 , q 3 , q 4 . The same as for the impossible figures, we have two ways of constructing the system of epistemic random variables. In Option 1, the system is represented by the following context-content matrix: (35) If we view a stair flight as ascending, then R c q = 1, otherwise R c q = 0. For every pair of consecutive stair flights in the picture, they are seen as both ascending, or both descending. By uniformly mixing 0 0 c q q and 1 1 c we form the first four contexts in the system. Thus, in contexts c 1 to c 4 , the distributions are described by for k = 1, . . . , 4 (⊕ is cyclic addition, with 4 ⊕ 1 = 1). The fifth context in E 1 includes the possible patterns of ascent and descent for the entire staircase. The strangeness (or impossibility) of the situation in the lithograph is that four stair flights forming a closed loop cannot ascend or descend indefinitely: the number of ascending stair flights should be precisely two to counterbalance the descending ones. In other words, the physically possible values of the bunch R 5 1 , R 5 2 , R 5 3 , R 5 4 are 0 0 1 1 c 5 The epistemic distribution of the fifth bunch therefore is given by the uniform mixture of these deterministic patterns.
The second way in which we represent the Ascending-Descending staircase is by looking at the four stair flights together. This forms one ("impossible") context where either all staircases are described as ascending (R 1 q = 1, for q = 1, . . . , 4) or all of them are descending (R 1 q = 0, for i = 1, . . . , 4). A second context, describing the physically realizable patterns is formed in the same way as the fifth context above. The system we obtain in this way is For both these options, the resulting systems are contextual at the level s * = 2, and the values are CNT 2 2 = 4 /3 for Option 1 and CNT 2 2 = 2 for Option 2.

Conclusions
We have introduced a hierarchical measure of (non)contextuality of systems of random variables. It follows the same logic and is calculated similarly to the measures of contextuality CNT 2 and noncontextuality NCNT 2 . It is clear that in cyclic systems the hierarchical measure equals CNT 2 when the system is contextual, and it equals NCNT 2 when the system is noncontextual. It still remains to investigate whether some of the properties of CNT 2 -NCNT 2 described in Reference [4] for cyclic systems generalize to some classes of noncyclic systems.
The analysis of the impossible figures show that the intuitive degree of the impossibility or strangeness of those figures can be captured through the contextuality of the systems of epistemic random variables chosen to describe them. When the endpoint codes of a bar in a Penrose-like figure are anticorrelated, the adjacent bars appear to twist toward different directions. The more such "twisted" situations we see in a figure, the stranger it looks, and the greater its contextuality (if achieved at the same level).
All contextual systems constructed in this paper happen to be contextual at level 2. This cannot be otherwise for the Option 1 systems which only contain two random variables in each bunch. However, it is an empirical rather than mathematically deducible fact for other systems. More work is needed to find out the scope of the impossible figures whose reasonable descriptions have the same property.
To create the epistemic random variables depicting realizable ("normal") square, we proceed analogously to the realizable triangles: we systematically consider all depictions that may be obtained by oblique projection at arbitrary angles excluding the multiples of 90 deg. In this case, ignoring rotations, there is only one pattern of endpoint labels produced by all realizable squares. Varying the starting point, we get the following sequences whose uniform mixture yields the epistemic random variables for realizable squares: 0 0 1 1 1 1 0 0 c 2 q 1 q 2 q 3 q 4 q 5 q 6 q 7 q 8 1 1 0 0 0 0 1 1 c 2 q 1 q 2 q 3 q 4 q 5 q 6 q 7 q 8 0 0 0 0 1 1 1 1 c 2 q 1 q 2 q 3 q 4 q 5 q 6 q 7 q 8 1 1 1 1 0 0 0 0 c 2 q 1 q 2 q 3 q 4 q 5 q 6 q 7 q 8 (A1) All the results for impossible squares are obtained as a straightforward expansion of those for impossible triangles. Following Option 1, we obtain the systems describing the rectangles in Figure 4 in format The systems for Option 2 have the format For the first option, just as with the triangles, all random variables representing a realizable bar, are perfectly correlated. The random variables representing bars of the Penrose-like square are negatively correlated, whereas for the alternative impossible square, two of the four odd-numbered bunches become deterministic. For the realizable square in Figure 4c, all four odd-numbered bunches are deterministic, making the system trivially noncontextual. For the impossible circle, the system for Option 1 has the format Option 1 and the system for the Option 2 has the format Option 2 The distributions of the random variables in each context for Option 1 follows the same pattern as the corresponding Option 1 for the triangles and the squares. The two variables in a bunch corresponding to the impossible loop are anticorrelated and the ones corresponding to the realizable one are perfectly correlated. For Option 2, 0 1 0 1 c 2 q 1 q 2 q 3 q 4 1 0 1 0 c 2 q 1 q 2 q 3 q 4 (A8) gives the patterns for the impossible circle. The patterns that generate the joint distribution for the bunch representing the realizable circle are 0 0 1 1 c 2 q 1 q 2 q 3 q 4 1 1 0 0 c 2 q 1 q 2 q 3 q 4 . (A9) The measures of contextuality for the impossible loop are CNT 2 2 = 1 and CNT 2 2 = 2, under Options 1 and 2, respectively. The realizable figure is noncontextual and all its noncontextuality measures are zero.

Appendix B. Measures of Contextuality
For completeness and comparison purposes, we present the values of the contextuality measures CNT 1 , CNT 2 , CNT 2 2 , CNT 3 , and contextual fraction (CNTF), for each of the systems used to represent the impossible figures. The measure CNT 2 2 is the hierarchical measure of (non)contextuality described in this paper. The contextual fraction was introduced in Reference [11]. The other measures and the linear programming tasks that may be used to compute them are presented in Reference [3]. Table A1 presents the contextuality measures for the systems representing the triangles.