Ruling out higher-order interference from purity principles

As first noted by Rafael Sorkin, there is a limit to quantum interference. The interference pattern formed in a multi-slit experiment is a function of the interference patterns formed between pairs of slits, there are no genuinely new features resulting from considering three slits instead of two. Sorkin has introduced a hierarchy of mathematically conceivable higher-order interference behaviours, where classical theory lies at the first level of this hierarchy and quantum theory theory at the second. Informally, the order in this hierarchy corresponds to the number of slits on which the interference pattern has an irreducible dependence. Many authors have wondered why quantum interference is limited to the second level of this hierarchy. Does the existence of higher-order interference violate some natural physical principle that we believe should be fundamental? In the current work we show that such principles can be found which limit interference behaviour to second-order, or"quantum-like", interference, but that do not restrict us to the entire quantum formalism. We work within the operational framework of generalised probabilistic theories, and prove that any theory satisfying Causality, Purity Preservation, Pure Sharpness, and Purification---four principles that formalise the fundamental character of purity in nature---exhibits at most second-order interference. Hence these theories are, at least conceptually, very"close"to quantum theory. Along the way we show that systems in such theories correspond to Euclidean Jordan algebras. Hence, they are self-dual and, moreover, multi-slit experiments in such theories are described by pure projectors.


Introduction
Described by Feynman as "impossible, absolutely impossible, to explain in any classical way" [36] (volume 1, chapter 37), quantum interference is a distinctive signature of non-classicality. However, as first noted by Rafael Sorkin [69,70], there is a limit to this interference; in contrast to the case of two slits, the interference pattern formed in a three slit experiment can be written as a linear combination of two and one slit patterns. Sorkin has introduced a hierarchy of mathematically conceivable higher-order interference behaviours, where classical theory lies at the first level of this hierarchy and quantum theory theory at the second. Informally, the order in this hierarchy corresponds to the number of slits on which the interference pattern has an irreducible dependence.
Many authors have wondered why quantum interference is limited to the second level of this hierarchy [69,52,50,8,73,72,71,60,51,32,12]. Does the existence of higher-order interference violate some natural physical principle that we believe should be fundamental [53]? In the current work we show that such natural principles can be found which limit interference behaviour to second-order, or "quantum-like", interference, but that do not restrict us to the entire quantum formalism.
We work in the framework of general probabilistic theories [10,39,14,15,38,6,9,31,55,22,47,49,48,11]. This framework is general enough to accommodate essentially arbitrary operational theories, where an operational theory specifies a set of laboratory devices which can be connected together in different ways, and assigns probabilities to different experimental outcomes. Investigating how the structural and information-theoretic features of a given theory in this framework depend on different physical principles deepens our physical and intuitive understanding of such features. Indeed, many authors [38,15,40,31,55] have derived the entire structure of finite-dimensional quantum theory from simple informationtheoretic axioms-reminiscent of Einstein's derivation of special relativity from two simple physical principles. So far, ruling out higher-order interference has required thermodynamic arguments. Indeed, by combining the results and axioms of Refs. [20,46], higher-order interference could be ruled out in theories satisfying the combined axioms. In this paper we show that we can prove this in a more direct way from first principles, using only the axioms of Ref. [20].
Many experimental investigations have searched for divergences from quantum theory by looking for higher-order interference [68,67,61,45,44]. These experiments involved passing a particle through a physical barrier with multiple slits and comparing the interference patterns formed on a screen behind the barrier when different subsets of slits are closed. Given this set-up, one would expect that the physical theory being tested should possess transformations that correspond to the action of blocking certain subsets of slits. Moreover, blocking all but two subsets of slits should not affect states which can pass through either slit. This intuition suggests that these transformations should correspond to projectors.
Many operational probabilistic theories do not possess such a natural mathematical interpretation of multi-slit experiments; indeed many theories do not admit well-defined projectors [52]. Here, we show that there exist natural information-theoretic principles that both imply the existence of the projector structure, and rule out third-, and higher-, order interference. The principles that ensure this structure are Causality, Purity Preservation, Pure Sharpness, and Purification. These formalise intuitive ideas about the fundamental role of purity in nature. More formally, we show that such theories possess a self-dualising inner product, and that there exist pure projectors which represent the opening and closing of slits in a multi-slit experiment. Barnum, Müller and Ududec have shown that in any self-dual theory in which such projectors exist for every face, if projectors map pure states to pure states, then there can be at most second-order interference [8] (Proposition 29). The conjunction of our new results and the principle of Purity Preservation implies the conditions of Barnum et al.'s proposition. Hence sharp theories with purification do not exhibit higher-order interference. In fact we prove a stronger result, that the systems in such theories are Euclidean Jordan algebras which have been studied in quantum foundations [73,8,7]. This paper is organised as follows. In Section 2 we review the basics of the operational probabilistic theory framework. In Section 3 we formally define higher-order interference. In Section 4 we define sharp theories with purification and review relevant known results. In Section 5 we present and prove our new results. Finally, in Section 6, we offer some suggestions on how new experiments might be devised to observe higher-order interference.

Framework
We will describe theories in the framework of operational-probabilistic theories (OPTs) [14,15,39,40,41,13,16], arising from the marriage of category theory [1,25,26,65,28,29] with probabilities. The foundation of this framework is the idea that any successful physical theory must provide an account of experimental data. Hence, such theories should have an operational description in terms of such experiments.
The OPT framework is based on the graphical language of circuits, describing experiments that can be performed in a laboratory with physical systems connecting together physical processes, which are denoted as wires and boxes respectively. The systems/wires are labelled with a type denoted A, B, C, . . . . For example, the type given to a quantum system is the dimension of the Hilbert space describing the system. The processes/boxes are then viewed as transformations with some input and output systems/wires. For instance, in quantum theory these correspond to quantum instruments. We now give a brief introduction to the important concepts in this formalism.

States, transformations, and effects
A fundamental tenant of the OPT framework is composition of systems and physical processes. Given two systems A and B, they can be combined into a composite system, denoted by A ⊗ B. Physical processes can be composed to build circuits, such as (2.1) Processes with no inputs (such as ρ in the above diagram) are called states, those with no outputs (such as a and b) are called effects and, those with both inputs and outputs (such as A, A ′ , B) are called transformations. We define: OPTs include a particular system, the trivial system I, representing the lack of input or output for a particular device.
Hence, states (resp. effects) are transformations with the trivial system as input (resp. output). Circuits with no external wires, like the circuit in Equation (2.1), are called scalars and are associated with probabilities. We will often use the notation (a|ρ) to denote the circuit (a|ρ) := ( /) . ρ A " %# $ a , and of the notation (a|C|ρ) to denote the circuit The fact that scalars are probabilities and so are real numbers induces a notion of a sum of transformations, so that the sets St (A), Transf (A, B), and Eff (A) become spanning sets of real vector spaces, denoted by St R (A), Transf R (A, B), and Eff R (A). In this work we will restrict our attention to finite systems, i.e., systems for which the vector space spanned by states is finite-dimensional for all systems. Operationally this assumption means that one need not perform an infinite number of distinct experiments to fully characterise a state. Restricting ourselves to non-negative real numbers, we have the convex cone of states and of effects, denoted by St + (A) and Eff + (A) respectively. We moreover make the assumption that the set of states is closed. Operationally this is justified by the fact that up to any experimental error a state space is indistinguishable from its closure.
The composition of states and effects leads naturally to a norm. This is defined, for states ρ as ρ := sup a∈Eff(A) (a|ρ), and similarly for effects a as a := sup ρ∈St(A) (a|ρ). The set of normalised states (resp. effects) of system A is denoted by St 1 (A) (resp. Eff 1 (A)).
Transformations are characterised by their action on states of composite systems: if (2.2) for every system S and every state ρ ∈ St (A ⊗ S). However it follows that [14] effects (resp. states) are completely defined by their action on states (resp. effects) of a single system. Equality on states of the single system A is, in general, not enough to discriminate between A and A ′ , as is the case for quantum theory over real Hilbert spaces [75]. However, for the scope of the present article, which focuses on single-system properties, we often concern ourselves with equality on single system.

Tests and channels
In general, the boxes corresponding to physical processes come equipped with classical pointers. When used in an experiment, the final position of the a given pointer indicates the particular process which occurred for that box in that run. In general, this procedure can be non-deterministic. These non-deterministic processes are described by tests [14,16]: a test from A to B is a collection of transformations {C i } i∈X from A to B, where X is the set of outcomes. If A (resp. B) is the trivial system, the test is called a preparation-test (resp. observation-test ). If the set of outcomes X has a single element, we say that the test is deterministic, because only one transformation can occur. Deterministic transformations will be called channels. A channel U from A to B is reversible if there exists another channel U −1 from B to A such that U −1 U = I A and U U −1 = I B , where I S is the identity transformation on system S. If there exists a reversible channel transforming A into B, we say that A and B are operationally equivalent, denoted as A ≃ B. The composition of systems is required to be symmetric, meaning that A⊗B ≃ B⊗A. Physically, this means that for every pair of systems there exists a reversible channel swapping them. A state χ is called invariant if U χ = χ for all reversible channels U .
A particularly useful class of observation-tests allows for the following.

Pure transformations
There are various different ways to define pure transformations, for example in terms of resources [42,37,18,20,21] or "side information" [16,64]. Informally pure transformations correspond to an experimenter having maximal control of or information about a process.
Here, we formalise this notion by defining the notion of a coarse-graining [14]. Coarsegraining is the operation of joining two or more outcomes of a test into a single outcome. More precisely, a test {C i } i∈X is a coarse-graining of the test {D j } j∈Y if there is a partition In this case, we say that the test {D j } j∈Y is a refinement of the test {C i } i∈X , and that the transformations {D j } j∈Y i are a refinement of the transformation C i . A transformation C ∈ Transf (A, B) is pure if it has only trivial refinements, namely refinements {D j } of the form D j = p j C, where {p j } is a probability distribution. We denote the sets of pure transformations, pure states, and pure effects as PurTransf (A, B), PurSt (A), and PurEff (A) respectively. Similarly, PurSt 1 (A), and PurEff 1 (A) denote normalised pure states and effects respectively. Non-pure states are called mixed.
Clearly, no states are contained in a pure state. On the other edge of the spectrum we have complete states.
Definition 2.5. We say that two transformations A, A ′ ∈ Transf (A, B) are equal upon input of the state ρ ∈ St 1 (A) if Aσ = A ′ σ for every state σ contained in ρ. In this case we will write A = ρ A ′ .

Causality
A natural requirement of a physical theory is that it is causal, that is, no signals can be sent from the future to the past. In the OPT framework this is formalised as follows: Axiom 2.6 (Causality [14,16]). The probability that a transformation occurs is independent of the choice of tests performed on its output.
Causality is equivalent to the requirement that, for every system A, there exists a unique deterministic effect u A on A (or simply u, when no ambiguity can arise) [14]. Owing to the uniqueness of the deterministic effect, the marginals of a bipartite state can be uniquely defined as: Moreover, this uniqueness forbids the ability to signal [14,27]. We will denote by Tr B ρ AB the marginal on system A, in analogy with the notation used in the quantum case. We will stick to the notation Tr in formulas where the deterministic effect is applied directly to a state, e.g., Tr ρ := (u|ρ).
In a causal theory it is easy to see that the norm of a state takes the form ρ = Tr ρ, and that a state can be prepared deterministically if and only if it is normalised.

Higher-order interference
The definition of higher-order interference we shall present in this section takes its motivation from the set-up of multi-slit interference experiments. In such experiments a particle passes through slits in a physical barrier and is detected at a screen. By repeating the experiment many times, one builds up a pattern on the screen. To determine if this experiment exhibits interference one compares this pattern to those produced when certain subsets of the slits are blocked. In quantum theory, for example, the two-slit experiment exhibits interference as the pattern formed with both slits open is not equal to the sum of the one-slit patterns.
Consider the state of the particle just before it passes through the slits. For every slit, there should exist states such that the particle is definitely found at that slit, if measured. Mathematically, this means that there is a face [8] of the state space, such that all states in this face give unit probability for the "yes" outcome of the two-outcome measurement "is the particle at this slit?". Recall that a face is a convex set with the property that if px+(1 − p) y, for 0 ≤ p ≤ 1, is an element then x and y are also elements. These faces will be labelled F i , one for each of the n slits i ∈ {1, . . . , n}. As the slits should be perfectly distinguishable, the faces associated with each slit should be perfectly distinguishable, or orthogonal. One can additionally ask coarse-grained questions of the form "Is the particle found among a certain subset of slits, rather than somewhere else?". The set of states that give outcome "yes" with probability one must contain all the faces associated with each slit in the subset. Hence the face associated with the subset of slits I ⊆ {1, . . . , n} is the smallest face containing each face in this subset F I := i∈I F i , where the operation is the least upper bound of the lattice of faces where the ordering is provided by subset inclusion of one face within another. The face F I contains all those states which can be found among the slits contained in I. The experiment is "complete" if all states in the state space (of a given system A) can be found among some subset of slits. That is, if F 12···n = St (A).
An n-slit experiment requires a system that has n orthogonal faces F i , with i ∈ {1, . . . , n}. Consider an effect E associated with finding a particle at a particular point on the screen. We now formally define an n-slit experiment. (e I |ρ) = 0, ∀ρ where ρ ⊥ F I . The effects introduced in the above definition arise from the conjunction of blocking off the slits {1, . . . , n} \ I and applying the effect E. If the particle was prepared in a state such that it would be unaffected by the blocking of the slits (i.e., ρ ∈ F I ) then we should have (e I |ρ) = (E|ρ). If instead the particle is prepared in a state which is guaranteed to be blocked (i.e., ρ ′ ⊥ F I ) then the particle should have no probability of being detected at the screen, i.e., (e I |ρ ′ ) = 0.
The relevant quantities for the existence of various orders of interference are [6,73,69,52]:    In a slightly different formal setting, it was shown in [69] that I n = 0 =⇒ I n+1 = 0, so if there is no nth order interference, there will be no (n + 1)th order interference; the argument of [69] applies here.
It should be noted that there appears to be a lot of freedom in choosing a set of effects {e I } to test for the existence of higher-order interference. Indeed, in arbitrary generalised theories this appears to be the case [52]. However, it is natural to ask whether there exists physical transformations T I in the theory which correspond to leaving the subset of slits I open and blocking the rest. Hence a unique e I is assigned to each fixed E defined as e I = ET I . Ruling out the existence of higher-order interference then reduces to proving certain properties of the T I . This will turn out to be the case in sharp theories with purification.

Sharp theories with purification
In this section we present the definition and important properties of sharp theories with purification. They were originally introduced in [19,20,21] for the analysis of the foundations of thermodynamics and statistical mechanics.
Sharp theories with purification are causal theories defined by three axioms. The first axiom-Purity Preservation-states that no information can leak when two pure transformations are composed: Axiom 4.1 (Purity Preservation [17]). Sequential and parallel compositions of pure transformations yield pure transformations.
The second axiom-Pure Sharpness-guarantees that every system possesses at least one elementary property.
Axiom 4.2 (Pure Sharpness [19]). For every system there exists at least one pure effect occurring with unit probability on some state.
These axioms are satisfied by both classical and quantum theory. Our third axiom-Purification-signals the departure from classicality, and characterises when a physical theory admits a level of description where all deterministic processes are pure and reversible.
Given a normalised state in this case B is called the purifying system. We say that a pure state Ψ ∈ PurSt (A ⊗ B) is an essentially unique purification of its marginal ρ A [16] if every other pure state Ψ ′ ∈ PurSt (A ⊗ B) satisfying the purification condition must be of the form for some reversible channel U . Quantum theory, both on complex and real Hilbert spaces, satisfies Purification, and also Spekkens' toy model [35]. Examples of sharp theories with purification besides quantum theory include fermionic quantum theory [33,34], a superselected version of quantum theory known as doubled quantum theory [21], and a recent extension of classical theory with the theory of codits [20].

Properties of sharp theories with purifications
Sharp theories with purifications enjoy some nice properties, which were mainly derived in Refs. [19,20]. The first property is that every non-trivial system admits perfectly distinguishable states [19], and that all maximal sets of pure states have the same cardinality [20]. Note that we will omit the subscript A when the context is clear. In sharp theories with purification every state can be diagonalised, i.e., written as a convex combination of perfectly distinguishable pure states (cf. Refs. [19,20]). Theorem 4.5. Every normalised state ρ ∈ St 1 (A) of a non-trivial system can be decomposed as is unique up to rearrangements. Such a decomposition is called a diagonalisation of ρ, the p i 's are the eigenvalues of ρ, and the α i 's are the eigenstates. Theorem 4.5 implies that the eigenvalues of a state are unique, and independent of its diagonalisation. Sharp theories with purification have a unique invariant state χ [14], which can be diagonalised as is any pure maximal set [20]. Furthermore, the diagonalisation result of Theorem 4.5 can be extended to every vector in St R (A), but here the eigenvalues will be generally real numbers [20].
One of the most important consequences for this paper of the axioms defining sharp theories with purification is a duality between normalised pure states and normalised pure effects.
Theorem 4.6 (States-effects duality [19,20]). For every system A, there is a bijective correspondence † : PurSt 1 (A) → PurEff 1 (A) such that if α ∈ PurSt 1 (A), α † is the unique normalised pure effect such that α † α = 1. Furthermore this bijection can be extended by linearity to an isomorphism between the vector spaces St R (A) and Eff R (A).
With a little abuse of notation we will use † also to denote the inverse map . A diagonalisation result holds for vectors of Eff R (A) as well [20]: they can be written as is a pure maximal set. Again, the λ i 's are uniquely defined given X.
Another result that will be made use of in the following sections is the following. It was shown to hold in Ref. [20], and expresses the possibility of constructing non-disturbing measurements [15,62,24]. Proposition 4.7. Given a system A, let a ∈ Eff (A) be an effect such that (a|ρ) = 1, for some ρ ∈ St 1 (A). Then there exists a pure transformation T ∈ PurTransf (A) such that T = ρ I, with (u|T |σ) ≤ (a|σ), for every state σ ∈ St 1 (A).
Note that the pure transformation T is non-disturbing on ρ because it acts as the identity on ρ and on all states contained in it. In other words, whenever we have an effect occurring with unit probability on some state ρ, we can always find a transformation that does not disturb ρ (i.e., a non-disturbing, non-demolition measurement) [20].
Finally, a property that we will use often is a sort of no-restriction hypothesis for tests, derived in [15] (Corollary 4).

Sharp theories with purification have no higher-order interference
Here we will show that sharp theories with purification do not exhibit higher-order interference. Our proof strategy will be to show that results of [8], which rule out the existence of higher-order interference from certain assumptions, hold in sharp theories with purification.
To this end, we will first prove that these theories are self-dual, and that they admit pure orthogonal projectors which satisfy certain properties, compatible with the setting presented in Section 3.

Self-duality
Now we will prove that sharp theories with purification are self-dual. Recall that a theory is To show that, we need to find a self-dualising inner product on St R (A) for every system A. The dagger will provide us with a good candidate. First we need the following lemma.
Lemma 5.1. Let a ∈ Eff 1 (A) be a normalised effect. Then a can be diagonalised as a = i∈I α † i + j∈J λ j α † j , where I is a non-empty subset of {1, . . . , d}, and J is a (possibly empty) subset of the complement of I, and λ j ∈ (0, 1) for every j ∈ J.
Proof. We know that every effect a can be written as are perfectly distinguishable, and for every i ∈ {1, . . . , r}, λ i ∈ (0, 1]. Since the state space is closed, and a is normalised, then there exists a (normalised) state ρ such that (a|ρ) = 1. One has where λ max is the maximum of the λ i 's. Therefore, λ max ≥ 1, which implies λ max = 1. Now, the condition This means that there always exists at least one eigenstate with eigenvalue 1, but in general there may be some eigenstates with eigenvalues strictly less than 1.
We can use this result to prove the following.
Proof. The map •, • is clearly bilinear by construction, because the dagger is also linear. Let us show that it is positive-definite. Take a non-null vector ξ ∈ St R (A), and diagonalise it as where we have used the fact that for perfectly distinguishable pure states α † i α j = δ ij [20]. The hard part is to prove that this bilinear map is symmetric, namely ξ, η = η, ξ , for every ξ, η ∈ St R (A). Let us define a new (double) dagger ‡. The double dagger of a normalised state ρ is an effect ρ ‡ whose action on normalised states σ is defined as where † is the dagger of Theorem 4.6. Note that Equation (5.1) is enough to characterise ρ ‡ completely, and it guarantees that ρ ‡ is a mathematically well-defined effect, because it is linear and σ † ρ ∈ [0, 1]. Consider now ρ and σ to be a normalised pure state ψ.
where the pure states {α i } i∈I ∪ {α j } j∈J are perfectly distinguishable. Note that ψ ‡ is pure if and only if |I| = 1, and J = ∅. Let us evaluate ψ ‡ on χ: [20]. Since |I| ≥ 1 and j∈J λ j > 0, a comparison between Equations (5.2) and (5.3) shows that it must be |I| = 1 and J = ∅. This means that ψ ‡ is a (physical) pure effect, whence ψ ‡ = ψ † by Theorem 4.6. Now we can show that the double dagger ‡ actually coincides with the dagger of Theorem 4.6. Indeed, given a state ρ, This means that ‡ = †, and that Equation (5.1) is nothing but a redefinition of the usual dagger. This means for every normalised states we have 4) and this extends linearly to all vectors ξ, η ∈ St R (A). We have proved that •, • is symmetric, and this concludes the proof.
Note that the above result immediately yields the "symmetry of transition probabilities" as defined in Ref. [3,5]. Now we prove that this inner product is invariant under reversible transformations.
Proposition 5.3. For every ξ, η ∈ St R (A) and every reversible channel U one has Proof. To prove the statement, let us first prove that for a normalised pure state α one has (U α) † = α † U −1 , for every reversible channel U . α † U −1 is a pure effect and one has α † U −1 U α = α † α = 1. By the uniqueness of the dagger for normalised pure states, The fact that •, • is an inner product allows us to define an additional norm in sharp theories with purification: if ξ ∈ St R (A), define the dagger norm as See Appendix A.1 for an extended discussion on the properties of this norm. Now we are ready to state the core of this subsection. x i y j α † i β j ≥ 0 because all the terms x i , y j , and α † i β j are non-negative. Sufficiency. Take ξ ∈ St R (A), and assume that ξ, η ≥ 0 for all η ∈ St + (A). Assume ξ is diagonalised as where the x i 's are generic real numbers. We wish to prove that all the x i 's are non-negative. Then Recalling that for perfectly distinguishable pure states one has α † i α j = δ ij [20], it is enough to take η to be one of the states {α i } d i=1 to conclude that x i ≥ 0 for every i ∈ {1, . . . , d}, meaning that ξ ∈ St + (A).
The self-dualising inner product, besides being a nice mathematical tool, has some operational meaning, because it provides a measure of the distinguishability of states, as explained in Appendix A.2. Moreover, it is the starting point for extending the dagger to all transformations. This is done in Appendix B.

Existence of pure orthogonal projectors
Now we show that we have orthogonal projectors on every face of the state space. A consequence of diagonalisation is that all faces are generated by perfectly distinguishable pure states. Indeed, every face F is generated by a state ω in its relative interior. ω can be diagonalised as ω = r i=1 p i α i , where r ≤ d, and p i > 0 for i ∈ {1, . . . , r}. By definition of face, this means that the states {α i } r i=1 are in F , and therefore generate F . Consequently, there is an effect a that picks out the whole face as the set of states ρ such that (a|ρ) = 1. In the specific case considered above, it is a = r i=1 α † i . Such faces are called exposed. Therefore the study of faces of sharp theories with purification reduces to the study of normalised effects of the form a I : Proof. Suppose ρ is any state in F I , then (a I |ρ) = 1. By Proposition 4.7 we know that there is a pure transformation P I such that P I ρ = ρ for every ρ ∈ F I . We also have (u|P I |σ) ≤ (a I |σ), so if σ ∈ F ⊥ I , we have (u|P I |σ) = 0, whence P I σ = 0. To prove that uP I = a I , first note that ψ † P I = ψ † for every pure state ψ ∈ F I . Indeed ψ † P I is pure by Purity Preservation, and we have ψ † P I ψ = ψ † ψ = 1 because P I ψ = ψ by definition. By Theorem 4.6, we have ψ † P I = ψ † . Furthermore, ϕ † P I = 0 for a pure state ϕ ∈ F ⊥ I . Indeed, consider The second term vanishes because α i ∈ F ⊥ I for i / ∈ I. The first term vanishes because P I α i = α i for i ∈ I, and ϕ is perfectly distinguishable from any of the α i 's for i ∈ I by means of the observation-test {u − a I , a I }, implying ϕ † α i = 0 [20]. This means that ϕ † P I occurs with zero probability on all states contained in χ, and since χ is complete [14], ϕ † P I = 0. Now, when we calculate uP I , we separate the contribution arising from states in orthogonal faces: This concludes the proof.
In other words, P I occurs with the same probability as a I , thus satisfying one of the desiderata of Section 3. Moreover, extending some of the results in the proof of Proposition 5.6 by linearity, we obtain the dual statements of Definition 5.5, namely Another consequence of Proposition 5.6 is that projectors actually project on their associated face, viz. for every normalised state ρ, P I ρ = λσ, where σ is in F I , and λ = (a I |ρ). Indeed, λ = (u|P I |ρ) = (a I |ρ). If λ = 0, which means ρ / ∈ F ⊥ I , then and (a I |σ) = 1 λ (a I |P I |ρ). However, we know that a I P I = a I , so (a I |σ) = 1, showing that σ ∈ F I . Furthermore, we can show that every projector P I has a complement P ⊥ I , which is the projector associated with the effect a ⊥ I = i / ∈I α † i , which defines the orthogonal face F ⊥ I . Clearly P ⊥ I ρ = a ⊥ I ρ σ, with σ ∈ F ⊥ I . In particular, P ⊥ I ρ vanishes if and only if ρ ∈ F I . These properties are the starting point for proving the idempotence of projectors. Proof. Recall that for every state ρ, P I ρ = λσ, where σ is in F I . Now, P I leaves σ invariant by definition, so P 2 I ρ = λP I σ = λσ, so P 2 I . = P I . To prove the other property, note that if I and J are disjoint, they define orthogonal faces. Indeed, suppose ρ ∈ F I , then which implies (a J |ρ) = 0 because (a I |ρ) = 1. Hence ρ ∈ F ⊥ J . Now, given any normalised state ρ, P I P J ρ = 0 because P J ρ is proportional to a state in F ⊥ I . This proves that P I P J .
This result shows that, once a pure maximal set {α i } d i=1 is fixed, whenever we have a partition {I j } of {1, . . . , d}, the test P I j is a von Neumann measurement. The only thing left to check is that j uP I j = u, which is a sufficient condition for a set of transformations to be a test in sharp theories with purification. This is satisfied because, recalling Proposition 5.6, Because of the properties proved above, von Neumann measurements are repeatable and minimally disturbing measurements in the sense of Refs. [23,24]. Indeed, a I j P I j = a I j , and because for k = j the P I k 's project on faces orthogonal to F I j .
The next proposition concerns the interplay between orthogonal projectors and the dagger.
Proposition 5.8. For every normalised state ρ, and for every projector P I on a face F I , one has (P I ρ) † = ρ † P I .
Proof. First of all, note that 0 ≤ P I ρ ≤ 1, and it vanishes if and only if ρ ∈ F ⊥ I . If ρ ∈ F ⊥ I , then ρ † P I = 0, so the statement is trivially true. Now suppose P I ρ > 0. We will first prove the statement for normalised pure states ψ, then it is sufficient to extend it by linearity to all states. We will make use of the uniqueness of the dagger for normalised pure states. Then the statement is equivalent to proving Noting that the term in brackets is a normalised pure state (by Purity Preservation), and that the RHS is a pure effect (again by Purity Preservation), by the uniqueness of the dagger for normalised pure states (cf. Theorem 4.6), it is enough to prove that ψ † P I P I ψ P I ψ 2 = 1; in other words that ψ † P I P I ψ = P I ψ 2 . Recall that P 2 I . = P I (Proposition 5.7), so ψ † P I P I ψ = ψ † P I ψ . Now, P I ψ = P I ψ ψ ′ , where ψ ′ is a pure state in F I . We have ψ † P I P I ψ = P I ψ ψ † ψ ′ . We only need to prove that ψ † ψ ′ = P I ψ . Recall that ψ † ψ ′ = ψ ′ † ψ by Lemma 5.2, and that ψ ′ † P I = ψ ′ † as ψ ′ ∈ F I , thus By the uniqueness of the dagger for normalised pure states we conclude that P I ψ A consequence of this proposition is that orthogonal projectors play nicely with the inner product of Lemma 5.2, namely for every ξ, η ∈ St R (A) one has P I ξ, η = ξ, P I η . (5.5) In other words, projections are symmetric with respect to the inner product. The last property we need is a generalisation of the results of Proposition 5.7. Proof. First let us prove that P I P J ρ = P I P J ρ ρ ′ (5.6) for every normalised state ρ, where ρ ′ ∈ F I∩J . Let us show that P I P J ρ = (a I∩J |ρ). By Proposition 5.6, (u|P I P J |ρ) = (a I |P J |ρ). Now, recalling that a I = i∈I α † i , where we have used the fact that α † i P J = α † i if i ∈ J, and α † i P J = 0 if i / ∈ J. If ρ ∈ F ⊥ I∩J , both the LHS and the RHS of Equation (5.6) vanish, and the statement is trivially satisfied. Now, let us assume ρ / ∈ F ⊥ I∩J , in this case (a I∩J |ρ) > 0. We wish to prove that (a I∩J |P I P J |ρ) = (a I∩J |ρ). Recalling the expression of a I∩J , we have again by the properties of P I and P J . This means that P I P J maps every normalised state to a state of F I∩J , up to normalisation. Now let us prove that (P I P J ) 2 . = P I P J . First note that F I∩J ⊆ F I . Indeed, suppose ρ ∈ F I∩J , then where we have used the fact that α † i ρ = 0 if i / ∈ I ∩ J. By a similar argument, F I∩J ⊆ F J . Now, P I P J ρ = P I P J ρ ρ ′ , with ρ ′ ∈ F I∩J . Then (P I P J ) 2 ρ = P I P J ρ P I P J ρ ′ . However, ρ ′ ∈ F J , so P J ρ ′ = ρ ′ , and, similarly, ρ ′ ∈ F I , so P I ρ ′ = ρ ′ . Consequently, (P I P J ) 2 ρ = P I P J ρ ρ ′ = P I P J ρ, proving that (P I P J ) 2 .
= P I P J . Now let us prove that for every ξ ∈ St R (A), we have (P I P J ξ) † = ξ † P I P J . Following the lines of proof of Proposition 5.8, let us show that this is true when ξ is a normalised pure state ψ. This boils down to showing that ψ † P I P J P I P J ψ = P I P J ψ 2 .
The proof goes on as for Proposition 5.8, noting that if ψ ′ ∈ F I∩J , then ψ ′ † P I P J = ψ ′ † because ψ ′ † P I = ψ ′ † as ψ ′ ∈ F I , and, similarly, ψ ′ † P J = ψ ′ † as ψ ′ ∈ F J . Eventually we find that for pure states (P I P J ψ) † = ψ † P I P J , and by linearity this means that (P I P J ξ) † = ξ † P I P J .
A consequence of this property is that P I P J ξ, η = ξ, P I P J η , for all ξ, η ∈ St R (A). These linear maps on St R (A) are such that St R (A) = im P I P J ⊕ ker P I P J , and ker P I P J is the orthogonal subspace to im P I P J , hence it is uniquely defined once im P I P J is fixed. Note that for any projector P I we have im P I = span F I , and we have just proved that im P I P J = span F I∩J = im P I∩J . Having the same image, and consequently the same kernel, P I P J and P I∩J agree on a basis of St R (A), therefore they agree also on all states of A, meaning that P I P J . = P I∩J .

Main result
Proposition 29 of [8] asserts that theories satisfying two postulates, Strong Symmetry and Projectivity, have higher-order interference if and only if their projectors (in our terminology here) preserve purity. A close examination of its proof, and those of all lemmas and propositions used in its proof-notably Lemma 22 and Propositions 18, 25, 26, and 28 of [8]-reveals that only premises weaker than the conjunction of Strong Symmetry and Projectivity are used: self-duality, the "spectral-like decomposition" of effects as in Lemma 5.1 above, the fact that faces are determined by subsets of maximal distinguishable sets of states as in Section 5.2 above, the existence of projectors onto each face in the sense of Definition 5.5 above, and the fact that these are symmetric with respect to the self-dualising inner product (i.e., orthogonal projectors), and satisfy Proposition 5.9 above. We have established these weaker premises for sharp theories with purification, and moreover, we have established in Proposition 5.6 that their projectors preserve purity, so we have proved: Theorem 5.10. In any sharp theory with purification there can be no nth order interference for n ≥ 3.

Jordan-algebraic structure
Our results also imply that systems, and therefore also the "subsystems" associated with their faces, are operationally equivalent to finite-dimensional Jordan-algebraic systems. These are systems A for which St + (A) is the cone of squares in a finite-dimensional Euclidean Jordan algebra (EJA) and Eff + (A) is identified with the same cone, with evaluation of effects on states given by the inner product and the Jordan unit as the deterministic effect. (See [7] for more on Jordan algebraic operational systems, and [3] for a mathematical treatment.) Theorem 5.11. In a sharp theory with purification, every system A has both St + (A) and Eff + (A) isomorphic to the cone of squares in a Euclidean Jordan algebra (EJA) via isomorphisms S and T such that (a|ρ) = T a, Sρ , where •, • is the canonical inner product on the EJA, and T takes the deterministic effect to the Jordan unit.
Proof. The proof uses results of Alfsen and Shultz [2], for which we refer to [3]. Theorem 9.33 in [3] implies that finite-dimensional systems with symmetry of transition probabilities (STP), a type of projection operator they call "compression" associated with every face, and whose compressions preserve purity, have state spaces affinely isomorphic to the state spaces of Euclidean Jordan algebras. Sharp theories with purification satisfy STP, as noted following Lemma 5.2 above. Our projectors are easily shown to be examples of compressions by the same argument as in Theorem 17 of [8]; this argument uses only properties satisfied by our projectors (the same ones needed in the proof of Theorem 5.10, except for Purity Preservation) and does not need Strong Symmetry. As shown above, our projectors also preserve purity.
Since faces of Jordan-algebraic systems are also Jordan-algebraic (to see this, combine a result of Iochum [43] (Theorem 5.32 in [3]), whose finite dimensional case is that all faces of EJAs are the positive part of the images of compressions, with the facts (cf. pp. 22-26 of [3]) that every face of the cone of squares is the image of such a compression P ([3], Lemma 1.39), and also a Jordan subalgebra whose unit is the image of the order unit under P ([3], Proposition 1.43).), so are the faces of state spaces in sharp theories with purification. However, it is not the case that in sharp theories with purification, each face of a system is necessarily isomorphic to a stand-alone system of the theory (an object of the category, in the categorical formulation), but, it is always possible to extend the theory such that they are. Every category has a Cauchy completion: this is a minimal extension of the category such that every idempotent morphism π : A → A can be written as a retraction-section pair, i.e., as the composition π = σ • ρ, with ρ : A → B and σ : B → A, such that the reverse composition ρ•σ is the identity morphism on B. When the idempotents are projectors P like the ones we consider here, B will be a system isomorphic to the face im + (P ). Of course, since there may be idempotents beyond the projectors onto faces (for example, decoherence of a set of orthogonal subspaces, or damping to a fixed state, in quantum theory), Cauchy completion of an operational theory T may add many objects in addition to ones isomorphic to faces of systems of T ; indeed, for many operational theories (e.g., ones possessing idempotent decoherence maps) this will add some classical systems. This is indeed the case for quantum theory where the Cauchy completion leads to the category of finite-dimensional C*-algebras and completely positive maps [30]. The Cauchy completion can be thought of as adding in all operationally accessible systems that can be simulated on the physical system via a consistent restriction on the allowed states, effects and transformations. The Cauchy completion of a sharp theory with purification will likely satisfy the Ideal Compression postulate by virtue of containing the faces that are images of orthogonal projectors; but there are also non-Cauchy complete theories that satisfy it, e.g., the category CPM of finite-dimensional quantum systems and CP maps, in which all systems, and also all images of orthogonal projectors as defined above, are fully coherent quantum systems, but there are no classical systems.
In [7], some categories, including dagger-compact-closed categories, of Jordan algebraic systems were constructed; these categories are equivalent to operational theories as we use the term here. Although sharp theories with purification also have Jordan algebraic state and effect spaces, it is interesting to note that some of the explicit examples in [20,21] involve composites different from those that would be obtained in the categories considered in [7] for systems with the same state spaces. On the other hand, the category combining real and quaternionic systems in [7] does not satisfy Purity Preservation by parallel composition and hence falls outside the class of sharp theories with purification, although its filters do preserve purity. Of course, the failure of Purity Preservation by parallel composition seems likely to allow phenomena like the nonextensiveness of entropy when products of states are taken, which could warrant focusing on sharp theories with purification in thermodynamically motivated work such as [20].
That Jordan-algebraic systems lack higher-order interference was shown by Barnum and Ududec ([72]; announced in [4]) and by Niestegge [59]; combining this with Theorem 5.11 gives another way to see that our results on sharp theories with purification imply the absence of higher-order interference. Moreover, as not all EJAs satisfy our postulates, it is clear that our postulates are sufficient but not necessary conditions for ruling out higher-order interfence.

Discussion and conclusions
We proved that in sharp theories with purification multi-slit experiments must have a pure projector structure and, moreover, such theories exhibit at most second-order interference. Hence these theories are, at least conceptually, very "close" to quantum theory. Moreover, recent work has shown that sharp theories with purification are close to quantum theory in terms of other physical and information processing features. Indeed, such theories possess quantum-like contextuality behaviour [23,24], quantum-like computation [50,51], and quantum-like thermodynamic properties [19,20,21]. Recall from Section 4 that quantum theory is not the only example of a generalised probabilistic theory satisfying these principles. Hence Causality, Purity Preservation, Pure Sharpness, and Purification do not recover the entire quantum formalism.
However, if one were to introduce the Ideal Compression and Local Discriminability principles of the reconstruction of quantum theory due to Chiribella, D'Ariano, and Perinotti [15], one would indeed regain the entire quantum formalism. Indeed, both additional principles are necessary: Local Discriminability to preclude real quantum theory and Ideal Compression to preclude the contrived-yet admissible-example of the theory in which all systems are composites of qubits. Sharp theories with purification thus serve as a fertile test-bed for physics that is conceptually quite close to that predicted by the quantum world, but which may diverge from it in certain small, yet interesting, ways.

Finding higher-order interference
To date there has been no experiment that has found higher-order interference, at least, none that cannot be explained by taking into account the fact that the "sets of histories are not mutually exclusive" [69,67]. However, this might be due to the specific experimental set-up employed, rather than a fundamental preclusion of higher-order interference in nature. We show here that many of the properties needed to rule out observing higher-order interference are in fact quite natural assumptions which appear to be suggested by the experimental set-up employed. This suggests that the experimental set-up itself may implicitly rule out observing higher-order interference from the outset.
The main result of the current work is that sharp theories with purification can never exhibit higher-order interference in any experiment. However, in a wider class of theories, we still will not observe higher-order interference in a particular experiment if the following three conditions are met; hence, to have any chance of observing higher-order interference, experiments must be designed in order to try to violate these conditions.
1. The transformations corresponding to blocking slits satisfy: T I T J = T I∩J . By this we mean that they share several properties with the projectors P I of Section 5: if we define the effects a I = uT I and the faces F I and F ⊥ I as in Section 5.2, i.e., as the 1-set and 0-set of a I , then the T I are assumed to be orthogonal projectors in the sense of Definition 5.5, and to be both idempotent and "orthogonal" (T I T J = 0) if I and J are disjoint (as in Proposition 5.7).
2. The T I 's map pure states to pure states 3. The T I 's are self-adjoint.
The first of these is generally expected as only those slits belonging to both I and J will not be blocked by either T I or T J , and so should hold in this experimental set-up for any theory that can describe it.
The second assumption, which is also natural given the multi-slit set-up, is that, in an idealised scenario, the slits should not introduce fundamental noise. That is, if an input state ρ is pure, i.e., has no classical noise associated with it, then T I ρ should also be pure. Hence it appears natural to assume that T I maps pure states to pure states. Violating this principle by just adding noise to the experiment does not seem likely to demonstrate higher-order interference. A more plausible way to violate this however would be if the particle passing through the slits were to become entangled with some degree of freedom associated with them, if we do not have access to this degree of freedom then this would send a pure input to a mixed state.
The final assumption is far less general than the others, as it places a constraint on the theory. That is, to even discuss whether a transformation is self-adjoint (cf. also Appendix B), one requires that the theory itself be self-dual. To fully understand what this assumption entails, one needs an operational or physical interpretation of the self-dualising inner product (see [63] for an example of such an interpretation). However, intuitively this notion reflects the inherent symmetry of the experimental set-up. Here one could consider propagation from the source to the effect or from the effect to the source as being "dual" to one another and, moreover, that the physical blocking of slits has an equivalent effect in either situation. That is, the assumption of self-adjointness corresponds to the statement that the projector has an equivalent action on the effects associated with a particular slit as it does on the states which can pass through them.
If an experiment satisfies these assumptions then for any self-dual theory it was shown in [8] (Proposition 29) that we will not see higher-order interference in this experiment. Hence any set of physical principles which ensure these assumptions hold will rule out higherorder interference. Because the mathematical assumptions involved in formalising a multislit experiment are so natural when interpreted operationally, perhaps one should search for higher-order interference in set-ups that don't seem to preclude it from the outset. This could involve "asymmetric" multi-slit set-ups that are not obviously time-symmetric in an arbitrary generalised probabilistic theory. One could also consider experiments that search for higherorder phases [51], a reformulation of higher-order interference that makes no reference to projectors and hence does not preclude certain generalised theories from the outset. The assumption that nature is self-dual could also be rejected; this poses the question as to whether it is possible to find a direct experimental test of this principle.

Acknowledgments
The authors thank J. van de Wetering for pointing out an error in Lemma 5.1 in the previous version of the paper. The authors also thank J. Barrett

A.1 Operational norm and dagger norm
In Ref. [14] the operational norm for every vector ξ ∈ St R (A) was introduced: As pointed out in [14], in quantum theory the operational norm coincides with the trace norm. The analogy is apparent also in sharp theories with purification.
Proof. Let us separate the terms with non-negative eigenvalues from the terms with negative eigenvalues, so that we can write ξ = ξ + − ξ − , where ξ + := x i ≥0 x i α i , and ξ − = . In order to achieve the supremum of (a|ξ) we must have (a|ξ − ) = 0. Moreover, For p ≥ 1, the p-norm of a vector x ∈ R d is defined as In sharp theories with purification we have an additional norm, the dagger norm, defined in Section 5.1. The dagger norm of a vector where the x i 's are the eigenvalues of ξ. It is obvious from the very definition that ξ † = x 2 . Thanks to these results following from diagonalisation, we can derive the standard bounds between the two norms, by making use of the well-known bounds Note that, unlike Ref. [57], here the bounds are derived without assuming Bit Symmetry [58,8].
If we take ξ to be a normalised state ρ, its eigenvalues form a probability distribution, and we have ρ † ≤ 1, with equality if and only if ρ is pure. Note that ρ † is a Schur-convex function [54] of the eigenvalues of ρ, so it is a purity monotone [20]. As such, it attains its minimum on the invariant state, which is , so for every normalised state one has consistently with the bounds (A1). The square of the dagger norm, still a Schur-convex function, was called purity in Refs. [57,56]. Consequently 1− ρ 2 † is a measure of mixedness, sometimes called the impurity I (ρ) of ρ. The impurity can be extended to subnormalised states by defining it as I (ρ) := (Tr ρ) 2 − ρ 2 † [8]. The two norms behave differently under channels applied to states. In Ref. [14] it was shown that in causal theories the operational norm of a state ρ is preserved by channels: Cρ = ρ for every channel C, because channels are such that uC = u.
Instead the dagger norm shows a different behaviour. To describe it, it is useful to divide channels into two classes: unital and non-unital channels [21].
Unital channels do not increase the dagger norm of states.
Proof. Unital channels can be chosen as free operations for the resource theory of purity [21]. In Ref. [21] it was shown that the spectrum of Dρ is majorised by the spectrum of ρ (see Ref. [54] for a definition of majorisation and Schur-convex functions). Since the dagger norm is a Schur-convex function, we have Dρ † ≤ ρ † .
Clearly if D is reversible, the dagger norm is preserved, by Proposition 5.3. For non-unital channels there is at least one state-the invariant state χ-for which the dagger norm increases. Indeed, if C is non-unital, χ is majorised by Cχ, whence χ † ≤ Cχ † . Is it true, then, that non-unital channels increase the dagger norm of all states? The answer is clearly negative. Consider the non-unital channel mapping all states to a fixed mixed state ρ 0 = χ. For some states, e.g., the invariant state, the dagger norm will increase, for others, e.g., pure states, the dagger norm will decrease because it is a purity monotone. In short, for non-unital channels there is no uniform behaviour of the dagger norm.

A.2 Dagger fidelity
The inner product defined in Section 5.1 allows us to define a fidelity-like quantity, called the dagger fidelity.
Definition A.4. Given two normalised states ρ and σ, the dagger fidelity is defined as The dagger fidelity measures the overlap between two states. It shares some properties with the fidelity in quantum theory (cf. for instance Ref. [74]), despite not coinciding with it. The first, obvious one, is that F † (ρ, σ) = F † (σ, ρ).
To prove the other properties we need the following lemma, generalising one of the results of Ref. [20].
Proof. Clearly what we need to prove is that be the perfectly distinguishing test, and let ρ i be diagonalised as ρ i = r i k=1 p k,i α k,i , where p k,i > 0 for all k = 1, . . . , r i . We have (a i |ρ i ) = 1, hence by Proposition 4.7 there exists a non-disturbing pure transformation T i such that T i = ρ i I. Specifically, we have that T i α k,i = α k,i . Moreover if i = j, we have (u|T i |ρ j ) ≤ (a i |ρ j ) = 0, whence (u|T i |ρ j ) = 0. This means that T i ρ j = 0 for all j = i. Now, consider where we have used the fact that T i α k,i = α k,i . Since α † k,i T i is a pure effect, it must be α † k,i T i = α † k,i by Theorem 4.6. By linearity we have ρ † i T i = ρ † i . Now, using this fact, for all Recalling that ρ † σ = ρ, σ , this lemma means that perfectly distinguishable states form an orthogonal set. Specifically, if the states are pure, the set is orthonormal.
The following proposition extends and generalises the properties of the self-dualising inner product of Ref. [58].
Proposition A.6. The dagger fidelity has the following properties, for all normalised states ρ and σ.
Proof. Let us prove the various properties.
4. This property follows by Proposition 5.3, because the inner product and the dagger norm are invariant under reversible channels.
Note that Property 3 captures the sharpness of the dagger for all normalised states [63]. A property involving tensor product of states is the following.

Now comes the actual proof.
Proof of Proposition A.7. We have Now, by Lemma A.8, Putting everything together,

B Dagger of all transformations
Inspired by the results of Lemma 5.2, in sharp theories with purification, we can extend the dagger to all transformations, a feature often present in process theories [66,28,63,29].
Definition B.1. Given the transformation A ∈ Transf (A, B), its dagger (or adjoint) is a linear transformation A † from B to A defined as for every system S, and every state ρ ∈ St 1 (B ⊗ S).
This definition specifies the dagger of a transformation completely, thanks to Equation (2.2). Note that Lemma 5.2 allows us to formulate Equation (5.4) in term of effects and their dagger: for all effects a, and b. In this way, Definition B.1 can be recast in equivalent terms by taking b as the term in round brackets in the RHS of Equation (B1). This yields for every system S, every state ρ ∈ St 1 (B ⊗ S), and every effect E ∈ Eff (A ⊗ S).
The dagger of a transformation may not be a physical transformation, i.e., it may send physical states to non-physical ones. Indeed, the action of A † ⊗ I on a generic state (the LHS of Equation (B1)) is defined as the dagger of an effect. However, not all daggers of effects are physical states. For instance, take the deterministic effect which is a supernormalised (and hence non-physical) state.
For channels, we can give a necessary condition for the existence of a physical dagger of the channel.
Proposition B.2. Let C ∈ Transf (A, B) be a channel. If C † is a physical transformation, then C is unital, and C † itself is a unital channel.
Proof. If C † is a physical transformation, then, for every normalised state ρ ∈ St 1 (B), we have C † ρ ≤ 1, or in other words, u C † ρ ≤ 1. By Equation (B2), u C † ρ = ρ † C u † , so the condition C † ρ ≤ 1 is equivalent to with equality if and only if C † is a channel. Suppose by contradiction that C is not unital, then Cχ = ρ 0 = χ. Diagonalise ρ 0 as ρ 0 = d i=1 p i α i , where p 1 ≥ p 2 ≥ . . . ≥ p d ≥ 0, and p 1 > 1 d . Then taking ρ to be α 1 in ρ † C χ yields p 1 , but p 1 > 1 d , contradicting Equation (B3). Being C unital, we have that showing that C † is itself a channel. Let us prove it is unital. The action of C † on χ is defined in Equation (B1), so where we have used the fact that C is a channel, so uC = u. This proves that C † is unital.
We can prove that the dagger of a transformation has some nice properties.  Comparing this with Equation (B4), we get the thesis.
We can give a characterisation of the dagger of reversible channels, which are unital channels. In particular we have that the dagger of the SWAP channel between two systems is the SWAP with the input and output systems reversed. The orthogonal projectors of Section 5.2, on the other hand, are self-adjoint on single system. Proposition B.5. Given the orthogonal projector P I on a face F I , we have P † I . = P I .
Proof. For every ρ and E, we have E P † I ρ = ρ † P I E † . The RHS is ρ, P I E † . By the properties of projectors, ρ, P I E † = P I ρ, E † = E † , P I ρ = (E|P I |ρ) .
This shows that P † I . = P I .
Finally we prove some properties of the dagger with respect to compositions. We need an easy lemma first. This means that the dagger respects the composition of diagrams, and corresponds to the action of flipping a diagram with respect to a vertical axis.