The ABC of Deutsch-Hayden Descriptors

It has been more than 20 years since Deutsch and Hayden proved the locality of quantum theory, using the Heisenberg picture of quantum computational networks. Of course, locality holds even in the face of entanglement and Bell's theorem. Today, most researchers in quantum foundations are still convinced not only that a local description of quantum systems has not yet been provided, but that it cannot exist. The main goal of this paper is to address this misconception by re-explaining the descriptor formalism in a hopefully accessible and self-contained way. It is a step-by-step guide to how and why descriptors work. Finally, superdense coding is revisited in the light of descriptors.


Motivation
It is still a widespread belief that a complete description of a composite entangled quantum system cannot be obtained by descriptions of the parts, if those are expressed independently of what happens to other parts. This apparently holistic feature of entangled quantum states entails violation of Bell inequalities [1,2] and quantum teleportation [3], which are repeatedly invoked to sanctify the "non-local" character of quantum theory. But this widespread belief has been proven false more than twenty years ago by Deutsch and Hayden [4], who by the same token provided an entirely local explanation of Bellinequality violations and teleportation.
Descriptions of dynamically isolated -but possibly entangled -systems A and B are local if that of A is unaffected by any process system B may undergo, and vice versa. After Bell, it has become conventional wisdom to equate locality with a possible explanation by a local hidden variable theory. However, local hidden variables are only one way in which locality can be instantiated [5]. Here, locality is taken in its crudest form, the one advocated by Einstein: "the real factual situation of the system S 2 is independent of what is done with the system S 1 , which is spatially separated from the former" [6]. Descriptions of individual systems A and B are complete if, when put together, they can predict the distributions of any measurement performed on the whole system AB.
For instance, if AB is in a pure entangled state |Ψ AB , the reduced density matrices ρ A = tr B |Ψ Ψ| and ρ B = tr A |Ψ Ψ| are local but incomplete descriptions. This is because ρ A is left unaffected regardless of what happens to system B, however, since |Ψ AB is entangled, it or its associated density matrix |Ψ Ψ| can no longer be recovered from ρ A and ρ B . Some information that could reveal crucial to compute the distribution of some joint measurements has been discarded in the tracing out. If instead the descriptions of A and B are both taken to be the global wave function |Ψ AB , then one finds a complete but non-local account. We seem to be stuck in a dichotomy, apparently forced to describe quantum systems either non-locally or incompletely. But the dichotomy is false. Following Gottesman's [7] quantum computation in the Heisenberg picture, Deutsch and Hayden define so-called descriptors for individual qubits and showed this mode of description to be both local and complete, hence vindicating the locality of quantum theory. In other words, even entangled systems admit a separable description. When such a bold foundational result collects a mere 190 citations in more than 20 years, it is evidence that a large portion of the community of quantum foundations is unaware of the idea, or worse, does not understand it. This is the problem that this paper addresses, and it does so by providing a detailed and self-contained explanation of how and why descriptors work. The paper culminates with the superdense coding protocol being revisited in the established framework. It is aimed both for experts and non-experts in quantum theory. A background in physics is optional; Only introductory knowledge in quantum information theory is required.

A Question of Picture
In quantum theory, computations leading to statistics of measurable quantities all take the same form, namely, that of Dirac's celebrated bra-ket notation, · · ·| · · · |· · · . Physicists recognize this kind of computation as the expected value of some observable. Quantum information scientists, bear with me for another 10 lines. An observable O is represented by a hermitian operator which admits a spectral decomposition where λ i ∈ R is an eigenvalue corresponding to the measurement outcome and Π i is the corresponding projector on the eigensubspace. If the system is in state |ψ , the expected value of such an observable is given by ψ| O |ψ , since where p i can be thought of the probability of measuring outcome λ i . While this type of computation is routine for physicists, quantum information scientists usually compute probabilities of measurement outcomes. An n-qubit network in the state 2 n −1 j=0 α j |j has a probability |α l | 2 to return the classical value "l". But |α l | 2 = ψ| |l l| |ψ is nothing but the expectation value of the observable |l l|. Hence, the reader who is unfamiliar with observables can simply keep in mind projectors of the form |l l| required to compute probabilities, but this 1 footnote explains further.
A generic state |ψ arises from the evolution of an initial state that shall be denoted |0 . If U is the unitary operator representing this evolution, |ψ = U |0 , so the computations carried to predict measurable quantities all have the form The Schrödinger picture is about viewing the sandwich equation (1) as if the bread evolves and the meat stays constant, namely, With such a viewpoint, the initial state |0 evolves to the final state |ψ = U |0 and the observable O remains constant.
The Heisenberg picture is about regarding the sandwich equation as if the meat evolves but the bread remains constant, In this picture, the state vector remains fixed to |0 but the observable O evolves to U † OU. Therefore, in the Heisenberg picture, the term 'state', which refers to a quantity that is fixed to |0 , becomes a misnomer. It will thus be called the reference vector. But then, in the Heisenberg picture, can the quantum information of the system at a given time be encoded in a single mathematical object? Yes: It is precisely what the descriptor does.

Tracking Observables
In the Heisenberg picture, a quantum system shall no longer be described by its state vector, but rather by an object that encodes the information about all the evolved observables of the system. This is a tall order since there is an uncountable number of such observables. Things are greatly simplified once it is realized that observables are linear operators and that the latter form a vector space. Since the evolution O → U † OU is linear, one does not need to track the evolution of infinitely many observables: Only a basis of the linear operators suffices. Indeed, if O = j a j B j , then U † OU = j a j U † B j U, so it suffices to track how each operator B j of the basis evolves by U to then compute how any observable evolves.

The Descriptor of a 1-Qubit Network
In the case of a singe qubit, the Pauli matrices together with the identity, form a basis of the 2 × 2 matrices, if the linear combinations are taken over complex numbers. Following the evolution of 1 is trivial, U † 1U = 1, so it can be neglected. This means that one only needs to follow the evolution of σ, to then be able to recover any evolved observable, or the expectation value thereof.
Hence, for a single qubit quantum network, the descriptor of the qubit at time t is given by q(t) = U † σU , where U is the unitary operator that represents the evolution undergone by the quantum network between time 0 and time t.

Example 1. Consider the following quantum circuit
is the Hadamard gate. At time t = 0, the descriptor is q(0) = σ = (σ x , σ y , σ z ), while at time t = 1, the descriptor is The Heisenberg picture and the expression for q(1) can be used to compute the probability of measuring the outcome "0". Representing |0 and |1 respectively by 1 0 and 0 1 ,

Descriptors of an n-Qubit Network
Consider now, and for the rest of the paper, the case of n interacting qubits in a quantum computational network. Suppose that the qubits are initialized at time 0 in the state |0 ⊗n , which, when more conveniently denoted |0 , correspond to the Heisenberg reference vector so far invoked. Although this network seems like a restricted system, its ability to simulate any other quantum system to arbitrary accuracy [8] makes it completely general. Moreover, no generality is lost by assuming that each gate in the network requires exactly one unit of time, so that the state of the network needs only be specified at integer values of time. Let again U be the unitary operator representing the evolution of the network between time 0 and time t.
A natural basis of the space of all operators on n qubits is the product of Pauli operators, namely, There are 4 n such matrices, and they are linearly independent, so, indeed, they form a basis of the 2 n × 2 n = 4 n dimensional complex 2 vector space of linear operators on n-qubits.
This means that if one knows how each observable of the basis B evolves by the action of U, then one knows, by linearity, how each observable evolves.

The Main Simplification
A great simplification is to track the evolution of only the set of observables where 1 k stands for the tensor product of k copies of the identity. Note that for each i, q i (0) has 3 components, each of them being an operator acting on the whole Hilbert space. The n-tuple whose components are the q i (0) is denoted q(0). Bold quantities are vectors, so for instance one writes q i (0), but q ix (0). This q i (0) is the descriptor of qubit i at time 0. The descriptor at time t is then given by Importantly, note that q(0) contains many fewer components than B contains elements. In fact, instead of tracking the 4 n operators of B, only 3n are suggested here. The reason is that these 3n operators can be multiplied to generate any of the 4 n basis operators. Moreover, this multiplicative structure is preserved by the evolution U, namely, if an observable is generated multiplicatively by q iw (0)q jw ′ (0), then the evolved observable is given by This observation obviously extends to larger products, as well as to sums of products of components of q(0).

Example 2.
Considering a 2-qubit network, the observable |01 01| can be expanded in the basis B = {σ µ ⊗ σ ν : µ, ν ∈ {0, x, y, z}}, and then expressed in terms of q 1 (0) and q 2 (0). Indeed, This can then be used to express in terms of q(t) the time-evolved counterpart of the observable, U † |01 01|U, under a an evolution U between time 0 and t:

The Algebra of Descriptors
The addition and multiplication of components of descriptors grant them with an algebraic structure.
Remark 1. The operators of q(0) satisfy the su(2) ⊗n algebra, namely In the first line, the bracket denotes the commutator, [A, B] = AB − BA. The above algebraic relations follow from those of the Pauli matrices and from the factorized form of the descriptors at time 0, displayed in equation (3). After evolving by U, the descriptors q i (t) shall in general loose their direct connection with Pauli matrices, as well as their factorized form, but still, they preserve their algebraic relations.
Remark 2. For any t, q(t) satisfies the su(2) ⊗n algebra : One might object that unitary evolution is but a special case of a larger class of processes represented by completely positive and trace preserving maps. Such processes include for instance noisy channels or maps that do not preserve the dimensionality of the system (and hence do not preserve the system's algebra). These processes are, however, a special case of unitary evolution. In fact, not only that, by Stinespring dilation theorem, these processes can be mathematically understood as sub-processes of a larger unitary evolution, but they physically are. Real quantum processes are unitary evolutions.

One More Simplification
Following Gottesman [7], the generating tuple q(0) could be reduced to 2n elements by noticing a redundancy due to the su(2) ⊗n algebra. In fact, for any i, only two of the triplet of operators (q ix (0), q iy (0), q iz (0)) are required, since the omitted operator can be recovered by the product of the selected two. In what follows, the notation will not be modified, but one will happily use this shortcut to avoid tracking the observables q iy (t), keeping in mind that q iy (t) = −iq ix (t)q iz (t).
Summing this up, the Heisenberg picture is about tracking the evolution O → U † OU of uncountably many initial observables O. This can be done by instead tracking the evolution q(0) → q(t) = U † q(0)U of only 2n observables (q iy is omitted). In fact, q(t) allows to infer, by multiplication, the evolution of the 4 n observables of B, which allow to infer, by linearity, the evolution of any observable.

Evolution from the Future?!
Although q(0) → q(t) = U † q(0)U looks like a completely fine way in which observables should evolve, when U is broken down into different gates, for instance U = G t . . . G 2 G 1 , one finds that the observables of the descriptors evolve in the wrong order! In fact, the order in which the gates are applied is first G 1 , then G 2 , and so on, until the last gate G t is applied. However, the descriptors evolve as The evolution of observables appears to occur from the last gate of the network to the first, which is inconvenient, since the network needs to be final before one can start to compute anything. Much worse, it does not reflect the actual dynamics that the system is undergoing, so this kind of evolution from the future simply cannot be the right explanation. The way out of this conundrum is to notice that inasmuch as observables are linear operators generated by some set q(0) of operators, the evolution operators -or gates -are too. They are generated multiplicatively and additively by the same set q(0), since questions of hermiticity versus unitarity do not arise.

The Functional Representation of a Gate
For a fixed gate with matrix representation G, its multiplicative and additive generation by q(0) defines a function U G (·) through G = U G (q(0)) .
The function U G (·) takes value in unitary operators and will be referred to as the functional representation of the gate G. Its functionality encodes the multiplicative and linear generation of G by the elements of q(0). In other words, any matrix G can be expressed as a polynomial in the 2n matrices q 1x (0), q 1z (0), . . . , q nz (0), and U G (·) is one such polynomial. Now, when q(t) varies with t, the matrix representation U G (q(t)) varies accordingly, but as we shall see in the next section, it is the fixed functionality of U G that plays a central algebraic role when performing computations in the Heisenberg picture.

Example 3.
In the case of a single qubit network, the negation and Hadamard gates are described by so their functional representations are The counterclockwise rotation of a state vector in the |0 & |1 plane 3 is described by which defines its functional representation U R θ (·).
In the case of an n-qubit network, if such a unary gate, say H, is applied on qubit i, while all other qubits are left invariant, then the matrix representation of the corresponding evolution operator is

Back in order!
The apparently reversed-ordered evolution of equation (5) can then be transformed back in the right order. Denoting V = G t−1 . . . G 2 G 1 , one finds − 1)) .
In the second last line, the function U Gt (and its hermitian conjugate) is applied to the components of q(0) that are sandwiched by V † and V . The equality holds because if U Gt contains products of components of q(0), the inner V † and V in the expansion of U Gt V † q(0)V shall cancel out, leaving only the outer ones, which can then be factored out to retrieve the line before.
At this stage, the computation can be continued in two different ways. First, remembering that V = G t−1 . . . G 2 G 1 , the argument can be iterated on both sides of the equation. This makes explicit that the problem of the order in which the observables evolve in the Heisenberg picture is solved by introducing the functional representation of the gates. Indeed, evolving the observables by the matrix representation of the gates acting in the wrong order, , is equivalent to the right ordering of the functional representation of the gates evaluated at the corresponding times, i.e., Another way to continue the previous calculation is to invoke equation (4) on both sides of the equation to find This is the way in which descriptors are prescribed to evolve in Ref. [4]. It is in fact correct and equivalent to equation (4), although not trivially recognized.

The Action on Desriptors
Evolving the descriptor in a step-by-step fashion, as prescribed by equation (6), permits to find out how a specific gate affects the different descriptors, i.e., the action of the gate on the descriptors. A gate G t transforms the 2n components of q(t − 1) in the following way: Leveraging the fact that the descriptors at time t − 1 satisfy the su(2) ⊗n algebra (c.f. Remark 2), the functional representation U Gt (q(t − 1)) can be expanded and the algebraic relations of the many components of q(t − 1) that shall crop up are used to simplify the expression. As it shall be seen, the locality of the applied gate renders trivial most of those 2n computations. Recalling that the action on descriptor q i is then When the context does not require it, the time labels can be omitted, like here, from the third line onwards, the "(t − 1)" has been discarded. One can then simply denote the action of the gate on the descriptors as H i : (q ix , q iz ) → (q iz , q ix ) without insisting on the time labels, since the calculation relies only on the time-independent algebra of descriptors. Notice that the result is analogous to what has been computed in Example 1, equation (2), but here, no matrix multiplication was involved, only the algebra of descriptors. More specifically, the properties q 2 iw = 1 and q iz q ix = iq iy = −q ix q iz have been used. How about the action of H i on all other q j , with j = i? Since U H i (q) depends only on q ix and q iz (time labels removed), it commutes with q j , leaving it invariant,

Locality and Completeness
The fact that U H i depends only on q i -and so leaves invariant the descriptor of all qubits but qubit i -is precisely due to the fact that H i is a gate that acts only on qubit i. More generally, if the gate G t acts only on qubits of the subset I ⊆ {1, 2, . . . , n}, then its functional representation U Gt shall only depend on components of q k (t − 1), for k ∈ I. For j / ∈ I, the descriptor q j (t − 1) shall then commute with U Gt (q(t − 1)), so it will remain unchanged between times t − 1 and t. Hence, anything that is done to any system that does not concern qubit j leaves its descriptor invariant, namely, the descriptors are a local description of quantum systems.
The descriptors are also complete, in that the expectation value of any timeevolved observable U † OU that concerns only qubits of I can be determined by the descriptors q k (t), with k ∈ I. This can be seen more clearly at time 0, where an observable O on the qubits of I is a linear (hermitian) operator that acts non-trivially only on the qubits of I. Any such operator can be generated additively and multiplicatively by the components of q k (0), with k ∈ I, thereby defining a polynomial f O (·) for which and so Example 5. Determine the action of N = σ x and of σ z on the descriptor of the qubit that is acted upon.
Similarly, and with a lighter, time-independent notation,

The Cnot
The controlled not gate, denoted Cnot, is a two qubit gate of great importance. Not only does it represent a perfect measurement, but when the Cnot is supplemented by arbitrary unary gates, it forms a universal gate set. This means that any unitary transformation can be realized by a circuit with gates chosen solely among this set.
Consider a Cnot gate where the qubit c controls the target qubit t. Restricting to the subspace acted upon, the linear transformation is represented by The functional representation of Cnot (c controls t) is thus given by U Cnot (q(t)) = 1 2 (1 + q cz (t) + q tx (t) − q cz (t)q tx (t)) .
The action of the Cnot on the descriptors that it affects can be found to be Cnot : (q cx , q cz ) (q tx , q tz ) → (q cx q tx , q cz ) (q tx , q cz q tz ) .
For example, the calculation of q cx (t) is done below.
(q cx + q cx q cz + q cx q tx − q cx q cz q tx +q cz q cx + q cz q cx q cz + q cz q cx q tx − q cz q cx q cz q tx +q tx q cx + q tx q cx q cz + q tx q cx q tx − q tx q cx q cz q tx −q cz q tx q cx − q cz q tx q cx q cz − q cz q tx q cx q tx + q cz q tx q cx q cz q tx ) where, the dependency on t − 1 has again been discarded.
The action of a gate on a descriptor can also be found directly from the matrix representation of the gate, without the detour by its functional representation and the gymnastic of the su(2) ⊗n algebra. Let's exemplify the method with the case of the Cnot, which, in this case consists of calculating For the q cx element, this yields consistently with the previous approach. But why does this work? In fact what has been computed is The leap to the general case, i.e., to have t and t − 1 instead of 1 and 0 in the above equation, follows from observing that the calculation could have been done by replacing U Cnot (q(0)) by its functional representation, and then use the su(2) ⊗n algebraic relations at time 0. But since the algebraic relations are preserved, q(0) could then invariably have be changed to q(t − 1), to obtain that, generically, q cx (t) = q cx (t − 1)q tx (t − 1) .

Superdense Coding, Revisited
In the Schrödinger picture, the superdense coding [9] may appear to hinge on 'non-local' properties of the wave-function. See Figure 1. The Schrödinger state at time 2 is given by the Bell state The local operations performed by Alice on her qubit shall evolve the system to one of the four Bell states in accordance with the bits i and j that she wants to transmit. The latter are then revealed by a Bell measurement. See Table 1.
Bits i, j State at time 4 State at time 6 The protocol is now revisited in the language of descriptors. Denoting the descriptor at time 0 without any time labels, the computation can be done as follows.
Denoting by U (ij) the evolution throughout the the protocol, the probability of measuring an outcome "i ′ " on the first qubit is given by In the Heisenberg picture, this computation is performed from the middle outwards. The initial observables are expressed in terms of descriptors as which evolve by U (ij) to The expectation value with the reference vector |00 thus yields Similarly, the probability of measuring "j ′ " on qubit 2 is given by δ jj ′ and hence, the system shall deterministically return the value of the bits i and j.
When revisited with the help of descriptors, the superdense coding of two bits into a single qubit appears quite natural: Alice's qubit's descriptor has precisely two slots in which bits can be encoded. When Alice transmits her qubit to Bob, measurements on that qubit alone could not leak any information about i or j. In fact, any observable on Alice's qubit at time 4 is a linear combination of 1, q 1x (4), q 1y (4) = −iq 1x (4)q 1z (4) and q 1z (4), and since 00| (1, q 1x (4), q 1y (4), q 1z (4)) |00 = (1, 0, 0, 0) , the expectation value of any observable on that qubit alone is independent of i and j. However, the information about the bits i and j is contained in the transmitted qubit at time 4, since not only does q 1 (4) depend on i and j, but those bits eventually become accessible to measurement. This kind of information, present in a system but unretrievable by measurements on the system alone has been called locally inaccessible by Deutsch and Hayden. In step 5 of the protocol, Bob's qubit serves as a key as well as an extra capacity: It unlocks the bit i by getting rid of the obfuscating q 2x while copying the bit j in its z component.
Finally, notice that between time 2 and time 4, only the descriptor of the first qubit is affected, which invalidates the idea that the superdense coding protocol relies on non-local properties of entanglement. Indeed, there is an important asymmetry to be underlined: The existence of a local way in which a phenomenon (or more generally, a theory) can be explained makes the phenomenon (or theory) local. But this doesn't hold for the attribute "non-local", otherwise, all phenomena and all theories would qualify as non-local by considering ad hoc non-local explanations.

Conclusions
The formalism of descriptors has been re-explained in this paper in what I hope is a more complete exposition. I re-showed that the Heisenberg picture entails a local and complete way of describing quantum systems, and I used the approach to revisit superdense coding. By the way, in quantum field theory, locality in the sense advocated here as no-action-at-a-distance, as well as Lorentz invariance, are also recognized in the Heisenberg picture. The reader who is curious to unravel the mysteries of Bell inequality violations and of quantum teleportation is referred to §4 and §5 of the article by Deutsch and Hayden (op. cit.). When I explained in terms of descriptors the teleportation process to one of its pioneers, Gilles Brassard told me enthusiastically that it was the most satisfactory elucidation he had ever heard of his own invention. The best explanations of quantum processes are unlocked by the Heisenberg picture, which is manifestly local, but remain oblivious in the widespread Schrödinger picture. nett, Xavier Coiteux-Roy, Samuel Ducharme, Samuel Kuypers, Chiara Marletto, Pierre McKenzie, Lodovico Scarpa, William Schober and Nicetu Tibau Vidal for fruitful discussions and comments on earlier versions of this paper. I also wish to thank Stefan Wolf as well as the Institute for Quantum Optics and Quantum Information of Vienna, in particular Marcus Huber's group, for warm welcome and inspiring discussions.