Structure of Optimal State Discrimination in Generalized Probabilistic Theories

We consider optimal state discrimination in a general convex operational framework, so-called generalized probabilistic theories (GPTs), and present a general method of optimal discrimination by applying the complementarity problem from convex optimization. The method exploits the convex geometry of states but not other detailed conditions or relations of states and effects. We also show that properties in optimal quantum state discrimination are shared in GPTs in general: i) no measurement sometimes gives optimal discrimination, and ii) optimal measurement is not unique.


Introduction
Suppose that there is a party, say Alice, who prepares her system in a particular state. The state is chosen from a set of states that have been publicly declared. The system is then given to the other party, called Bob, who then applies a measurement to find which state has been prepared in among the possibilities. This scenario defines the problem of optimal state discrimination that seeks the guessing probability, i.e. the maximum probability that Bob can correctly guess the state that has been prepared by Alice, as well as the optimal measurement that achieves the guessing probability. Optimal state discrimination shows that there is a fundamental limit in the distinguishability of systems. This problem constitutes one of the most fundamental measures in information theory with deep connection to applications in quantum information processing [1] [2] [3].
Generalized probabilistic theories (GPTs) capture the formalism of the convex operational framework, in which operational significances of states, effects, and dynamics can be identified and characterized, respectively [4] [5] [6], see also a recent review [7]. States are elements of a convex set, effects are postulated to map states into probabilities and present probabilities measures, and dynamics constrains possible evolution of states. In quantum theory, states correspond to non-negative and unit-trace bounded operators on Hilbert spaces, effects are postulated such that product of positive-operator-valued-measures and states results to probabilities, and dynamics is generally described by positive and completely positive maps. GPTs are of fundamental interest, particularly within the foundations of quantum information theory and they also useful for identifying specific properties of states or effects that have operational significances. For instance, in quantum theory, the fact that quantum states cannot be perfectly cloned may be found as one of properties associated with the Hilbert spaces, e.g. non-orthogonality of state vectors. However, the no-cloning theorem does not necessarily rely on the structure of Hilbert spaces and in fact, GPTs which have violations of Bell inequalities can also incorporate the no-cloning theorem [8].
Recently, optimal state discrimination in GPTs has been considered and it has been shown that it is tightly connected to ensemble steering of states and the no-signaling principle [9]. Specifically, in a GPT where ensemble steering is possible, the no-signaling principle can determine optimal state discrimination. This also holds true in quantum theory, where the no-signaling principle elucidates the relation between optimal state discrimination and quantum cloning [10]. Given that ensemble steering itself does not single out quantum theory [11], the result is valid even beyond quantum theory as long as ensemble steering is allowed in a theory. That is, GPTs are a useful theoretical tool to find operational relations that may play a key role in quantum information applications [7].
In this work, we investigate general properties of optimal state discrimination in GPTs and present a method of optimal state discrimination based on the convex geometry of a state space. After briefly introducing the framework of GPTs and optimal state discrimination, we formalize optimal state discrimination within the convex optimization framework. We show that primal and dual problems return the idential result, and thus formulate the problem in the form of the complementarity problem that gives a generalization of the optimization problems. This then allows us to derive a geometric method of state discrimination. We consider an example of GPTs, the polygon states, and apply the geometric formulation to optimal discrimination. We identify those properties that optimal quantum state discrimination shares with GPTs: i) optimal measurement is not unique in general, and ii) no measurement can sometimes give optimal state discrimination.
The present paper is structured as follows. We first review the framework of GPTs and optimal state discrimination, and then formulate optimal state discrimination in the convex optimization framework. We show that primal and dual problems result in the same solutions, due to the strong duality in the problem. We then apply the complementarity problem that generalizes the primal and the dual problems, and derive the method of optimal state discrimination. The polygon system is considered as examples of GPTs, and we apply the method to optimal discrimination of polygon states.

Optimal state discrimination in GPTs
We briefly summarise GPTs [4] [5] [6] and formulate optimal state discrimination as a convex optimization problem. In particular, we apply the complementarity problem and then present a method of optimal discrimination based on the convex geometry of states.

Generalized Probabilistic Theories
As it has been mentioned, a GPT contains states and effects such that they produce probabilities. Any convex set can be a state space. A set of states, denoted by Ω, consists of all possible states that a system can be prepared in. Any probabilistic mixture of states, i.e. pw 1 + (1 − p)w 2 ∈ Ω for w 1 , w 2 ∈ Ω and probability p is also a state, and thus the set is convex. A general mapping from states to probabilities is described by effects, linear functionals Ω → [0, 1]. A measurement denoted by s is described by a set of effects, , with which the probability of getting outcome x for measurement s when state w is given is given by p(x|s) = e (s) . A unit effect u is introduced so that states are mapped to probabilities by effects: once a measurement occurs, we have u[w] = 1 for all w ∈ Ω. Thus, it holds that for any measurement s, we have x e (s) x = u. As effects are dual to the state space, they are also convex.

State discrimination in convex optimization
Optimal state discrimination in GPTs can be described by a game of two parties, Alice and Bob, as follows. Suppose that they have agreed on a set of N states in advance, and then Alice prepares a system in one of the N states with some probability and gives it to Bob. Note also that the a priori probabilities are known publicly. Given that the set of states and a priori probabilities are known, Bob applies measurement and attempts to guess which one has been prepared by Alice. If he makes a correct guess, the score is given 1, and 0 otherwise. The goal is to maximize the average score by optimizing measurements.
Let us label the N states by {w x } N x=1 and their prior probabilities by {q x } N x=1 , so that together they can be expressed as that fulfills the condition x e x = u, in such a way that he makes guesses for each effect e x . Let p B|A (x|y) = e x [w y ] denotes the probability that Bob makes a guess w x from effect e x corresponding to the state w y given by Alice. Optimal state discrimination allows us to determine the guessing probability, the maximum success probability that Bob makes correct guesses on average, with where the maximization runs over all effects. Note that GPTs are generally not self-dual, that is, two spaces of states and effects are in general not isomorphic [12].

A convex optimization framework
We recall that the state space Ω is convex, and so is its dual, the space of effects, leading naturally to the following optimization problem: where by e x ≥ 0 it is meant that e x [w] ≥ 0 for all w ∈ Ω. Note that the above problem is feasible as the set of parameters satisfying constraints is not empty. The trivial solution can be e x = u for a single x and e y = 0 for y = x. For convenience, we follow the notation in Ref. [13], and rewrite the maximization problem in the above as minimization, It is then straightforward to derive the dual problem to this. Let us write down the Lagrangian as follows, where are normalized states. The dual problem can be obtained by solving the following, The minimization in the above is given as ≥ 0 for all effects e. We write this as, K ≥ q x w x for each x. The dual problem is thus as follows, Or, equivalently, In the above, the inequality means an order relation in a convex cone, which is determined by effects, that is, e[K − q i w i ] ≥ 0 for all effects e. Note also that the dual problem is also feasible: the trivial solution would be K = x q x w x .

Constraint Qualification
Recall that in general, the primal and the dual problems do not return an identical solution but there can be a finite gap between solutions of the two problems. In the case of state discrimination in the above, both problems in the above are feasible. This means that from Slater's constraint qualification, the strong duality holds. Hence, no gap exists between the solutions, and in other words, one can get the optimal solution by solving either of the primal or the dual problems.
In addition, the strong duality also implies that the list of optimality conditions, the so-called Karush-Kuhn-Tucker (KKT) conditions, are also sufficient. That is, parameters satisfying KKT conditions provide optimal solutions in both primal and dual problems. For the optimization problems in the above, The KKT conditions are, together with constraints in both primal and dual problems, as follows, The former one is called Lagrangian stability, and the latter one the complementary slackness. The fact that the strong duality holds also guarantees that there exist dual parameters K and {r x , d x } N |x=1 that fulfill KKT conditions, and then those parameters give optimal solutions to the primal and the dual problems. Here, the optimal effects are also characterized by the complementary slackness in the above. This also shows existence of optimal effects or observables in a GPT. All these follow from the fact that the state space is convex. For comparison with quantum cases, the formulation for minimum-error discrimination has been shown in Ref. [14], and see also its applications to various forms of figures of merit in Ref. [15].
To summarize, the sole fact that state and effect spaces are convex allows us to formalize the discrimination problem in the convex optimization framework [13]. This in fact provides a general approach of finding optimal discrimination in GPTs. For states {q x , w x } N x=1 , we take the form in Eq. (1) as the primal problem denoted by p * and derive its dual d * , as follows, where inequalities mean the order relation in the convex set: by e x ≥ 0, it is meant that e x [w] > 0 for all w ∈ Ω, and by K ≥ q x w w , that e[K − q x w x ] ≥ 0 for all effects e.
For the primal and dual problems in Eqs. (4) and (5), the property called the strong duality holds true. This means that the two problems have an identical solution, i.e. p * = d * , and therefore one can obtain the guessing problem by solving either of the problems. The strong duality can be shown from the so-called Slater's constraint quantification in convex optimization. A sufficient condition for the strong duality is the strict feasibility to either of primal or dual problems, that is, the existence of a strictly feasible point of parameters. For instance, primal parameters {e x = u/N } N x=1 are in the case, since e x [w y ] > 0 ∀x, y and x e x = u. From this, it is shown that the guessing probability can be obtained from either the primal or the dual problem.

The complementarity problem
In convex optimization, there is another approach called the complementarity problem that generalizes primal and dual problems. It collects optimality conditions and analyzes them directly. Consequently, the complementarity problem deals with both primal and dual parameters in Eqs. (4) and (5) and find all of optimal parameters. In this sense, the approach is generally not considered more efficient than primal or dual problems. The advantage, actually, lies at the fact that generic structures existing in an optimization problem are found and exploited.
The optimality conditions for optimal state discrimination in Eqs. (4) and (5) can be summarized by the so-called Karush-Kuhn-Tucker (KKT) conditions, which are constraints listed in Eqs. (4) and (5), together with the followings, (Symmetry parameter) where r x ∈ [0, 1] for all x, and {d x } N x=1 are normalized states, i.e. u[d x ] = 1. We call {d x } N x=1 complementary states that construct the symmetry operator. Two conditions in the above are explained in terms of the convex geometry of given states, as follows.
1. The first condition, symmetry parameter, follows from the Lagrangian stability and shows that for any discrimination problem e.g. {q x , w x } N x=1 , there exists a single parameter K which is decomposed into N different ways with given states and complementary states {r x , d x } N x=1 . Then, the second condition in Eq. (7) from the complementary slackness characterizes optimal effects by the orthogonality relation between complementary states and optimal effects. These generalize optimality conditions from quantum cases to all GPTs, see also various forms of optimality conditions in quantum cases [14].

Primal and dual parameters satisfying the KKT conditions are automatically optimal
parameters that provide solutions in the primal and the dual problems. Note also that, since the strong duality holds, both problems show the same solution. Conversely, the fact that the strong duality holds in Eqs. (4) and (5) implies the existence of optimal parameters which satisfy KKT conditions and give the guessing probability in Eq. (1).
Note that a similar approach has been made in Ref. [16] in the form of the so-called Helstrom family, by generalising examples in quantum cases to GPTs. For quantum state discrimination, the approach based on the complementarity problem has been firstly applied in Refs. [17] [18] for two qubit states. This has been generalised to a pair of arbitrary states in GPTs [19]. When this result is generalized to arbitrary number of states in GPTs, however, the existence of the symmetry operator and the orhogonality conditions has been only assumed [16]: in particular, those cases for which the optimal parameters exist are called Helstrom families. Here, we apply the complementarity problem that immediately proves the existence of optimal parameters.

2.5
The geometric method and the general form of the guessing probability We are now ready to present a geometric method of solving minimum-error state discrimination within the framework of GPTs for the complementarity problem. We first observe that, for optimality conditions in Eqs. (6) and (7), constraints for states and effects are separated. The symmetry parameter K is characterized on a state space and gives the guessing probability, see Eq. (5), that is, This means that one can find the guessing probability from a state space. To do this, one has to find the symmetry operator K such that it is decomposed into a given state q x w x and complementary states r x d x in the state space. Or, equivalently, one has to search complementary states {r x , d x } N x=1 fulfilling Eq. (6) on the state space. Let us introduce a convex polytope denoted by P({q x , w x } N x=1 ) of given states in the state space: each vertex of the polytope corresponds to state q x w x for x = 1, · · · , N . Then, the polytope of complementary states, P({r x , d x } N x=1 ), is immediately congruent to P({q x , w x } N x=1 ) in the state space: from Eq. (6) the following holds, which shows that corresponding lines of two polytopes P({q x , w x } N x=1 ) and P({r x , d x } N x=1 ) are of equal lengths and anti-parallel. Then, from the convex geometry of the state space, one can find the polytope of complementary states as well as complementary states by putting two congruent polytopes such that the condition in Eq. (6) holds. Once complementary states are obtained, optimal effects can be found from the orthogonal relation in Eq. (7), accordingly.
Finally, let us provide a general form of the guessing probability in GPTs, when a priori probabilities are equal i.e. q x = 1/N for all x. In this case, the guessing probability is in a simpler form and show its meaning with the convex geometry. First, from Eq. (8) we have p guess = q x + r x for any x. Since q x = 1/N , we have r x = r y for all x, y. Denoted by r := r x for all x, the guessing probability has the form in the following where the expression of r follows from the condition in Eq. (9) with a distance measure · that can be defined in the state space. The parameter r has a meaning as the ratio between two polytopes, P({1/N, w x } N x=1 ) of given states, and P({d x } N x=1 ) of complementary states.

Examples: polygon states
We illustrate the method of optimal state discrimination in GPTs, with an example called the polygon systems in Ref. [12]. We consider cases of three and four states and apply the geometric method of optimal state discrimination. It is straightforward to apply to N states. The polygon system is in general given by N states where the unit effect u = (0, 0, 1) T and the map to probabilities is given by the Euclidean inner product between states and effects.

A case of N = 3
Let us first consider the case N = 3, in which states and effects are given as One can easily check that f 0 + f 1 + f 2 = u. We consider optimal state discrimination for {1/3, w x } 2 x=0 . Applying the geometric method and also from the geometry of the polygon system for N = 3, see also Fig. (1) one can find it holds that

A case of N = 4
We next consider the case N = 4, in which states and effects are given as For four states {1/4, w x } 3 x=0 , the goal is now to find the guessing probability and optimal measurement. Exploiting the convex geometry, see forms a rectangle, from which it follows that r = 1/4 from Eq. (10). To be precise, from the state space geometry, one can see that where d x = w x+2 , for x = 0, 1, 2, 3, (mod 4), where the complementary states are obtained as d x = w x+2 . Thus, we have the guessing probability, from the primal problem in Eq. (6), Note that these four states are analogous to cases in quantum theory, pairs of orthogonal states: for the four quantum states, the guessing probability is also given by 1/2 [14].
Optimal measurement is obtained by using the orthogonality condition in Eq. (7). In fact, optimal measurement is not unique and the following effects give the guessing probability. One can also easily check the orthogonality condition (1/2)e x [d x ] = 0 and also that x e x /2 = u.
• ii) {f 0 , f 2 }: In this case, measurement on effect on e 0 concludes that given state is either w 0 or w 3 , and e 3 to w 1 or w 2 . This is because, from the orhogonality condition in Eq. (7), it holds that Once measurement on effect f 0 (f 2 ) is found, one randomly conclude the given state is either w 0 or w 3 (w 1 or w 2 ), and the guessing probability is obtained 1/2.
• iii) {f 1 , f 3 }: This case works in a similar way to the previous. Measurement on effect on f 1 concludes that given state is either w 0 or w 1 , and f 3 to w 2 or w 3 .
From optimal measurement shown in the above, we remark that properties of optimal quantum state discrimination also hold true in GPTs. First, optimal measurement of quantum state discrimination is generally not unique [1], and the example in the above shows that this also holds true in GPTs. Moreover, optimal measurement for discriminating N quantum states does not always contain output ports in the same number, that is, N POVMs [14] [20]. This also holds true in GPTs in general as shown above.

When no measurement is optimal
We here show another property that quantum state discrimination shares with GPTs. Namely, no measurement is optimal in state discrimination. That is, applying no measurement but simply guessing the state from a priori probabilities gives a guessing probability higher than any other strategies. This also holds true in GPTs. In the following, we provide an example from the result in the quantum case [20].
Let us consider the four polygon states {w x } 3 x=0 for N = 4 in the above together with their mixture w 4 = 3 x=0 w x /4. Let q x = (1 − p)/4 denote a priori probabilities for states w x for x = 0, 1, 2, 3, respectively, and q 4 = p for state w 4 . Hence, we consider optimal state discrimination for {q x , w x } 4 x=0 . In particular, let us also assume that p ≥ 1/5. Then, one can find the optimal discrimination with the symmetry operator as follows, with constants {r x = (1 + 5p)/4} 3 x=0 . It is then straightforward to find {d x } 3 x=0 such that the equalities in Eq. (16) hold true. Note also that whenever p ≥ 1/5 it holds that r x d x ≥ 0. Then, the guessing probability is simply given as p guess = u(K) = p, which can be made by guessing the state w 4 with the a priori probability, without measurement. operational task that corresponds to the information-theoretic measure, the min-entropy [21]. On the other hand, GPTs are of theoretical and fundamental interest such that states, effects, and dynamics are identified in a convex operational framework. Their operational significances can be found without detailed structures of a given theory, e.g. Hilbert spaces of quantum theory.
In the present work, we have considered optimal state discrimination in GPTs within the convex optimization framework. This generalizes the result in quantum cases where optimization runs over symmetric operators describing quantum states and measurements [14]. Here, we have considered optimal state discrimination without such structures, and shown that the results in quantum cases, e.g. see Ref. [14], are shared in GPTs in general. These include, firstly, the convex optimization and the complementariy problem, and then the method of optimal state discrimination with the convex geometry of state spaces. In particular, we has shown with the polygon systems how the method can be applied. We have shown that the followings hold true in GPTs in general: i) optimal measurement is not unique, and ii) no measurement can sometimes give optimal discrimination. The results may be useful in the operational characterization of quantum information processing, and we also envisage their usefulness in quantum information applications.