The ansatz for the mathematical theory of quantum physics is to represent a physical quantity as a self-adjoint operator 
 in the space of linear operators over a state space 
, which carries the structure of a complex Hilbert space of some dimension 
. The values, which this quantity can assume in an experiment, are the corresponding eigenvalues 
 of 
. We have to find a way to assign probabilities to these eigenvalues, a task which is equivalent to assigning probabilities to the orthogonal projection operators 
, which project states in 
 to eigenstates corresponding to the 
. If agents assign a probability 
 to a projection operator 
, which is common to two families of orthogonal projectors, 
 and do corresponding experiments, they would like to be sure that, if they find the frequency 
, the results describe the same event 
. This is a non-contextuality condition (note that the set of (conditional) probabilities over realized values is contextual, as a theorem by Kochen–Specker [
8] and an example by Hardy [
9] show). A famous theorem from Gleason [
10] says that to any non-contextual measure 
 on the sets of projectors 
 over a Hilbert space 
 of dimension 
, there exists a unique positive semi-definite, self-adjoint operator 
 of trace class one, called density operator, such that 
 for all 
. This is the Born-rule. The theorem defines the appropriate measures as well as the quantum states, which are identified with the density-operators 
. There is a special class 
 of density-operators, called pure states. Every vector, 
 defines a corresponding pure state 
, which is the projection operator onto 
 The set of density operators, 
, is the set of all convex combinations of pure states 
. The weights corresponding to the convex combinations are the second category of probabilities mentioned above. As important as Gleason’s theorem of course is, because it technically defines the right quantum-probabilities, it does not say much about their nature/interpretation.
  2.2. Probabilities of Pure States
We now consider a system 
 represented by a pure state 
 with resolution in the eigenbasis 
 of a self-adjoint operator 
, 
 We can form the corresponding pure state 
 with matrix-entries 
. The Born rule then assigns probabilities
        
        to the projectors 
. Assume that there is an additional system 
 with orthonormal basis states 
 which is initially in the base state 
. A measurement of some state 
 by the probe 
 is an operation 
 on the joint system 
        where 
 is unitary 
 (this follows from the fact that a general interaction evolution 
 is unitary). A general unitary transformation on a tensor-product, expressed in the respective bases, can be written as a matrix
        
        where the operators 
 are given by 
. We denote the diagonal sub-block 
 simply by 
. Since 
 is unitary, we have
        
Conversely, we can choose any set of operators  satisfying the resolution of the identity-condition (5) to define a measurement on an initial joint state . We now have the necessary elements in place to give the main argument.
Assume there is a second system 
 with basis 
 and an observer who would like to know in what state 
 the system 
 is in, by making an appropriate measurement 
 on the joint system 
. If that is possible in the first place, then, having no additional knowledge, the observer does not, a priori, know in what state 
, the probe will be after the measurement and before observation, leading to permutation-symmetry. Let the underlying pure state 
 have coefficients 
 (since the rational numbers 
 are dense in 
, the choice of 
 is general enough). The probe 
 can be chosen appropriately coarse-grained (this coarse-graining is first introduced in [
11] in the context of many-worlds) such that 
. The observer is after the measurement and before observation in a situation where, by lack of further information, she will by Laplace’s principle of indifference a priori attribute to each outcome 
 equal probability 
. This attribution is equivalent to maximizing the entropy function 
. The observer can therefore write down in the spirit of (1) an average of outcomes
        
For our purpose, we now chose the operators 
 to be the scaled projectors 
 to the basis-states 
. Note that we have replaced the simple-index 
 by the double-index 
. This choice is consistent with the demands of a measurement, since the 
 satisfy (5)
        
Therefore, we can write (6) in the following form
        
Comparing Equation (8) with Equation (1), we see that 
 can be viewed as a mixed state with probabilities
        
       which is the Born-rule.
Before we turn to consider the frequencies, let’s have a look at composite systems  To show the principle it is sufficient to look at binary systems.) The state  may be mixed or pure and we can apply the findings in a straightforward way. The single components are given by the partial trace . If the state has the form , then we are in the separable case and can apply the results in 2.1 and 2.2 to each individual component, which can be pure or mixed. In case  is entangled, then the partial trace always produces a mixed state. When we now consider frequencies, then the individual quantum systems might be single or composite, what is important is that they are temporally separable in order to allow for statistics. 
  2.3. Frequencies
The theory so far does only cover single trials. Assume there is a density-operator 
 and a complete set of projectors 
. To find probabilities for a sequence of different outcomes 
 of 
 experiments on 
 (this is done on 
 identically prepared systems) we can apply Gleason’s theorem to the tensor product [
5]
        
		to get
        
		with
        
So the outcomes of repeated measurements are identically and independently distributed (i.i.d.). The probability for outcome 
 to occur 
 times, 
, 
, is given by the multinomial distribution
        
The individual counting functions 
 are binomially distributed and hence 
. For large 
 the averages, 
, of the statistical counting functions approach the expectation values and therefore
        
The fact that 
 is due to the law of large numbers. The frequencies with their implied principle of indifference (14) indeed replicate the probabilities. This is achieved by a strong assumption in the theory, reflected in Equations (11), (13), and (14). It is the independence condition for the multi-trial states 
. Actually, it is itself a consequence of the assumption that agents have maximal information about a system of 
 copies of a quantum state [
5]. Independence implies serial permutation-symmetry, i.e., the fact that it does not count in which sequence the results occur. So in the case of multiple-trials the theory uses a stronger assumption than permutation-symmetry to obtain the compatible frequency-probabilities (14). Can we weaken the assumption?
It is remarkable that, due to the (infinite) de-Finetti theorem [
12], the assumption of independence can be weakened to the one of exchangeability, to still allow reasonable statistics. Exchangeability stands for permutation-symmetry of the joint distribution of 
 trials 
        and for consistency from step 
 to 
, 
. If satisfied, it can uniquely represent an 
-trial state 
 by an integral over product states of form (10) by means of a measure 
 on 
(The measure 
 belongs to the second category of probabilities.) For states 
 of form (15) the statistical approach (14) works with some suitable adjustments (while the distributions are directly integral averages over the product state-distributions, there holds a law of large numbers only conditional to a suitable 
-algebra [
13]). Whether we work with states of maximal information (10) or states of form (15), in any case permutation-symmetry and the principle of indifference are key features of the frequencies (14), derived in multiple-trials.