Next Article in Journal
Mutual Information-Based Inputs Selection for Electric Load Time Series Forecasting
Next Article in Special Issue
Experimental Assessment of a 2-D Entropy-Based Model for Velocity Distribution in Open Channel Flow
Previous Article in Journal
Transfer Entropy for Coupled Autoregressive Processes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Quantum Models of Classical World

Institute for Theoretical Physics, University of Berne, Sidlerstrasse 5, CH-3012 Bern, Switzerland
Entropy 2013, 15(3), 789-925; https://doi.org/10.3390/e15030789
Submission received: 21 December 2012 / Revised: 11 February 2013 / Accepted: 17 February 2013 / Published: 27 February 2013
(This article belongs to the Special Issue Maximum Entropy and Bayes Theorem)

Abstract

:
This paper is a review of our recent work on three notorious problems of non-relativistic quantum mechanics: realist interpretation, quantum theory of classical properties, and the problem of quantum measurement. A considerable progress has been achieved, based on four distinct new ideas. First, objective properties are associated with states rather than with values of observables. Second, all classical properties are selected properties of certain high entropy quantum states of macroscopic systems. Third, registration of a quantum system is strongly disturbed by systems of the same type in the environment. Fourth, detectors must be distinguished from ancillas and the states of registered systems are partially dissipated and lost in the detectors. The paper has two aims: a clear explanation of all new results and a coherent and contradiction-free account of the whole quantum mechanics including all necessary changes of its current textbook version.

Contents
1 Introduction791
I Corrected language of quantum mechanics807
2 States, observables and symmetries808
 2.1 States808
 2.2 Observables817
 2.3 Galilean group829
3 Composition of quantum systems837
 3.1 Composition of heterogeneous systems838
 3.2 Composition of identical systems845
 3.3 State reduction864
II The models867
4 Quantum models of classical properties867
 4.1 Modified correspondence principle869
 4.2 Maximum entropy assumption in classical mechanics871
 4.3 Classical ME packets mechanics873
 4.4 Quantum ME packets878
 4.5 Classical limit887
 4.6 A model of classical rigid body889
 4.7 Joint measurement of position nd momentum893
5 Quantum models of preparation and registration896
 5.1 Old theory of measurement897
 5.2 New theory of measurement901
 5.3 Models of direct registrations903
 5.4 Comparison with other changes of separation status915
6 Conclusions917
Acknowledgements920
References920

1. Introduction

Quantum mechanics was originally developed for the world of atoms and electrons, where it has been very successful. The understanding of the microscopic world, let us call it “quantum world”, that has developed from this success, seems to be very different or even incompatible with the understanding of the everyday world of our immediate experience, which we can call “classical world”. This is unsatisfactory because one of the strongest feelings of a modern physicist is the belief in the unity of knowledge. It is even paradoxical because the bodies of the everyday world are composed of atoms and electrons, which ought to be described by quantum mechanics.
We can distinguish three problems that are met on the way from quantum to classical physics. Classical theories such as Newtonian mechanics, Maxwellian electrodynamics or thermodynamics are objective, in the sense that the systems they are studying can be considered as real objects, and the values of their observables, such as position, momentum, field strengths, charge current, temperature, etc., can be ascribed to the systems independently of whether they are observed or not. If we are going to construct quantum models of classical systems, the question naturally arises how such an objective world can emerge from quantum mechanics.
Indeed, in quantum mechanics, the values obtained by most registrations on microsystems (technically, they are values of observables) cannot be assumed to exist before the registrations, that is, to be objective properties of the microsystem on which the registration is made. An assumption of this kind would lead to contradictions with other assumptions of standard quantum mechanics and, ultimately, with observable facts (contextuality [1,2], Bell inequalities [3], Hardy impossibilities [4], Greenberger–Horne–Zeilinger equality [5]).
This property of quantum mechanical observables has lead to growing popularity of various forms of weakened realism. For example, according to Bohr, realism applies only to the results of quantum measurements, which can be described by the relation between objective classical properties of real classical preparation and registration apparatuses. Various concepts of quantum mechanics itself, such as electron, wave function, observable, etc., do not possess any direct counterparts in reality; they are just instruments to keep order in our experience and to make it ready for application. A rigorous account of this kind of weakened realism is [6]. Similar view is the so-called “statistical” or “ensemble” interpretation [7]. This refuses to attribute any kind of reality to the quantum-mechanical probability amplitudes either at the microscopic or macroscopic level. According to this view, the amplitudes are simply intermediate symbols in a calculus whose only ultimate function is to predict the statistical probability of various directly observed macroscopic outcomes, and no further significance should be attributed to them. Another example of even weaker realism is the “constructive empiricism” by van Fraassen [8], which proposes to take only empirical adequacy, but not necessarily “truth”, as the goal of science. Even the reality of classical systems and their properties is then only apparent. The focus is on describing “appearances” rather than how the world really is [9]. Thus, quantum mechanics does not seem to allow a realist interpretation. Let us call this Problem of Realist Interpretation.
The second problem for construction of quantum models of classical systems can be called Problem of Classical Properties. This is the apparent absence of quantum superpositions, as well as the robustness, of classical properties (for detailed discussion, see Ref. [10]). Clearly, classical properties such as a position do not allow linear superposition. Nobody has ever seen a table to be in a linear superposition of being simultaneously in the kitchen as well as in the bedroom. Also, observing the table in the kitchen does not shift it to the bedroom, while quantum registration changes properties of registered systems. That is roughly what is meant when one says that classical properties are robust (for a better definition, see Ref. [10]).
Finally, there is a serious problem at the quantum–classical boundary. For quantum measurements, evidence suggests the assumption that the registration apparatus always is in a well-defined classical state at the end of any quantum measurement indicating just one value of the registered observable. This is called objectification requirement [11]. However, if the initial state of the registered system is a linear superposition of different eigenvectors of the observable, then the linearity of Schrödinger equation implies that the end state of the apparatus is also a linear superposition of eigenstates of its pointer observable. (This holds, strictly speaking, only if the measured system S and the apparatus A constitute together an isolated system. But if they are not isolated from their environment E then the composite S + A + E can be considered as isolated and the difficulty reappears.) Thus, it turns out that realism in most cases leads to contradictions with the postulate of linear quantum evolution, see the analysis in [10]. Let us call this Problem of Quantum Measurement.
There is a vast literature dealing with the three problems containing many different proposals from various variants of weakened realism to radical changes of quantum mechanics (for clear reviews see [11,12,13]). One also proposes that quantum mechanics is based on a kind of approximation that ceases to be valid for macroscopic systems, see e.g. [10]. An even more radical approach is to look for the way quantum mechanics could be obtained from a kind of deeper theory with classical character (see e.g. the proceedings [14]) These or other attempts in the literature do not seem to lead to a satisfactory solution. We shall not discuss this work because we shall look for the solution in a different and utterly novel direction. Our approach starts directly from quantum mechanics as it has been formulated by Bohr, Born, Dirac, Heisenberg, von Neumann, Pauli and Schrödinger. Our analysis has shown that they have delivered enough tools to deal with the problems. The aim of this review is to prove that our proposals of solutions to the three problems form a logically coherent whole with the rest of quantum mechanics.
Let us briefly describe the ideas from which our approach starts. First, it is true that values of observables are not objective. However, in [15], we have shown that there are other observable properties of quantum systems that can be ascribed to the systems without contradictions and there is a sufficient number of them to describe the state of the systems completely. We shall introduce and discuss this approach in Section 1.0.3. in general terms and then, in technical detail, in Section 2.1.2..
Second, many attempts to derive classical theories from quantum mechanics are based on quantum states of minimal uncertainty (Gaussian wave packets, coherent states, etc.). But a sharp classical trajectory might be just a figment of imagination because each measurement of a classical trajectory is much fuzzier than the minimum quantum uncertainty. Thus, the experimental results of classical physics do not justify any requirement that we have to approximate absolutely sharp trajectories as accurately as possible. This was observed as early as 1822 by Exner [16] and evolved further in 1955 by Born [17]. Taking this as a starting point, one must ask next what the quantum states that correspond to realistically fuzzy classical ones are. In [18], a new assumption about such quantum states has been formulated and studied. We shall discuss it in Section 4.
Third, in [19] it is shown that the measurement of observables such as a position, momentum, spin, angular momentum, energy, etc. on a quantum system must be strongly disturbed by all existing systems of the same type, at least according to the standard quantum mechanics. To eliminate the disturbance, and to give an account of what is in fact done during a quantum measurement process, the standard theory of observables must be rewritten. We shall do this in Section 3.2 on identical quantum systems.
Fourth, the current theory of quantum measurement describes a registration apparatus by a microscopic quantum system, the so-called pointer, and assumes that a reading of the apparatus is an eigenvalue of a pointer observable. For example, Ref. [20], p. 64, describes a measurement of energy eigenvalues with the help of scattering process similar to Stern–Gerlach experiment, and it explicitly states:
We can consider the centre of mass [of a microscopic system] as a ’special’ measuring apparatus...
Similarly, Ref. [21], p. 17 describes Stern–Gerlach experiment:
The microscopic object under investigation is the magnetic moment μ of an atom.. . . The macroscopic degree of freedom to which it is coupled in this model is the centre of mass position r .. . . I call this degree of freedom macroscopic because different final values of r can be directly distinguished by macroscopic means, such as the detector.. . . From here on, the situation is simple and unambiguous, because we have entered the macroscopic world: The type of detectors and the detail of their functioning are deemed irrelevant.
Paradoxically, this notion of measurement apparatus (mostly called “meter”) is quite useful for the very precise modern quantum experiments, such as non-demolition measurements or weak measurements, see, e.g., Refs. [22,23]. Quite generally, these experiments utilize auxiliary microscopic systems called ancillas. If the pointer is interpreted as an ancilla, then the old theory works well at least for the interaction of the measured system with the ancilla. However, the fact that ancillas themselves must be registered by macroscopic detectors is “deemed irrelevant”. A more detailed analysis is given in Section 5.1.
In fact, detectors are very special macroscopic systems. We shall show that their role is to justify the state reduction (collapse of wave function) and to define the so-called “preferred basis” (see [13]) that determines the form of state reduction. Further, the reduction can be assumed as a process within detectors at the time of the detector’s macroscopic signal. We shall discuss the design and role of detectors in Section 4.3.
The present paper has grown from review [24] by adding new results, by correcting many minor errors and by explaining many points in a clearer and more coherent way. It is a systematic exposition of the non-relativistic quantum mechanics (the space and time structure is everywhere assumed to be Newtonian).

1.0.1. Examples of quantum systems 

To explain what quantum mechanics is about, this section describes some well-known quantum systems following Ref. [24]. It also introduces some general notions, such as microsystem, macrosystem, type of system and structural property following Ref. [15].
Quantum mechanics is a theory that describes certain class of properties of certain class of objects in a similar way as any other physical theory does. For example, among others, Newtonian mechanics describes bodies that can be considered as point-like in a good approximation and studies the motion of the bodies.
Quantum systems that we shall consider are photons, electrons, neutrons and nuclei, which we call particles, and systems containing some number of particles, such as atoms, molecules and macroscopic systems, which are called composite. (One can ask whether there is a non-relativistic limit of photons. In one such limit, photons may move with infinite velocity and therefore their position does not need to be very well defined. In another, photons may be represented by a classical electromagnetic wave.) Of course, neutrons and nuclei themselves are composed of quarks and gluons, but non-relativistic quantum mechanics can and does start from some phenomenological description of neutrons and nuclei.
Let us call particles and quantum systems that are composed of small number of particles microsystems. They are extremely tiny and they mostly cannot be perceived directly by our senses. We can observe directly only macroscopic quantum systems that are composed of very many particles. (It is true that the eye can recognize signals of just several photons, but it can be viewed as a quantum registration apparatus with macroscopic parts and only these are observed “directly”.) “Very many” is not too different from 10 23 (the Avogadro number). Let us call these macrosystems. Some properties of most macroscopic systems obey classical theories. For example, shape and position of my chair belong to Euclidean geometry, its mass distribution to Newtonian mechanics, chemical composition of its parts to classical chemistry and thermodynamic properties of the parts such as phase or temperature to phenomenological thermodynamics. Such properties are called ’classical’. Thus, properties of microsystems can only be observed via classical properties of macrosystems; if microsystems interact with them, then this interaction changes their classical properties.
Microsystems are divided into types, such as electrons, hydrogen atoms, etc. Systems of one type are not distinguishable from each other in a sense not existing in classical physics. Systems of the same type are often called identical. Microsystems exist always in a huge number of identical copies. The two properties of microsystems, viz. 1) their inaccessibility to direct observations and 2) utter lack of individuality that is connected with the existence of a huge number of identical copies, make them rather different from classical systems or “things”. Each classical system can be observed directly by humans (in principle: for example, the distant galaxies) and each can be labelled and distinguished from other classical systems, because it is a quantum system composed of a huge number of particles and hence it is highly improbable that it has a kin of the same type in the world.
Objective properties that are common to all microsystems of the same type will be called structural. Thus, each particle has a mass, spin and electric charge. For example, the mass of electron is about 0.5 MeV, the spin 1/2 and the charge about 10 - 19 C. In non-relativistic quantum mechanics, any composite system consists of definite numbers of different particles with their masses, spins and charges. (We do not view quasiparticles as particles but as auxiliary entities useful for description of the spectrum of some composite systems.) E.g., a hydrogen atom contains one electron and one proton (nucleus). The composition of a system is another structural property. The structural properties influence the dynamics of quantum systems; the way they do it and what dynamics is will be explained later. Only then, it will be clear what the meaning of these parameters is and that the type of each system can be recognized, if its dynamics is observed. When we shall know more about dynamics, further structural properties will emerge.
Structural properties are objective in the sense that they can be assumed to exist before and independently of any measurement. Such assumption does not lead to any contradictions with standard quantum mechanics and is at least tacitly made by most physicists. In fact, the ultimate goal of practically all experiments is to find structural properties of quantum systems.
From the formally logical point of view, all possible objective properties of given kind of objects ought to form a Boolean lattice. The structural properties satisfy this condition: systems with a given structural property form a subset of all systems. These subsets are always composed of whole type-classes of quantum systems. Clearly, the intersection of two such subsets and the complement of any such subset is again a structural property.
Structural properties characterize a system type completely but they are not sufficient to determine the dynamics of individual systems.

1.0.2. Examples of quantum experiments 

The topic of this section plays an important role in understanding quantum mechanics. Specific examples of typical experiments will be given in some detail following Ref. [24]. In this way, we gain access to the notions of preparation and registration, which are assumed by the basic ideas of our realist interpretation of quantum mechanics. Describing the experiments, we shall already use some of the language of the interpretation, which will be introduced and motivated in this way.
Let us first consider experiments with microsystems that are carried out in laboratories. Such an experiment starts at a source of microsystems that are to be studied. Let us give examples of such sources.
  • Electrons. One possible source (called field emission, see e.g. Ref. [25], p. 38) consists of a cold cathode in the form of a sharp tip and a flat anode with an aperture in the middle at some distance from the cathode, in a vacuum tube. The electrostatic field of, say, few kV will enable electrons to tunnel from the metal and form an electron beam of about 10 7 electrons per second through the aperture, with a relatively well-defined average energy.
  • Neutrons can be obtained through nuclear reaction. This can be initiated by charged particles or gamma rays that can be furnished by an accelerator or a radioactive substance. For example the so-called Ra-Be source consists of finely divided RaCl 2 mixed with powdered Be, contained in a small capsule. Decaying Ra provides alpha particles that react with Be. The yield for 1 mg Ra is about 10 4 neutrons per second with broad energy spectrum from small energies to about 13 MeV. The emission of neutrons is roughly spherically symmetric centred at the capsule.
  • Atoms and molecules. A macroscopic specimen of the required substance in gaseous phase at certain temperature can be produced, e.g., by an oven. The gas is in a vessel with an aperture from which a beam of the atoms or molecules emerges.
Each source is defined by an arrangement of macroscopic bodies of different shapes, chemical compositions, temperatures and by electric or magnetic fields that are determined by their macroscopic characteristics, such as average field intensities: that is, by their classical properties. These properties determine uniquely what type of microsystem is produced. Let us call this description empirical. It is important that the classical properties defining a source do not include time and position so that the source can be reproduced later and elsewhere. We call different sources that are defined by the same classical properties equivalent. Empirical description is sufficient for reproducibility of experiments but it is not sufficient for understanding of how the sources work. If a source defined by an empirical description is set into action, we have an instance of the so-called preparation.
Quantum mechanics assumes that these are general features of all sources, independently of whether they are arranged in a laboratory by humans or occur spontaneously in nature. For example, classical conditions at the centre of the Sun (temperature, pressure and plasma composition) lead to emission of neutrinos that reach the space outside the Sun.
Often, a source yields very many microsystems that are emitted in all possible directions, a kind of radiation. We stress that the detailed structure of the radiation as it is understood in classical physics, that is where each individual classical system exactly is at different times, is not determined for quantum systems and the question even makes no sense. Still, a fixed source gives the microsystems that originate from it some properties. In quantum mechanics, these properties are described on the one hand by the structural properties that define the prepared type, on the other, e.g., by the so-called quantum state. The mathematical entity that is used in quantum mechanics to describe a state (the so-called state operator) will be explained in Section 2.1. To determine the quantum state that results from a preparation with a given empirical description in each specific case requires the full formalism of quantum mechanics. Hence, we postpone this point to Section 5.
After arranging the source, another stage of the experiment can start. Generally, only a very small part of the radiation from a source has the properties that are needed for the planned experiment. The next step is, therefore, to select the part and to block off the rest. This is done by the so-called collimator, mostly a set of macroscopic screens with apertures and macroscopic electric or magnetic fields. For example, the electron radiation can go through an electrostatic field that accelerates the electrons and through electron-microscope “lenses”, each followed by a suitable screen. A narrow part of the original radiation, a beam, remains. Another example is a beam of molecules obtained from an oven. It can also contain parts of broken molecules including molecules with different degrees of ionisation. The part with suitable composition can then be selected by a mass spectrometer and the rest blocked off by a screen. Again, the beam resulting from a raw source and a collimator consists of individual quantum objects with a well-defined type and quantum state. The process of obtaining these individual quantum objects can be viewed as a second stage of the preparation. Again, there is an empirical description that defines an equivalence class of preparations and equivalent preparations can be reproduced.
The final beam can be characterized not only by the quantum state of individual objects but also by its approximate current, that is how many individual objects it yields per second. The beam can be made very thin. For example, in the electron-diffraction experiment [26], the beam that emerges from the collimator represents an electric current of about 10 - 16 A , or 10 3 electrons per second. As the approximate velocity of the electrons and the distance between the collimator and the detector are known, one can estimate the average number of electrons that are there simultaneously at each time. In the experiment, it is less than one. One can understand in this way that it is an experiment with individual electrons.
Next, the beam can be lead through further arrangement of macroscopic bodies and fields. For example, to study the phenomenon of diffraction of electrons, each electron can be scattered by a thin slab of crystalline nickel or by an electrostatic biprism interference apparatus. The latter consists of two parallel plates and a wire in between with a potential difference between the wire on the one hand and the plates on the other. An electron object runs through between the wire and both the left and right plate simultaneously and interferes with itself afterwards (for details see [26]). Again, the beam from the graphite or the biprism can be viewed as prepared by the whole arrangement of the source, collimator and the interference apparatus. This is another example of a reproducible preparation procedure.
Finally, what results from the original beam must be made directly perceptible by its interaction with another system of macroscopic bodies and fields. This process is called registration and the system registration apparatus. The division of an experimental arrangement into preparation and registration parts is not unique. For example, in the electron diffraction experiment, one example of a registration apparatus begins after the biprism interference, another one includes also the biprism interference apparatus. Similarly to preparations, the registrations are defined by an empirical description of their relevant classical properties in such a way that equivalent registrations can be reproduced.
An important, even definition, property of a registration is that it is applicable to an individual quantum system and that each empirical result of a registration is caused by just one individual system. In the above experiment, this assumption is made plausible by the extreme thinning of the beam, but it is adopted in general even if the beam is not thin.
An empirical description of a registration apparatus can determine a quantum mechanical observable similarly as preparation determines a quantum state. Again, more theory is needed for understanding of what are the mathematical entities (the so-called positive valued operator measures, see Section 2.2) describing observables and how they are related to registration devices. Each individual registration performed by the apparatus, i.e., registration performed on a single quantum object, then gives some value of the observable. The registration is not considered to be finished without the registration apparatus having given a definite, macroscopic and classical signal. This is the objectification requirement.
A part of registration apparatuses for microsystems are detectors. At the empirical level, a detector is determined by an arrangement of macroscopic fields and bodies, as well as by the chemical composition of its sensitive matter [27]. For example, in the experiment [26] on electron diffraction, the electrons coming from the biprism interference apparatus are absorbed in a scintillation film placed transversally to the beam. An incoming electron is thus transformed into a light signal. The photons are guided by parallel system of fibres to a photo-cathode. The resulting (secondary) electrons are accelerated and lead to a micro-channel plate, which is a system of parallel thin photo-multipliers. Finally, a system of tiny anodes enables to record the time and the transversal position of the small flash of light in the scintillation film.
In this way, each individual electron coming from the biprism is detected at a position (two transversal coordinates determined by the anodes and one longitudinal coordinate determined by the position of the scintillation film). Such triple x of numbers is the result of each registration and the value of the corresponding observable, which is a coarsened and localized position operator q in this case (see Section 3.2.3.). Also the time of the arrival at each anode can be approximately determined. Thus, each position obtains a certain time.
For our theory, the crucial observation is the following. When an electron that has been prepared by the source and the collimator hits the scintillation film, it is lost as an individual system. Indeed, there is no property that would distinguish it from other electrons in the scintillation matter. Thus, this particular registration is a process inverse to preparation: while the preparation has created a quantum system with certain individuality, the registration entails a loss of the individuality. Our theory of quantum registration in Section 5 will make precise and generalize this observation.
If we repeat the experiment with individual electrons many times and record the transversal position coordinates, the gradual formation of the electron interference pattern can be observed. The pattern can also be described by some numerical values. For example, the distance of adjacent maxima and the direction of the interference fringes can be such values. Still, the interference pattern is not a result of one but of a whole large set of individual registrations.
In some sense, each electron must be spread out over the whole plane of the scintillation detector after coming from the biprism but the excitation of the molecules in the detector matter happens always only within a tiny well-localized piece of it, which is different for different electrons. (This is what is sometimes called “the collapse of wave function”.) Thus, one can say that the interference pattern must be encoded in each individual electron, even if it is not possible to obtain the property by a single registration. The interference pattern can be considered as an objective property of the individual electrons prepared by the source, the collimator and the biprism interference apparatus. The interference pattern is not a structural property: preparations that differ in the voltage at some stage of the experiment (e.g., the accelerating field in the collimator or the field between the wire and the side electrodes in the biprism interference apparatus, etc.) will give different interference patterns. We call such objective properties dynamical. On the other hand, the hitting position of each individual electron cannot be considered as its objective property. Such an assumption would lead to contradictions with results of other experiments. The position must be regarded as created in the detection process.
It is a double-slit experiment, a special kind of which is described above, that provides a strong motivation for considering an individual quantum particle as an extended object of sorts. Without any mathematical description, it is already clear that such an extended character of electrons could offer an explanation for the stability of some states of electrons orbiting atomic nuclei. Indeed, a point-like electron would necessarily have a time-dependent dipole momentum and lose energy by radiation. However, an extended electron can define a stationary charge current around the nucleus.
Some structural properties can be measured directly by a registration (on individual quantum systems) and their values are real numbers. For example, mass can be measured by a mass spectrometer. Such structural properties can be described by quantum observables (see Section 2.2.5.). (These observables must commute with all other observables ([6], IV.8), and can be associated with the so-called superselection rules, see e.g. [11].) However, there are also structural properties that cannot be directly measured on individual objects similarly to the interference pattern, such as cross sections or branching ratios. They cannot be described by observables.

1.0.3. Realist Model Approach to quantum mechanics 

In the previous section, describing specific experiments, we have used certain words that are avoided in careful textbooks of quantum mechanics such as objective properties or quantum object or
…An electron object runs through between the wire and both the left and right plate simultaneously and interferes with itself afterwards …
The electron is viewed here as a real object that is extended over the whole width of the biprism apparatus. After this intuitive introduction, we give now a general and systematic account of our realist interpretation.
A realist interpretation of a physical theory is a more subtle question than whether the world exists for itself rather than being just a construction of our mind. This question can always be answered in positive without any danger of falsification. However, every physical theory introduces some general, abstract concepts. For example, Newtonian mechanics works with mass points, their coordinates, momenta and their dynamical trajectories. The truly difficult question is whether such concepts possess any counterparts in the real world. On the one hand, it seems very plausible today that mass points and their sharp trajectories cannot exist and are at most some idealizations. On the other hand, if we are going to understand a real system, such as a snooker ball moving on a table, then we can work with a construction that uses these concepts but is more closely related to the reality. For example, we choose a system of infinitely many mass points forming an elastic body of a spherical shape and calculate the motion of this composite system using Newton’s laws valid for its constituent points. Then, some calculated properties of such a model can be compared with interesting observable properties of the real system. Thus, even if the general concepts of the theory do not describe directly anything existing, a suitable model constructed with the help of the general concepts can account for some aspects of a real system.
Motivated by this observation, we shall divide any physical theory into two parts. First, there is a treasure of successful models. Each model gives an approximative representation of some aspects of a real object [28]. Historically, models form a primary and open part of the theory. For example, in Newtonian mechanics, the solar system was carefully observed by Tycho de Brahe and then its model was constructed by Kepler. Apparently, Newton was able to calculate accelerations and doing so for Kepler trajectories, he might discover that they pointed towards the Sun. Perhaps this lead to the Second Law. The hydrogen atom had a similar role in quantum mechanics.
Second, there is a general language part. It contains the mathematical structure of state space, conditions on trajectories in the state space, their symmetries and the form of observables [8]. It is obtained by generalization from the study of models and is an instrument of further model construction and of model unification. For example, in Newtonian mechanics, the state space is a phase space, the conditions on trajectories is the general structure of Newton’s dynamical equations, symmetries are Galilean transformations and observables are real functions on the phase space.
A model is constructed as a particular subset of trajectories in a particular state space as well as a choice of important observables. For example, to describe the solar system, assumptions such as the number of bodies, their point-like form, their masses, the form of gravitational force and certain class of their trajectories can be made if we want to construct a model. The observed positions of the planets would then match the theoretical trajectories of the model within certain accuracy. Thus, a model consists of a language component on the one hand, and an identifiable-object component on the other. The language component always contains simplifying assumptions, always holds only for some aspects of the associated object and only within some approximation. The approximation that is referred to is bounded from above by the accuracy of performed measurements. This is measurable and can be expressed numerically by statistical variances.
Clearly, the models of a given theory are not predetermined by the general part but obtained in the historical evolution and dependent on observation of real objects. On the one hand, the general part can also be used to construct language components of models that do not have any real counterparts. On the other hand, the model part is steadily evolving and never closed. For example, a satisfactory quantum model of high temperature superconductivity is not yet known. This is why the treasure of successful models is an independent and, in fact, the basic part of any theory.
Such philosophy forms a first step of what we call our Realist Model Approach to quantum mechanics. Thus, the Approach lies somewhat within the recent trend of the philosophy of science that defines a theory as a class of models (see, e.g., [8,28,29,30,31]). It can be said that it combines ideas of the constructive realism by Ronald Giere (I enthusiastically adopt Giere’s view that philosophy of science is to be removed from the realm of philosophy and put into the realm of cognitive sciences) with van Fraassen notions of state space and symmetries [8] as a basis of the general language part. (Van Fraassen also applied his constructive empiricism to quantum mechanics [32] and, adding some further ideas, arrived at his own, the so-called “modal interpretation” of it. To prevent misunderstanding, it must be stressed that the account and interpretation of quantum mechanics described here is different from van Fraassen’s.) It is important that constructive realism is immune to the usual objections against naive realism. In addition, we add some further importance to the general part by recognizing its unification role. The effort at unification is without any doubts a salient feature of scientists that can be observed at any stage of research. For example, Newton was admired for his unification of such different phenomena as apples falling from trees and the Moon moving in the sky. Today’s endeavor to unify the theories of quantum fields and gravity is a very well observable historical fact. From the point of view of Giere, this bent might, perhaps, be understood as one of cognitive instincts.
The focus on models allows to define the task of quantum-explaining the classical world in the following way. Instead of trying to find a direct relation between the general language parts of, say, quantum and Newtonian mechanics or a universal correspondence between states of Newtonian and quantum mechanics such as Wigner–Weyl–Moyal map [33], p. 85, one ought to build quantum models of real macroscopic systems and their aspects for which there are models in Newtonian mechanics striving for approximate agreement between the two kinds of models on those aspects for which the Newtonian models are successful. For example, we shall not attempt to obtain from quantum mechanics the sharp trajectories that is a concept of general language part of Newtonian mechanics, but rather try to model the observed fuzzy trajectories of specific classical systems (Section 4), or to analyse different specific registration apparatuses first and then try to formulate some features common to all (Section 5).
However, the Realist Model Approach is not so easily applied to quantum mechanics as it is to Newtonian mechanics. A question looms large at the very start: What is a real quantum object? Of course, such object are met “empirically” in preparations and registrations. However, we would like to subscribe to the notion that the language component of a model must ascribe to its real object a sufficient number of objective properties. Objective means that the properties can be ascribed to the object alone. Sufficient means that the dynamics of any object as given by its model is uniquely determined by initial data defined by values of a minimal set of its objective properties. For example, in Newtonian mechanics, the values of coordinates and momenta determine a unique solution of Newton’s dynamical equations.
In Newtonian mechanics, coordinates and momenta are observables, and values of observables can be viewed as objective without any danger of contradictions. Using this analogy, one asks: Can values of observables be viewed as objective properties of quantum systems? As is well known, the answer is negative (see Section 2.2.4.). If we assume that values of observables are the only properties of quantum systems that are relevant to their reality, then there are no real quantum systems. For rigorous no-go theorems concerning such objective properties see, e.g., Ref. [6].
Our approach to properties of quantum systems is therefore different from those that can be found in literature. First, we extend the notion of properties to include complex ones in the following sense [24]:
  • Their values may be arbitrary mathematical entities (sets, maps between sets, etc.). For example, the Hamiltonian of a closed quantum system involves a relation between energy and some other observables of the system. This relation is an example of such a complex property.
  • Their values do not need to be directly obtained by individual registrations. For example, to measure a cross-section a whole series of scattering experiments must be done. Thus, their values do not necessarily possess probability distribution but may be equivalent to, or derivable from, probability distributions.
Point 2 is usually not clearly understood and we must make it more precise. A real system of Newton mechanics is sufficiently robust so that we can do many experiments with it and perform many measurements on it without changing it. Moreover, any such system is sufficiently different from other systems anywhere in the world (even two cars from one factory series can be distinguished from each other). Any physical experiment on a given classical system can then be repeated many times and only then the results can be considered reliable. The results are then formulated in statistical terms (e.g., as averages, variances, etc.). One can, therefore, feel that it might be more precise account of what one does generally in physics if one spoke of ensembles of equivalent experiments done on equivalent object systems in terms of equivalent experimental set-ups and of the statistics of these ensembles.
This is of course a well-known idea. We shall apply it consequently to Newtonian mechanics in the theory of classical limit in Section 4. However, one ought not to forget that each ensemble must consist of some elements. Indeed, to get a statistics, one has to possess a sufficiently large number of different individual results. Hence, these individual elements must always be there independently of how large the ensemble is. Then, we can ask the question: What do the statistical properties of an ensemble tell us about properties of the individual object systems used in each individual experiment? In the classical physics, at least, the answer to this question is considered quite obvious and one interprets the experimental results as properties of the individual objects.
Quantum microsystems are never robust in the above sense. After a single registration, the microsystem is usually lost. Then we can repeat the experiment only if we do it with another system. Here, we can utilize another property of quantum microsystems that is different from classical ones: there is always a huge number of microscopic systems of the same type, which are principally indistinguishable from each other. Thus, we can apply the same preparation together with the same registration many times. In quantum mechanics, the thought set of all such experiments is called ensemble of experiments, and similarly ensembles of prepared systems and of obtained results. The elements of the ensembles are again called individuals. Suppose that each individual result of an ensemble of measurements is a real number. Then we can e.g. calculate an average of the results ensemble. The average can then surely be considered as a property of the ensemble.
However, as in classical physics, one can also understand the average as a property of each individual system of the ensemble. In any case, the fact that a given individual belongs to a given ensemble is a property of the individual. It is a crucial step in our theory of properties that we consider a property of an ensemble as a property of each individual element of the ensemble. In fact, this is the only way of how the logical union or intersection of two properties can be understood. For example, the logical union, A B , of properties A and B of system S is the property, that S has either property A or B.
In our theory, we shall use both notions, individual object and ensemble. The notion of system ensemble is defined as usual (see, e.g., [21], p. 25): it is the thought set of all systems obtained through equivalent preparations.
Returning to objectivity of observables in quantum mechanics, the problem is that a registration of an eigenvalue a of an observable A of a quantum system S by an apparatus A disturbs the microsystem and that the result of the registration is only created during the registration process. The result of the individual registration cannot thus be assumed to be an objective property of S before the registration. It can however be assumed, as we shall do, that it is an objective property of the composite S + A after the registration. This is the objectification requirement [11].
It seems therefore that the objective properties of quantum systems, if there are any, cannot be directly related to individual registrations, as they can in classical theories. (Paradoxically, most of the prejudices that hinder construction of quantum models of classical theories originate in the same classical theories.) However, there are observable properties in quantum mechanics that are different from values of observables [24]:
Basic Ontological Hypothesis of Quantum Mechanics A sufficient condition for a property to be objective is that its value is uniquely determined by a preparation according to the rules of standard quantum mechanics. The “value” is the value of the mathematical expression that describes the property and it may be more general than just a real number. To observe an objective property, many registrations of one or more observables are necessary.
In fact, the Hypothesis just states explicitly the meaning that is tacitly given to preparation by standard quantum mechanics. More discussion on the meaning of preparation is in [19,34]. In any case, prepared properties can be assumed to be possessed by the prepared system without either violating any rule of standard quantum mechanics or contradicting possible results of any registration performed on the prepared system. The relation of registrations to such objective properties is only indirect: an objective property entails limitations on values of observables that will be registered. In many cases, we shall use the Hypothesis as a heuristic principle: it will just help to find some specific properties and then it will be forgotten, that is, an independent assumption will be made that these properties can be objective and each of them will be further studied.
We shall divide objective properties into structural (see Section 1.0.1.) and dynamical and describe the dynamical ones mathematically in Section 2.1.2.. Examples of dynamical properties are a state, the average value and the variance of an observable. We shall define so-called simple objective properties and show first, that there is enough simple objective properties to characterize quantum systems completely (at least from the standpoint of standard quantum mechanics) and second, that the logic of simple properties satisfies Boolean lattice rules. Thus, a reasonable definition of a real object in quantum mechanics can be given (see Section 2.1.2.).
Often, the Hypothesis meets one of the following two questions. First, how can the Hypothesis be applied to cosmology, when there was nobody there at the Big Bang to perform any state preparation? Second, a state preparation is an action of some human subject; how can it result in an objective property? Both objections result from a too narrow view of preparations (see Section 1.0.2.). Moreover, the second objection is not much more than a pun. It is not logically impossible that a human manipulation of a system results in an objective property of the system. For example, pushing a snooker ball imparts it a certain momentum and angular momentum that can then be assumed to be objective properties of the ball.
One may wonder how the average of an observable in a state can be objective while the individual registered value of the observable are not because an average seems to be defined by the individual values. However, the average is a property of a prepared state and is, therefore, defined also by the preparation. The results of a huge number of individual registrations must add to their predetermined average. This can be seen very well for averages with small variances. In our theory of classical properties (Section 4), the explanation of classical realism will be based on the objectivity of averages because some important classical observables will be defined as (quantum) averages of (quantum) observables in a family of specific (quantum) states that will be called classicality states.
Let us compare our Realist Model Approach with what is usually understood as the Realism of Classical Theories. This is the philosophy that extends some successful features of classical theories, especially Newtonian mechanics, to the whole real world. There are three aspects of the Realism of Classical Theories that are not included in our Realist Model Approach. First, classical physics is deterministic, assuming that every event has a cause, but quantum theory does not tell us the causes of some of its events. Second, each interaction of classical physics is local in the sense that the mutual influence of two interacting systems asymptotically vanishes if the systems are separated by increasing spatial distance (cluster separability). But, in addition to local interactions, quantum theory contains mutual influence that is independent of distance (entanglement, mutual influence between particles of the same type). Third, classical physics requires a causal explanation for every correlation. This can be rigorously expressed by Reichennbach’s condition of common cause [35]. The existence of the common cause for some quantum correlation is incompatible with experiments (for discussion, see [32]).
Our Realist Model Approach just states which ontological hypotheses can reasonably be made under the assumption that quantum mechanics is valid. A few words have to be said on ontological hypotheses. As is well-known, the objective existence of anything cannot be proved (even that of the chair on which I am now sitting, see, e.g., Ref. [12], where this old philosophical tenet is explained from the point of view of a physicist). Thus, all such statements are only hypotheses, called ontological.
It is clear, however, that a sufficiently specific ontological hypothesis may lead to contradictions with some observations. Exactly that happens if one tries to require objectivity of quantum observables. (More precisely, the existence in question is that of systems with sufficient number of properties defined by values of observables.) Moreover, hypotheses that do not lead to contradictions may be useful. For example, the objective existence of the chair nicely explains why we all agree on its properties. Similarly, the assumption that quantum systems possess certain objective properties will be useful for the quantum theory of classical properties or for a solution of the problem of quantum measurement. The usefulness of ontological hypotheses in the work of experimental physicists has been analysed by Giere [28], p. 115. The hypothesis in question is the existence of certain protons and neutrons. It explains, and helps to perform, the production, the manipulations, the control and the observations of proton and neutron beams in an experiment at Indiana University Cyclotron Facility. From the point of view of van Fraassen, the ontological hypotheses of the kind used by this paper might perhaps be considered as a part of theoretical models: such a hypothesis may or may not be “empirically adequate”.
The position on ontological hypotheses taken here is, therefore, rather different from what has been called “metaphysical realism” by Hillary Putnam [36]: “There is exactly one true and complete description of ’the way the world is’ ”.
The Realist Model Approach enables us to characterize the subject of quantum mechanics as follows:
Quantum mechanics studies objective properties of existing microscopic objects.
This can be contrasted with the usual cautious characterization of the subject, as e.g. [21], p. 13:
...quantum theory is a set of rules allowing the computation of probabilities for the outcomes of tests [registrations] which follow specific preparations.

1.0.4. Probability and information 

Let us return to Tonomura experiment. At each individual registration, a definite value x of the observable x is obtained. Quantum mechanics cannot predict which value it will be, but it can give the probability p ( x ) that the value x will be obtained. This is a general situation for any registration. In this way, registrations introduce a specific statistical element into quantum mechanics.
A correct understanding of probability and information is an important part in the conceptual framework of the theory. The discussion whether probability describes objective properties that can be observed in nature or subjective states of the knowledge of some humans has raged since the invention of probability calculus by Jacob Bernoulli and Pierre-Simon Laplace [37]. The cause of this eternal argument might be that the dispute cannot be decided: probability has both aspects, ontological and epistemic [24].
Probability is a function of a proposition A and its value, p ( A ) , is a measure of the degree of certainty that A is true. As a function on a Boolean lattice of propositions, it satisfies Cox’s axioms [37]. Then it becomes a real additive measure on the lattice. Whether a proposition is true or false must be decided by observation, at least in principle. Hence, the probability always concerns objective events, at least indirectly.
As an example, consider the Tonomura experiment. The probability p ( x ) concerns the proposition that the value of observable x is individually registered on an electron is x . The value of the probability can be verified by studying an ensemble of such registrations. Indeed, if we perform a huge number of such registrations, we obtain an interference pattern that approximately reproduces the smooth probability distribution obtained by calculating the quantum mechanical model of an electron in Tonomura’s apparatus. It is a real interference pattern shown by the apparatus.
We have used the term “ensemble” with the meaning of a statistical ensemble of real events or objects. Here, the objectification requirement is involved. The value x obtained in each individual registration is considered as a real property of the system consisting of the electron and the apparatus. Then, the probability concerns both the lack of knowledge of what objectively happens and properties of real systems. We emphasize that p ( x ) is not the probability that the prepared electron possesses value x of observable x but the probability that a registration will give such a value.
Another question is whether the individual outcomes are in principle predictable from some more detailed initial conditions (the so-called “hidden variables”) on the electron that we do not know. Quantum mechanics does not contain information on any such conditions. It does not deliver these predictions and it would even be incompatible with any ’deeper’ theory that did (see Section 2.2.4.). We shall, therefore, assume that they are objectively unpredictable.
Let us describe the general framework that is necessary for any application of probability theory. First, there is a system, denote it by S . Second, certain definite objective conditions are imposed on the system, e.g., it is prepared as in the example above and observable x is registered. In general, there must always be an analogous set of conditions, let us denote it by C . To each system S subject to condition C , possibilities in some range are open. These possibilities are described by a set of propositions that form a Boolean lattice F . A probability distribution p : F [ 0 , 1 ] is a real additive measure on F .
In quantum mechanics, F is usually constructed with the help of some observable, E say. If E is a discrete observable, its value set Ω is at most countable, Ω = { ω 1 , ω 2 } . Then the single-element sets { ω k } , k = 1 , 2 , , are atoms of F that generate F and the probability distribution p ( X ) , X F , can be calculated from p ( ω k ) by means of Cox’s axioms. The atoms are called outcomes. If E is a continuous observable, then there are no atoms but continuous observables can be considered as idealizations of more realistic discrete observables with well-defined atoms. For example, in Tonomura’s experiment, observable x is defined by the photo-multiplier cells in the micro-channel plate, and not only is discrete but also even has a finite number of values.
If condition C is reproducible or it obtains spontaneously sufficiently often (which is mostly the case in physics), anybody can test the value of the probabilities because probability theory enables us to calculate the frequencies of real events starting from any theoretical probability distribution and the frequencies are measurable [37]. The probability distribution p ( X ) is therefore an objective property of condition C on S . It is so even in cases when the outcomes can be in principle predicted if occurrence of more detailed conditions is observable and can in principle be known. More precisely, this would depend on whether condition C can be decomposed into other conditions C i , i = 1 , , N in the following sense. Condition C can be viewed as a logical statement ’ S satisfies C ’. Let
  • C = C 1 C N , where ∨ is the logical union (disjunction),
  • each C i be still recognizable and reproducible,
  • each outcome allowed by C is uniquely determined by one of C i ’s.
It follows that C i C k = for all k i , where ∧ is logical intersection (conjunction). Even in such a case, condition C itself leaves the system a definite amount of freedom that can be described in all detail by the probability distribution p ( X ) and it is an objective, verifiable property of C alone. And, if we know only that condition C obtains and that a probability distribution is its objective property, then this probability distribution describes the state of our knowledge, independently of whether the conditions C k do exist and we just do not know which of them obtains in each case or not. Examples of these two different situations are given by standard quantum mechanics, which denies the existence of C k ’s, and the Bohm–de Broglie pilot wave theory, which specifies such C k ’s.
Of course, there are also cases where some condition, C , say, occurs only once so that a measurement of frequencies is not possible. Then, no probability distribution associated with C can be verified so that our knowledge about C is even more incomplete. However, in some such cases, one can still give a rigorous sense to the question [37], p. 343 (see also the end of this section): “What is the most probable probability distribution associated with C ?” One can then base one’s bets on such a probability distribution. Such a probability distribution can be considered as an objective property of C and again, there is no contradiction between the objective and subjective aspects of probability.
In quantum mechanics, it is also possible to mix preparations in a random way. Suppose we have two preparations, P 1 and P 2 , and can mix them randomly by e.g. mixing the resulting particle beams in certain proportions c 1 and c 2 , c 1 + c 2 = 1 . Then, each particle in the resulting beam is either prepared by P 1 , with probability c 1 , or by P 2 , with probability c 2 . In this way, another kind of statistical element can be introduced into quantum theory. This element will be discussed in Section 2.1.2..
Condition C is defined by the two beams and their mixing so that the ensemble has, say, N particles. Then, C can be decomposed in N 1 C 1 N 2 C 2 , where C k is the preparation by P k and c k = N k / N , k = 1 , 2 . We can know c 1 and c 2 because we know the intensity of the corresponding beams but it is unlikely that we also know whether a given element of the ensemble has been prepared by P 1 or P 2 .
An important role in probability theory is played by entropy. Entropy is a certain functional of p ( X ) that inherits both objective and subjective aspects from probability. Discussions similar to those about probability spoil the atmosphere about entropy. The existence of a subjective aspect of entropy—the lack of information—seduces people to ask confused questions such as: Can a change of our knowledge about a system change the system energy?
The general definition of entropy as a measure of missing information has been given by Shannon [38] and its various applications to communication theory are, e.g., described in the book by Pierce [39]. Our version is:
Definition 1
Let p : F [ 0 , 1 ] be a probability distribution and let X k be the atoms of F . Then the entropy of p ( X ) is
S = - k p ( X k ) ln ( p ( X k ) ) .
Let us return to the quantum-mechanical example in order to explain which information is concerned. After the choice of preparation and registration devices, we do not know what will be the outcome of a registration but we just know that any outcome X k of the registration has probability p ( X k ) . After an individual registration, one particular outcome will be known with certainty. The amount of information gained by the registration is the value of S given by Equation (1). For more detail, see Ref. [21]. Thus, the value of S measures the lack of information before the registration.
The entropy and the so-called Maximum Entropy Principle (MEP) have become important notions of mathematical probability calculus, see, e.g., Ref. [37]. (There is also a principle of statistical thermodynamics that carries the same name but ought not to be confused with the mathematical MEP.) The mathematical problem MEP solves can be generally characterized as follows. Let system S , condition C and lattice F with atoms X k be given. Let there be more than one set of p ( X k ) ’s that appears compatible with C . How the probabilities p ( X k ) are to be assigned so that condition C is properly accounted for without any additional bias? Such p ( X k ) ’s yield the maximum of S as given by (1). MEP clearly follows from the meaning of entropy as a measure of lack of information. We shall use this kind of MEP in Section 4.2.
Part I
Corrected language of quantum mechanics
The origin of quantum mechanics can be traced back to the study of a few real systems: hydrogen atom and black-body radiation. The resulting successful models used some new concepts and methods that were readily generalized so that they formed a first version of a new theoretical language. This language was then used to construct models of some aspects of further real objects, such as atoms and molecules, solid bodies, etc., and this activity lead in turn to the refinement of the language.
This evolution does not seem to be finished. A large number of real objects, the so-called classical world, have as yet no satisfactory quantum models. Our own attempts [15] and [19] at constructing such models have lead to some changes in the quantum language. The first part of this review starting here is an attempt at a systematic formulation of this new language.
The general notions of a theoretical language are imported from some mathematical theory and satisfy the corresponding relations given by the axioms and theorems of the mathematical theory. They are rather abstract and by themselves, they do not possess any direct connection to real physical systems. However, as building blocks of various models that do possess such connections, some of them acquire physical meaning. Such a model-mediated physical interpretation can be postulated for most of the mathematical notions by basic assumptions that will be called rules to distinguish them from the axioms of the mathematical theory. What can be derived from these rules and axioms will be called propositions. We shall however formulate only most important theorems and propositions explicitly as such in order to keep the text smooth.
A state of a quantum system is determined by a preparation while a value of an observable is determined by a registration. The notions of preparation and registration are used in their empirical (see Section 1.0.2.) meaning first, just to catch the model-mediated significance of mathematical notions, and the quantum mechanical models explaining relevant aspects of preparation and registration processes will be constructed later. The calculation of a state from classical conditions defined by the preparation needs a sophisticated model of the mature quantum mechanics. Similarly, to calculate an observable from classical properties of the registration device, a quantum model must be used. Only in Section 4 and Section 5 shall we be able to find the way from the empirical description of preparation and registration to a particular mathematical state or observable.

2. States, observables and symmetries

This section introduces the notions and most important properties of quantum states, quantum observables and their relation to symmetry transformations. The states and observables are described by specific mathematical entities. We study the mathematical aspects first and then discuss the physical interpretation.

2.1. States

Quantum-mechanical states are often described by wave functions. However, this leads to some confusion of the preparation statistics with the statistics of registered values, which is a hindrance to understanding quantum measurement (see Section 5). Moreover, it is not adequate for studying the states of macroscopic bodies that we meet in our everyday life (see Section 4), which are very different from wave functions. We start, therefore, with the general notion of quantum states, called either density matrices [21] or state operators [11].
In the mathematical part of this section, the construction of the space of states from the Hilbert space of a system is described and the most important general properties of states are listed. In the interpretation part, the Realist Model Approach introduced in Section 1.0.3. is described in detail.

2.1.1. Mathematical preliminaries

This subsection lists briefly all necessary definitions and theorems, stating some explicitly and giving reference to literature for others. Good textbooks are [6,40,42].
Let H be a complex separable Hilbert space with inner product · | · satisfying a ϕ | b ψ = a * b ϕ | ψ , where ‘✳’ denotes complex conjugation. An element ϕ H is a unit vector if its norm defined by
ϕ = ϕ | ϕ
equals one,
ϕ = 1 ,
and the non-zero vectors ϕ , ψ H are orthogonal if
ϕ | ψ = 0 .
A set { ϕ k } H is orthonormal if the vectors ϕ k are mutually orthogonal unit vectors. { ϕ k } H , k = 1 , 2 , , is an orthomormal basis of H if any ψ H can be expressed as a series
ψ = k ϕ k | ψ ϕ k
with
ψ 2 = k | ϕ k | ψ | 2 .
Separability means that there is at least one countable basis.
Let H and H be two separable Hilbert spaces, { ϕ k } and { ϕ k } two orthonormal bases, { ϕ k } H and { ϕ k } H . Let us define map U : { ϕ k } { ϕ k } by
U ϕ k = ϕ k
for each k. Then, U can be extended by linearity and continuity to the whole of H and it maps H onto H . The map U is called unitary. Unitary maps preserve linear superposition,
U ( a ψ + b ϕ ) = a U ψ + b U ϕ
and inner product,
U ψ | U ϕ = ψ | ϕ .
They can be defined by these properties for general Hilbert spaces and used as equivalence morphisms in the theory of Hilbert spaces. Each two separable Hilbert spaces are thus unitarily equivalent.
Any unit vector ϕ H determines a one-dimensional (orthogonal) projection operator P [ ϕ ] by the formula
P [ ϕ ] ψ = ϕ | ψ ϕ
for all ψ H . We also use the Dirac notation | ϕ ϕ | for this projection. If { ϕ k } is an orthonormal basis of H , then the projection operators P [ ϕ k ] satisfy
P [ ϕ k ] P [ ϕ l ] = 0
for all k l —we say they are mutually orthogonal—and
k P [ ϕ k ] = 1 ,
where 1 is the identity operator on H .
An operator A : H H is called bounded if its norm
A = sup ψ = 1 A ψ
is finite. The domain of a bounded operator is clearly the whole Hilbert space H . A linear operator A is defined by the property
A ( a ϕ + b ψ ) = a A ϕ + b A ψ
for all ϕ , ψ H . Multiplication A B of two linear operators is defined by
A B ϕ = A ( B ϕ )
and linear combination a A + b B by
( a A + b B ) ϕ = a A ϕ + b B ϕ
for all ϕ H and for any a , b C . Let us denote the algebra of all bounded linear operators on H by L ( H ) .
The adjoint A of operator A L ( H ) is defined by
A ϕ | ψ = ϕ | A ψ
for all ϕ , ψ , and A is self-adjoint (s.a.) if
A = A .
Let us denote the set of all bounded s.a. operators by L r ( H ) . For self-adjoint operators, the spectral theorem holds (see Section 2.2.1.).
Unitary maps U : H H are bounded operators and we obtain from Equation (2)
U · U = U · U = 1 .
Let H and H be two separable Hilbert spaces and U : H H be a unitary map. Then U defines a map of L ( H ) onto L ( H ) by A U A U . This map preserves operator action,
( U A U ) ( U ϕ ) = U ( A ϕ ) ,
linear relation,
U ( a A + b B ) U = a U A U + b U B U ,
operator product
( U A U ) ( U B U ) = U ( A B ) U
and norm,
U A = A .
An operator A L r ( H ) is positive, A 0 , where 0 is the null operator, if
ϕ | A ϕ 0
for all vectors ϕ H . The relation A B is defined by
A - B 0 .
The order relation is preserved by unitary maps,
U A U U B U if A B .
Let { ϕ k } be any orthonormal basis of H . For any A L r ( H ) , we define the trace by
t r [ A ] = k ϕ k | A ϕ k .
Trace is independent of basis and invariant with respect to unitary maps,
t r [ U A U ] = t r [ A ] .
.
Theorem 1 Trace defines the norm A s on L r ( H ) by
A s = t r A 2
satisfying
A s A
for all A L r ( H )
For proof, see Ref. [6], Appendix IV.11.
Definition 2 The norm (4) is called trace norm and all elements of L r ( H ) with finite trace norm are called trace-class. The set of all trace-class operators is denoted by T ( H ) .
Trace norm is preserved by unitary maps.
Theorem 2 T ( H ) with the operation of linear combination of operators on H , partial ordering ≥ defined above and completed with respect to the norm (4) is an ordered Banach space. A trace-class operator is bounded, its trace is finite and its spectrum is discrete.
For proof, see Ref. [6], Appendix IV.11.
Let T ( H ) 1 + be the set of all positive elements of L r ( H ) with trace 1. As these operators are positive, their trace is equal to their trace norm and they lie on the unit sphere in T ( H ) . T ( H ) 1 + is not a linear space but a convex set: let T 1 , T 2 T ( H ) 1 + , then
T = w T 1 + ( 1 - w ) T 2 T ( H ) 1 +
for all 0 < w < 1 . The sum is called convex combination and states T 1 and T 2 are called convex components of T .
It follows that any convex combination
T = k w k T k
of at most countable set of T k T ( H ) 1 + with weights w k satisfying
0 w k 1 , w k = 1
and the series converging in the trace-norm topology also lies in T ( H ) 1 + .
In general, elements of T ( H ) 1 + can be written in (infinitely) many ways as a convex combinations of other elements.
Definition 3 Face W is a (norm) closed subset of T ( H ) 1 + that is invariant with respect to convex combinations and contains all convex components of any T W .
Then, T ( H ) 1 + itself is a face. “Face” is an important notion of the mathematical theory of convex sets.
Theorem 3 Every face W T ( H ) 1 + can be written as W ( T ) for a suitably chosen T W ( T ) where W ( T ) is the smallest face that contains T .
For proof, see [6], p. 76. There is a useful relation between faces and projections:
Theorem 4 To each face W of T ( H ) 1 + there is a unique projection P : H H , where H is a closed subspace of H , for which T W is equivalent to
T = P T P .
The map so defined between the set of faces and the set of projections is an order isomorphism, i.e., it is invertible and P < P is equivalent to W W .
For proof, see [6], p. 77. We shall denote the face that corresponds to a projection P by W P .
Clearly, intersection of two faces, if non-empty, is a face, and a unitary map of a face is a face. The next theorem shows that W ( T ) is not necessarily the set of all convex components of T .
Theorem 5 Let P ( H ) be infinite-dimensional and let T 1 , T 2 W P be positive definite on P ( H ) . Then
W ( T 1 ) = W ( T 2 ) = W P .
Let { | k } be an orthonormal basis of P ( H ) and let
sup k k | T 1 | k k | T 2 | k = .
Then T 1 is not a convex component of T 2 .
Proof Suppose that T 1 is a component of T 2 . Then, there is T 3 and w ( 0 , 1 ) such that
w T 1 + ( 1 - w ) T 3 = T 2 .
Hence T 2 - w T 1 > 0 and
k | T 2 | k - w k | T 1 | k > 0
for some positive w and all k, which contradicts Equation (7), QED.
Definition 4 An element T is called extremal element of T ( H ) 1 + if W ( T ) is zero-dimensional, i.e., if the condition
T = w T 1 + ( 1 - w ) T 2
with T 1 , T 2 T ( H ) 1 + and 0 w 1 , implies that T = T 1 = T 2 .
For extremal states, we have:
Theorem 6 T is extremal iff T = | ψ ψ | , where ψ is a unit vector of H .
For proof, see [6], p. 78. The set of all extremal elements of T ( H ) 1 + generates T ( H ) 1 + in the sense that any T T ( H ) 1 + can be expressed as countable convex combination of some extremal elements,
T = k w k P [ ϕ k ] .
Such a decomposition can be obtained, in particular, from the spectral decomposition
T = k t k P k .
In that case T is decomposable into mutually orthogonal projectors P [ ϕ l ] onto elements of a basis, with weights w l = t l / n l , where n l is the degeneracy of the eigenvalue t l (the degeneracy subspaces of trace-class operators have finite dimensions).
A unit vector ϕ of H defines a unique extremal element P [ ϕ ] T ( H ) 1 + but P [ ϕ ] determines ϕ only up to a phase factor e i α . Due to the (complex) linear structure of H , there is an operation on vectors called linear superposition. Linear superposition ψ = c k ϕ k of unit vectors with complex coefficients satisfying
k | c k | 2 = 1
is another unit vector and the resulting projector P [ ψ ] ,
P [ ψ ] = k c k ϕ k l c l ϕ l = k l c l * c k | ϕ k ϕ l | k | c k | 2 | ϕ k ϕ k | .
is different from the corresponding convex combination,
k | c k | 2 | ϕ k ϕ k | .
Observe that P [ ψ ] is not determined by the projections P [ ϕ k ] because it depends on the relative phases of vectors ϕ k .

2.1.2. General rules

The preceding subsection has introduced technical tools that will now be used to further develop Realist Model Approach (see Section 1.0.3.) so that it can serve as a basis of the general quantum language.
Rule 1 With each quantum system S of type τ, a complex separable Hilbert space H τ is associated. H τ is a representation space of certain group associated with Galilean group and τ determines the representation (see Section 2.3).
Thus, every system has its own copy of a Hilbert space and the structure of the space depends only on the type of the system. Starting from the Hilbert space, all important entities concerning S such as state space or algebra of observables are constructed. The word “system” can be used in two different ways: it may represent a real, i.e., a prepared quantum object, or an idealized entity used in the construction of a theoretical model of another prepared object. This idealized entity can be described by, or identified with, the corresponding Hilbert space because the construction uses only this Hilbert space. (If the model contains more identical subsystems, then none of these subsystems has an individual existence and none is in a state of its own (see Section 3.2.1.).) However, we can assume:
Rule 2 The state space of S is T ( H τ ) 1 + . For each T T ( H τ ) 1 + , there is a preparation P that prepares a system S of type τ in state T . T is then an objective property of S .
If a state is not extremal, then it can always be written as a convex combination of other states. What is the physical meaning of this mathematical operation?
Definition 5 Let P 1 and P 2 be two preparation of S and w [ 0 , 1 ] . Statistical mixture
P = { ( w , P 1 ) , ( ( 1 - w ) , P 2 ) }
of the two preparations is a new preparation constructed as follows. Let system S be prepared either by P 1 or by P 2 in a random way so that P 1 is used with probability w and P 1 with probability 1 - w .
This definition can easily be extended to any number of preparations. Examples of statistical mixture are given in Section 1.0.4. and Section 5.3.1.. Then, we assume:
Rule 3 Let P 1 and P 2 be two preparation of S and let the corresponding states be T 1 and T 2 . Then the statistical mixture (10) prepares state
T = w T 1 ( + ) p ( 1 - w ) T 2 .
The purpose of sign “ ( + ) p ” on the right-hand side is to distinguish statistical decompositions from convex combinations. This distinction is very important in the theory of quantum measurement. For example, the theory of quantum decoherence can achieve that the final state of the apparatus is a convex combination of pointer states but cannot conclude that it is a statistical decomposition and must, therefore, resort to further assumptions such as Everett interpretation [13]. Let us stress that a statistical decomposition of state T is not determined by the mathematical structure of state operator T but by the preparation of T .
Sometimes, one meets the objection that the states w T 1 ( + ) p ( 1 - w ) T 2 and w T 1 + ( 1 - w ) T 2 of system S cannot be distinguished by any measurement. But this is only true if the measurements are limited to registrations of observables of S . If observables of arbitrary composite systems containing S are also admitted, then the difference of the two states can be found by measurements [24]. This is exactly the argument against the decoherence theory described in [12], p. 171. Let us also emphasize that quantum state statistics has nothing to do with the statistics of values of observables.
As a mathematical operation, ( + ) p is commutative, associative, and state statistics is invariant with respect to state composition and unitary evolution [19]. Thus, the definitions and assumptions can be generalized to more than two preparations and states (see Rules 12 and 15).
Some comment is in order. Rules 2 and 3 imply, on the one hand, that T is an objective property of S because it has been prepared by P and, on the other hand, that S is objectively either in state T 1 or T 2 at the same time because it has been also prepared either by P 1 or P 2 . There seems to be a contradiction: a particular copy of S is announced to be in two different states simultaneously. However, this contradiction is only due to a too narrow understanding of the term state. To explain this with the help of a simple example, let us consider a large box filled with small balls, some red and some green, in a random mixture. Each ball chosen blindly from the box is either red, with probability 1/2, or green, with probability 1/2, and each of them is also colored, with certainty. Ball properties red, green and colored in this example can be considered as objective. Thus, two different objective properties concerning the same thing, namely the color, e.g., red and colored, can exist simultaneously.
Definition 6 Let P of the form (10) prepare S in state T . Then we write (11) and call the right-hand side of Equation (11) the statistical decomposition of T . States that have a non-trivial statistical decomposition ( w ( 0 , 1 ) ) will be called decomposable, otherwise indecomposable.
Observe that the statistical decomposition of quantum states has nothing to do with the statistical structure of values of observables and that it is usually the latter that is understood as implying the statistical character of quantum mechanics.
The properties “decomposable”, “indecomposable” and statistical decomposition are determined uniquely by a preparation, hence they are objective according to Basic Ontological Hypothesis. As explained above, they are in principle observable. If a preparation is not completely known, it can prepare a state without determining whether it is decomposable or indecomposable.
State operator T does not, by itself, determine the statistical decomposition of a prepared state described by it, unless T is extremal so that every convex decomposition of it is trivial. One works usually with wave functions, which do represent extremal states. There are also some mathematical properties of statistical decomposition that can be helpful for deciding if a state is decomposable that will be introduced later (conservation under unitary time evolution and under system composition). However, in many cases, possible statistical decomposition of a state is not important because sufficiently many properties of the state are independent of its statistical decomposition so that everything one needs can be obtained from the state operator.
There are examples of prepared non-extremal states that are indecomposable. Consider system S prepared in the EPR experiment [21]. S is composite, S = S 1 + S 2 so that the spins of the two subsystems are correlated. S is prepared in extremal state | ψ ψ | . Then the state of S 1 is t r S 2 [ | ψ ψ | ] , which is well-known to be indecomposable but not extremal.
We observe that a quantum state is conceptually very different from a state in Newtonian mechanics. It may be helpful to look at some important differences. Let us define a state of a Newtonian system as a point of the phase space of the system, Γ. Newtonian state defined in this way is generally assumed to satisfy:
  • objectivity: a state of a system is an objective property of the system,
  • universality: any system is always in some state,
  • exclusivity: a system cannot be in two different states simultaneously,
  • completeness: any state of a system contains maximum information that can exist about the system.
  • locality the state of a system determines the position of the system.
It follows that incomplete information about the state of a system can be described by a probability distributions on Γ. Indeed, it is always at a particular point, but we do not know at which. Such a distribution is sometimes called statistical state. In any case, we distinguish a state from a statistical state.
A quantum state is an element of T ( H ) 1 + and the comparison with Newtonian states shows that it satisfies just the objectivity in the form that a prepared state is an objective property, at least according to our interpretation. However, a quantum system does not have to be in any state (an example is a particle S in a system S of identical particles and we assume that a state of S has been prepared but that of S has not [34]). Also, a system can be in several states simultaneously, such as T 1 and w T 1 ( + ) p ( 1 - w ) T 2 in formula (11). Moreover, a state operator alone does not contain any information on the statistical decomposition of a prepared state. However, if an indecomposable state of a system is given, no more knowledge on the system can objectively exist than that given by the state. (It follows that a collapse of wave function or analogous processes are not just changes of our information about the system bur genuine physical processes, see Section 5 and [19,34].) Finally, quantum states are non-local: most states of a single particle do not determine its position, but simultaneous registrations by two detectors at different positions will give anti-correlated results (see Section 5).
In particular, disregarding all differences, a Newtonian state is, in a sense, analogous to indecomposable state in quantum mechanics and probability distributions on Newtonian phase space are, in the same sense, analogous to decomposable states. The following comparison is amusing. If the knowledge of a state of a Newtonian system is not complete, then we can describe it by a probability distribution on the phase space. If a knowledge of the preparation of a quantum state is incomplete, then the state is known but its statistical decomposition does not need to be.
At this stage, we can understand a different approach to reality of states [43,44] and to compare it to ours. In this approach, the existence of a theory with realist interpretation is assumed, the states of which have the features 1–4 above and such that quantum states can be considered as distributions over these real states. Thus, the information contained in quantum states is never complete. Then, the question is asked, whether different extremal quantum states can be non-overlapping distributions, so that each real state defines at most one extremal quantum state. This is interpreted as reality of quantum states. (Under some reasonable conditions, one can prove that the answer to the question is positive.) One can then try to understand the collapse of wave function as the change of information resulting from a registration together with, or rather than, a physical change in a real state. Clearly, we do not assume the existence of any further theory and directly consider all quantum states to be objective properties without being probability distributions over some different kind of real states. Rather, we say that indecomposable states are not distributions over any real states (and represent a complete information) while decomposable states can be considered as distribution over real indecomposable states. The collapse of wave function is then a physical process.
Any unit vector ϕ H τ defines state P [ ϕ ] . Often, such states are called “pure” while general state operators are called “mixed”. In fact, the common use of the terms “pure” and “mixed” states is misleading. From the point of view of statistical physics, pure ought to be the indecomposable states and mixed the decomposable ones. The confusion is aggravated by the fact that decomposable states have various names in literature: direct mixture [6] or proper mixture [12] or Gemenge [11].
In Newtonian mechanics, simple physical properties are constructed from real functions on the phase space. For example energy is such a function and the proposition “Energy has value E” is a property. Such propositions form a Boolean lattice with the logical operation union and intersection. The values of the functions also define subsets of the phase space: e.g., the set of all points of the phase space where the energy has value E. The Boolean lattice of the propositions can be isomorphically mapped onto the Boolean lattice of the subsets. We can give analogous definitions in quantum mechanics:
Definition 7 Let S be system of type τ and f : T ( H τ ) 1 + R a function. Then the proposition “ f = a ” is a simple property. Simple properties form a Boolean lattice with the logical operation union and intersection. Each simple property defines a subset { T T ( H τ ) 1 + | f ( T ) = a } . The Boolean lattice of simple properties is isomorphic to the Boolean lattice of the subsets.
According to Basic Ontological Hypothesis, simple properties are objective because they are uniquely defined by preparations. Indeed, a preparation defines a unique state T and the state a unique simple property f = f ( T ) .
The statistical decomposition of a state is an example of an objective property that is not simple. Next section will give important examples of simple properties. Hence, in both theories, Newtonian and quantum mechanics, the values of simple properties are real numbers and the logics of the properties are the same.
Let us briefly compare our logic of simple properties with the so-called quantum logic (for more details, see Section 2.2.4.). Properties of quantum logic are described mathematically by orthogonal projections onto subspaces of Hilbert space H τ . Any such projection P is an observable and the corresponding property is the proposition “Observable P has value η”, where η = 0 , 1 . The properties are not objective and do not form a Boolean lattice. The basic difference is that our properties are associated with preparations while the properties of quantum logic are associated with registrations.
Extremal states allow another mathematical operation, a linear superposition (8), which is different from a convex combination. The non-diagonal (cross) terms in sum (8) lead to interference phenomena (such as the electron interference in the Tonomura [26] experiment, see Section 1.0.2.) that are purely quantum and unknown in the Newtonian mechanics.

2.2. Observables

The popular description of observables is by self-adjoint operators. However, this notion is not adequate for accounts of registrations that use ancillas (see Section 5.1) and our construction of D-local observables (see Section 3.2.3.) also needs a more general notion of quantum observable. In the present section, the general theory of observables will be briefly described. First, we give the mathematical construction of observables from the Hilbert space, then their general relation to registration. For this purpose, the empirical notion of registration that was explained in the Introduction is sufficient. Also, some important properties of observable such as joint measurability, contextuality and superselection rules are discussed.

2.2.1. Mathematical preliminaries

This section is a brief review of most important definitions and theorems. More details and proofs can be found e.g. in [6]. According to Rule 1, every system S is associated with a Hilbert space H . Starting from H , we can perform the following constructions.
The set of all bounded linear operators denoted by L ( H ) in Section 2.1.1. is closed with respect to both multiplication and linear combination of its elements. One can show that L ( H ) is a Banach algebra with norm (3). That is, L ( H ) is complete with respect to the norm and its multiplication satisfies
A B A B .
For proof, see [40], p. 75. Moreover, the identity operator 1 defined by 1 ϕ = ϕ for all ϕ satisfies 1 A = A for any A L ( H ) and 1 = 1 , hence L ( H ) is a Banach algebra with unity [40].
Linear combinations of bounded s.a. operators with real coefficients are again s.a., but their products are not, in general. Hence, the set of all bounded s.a. operators forms a real linear space. If completed with respect to the norm (3), it is an ordered Banach space denoted by L r ( H ) .
Definition 8 Let F be the Boolean lattice of all Borel subsets of R n . A positive operator valued (POV) measure
E : F L r ( H )
is defined by the properties
1. 
positivity: E ( X ) 0 for all X F ,
2. 
σ-additivity: if { X k } is a countable collection of disjoint sets in F then
E ( k X k ) = k E ( X k ) ,
where the series converges in weak operator topology, i.e., averages in any state converge to an average in the state.
3. 
normalization:
E ( R n ) = 1 .
The number n is called dimension of E . Let us denote the support of the measure E by Ω. The set Ω is called the value space of E . The operators E ( X ) for X F are called effects.
The support Ω of measure E is defined as follows
Ω = { x R n | E ( { y | | x - y | < ϵ } ) 0 ϵ > 0 } .
Let H and H be two separable Hilbert spaces and U : H H a unitary map. Then U E U : F L ( H ) is a POV measure on H .
We denote by L r ( H ) 1 + the set of all effects.
Theorem 7 L r ( H ) 1 + is the set of elements of L r ( H ) satisfying the inequality
0 E ( X ) 1 .
That effects must satisfy (12) follows from the positivity and normalization of a POV measure [6]. On the other hand, each element of the set defined in Theorem 7 is an effect. For example, if E is such an element, we can define E ( { 1 } ) = E and E ( { - 1 } ) = 1 - E . The two operators satisfy Equation (12) and they sum to 1 , so they determine a POV measure with the value set Ω = { - 1 , + 1 } .
Theorem 7 implies that the spectrum of each effect is a subset of [ 0 , 1 ] . An effect is a projection operator ( E ( X ) 2 = E ( X ) ) if an only if its spectrum is the two-point set { 0 , 1 } .
Definition 9 Let E : F L ( H ) and E : F L ( H ) be two POV measures that satisfy
E ( X ) E ( X ) = E ( X ) E ( X )
for all X F and X F . Then we say that the two POV measures commute.
Theorem 8 For any POV measure E : F L ( H ) the following two conditions are equivalent
E ( X ) 2 = E ( X )
for all X F and
E ( X Y ) = E ( X ) E ( Y )
for all X , Y F .
Thus, a POV measure is a projection valued (PV) measure exactly when it is multiplicative. In this case, all its effects commute with each other.
PV measures for n = 1 are equivalent to s.a. operators that are not necessarily bounded. Section 2.1.1. dealt only with bounded s.a. operators, but now we need more general entities.
Let A be an operator on Hilbert space H that is not necessarily bounded. Then it is not defined on the whole of H but only on linear subspace D A that is dense in H and is called domain of A . The definition of adjoint has two steps: A is first defined on some domain associated with D A and then possibly extended to a larger domain. The self-adjoint operator is defined similarly. For details, see e.g. Ref. [42].
To define a sum and product of two unbounded operators A and B , their domains are used as follows. If
D A = D B
then we say that A and B have a common domain. If, moreover, the common domain D is invariant with respect to both operators, i.e.,
A D D , B D D
then the sum A + B , product A B and commutator [ A , B ] = A B - B A are well defined on D and can be possibly extended.
For s.a. operators, bounded or unbounded, the so-called spectral theorem holds, see Refs. [42], [40], Chap. 10. This says that any s.a. operator A is equivalent to a PV measure, which is called, in this case, the spectral measure of A . Let E be a PV measure, then E determines a unique self-adjoint operator R ι d E , where ι denotes the identity function on R . Conversely, each s.a. operator A on H determines a unique PV measure E : F L ( H ) such that
A = R ι d E .
If PV measure E is equivalent to a s.a. operator A , we shall denote it E A . Thus, POV measure is a generalization of a self-adjoint operator. Clearly, two s.a. operators A and B commute if PV measures E A and E B commute.
Expression t r [ T , E ] is well-defined for all T T ( H ) and E L r ( H ) and it is a real-valued bilinear form with respect to which the two Banach spaces are dual to each other [6], p. 413. We even have:
Theorem 9 If T 1 and T 2 from T ( H τ ) 1 + satisfy
t r [ T 1 E ] = t r [ T 2 E ]
for all E L r ( H τ ) 1 + then T 1 = T 2 ; if E 1 and E 2 from L r ( H τ ) 1 + satisfy
t r [ T E 1 ] = t r [ T E 2 ]
for all T T ( H τ ) 1 + then E 1 = E 2 .
An important further property of the bilinear form is:
Theorem 10 For each T T ( H τ ) 1 + and E L r ( H τ ) 1 + , the condition t r [ T E ] = 1 is equivalent to
E T = T .
In general, we call state T satisfying
E T = a T
with some real a eigenstate of E to eigenvalue a. If T = P [ ψ ] , then ψ is called eigenvector.
An example is a discrete POV measure. A POV measure E is discrete if its value set Ω is an at most countable subset of R . Let E be a discrete POV measure and let Ω = { a k , k N } , where N is the set of positive integers. Then E is a PV measure, E = E A , A = k a k E k , a k are eigenvalues of A and
E k = E ( { a k } )
are projections on the corresponding eigenspaces of the s.a. operator A that is defined in this way.
Theorem 11 For any POV measure E : F L ( H ) and any T T ( H ) 1 + , the mapping
p T E : F [ 0 , 1 ]
defined by
p T E ( X ) = t r [ T E ( X ) ]
for all X F is a real σ-additive probability measure with values in [ 0 , 1 ] .
This follows from the defining properties of E and the continuity and linearity of the trace, see e.g. [11]. The measure is preserved by unitary maps,
t r [ ( U T U ) ( U E ( X ) U ) ] = t r [ T E ( X ) ] .
We note that a convex combination of states induces a convex combination of measures,
T = k w k T k p T E = k w k p T k E .
More about mathematical properties of states and effects and the corresponding spaces T ( H ) 1 + and L r ( H ) 1 + can be found in Ref. [6].

2.2.2. General rules

The physical interpretation of POV measures is given by:
Rule 4 Any quantum mechanical observable for system S of type τ is mathematically described by some POV measure E : F L r ( H τ ) . Each outcome of an individual registration of the observable E ( X ) performed on quantum object S yields an element of F . Each registration apparatus that interacts with S determines a unique observable of S .
Definition 10 If a PV measure is an observable, the observable is called sharp.
Often, a stronger assumption than Rule 4 is made, namely that each s.a. operator on H τ is a sharp observable of system S of type τ. Such systems are called proper quantum systems [11]. In Section 3.2.2. and Section 5.2, we shall show that, in strict sense, there is no proper quantum system. This does not seem to represent any genuine difficulty. Usually, the construction of a model needs only few observables and, for a given object, our theory of quantum measurement will itself determine which POV measures cannot be observables. These facts do not invalidate all applications of C * -algebras (e.g. [45]) as powerful mathematical method in quantum mechanics. In particular, C * -algebras are useful in quantum field theory, where they are generated by local operators, which are measurable according to our theory. However, some of the physical interpretations that are sometimes [46] given to C * -algebras in non-relativistic quantum mechanics are difficult to maintain.
An important assumption of quantum mechanics is the following generalization of Born’s rule
Rule 5 The number p T E ( X ) defined by Equation (14) is the probability that a registration of the observable E ( X ) performed on object S in the state T leads to a result in the set X.
Using the same preparation N times results in a set of systems with the same state T . Performing on each element the registration of E ( X ) gives a value (or an approximate value) y Ω . The relative frequency of finding y X approaches p T E ( X ) if N on the set. This is equal to the probability measure given by Equation (15). Thus, the relative frequency of a measurement result for a registration on elements of an ensemble can be directly calculated from the state operator. The relative frequencies of a registration result on the so constructed ensemble are approximately measurable and this gives the physical meaning to the probability distribution p T E ( X ) . This does not mean that we define probability as a frequency, see Section 1.0.4..
Theorem 9 implies that a state T can be determined uniquely, if it is prepared many times and a sufficient number of different registrations can be performed on it. On the other hand, the question of what is its statistical decomposition cannot be decided in this way because t r [ E T ] is generally independent of the statistical decomposition of T . There is one exception: extremal states.
Observe that the probability is only meaningful if it concerns really existing (but maybe unknown) outcomes, see Section 1.0.4.. Thus, it can be only associated with registration. Hence, in Rule 5, the probability p T E ( X ) refers to registrations on an object and not to the object itself. In general, the property that value x Ω lies within subset X F is not an objective property of objects in the sense that it could be attributed to an object itself and that the measurement would just reveal it. Such assumption would lead to contradictions, see Section 2.2.4.. Thus, there is an asymmetry between states and observables: states are objective properties but observables are not; all elements of T ( H τ ) 1 + can be prepared but not all effects of the dual space are observables.
Special cases in which the outcome of a registration is predictable are described as follows. Let E : F L r ( H τ ) be an observable and let T T ( H τ ) 1 + be such that
t r [ T E ( X ) ] = 1
for some X F . Then the probability that the registration of E ( X ) on T will give a value in X is 1 and we can say that the prepared object in the state T possesses the property independently of any measurement. Theorem 10 implies that T is an eigenstate of E ( X ) to eigenvalue 1.
Rule 5 implies useful formulae for averages and higher moments of observable E that generalize the well-known formulae for sharp observables. Let us restrict the dimension of POV measures to n = 1 . For the average E T of observable E in the state T , we have
E T = R ι d p T E .
Using Equation (14), we obtain
E T = R ι d t r [ T E ] = t r T R ι d E .
For a sharp observable E A , Equation (13) yields the usual relation
A T = t r [ T A ] .
In the case of extremal state, T = P [ ϕ ] , we obtain A T = ϕ | A ϕ . For higher moments, we just have to substitute suitable power of ι in the integral and the proof is analogous.
Definition 11 Let A be a sharp observable and T be a state. Then
Δ T A = A 2 T - A T 2
is called variance of A in T .
Next,
Definition 12 The normalized correlation in a state T of any two commuting sharp observables A and B is defined by
C ( A , B , T ) = A B T - A T B T Δ T A Δ T B .
The normalized correlation satisfies
- 1 C ( A , B , T ) 1
for all commuting sharp observables A and B and for all states T because of Schwarz’ inequality. The proof is based on the facts that R e A B T is a positive symmetric bilinear form on real linear space L r ( H τ ) and that R e A B T = A B T if A and B commute. If C ( A , B , T ) = 1 , the observables are strongly correlated, if C ( A , B , T ) = 0 , they are uncorrelated and if C ( A , B , T ) = - 1 , they are strongly anticorrelated (see [11], p. 50).
The values of an observable are not objective properties of an object but its states are. As averages and moments of a given observable are uniquely determined by states, we have
Proposition 1 The average A T and variance Δ T A of any sharp observable A and correlation C ( A , B , T ) of any two commuting sharp observables A and B in state T of object S are an objective (dynamical) property of S that has been prepared in state T .
Proposition 1 gives the most important examples of simple properties (they have been defined in Section 2.1.2.).
Let us consider a discrete observable A (which is always sharp) with spectrum { a k } that is non-degenerate, P k = P [ ψ k ] for all k. ψ k are eigenvectors of A . If we prepare the state P [ ψ k ] and then register A , the result must be a k with probability 1. Next suppose that we prepare a linear superposition P [ Ψ ] ,
Ψ = 1 c k ψ k ,
with | c k | 2 = 1 . Before the registration, the object in this state does not possess any of the values a k but one can say that it does possess all those a k simultaneously for which c k 0 . The probability p k that the registration of A will give the result a k is
p k = t r [ P [ ψ k ] P [ Ψ ] ] = | c k | 2 .
Equation (18) gives the physical meaning to the absolute value of the coefficients in the linear superposition and is called the Born rule.
There is also some physical meaning of the relative phases of the coefficients of decomposition (18). This can be seen experimentally by means of the interference phenomena and correlations. As for interference, the average in state P [ Ψ ] of observable B that does not commute with A does depend on matrix elements of B between different states ψ k :
t r [ B P [ Ψ ] ] = k l c k * c l ψ k | B | ψ l .
Registering values of B many times on systems in state P [ Ψ ] and knowing that they add to average (19), we can see that each individual system must “know” the values of c k * c l ψ k | B | ψ l for all pairs { k , l } . Similarly, the normalized correlations of a suitable pair of observables can depend on the cross terms in Equation (8).
Let { ψ n } be a basis of Hilbert space H τ and a ( n ) a monotonous function of n. Then, the s.a. operator
A = n a ( n ) | ψ n ψ n |
is a discrete non-degenerate observable. The registration of any discrete non-degenerate observable is called complete test (see, e.g., [21], p. 29). If state T is prepared, then the probability distribution of a complete test A is
p n = t r [ T | ψ n ψ n | ]
In quantum mechanics, probability distributions of registration outcomes are generally associated with both preparations and registrations. This has to do with the non-objective nature of observables. The amount of information that can be gained in a complete test is given by the Shannon entropy (1) of the distribution p n and depends both on state ψ and observable A .
Still, can there be a general measure of information lack associated with a state alone? Clearly, such a question is meaningful and the answer is the minimum of entropies, each associated with a complete test. This minimum, S ( T ) is a well-defined function of state T and is called von Neumann entropy. One can show (see, e.g., Ref. [21]) that, up to a constant factor,
S ( T ) = - t r [ T ln ( T ) ] .
As each T must have a discrete spectrum with positive eigenvalues t k , we have
S ( T ) = - k t k ln ( t k ) .
The lack of information S ( T ) is associated with the state T of object S and consequently with the (classical) condition C that defines the preparation. Hence, according to our criterion of objectivity, von Neumann entropy is an objective property of object S prepared in state T . The objective property of a state that is given by a value of von Neumann entropy can be called its fuzziness.
The only states that are not fuzzy are the extremal ones. Indeed, the complete test on extremal state | ψ ψ | defined by any basis with ψ 1 = ψ has a trivial probability distribution p n = δ n 1 , for which the entropy is zero, and entropy cannot be negative. A very important observation is that a quantum mechanical state can be both fuzzy and indecomposable. An example is given in Section 3.1.2..

2.2.3. Joint measurability

This section adapts the theory of joint measurability as given, e.g., in [6], p. 84., to our changes in general language of quantum mechanics.
Proposition 2 Let A and B be two s.a. operators with a common invariant domain, let i [ A , B ] have a s.a. extension and let T be an arbitrary state. Then
Δ A Δ B | [ A , B ] | 2 .
Equation (21) is called uncertainty relation.
The interpretation of the uncertainty relation is as follows. If we prepare many copies of an object in state T and register either A or B or i [ A , B ] on each, then the average of [ A , B ] and variances of A as well as of B will satisfy Equation (21). It is not necessary to register all observables jointly.
An important notion of quantum mechanics is that of joint measurability. Often, it is also called simultaneous measurability but the notion has nothing to do with time. It simply means that there is one registration device that can measure both quantities.
Definition 13 Two elements E 1 and E 2 of L r ( H ) 1 + are jointly measurable if there is a POV measure E : F L r ( H ) 1 + such that E 1 = E ( X 1 ) and E 2 = E ( X 2 ) for some X 1 and X 2 in F .
This is a mathematical property that, in fact, is only a necessary condition for existence of the required registration device. Even if E existed, the question whether it is an observable is non-trivial.
Proposition 3 Two effects E 1 and E 2 are jointly measurable if and only if there are three elements E 1 , E 2 and E 12 in L r ( H ) 1 + such that
E 1 + E 2 + E 12 L r ( H ) 1 +
and
E 1 = E 1 + E 12 , E 2 = E 2 + E 12 .
For proof, see [6], p. 89. Proposition 3 has two corollaries: (1) two projections are jointly measurable if they commute and (2) a sufficient condition for joint measurability of E 1 and E 2 is that
E 1 + E 2 L r ( H ) 1 + .
If an observable is sharp, then its effects commute, but the effects of a general observable does not necessarily commute. Thus, even non-commuting operators can be jointly measurable. This is completely compatible with standard quantum mechanics. Indeed, any POV measure that is an observable of an object S can be measured as a sharp observable of a composite system containing S and another object called ancilla as subsystems. For proof, see e.g. [21], p. 285. An important example will be given in Section 4.7.
Two observables E 1 : F 1 L r ( H ) and E 2 : F 2 L r ( H ) with F 1 comprising the Borel subsets of R n 1 and F 2 those of R n 2 are called jointly measurable, if each two effects E 1 ( X 1 ) and E 2 ( X 2 ) are jointly measurable. In this case, there is an effect E 1 ( X 1 ) E 2 ( X 2 ) in L r ( H ) 1 + giving the probability that observable E 1 has value in X 1 and observable E 2 has value in X 2 . Then, there is a unique observable
E : F L r ( H ) ,
where F is the set of Borel subsets of R n 1 + n 2 and
E ( X 1 × X 2 ) = E 1 ( X 1 ) E 2 ( X 2 ) .
We call E compound of E 1 and E 2
An important example are two sharp observables A and B . They are jointly measurable, if and only if all their effects (which are projections) commute with each other, i.e., the operators commute. In this case, the wedge is just the ordinary operator product,
E A ( X 1 ) E B ( X 2 ) = E A ( X 1 ) E B ( X 2 ) .
The compound E A B of E A and E B is an observable that represents registration of a pair of values, one of A , the other of B .
If two sharp observables A and B do not commute then they can be jointly measurable at best with the inaccuracy corresponding to their uncertainty relation. This means that only some effects of A are jointly measurable with only some effects of B . An example will be given in Section 4.7.

2.2.4. Contextuality

The investigations in the field of contextuality were motivated by the following problem. In Newton mechanics, a statistical state of any system S is described by probability distributions ρ : Z [ 0 , 1 ] , Z Γ on its phase space Γ. Any such distribution results from a fixed preparation and describes the ensemble of individual systems S prepared in this way. Each individual system of the ensemble is, however, always assumed, at least in Newton mechanics, to be in some state given by a point Z of Γ. We just do not know which point and the distribution ρ ( Z ) describes the state of our incomplete knowledge. Thus, all observables, which are functions on Γ, have determinate values for each element of the ensemble.
Can anything analogous be assumed for quantum mechanics making a general state operator analogous to a statistical state of Newtonian mechanics? Could a state operator of a quantum object S describe our knowledge of an ensemble defined by the preparation and could the individual elements of the ensemble each have determinate values of all observables, which are just not known, or even, be it for whatever reason, could not be known? To solve this problem, it is sufficient to restrict the observables to sharp observables, because if the question had negative answer for a restricted class, it would have negative answer for any class containing the restricted one.
It is technically advantageous to restrict the problem even further by limiting oneself to (orthogonal) projection operators to subspaces of H τ . Let us denote the set of all such projections by P ( H τ ) L r ( H τ ) . The problem can then be formulated as follows. Is there a dispersion-free probability distribution h : P ( H τ ) { 0 , 1 } ? Dispersion-free means that its values are just zero and one, so that such a distribution determines values of all projections.
The dispersion-free distribution h has to fulfill certain conditions, or else it could not be interpreted as concerning properties. Let us first describe the structure of P ( H τ ) .
Each projection a P ( H τ ) can be mapped on linear subspace a ( H τ ) of H τ and this bijective map allows to define the lattice relations on P ( H τ ) . First, a b if a ( H τ ) b ( H τ ) . This defines a partial ordering on P ( H τ ) . Second, a = b if a ( H τ ) contains all vectors orthogonal to b ( H τ ) . Projection a is called the orthocomplement of b . Third, c = a b if c ( H τ ) is the set-theoretical intersection of a ( H τ ) and b ( H τ ) . Observe that a b = a b (operator product of the projections) only if a and b commute (are orthogonal). Finally, c = a b if c ( H τ ) is the linear hull of a ( H τ ) and b ( H τ ) .
P ( H τ ) with these relations forms the so-called orthocomplemented lattice (for proof, see, e.g., Refs. [47,48]), but not a Boolean lattice. (In the so-called quantum logic, properties of S are described by elements of P ( H τ ) . They represent the mathematical counterpart of the so-called YES-NO registrations [49,50]. If we pretend that the values obtained in the YES-NO experiments are properties of a well-defined single quantum system, then we are forced to replace the Boolean lattice of ordinary logic by the orthocomplemented lattice of quantum logic. But this pretence is against all logic because these values are not properties of one but of many different systems each consisting of S plus some registration apparatus.) It has however Boolean sublattices, which represent sets of jointly measurable sharp effects, and must therefore contain only mutually commuting projections. On these sublattices, h is, in fact, an assignment of truth values 0 and 1, and it has to satisfy the usual logical conditions
h ( a b ) = h ( a ) h ( b ) , h ( a b ) = h ( a ) h ( b ) , h ( a ) = h ( a ) .
In these relations, we consider the set { 0 , 1 } as a Boolean lattice of one empty set and one arbitrary non-empty set A with ⊥ the set-theoretical complement in A , ∨ the set-theoretical union and ∧ the set-theoretical intersection. Thus, on each Boolean sublattice, h must also be a Boolean lattice homomorphism.
The first result relevant to the question of existence of such maps is the Gleason’s theorem [51]. It states that the set of all probability distributions on P ( H τ ) for Hilbert space H τ of dimensions greater than 2 is just T ( H τ ) 1 + . Thus, our dispersion-free distributions are to be found in T ( H τ ) 1 + . It is easy to show that there are none there.
One can object that P ( H τ ) is an idealization containing an infinity of observables. This leads to the question whether there is a finite subset of P ( H τ ) that does not admit such distributions, either. This is the next relevant result, Kochen–Specker no-go theorem [2], in which an example of such a subset is given. There are more examples now provided by different physicists (see, e.g., Ref. [48]). The reason why h does not exist is that the assignment of a truth value to projection a , say, depends on to which Boolean sublattice a belongs. This can be understood as “context”: in one case, a is registered jointly with elements of one Boolean sublattice, in another case with those of another sublattice.
The definitive result in this field seems to be Bub–Clifton–Goldstein theorem [48], which lists all maximal sub-lattices of P ( H τ ) that admit dispersion-free probability distributions. They do not need to be Boolean but they are always only proper sublattices of P ( H τ ) . Hence, only a limited number of projections can be assumed to have determinate values before registration, and this limits possible “non-collapse” interpretations and modifications of quantum mechanics such as Bohm–de Broglie or modal interpretations. In fact, each of these interpretations or modifications is based on a unique Bub–Clifton–Goldstein sublattice so that these theories can be classified according to these sublattices [48].

2.2.5. Superselection rules

A preparation that deals only with systems of one and the same type prepares so-called one-type systems. Not every preparation is such. For example, we can randomly mix electrons in some states with protons in some states. Such systems, and some generalizations of them, are called mixed systems. (This has nothing to do with the term “mixed state”, which is sometimes used for non-vector states.) In this section, the mathematical description of mixed systems will be explained. We shall mix just two one-type systems, S 1 and S 2 , but the generalization to any number is straightforward.
Let H 1 and H 2 be the Hilbert spaces of S 1 and S 2 . Then the Hilbert space H of the system S mixing S 1 and S 2 is
H = H 1 H 2 .
The direct sum on the right-hand side is the space of pairs, ( ψ 1 , ψ 2 ) , ψ 1 H 1 and ψ 2 H 2 with linear superposition defined by
a ( ψ 1 , ψ 2 ) + b ( ϕ 1 , ϕ 2 ) = ( a ψ 1 + b ϕ 1 , a ψ 2 + b ϕ 2 )
and the inner product defined by
( ψ 1 , ψ 2 ) , ( ϕ 1 , ϕ 2 ) = ψ 1 , ϕ 1 1 + ψ 2 , ϕ 2 2 ,
where · , · i is the inner product of H i . There are then embeddings ι i : H i H defined by
ι 1 ( ψ 1 ) = ( ψ 1 , 0 ) , ι 2 ( ψ 2 ) = ( 0 , ψ 2 )
for any ψ i H i and i = 1 , 2 .
Let L ( H ) be the algebra of bounded linear operators on H and L ( H i ) that of H i . If two operators A i L ( H i ) , i = 1 , 2 , are given, then an operator ( A 1 A 2 ) L ( H ) , called direct sum of A 1 and A 2 , is defined by
( A 1 A 2 ) ( ψ 1 , ψ 2 ) = ( A 1 ( ψ 1 ) , A 2 ( ψ 2 ) )
for any ( ψ 1 , ψ 2 ) H . The special property of the direct-sum operators is that the subspaces ι i ( H i ) H are invariant with respect to them. Clearly, not all operators in L ( H ) are of this form. Effects of the form A 1 0 2 ( 0 1 A 2 ) can be interpreted as representing registrations done on S 1 ( S 2 ) alone.
The embeddings ι i define maps on projections P [ ψ i ] , which can be denoted by the same symbol, viz.
ι i ( P [ ψ i ] ) = P [ ι i ( ψ i ) ]
for all ψ i H i . ι i can be extended to the whole spaces T ( H i ) 1 + , ι i : T ( H i ) 1 + T ( H ) 1 + as follows. Let T i T ( H i ) 1 + , then
ι 1 ( T 1 ) = T 1 0 , ι 2 ( T 2 ) = 0 T 2 .
The convex combinations of states from T ( H 1 ) 1 + and T ( H 2 ) 1 + are states of mixed systems defined at the beginning of this section. However, the states T ( H ) 1 + are not exhausted by convex combinations of states from T ( H 1 ) 1 + and T ( H 2 ) 1 + .
The structural properties in which the systems S 1 and S 2 differ from each other can now be viewed as non-trivial operators on H . Let for example the masses μ 1 and μ 2 satisfy μ 1 μ 2 . Then the s.a. operator μ 1 1 1 μ 2 1 2 defines the mass operator m on H with two eigenspaces ι i ( H i ) and eigenvalues μ i , i = 1 , 2 . Similarly for charges, spins, etc. Now, it is easy to see that all operators of L ( H ) that commute with m have the form A 1 A 2 and all operators of this form commute with m . Clearly, s.a. operators of the form A 1 A 2 are observables of system S . The nature of mixing systems suggests the basic rule
Rule 6 All effects that can be registered on system S mixing S 1 and S 2 have the form A 1 A 2 where A 1 L r ( H 1 ) 1 + and A 2 L r ( H 2 ) 1 + .
Hence, the space of all observables of S is not L r ( H ) 1 + but the subset of all observables that commute with m .
In general, we define
Definition 14 A discrete sharp observable Z of a system S with Hilbert space H is called superselection observable if Z commutes with all observables of S , sharp or not sharp. The existence of such an observable is called superselection rule. The eigenspaces of Z are called superselection sectors. All superselection observables form the centre Z of the algebra of all sharp observables of S .
A restriction of the set of observables for system S suggests the introduction of an equivalence relation on the set of states T ( H τ ) 1 + .
Definition 15 Two states T 1 and T 2 are equivalent with respect to a set O of observables if these states assign the same probability measures to each observable of O , that is , p T 1 E = p T 2 E for each E O . In that case, we write T 1 O T 2 .
For example, linear superposition
Ψ = a ( ψ 1 , 0 ) + b ( 0 , ψ 2 )
for any ψ i H i , i = 1 , 2 , | a | 2 + | b | 2 = 1 , and convex combination
T = | a | 2 ι 1 ( | ψ 1 ψ 1 | ) + | b | 2 ι 2 ( | ψ 2 ψ 2 | )
are equivalent with respect to the set of superselection observables O ,
| Ψ Ψ | O T .
For a given set of observables O , O is, indeed, an equivalence relation on the set of states T ( H ) 1 + . Let us denote the set of equivalence classes of states in T ( H ) 1 + with respect to the set of observables O by T O ( H ) 1 + . No two states of the same class can be distinguished by any registrations. No two different states from T ( H ) 1 + are equivalent with respect to all observables unless the set of effects O is restricted.
Proposition 4 Given a system S with the sets O and T O ( H ) 1 + of observables and state classes, respectively. Then the following statements are equivalent:
A 
Z is a superselection observable.
B 
Each state is equivalent to a unique convex combination of eigenstates of Z .
See Ref. [11], p. 18.
Rule 7 The states of system S mixing S 1 and S 2 are the equivalence classes with respect to the set of observables O of the form A 1 A 2 . The unique convex combination C Z ( T ) of eigenstates of Z to which each element T of T ( H ) 1 + is equivalent describes the physical meaning of the class.
Clearly, state C Z ( T ) does not contain any unmeasurable correlation.
Attempts at a solution of the problem of classical properties and of quantum measurement [46,52,53,54,55,56,57,58] utilize properties of superselection rules. For example, in the theory of measurement, one would like to evolve extremal states to convex combinations and this can indeed be achieved by suitable superselection rules. However, the method might work only if a stronger assumption than Rule 7 were made: the words “convex combination” had to be replaced by “statistical decomposition”.

2.3. Galilean group

In quantum mechanics, Galilean relativity principle holds in the following form:
Rule 8 The same experiments performed in two different Newtonian inertial frames have the same results, i.e., give the frequencies of observable values.
“The same experiment” means that the first experiment has the same empirical description with respect to the first inertial frame as the second experiment has with respect to the second frame. In the present section, we restrict ourselves to the proper Galilean group and work out some consequences of the principle. For example, the most important PV measures of quantum mechanics will be defined and the time evolution equation formulated. We keep the exposition brief; for more detail and references see e.g. Ref. [21,59].
The group of transformations that leave the geometric structure of Newtonian spacetime invariant (see, e.g., Ref. [60], p. 296) is called Galilean group G . It is also the group of transformations between inertial frames. (We understand symmetry as transformation that leaves some well-defined structure invariant. For example, Galilean group contains all transformations that leave Newtonian spacetime geometry invariant and a symmetry of a quantum system leaves its Hamiltonian invariant.) We shall restrict ourselves just to the component of unity of G called proper Galilean group, G + and shall ignore the improper transformations such as space inversions and time reversal. More about them can be found in Ref. [61].
Let ( x 1 , x 2 , x 3 , t ) be a Newtonian inertial frame so that x 1 , x 2 , x 3 are Cartesian coordinates. A general element g ( λ ) of G + can be written in the form
x O x + a + v t
and
t t + λ 10 ,
where O is a proper orthogonal matrix (element of rotation group S O ( 3 ) ) determined by three parameters λ 1 , λ 2 , λ 3 , a is a vector of space shift with components λ 4 , λ 5 , λ 6 , v is a boost velocity with components λ 7 , λ 8 , λ 9 and λ 10 is a time shift. The relations are written in matrix notation so that, e.g., x is a column matrix with components of vector x .
The group product g ( λ 3 ) = g ( λ 2 ) g ( λ 1 ) is defined as the composition of the transformations g ( λ 2 ) and g ( λ 1 ) so that g ( λ 1 ) is performed first and g ( λ 2 ) second. Then,
O 3 = O 2 O 1 ,
a 3 = a 2 + O 2 a 1 + v 2 λ 10 1 ,
v 3 = v 2 + O 2 v 1 ,
and
λ 3 10 = λ 2 10 + λ 1 10 .
Group G + is not simply connected because its subgroup S O ( 3 ) is not. The universal covering of S O ( 3 ) is group S U ( 2 ) so that there are always two elements of S U ( 2 ) , differing by rotation by 2π, that are homomorphically mapped on one element of S O ( 3 ) . Let us denote by G ¯ + the universal covering group of G + .
Let us first give an intuitive motivation of why Galilean group acts on state operators and POV measures. Given a quantum system S of type τ, classical apparatuses that are supposed to prepare and register S have well-defined Galilean transformations. Often, S is subject to macroscopic influences external to S such as classical external fields. These also possess non-trivial transformation laws with respect to Galilean transformations.
Let us consider a measurement that consists of a preparation and a registration of system S of type τ by apparatuses A p and A r , respectively, and let there be some external fields f. Let A p prepare object S in state T and let A r register effect E ( X ) . Next, we can transport both A p and A r by g G to g ( A p ) and g ( A r ) . Let T g , f be the state prepared by g ( A p ) and E g , f ( X ) the effect registered by g ( A r ) . If the external influences are also transformed by g : f g ( f ) , we obtain state T g , g ( f ) and effect E g , g ( f ) ( X ) . The experiment has then been completely transferred by g and, therefore, Galilean relativity principle implies
Proposition 5 State T g , g ( f ) and effect E g , g ( f ) ( X ) satisfy
t r [ T E ( X ) ] = t r [ T g , g ( f ) E g , g ( f ) ( X ) ]
for all g G , T T ( H τ ) 1 + and E ( X ) L r ( H τ ) 1 + .
Let us next compare measurable properties of the two states T and T g , g ( f ) . Proposition 5 implies:
t r [ T g , g ( f ) E ( X ) ] = t r [ T E g - 1 , g - 1 ( f ) ( X ) ] .
Hence, values that can be registered on the state T g , g ( f ) are those on T transformed by g - 1 .

2.3.1. Closed systems

To keep the description of Galilean group action simple, we restrict ourselves to systems for which all external macroscopic influences can be assumed to be negligible and call such systems closed. Then, f = 0 , and we shall discard the second index at transformed states and effects. The first basic assumption is
Rule 9 Let S be an closed system of type τ. Then there is a unique linear map U ( g ) : H τ H τ for each element g G ¯ + so that
T g - 1 = U ( g ) T U ( g ) ,
E g - 1 ( X ) = U ( g ) E ( X ) U ( g )
for all T T ( H τ ) 1 + , E ( X ) L r ( H τ ) 1 + and g G ¯ + .
For quantum mechanics, it is important that the universal covering group, rather than Galilean group itself, acts on Hilbert spaces.
Then, Rules 8 and 9 imply
Proposition 6 g U ( g ) is a unitary ray representation of G ¯ + on H τ
For proof and references see, e.g., [59], pp. 285–292. The ’ray’ representation means that
U ( g 2 g 1 ) = exp [ i ω ( g 1 , g 2 ) ] U ( g 2 ) U ( g 1 ) ,
where ω : G ¯ + × G ¯ + R satisfies
ω ( g 1 , g 2 ) + ω ( g 1 g 2 , g 3 ) = ω ( g 1 , g 2 g 3 ) + ω ( g 2 g 3 ) .
Equations (25) and (26) and the fact that rotation by 2 π is always represented by 1 or −1 imply that it is indeed the proper Galilean group and not only G ¯ + acting on T ( H τ ) 1 + and L r ( H τ ) 1 + .
Clearly, the probability is preserved,
t r [ ( U ( g ) T U ( g ) ) ( U ( g ) E ( X ) U ( g ) ) ] = t r [ T E ( X ) ] .
We assume that Borel sets X have a well-defined transformation with respect to g, γ ( g ) : F F and that the two effects E ( X ) and E g ( X ) belong to the same observable. Then
U ( g ) E ( X ) U ( g ) ) = E ( γ ( g ) - 1 X )
(see Ref. [62] for generalization of this relation).
Given a one-parameter group of unitary operators U ( λ ) , then according to Stone’s theorem (see, e.g., [40]), there exists a s.a. operator G satisfying
U ( λ ) = exp ( i G λ ) .
It is called generator of group U ( λ ) and can be calculated with the help of the formula
i G = U ( λ ) d U ( λ ) d λ .
If the parameter is not specified by any further convention, it is defined only up to a real multiplier and so is the generator.
A small technical problem for application of Stone’s theorem to our case is that the set U ( g ( λ ) ) for any one-parameter subgroup g ( λ ) of G ¯ + does not necessarily form a group because map U ( g ) is only a ray representation. This is usually treated by working with the central extension G ¯ c + of group G ¯ + that has elements ( g , ϕ ) , g G ¯ + and ϕ ( R mod 2 π ) , multiplication law
( g 1 , ϕ 1 ) ( g 2 , ϕ 2 ) = ( g 1 g 2 , ϕ 1 + ϕ 2 + ω ( g 1 , g 2 ) )
and action on H τ
U ˜ ( g , ϕ ) | ψ = e i ϕ g | ψ .
Then, ( g , ϕ ) U ˜ ( g , ϕ ) is a unitary representation of G ¯ c + on H τ .
Group G ¯ c + has eleven parameters λ 1 , , λ 10 , ϕ and each parameter has its generator, which can be defined by the following conventions. Each one-parameter subgroup of space translations can be represented on H τ by a unitary maps of the form
exp i n · P a ,
where vector n is the unit vector in the direction, a the distance, of translation in a chosen system of units. Then, the distance a also serves as the parameter of the subgroup.
Each one-parameter subgroup of S U ( 2 ) can be represented by
exp i n · J θ ,
where n is a unit vector along the rotation axis and the parameter is an angle θ of rotation in the counterclockwise direction around the axis. θ is an angle in radians, θ [ 0 , 4 π ).
Each one-parameter boost subgroup can be represented by
exp i ( n · K v ,
where n is the direction and v the velocity of the boost in the chosen system of units; v is the parameter of the group.
Each time translation can be represented by
exp - i H t ,
where the parameter t is time in the chosen system of units and the parameter of the group.
Finally, each phase transformation can be denoted by
exp i M ϕ ,
where the parameter ϕ has the dimension 1 / [ mass ] .
Definition 16 The three s.a. operators P k are components of total momentum , J k are components of total angular momentum and Q k = ( 1 / M ) K k are components of position (centre of mass) of S . M is the total mass of S and
M = M 1
is the operator of total mass of S . Operator H is Hamiltonian of S .
The self-adjoint operators defined by Definition 16 have a common invariant domain (see, e.g. [63]), hence their sums and products are well-defined. They are the most important operators for any system in the sense that most quantum mechanical observables of the system are constructed from them. For example, an internal angular momentum or spin s can be defined in terms of these generators by
s = J - Q × P .
However, not all unitary ray representations U ˜ ( g , ϕ ) of group G ˜ c + are physical. For example, the choice of phase shift ω ( g 1 , g 2 ) = 0 does not contradict any assumption above but has wrong physical consequences. There is an additional rule restricting ω ( g 1 , g 2 ) :
Rule 10 The non-zero commutators of the generators are
[ J k , J l ] = i j ϵ k j l J j , [ J k , P l ] = i j ϵ k j l P j ,
[ J k , K l ] = i j ϵ k j l K j , [ K k , H ] = - i P k , [ K k , P l ] = i δ k l M 1 .
Thus, we can see why the mass is a generator in a Galilean invariant theory.
We can also observe that the majority of observables dealt with in standard textbooks of quantum mechanics are obtained from generators of Galilean group and must, therefore, satisfy commutation relations dictated by the Lie algebra of the group. But we can also observe that the situation in Newtonian mechanics is analogous. Galilean group acts on the phase space as a group of symplectic transformations and its generators are closely related to classical observables. This explains the fact that the commutation relations of quantum observables resemble Poisson brackets of classical observables.
Each Hilbert space H τ carries a definite representation of G ¯ c + depending only on τ (τ can contain more information, e.g., about the electric charge, etc.). Construction of models can start with the choice of a suitable representation. For example, the representation is irreducible for particles. Irreducible representations are classified by three numbers, μ , s and E 0 , with the meaning of mass, spin and ground state energy, respectively (see Ref. [64], p. 221). Let us describe this representation.
Consider the Schwartz space (see, e.g., Ref. [42]) S 2 s + 1 ( R 3 ) of rapidly decreasing 2 s + 1 -tuples of C functions ϕ ( x , m ) , where m { - s , - s + 1 , , s } represents a set of discrete parameters depending on the system spin. S 2 s + 1 ( R 3 ) is a common invariant domains of all generators.
Let H τ be the completion of S 2 s + 1 ( R 3 ) with respect to the inner product
ϕ | ψ = m R 3 d 3 x ϕ * ( x , m ) ψ ( x , m ) .
The elements of H τ are called wave functions. To describe the operators, it is sufficient to define the action of their components on C functions ϕ ( x , m ) because there is always only one s.a. extension. The result is
Q k ϕ ( x , m ) = x k ϕ ( x , m )
and
P k ϕ ( x , m ) = - i x k ϕ ( x , m ) .
The three components of spin are ( 2 s + 1 ) × ( 2 s + 1 ) Hermitian matrices s m n k that act on wave functions as follows
s k ϕ ( x , m ) = n = - s s s m n k ϕ ( x , n ) .
For example, the matrices for s = 0 are all equal to 1 and those for s = 1 / 2 are
s 1 = / 2 0 1 1 0 , s 2 = / 2 0 - i i 0 , s 3 = / 2 1 0 0 - 1 .
Then, the operator of angular momentum can be constructed from Equation (28).
Finally, the Hamiltonian is
H = P · P 2 μ + E 0 .
This representation of G ¯ c + is an example of structure that is labelled by index τ. The representation of Hilbert space and of operators described above is called Q-representation. (The meaning of the term “representation” here is different from that of group representation. It is called Q-representation because operators Q are diagonal in it. The same unitary representation of G ¯ c + that we are describing in Q-representation can also be described in, say, P-representation, etc.) All operators (not necessarily group generators) on H τ in Q-representation can be written in the form of integral operators with kernels
A = A ( x , m ; x , m ) ,
the kernel A ( x , m ; x , m ) being a generalized function of its arguments acting on functions ϕ ( x , m ) S 2 s + 1 ( R 3 ) as follows
( A ϕ ) ( x , m ) = m R 3 d 3 x A ( x , m ; x , m ) ϕ ( x , m ) ,
where the right-hand side represents the action of the generalized function on a test function. For example,
( Q k ϕ ) ( x , m ) = m R 3 d 3 x x k δ ( x - x ) δ m m ϕ ( x , m )
and
( P k ϕ ) ( x , m ) = m R 3 d 3 x i x k δ ( x - x ) δ m m ϕ ( x , m ) .
The transformation from abstract notation to a representation can be understood as expansion in an orthonormal basis. If { | n } is a basis, then any vector | ψ can be represented by the function ψ ( n ) = n | ψ , and any operator A can be represented by its matrix elements A ( n , m ) = n | A | m . Only, for the Q-representation, we use a generalized basis [65].
The irreducible representation described above is a basic building block of quantum mechanics. The models for all other systems can be constructed from it. Many examples of such construction are given in textbooks of quantum mechanics and we shall see some examples later.

2.3.2. Time translations

In non-relativistic quantum mechanics, time is, unlike position, just a parameter and time translation defines, unlike space shift, the dynamics of the system. This is the asymmetry between time and space in non-relativistic quantum mechanics. In this subsection, we drop the assumption that systems are closed. For general systems, we assume
Rule 11 Let S be a system of type τ and let external fields f be given. Then, time translation from t 1 to t 2 is represented by unitary operator U ( f , t 2 , t 1 ) on H τ satisfying
U ( f , t 3 , t 1 ) = U ( f , t 3 , t 2 ) U ( f , t 2 , t 1 )
and we have
T g ( t 2 - t 1 ) - 1 , f = U ( f , t 2 , t 1 ) T U ( f , t 2 , t 1 ) ,
where g ( t 2 - t 1 ) is the group element for λ 1 = = λ 9 = 0 , λ 10 = t 2 ? t 1 and T T ( H τ ) 1 + .
A given time translation element of G does not define a unique map on T ( H τ ) 1 + in this case. U ( f , t 2 , t 1 ) depends not only on t 2 - t 1 but also on the position of the system with respect to the external fields (even if the fields are stationary). Time translations U ( f , t 2 , t 1 ) do not form a group. We shall let out the argument f in U ( f , t 2 , t 1 ) in agreement with the current practice.
Definition 17 The operator H ( t ) defined by
H ( t ) = i U ( t , t 0 ) d U ( t , t 0 ) d t ,
is the Hamiltonian of S .
Operators H ( t ) for different t do not commute in general.
An important observable in non-relativistic quantum mechanics is energy. The corresponding operator for system S will be constructed from its Hamiltonian in Section 3.2.3.. On the other hand, any individually prepared quantum system has a definite Hamiltonian with the form
H = H ( operators, external fields ) ,
where function H symbolizes the construction of the operator from the other s.a. operators of S . The form of Hamiltonian is a model assumption of quantum mechanics. The choice of a Hamiltonian is usually the most important step in model construction. This Hamiltonian is usually different from the observable called energy of the system.
For closed systems, the choice of Hamiltonian is included in the choice of the ray representation of group G ¯ c + . In fact, for each kind τ, there is also one system of this kind that is closed and the corresponding representation can be viewed as a part of τ. Then, for a given system of kind τ that happens not to be closed, some generators of the representation are not physical. Mostly, this is just the Hamiltonian, but there are also cases when the physical momentum is different from that defined by τ. In general, the Hamiltonian of S defined by τ can be decomposed in the free Hamiltonian of the mass-centre motion and the internal Hamiltonian. The internal Hamiltonian is invariant with respect of the Galilean group and is an objective (structural) property of S and some aspects of it, such as the spectrum, are in principle (indirectly) observable even if it itself, as an operator, is not an observable.
Dynamics of quantum system S has to do with the time aspect of its preparation and registration. Any preparation procedure finishes at some time instant t p and any registration procedure starts at some time instant t r . The times t r and t p can serve as defining the time aspects because the whole preparation or registration processes can themselves take some time. The dynamics enables to calculate how the probabilities depend on the times t r and t p . The dependence can be obtained in two ways. We can make either state T ( t ) to a function of t by shifting T ( 0 ) forwards or observable E ( t ) by shifting E ( 0 ) backwards by U ( t , 0 ) . The probability p T E ( X , t ) corresponding to the time t is then
p T E ( X , t ) = t r [ T ( t ) E ] = t r [ T E ( t ) ] .
The first method is called Schrödinger picture, the second Heisenberg picture.
Rule 11 implies:
Proposition 7 Let quantum system S have a Hamiltonian H ( t ) . Then, the dynamical evolution of S in Schrödinger picture obeys von Neumann–Liouville equation of motion
i d T ( t ) d t = [ H ( t ) , T ( t ) ]
and in Heisenberg picture Heisenberg equation of motion
i d E ( X , t ) d t = [ E ( X , t ) , H ( t ) ] .
Proposition 8 Let T be an extremal state | ϕ ϕ | . Then
T ( t ) = | ϕ ( t ) ϕ ( t ) | ,
where ϕ ( t ) obeys Schrödinger equation
i d ϕ ( t ) d t = H ( t ) ϕ ( t )
and
ϕ ( t ) = U ( t , 0 ) ϕ ( 0 ) .
In particular, extremal states remain extremal states in unitary evolution.
The unitary time evolution of a statistical decomposition must satisfy the following rule
Rule 12 Let T be the state with statistical decomposition (11) Then, its time evolution is a state operator with statistical decomposition
T ( t ) = w T 1 ( t ) ( + ) p ( 1 - w ) T 2 ( t ) ,
where T k ( t ) is determined by (36) for each k = 1 , 2 and w is time independent.
Of course, Equation (39) is easily obtained from Equation (11) by multiplying both sides by U ( t , 0 ) from the left and by U ( t , 0 ) from the right. The non-trivial assumption is that the statistical decomposition is conserved by unitary dynamics.
With the help of Hamiltonian, we can define the notion of symmetry of a closed system.
Definition 18 Each unitary transformation U : H τ H τ that leaves the Hamiltonian H of S invariant,
U H U = H
is called a symmetry of system S .
All symmetries of a system S form a unitary group that is an objective (structural) property of S . The generators of its one-parameter subgroups are s.a. operators that commute with the Hamiltonian and, as sharp observables, yield probability distributions that are independent of time.

3. Composition of quantum systems

This section studies composition of two kinds of quantum systems: heterogeneous and identical. For the identical systems, it introduces a number of new ideas.
Suppose that S 1 and S 2 are two quantum systems. They can be particles or composites. Then, one can consider this pair as one quantum system S = S 1 + S 2 . This is called composition of systems. Clearly, composition is an effective tool of model building so that any system can be constructed from particles. Also, various kinds of interaction between quantum systems can be studied.
For the sake of simplicity, we assume that S 1 and S 2 are disjoint, i.e., they have no subsystem in common. Composition of systems that are not disjoint can clearly be reduced to composition of disjoint ones. More important is a stronger condition, which we call heterogeneity: systems S 1 and S 2 are heterogeneous if there is no pair of particles S 1 and S 2 of the same type such that S 1 S 1 and S 2 S 2 . The rules of composition are different for systems that are or are not heterogeneous.

3.1. Composition of heterogeneous systems

This section gives a brief account of the composition of heterogeneous systems and the important non-local effect of entanglement.

3.1.1. Tensor product of Hilbert spaces

First, we describe the mathematical apparatus. The tensor product H = H 1 H 2 of H 1 and H 2 is the Cauchy completion of the linear span of the set of products
ϕ ψ , ϕ H 1 , ψ H 2
with respect to the inner product of H , which is determined by
ϕ ψ | ϕ ψ = ϕ | ϕ ψ | ψ .
Tensor product operation “⊗” is postulated to be associative and distributive in both arguments. Hence, if { ϕ k } and { ψ k } are bases of H 1 and H 2 , then { ϕ k ψ l } is a basis of H . If the bases are orthonormal then any Ψ H can be expressed as
Ψ = k l ϕ k ψ l | Ψ ϕ k ψ l .
Thus, any vector Ψ H 1 H 2 can be represented by the function of two arguments,
Ψ ( k 1 , k 2 ) = ϕ k 1 | ψ k 2 | Ψ .
For instance, two-particle state Ψ in Q-representation is described by wave function Ψ ( x 1 , x 2 ) .
If A L ( H 1 ) and B L ( H 2 ) , then their tensor product A B on H 1 H 2 is determined via the relation
( A B ) ( ϕ ψ ) = A ϕ B ψ
for all ϕ H 1 and ψ H 2 . It follows that
t r [ A B ] = t r [ A ] t r [ B ] .
In a basis { ϕ k 1 ψ k 2 } , operator A B is described by its kernel
K ( k 1 , k 2 ; k 1 , k 2 ) = ϕ k 1 | A | ϕ k 1 ψ k 2 | B | ψ k 2 .
For instance, a kernel in the Q-representation is K ( x 1 k , x 2 l ; x 1 k , x 2 l ) .
An important example is tensor product of group representations. Let ( g , ϕ ) U ˜ 1 ( g , ϕ ) be a unitary representation of G ¯ c + on H 1 and ( g , ϕ ) U ˜ 2 ( g , ϕ ) on H 2 . Then ( g , ϕ ) U ˜ ( g , ϕ ) = U ˜ 1 ( g , ϕ ) U ˜ 2 ( g , ϕ ) is a unitary representation of G ¯ c + on H 1 H 2 .
Another example is the tensor product of states, T 1 T 2 , of T 1 T ( H 1 ) 1 + and T 2 T ( H 2 ) 1 + . It is a positive trace-class operator with trace 1 and T 1 T 2 T ( H 1 H 2 ) 1 + . However, T ( H 1 H 2 ) 1 + contains also convex combinations of tensor products of elements from T ( H 1 ) 1 + and T ( H 2 ) 1 + , which cannot themselves generally be written as such tensor products.
The partial trace over the Hilbert space H 2 , say, is the positive linear mapping
Π 2 : T ( H 1 H 2 ) 1 + T ( H 1 ) 1 +
defined via the relation
t r [ Π 2 ( W ) A ] = t r [ W ( A 1 2 ) ]
for all A L r ( H 1 ) , W T ( H 1 H 2 ) 1 + and 1 2 is the identity operator on H 2 . State operator Π 2 ( W ) is uniquely defined because of Theorem 9.
If { ϕ k } H 1 and { ψ k } H 2 are orthonormal bases, then Π 2 ( W ) can be written as
Π 2 ( W ) = i j k ϕ i ψ k | W ( ϕ j ψ k ) | ϕ i ϕ j | .
Here | ϕ i ϕ j | is the bounded linear operator on H 1 given by
| ϕ i ϕ j | ( ϕ ) = ϕ j | ϕ ϕ i
for all ϕ H 1 . The partial trace over H 1 is defined similarly.
If W = T 1 T 2 , then T 1 = Π 2 ( W ) and T 2 = Π 1 ( W ) but, in general,
W Π 2 ( W ) Π 1 ( W ) .
In particular, if W = P [ Ψ ] , then
P [ Ψ ] = Π 2 ( P [ Ψ ] ) Π 1 ( P [ Ψ ] )
if and only if
Ψ = ϕ ψ
for some ϕ H 1 and ψ H 2 . In that case also
Π 2 ( P [ Ψ ] ) = P [ ϕ ] , Π 1 ( P [ Ψ ] ) = P [ ψ ] .
Thus, tensor products of extremal elements of the sets T ( H 1 ) 1 + and T ( H 2 ) 1 + do not exhaust the set of extremal elements of T ( H 1 H 2 ) 1 + .
Tensor product is also an operation for POV measures. Let E 1 : F 1 L r ( H 1 ) with dimension n 1 and E 2 : F 2 L r ( H 2 ) with dimension n 2 be two POV measures on Hilbert spaces H 1 and H 2 with value sets Ω 1 and Ω 2 . Then POV measure ( E 1 E 2 ) : ( F 1 × F 2 ) L r ( H 1 H 2 ) on the tensor product H 1 H 2 has dimension n 1 + n 2 , values set Ω 1 × Ω 2 and is defined by
( E 1 E 2 ) ( X 1 × X 2 ) = E 1 ( X 1 ) E 2 ( X 2 )
for all X 1 R n 1 and X 2 R n 2 . Tensor product of POV measures is associative but not commutative.
Tensor products of more Hilbert spaces and the corresponding notions and relations can be obtained using the above axioms and relations.
Let us now turn to physical interpretation. Then, the Hilbert spaces are associated with heterogeneous quantum systems and the indices distinguishing the Hilbert spaces carry the information on the system types.
Rule 13 Let S 1 and S 2 be two heterogeneous quantum systems and their Hilbert spaces be H 1 and H 2 , respectively. Then, the system S composed of S 1 and S 2 has the Hilbert space H = H 1 H 2 , its states are elements of T ( H 1 H 2 ) 1 + and its effects are elements of L r ( H 1 H 2 ) 1 + .
An important assumption concerns the states and observables of subsystems:
Rule 14 Let system S composed of heterogeneous systems S 1 and S 2 be prepared in state T T ( H 1 H 2 ) 1 + . Then S 1 is simultaneously prepared in state Π 2 ( T ) and S 1 in state Π 2 ( T ) . The observables of S 1 and S 2 can be identified with observables E 1 1 2 and 1 1 E 2 respectively, of the composite. S 1 and S 2 are called subsystems of S .
Hence, in the case that a system is composed of heterogeneous systems, these systems retain their individuality in the sense that they have well-defined states and observables of their own.
The theory of composition would not be complete if we did not know how group G ¯ c + acts on Hilbert space H 1 H 2 of the composite system. In this way, interaction between systems can be defined.
Definition 19 Let heterogeneous systems S 1 and S 2 have Hilbert spaces H 1 and H 2 . Let the representative of ( g , ϕ ) G ¯ c + on H k be denoted by U ˜ k ( g , ϕ ) and that on H 1 H 2 by U ˜ ( g , ϕ ) . If
U ˜ ( g , ϕ ) = U ˜ 1 ( g , ϕ ) U ˜ 2 ( g , ϕ ) ,
we say that S 1 and S 2 do not interact.
The consequence for the generators is that they are additive:
Proposition 9 Given a one parameter subgroup of G ¯ c + with generator G of its representation on H 1 H 2 , G 1 of its representations on H 1 and G 2 of its representation on H 2 . Then, in the case of non-interacting subsystems,
G = G 1 1 2 + 1 1 G 2 .
As an example, consider two particles S 1 and S 2 of different types with Hilbert spaces H 1 and H 2 and let us composite them so that they do not interact. We can then construct the representation of the group G ¯ c + on H 1 H 2 from its representations on H 1 and H 2 as follows.
Let the representations on H i have the parameters μ i , s i , E 0 i and the corresponding group generators be p i , j i , k i = μ i x i and h i = E 0 i + p i · p i / 2 μ i (see Section 2.3.1.). Let the generators of the group on H 1 H 2 that are to be determined be denoted by P , J , K = M Q and H . We use the Q-representation so that elements of H 1 H 2 are constructed from rapidly decreasing C functions Ψ ( x 1 , x 2 ) , which also form the common invariant domain of all group generators. Then, Proposition 9 implies
P = p 1 + p 2 , Q = μ 1 x 1 + μ 2 x 2 μ 1 + μ 2 ,
M = μ 1 + μ 2 , H = E 01 + E 02 + p 1 · p 1 2 μ 1 + p 2 · p 2 2 μ 2 ,
where P , p 1 and p 2 are the differential operators of the form (32).
It is advantageous to change variables x 1 , x 2 to Q , q so that wave functions have the form Ψ ( Q , q ) and
[ Q k , q l ] = [ p k , P l ] = 0 , [ Q k , P l ] = [ q k , p l ] = i δ k l .
The transformation is uniquely determined by conditions:
x 1 = Q - μ 2 μ 1 + μ 2 q , x 2 = Q + μ 1 μ 1 + μ 2 q , p 1 = μ 1 m u 1 + μ 2 P - p , p 2 = μ 2 m u 1 + μ 2 P + p .
The transformed Hamiltonian and the angular momentum are
H = E 01 + E 02 + P · P 2 M + p · p 2 μ , J = Q × P + q × p + s 1 + s 2 ,
where s 1 and s 2 are spin operators of S 1 and S 2 given by Equation (34) and
μ = μ 1 μ 2 μ 1 + μ 2
is the so-called reduced mass.
To construct models of interacting systems, we modify the above generators in any way subject only to the condition that the commutation relations (29) and (30) are preserved. For example, an arbitrary term can be added to the Hamiltonian if the term commutes with all other generators, i.e., it is invariant under Galilean group. For example, q · q and any functions of it is an invariant. Such a function, V ( q · q ) , is called potential.
The final assumption of this section concerns decomposable states.
Rule 15 Suppose that the state of the system composed of heterogeneous systems S 1 + S 2 is T . The necessary and sufficient condition for the statistical decomposition of the state of S 1 to be
Π 2 ( T ) = k p w k T 1 k
is that T itself has statistical decomposition
T = k p w k T 1 k T 2 k ,
where T 1 k are some states of S 1 and T 2 k are some states of S 2 .
One can say that statistical decomposition is invariant with respect to compositions. By registration of observables of S 1 alone, only the state operator Π 2 ( T ) can be determined, not its statistical decomposition. However, by registration of observables pertaining to a composite object S 1 + S 2 , some information about the statistical decomposition of Π 2 ( T ) can be obtained from Rule 15. Suppose, e.g., that S 1 + S 2 is in an extremal state T = P [ Ψ ] . Then state operator Π 2 ( P [ Ψ ] ) cannot have a non-trivial statistical decomposition. This fact is at the root of the objectification problem in quantum theory of measurement (cf. [11], our Section 5.1 and Section 5.2).
Based on the mathematical properties of the tensor products, the description of systems composed of arbitrary number of sub-systems can be obtained by extension of the above methods. The only condition is that each two subsystems are heterogeneous.

3.1.2. Entanglement

Entanglement is a kind of mutual influence between quantum systems that is very different from any effect known from classical theories. It is not an interaction according to Definition 19.
Definition 20 If object S composed of two heterogeneous quantum objects S 1 and S 2 is in an indecomposable state W that satisfies condition (40), one says that S 1 and S 2 are entangled or that state W is entangled.
Consider a decomposable state
W = p T 1 T 2 ( + ) p p T 1 T 2 ,
where T i and T i , i = 1 , 2 , are states of S i . Clearly, neither T 1 T 2 nor T 1 T 2 is entangled, and W cannot, therefore, be considered as entangled, either. But W does satisfy condition (40). This is the reason for the condition that W be indecomposable in Definition 20. (There are different definitions of entanglement, e.g., [66]. Our definition seems to be intuitively clearer and mathematically much simpler. The problem is that pure mathematics cannot distinguish decomposable state operators from indecomposable ones.)
Entanglement is a physical phenomenon that has measurable consequences. One of these consequences is correlations between outcomes of registrations that are performed simultaneously on the two (or more) entangled systems.
As an example, consider two heterogeneous objects S 1 and S 2 and two sharp observables A i : H i H i , i = 1 , 2 . Let | a i H i and | b i H i be four eigenstates,
A i | a i = a i | a i , A i | b i = b i | b i ,
and let b i > a i , i = 1 , 2 . The state P [ Ψ ] of the composite object S 1 + S 2 , where
| Ψ = 1 2 ( | a 1 | b 2 + | b 1 | a 2 )
is entangled. Indeed,
Π 2 ( P [ Ψ ] ) = 1 2 ( | a 1 a 1 | + | b 1 b 1 | ) , Π 1 ( P [ Ψ ] ) = 1 2 ( | a 2 a 2 | + | b 2 b 2 | )
and P [ Ψ ] is not a tensor product of these two states.
Let us calculate the correlations of two sharp observables, A 1 1 and 1 A 2 on H 1 H 2 . The observables commute, hence Definition 12 is applicable. We need the first two moments of observables A 1 1 and 1 A 2 in state P [ Ψ ] :
Ψ | A 1 1 | Ψ = 1 2 ( a 1 + b 1 ) , Ψ | 1 A 2 | Ψ = 1 2 ( a 2 + b 2 )
and
Ψ | ( A 1 1 ) 2 | Ψ = 1 2 ( a 1 2 + b 1 2 ) , Ψ | ( 1 A 2 ) 2 | Ψ = 1 2 ( a 2 2 + b 2 2 ) .
The variances (Definition 11) are
Δ ( A 1 1 ) = b 1 - a 1 2 2 , Δ ( 1 A 2 ) = b 2 - a 2 2 2 .
The average of the product ( A 1 1 ) ( 1 A 2 ) = A 1 A 2 is
Φ | ( A 1 A 2 ) Φ = a 1 b 2 + b 1 a 2 2
so that, finally
C ( A 1 1 , 1 A 2 , P [ Ψ ] ) = - 1 .
The result is that the observables are strongly anticorrelated in state P [ Ψ ] . What this means can be seen from the probability distributions for different possible outcomes by measuring the two observables. The corresponding PV measures define the four projections
P a a = E 1 ( { a 1 } ) E 2 ( { a 2 } ) , P a b = E 1 ( { a 1 } ) E 2 ( { b 2 } ) ,
P b a = E 1 ( { b 1 } ) E 2 ( { a 2 } ) , P b b = E 1 ( { b 1 } ) E 2 ( { b 2 } ) ,
and we obtain
Ψ | P a a | Ψ = 0 , Ψ | P a b | Ψ = 1 2 ,
Ψ | P b a | Ψ = 1 2 , Ψ | P b b | Ψ = 0 .
It follows: if the registration of A 1 1 gives a 1 ( b 1 ) then the registration of 1 A 1 gives b 2 ( a 2 ) with certainty, and vice versa, the correlation being symmetric with respect to S 1 and S 2 .
Existence of such correlations has surprising consequences. For example, it allows to register observable A 1 by an apparatus that interacts only with A 2 : If the apparatus register observable A 2 and gives value b 2 for it, then it simultaneously gives value a 1 for A 1 . This is an example of indirect registration.
State Π 2 ( P [ Ψ ] ) in Equation (45) is an example of a non-extremal state that is indecomposable. This follows from Rule 15 and the fact that Ψ is extremal.
Another aspect of entanglement is the amount of information that the entangled state and the states of the subsystems carry. This can be studied with the help of von Neumann entropy (20). Von Neumann entropy of composite objects satisfies a number of interesting inequalities, see, e.g., Ref. [67]. One is the so-called sub-additivity [68], which is described by the following:
Proposition 10 Let S be composed of heterogeneous systems S 1 and S 2 of different types. Let T be the state of S and let
T 1 = Π 2 ( T ) , T 2 = Π 1 ( T ) .
Then
S ( T ) S ( T 1 ) + S ( T 2 )
and the equality sign is valid only if
T = T 1 T 2 .
In the example above,
S ( P [ Ψ ] ) = 0 , S ( Π 2 ( P [ Ψ ] ) ) = S ( Π 2 ( P [ Ψ ] ) ) = ln 2 ,
confirming the inequality and showing that an entangled state of two systems can contain more information than the sum of the information in both subsystems. This possibility is utilized by quantum computers.
Suppose next that S 1 and S 2 are far from each other, S 1 near point x 1 and S 2 near point x 2 . State Ψ of the composite is independent of the distance between x 1 and x 2 . Could one use the strong anticorrelation to send signals from x 1 to x 2 , say? There are two reasons why one cannot. First, one has no choice of the value, a 1 or b 1 , that is obtained at x 1 . As is easily seen from the state Π 2 ( P [ Ψ ] ) of S 1 , Equation (45), the probability of each outcome is 1/2. One has, therefore, no control about what signal will be sent. Second, suppose that the state P [ Ψ ] is prepared many times and let the observer at x 1 register A 1 every time in the first ensemble of experiments and do nothing in the second ensemble. Is there any difference between the two cases that could be recognized at x 2 ? In the first case, the state of S 2 given by Equation (45) is the statistical decomposition of the state, whereas in the second case, the state operator is the same but the state is indecomposable because the composite system is in an extremal state. However, by direct registrations of any observable pertinent to S 2 , the observer at x 2 cannot distinguish different statistical decomposition of the same state operator from each other.
The next interesting question is, how the influence of a registered value at x 1 on that at x 2 is to be understood. Even in classical mechanics, one can arrange strong correlations. For example, if a body with zero angular momentum with respect to its centre of mass decays into two bodies flying away from each other, the angular momentum of the first is exactly the opposite to that of the second. This strong anticorrelation cannot be used to send signals either. It is moreover clear that a measurement of the first angular momentum giving the value L 1 is not a cause of the second angular momentum having the value - L 1 . Rather, the decay is the common cause of the two values being opposite. (The condition of common cause can be formulated rigorously [35], see also the discussion in Ref. [32], pp. 83–94.) The process of creating the two opposite values at distant points is also completely local: the decay is a local phenomenon and the movement of each of the debris is governed by a local equation of motion. Moreover, the values L 1 and - L 1 are objective, that is, they exist on the debris independently of any measurement and can be, in this way, transported from the decay point to the measurements in a local way.
In quantum mechanics, such an explanation of the correlations is not possible. A value of an observable is created only during its registration. It does not exist in any form before the registration, except in the special case of state which is an eigenstate. However, in our example, Ψ is not an eigenstate either of A 1 1 or of 1 A 2 . The registrations performed simultaneously at x 1 and at x 2 , which can be very far from each other, are connected by a relation that is utterly non-local. How can the apparatuses together with parts of the quantum system at two distant points x 1 and at x 2 “know” what values they are to create so that the correlations result? Note that the rejection of objectivity of observables leads more directly to non-locality than assumption of any sort of realism.
The nature of simultaneity with which the correlations take place can be described in more details. In Ref. [69], the entangled state is created by a decay of one particle that defines a preferred frame—its rest frame, which, in turn, defines the simultaneity. Lead by this example, we adopt the following general assumption about entanglement. First, the entangled state on which the correlations between remote registrations can be observed must be prepared and the apparatus that prepares it (or some of its parts) defines a unique preferred frame similarly as in the case considered in Ref. [69]. Second, the apparatus location and the time interval of the preparation process defines a spacetime region D that must lie inside the common past of the events at which the simultaneous correlations are registered. Region D can be arbitrarily large (it does not have to be just a small neighborhood of the particle-decay event as in Ref. [69]). This assumption is weaker than Reichenbach’s condition of common cause because it holds, e.g., for EPR effect. The point is that it does not require the causal independence of the registrations that are correlated (allowing for non-locality).
The non-local correlation between the registration outcomes can be ascertained only if both values are known and can be compared. It may therefore be more precise if we say that the non-local correlations can only be seen if the non-local observable A 1 A 2 is registered. This non-locality of quantum correlations in entangled states does not lead to any internal contradictions in quantum mechanics and is compatible with other successful theories (such as special relativity) as well as with existing experimental data. Nonetheless it is very surprising and it has been very thoroughly studied. In this way, various conditions (e.g., Bell inequality) have been found that had to be satisfied by values of observables if the values were objective and locality were satisfied. Experiments show that such conditions are violated, and, moreover, that their violation can even be exploited in quantum communication techniques [70].
Finally, let us repeat here that non-local correlations and non-objectivity of observables is accepted by our interpretation of quantum mechanics in the full extent and that this does not lead to any contradiction with Basic Ontological Hypothesis of Quantum Mechanics and with the Realist Model Approach (see Section 1.0.3.).

3.2. Composition of identical systems

This section gives an account of composition of non-heterogeneous systems, explains why the standard theory is inadequate and introduces all necessary corrections.
An efficient mathematical tool for dealing with such systems are Fock-space methods (see, e.g., [21], p. 137 and [59]). However, the Fock-space methods are not advantageous for the presentation of our ideas on general identical systems and we shall not use them in this paper. This does not mean that they cannot be utilized and give an efficient help to solve various mathematical problems that occur in specific cases.

3.2.1. Identical subsystems

If a prepared object has more than one subsystem of the same type (identical subsystems), then these subsystems are indistinguishable. This idea can be mathematically expressed as invariance with respect to permutations.
Let S N be the permutation group of N objects, that is, each element g of S N is a bijective map g : { 1 , , N } { 1 , , N } , the inverse element to g is the inverse map g - 1 and the group product of g 1 and g 2 is defined by ( g 1 g 2 ) ( k ) = g 1 ( g 2 ( k ) ) , k { 1 , , N } .
Given a Hilbert space H , let us denote by H N the tensor product of N copies of H ,
H N = H H H .
On H N , the permutation group S N acts as follows. Let ψ k H , k = 1 , , N , then
ψ 1 ψ N H N
and
g ( ψ 1 ψ N ) = ψ g ( 1 ) ψ g ( N ) .
g preserves the inner product of H N and is, therefore, bounded and continuous. Hence, it can be extended by linearity and continuity to the whole of H N . The resulting operator on H N is denoted by the same symbol g and is a unitary operator by construction. The action (46) thus defines a unitary representation of the group S N on H N .
All vectors of H N that transform according to a fixed unitary representation R of S N form a closed linear subspace of H N that will be denoted by H R N . The representations being unitary, the subspaces H R N are orthogonal to each other. Let us denote by P R ( N ) the orthogonal projection operator,
P R ( N ) : H N H R N .
An important property of the subspaces is their invariance with respect to tensor products of unitary operators. Let U be a unitary transformation on H , then U U U is a unitary transformation on H H H and each subspace H R N is invariant with respect to it. Hence, U U U acts as a unitary transformation on H R N for each R .
The location order of a given state in a tensor product can be considered as information about the identity of the corresponding system. Such information has no physical meaning and a change of the ordering is just a kind of gauge transformation. (A different and independent part (ignored here) of the theory of identical particles is that states of two identical systems can also be swapped in a physical process of continuous evolution, and can so entail a non-trivial phase factor at the total state (anyons, see, e.g. Ref. [71]).) Motivated by this idea, we look for one-dimensional unitary representations of S N because only these transform vectors by a phase factor multiplication. S N has exactly two one-dimensional unitary representations: the symmetric (trivial) one, g 1 , and the alternating one g η ( g ) 1 for each g S N , where η ( g ) = 1 for even and η ( g ) = - 1 for odd permutations g. If R is the symmetric (alternating) representation we use symbol H s N ( H a N ) for H R N . Let P s ( N ) ( P a ( N ) ) be the orthogonal projection on H s N ( H a N ). Note that the usual operation of symmetrisation or antisymmetrisation on a vector Ψ H N , such as
ψ ϕ ( 1 / 2 ) ( ψ ϕ ± ϕ ψ )
in H 2 , is nothing but P s ( N ) Ψ or P a ( N ) Ψ , respectively.
Now, we are ready to formulate the basic assumption concerning identical subsystems. From relativistic quantum field theory [72], we take over the following result.
Rule 16 Let S N be a quantum system composed of N subsystems S , each of type τ with Hilbert space H τ . Then, the Hilbert space of S N is H τ s N for subsystems with integer spin and H τ a N for those with half-integer spin. If systems S are closed and do not interact, then the representation of group G ¯ c + on H τ s N or H τ a N is the tensor product of its N representations on H τ .
Definition 21 Systems with integer spin are called bosons and those with half-integer spin are called fermions. The symmetry properties of states lead to Bose–Einstein statistics for bosons and Fermi–Dirac one for fermions.
We can, therefore, introduce a useful notation, a common symbol H R ( τ ) N for the subspaces of the symmetric and anti-symmetric representations and P τ ( N ) for the corresponding projections because the representation is determined by the system type τ.
For the Galilean group, we have:
Proposition 11 Let G be the generator of subgroup g ( t ) of G ¯ c + on H τ . Then, the generator G ˜ of g ( t ) on H R ( τ ) N is given by
G ˜ = G 1 2 1 N + 1 1 G 1 3 1 N + + 1 1 1 N - 1 G .
Observe that the form of the generator is independent on whether the space is symmetric or anti-symmetric. Proposition 11 also suggests a way of constructing an operator on H R ( τ ) N from that on H τ that will be important later. Let us study the properties of the construction with the help of simple example.
Suppose that S is a bosonic particle with Hilbert space H τ and let S 2 be composed of two such bosons with Hilbert space H τ s 2 . Let A be a bounded s.a. operator on H τ with discrete non-degenerate value set { a k } and let the projection on the eigenspace of a k be P k = | ψ k ψ k | for all k, where ψ k is an eigenvector of A , A | ψ k = a k | ψ k . Consider the operator A 1 + 1 A on H τ s 2 . We obtain
A 1 + 1 A = k = 1 a k ( P k 1 + 1 P k ) .
However, P k 1 + 1 P k is not a projection. Indeed,
( P k 1 + 1 P k ) 2 = P k 1 + 1 P k + 2 P k P k .
The eigenvectors of A 1 + 1 A are
1 2 ( | ψ k | ψ l + | ψ l | ψ k )
with eigenvalue a k + a l , for all k < l , and
| ψ k | ψ k
with eigenvalue 2 a k for all k. There can be a degeneracy if there are values k < l and k < l such that a k + a l = a k + a l . The projection on eigenvector (49) is
1 2 P k P l + 1 2 P l P k + 1 2 ( | ψ k ψ l | | ψ l ψ k | + | ψ l ψ k | | ψ k ψ l | ) ,
which cannot be easily expressed in terms of P k and P l . We can see from this example that it is no straightforward business to construct a POV measure for a system composed of identical particles from that of its subsystems. But we can still use expression (48) and its analogues in calculations. We shall generalize this construction later.
For states and observables, we have:
Proposition 12 Possible states of system S N composed of N systems of type τ are elements of T ( H R ( τ ) N ) 1 + and the effects of S N are elements of L r ( H R ( τ ) N ) 1 + .
Proposition 12 follows directly from the definition of a system with a given Hilbert space. To see its significance, let us make a comparison with the case of subsystems of different types. Let system S be composed of two particles S 1 and S 2 of different types. Let us prepare S in a state P [ Ψ ( x 1 , x 2 ) ] T ( H 1 H 2 ) 1 + . Then, S 1 is in state T T ( H 1 ) 1 + given by kernel
T ( x 1 ; x 1 ) = d 3 x 2 Ψ * ( x 1 , x 2 ) Ψ ( x 1 , x 2 )
and sharp observable of S 1 with kernel A can be measured in such a way that we measure sharp observable
A 1 2
of S . This is an expression of the obvious fact that observing anything on a subsystem is tantamount to observing something on the whole system. Thus, composing heterogeneous systems does not disturb their individuality and rules valid for each of them separately (even in the cases they are entangled).
Looking at the composition of identical systems we observe a very different picture. A simple example of two spin-zero particles will give a sufficient illustration of the phenomena. Let us consider two experiments.
Experiment I: State P [ ψ ] of particle S 1 of type τ is prepared in our laboratory.
Experiment II: State P [ ψ ] is prepared as in Experiment I and state P [ ϕ ] of particle S 2 of the same type τ is prepared simultaneously in a remote laboratory.
If our laboratory does not know about the second one, it believes that the state of S 1 is P [ ψ ] T ( H τ ) 1 + . If it does then it believes that the state of the composite system S is, according to Proposition 12, P [ Ψ ] T ( H R ( τ ) 2 ) 1 + given by
Ψ ( x 1 , x 2 ) = ν ψ ( x 1 ) ϕ ( x 2 ) + ϕ ( x 1 ) ψ ( x 2 ) ,
where ν = [ 2 ( 1 + | c | 2 ) ] - 1 / 2 is a normalization factor, c = ψ | ϕ . This is true even if states ψ and ϕ are localized (the wave functions have supports) within the respective laboratories.
Localisation of the states ψ and ϕ removes at least the following difficulty. Suppose that a fermion is prepared in a remote laboratory in a state ϕ. Then a fermion of the same type cannot be prepared in the same state ψ = ϕ in our laboratory: the composite state of the two fermions had then to be zero (Pauli exclusion principle). However, if the wave function of the fermion prepared in our laboratory falls off rapidly outside our laboratory and that in the remote laboratory does the same outside the remote one, then the two wave functions would be different, ψ ( x ) ϕ ( x ) , and their antisymmetric combination would not be zero even if ψ was just a Euclidean group transform of ϕ. The requirement of the fall-off is very plausible indeed. It would be technically impossible to prepare a state with the same wave function ψ in a laboratory located in Prague, and simultaneously in a laboratory in Bern, say, even for bosons.
Returning to our original experiments, it seems that the intuitive notions of preparation reaches its limits and becomes ambiguous. Has this ambiguity any observable consequences? To answer this question, let us first consider Experiment I supplemented by a registration corresponding to the sharp observable A of S 1 and let the registration be made in our laboratory. Measurements of this kind lead to average value ψ | A | ψ . Second, perform Experiment II supplemented by the registration by the same apparatus in our laboratory as above. Because the apparatus cannot distinguish between the contributions by the two particles, the correct observable corresponding to this registration now is:
A 1 2 + 1 1 A .
We can see a physical motivation for the construction suggested above by the considerations about the Galilean group: here, it is not the additivity as in the case of heterogeneous systems but the way identical particles contribute to averages. Operator (52) acts on H R ( τ ) 2 so that it satisfies Proposition 12 while (50) does not. The proposed measurements lead to the average value defined by Equations (51) and (52):
ψ | A ψ + ϕ | A ϕ + c ϕ | A ψ + c * ψ | A ϕ 1 + | c | 2 .
Average (53) appreciably differs from ψ | A | ψ for many choices of A , such as all generators of group G ¯ c + and most operators constructed from them such as position, momentum, spin, angular momentum, energy, etc. (the majority of operators dealt with in any textbook). In particular, if the states are localized inside each laboratory that prepares them, then c = 0 and average (53) is ψ | A ψ + ϕ | A ϕ . Consider the position operator Q . In Q-representation, the difference between the two averages is
d 3 x x ϕ * ( x ) ϕ ( x ) .
This can be made arbitrarily large by choosing the remote laboratory far enough.
We conclude that system S 1 does not seem to be in state P [ ψ ] prepared in our laboratory and registration of its observable A in our laboratory seems to give values influenced by external circumstances that are not under our control. Thus, Proposition 12 would lead to violation of Rules stated in Section 2 if it were not supplemented by some further assumptions.

3.2.2. Cluster separability

There was a proposal (Ref. [21], p. 128) that the problem described in the previous section can be avoided by adding some kind of locality to properties of observables. Such locality assumptions are quite popular in various branches of quantum theory. Let us briefly look at them.
The full relativistic theory starts with the requirement that space-time symmetries of a closed system be realized by unitary representations of Poincaré group on the Hilbert space of states, see Refs. [72] and [45]. Then, the cluster decomposition principle, a locality assumption, states that if multi-particle scattering experiments are studied in distant laboratories, then the S-matrix element for the overall process factorizes into those concerning only the experiments in the single laboratories. This ensures a factorization of the corresponding transition probabilities, so that an experiment in one laboratory cannot influence the results obtained in another one. Cluster decomposition principle implies non-trivial local properties of the theory underlying the S-matrix, in particular it plays a crucial part in suggesting that local field theory is inevitable (cf. Ref. [72], Chap. 4).
In the phenomenological theory of relativistic or non-relativistic many-body systems, Hilbert space of a closed system must also carry a unitary representation of Poincaré or Galilei group. Then, the so-called cluster separability is a locality assumption, see, e.g., Refs. [59] or [73] and references therein. It is a condition on interaction terms in the generators of the space-time symmetry group saying: if the system is divided into disjoint subsystems (i.e., clusters) by a sufficiently large spacelike separation, then each subsystem behaves as a closed system with a suitable representation of space-time symmetries on its Hilbert space, see Ref. [59], Section 6.1. Let us call this principle Cluster Separability I.
Peres’ proposal contains another special case of locality assumption. Let us reformulate it as follows:
Cluster Separability II No quantum experiment with a system in a local laboratory is affected by the mere presence of an identical system in remote parts of the universe.
It is well known (see, e.g., Ref. [21], p. 136) that this principle leads to restrictions on possible statistics (fermions, bosons). What is less well known is that it also motivates non-trivial locality conditions on observables.
Some locality condition is already formulated in Ref. [21], p. 128:
... a state w is called remote if A w is vanishingly small, for any operator A which corresponds to a quantum test in a nearby location. ... We can now show that the entanglement of a local quantum system with another system in a remote state (as defined above) has no observable effect.
This is a condition on A inasmuch as there has to be at least one remote state for A .
However, Peres does not warn that most operators of quantum theory do not satisfy his condition on A [19]. Indeed, observables of quantum field theory and many-body theories that are constructed from generators of Poincaré or Galilei groups do not satisfy the locality condition. It follows that Cluster Separability II is logically independent from the cluster decomposition or from Cluster Separability I.
The present section reformulates and extends Peres’ ideas. Let us explain everything on an example of single spin-zero particle, working in Q-representation of Hilbert space H τ and of operators on it. A more general theory will be developed in the next subsection.
We introduce an important locality property of observables [19]. A similar local condition on observables has been introduced in [58,74]): Let D R 3 be open. Operator with kernel a ( x 1 ; x 1 ) is D-local if
d 3 x 1 a ( x 1 ; x 1 ) f ( x 1 ) = d 3 x 1 a ( x 1 ; x 1 ) f ( x 1 ) = 0 ,
for any test function f that vanishes in D.
Now assume for Experiment II that our laboratory is inside open set D R 3 and that supp ϕ D = . Then, the second term in (53) vanishes for all D-local observables and averages ψ | A | ψ and (53) agree in this case. Hence, in this case, the two approaches are compatible: First, system S 1 is in state ψ and the observable that is measured on it is A . Second, system S 1 + S 2 is in state (51) and the observable being measured is (52).
This suggests the following idea. Let S be prepared in a state ψ ( x ) such that supp ψ D and that registration of any D-local observable A of S lead to average ψ ( x ) | A ψ ( x ) . In such a case, we say that S has separation status D.
This includes the following condition that must be fulfilled by the apparatus A registering observable A : The states of those subsystems of A that are identical with S do not disturb the registration of A by A (but may disturb other registrations). A model of A satisfying the condition is given in Section 5.3.
In the above example, the reason why the registration of A is not disturbed by other identical systems is that the wave functions of these systems vanish in D. This could also be weakened to
D d 3 x ϕ * ( x ) ϕ ( x ) 0 ,
so that the registration apparatuses are not sensitive enough to react to ϕ. In any case, the vanishing of the wave functions of all external identical systems provides a good mathematical model of separation status and we shall assume that the mathematical results obtained for such models are approximately valid for more realistic situations.
Now, we can introduce a correction to our intuitive notion of preparation. The separation status must be understood as new condition on preparations and observables that could be formulated as follows:
Rule 17 Let S be a quantum system of type τ with Hilbert space H τ . Any preparation of S must give it separation status D satisfying D . Then the prepared state of S is an element of T ( H τ ) 1 + and D-local effects of L r ( H τ ) 1 + are individually registrable on S but only these are.
Thus, any preparation must also provide sufficient isolation of the prepared system S from influence of particles of the environment that are identical to constituents of the prepared system. In this way, control about what is prepared and what is registered might be regained and Cluster Separability II would hold.
What is the nature and aim of the limitation of observables to D-local ones? A current opinion is that, on the one hand, the states of each quantum system of type τ with Hilbert space H τ are all positive normalized s.a. operators on H τ and its observables are all s.a. operators on H τ . On the other, in the everyday mess of practical conditions of each particular experiment, only some of these states occur and only some of these effects are registrable. That is all that the current opinion can say.
In this sense, the current language of quantum mechanics is too abstract and too far away from real experiments. So far away that it cannot give a detailed account of experiments without getting entangled in contradictions. This is one of the reasons why it has problems with realist interpretation, classical properties and theory of measurement. Our aim is to refine the language so that it becomes more suitable for description of real experiments.
For example, the concept “quantum system of type τ” is an ideal theoretical one that has no real counterpart. It has been abstracted by generalization from many particular experiments with real objects. Only objects that have been prepared are real and we have to study these real objects first to arrive at useful abstract concepts. We must view each real physical quantum object S as a system with a modified set of observables. The modification depends on the bounded open set D that is determined by the preparation of S . Hence, in general, any registrable observable must be D-local for some bounded open set D. The price is that, strictly speaking, no quantum system is proper (see Section 2.2.2.).
Simple examples of separation status are D = and D = R 3 . The first, the so-called trivial separation status, is the case when S is not separated from a system composed of N particles of the same kind. Then, S has no individual states and observables. An example of the second is the case that S is isolated: the whole space R 3 is empty except for S . The standard quantum mechanics works just with these two separation statuses, assuming tacitly that systems are approximately isolated. Our definition of general separation status D explains what may be meant by “approximately” and why quantum mechanics works at all in practical applications.

3.2.3. Mathematical theory of D-local observables

In the two foregoing sections, some raw physical ideas on composition of identical systems have been put forward. Now, we start to give a general mathematical formulation of these ideas. In the present section, we extend the definition of D-local operators to composite systems containing more than just one kind of particles and to non-vector states, and we study the question of how D-local POV measures can be constructed from general ones. We limit ourselves to one-dimensional POV’s, that is, F is the set of subsets of R .
Let S be a general N-particle system. We shall work in Q representation throughout, suppress spin indices and consider only those operators A on H that fulfill the following condition. The kernel A ( x 1 , , x N ; x 1 , , x N ) of A is a distribution on Schwartz space (its elements are rapidly decreasing C functions, see, e.g., Ref. [42]) S ( R 3 N ) in variables x 1 , , x N for any fixed value of x 1 , , x N and that
g ( x 1 , , x N ) = d 3 N x A ( x 1 , , x N ; x 1 , , x N ) f ( x 1 , , x N ) S ( R 3 N )
for any f ( x 1 , , x N ) S ( R 3 N ) . This is usually satisfied, see Section 2.3.1..
The general and formal definition of D-local operators is the following:
Definition 22 Let D R 3 be open, let A be an operator on H and let the following conditions hold:
1. 
A ( x 1 , , x N ; x 1 , , x N ) is the zero distribution for any x 1 , , x N R 3 N \ D N .
2. 
d 3 N x A ( x 1 , , x N ; x 1 , , x N ) f ( x 1 , , x N ) = 0
for any test function f such that
supp f ( x 1 , , x N ) R 3 N \ D N .
Then A is called D-local.
Let A be D-local for some open set D, let D be open and D D . Then, A is D -local. Let A be D-local and also D -local for two open sets D and D . Then, A is ( D D )-local. Thus, all open sets D such that A is D-local form a filter in the Boolean lattice of open subsets of R 3 .
Example 1 N = 1 , A ( x ; x ) = x 1 δ ( x - x ) . A is R 3 -local and the filter is R 3 .
Example 2 N = 1 , ψ ( x ) is a wave function with support D. Then | ψ ( x ) ψ ( x ) | is D -local, where D is the interior of D, and the filter is the family of all open sets containing D .
Definition 22 can be applied both to state operators and to effects (of a POV measure, see Section 2.2.1.). However, the definition of a D-local POV measure is more complicated than just the condition that its effects be D-local. Let us denote by H ( D ) the Hilbert space obtained by completion of the linear space of rapidly decreasing C -functions with support in D N with respect to the inner product of H . H ( D ) is a closed linear subspace of H as is the set H ( D ) of all vectors in H orthogonal to H ( D ) . Let P [ H ( D ) ] be the orthogonal projection from H onto H ( D ) . Then 1 - P [ H ( D ) ] is the projection onto H ( D ) .
Definition 23 A POV measure E ( X ) of dimension 1 is called D-local if
1. 
effect E ( X ) is D-local for all X F such that 0 X and
2. 
E ( X ) = 1 - P [ H ( D ) ] + E ( X )
for all X F such that 0 X , where E ( X ) is a D-local effect.
In principle, real registrations can register only D-local POV measures for some bounded D because each registration apparatus takes only a limited region D of space and is not sensitive to any system localized outside D. But there is no a priory bound on D.
Next, we shall prove that each one-dimensional POV measure E of a general system S is associated, for a given open set D, with a unique one-dimensional D-local POV measure Λ D ( E ) , the so-called D-localization of E , such that the condition
t r [ E ( X ) T ] = t r [ Λ D ( E ) ( X ) ) T ]
is satisfied for all X F and for all D-local states T . Thus, any standard observable E , such as position, momentum, etc., can be registered under the condition that the system is within D by registering Λ D ( E ) and the probability distribution would be the same as if we had registered E . We are going to propose that exactly this has been done if somebody claims he has measured position, momentum, etc.
Let A L r ( H ) . Then P [ H ( D ) ] A P [ H ( D ) ] L r ( H ) . Moreover, if A is positive, so is P [ H ( D ) ] A P [ H ( D ) ] , and we have also
P [ H ( D ) ] A P [ H ( D ) ] A .
Hence, if A L r ( H ) 1 + then P [ H ( D ) ] A P [ H ( D ) ] L r ( H ) 1 + . Let us call
P [ H ( D ) ] A P [ H ( D ) ]
D-projection of A . Of course, the D-projection is not a unitary map and it changes the spectral measure of the operator. For example, the spectral measure of
P [ H ( D ) ] A P [ H ( D ) ]
must contain P ( { 0 } ) = 1 - P [ H ( D ) ] even if that of A does not because all vectors in H τ ( D ) are eigenvectors of P [ H ( D ) ] A P [ H ( D ) ] to eigenvalue 0.
Now, we are ready to give the following definition:
Definition 24 Let E be an one-dimensional POV measure on H . Then,
D-localization Λ D ( E ) of E is defined by
1. 
Λ D ( E ) ( X ) = P [ H ( D ) ] E ( X ) P [ H ( D ) ]
for all X F such that 0 X and
2. 
Λ D ( E ) ( X ) = P [ H ( D ) ] E ( X ) P [ H ( D ) ] + 1 - P [ H ( D ) ]
for all X F such that 0 X .
Clearly,
Λ D ( E ) ( R ) = P [ H ( D ) ] 1 P [ H ( D ) ] + 1 - P [ H ( D ) ] = 1 ,
and the normalization condition holds. Then, all other conditions on POV measures are also satisfied by Λ D ( E ) and Equation (54) is true.
Let us study this definition for a simple example. Let A be a bounded s.a. operator on H with discrete value set { a k } including a 0 = 0 and let the projection on the eigenspace of a k be P k for all k. Then
A = k = 1 a k P k .
The corresponding PV measure is
E A ( X ) = k K ( X ) P k ,
where k K ( X ) if a k X . Let D be open, H ( D ) and P [ H ( D ) ] be defined as above. Equation (55) and Definition 24 imply that D-projection of A can then be written as
P [ H ( D ) ] A P [ H ( D ) ] = k = 0 a k Λ D ( P k ) ,
where
Λ D ( P k ) = P [ H ( D ) ] P k P [ H ( D ) ] + δ 0 k ( 1 - P [ H ( D ) ] )
and
Λ D ( E A ) ( X ) = k K ( X ) Λ D ( P k ) .
Clearly, P [ H ( D ) ] P k P [ H ( D ) ] is a projection only if P and P [ H ( D ) ] commute. Projections onto subspaces H ( D ) and H k of H commute only if either H ( D ) H k = 0 , where 0 is the zero vector of H , or H ( D ) H k or H k H ( D ) .
Of course, the k = 0 element of the right-hand side of sum (56) vanishes and it is not necessary for the validity of the equation to care how Λ D ( P 0 ) is to be defined. With our definition, however, the normalization condition on the POV measures is satisfied.
As A is bounded and s.a., so is P [ H ( D ) ] A P [ H ( D ) ] , hence it also possesses its own spectral decomposition, which is in general different from Equation (56). Its value set will be different from { a k } and its projections different from P [ H ( D ) ] P k P [ H ( D ) ] (there does not seem to be any general way calculate E P [ H ( D ) ] A P [ H ( D ) ] from E A ). Hence, our definition of the D-localization of an observable is different from D-projection of the corresponding operator. In particular, our definition preserves the value set (adding possibly 0 to it) but does not preserve the sharpness. Moreover, it makes sense for unbounded operators, too.
The above definitions of D-local observables and of D-localization could be generalized as follows. As yet, D has been an open subset in the spectrum of the position Q of S . We can take any sharp observable instead of position and all definitions and their consequences remain valid. A very important example is the following. As we shall see in Section 5, every registration apparatus A must include detectors. It seems that any detector can work only if the total energy of the detected system is higher than certain threshold E ( A ) > 0 . Hence, the observable that is registered by such an apparatus must be D-local where D is the subset E > E ( A ) of the energy spectrum. If the energy of all systems identical with S that are in the environment is smaller than E ( A ) then the apparatus cannot be influenced by them similarly as it is not influenced by wave functions that vanish in a neighborhood of A . Under such conditions, S prepared so that its energy lies in D has the generalized separation status D. Here, the term “local” loses its purely space-like character.

3.2.4. Separation status

The generalization of the notion of separation status introduced in Section 3.2.2. to non-vector states of composite systems and to one-dimensional POV measures is straightforward:
Definition 25 Let D R 3 be an open set and system S be prepared in state T that satisfies the following conditions:
1. 
There is at least one D-local POV measure E such that its average on T does not vanish,
R ι t r [ T d E ] 0 .
2. 
The average of any D-local one-dimensional POV measure E as registered on T is given by
R ι t r [ T d E ] .
Then, open set D is called separation status of S .
The second condition means that the registration of E on T is not disturbed by any other existing identical system.
Rule 17 then guarantees that control is regained, external influences are removed and possible ambiguity is harmless. An example of such an ambiguity can be constructed from Experiments I and II if the preparations are viewed as hierarchically nested. More generally, let system S ¯ be prepared in state T ¯ with separation status D ¯ and have a subsystem S that is simultaneously prepared in state T with separation status D D ¯ . Suppose further that S ¯ contains at least two particles S 1 and S 2 of the same type such that S 1 S and S 2 S . Then S ¯ has more subsystems that are different from, but contain particles of the same type as, S . Thus, S would in general be only a mathematical entity because it could not be physically distinguished from some other subsystems of S ¯ . However, in our special case, one of these subsystems, viz. S , has a separation status D and is, therefore, recognizable and has state T , on which D-local observables of S can be registered.
Let us set up a mathematical model of such an ambiguity. It concerns composition of general non-heterogeneous systems S and S and the construction of observables of S ¯ = S + S from those of S analogous to Equation (52).
We assume that there is, generally, a fixed number K of fermion particle types and a fixed number L of boson particle types in quantum mechanics. Let S be a general N-particle system composed of numbers F 1 , , F K of the fermions and B 1 , , B L of bosons, where F k and B l are non-negative integers so that
k = 1 K F k + l = 1 L B l = N
is the total particle number in S . We call F k and B l occupation numbers. Let the Hilbert space of F k fermions be H a F k , k = 1 , , K and that of B l bosons be H s B l , l = 1 , , L according to Rule 16 so that the Hilbert space of S is
H = k = 1 K H a F k l = 1 L H s B l
according to Rule 13. The tensor product on the right hand side is to be understood so that the factor H a F k is left out if F k = 0 and similarly for the bosons. Let the occupation numbers of S be F 1 , , F K and B 1 , , B L , and those of S ¯ be F ¯ 1 , , F ¯ K and B ¯ 1 , , B ¯ L . We have
N = k = 1 K F k + l = 1 L B l , F k + F k = F ¯ k , B k + B k = B ¯ k
for all k. The Hilbert spaces H τ , H τ and H τ τ of S , S and S ¯ , respectively, are given by equations analogous to (57). Let us write the Hilbert space H τ H τ on which T T is a state operator, as follows:
H τ H τ = k = 1 K H a F k H a F k l = 1 L H s B l H s B l .
For any factor H ρ k H ρ k in the expression for H τ H τ , there is a factor H ρ k + k in the expression for H τ τ , as k ¯ = k + k . Here, ρ is either a or s and k is either F k or B l , etc. Let P ρ k + k be the orthogonal projection,
P ρ k + k : H ρ k H ρ k H ρ k + k
consisting of total symmetrisation or anti-symmetrisation depending on ρ. We define
P τ τ : H τ H τ H τ τ
by
P τ τ = k = 1 K P a F k + F k l = 1 L P s B l + B l .
From the definition, it follows immediately that P τ τ is an orthogonal projection. Let us define
J : T ( H τ ) 1 + × T ( H τ ) 1 + T ( H τ τ ) 1 +
by
J ( T , T ) = P τ τ ( T T ) P τ τ t r [ P τ τ ( T T ) P τ τ ] .
The symmetry properties of states such as J T , T are stronger than the symmetry properties of operators that are necessary to ensure that their action does not change the symmetry properties of the states. For example, the kernel of operator (52) or (47) are not of the form P A P , where P is a (anti-)symmetrising projection such as P τ τ . Rather, they are symmetrized in whole variable pairs ( x i ; x i ) . Formulas (52) and (47) are examples of a construction of an operator A ¯ of a system S ¯ from an operator A of its subsystem S . The general form of such an operator is
A ¯ = S ˜ + S = S ¯ A S ˜ 1 S ,
where the sum is over all different pairs of subsystems S ˜ and S such that S ˜ is of the same kind as S . We call A ¯ extension of A to the composite. It is clear that our definition of extension is also valid for the composition of two heterogeneous systems. In this case, the extension would be A ¯ = A 1 .
Now, we can tackle the composition of general systems. The composition of systems of the same type is determined by Rule 16 and Proposition 12 together with Rule 17 so that we can choose a state for the system that is prepared from the corresponding state space. The generalization to arbitrary systems can be formulated as follows.
Definition 26 Let system S be prepared in state T with separation status D and system S in state T with separation status D such that D D = . Then, S and S are called separated.
This is a generalization of the situation occurring in Experiments I and II.
Rule 18 Let S and S be separated. Then system S + S can be considered as prepared in state T ¯ = J ( T , T ) , where separation status D D and operators of the form (60) for A A S ( D ) are observables of S + S . Alternatively, S + S can be considered as prepared in state T T and operators of the form A 1 are observables of S + S , where A A S ( D ) and 1 A S ( D ) .
Now, it also ought to be clear why we do not employ Fock-space method to deal with identical systems: it automatically (anti-)symmetrizes over all systems of the same type.
To show that the ambiguity is innocuous, it may be helpful to consider a simple example giving the Euclidean space just one dimension. Let S has occupation numbers F 1 = 2 , B 1 = 1 and S has F 1 = 1 and B 1 = 1 , and let the prepared state of S be ψ 1 ( x 1 , x 2 ) ϕ 1 ( x 3 ) and that of S be ψ 2 ( x 4 ) ϕ 2 ( x 5 ) . The wave function ψ 1 ( x 1 , x 2 ) = - ψ 1 ( x 2 , x 1 ) is antisymmetric in its arguments and the normalization is
d x 1 d x 2 ψ 1 * ( x 1 , x 2 ) ψ 1 ( x 1 , x 2 ) = d x 3 ϕ 1 * ( x 3 ) ϕ 1 ( x 3 ) = d x 4 ψ 2 * ( x 4 ) ϕ 1 ( x 4 ) = d x 5 ϕ 2 * ( x 5 ) ϕ 2 ( x 5 ) = 1 .
The state of S ¯ then is
ψ ¯ ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ν ψ 1 ( x 1 , x 2 ) ψ 2 ( x 4 ) + ψ 1 ( x 4 , x 1 ) ψ 2 ( x 2 ) + ψ 1 ( x 2 , x 4 ) ψ 2 ( x 1 ) × ϕ 1 ( x 3 ) ϕ 2 ( x 5 ) + ϕ 1 ( x 5 ) ϕ 2 ( x 3 ) ,
where ν is a normalization factor. Observe that the kernel
ψ ¯ ( x 1 , x 2 , x 3 , x 4 , x 5 ) ψ ¯ * ( x 1 , x 2 , x 3 , x 4 , x 5 )
is antisymmetric in three variables x 1 , x 2 , x 4 and independently so in further three variables x 1 , x 2 , x 4 and similarly symmetric in two variables x 3 , x 5 and independently so in further two variables x 3 , x 5 .
Kernels of observables have less symmetry. For example, let A ( x 1 , x 2 , x 3 ; x 1 , x 2 , x 3 ) be the kernel of an operator for S . It must be an operator on Hilbert space H a 2 H of S . To satisfy this requirement, it is sufficient that
A ( x 1 , x 2 , x 3 ; x 1 , x 2 , x 3 ) = A ( x 2 , x 1 , x 3 ; x 2 , x 1 , x 3 ) ,
as a simple calculation easily shows. The extension of A to S ¯ is
A ¯ ( x 1 , x 2 , x 3 , x 4 , x 5 ; x 1 , x 2 , x 3 , x 4 , x 5 ) = A ( x 1 , x 2 , x 3 ; x 1 , x 2 , x 3 ) δ ( x 4 , x 4 ) δ ( x 5 , x 5 ) + A ( x 2 , x 4 , x 3 ; x 2 , x 4 , x 3 ) δ ( x 1 , x 1 ) δ ( x 5 , x 5 ) + A ( x 4 , x 1 , x 3 ; x 4 , x 1 , x 3 ) δ ( x 2 , x 2 ) δ ( x 5 , x 5 ) + A ( x 1 , x 2 , x 5 ; x 1 , x 2 , x 5 ) δ ( x 4 , x 4 ) δ ( x 3 , x 3 ) + A ( x 2 , x 4 , x 5 ; x 2 , x 4 , x 5 ) δ ( x 1 , x 1 ) δ ( x 3 , x 3 ) + A ( x 4 , x 1 , x 5 ; x 4 , x 1 , x 5 ) δ ( x 2 , x 2 ) δ ( x 3 , x 3 ) .
The six terms are obtained by exchanging identical particles only between the different subsystems.
Suppose next that state ψ 1 ( x 1 , x 2 ) ϕ 1 ( x 3 ) and operator A ( x 1 , x 2 , x 3 ; x 1 , x 2 , x 3 ) are both D-local while state ψ 2 ( x 4 ) ϕ 2 ( x 5 ) is D -local so that D D = . In such a case, calculations of traces simplifies considerably. To see how it comes about, consider t r [ | ψ ¯ ψ ¯ | ] or
d x 1 d x 2 d x 3 d x 4 d x 5 ψ ¯ * ( x 1 , x 2 , x 3 , x 4 , x 5 ) ψ ¯ ( x 1 , x 2 , x 3 , x 4 , x 5 ) .
Taking the product of two arbitrary terms of (61), e.g.,
d x 1 d x 2 d x 3 d x 4 d x 5 ψ 1 * ( x 1 , x 2 ) ψ 2 * ( x 4 ) ϕ 1 * ( x 5 ) ϕ 2 * ( x 3 ) ψ 1 ( x 2 , x 4 ) ψ 2 ( x 1 ) ϕ 1 ( x 3 ) ϕ 2 ( x 5 ) ,
we observe that e.g. the integral over x 1 must vanish because it connects the functions ψ 1 * ( x 1 , x 2 ) and ψ 2 ( x 1 ) that are non-zero in two different non-overlapping domains D and D of x 1 . It is clear that only terms that are obtained by the same permutation of the original variables in both factors, such as
d x 1 d x 2 d x 3 d x 4 d x 5 ψ 1 * ( x 1 , x 2 ) ψ 2 * ( x 4 ) ϕ 1 * ( x 5 ) ϕ 2 * ( x 3 ) ψ 1 ( x 1 , x 2 ) ψ 2 ( x 4 ) ϕ 1 ( x 5 ) ϕ 2 ( x 3 ) = ψ 1 | ψ 1 ϕ 1 | ϕ 1 ψ 2 | ψ 2 ϕ 2 | ϕ 2 = 1 ,
can give a non-zero result. Hence,
d x 1 d x 2 d x 3 d x 4 d x 5 ψ ¯ * ( x 1 , x 2 , x 3 , x 4 , x 5 ) ψ ¯ ( x 1 , x 2 , x 3 , x 4 , x 5 ) = 6
and ν = 1 / 6 . The same observation holds for the traces containing D-local observable. For example,
t r [ A ¯ | ψ ¯ ψ ¯ | ]
contains 6 3 terms but only six survive, namely those in which the same permutation of variables x 1 , x 2 , x 3 , x 4 , x 5 meet each other in A ¯ and | ψ ¯ ψ ¯ | and the same permutation of variables x 1 , x 2 , x 3 , x 4 , x 5 meet each other in A ¯ and | ψ ¯ ψ ¯ | . For instance,
d x 1 d x 2 d x 3 d x 4 d x 5 d x 1 d x 2 d x 3 d x 4 d x 5 A ( x 2 , x 4 , x 5 ; x 2 , x 4 , x 5 ) × δ ( x 1 , x 1 ) δ ( x 3 , x 3 ) ψ 1 * ( x 2 , x 4 ) ψ 2 * ( x 1 ) ϕ 1 * ( x 5 ) ϕ 2 * ( x 3 ) ψ 1 ( x 2 , x 4 ) ψ 2 ( x 1 ) ϕ 1 ( x 5 ) ϕ 2 ( x 3 ) = d x 2 d x 4 d x 5 d x 2 d x 4 d x 5 A ( x 2 , x 4 , x 5 ; x 2 , x 4 , x 5 ) ψ 1 * ( x 2 , x 4 ) ϕ 1 * ( x 5 ) ψ 1 ( x 2 , x 4 ) ϕ 1 ( x 5 ) .
Hence,
t r [ A ¯ T ¯ ] = t r [ A T ]
where
T ¯ = | ψ ¯ ψ ¯ |
and
T = | ψ 1 ϕ 1 ψ 1 ϕ 1 | .
We can state the following general key property of composition of non-heterogeneous systems:
Theorem 12 Let systems S and S be separated. Then, system S + S is prepared in state T ¯ = J ( T , T ) with separation status D D . Let further A be a D-local s.a. operator for S and A ¯ its extension to S + S . Then,
t r [ A ¯ T ¯ ] = t r [ A T ] .
In fact, this “Theorem” is only a conjecture because we have not proved it for the general case stated in it.
Thus, the ambiguity of preparation has no observable consequences. The resulting methods that use tensor products instead of full (anti-)symmetrized tensor products are in agreement with the common practice in quantum mechanics. In fact, they make quantum mechanics viable because the state of the whole environment is never known. For example, in the theory of the experiment described in Section 1.0.2., the state is prepared as a state of an individual electron and its entanglement with all other electrons, which exist, in fact, everywhere in huge amounts, is serenely ignored. Due to Theorem 12, such method cannot lead to any problems.

3.2.5. Change of separation status

In classical mechanics, the possible states of system S are points of the phase space Γ and possible observables are real function on Γ. Clearly, all such observables have definite values on S in a fixed state independently of external circumstances. Γ is uniquely associated with the system alone and forms the basis of its kinematic description. Alternatively, we can always consider S as a subsystem of a larger system S ¯ with bigger phase space Γ ¯ . In Newtonian mechanics, Γ is then a subspace of Γ ¯ and observables of S can be extended to S ¯ by defining them to vanish outside of Γ. Hence, there is an analogous ambiguity in the choice of space of states in Newtonian mechanics as in quantum mechanics. However, no additional conditions, such as suitable separation statuses, are needed there. Thus, the quantum theory of observables is much more complicated than the Newtonian one: not only their values cannot be ascribed to microsystem S alone but some of them are not even registrable in principle due to the environment of S .
We assume that the quantum kinematics of a microsystem is defined mathematically by possible states represented by all positive normalized (trace one) operators, and possible observables represented by some POV measures, on the Hilbert space associated with the system. Then the transition from state T T to J ( T , T ) as it occurs in Rule 18 is a change of kinematic description.
Let us study this transition in more detail. We observe that P τ τ : H τ H τ H τ τ is a linear but in general non-invertible and non-unitary operator and that the normalization is an even non-linear operation on the two states. We can however show that the maps can be invertible in a special case of separation statuses D and D of T and T .
Let system S consist of N and S of N particles. Consider first vector states ϕ of S and ϕ of S . Let
Φ a s = P τ τ ( ϕ ϕ ) , Φ a s n = J ( ϕ , ϕ ) = P τ τ ( ϕ ϕ ) P τ τ ( ϕ ϕ ) | P τ τ ( ϕ ϕ ) .
If S and S are separated ( D D = ), then ϕ and ϕ satisfy:
d 3 x i f ( x i ) ϕ ( x 1 , , x N ) = 0
for any i = 1 , , N and for any test function f with supp f D , and
d 3 x i f ( x i ) ϕ ( x 1 , , x N ) = 0
for any i = 1 , , N and for any test function f with supp f D .
Let f be a test function such that f H τ with supp f ( D × ) N , where ( D × ) N is an abbreviation for the Cartesian product of N factors D . Let us define map R [ f , D ] : H τ τ H τ by
( R [ f , D ] Φ a s ) ( x 1 , , x N ) = d 3 x N + 1 d 3 x N + N f ( x N + 1 , , x N + N ) Φ a s ( x 1 , , x N , x N + 1 , , x N + N ) ,
and similarly, for test function f H τ and supp f ( D × ) N , R [ f , D ] : H τ τ H τ by
( R [ f , D ] Φ a s ) ( x N + 1 , , x N + N ) = d 3 x 1 d 3 x N f ( x 1 , , x N ) Φ a s ( x 1 , , x N , x N + 1 , , x N + N ) .
Then, we obtain easily:
R [ f , D ] Φ a s = ν f ϕ ( x 1 , , x N ) ,
where
ν f = ν τ τ d 3 x N + 1 d 3 x N + N f ( x N + 1 , , x N + N ) ϕ ( x N + 1 , , x N + N ) ,
and ν τ τ is the normalization factor defined by P τ τ . ν f is non-zero for at least some f . Similarly,
R [ f , D ] Φ a s = ν f ϕ ( x N + 1 , , x N + N ) ,
where
ν f = ν τ τ d 3 x 1 d 3 x N f ( x 1 , , x N ) ϕ ( x 1 , , x N ) .
Thus, we obtain both functions ϕ ( x 1 , , x N ) and ϕ ( x N + 1 , , x N + N ) up to normalization. As the functions are normalized, they can be reconstructed. Analogous steps work for Φ a s n .
For the generalization of these ideas to state operators, we shall need adjoints of operators R [ f , D ] and R [ f , D ] . The definition of R [ f , D ] : H τ H τ τ is
( R [ f , D ] ϕ , Φ ) = ( ϕ , R [ f , D ] Φ )
for all ϕ H τ and Φ H τ τ . A simple calculation yields
R [ f , D ] ϕ = P τ τ ( ϕ f * ) .
Similarly,
R [ f , D ] ϕ = P τ τ ( f * ϕ ) .
Map T T P τ τ ( T T ) P τ τ is linear in both T and T and its result is an operator on H τ H τ that leaves H τ τ invariant. Operator P τ τ ( T T ) P τ τ : H τ τ H τ τ is self-adjoint and positive if T and T are state operators but it is not normalized. Let { ψ n } be a basis H τ and { ψ α } that of H τ . We can write
T = m n T m n | ψ m ψ n | , T = α β T α β | ψ α ψ β | .
Then
P τ τ ( T T ) P τ τ = m n α β T m n T α β | P τ τ ( ψ m ψ α ) P τ τ ( ψ n ψ β ) | .
Now, the above proof that vector states ϕ and ϕ can be reconstructed from J ( ϕ , ϕ ) can be easily extended to general state operators T and T by expanding the state operators into the bases and acting by R’s from the left and R ’s from the right on them.
Moreover, for separated systems, the “individual” observables from A [ S ] D and A [ S ] D can be recovered from operators on H τ τ that are extensions of operators either of A [ S ] D or of A [ S ] D . Let us show it for a simple example.
Example Let S be a fermion particle and S a composite of one fermion of the same type as S and some particle of a different type. Let ϕ ( x 1 ) be an arbitrary element of H and ϕ ( x 2 , x 3 ) that of H τ , x 2 being the coordinate of the fermion. Then
Ψ ( x 1 , x 2 , x 3 ) = P τ τ ϕ ( x 1 ) ϕ ( x 2 , x 3 ) = 1 2 ϕ ( x 1 ) ϕ ( x 2 , x 3 ) - ϕ ( x 2 ) ϕ ( x 1 , x 3 ) .
Let a A [ S ] D . Then its extension A is an operator on H τ τ defined by its kernel
a ( x 1 ; x 1 ) δ ( x 2 - x 2 ) δ ( x 3 - x 3 ) + a ( x 2 ; x 2 ) δ ( x 1 - x 1 ) δ ( x 3 - x 3 )
so that
( A Ψ ) ( x 1 , x 2 , x 3 ) = 1 2 ( a ϕ ) ( x 1 ) ϕ ( x 2 , x 3 ) - ( a ϕ ) ( x 2 ) ϕ ( x 1 , x 3 ) .
Then,
R [ f , D ] ( A Ψ ) = ν f ( a ϕ ) ( x 1 ) ,
where
ν f = 1 2 d 3 x 2 d 3 x 3 f ( x 2 , x 3 ) ϕ ( x 2 , x 3 ) .
But ϕ ( x 1 ) , ϕ ( x 2 , x 3 ) and f ( x 2 , x 3 ) are known, hence, as ϕ is arbitrary, a is well-defined.
To summarise: for separated systems S and S , there are two equivalent descriptions: the standard QM description of S + S on the Hilbert space H τ τ and the untangled QM description on H τ H τ explained above.
As yet, the considerations apply to situations at a fixed instant of time. The new aspect that time evolution can introduce is that separation status of a system in a state can change in time. Let us define mathematically this means.
First, we come to the notion of formal evolution.
Definition 27 Let system S be initially ( t = t 1 ) prepared in state T , another quantum system S in state T and let them be separated at t 1 . Let the composite have a time-independent Hamiltonian defining a unitary group U ( t - t 1 ) of evolution operators on H τ τ . Then, the standard quantum mechanical evolution of S + S ,
T ¯ ( t ) = U ( t - t 1 ) J T T U ( t - t 1 ) ,
is called formal evolution of two interacting systems S and S .
The idea is analogous to the well-know time-dependent Hartree–Fock method in the theory of nuclear fusion [75]. Thus, the formal evolution uses the standard QM description. It is called “formal” because the character of the separation statuses can change during the evolution and it is not clear whether the standard quantum mechanics is then still applicable. Indeed, this evolution does not agree with observation of separation status changes that occur during registrations. However, the formal evolution is our first step in the mathematical analysis of separation status changes. With its help, we can decide whether a change of separation status has taken place in a given theoretical model. Let us study an example in some detail.
Let S and S be two quantum systems, S containing N particles and S containing N particles. Let the systems be prepared, at time t 1 , in states T and T with non-trivial separation statuses D 1 and D , respectively, and D 1 D = . Thus, S and S are separated at t 1 . Let the formal evolution of the composite S + S for the initial state T ¯ ( t 1 ) = J ( T , T ) be described by its kernel in Q-representation:
T ¯ ( t ) ( x 1 , , x N , x N + 1 , , x N + N ; x 1 , , x N , x N + 1 , , x N + N ) .
  • Suppose that, for some t 2 > t 1 , supp T ¯ ( t 2 ) = ( D × ) 2 ( N + N ) . (This can easily be generalized to a more realistic condition, e.g., ( D × ) N + N d 3 x 1 d 3 x N + N T ¯ ( t 2 ) ( x 1 , , x N + N ; x 1 , , x N + N ) 1 .) Then we can say: at time t 2 , the separation status of S is , that of S is D and that of the composite S + S is also D or, that S is swallowed by S .
  • Suppose that, for some t 3 > t 2 , there is an open set D 3 R 3 , D 3 D = , such that the kernel T J ( t 3 ) has the properties:
    (a)
    For any test function f H τ and
    supp f = ( D × ) N , R [ f , D ] T J ( t 3 ) R [ f , D ] 0 ,
    ν R [ f , D ] T ¯ ( t 3 ) R [ f , D ] is a state operator of S independent of f , where ν is the normalization factor.
    (b)
    For any test function f H τ and
    supp f = ( D 3 × ) N , R [ f , D 3 ] T J ( t 3 ) R [ f , D 3 ] 0 ,
    ν R [ f , D 3 ] T ¯ ( t 3 ) R [ f , D 3 ] is a state operator of S independent of f, where ν is the normalization factor.
    (c)
    For any test function g H τ and supp g = ( D 3 × ) N , we have
    R [ f , D ] T J ( t 3 ) R [ f , D ] | g = 0 .
    (d)
    For any test function g H τ and supp g = ( D × ) N , we have
    R [ f , D 3 ] T J ( t 3 ) R [ f , D 3 ] | g = 0 .
    Then we can say: the systems become separated again at time t 3 > t 2 , system S being in state ν R [ f , D ] T J ( t 3 ) R [ f , D ] with separation status D 3 and system S in state ν R [ f , D 3 ] T J ( t 3 ) R [ f , D 3 ] with separation status D .
Thus, we judge on separation statuses of the two systems by studying the supports of the kernels of the Q-representation of their state operators during the formal (i.e., ordinary unitary) evolution of their composite. The change from D 1 at t 1 to ∅ at t 2 and to D 3 at t 3 is a complicated function of the evolution of the whole composite system. As for the observables, their unitary evolution is, in fact, irrelevant to what can be registered. Consider, e.g., position of S . The standard position operator Q can never serve as observable “position of S ”. The D 1 localization of Q is registrable and its meaning is “position of S ” at t 1 . But it is not “position of S ” at t 2 or t 3 because at t 2 , S does not possess any observable of its own, including position. At t 3 , “position of S ” is D 3 -localization of Q . At any time, one can construct the extensions of the corresponding localizations to the whole composite, but the registrable meaning of these extensions changes with time. Thus, the observables change with time even if we are working in Schrödinger representation.
Although we have based the time process on a unitary evolution (the formal evolution), the time evolution of genuinely registrable properties of S does not look like a unitary evolution. And, although we can find the separation statuses of S and S by studying the formal evolution of S + S , we cannot claim that the formal evolution gives the physical state of the composite. The question even seems natural, whether the formal evolution ought to be further corrected in the case that it leads to separation-status changes.
What has been said up to now shows that standard quantum mechanics is incomplete in the following sense:
  • It accepts and knows only two separation statuses:
    (a)
    that of isolated systems, D = R 3 , with the standard operators (position, momentum, energy, spin, etc.) as observables, and
    (b)
    that of a member of a system of identical particles, D = , with no observables of its own.
  • It disregards the fact that separation status can change during time evolution. In particular, it does change during preparations and registrations, and that makes the measurement a process physically different from most other processes considered by quantum mechanics. The question whether the unitary evolution law provides an adequate description to such changes naturally arises.
This suggests that quantum mechanics can be supplemented by a theory of general separation status and by new rules that govern processes in which separation status changes. The new rules must not contradict the rest of quantum mechanics and ought to agree with, and to explain, observational facts.

3.3. State reduction

The standard quantum theory of indistinguishable particles as explained in the foregoing sections leads to an important but as yet insufficiently studied or even ignored phenomenon: A prepared state of a quantum system will often be mangled and degraded during an interaction with a large system such as a macroscopic body. We consider this to be an objective change of the state similarly as worn boots are objectively different from new boots. Let us give a simple example.
In many optical experiments, such as [76], polarisers, such as Glan–Thompson ones, are employed. A polariser is a macroscopic body that decomposes the coming light into two orthogonal-polarisation parts. One part disappears inside an absorber and the other is left through practically unchanged. Similarly, in most quantum experiments, one or more screens are used. A screen is a macroscopic body that decomposes the incoming state into one part that disappears inside the body and the other that evolves further.
Disappearance of a quantum system S in a macroscopic body B is the following process. The body is assumed to be a perfect absorber. First, S enters B and ditches most of its kinetic energy somewhere inside B . Second, the energy passed to B is dissipated and distributed homogeneously through B in a process aiming at thermodynamic equilibrium. In this way, S ceases to be separated from the other systems of the same type within B by its energy. Moreover, a photon might be annihilated and a massive particle becomes entangled with all other particles of its type inside B and its separation status by position becomes trivial. In this way, S ceases to exist as an individual object and no more registrations can be done on it. This can be viewed as a complete or partial loss of the system because it becomes undistinguishable from all subsystems of B that are of the same type as S .
Mathematical description of the initial and the final state of the composite S + B can be easily given. Let S be a particle of type τ and ψ ( x ) the wave function of its initial state prepared with a separation status D. Let screen B be a macroscopic quantum system of type τ with separation status D having sufficiently large common boundary with D. Let
ψ ( x ) = c thr ψ thr ( x ) + c sw ψ sw ( x )
be the decomposition of the initials state, where ψ thr ( x ) is a normalized wave function of the part that will be left through and ψ sw ( x ) that that will be swallowed by B . This decomposition is determined by the nature of B : for a polariser, these are the two orthogonal polarisation states, and for a simple screen, these can be calculated from the geometry of B and the incoming beam as it is usually done e.g. in accounts of a double-slit experiment.
The initial state of B is a classical state, which is a high entropy one (see Section 4). It is, therefore, described by a state operator T . The initial state of the composite is then
T ¯ i = | ψ ψ | T .
Now, the initial state for the formal evolution of the composite is
T ¯ fei = ν P τ τ ( | ψ ψ | T ) P τ τ ,
where ν = t r [ P τ τ ( | ψ ψ | T ) P τ τ ] and P τ τ is defined by Equation (58). Using decomposition (65), we can write
T fei = ν ( c thr * c thr P τ τ ( | ψ thr ψ thr | T ) P τ τ + c thr c sw * P τ τ ( | ψ thr ψ sw | T ) P τ τ + c sw c thr * P τ τ ( | ψ sw ψ thr | T ) P τ τ + c sw * c sw P τ τ ( | ψ sw ψ sw | T ) P τ τ ) .
Let U be the unitary operator that describes the formal evolution on the Hilbert space H τ τ of the composite. After the process is finished, we obtain
T ¯ fef = ν ( c thr * c thr P τ τ ( | ψ thr ψ thr | T thr ) P τ τ + c thr c sw * U P τ τ ( | ψ thr ψ sw | T ) P τ τ U + c sw c thr * U P τ τ ( | ψ sw ψ thr | T ) P τ τ U ) + c sw * c sw T ¯ ,
where
T ¯ = ν U P τ τ ( | ψ sw ψ sw | T ) P τ τ U
is the end state of the screen with the swallowed part of S and we have assumed that
U P τ τ ( | ψ thr ψ thr | T ) P τ τ U = P τ τ ( | ψ thr ψ thr | T thr ) P τ τ ,
where ψ thr is the wave function of S with separation status D thr describing the part that went through, D thr D = , D thr D = , and T thr is the corresponding state of the screen with separation status D .
The crucial step now is that the two terms containing products of the left-through and the swallowed parts of S are discarded so that the physical final state of the composite is
T ¯ f = p thr | ψ thr ψ thr | p thr T thr ( + ) p p sw T ¯ ,
where
p thr = c thr * c thr , p sw = c sw * c sw ,
The change from (68) to (69) is called state reduction. It is not a unitary transformation: the non-diagonal terms in (68) have been erased. The sign “ ( + ) p ” suggests that the convex combination is a statistical decomposition (see Section 2.1.2.). Thus, not only some terms have been erased but also the state of the composite has been further changed.
What part of the state operator is to be erased must be judged both with the help of an assessment of the experimental arrangement resulting in a theoretical model thereof and calculation of the unitary evolution of an initial state, which had to lead to decomposition (65). Thus, Equation (69) is a result of a judicious decision that may but does not necessarily work, because it may but does not necessarily express the reality with sufficient accuracy. It is analogous to the decision of which state has been prepared by a given experimental setup of a preparation apparatus.
In our previous work [19,24,34], we have assumed that transformation analogous to that from (65) to (69) result from application of some general alternative dynamical law and that such a law must be postulated. An extended study of all empirical cases that came to mind as yet has shown that a number of specific details must in each case be taken into account so as to make an adequate theoretical description of what happens. Schrödinger equation gives a physical evolution only in ideal cases of isolated systems. Under more general conditions, Schrödinger equation gives only a formal evolution. Then, the results of Schrödinger evolution must be suitably corrected to express the resulting state degradation. This is the content of the following rule.
Rule 19 Let S be a microscopic quantum system and A a macroscopic one in a classical (high entropy) state. Let there be process with an interaction between them such that the resulting change in the state of A includes a dissipation of a portion of the state within a macroscopic part of the degree of freedom of A . Then the end state of the formal evolution of S + A must be corrected by discarding all terms that express correlation between macroscopically different end states of the composite, and the resulting convex combination of states is a statistical decomposition.
The state reduction above is formally similar to what is often called the collapse of wave function or the state reduction or the dynamical state reduction. State reduction was declared to be a basic new kind of dynamics that sometimes replaces the unitary dynamics [77] or always corrects a unitary evolution [78,79] but no cause for the state reduction has been given (for an extended discussion see Section 5.1 and Section 5.2). As explained above, the state reduction runs parallel to, and the reason for it is provided by, the unitary evolution together with relevant specific empirical data for each case. Its basic feature is a complete or partial disappearance of the system due to the dissipation and this is the cause of a state degradation, the result of which is a state reduction.
Rule 19 formulates only few general features and leaves some freedom in the choice of exact mathematical details that must be assumed separately for each particular object studied so that the resulting model corresponds well to its observed properties. This is, in fact, similar to Schrödinger equation, which is generally restricted only in its overall general features and its details must also be assumed for each model separately with the aim to yield a good model of the object.
The described action of the screen on incoming individual system S is not a registration: by itself, it does not deliver the value of any observable of S . Hence, it is either a preparation or a part of a registration. For example, in experiment [76], the position of photons leaving the polariser is measured by a photodiode. Thus, a detector is needed to accomplish a registration. We shall study various models of registration in Section 5 and many examples will show how Rule 19 is applied.
Part II
The models
The treasure of successful models is the primary part of any physical theory. Textbooks of quantum mechanics dedicate most of their text to models of atoms and molecules, to scattering theory of particles on atoms and molecules, to solid bodies, etc. A general method of such constructions has been described in Section 2.3.1.. The only part of the textbooks that has to be changed concerns the operators that are used in the construction and referred by the textbooks as “observables”. This name is not correct because we have seen in Section 3.2.1. that the measurement of all such operators would be disturbed by the environment.
For particles, a D-localization (see Definition 24) of these operators are already genuine observables. For composite systems, such as atoms, the construction of position and momentum observables of their mass centre is analogous, but other observables may lead to more complications. Real experiments must be carefully studied and the corresponding observable must be constructed accordingly. For example, the energy spectrum of hydrogen atom is usually measured indirectly via the energy of photons scattered off the atoms. The corresponding observable will not be just a D-localization of the Hamiltonian operator. How observables describing indirect registrations are constructed is well known (see, e.g., [21], p. 282). We do not expect any contradictions to our new rules or difficult mathematical problems that would hinder such constructions. Hence, we shall skip the whole menagerie of models of microscopic systems and restrict ourselves only to those models that are immediately important for our main aim: to deal with the problems of classical properties (Section 4) and of quantum measurement (Section 5).
To construct models of classical world will require, in addition to the already described changes of language, some further new ideas, which are specific to the particular objects to be modelled.

4. Quantum models of classical properties

There are many classical aspects of real objects that have been successfully modelled by quantum mechanics, such as electrical conductivity or specific heats. These are typically phenomena that occur in systems with very many degrees of freedom so that statistical methods can be used. The statistical methods were invented already before quantum mechanics was born and introduced some elements that could be understood only later by quantum mechanics. For example, the microcanonical or canonical ensemble is, in fact, methods of preparation of thermodynamic systems. Or, theoretical results are given in the form of averages and variances. The modern condensed matter theory around room temperature can, therefore, be included into our theory of classical properties without much change. We would just utilize the objectivity of averages and variances in our interpretation of quantum mechanical results.
However, the Galilean invariance of quantum theory leads to separation of the overall motion from all other degrees of freedom. The motion of mass centre and of the total angular momentum with respect to the mass centre comprises only six degrees of freedom that do not seem to allow statistical methods. Exactly this kind of motion is studied by Newtonian mechanics. Thus, the situation is that there are quantum models of classical thermodynamic properties but none of mechanical properties that would be really satisfactory.
Quantum modelling of non-thermodynamic properties of classical systems encounters two main problems. First, a key feature of Newtonian mechanics (and any other classical theory as well) is that each system objectively has a sharp trajectory. Any fuzziness is just due to incomplete knowledge. In particular, the state of a Newtonian system is described by a point of its phase space, and the system is always in a definite state, i.e., it cannot be at two points of the phase space simultaneously (see also the discussion at the end of Section 2.2.2.). Second, the system is robust so that measurements can be done on it without changing its properties. For example, the state of a system can be determined or confirmed by a suitable set of measurements on the system.
Thus, any quantum model of a classical system must satisfy the first two conditions of what Leggett has called Principle of Macroscopic Realism [10]:
  • A macroscopic system that has available to it two or more distinct macroscopic states is at any given time in a definite one of those states.
  • It is possible in principle to determine which of these states the system is in without any effect on the state itself or on the subsequent system dynamics.
In trying to model the sharpness of classical states and trajectories, one may be mislead to overestimate the importance of quantum states of minimum uncertainty, which is of coherent states. However, such states are always extremal states, which can be linearly superposed, and quantum mechanics requires that linear superpositions of available states are also an available state. Moreover, measurements of the classical parameters of a coherent state necessarily disturb the value of the parameters.
To solve this problem, one could e.g. assume that some as yet unknown phenomena exist at the macroscopic level that are not compatible with standard quantum mechanics. For example, they may prevent linear superpositions (see, e.g., [10] and the references therein). However, no such phenomena have been observed.
Another strategy is to assume that the macroscopic realism is only apparent in the sense that there are linear superpositions of macroscopic states but the corresponding interference phenomena are difficult or impossible to observe. For example, the quantum decoherence theory [9,80] works only if certain observables concerning both the environment and the quantum system cannot be measured (see the analysis in [12,48]). Another example is the theories based on coarse-grained operators [21,81,82] being measurable but fine grained being not. The third example is the Coleman–Hepp theory [52,53,54,55,56] and its modifications [46,57,58]: they are based on some particular theorems that hold only for infinite systems (see the analysis in [53]) or for asymptotic regions [58].
However, if we turn from theory to experiment, we may notice that any well-founded scientific observation of classical properties always has a statistical form. A measurement or observation is only viewed as well understood if it is given as an average with a variance. This fact does not by itself contradict the sharp character of the corresponding theory. The usual excuse is that the observation methods are beset with inaccuracy but that improvement of techniques can lead to better and better results approaching the “objective sharp” values arbitrarily closely. In any case, however, the measured classical parameters of real objects are much fuzzier than the minimal quantum uncertainty requires.
Moreover, that popular excuse is clearly incompatible with the assumption that the classical world is only an aspect of a deeper quantum world and that each classical model is nothing but a kind of incomplete description of the underlying quantum system. If we assume such universality of quantum theory, then the statistical character of classical observational results must not only be due to inaccuracy of observational methods but also to genuine uncertainty of quantum origin. This point of view is due to Exner [16], p. 669, and Born [17] and will be adopted here as a starting point of our theory of classical properties.
We can formulate this idea in terms of the Realist Model Approach as follows. The language part of classical theories contains the notions of sharp state and trajectory. These are idealized notions that do not possess any counterpart in the real world, but they are useful for model construction.
In this section, we first formulate some general hypotheses that can be applied to both thermodynamic and mechanical properties, introducing thus a unified theory of classical properties: they turn out to be selected objective properties of high-entropy quantum states of macroscopic systems. Next, we show in detail how these ideas are to be applied to Newtonian mechanics, introducing states called ME packets. Then, we construct a quantum model of a classical rigid body. Finally, we modify the well-known model of a simultaneous measurement of position and momentum of a Gaussian wave packet to that of position and momentum of a ME packet.
Thus, our project to construct quantum models of observed classical systems seems to work nicely. What remains open is the question of what is the origin of all the high-entropy states that are observed in such a great abundance around us.

4.1. Modified correspondence principle

The Born–Exner assumption has quite radical consequences, which is only seldom realized. First, the exactly sharp states and trajectories of classical theories are not objective. They do not exist in reality but are only idealizations. What really exist are fuzzy states and trajectories. The objectivity of fuzzy states of classical models is a difficult point to accept and understand. Let us explain it in more detail.
In quantum mechanics, the basis of objectivity of dynamical properties is the objectivity of the conditions that define preparation procedures. In other words, if a property is uniquely determined by a preparation, then it is an objective property. If we look closely, one hindrance to try the same idea in classical theories is the custom always to speak about initial data instead of preparations. An initial datum can be and mostly is a sharp state. The question on how an exactly sharp state can come into being is ignored. This in turn seems justified by the hypothesis that sharp states are objective, that is, they just exist by themselves.
To come away from this self-deception, we accept that preparation procedures play the same basic role in classical physics as in quantum physics. Then, the nature and form of necessary preparation procedures must be specified and the corresponding states described. In this way, the Exner–Born idea leads to a rather radical change of interpretation of classical theories and this will enable us to construct quantum models of all classical aspects of real objects.
An obvious starting point of such constructions is that all classical systems are also quantum systems. Let us now make this more precise. Consider a real physical object (so to speak, independent of any theory). If there is an adequate classical model S c of some aspects of this system, the system is called classical. Adequate means that some properties of S c approximately represent important properties of the real system. In addition, the same real object must also be understood in terms of a quantum model S q . S q could be richer than S c so that classical properties of S c can be identified with some quantum properties of S q . In particular, these quantum properties ought to be objective. This follows from the fact that all classical properties can be assumed to be objective without any danger of contradictions.
The construction of quantum model S q consists of the following points. (1) The composition of S q must be defined. (2) The observables that can be measured on S q are to be determined. On the one hand, this is a non-trivial problem because there are relatively strong restrictions on what observables of macroscopic systems can be measured (see Section 3.2.2.). On the other hand, as any quantum observable is measurable only by a classical apparatus, the existence of such apparatuses is tacitly assumed from the very beginning. Quantum model S q will thus always depend on some classical elements. This does not mean that classicality has been smuggled in because, in our approach, classical properties are specific quantum ones. (3) A Hamiltonian operator of the system must be set up. (4) Suitable quantum states must be chosen. Finally, the known classical properties of S c must be listed and each derived as an objective property of S q from the four sets of assumptions above. This is a self-consistent framework for a non-trivial problem.
It follows that there must be some at least approximate relation between classical observables of S c and quantum observables of S q as well as between the classical states of S c and the quantum states of S q . The following model assumption on such a relation might be viewed as a version of Correspondence Principle, let us call it Modified Correspondence Principle.
Assumption 1
1. 
The state of classical model S c in a given classical theory is described by a set of n numbers { a 1 , , a n } that represent values of some classical observables. The set is not uniquely determined. Let us call any such set state coordinates.
2. 
We assume that state coordinates { a 1 , , a n } can be chosen so that there is a subset { a 1 , , a n } of sharp observables of quantum model S q and a state T of S q such that
t r [ T a k ] = a k .
3. 
All such states form a subset of T ( H S q ) 1 + . Some of these states satisfy the condition that all properties of S c can be (at least approximately) obtained from S q if it is in such states. They are called classicality states of quantum system S q .
Clearly, Modified Correspondence Principle does not need the assumption that all observables from { a 1 , , a n } commute with each other. Then, even if they themselves are not jointly measurable, their fuzzy values can be (see Section 4.7). Further, it does not follow that each classical property is an average of a quantum operator. That would be false. We assume only that the classical state coordinates a 1 , . . . , a n can be chosen in such a way.
It is important to realize that Modified Correspondence Principle suggests how Principle of Macroscopic Realism is to be understood. For example, macroscopic systems also have extremal states that satisfy Equation (70). These seem to be macroscopic states available to the system. However, extremal states are readily linearly superposed and any quantum registration that ought to find the parameters of a coherent state (a generalized measurement: positive operator valued measure) would strongly change the state (for a general argument, see Ref. [11], p. 32). We assume that the validity of Principle of Macroscopic Realism can be achieved if the words “distinct macroscopic states” are replaced with “distinct classicality states”. Let us try to motivate a proposal of what such classicality states might be.
An interesting subset of classical properties of macroscopic system is the thermodynamic ones. They are important for us because quantum models of these properties are available. Existing models based on statistical physics need one non-trivial assumption: the states of sufficiently small macroscopic systems that we observe around us are approximately states of maximum entropy. As it has been discussed in Section 1.0.4., entropy is an objective property of quantum systems because it is defined by their preparation. Thus, the validity of thermodynamics depends on the preparation conditions, or the origin, of observed macroscopic systems. The averages and variances that result from the models based on the maximum-entropy assumption agree with observations. In particular, they explain why classical states and properties are relatively sharp. Moreover, high entropy states are very far from extremal and linear superposition does not make any sense for them.
The physical foundations of thermodynamics are not yet well-understood but there are many ideas around about the origin of high-entropy states. Their existence might follow partially from logic (Bayesian approach, [37]) and partially from quantum mechanics (thermodynamic limit, [83], Vol. 4). Some very interesting models of how maximum entropy quantum states come into being are based on entanglement [84,85,86,87]). However, just in order to construct quantum models of classical properties, they can be used as one of the main assumptions without really understanding their origin.
We generalize the statistical methods as follows [18].
Assumption 2 All classicality states are states of high entropy.
Assumption 2 is a heuristic one and it is therefore formulated a little vaguely. It will be made clearer after some examples of its use will be studied in this section. But some brief discussion can be given already now.
Consider first states of macroscopic systems that are at or near absolute zero of temperature. These are approximately or exactly extremal and maximize entropy at the same time. Thus, they are not classicality states but the entropy, though maximal, is not high, either. Second, consider states of macroscopic systems at room temperature that are not at their thermodynamic equilibrium but are close to it. There are many such states, and they and the systems can be described by classical physics to a good approximation. They are not in maximum- but in high-entropy states.

4.2. Maximum entropy assumption in classical mechanics

In this subsection, we follow loosely Ref. [18]. As explained at the start of this section, the basic notions of the language part of Newtonian mechanics are that of a sharp state—a point of the phase space—and of a sharp trajectory—a curve in the phase space of an isolated system. We accept this language without assuming that the sharp trajectories have any real counterpart in the world because this does not prevent us from building fuzzy models that have a more direct relation to reality.
However, most physicists take the existence of sharp trajectories seriously and try to obtain them from quantum mechanics as exactly as possible. Hence, they focus on quantum states the phase-space picture of which is as sharp as possible. Those are states with minimum uncertainty allowed by quantum mechanics. For one degree of freedom, described by coordinate q and momentum p , the uncertainty is given by the quantity
ν = 2 Δ q Δ p ,
where Δ a denotes the variance of quantity a , as defined by Equation (16).
The states with minimum uncertainty ν = 1 are, however, very special extremal states. Such states do exist for macroscopic systems but are very difficult to prepare, unlike the usual states of macroscopic systems described by classical mechanics. As we explained in Section 2.1.2., they have a number of properties that are very strange from the point of view of classical theories and they are therefore not what we have called classicality states.
We feel that there is no point in attempts to derive the language part of Newtonian mechanics from that of quantum mechanics. Instead, we propose that the classical limit is to be considered at the level of models. That is, properties of successful Newtonian models are to be obtained from some quantum models under suitable conditions. We assume further that a good model of Newtonian mechanics is necessarily fuzzy and that the fuzziness is determined by the preparation of the system similarly as in quantum mechanics. Let us give some examples.
Consider a gun in a position that is fixed in a reproducible way and that shoots bullets using cartridges of a given provenance. All shots made under these conditions form an ensemble with average trajectory ( Q gun k ( t ) , P gun k ( t ) ) and the trajectory variance ( Δ Q gun k ( t ) , Δ P gun k ( t ) ) that describes objective properties of the ensemble. The Newtonian model of this ensemble is the evolution ρ gun ( Q k , P k ; t ) of a suitable distribution function on the phase space. According to Newtonian mechanics, each individual shot has a sharp trajectory ( Q k ( t ) , P k ( t ) ) . Each individual shot is also an element of the ensemble and this is a property of the individual that can be considered also as objective, even in Newtonian mechanics.
The existence of this fuzzy property of an individual shot does not contradict the fact that some more precise observations (optical, say) of this one shot can give a different fuzzy structure. Indeed, such an optical measurement method ought to have been studied on other ensembles and already well established itself, which will allow to estimate its error (variance) and hence to understand the result of the measurement as saying that a given, fixed trajectory is an element of a thought ensemble with an average ( Q opt k ( t ) , P opt k ( t ) ) and variance ( Δ Q opt k ( t ) , Δ P opt k ( t ) ) , where
Q opt k ( t ) Q gun k ( t ) , Δ Q opt k ( t ) Δ Q gun k ( t ) ,
and similarly for the momentum part. Still, Δ Q opt k ( t ) · Δ P opt k ( t ) must be much larger than the minimum quantum uncertainty / 2 .
The simplest way to construct a fuzzy model is to fix initial averages and variances of coordinates and momenta, Q k , Δ Q k , P k , Δ P k , and leave everything else as fuzzy as possible. To calculate the corresponding probability distributions in classical mechanics and the state operators in quantum mechanics, we shall, therefore, apply the maximum entropy principle. The resulting states will be called maximum-entropy packets, abbreviated as ME packets. The averages of coordinates and momenta take over the role of coordinate and momenta in classical mechanics. In any case, these averages represent measurable aspects of these variables. Quantities Q k , Δ Q k , P k , Δ P k will also play the role of classical state coordinates defined by Assumption 1. To limit ourselves just to given averages and variances of coordinates and momenta is a great simplification that enables us to obtain interesting results easily. One can imagine, however, more complicated models, where further moments are fixed, or moments of different observables (e.g., mass centre, total momentum, angles and total angular momentum) are fixed.
The variances are not assumed small. How large they are depends on the accuracy of a preparation or of a measurement, as the gun example shows.
In fact, the dynamical evolution of variances is an important indicator of the applicability of the model one is working with. It determines the time intervals within which reasonable predictions are possible. Consider a three-body system that is to model the Sun, Earth and Jupiter in Newtonian mechanics. It turns out that generic trajectories starting as near to each other as, say, the dimension of the irregularities of the Earth surface will diverge from each other by dimensions of the Earth–Sun distance after the time of only about 10 7 years. This seems to contradict the 4 × 10 12 years of relatively stable Earth motion around the Sun that is born out by observations. The only way out is the existence of a few special trajectories that are much more stable than the generic ones and the fact that bodies following an unstable trajectory have long ago fallen into the Sun or have been ejected from the solar system. By the way, this spontaneous evolution can be considered as a preparation procedure of solar system.

4.3. Classical ME packets

Let us first consider a system S with one degree of freedom and then generalize it to any number of degrees. Let the coordinate be q and the momentum p. A state is a distribution function ρ ( q , p ) on the phase space spanned by q and p. The function ρ ( q , p ) is dimensionless and normalized by
d q d p v ρ = 1 ,
where v is an auxiliary phase-space volume to make ρ dimensionless. The entropy of ρ ( q , p ) can be defined by
S : = - d q d p v ρ ln ρ .
The value of entropy will depend on v but most other results will not. Classical mechanics does not offer any idea of how to fix v. We shall get its value from quantum mechanics.

4.3.1. Definition and properties

Definition 28 ME packet is the distribution function ρ that maximizes the entropy subject to the conditions:
q = Q , q 2 = Δ Q 2 + Q 2 ,
and
p = P , p 2 = Δ P 2 + P 2 ,
where Q, P, Δ Q and Δ P are given values.
We have used the abbreviation
x = d q d p v x ρ .
The explicit form of ρ can be found using the partition-function method as described in Ref. [37]. The variational principle yields
ρ = 1 Z ( λ 1 , λ 2 , λ 3 , λ 4 ) exp ( - λ 1 q - λ 2 p - λ 3 q 2 - λ 4 p 2 ) ,
where
Z = d q d p v exp ( - λ 1 q - λ 2 p - λ 3 q 2 - λ 4 p 2 ) ,
and λ 1 , λ 3 , λ 2 and λ 4 are the four Lagrange multipliers corresponding to the four conditions (72) and (73). Hence, the partition function for classical ME packet is
Z = π v 1 λ 3 λ 4 exp λ 1 2 4 λ 3 + λ 2 2 4 λ 4 .
The expressions for λ 1 , λ 2 , λ 3 and λ 4 in terms of Q, P, Δ Q and Δ P can be obtained by solving the equations
ln Z λ 1 = - Q , ln Z λ 3 = - Δ Q 2 - Q 2 ,
and
ln Z λ 2 = - P , ln Z λ 4 = - Δ P 2 - P 2 .
The result is:
λ 1 = - Q Δ Q 2 , λ 3 = 1 2 Δ Q 2 ,
and
λ 2 = - P Δ P 2 , λ 4 = 1 2 Δ P 2 .
Substituting this into Equation (74), we obtain the distribution function of a one-dimensional ME packet. The generalization to any number of dimensions is:
Theorem 13 The distribution function of the ME packet for a system with given averages and variances Q 1 , , Q n , Δ Q 1 , , Δ Q n of coordinates and P 1 , , P n , Δ P 1 , , Δ P n of momenta, is
ρ = v 2 π n k = 1 n 1 Δ Q k Δ P k exp - ( q k - Q k ) 2 2 Δ Q k 2 - ( p k - P k ) 2 2 Δ P k 2 .
We observe that all averages obtained from ρ are independent of v and that the right-hand side of equation (78) is a Gaussian distribution in agreement with Jaynes’ conjecture that the maximum entropy principle gives the Gaussian distribution if the only conditions are fixed values of the first two moments.
As Δ Q and Δ P approach zero, ρ becomes a δ-function and the state becomes sharp. For some quantities this limit is sensible, for others it is not. In particular, the entropy, which can easily be calculated,
S = 1 + ln 2 π Δ Q Δ P v ,
diverges to - . This is due to a general difficulty in giving a definition of entropy for a continuous system that would be satisfactory in every respect. What one could do is to divide the phase space into cells of volume v so that Δ Q Δ P could not be chosen smaller than v. Then, the limit Δ Q Δ P v of entropy would make more sense.
The average of any monomial of the form q k p l q 2 m p 2 n can be calculated with the help of partition-function method as follows:
q k p l q 2 m p 2 n = ( - 1 ) N Z N Z λ 1 k λ 2 l λ 3 m λ 4 n ,
where N = k + l + 2 m + 2 n , Z is given by Equation (75) and the values (76) and (77) must be substituted for the Lagrange multipliers after the derivatives are taken.
Observe that this enables to calculate the average of a monomial in several different ways. Each of these ways, however, leads to the same result due the identities
2 Z λ 1 2 = - Z λ 3 , 2 Z λ 2 2 = - Z λ 4 ,
which are satisfied by the partition function.
Assumption 3 ME packet Equation (78) is a part of a satisfactory model for many systems in Newtonian mechanics.

4.3.2. Classical equations of motion

Let us assume that the Hamiltonian of S has the form
H = p 2 2 μ + V ( q ) ,
where μ is the mass and V ( q ) the potential function. The equations of motion are
q ˙ = { q , H } , p ˙ = { p , H } .
Inserting (80) for H, we obtain
q ˙ = p μ , p ˙ = - d V d q .
The general solution to these equations can be written in the form
q ( t ) = q ( t ; q , p ) , p ( t ) = p ( t ; q , p ) ,
where
q ( 0 ; q , p ) = q , p ( 0 ; q , p ) = p ,
q and p being arbitrary initial values. This implies for the time dependence of the averages and variances, if the initial state is an ME packet:
Q ( t ) = q ( t ; q , p ) , Δ Q ( t ) = ( q ( t ; q , p ) - Q ( t ) ) 2
and
P ( t ) = p ( t ; q , p ) , Δ P ( t ) = ( p ( t ; q , p ) - P ( t ) ) 2 .
In general, Q ( t ) and P ( t ) will depend not only on initial Q and P, but also on Δ Q and Δ P .
Let us consider the special case of at most quadratic potential:
V ( q ) = V 0 + V 1 q + 1 2 V 2 q 2 ,
where V k are constants with suitable dimensions. If V 1 = V 2 = 0 , we have a free particle, if V 2 = 0 , it is a particle in a homogeneous force field and if V 2 0 , it is a harmonic or anti-harmonic oscillator.
In this case, general solution (82) has the form
q ( t ) = f 0 ( t ) + q f 1 ( t ) + p f 2 ( t ) ,
p ( t ) = g 0 ( t ) + q g 1 ( t ) + p g 2 ( t ) ,
where f 0 ( 0 ) = f 2 ( 0 ) = g 0 ( 0 ) = g 1 ( 0 ) = 0 and f 1 ( 0 ) = g 2 ( 0 ) = 1 . If V 2 0 , the functions are
f 0 ( t ) = - V 1 V 2 ( 1 - cos ω t ) , f 1 ( t ) = cos ω t , f 2 ( t ) = 1 ξ sin ω t ,
g 0 ( t ) = - ξ V 1 V 2 sin ω t , g 1 ( t ) = - ξ sin ω t , g 2 ( t ) = cos ω t ,
where
ξ = μ V 2 , ω = V 2 μ .
Only for V 2 > 0 , the functions remain bounded. If V 2 = 0 , we obtain
f 0 ( t ) = - V 1 2 μ t 2 , f 1 ( t ) = 1 , f 2 ( t ) = t μ ,
g 0 ( t ) = - V 1 t , g 1 ( t ) = 0 , g 2 ( t ) = 1 .
The resulting time dependence of averages and variances resulting from Equations (82), (72) and (73) are [18]
Q ( t ) = f 0 ( t ) + Q f 1 ( t ) + P f 2 ( t )
and
Δ Q 2 ( t ) + Q 2 ( t ) = f 0 2 ( t ) + ( Δ Q 2 + Q 2 ) f 1 2 ( t ) + ( Δ P 2 + P 2 ) f 2 2 ( t ) + 2 Q f 0 ( t ) f 1 ( t ) + 2 P f 0 ( t ) f 2 ( t ) + 2 q p f 1 ( t ) f 2 ( t ) .
For the last term, we have from Equation (79)
q p = 1 Z 2 Z λ 1 λ 2 .
Using Equations (75), (76) and (77), we obtain from Equation (94)
Δ Q ( t ) = f 1 2 ( t ) Δ Q 2 + f 2 2 ( t ) Δ P 2 .
Similarly,
P ( t ) = g 0 ( t ) + Q g 1 ( t ) + P g 2 ( t ) ,
Δ P ( t ) = f g 2 ( t ) Δ Q 2 + g 2 2 ( t ) Δ P 2 .
We observe that, if functions f 1 ( t ) , f 2 ( t ) , g 1 ( t ) and g 2 ( t ) remain bounded, the variances also remain bounded and the predictions are possible in arbitrary long intervals of time. Otherwise, there will always be only limited time intervals in which the theory can make predictions.
In the case of general potential, the functions (82) can be expanded in products of powers of q and p, and the averages of these products will contain powers of the variances. However, as one easily sees form formula (79) and (75),
q k p l = Q k P l + X Δ Q + Y Δ P ,
where X and Y are bounded functions. It follows that the dynamical equations for averages coincide, in the limit Δ Q 0 , Δ P 0 , with the exact dynamical equations for q and p. It is an idealization that we consider as not realistic, even in principle, but it may still be useful for calculations.
Let us expand a general potential function in powers of q,
V ( q ) = k = 0 1 k ! V k q k ,
where V k are constants of appropriate dimensions. The Hamilton equations can be used to calculate all time derivatives at t = 0 . First, we have
d q d t = { q , H } = p μ .
This equation can be used to calculate all derivatives of q in terms of those of p:
d n q d t n = 1 μ d n - 1 p d t n - 1 .
A simple iterative procedure gives:
d p d t = - V 1 - V 2 q - V 3 2 q 2 - V 4 6 q 3 + r 5 ,
d 2 p d t 2 = - V 2 μ p - V 3 μ q p - V 4 2 μ q 2 p + r 5 ,
d 3 p d t 3 = - V 3 μ 2 p 2 - V 4 μ 2 q p 2 + V 1 V 2 μ + V 1 V 3 + V 2 2 μ q + 3 V 2 V 3 + V 1 V 4 2 μ q 2 + 4 V 2 V 4 + 3 V 3 2 6 μ q 3 + 5 V 3 V 4 12 μ q 4 + V 4 2 12 μ q 5 + r 5 ,
and
d 4 p d t 4 = - V 4 μ 3 p 3 + 3 V 1 V 3 + V 2 2 μ 2 p + 3 V 1 V 4 + 5 V 2 V 3 μ 2 q p + 5 V 3 2 + 8 V 2 V 4 2 μ 2 q 2 p + 3 V 3 V 4 μ 2 q 3 p + 3 V 4 2 4 μ 2 q 4 p + r 5 ,
where r k is the rest term that is due to all powers in (98) that are not smaller than k (the rests symbolize different expressions in different equations). The purpose of having time derivatives up to the fourth order is to see better the difference to quantum corrections that will be calculated in Section 4.4.3..
Taking the average of both sides of Equations (100)–(103), and using Equations (79), (75)–(77), we obtain
d P d t = - V 1 - V 2 Q - V 3 2 Q 2 - V 4 6 Q 3 - V 3 + V 4 Q 2 Δ Q 2 + r 5 ,
d 2 P d t 2 = - V 2 μ P + V 3 μ Q P + V 4 2 μ Q 2 P + V 4 2 μ P Δ Q 2 + r 5 ,
d 3 P d t 3 = - V 3 μ 2 P 2 - V 4 μ 2 Q P 2 + V 1 V 2 μ + V 1 V 3 + V 2 2 μ Q + 3 V 2 V 3 + V 1 V 4 2 μ Q 2 + 4 V 2 V 4 + 3 V 3 2 6 μ Q 3 + 5 V 3 V 4 12 μ Q 4 + V 4 2 12 μ Q 5 - V 3 μ 2 + V 4 μ 2 Q Δ P 2 + 3 V 2 V 3 + V 1 V 4 2 μ + 4 V 2 V 4 + 3 V 3 2 2 μ Q + 5 V 3 V 4 2 μ Q 2 + 5 V 3 V 4 4 μ Δ Q 2 + 5 V 4 2 6 μ Q 3 + 5 V 4 2 4 μ Q Δ Q 2 Δ Q 2 + r 5 ,
and
d 4 P d t 4 = - V 4 μ 3 P 3 + 3 V 1 V 3 + V 2 2 μ 2 P + 3 V 1 V 4 + 5 V 2 V 3 μ 2 Q P + 5 V 3 2 + 8 V 2 V 4 2 μ 2 Q 2 P + 3 V 3 V 4 μ 2 Q 3 P + 3 V 4 2 4 μ 2 Q 4 P - 3 V 4 μ 3 P Δ P 2 + 5 V 3 2 + 8 V 2 V 4 2 μ 2 P + 9 V 3 V 4 μ 2 Q P + 9 V 4 2 2 μ 2 Q 2 P + 9 V 4 2 4 μ 2 P Δ Q 2 Δ Q 2 + r 5 .
We can see, that the limit Δ Q 0 , Δ P 0 in Equations (104)–(107) lead to equations that coincide with Equations (100)–(103) if Q q , P p as promised.

4.4. Quantum ME packets

Let us now turn to quantum mechanics and try to solve an analogous problem.
Definition 29 Let the quantum model S q of system S has spin 0, position q and momentum p . State T that maximizes von Neumann entropy (see Section 2.2.2.)
S = - t r ( T ln T )
under the conditions
t r [ T q ] = Q , t r [ T q 2 ] = Q 2 + Δ Q 2 ,
t r [ T p ] = P , t r [ T p 2 ] = P 2 + Δ P 2 ,
where Q, P, Δ Q and Δ P are given numbers, is called quantum ME packet.

4.4.1. Calculation of the state operator

To solve the mathematical problem, we use the method of Lagrange multipliers as in the classical case. Thus, the following equation results:
( d S - λ 0 d t r [ T ] - λ 1 d t r [ T q ] - λ 2 d t r [ T p ] - λ 3 d t r [ T q 2 ] - λ 4 d t r [ T p 2 ] = 0 .
The differentials of the terms that are linear in ρ are simple to calculate:
d t r [ T x ] = m n x n m d T m n .
Although not all elements of the matrix d T m n are independent (it is a Hermitian matrix), we can proceed as if they were because the matrix x n m is to be also Hermitian. The only problem is to calculate d S . We have the following
Lemma 1
d S = - m n [ δ m n + ( ln T ) m n ] d T m n .
Proof Let M be a unitary matrix that diagonalizes T ,
M T M = R ,
where R is a diagonal matrix with elements R n . Then S = - n R n ln R n . Correction to R n if T T + d T can be calculated by the first-order formula of the stationary perturbation theory. This theory is usually applied to Hamiltonians but it holds for any perturbed Hermitian operator. Moreover, the formula is exact for infinitesimal perturbations. Thus,
R n R n + k l M k n M l n d T k l .
In this way, we obtain
d S = - n R n + k l M k n M l n d T k l × ln R n 1 + 1 R n r s M r n M s n d T r s - n R n ln R n = - n ln R n k l M k n M l n d T k l + k l M k n M l n d ρ k l = - k l δ k l + ( ln T ) k l ] d T k l ,
QED.
With the help of Lemma 1, Equation (111) becomes
t r ( 1 + ln T - λ 0 - λ 1 q - λ 2 p - λ 3 q 2 - λ 4 p 2 ) d T = 0
so that we have
T = exp ( - λ 0 - 1 - λ 1 q - λ 2 p - λ 3 q 2 - λ 4 p 2 ) .
The first two terms in the exponent determine the normalization constant
e - λ 0 - 1
because they commute with the rest of the exponent and are independent of the dynamical variables. Taking the trace of Equation (113), we obtain
e - λ 0 - 1 = 1 Z ( λ 1 , λ 2 , λ 3 , λ 4 ) ,
where Z is the partition function
Z ( λ 1 , λ 2 , λ 3 , λ 4 ) = t r [ exp ( - λ 1 q - λ 2 p - λ 3 q 2 - λ 4 p 2 ) ] .
Thus, the state operator has the form
T = 1 Z ( λ 1 , λ 2 , λ 3 , λ 4 ) exp ( - λ 1 q - λ 2 p - λ 3 q 2 - λ 4 p 2 ) .
At this stage, the quantum theory begins to differ from the classical one. It turns out that, for the case of non-commuting operators in the exponent of the partition function, formula (79) is not valid in general. We can only show that it holds for the first derivatives. To this aim, we prove the following
Lemma 2 Let A and B be Hermitian matrices. Then
d d λ t r [ exp ( A + B λ ) ] = t r [ B exp ( A + B λ ) ] .
Proof We express the exponential function as a series and then use the invariance of trace with respect to any cyclic permutation of its argument.
d t r [ exp ( A + B λ ) ] = n = 0 1 n ! t r [ d ( A + B λ ) n ] = n = 0 1 n ! t r k = 1 n ( A + B λ ) k - 1 B ( A + B λ ) n - k d λ = n = 0 1 n ! k = 1 n t r B ( A + B λ ) n - 1 d λ = t r [ B exp ( A + B λ ) ] d λ ,
QED.
The proof of Lemma 2 shows why formula (79) is not valid for higher derivatives than the first in the quantum case: the operator B does not commute with A + B λ and cannot be shifted from its position to the first position in product
( A + B λ ) k B ( A + B λ ) l .
For the first derivative, it can be brought there by a suitable cyclic permutation. However, each commutator [ B , ( A + B λ ) ] is proportional to ℏ. Hence, formula (79) with higher derivatives is the leading term in the expansion of averages in powers of ℏ.
Together with Equation (114), Lemma 2 implies the formulae:
ln Z λ 1 = - Q , ln Z λ 3 = - Q 2 - Δ Q 2
and
ln Z λ 2 = - P , ln Z λ 4 = - P 2 - Δ P 2 .
The values of the multipliers can be calculated from Equations (117) and (118), if the form of the partition function is known.
Variational methods can find locally extremal values that are not necessarily maxima. We can however prove that our state operator maximizes entropy. The proof is based on the generalized Gibbs’ inequality,
t r [ T ln T - T ln S ] 0
for all pairs { T , S } of state operators (for proof of the inequality, see [21], p. 264). The proof of maximality is then analogous to the “classical” proof (see, e.g., [37], p. 357). The first proof of maximality in the quantum case was given by von Neumann [77].
The state operator (115) can be inserted in the formula (108) to give the value of the maximum entropy,
S = ln Z + λ 1 q + λ 2 p + λ 3 q 2 + λ 4 p 2 .
This, together with Equations (117) and (118), can be considered as the Legendre transformation from the function ln Z ( λ 1 , λ 2 , λ 3 , λ 4 ) to the function S ( q , p , q 2 , p 2 ) .

4.4.2. Diagonal representation

The exponent in Equation (115) can be written in the form
λ 1 2 4 λ 3 + λ 2 2 4 λ 4 - 2 λ 3 λ 4 K ,
where
K = 1 2 λ 3 λ 4 q + λ 1 2 λ 3 2 + 1 2 λ 4 λ 3 p + λ 2 2 λ 4 2 .
This is an operator acting on the Hilbert space of our system. K has the form of the Hamiltonian of a harmonic oscillator with the coordinate u and momentum w
u = q + λ 1 2 λ 3 , w = p + λ 2 2 λ 4 ,
that satisfy the commutation relation [ u , w ] = i . The oscillator has mass M and frequency Ω,
M = λ 3 λ 4 , Ω = 1 .
(The operator K must not be confused with the Hamiltonian H of our system, which can be arbitrary.) The normalized eigenstates | k of the operator form a basis in the Hilbert space of our system defining the so-called diagonal representation and its eigenvalues are / 2 + k . As usual, we introduce operator A such that
u = 2 M ( A + A ) ,
w = - i M 2 ( A - A ) ,
K = 2 ( A A + A A ) ) ,
A | k = k | k - 1 ,
A | k = k + 1 | k + 1 .
To calculate Z in the diagonal representation is easy:
Z = t r exp λ 1 2 4 λ 3 + λ 2 2 4 λ 4 - 2 λ 3 λ 4 K = k = 0 k | exp λ 1 2 4 λ 3 + λ 2 2 4 λ 4 - 2 λ 3 λ 4 K | k = exp λ 1 2 4 λ 3 + λ 2 2 4 λ 4 - λ 3 λ 4 k = 0 exp ( - 2 λ 3 λ 4 k ) .
Hence, the partition function for the quantum ME-packets is
Z = exp λ 1 2 4 λ 3 + λ 2 2 4 λ 4 2 sinh ( λ 3 λ 4 ) .
Now, we can express the Lagrange multipliers in terms of the averages and variances. Equations (117) and (118) yield
λ 1 = - Q Δ Q 2 ν 2 ln ν + 1 ν - 1 , λ 2 = - P Δ P 2 ν 2 ln ν + 1 ν - 1 ,
and
λ 3 = 1 2 Δ Q 2 ν 2 ln ν + 1 ν - 1 , λ 4 = 1 2 Δ P 2 ν 2 ln ν + 1 ν - 1 ,
where ν (71).
From Equations (119), (130) and (131), we obtain the entropy:
S = - ln 2 + ν + 1 2 ln ( ν + 1 ) - ν - 1 2 ln ( ν - 1 ) .
Thus, S depends on Q, P, Δ Q , Δ P only via ν. We have
d S d ν = 1 2 ln ν + 1 ν - 1 > 0 ,
so that S is an increasing function of ν. Near ν = 1 ,
S - ν - 1 2 ln ( ν - 1 ) .
Asymptotically ( ν ),
S ln ν + 1 - ln 2 .
In the classical region, ν 1 , S ln ν .
It is clear that the choice of Q and P cannot influence the entropy. The independence of S from Q and P does not contradict the Legendre transformation properties. Indeed, usually, one would have
S Q = λ 1 ,
but here
S Q = λ 1 + 2 λ 3 Q ,
which is zero.
The resulting state operator, generalized to n degrees of freedom, is described by the following
Theorem 14 The state operator of the ME packet of a system with given averages and variances Q 1 , , Q n , Δ Q 1 , , Δ Q n of coordinates and P 1 , , P n , Δ P 1 , , Δ P n of momenta, is
T = k = 1 n 2 ν k 2 - 1 exp - 1 ln ν k + 1 ν k - 1 K k ,
where
K k = 1 2 Δ P k Δ Q k ( q k - Q k ) 2 + 1 2 Δ Q k Δ P k ( p k - P k ) 2
and
ν k = 2 Δ P k Δ Q k .
Strictly speaking, the state operator (133) is not a Gaussian distribution. Thus, it seems to be either a counterexample to, or a generalization of, Jaynes’ hypothesis.
Assumption 4 The quantum model S q corresponding to the classical model S c described by Assumption 3 is the ME packet (133).
Let us study the properties of quantum ME packets. In the diagonal representation, we have for n = 1 :
K = k = 0 R m | m m | .
We easily obtain for R m that
R m = 2 ( ν - 1 ) m ( ν + 1 ) m + 1 .
Hence,
lim ν 1 R m = δ m 0 ,
and the state T becomes | 0 0 | . In general, states | m depend on ν. The state vector | 0 expressed as a function of Q, P, Δ Q and ν is given, for any ν, by
ψ ( q ) = 1 π ν 2 Δ Q 2 1 / 4 exp - ν 4 Δ Q 2 ( q - Q ) 2 + i P q .
This is a Gaussian wave packet that corresponds to other values of variances than the original ME packet but has the minimum uncertainty. For ν 1 , it remains regular and the projection | 0 0 | becomes the state operator of the original ME packet. Hence, Gaussian wave packets are special cases of quantum ME packets.
The diagonal representation offers a method for calculating averages of coordinates and momenta products that replaces the partition function way. Let us denote such a product X. We have
X = k = 0 R k k | X | k .
To calculate k | X | k , we use Equations (124), (125), (122), (123), (130) and (131) to obtain
q = Q + Δ Q ν ( A + A ) , p = P - i Δ P ν ( A - A ) .
By substituting these relations into X and using the commutation relations [ A , A ] = 1 , we obtain
X = P ( N ) + Q ( A , A ) ,
where N = A A and where, in each monomial of the polynomial Q , the number of A -factors is different from the number of A -factors. Thus,
k | X | k = P ( k ) .
In Equation (139), there are, therefore, sums
k = 0 k n R k .
With Equation (137), this becomes
k = 0 k n R k = 2 ν + 1 I n ,
where
I n ( ν ) = k = 0 k n ν - 1 ν + 1 k .
We easily obtain
I n = ν 2 - 1 2 d d ν n ν + 1 2 .
The desired average value is then given by
X = 2 ν + 1 P ν 2 - 1 2 d d ν ν + 1 2 .
The calculation of the polynomial P for a given X and the evaluation of the right-hand side of Equation (140) are the two steps of the promised method.

4.4.3. Quantum equations of motion

Let the Hamiltonian of S q be H and the unitary evolution group be U ( t ) . The dynamics in the Schrödinger picture leads to the time dependence of T :
T ( t ) = U ( t ) T U ( t ) .
Substituting for T from Equation (133) and using a well-known property of exponential functions, we obtain
T ( t ) = 2 ν 2 - 1 exp - 1 ln ν + 1 ν - 1 U ( t ) K U ( t ) .
In the Heisenberg picture, T remains constant, while q and p are time dependent and satisfy the equations
i d q d t = [ q , H ] , i d p d t = [ p , H ] .
They are solved by
q ( t ) = U ( t ) q U ( t ) , p ( t ) = U ( t ) p U ( t ) ,
where q and p are the initial operators, q = q ( 0 ) and p = p ( 0 ) . The resulting operators can be written in the form of operator functions analogous to classical expressions (82) so that Equations (84) and (85) can again be used.
The example with potential function (86) is solvable in quantum theory, too, and we can use it for comparison with the classical dynamics as well as for a better understanding of the ME packet dynamics. Equation (142) has then the solutions given by (87) and (88) with functions f n ( t ) and g n ( t ) given by (89) and (90) or (91) and (92). The calculation of the averages and variances is analogous to the classical one and we obtain Equations (93) and (95) again with the difference that the term 2 q p on the right hand side of (94) is now replaced by q p + p q .
To calculate q p + p q , we use the method introduced in the previous section. We have
q p + p q = 2 Q P + 2 P Δ Q ν ( A + A ) - 2 i Q Δ P ν ( A - A ) - 2 i Δ Q Δ P ν ( A 2 - A 2 ) .
hence, P = 2 Q P , and
q p + p q = 2 Q P .
The result is again Equation (95). Similarly for p , the results are given by Equations (96) and (97).
We have shown that the averages and variances of quantum ME packets have exactly the same time evolution as those of classical ME packets in the special case of at-most-quadratic potentials. From Equations (95) and (97) we can also see an interesting fact. On the one hand, both variances must increase near t = 0 . On the other hand, the entropy must stay constant because the evolution of the quantum state is unitary. As the relation between entropy and ν is fixed for ME packets, the ME packet form is not preserved by the evolution (the entropy ceases to be maximal). This is similar for Gaussian-packet form or for coherent-state form.
For general potentials, there are two types of corrections to the dynamics of the averages: terms containing the variances and terms containing ℏ. To obtain these corrections, let us calculate time derivatives for the quantum analogue of Hamiltonian (80) with potential (98). The Heisenberg-picture equations of motion give again
d q d t = 1 μ p ,
so that Equation (99) is valid. The other equation,
i d p d t = [ p , H ] ,
can be applied iteratively as in the classical case so that all time derivatives of p can be obtained. Thus,
d p d t = - V 1 - V 2 q - V 3 2 q 2 - V 4 6 q 3 + r 5 ,
and
d 2 p d t 2 = - V 2 μ p - V 3 2 μ ( q p + p q ) - V 4 6 μ ( q 2 p + q p q + p q 2 ) + r 5 .
This differs from the classical equation only by factor ordering. We can use the commutator [ q , p ] = i to simplify the last term,
d 2 p d t 2 = - V 2 μ p - V 3 2 μ ( q p + p q ) - V 4 2 μ q p q + r 5 .
Similarly,
d 3 p d t 3 = - V 3 μ 2 p 2 - V 4 μ 2 p q p + V 1 V 2 μ + V 1 V 3 + V 2 2 μ q + 3 V 2 V 3 + V 1 V 4 2 μ q 2 + 4 V 2 V 4 + 3 V 3 2 6 μ q 3 + 5 V 3 V 4 12 μ q 4 + V 4 2 12 μ q 5 + r 5 ,
and
d 4 p d t 4 = - V 4 μ 3 p 3 + 3 V 1 V 3 + V 2 2 μ 2 p + 3 V 1 V 4 + 5 V 2 V 3 2 μ 2 ( q p + p q ) + 5 V 3 2 + 8 V 2 V 4 2 μ 2 q p q + 3 V 3 V 4 2 μ 2 ( q 3 p + p q 3 ) + 3 V 4 2 4 μ 2 q 2 p q 2 + r 5 .
Next, we calculate quantum averages with the help of Equation (140). The quantum averages of the monomials that are linear in one of the variables q or p can differ from their classical counterparts only by terms that are of the first order in 1 / ν and purely imaginary. For example,
q p = Q P + i 2 ,
or
q 3 p = Q 3 P + 3 Q P Δ Q 2 + 3 i Q 2 Δ Q Δ P ν + 3 i Δ Q 3 Δ P ν .
These corrections clearly cancel for all symmetric factor orderings. The first term in which a second-order correction occurs is q 2 p 2 and we obtain for it:
p q 2 p = q 2 p 2 class + 2 Δ Q 2 Δ P 2 ν 2 .
The Equations (143)–(146) do not contain any such terms and so their averages coincide exactly with the classical Equations (104)–(107). The terms q 2 p 2 with different factor orderings occur in the fifth time derivative of p and have the form
3 V 3 V 4 2 μ 2 q 3 p + p q 3 , p 2 2 μ + V 3 V 4 2 μ 3 1 3 q 3 , p 3 = i V 3 V 4 2 μ 3 ( 21 p q 2 p - 11 2 ) .
The average of the resulting term in the fifth time derivative of p is
V 3 V 4 2 μ 3 21 Q 2 P 2 + 21 P 2 Δ Q 2 + 21 Q 2 Δ P 2 + 21 Δ Q 2 Δ P 2 - 2 2 .
If we express ℏ as 2 Δ Q Δ P / ν , we can write the last two terms in the parentheses as
Δ Q 2 Δ P 2 21 - 2 ν 2 .
A similar term appears in the third time derivative of p , if we allow V 5 0 in the expansion (98):
- V 5 12 μ ( q 3 p + p q 3 ) , p 2 2 μ = i - V 5 4 μ 2 ( 2 p q 2 p + 2 ) ,
which contributes to d 3 P / d t 3 by
- V 5 2 μ 2 q 2 p 2 class + 4 Δ Q 2 Δ P 2 ν 2 .
Again, the correction is of the second order in ν - 1 .
We can conclude. The quantum equations begin to differ from the classical ones only for the higher order terms in V or in the higher time derivatives and the correction is of the second order in 1 / ν . This seems to be very satisfactory: our quantum model reproduces the classical dynamic very well. Moreover, Equation (138) shows that Gaussian wave packets are special cases of ME packets with ν = 1 . Thus, they approximate classical trajectories less accurately than ME packets with large ν. Of course, these results have as yet been shown only for the first four time derivatives. It would be nice if a general theorem could be proved.

4.5. Classical limit

Let us now look to see if our equations give some support to the statement that ν 1 is the classical regime.
The quantum partition function (129) differs from its classical counterpart (75) by the denominator sinh ( λ 3 λ 4 ) . If
λ 3 λ 4 1 ,
we can write
sinh ( λ 3 λ 4 ) = λ 3 λ 4 [ 1 + O ( ( λ 3 λ 4 ) 2 ) ]
The leading term in the partition function then is
Z = π h 1 λ 3 λ 4 exp λ 1 2 4 λ 3 + λ 2 2 4 λ 4 ,
where h = 2 π . Comparing this with Equation (75) shows that the two expressions are identical if we set
v = h .
We can interpret this by saying that quantum mechanics gives us the value of v. Next, we have to express condition (147) in terms of the averages and variances. Equations (130) and (131) imply
λ 3 λ 4 = 1 2 ln ν + 1 ν - 1 .
Hence, condition (147) is equivalent to
ν 1 .
The result can be formulated as follows. Classical mechanics allows not only sharp, but also fuzzy trajectories and the comparison of some classical and quantum fuzzy trajectories shows a very good match. The fuzzy states chosen here are the so-called ME packets. Their fuzziness is described by the quantity ν = 2 Δ Q Δ P / . The entropy of an ME packet depends only on ν and is an increasing function of it. The time evolution of classical and quantum ME packets with the same initial values of averages and variances defines the averages as time functions. The larger ν is, the better the quantum and the classical evolutions of average values have been shown to agree for the first four terms in the expansion in powers of time. Thus, the classical regime is neither Δ Q = Δ P = 0 (absolutely sharp trajectory) nor ν = 1 (minimum quantum uncertainty). This is the most important result of Ref. [18]. The time functions coincide for the two theories in the limit ν . Hence, in our approach, this is the classical limit. This is just the opposite to the usual assumption that the classical limit must yield the variances as small as possible. Of course, ν can be very large and still compatible with classically negligible variances.
One also often requires that commutators of observables vanish in classical limit. This is however only motivated by the assumption that all basic quantum properties are single values of observables. Within our interpretation, this assumption is replaced by the following claim: If classical observables are related to quantum operators then only in such a way that they are average values of the operators in classicality states. Then, first, all such averages are defined by a preparation and do exist simultaneously, independently of whether the operators commute or not. For example, Q and P are such simultaneously existing variables for ME packets. Second, a joint measurement of fuzzy values of non-commuting observables is possible. This will be explained in Section 4.7.
It might be helpful to emphasize that construction of models of Newtonian mechanics and the so-called semi-classical or WKB approximation to quantum mechanics are two different things. Indeed, the semi-classical approximation is a mathematical method, usually defined as the expansion in powers of h in some quantum expressions [21], to calculate approximately correct values of quantum expressions in suitable applications. Equations resulting from h 0 may be similar to the corresponding classical equations. In fact, limit ν also results from h 0 if the variances are kept constant. The suitable applications can be more general than the above construction of models in that they, e.g., do not necessarily concern fuzzy trajectories and macroscopic systems.

4.6. A model of classical rigid body

To show how the above theory of classical properties works, we construct a one dimensional model of a free solid body. The restrictions to one dimension and absence of external forces enable us to calculate everything explicitly—the model is completely solvable. The real object is a thin solid rod of mass M and length L. Its classical model S c is a one-dimensional continuum of the same mass and length, with mass density M / L , internal energy E, centre of mass X and total momentum P. The classical state coordinates (see Section 4.1) are M, L, X, P and E.
The construction of its quantum model S q entails that, first, the structural properties of the system must be defined, second, some assumptions on the state of the system must be done; third, the quantum objective properties must be found that correspond to the classical properties M, L, E, X and P. Large parts of this section follow reference [15].

4.6.1. Composition, Hamiltonian and spectrum

Assumption 5 S q is an isolated linear chain of N identical particles of mass μ distributed along the x-axis with the quantum Hamiltonian
H = 1 2 μ n = 1 N p n 2 + κ 2 2 n = 2 N ( x n - x n - 1 - ξ ) 2 ,
involving only nearest-neighbor elastic forces. Here operator x n is the position, operator p n the momentum of the n-th particle, κ the oscillator strength and ξ the equilibrium interparticle distance.
The parameters N, μ, κ and ξ are structural properties (determining the Hamiltonian of a closed system, see Section 2.3.2.).
This kind of chain seems to be different from most chains that are studied in literature: the positions of the chain particles are dynamical variables so that the chain can move as a whole and the invariance with respect to Galilean group is achieved. However, the chain can still be solved by methods that are described in [88].
First, we find the variables u n and q n that diagonalize the Hamiltonian and thus define the so-called normal modes. The transformation is
x n = m = 0 N - 1 Y n m u m + n - N + 1 2 ξ ,
and
p n = m = 0 N - 1 Y n m q m ,
where the mode index m runs through 0 , 1 , , N - 1 and Y n m is an orthogonal matrix; for even m,
Y n m = A ( m ) cos π m N n - N + 1 2 ,
while for odd m,
Y n m = A ( m ) sin π m N n - N + 1 2 ,
and the normalization factors are given by
A ( 0 ) = 1 N , A ( m ) = 2 N , m > 0 .
To show that u n and q n do represent normal modes, we substitute Equations (150) and (151) into (149) and obtain, after some calculation,
H = 1 2 μ m = 0 N - 1 q m 2 + μ 2 m = 0 N - 1 ω m 2 u m 2 ,
which is indeed diagonal. The mode frequencies are
ω m = 2 κ μ sin m N π 2 .
Consider the terms with m = 0 . We have ω 0 = 0 , and Y n 0 = 1 / N . Hence,
u 0 = n = 1 N 1 N x n , q 0 = n = 1 N 1 N p n ,
so that
u 0 = N X , q 0 = 1 N P ,
where X is the centre-of-mass coordinate of the chain and P is its total momentum. The “zero” terms in the Hamiltonian then reduce to
1 2 M 0 P 2
with M 0 = N μ . Thus, the “zero mode” describes a straight, uniform motion of the chain as a whole. The fact that the centre of mass degrees of freedom decouple from other (internal) ones is a consequence of Galilean invariance.
The other modes are harmonic oscillators called “phonons” with eigenfrequencies ω m , m = 1 , 2 , , N - 1 . The energy of the phonons,
E = H - 1 2 M 0 P 2 ,
is the internal energy of our system and its spectrum is built from the mode frequencies by the formula
E = m = 1 N - 1 ν m ω m ,
where { ν m } is an ( N - 1 ) -tuple of non-negative integers—phonon occupation numbers.
Let us define the operator describing the mass by
M = M 0 + E c 2
and the length of the body by
L = x N - x 1 .
We assume that the second term in the expression for the mass can be safely neglected in the non-relativistic regime in which we are working. The length can be expressed in terms of modes u m using Equation (150),
L = ( N - 1 ) ξ + m = 0 N - 1 ( Y N m - Y 1 m ) u m .
The differences on the right-hand side are non-zero only for odd values of m, and equal then to - 2 Y 1 m . We easily find, using Equations (153) and (154):
L = ( N - 1 ) ξ - 8 N m = 1 [ N / 2 ] ( - 1 ) m cos 2 m - 1 N π 2 u 2 m - 1 .

4.6.2. Maximum-entropy assumption

The next point is the choice of classicality states. We write the Hilbert space of S q as
H = H CM H int ,
where H CM is constructed from the wave functions Ψ ( X ) (see Section 2.3.1.) and H int has the phonon eigenstates as a basis.
Assumption 6 The classicality states have the form
T CM T int .
Internal state T int maximizes the entropy under the condition of fixed average of the internal energy,
Tr T int H - 1 2 M 0 P 2 = E .
The external state T CM is the ME packet for given averages X , P , Δ X and Δ P .
Let us first focus on T int . It is the state of thermodynamic equilibrium or the Gibbs state, which we denote by T E (see, e.g., [37]).
The maximum of entropy does not represent an additional condition but rather the absence of any, see Section 1.0.4.. This is, of course, also a condition, and its validity in overwhelming number of real cases is an interesting problem. For T int , it must have to do with the preparation (not by physicists but by nature). Physically, the thermodynamic equilibrium can settle down spontaneously starting from an arbitrary state only if some weak but non-zero interaction exists both between the phonons and between the rod and the environment. We assume that this can be arranged so that the interaction can be neglected in the calculations of the present section.
The internal energy has itself a very small relative variance in the Gibbs state if N is large. This explains why it appears to be sharp. All other classical internal properties will turn out to be functions of the classical internal energy. Hence, for the internal degrees of freedom, E forms itself a complete set of state coordinates introduced in Assumption 1. The properties of internal energy are well known and we shall not repeat the calculations here.

4.6.3. The length of the body

The mathematics associated with the maximum entropy principle is variational calculus. The condition of fixed average energy is included with the help of Lagrange multiplier denoted by λ. It becomes a function λ ( E ) for the resulting state. As it is well known, λ ( E ) has to do with temperature.
The phonons of one species are excitation levels of a harmonic oscillator, so we have
u m = 2 μ ω m ( a m + a m ) ,
where a m is the annihilation operator for the m-th species. The diagonal matrix elements between the energy eigenstates ν m that we shall need then are
ν m | u m | ν m = 0 , ν m | u m 2 | ν m = 2 μ ω m ( 2 ν m + 1 ) .
For our system, the phonons of each species form statistically independent subsystems, hence the average of an operator concerning only one species in the Gibbs state T E of the total system equals the average in the Gibbs state for the one species. Such a Gibbs state operator for the m-th species has the form
T m = ν m = 0 | ν m p ν m ( m ) ν m | ,
where
p ν m ( m ) = Z m - 1 exp - λ ω m ν m
and Z m is the partition function for the m-th species
Z m ( λ ) = ν m = 0 e - λ ω m ν m = 1 1 - e - λ ω m .
The average length is obtained using Equation (160),
L E = ( N - 1 ) ξ .
It is a function of objective properties N, ξ and E.
Equation (160) is an important result. It shows that contributions to the length are more or less evenly distributed over all odd modes. Such a distribution leads to a very small variance of L in Gibbs states. A lengthy calculation [15] gives for large N
Δ L L E 2 3 π κ ξ λ 1 N .
Thus, the small relative variance for large N does not need to be assumed from the start. The only assumptions are values of some structural properties and that an average value of energy is fixed. We have obtained even more information, viz. the internal-energy dependence of the length (in this model, the dependence is trivial). This is an objective relation that can be in principle tested by measurements.
Similar results can be obtained for further thermodynamic properties such as specific heat, elasticity coefficient, etc. (If we extend the classical model so that it contains the elasticity coefficient, we could calculate the coefficient for an extended quantum model, in which the rod would be placed into a non-homogeneous “gravitational” field described by, say, a quadratic potential. This would again give a solvable model.) All these quantities are well known to have small variances in Gibbs states. The reason is that the contributions to these quantities are evenly distributed over the normal modes and the modes are mechanically and statistically independent.

4.6.4. The bulk motion

The mechanical properties of the system are the centre of mass and the total momentum. The contributions to them are evenly distributed over all atoms, not modes: the bulk motion is mechanically and statistically independent of all other modes and so its variances will not be small in Gibbs states defined by a fixed average of the total energy. Still, generalized statistical methods of Section 4.2, Section 4.3 and Section 4.4 can be applied to it. This is done in the present subsection.
First, we assume that the real rod we are modelling cannot possess a sharp trajectory. Thus, satisfactory models of it can be ME packets in both Newtonian and quantum mechanics. Then, according to Assumption 6 and Theorem 13, the external state of the classical model can be chosen as
ρ = v 2 π 1 Δ X Δ P exp - ( X - X ) 2 2 Δ X 2 - ( P - P ) 2 2 Δ P 2 .
(For the definition of v, see Section 4.3.) Similarly, Theorem 14 implies that the external state of the quantum model can be chosen as
T CM = 2 ν 2 - 1 exp - 1 ln ν + 1 ν - 1 K ,
where
K = 1 2 Δ P Δ X ( X - X ) 2 + 1 2 Δ X Δ P ( P - P ) 2
and
ν = 2 Δ P Δ X .
The Hamiltonian for the bulk motion of both models is given by Equation (156). Thus, as explained in Section 4.4.3., the quantum trajectory coincides with the classical one exactly. (Recall that trajectory has been defined as the time dependence of averages and variances.)
Hopefully, this simple rod example has sufficiently illustrated how our idea of model construction works in the case of classical properties and we can finish the comparison of classical and quantum models here.

4.7. Joint measurement of position nd momentum

The existence of an observable that represents a joint measurement of position and momentum plays some role in the theory of classical properties. To show it, we generalize the construction of such an observable for a simplified model that was first proposed in Ref. [89]. We follow Ref. [24]. The model system S is a free one-dimensional spin-zero particle with position q and momentum p . The Hilbert space is L 2 ( R ) and the operators are defined by equations analogous to (31) and (32).
Operators q and p have an invariant common domain and their commutator is easily calculated to be
[ q , p ] = i .
Hence, the joint measurement may be a problem.
The general construction of a non-trivial POV measure for system S introduces another system, ancilla, that forms a composite system with S . Let our ancilla A be a similar particle with position Q and momentum P . We work in Q-representation so that the Hilbert space of the composite system S + A is L 2 ( R ) L 2 ( R ) , which can be identified with L 2 ( R 2 ) . Then, we have wave functions Ψ ( q , Q ) and integral operators with kernels of the form K ( q , Q ; q , Q ) .
The dynamical variables A = q - Q and B = p + P of the composite system S + A commute and can therefore be measured jointly. The value space of PV observable E A B is R 2 with coordinates a and b (see end of Section 2.2.3.).
The next step is to smear E A B to obtain a realistic POV measure E k l , where k and l are integers. Let us divide the a b plane into disjoint rectangular cells X k l = [ a k , a k + 1 ] × [ b l , b l + 1 ] covering the entire plane. Each cell is centred at ( a k , b l ) , a k = ( a k + 1 + a k ) / 2 , b l = ( b l + 1 + b l ) / 2 and S k l = ( a k + 1 - a k ) ( b l + 1 - b l ) is its area. Then,
E k l = E A B ( X k l ) = E A ( [ a k , a k + 1 ] ) E B ( [ b l , b l + 1 ] ) .
The cells can be arbitrarily small.
The probability to obtain the outcome { k , l } in state T of the composite system is
t r [ E k l T ] = d q d Q d q d Q E k l ( q , Q ; q , Q ) T ( q , Q ; q , Q ) .
We assume that the composite system S + A is prepared in a factorized state
T = T S T A
and express the probability (168) in terms of the state T S . The action of the projections E A ( [ a k , a k + 1 ] ) and E B ( [ b l , b l + 1 ] ) on vector states of the form Ψ ( q , Q ) is
E A ( [ a k , a k + 1 ] ) Ψ ( q , Q ) = χ [ a k , a k + 1 ] ( q - Q ) Ψ ( q , Q ) ,
where
χ [ a k , a k + 1 ] ( x ) = 1 ∀</