# Thermodynamics as Control Theory

*Keywords:*thermodynamics; landauer; control theory

Next Article in Journal

Next Article in Special Issue

Next Article in Special Issue

Previous Article in Journal

Previous Article in Special Issue

Previous Article in Special Issue

Balliol College, Oxford OX1 3BJ, UK

Received: 23 October 2013 / Revised: 18 November 2013 / Accepted: 17 December 2013 / Published: 24 January 2014

(This article belongs to the Special Issue Maxwell’s Demon 2013)

I explore the reduction of thermodynamics to statistical mechanics by treating the former as a control theory: A theory of which transitions between states can be induced on a system (assumed to obey some known underlying dynamics) by means of operations from a fixed list. I recover the results of standard thermodynamics in this framework on the assumption that the available operations do not include measurements which affect subsequent choices of operations. I then relax this assumption and use the framework to consider the vexed questions of Maxwell’s demon and Landauer’s principle. Throughout, I assume rather than prove the basic irreversibility features of statistical mechanics, taking care to distinguish them from the conceptually distinct assumptions of thermodynamics proper.

Thermodynamics is misnamed. The name implies that it stands alongside the panoply of other “X-dynamics” theories in physics: Classical dynamics, quantum dynamics, electrodynamics, hydrodynamics, chromodynamics and so forth [1]. But what makes these theories dynamical is that they tell us how systems of a certain kind—classical or quantum systems in the abstract, or charged matter and fields, or fluids, or quarks and gluons, or whatever—evolve if left to themselves. The paradigm of a dynamical theory is a state space, giving us the possible states of the system in question at an instant, and a dynamical equation, giving us a trajectory (or, perhaps, a family of trajectories indexed by probabilities) through each state that tells us how that state will evolve under the dynamics.

Thermodynamics basically delivers on the state space part of the recipe: Its state space is the space of systems at equilibrium. But it is not in the business of telling us how those equilibrium states evolve if left to themselves, except in the trivial sense that they do not evolve at all: That is what equilibrium means, after all. When the states of thermodynamical systems change, it is because we do things to them: We put them in thermal contact with other systems, we insert or remove partitions, we squeeze or stretch or shake or stir them. And the laws of thermodynamics are not dynamical laws like Newton’s: They concern what we can and cannot bring about through these various interventions.

There is a general name for the study of how a system can be manipulated through external intervention: Control theory. Here again a system is characterised by its possible states, but instead of a dynamics being specified once and for all, a range of possible control actions is given. The name of the game is to investigate, for a given set of possible control actions, the extent to which the system can be controlled: That is, the extent to which it can be induced to transition from one specified state to another. The range of available transitions will be dependent on the forms of control available; the more liberal a notion of control, the more freedom we would expect to have to induce arbitrary transitions.

This conception of thermodynamics is perfectly applicable to the theory understood phenomenologically: That is, without any consideration of its microphysical foundations. However, my purpose in this paper is instead to use the control-theory paradigm to explicate the relation between thermodynamics and statistical mechanics. That is: I will begin by assuming the main results of non-equilibrium statistical mechanics and then consider what forms of control theory they can underpin. In doing so I hope to clarify both the control-theory perspective itself and the reduction of thermodynamics to statistical mechanics, as well as providing some new ways to get insight into some puzzles in the literature: Notably, those surrounding Maxwell’s Demon and Landauer’s Principle.

In Sections 2 and 3, I review the core results of statistical mechanics (making no attempt to justify them). In Sections 4 and 5 I introduce the general idea of a control theory and describe two simple examples: Adiabatic manipulation of a system and the placing of systems in and out of thermal contact. In Sections 6–8, I apply these ideas to construct a general account of classical thermodynamics as a control theory, and demonstrate that a rather minimal form of thermodynamics possesses the full control strength of much more general theories; I also explicate the notion of a one-molecule gas from the control-theoretic (and statistical-mechanical) perspective). In the remainder of the paper, I extend the notion of control theory to include systems with feedback, and demonstrate in what senses this does and does not increase the scope of thermodynamics.

I develop the quantum and classical versions of the theory in parallel, and fairly deliberately flit between quantum and classical examples. When I use classical examples, in each case (I believe) the discussion transfers straightforwardly to the quantum case unless noted otherwise. The same is probably true in the other direction; if not, no matter, given that classical mechanics is of (non-historical) interest in statistical physics only insofar as it offers a good approximation to quantum mechanics.

Statistical mechanics, as I will understand it in this paper, is a theory of dynamics in the conventional sense: It is in the business of specifying how a given system will evolve spontaneously. For the sake of definiteness, I lay out here exactly what I assume to be delivered by statistical mechanics.

- (1)
The systems are classical or quantum systems, characterised inter alia by a classical phase space or quantum-mechanical Hilbert space Hamiltonian H[V

_{I}] which may depend on one or more external parameters V_{I}(in the paradigm case of a gas in a box, the parameter is volume). In the quantum case I assume the spectrum of the Hamiltonian to be discrete; in either case I assume that the possible values of the parameters comprise a connected subset of R^{N}and that the Hamiltonian depends smoothly on them.- (2)
The states are probability distributions over phase space, or mixed states in Hilbert space. (Here I adopt what is sometimes called a Gibbsian approach to statistical mechanics; in [2], I defend the claim that this is compatible with a view of statistical mechanics as entirely objective.) Even in the classical case the interpretation of these probabilities is controversial; sometimes they are treated as quantifying an agent’s state of knowledge, sometimes as being an objective feature of the system; my own view is that the latter is correct (and that the probabilities are a classical limit of quantum probabilities; cf [3]). In the quantum case the interpretation of the mixed states merges into the quantum measurement problem, an issue I explore further in [4]. For the most part, though, the results of this paper are independent of the interpretation of the states.

- (3)
Given two systems, their composite is specified by the Cartesian product of the phase spaces (classical case) or by the tensor product of the Hilbert spaces (quantum case), and by the sum of the Hamiltonians (either case).

- (4)
The Gibbs entropy is a real function of the state, defined in the classical case as

$${S}_{G}(\rho )=-\int \text{d}x\hspace{0.17em}\hspace{0.17em}\rho (x)\hspace{0.17em}\text{ln}\hspace{0.17em}\rho (x)$$$${S}_{G}(\rho )=-\text{Tr}(\rho \hspace{0.17em}\text{ln}\hspace{0.17em}\rho ).$$- (5)
The dynamics are given by some flow on the space of states. In Hamiltonian dynamics, or unitary quantum mechanics, this would be the flow generated by Hamilton’s equation or the Schrödinger equation from the Hamiltonian H[V

_{I}], under which the Gibbs entropy is a constant of the motion; in statistical mechanics, however, we assume only that the flow (a) is entropy-non-decreasing, and (b) conserves energy, in the sense that the probability given by the state to any given energy is invariant under the flow.- (6)
For any given system there is some time, the equilibration timescale, after which the system has evolved to that state which maximises the Gibbs entropy subject to the conservation constraint above [5].

Now, to be sure, it is controversial at best how statistical mechanics delivers all this. In particular, we have good reason to suppose that isolated (classical or quantum) systems ought really to evolve by Hamiltonian or unitary dynamics, according to which the Gibbs entropy is constant and equilibrium is never achieved; more generally, the statistical-mechanical recipe I give here is explicitly time-reversal-noninvariant, whereas the underlying dynamics of the systems in question have a time reversal symmetry.

There are a variety of responses to offer to this problem, among them:

Perhaps no system can be treated as isolated, and interaction with an external environment somehow makes the dynamics of any realistic system non-Hamiltonian.

Perhaps the probability distribution (or mixed state) needs to be understood not as a property of the physical system but as somehow tracking our ignorance about the system’s true state, and the increase in Gibbs entropy represents an increase in our level of ignorance.

Perhaps the true dynamics is not, after all, Hamiltonian, but incorporates some time-asymmetric correction.

My own preferred solution to the problem (and the one that I believe most naturally incorporates the insights of the “Boltzmannian” approach to statistical mechanics) is that the state ρ should not be interpreted as the true probability distribution over microstates, but as a coarse-grained version of it, correctly predicting the probabilities relevant to any macroscopically manageable process but not correctly tracking the fine details of the microdynamics, and that the true signature of statistical mechanics is the possibility of defining (in appropriate regimes, under appropriate conditions, and for appropriate timescales) autonomous dynamics for this coarse-grained distribution that abstract away from the fine-grained details. The time asymmetry of the theory, on this view, arises from a time asymmetry in the assumptions that have to be made to justify that coarse-graining.

But from the point of view of understanding the reduction of thermodynamics to statistical mechanics, all this is beside the point. The most important thing to realise about the statistical-mechanical results I give above is that manifestly they are correct: The entire edifice of statistical mechanics (a) rests upon them; and (b) is abundantly supported by empirical data. (See [6] for more on this point.) There is a foundational division of labour here: the question of how this machinery is justified given the underlying mechanics is profoundly important, but it can be distinguished from the question of how thermodynamics relates to statistical mechanics. Statistical mechanics is a thoroughly successful discipline in its own right, and not merely a foundational project to shore up thermodynamics.

The “state which maximises the Gibbs entropy” can be evaluated explicitly. If the initial state ρ has a definite energy U, it will evolve to the distribution with the largest Gibbs entropy for that energy, and it is easy to see that (up to normalisation) in the classical case this is the uniform distribution on the hypersurface H[V_{I}](x) = U, and that in the quantum case it is the projection onto the eigensubspace of Ĥ[V_{I}] with energy U. Writing ρ_{U} to denote this state, it follows that in general the equilibrium state achieved by a general initial ρ will be that statistical mixture of ρ_{U} that gives the same probability to each energy as ρ did. In the classical case this is

$$\rho \to \int \text{d}U\hspace{0.17em}\text{Pr}(U){\rho}_{U}$$

$$\mathit{Pr}(U)=\int \rho \delta (H-U);$$

$$\rho \to \sum _{i}\text{Pr}({U}_{i}){\rho}_{U}$$

We can define the density of states $\mathcal{V}$(U) at energy U for a given Hamiltonian H in the classical case as follows: We take $\mathcal{V}$(U)δU to be the phase-space volume of states with energies between U and U + δU. We can use the density of states to write the Gibbs entropy of a generalised equilibrium state explicitly as

$${S}_{G}(\rho )=\int \text{d}U\hspace{0.17em}\text{Pr}(U)\hspace{0.17em}\text{ln}\hspace{0.17em}\mathcal{V}(U)+\left(-\int \text{d}U\hspace{0.17em}\text{Pr}(U)\hspace{0.17em}\text{ln}\hspace{0.17em}\text{Pr}(U)\right).$$

$${S}_{G}(\rho )=\sum _{i}\text{Pr}({U}_{i})\hspace{0.17em}\text{ln}\hspace{0.17em}(\text{Dim}\hspace{0.17em}{U}_{i})+\left(-\sum _{i}\text{Pr}({U}_{i})\hspace{0.17em}\text{ln}\hspace{0.17em}\text{Pr}({U}_{i})\right)$$

Now, suppose that the effective spread ΔU over energies of a generalised equilibrium state around its expected energy U_{0} is narrow enough that the Gibbs entropy can be accurately approximated simply as the logarithm of $\mathcal{V}$(U_{0}). States of this kind are called microcanonical equilibrium states, or microcanonical distributions (though the term is sometimes reserved for the ideal limit, where Pr(U) is a delta function at U_{0}, so that ρ(x) = (1/$\mathcal{V}$(U_{0}))δ(H(x) − U_{0})). A generalised equilibrium state can usefully be thought of as a statistical mixture of microcanonical distributions.

If ρ is a microcanonical ensemble with respect to H[V_{I}] for particular values of the parameters V_{I}, in general it will not be even a generalised equilibrium state for different values of those parameters. However, if close-spaced eigenvalues of the Hamiltonian remain close-spaced even when the parameters are changed, ρ will equilibrate into the microcanonical distribution. In this case, I will say that the system is parameter-stable; I will assume parameter stability for most of the systems I discuss.

A microcanonical distribution is completely characterised (up to details of the precise energy width δU and the spread over that width) by its energy U and the external parameters V_{I}. On the assumption that $\mathcal{V}$(U) is monotonically increasing with U for any values of the parameters (and, in the quantum case, that the system is large enough that we can approximate $\mathcal{V}$(U) as continuous) we can invert this and regard U as a function of Gibbs entropy S and the parameters. This function is (one form of) the equation of state of the system: For the ideal monatomic gas with N mass-m particles, for instance, we can readily calculate that

$$\mathcal{V}(U,V)\propto {V}^{N}{(2mU)}^{3N/2-1}$$

$$S\simeq {S}_{0}+N\hspace{0.17em}\text{ln}\hspace{0.17em}V+(3N/2)\hspace{0.17em}\text{ln}\hspace{0.17em}U,$$

The microcanonical temperature is then defined as

$$T={\left(\frac{\partial U}{\partial S}\right)}_{{V}_{I}}$$

At the risk of repetition, it is not (or should not be!) controversial that these probability distributions are empirically correct as regards predictions of measurements made on equilibrated systems, both in terms of statistical averages and of fluctuations around those averages. It is an important and urgent question why they are correct, but it is not our question.

Given this understanding of statistical mechanics, we can proceed to the control theory of systems governed by it. We will develop several different control theories, but each will have the same general form, being specified by:

A controlled object, the physical system being controlled.

A set of control operations that can be performed on the controlled object.

A set of feedback measurements that can be made on the controlled object.

A set of control processes, which are sequences of control operations and feedback measurements, possibly subject to additional constraints and where the control operation performed at a given point may depend on the outcomes of feedback measurements made before that point.

Our goal is to understand the range of transitions between states of the controlled object that can be induced. In this section and the next I develop two extremely basic control theories intended to serve as components for thermodynamics proper in Section 6.

The first such theory, adiabatic control theory, is specified as follows:

The controlled object is a statistical-mechanical system which is parameter-stable and initially at microcanonical equilibrium.

The control operations consist of (a) smooth modifications to the external parameters of the controlled object over some finite interval of time; (b) leaving the controlled object alone for a time long compared to its equilibration timescale.

There are no feedback measurements: The control operations are applied without any feedback as to the results of previous operations.

The control processes are sequences of control operations ending with a leave-alone operation.

Because of parameter stability, the end state is guaranteed to be not just at generalised equilibrium but at microcanonical equilibrium. The control processes therefore consist of moving the system’s state around in the space of microcanonical equilibrium states. Since for any value of the parameters the controlled object’s evolution is entropy-nondecreasing, one result is immediate: The only possible transitions are between states x, y with S_{G}(y) ≥ S_{G}(x). The remaining question is: Which such transitions are possible?

To answer this, consider the following special control processes: A process is quasi-static if any variations of the external parameters are carried out so slowly that the systems can be approximated to any desired degree of accuracy as being at or extremely close to equilibrium throughout the process.

A crucial feature of quasi-static processes is that the increase in Gibbs entropy in such a process is extremely small, tending to zero as the length of the process tends to infinity. To see this [7], suppose for simplicity that there is only one external parameter whose value at time t is V (t). If the expected energy of the state at time t is U(t), there will be a unique microcanonical equilibrium state ρ_{eq}[U(t), V (t)] for each time determined by the values U(t) and V (t) of the expected energy and the parameter at that time. The full state ρ(t) at that time can be written as

$$\rho (t)={\rho}_{eq}[U(t),V(t)]+\delta \rho (t),$$

The system’s dynamics is determined by some equation of the form

$$\dot{\rho}(t)=L[V(t)]\rho (t),$$

$$\dot{\rho}(t)=L[V(t)]\delta \rho (t);$$

$$\delta \rho ~\Delta \rho /T,$$

Now the rate of change of Gibbs entropy in such a process is given by

$$\dot{S}(t)={\frac{\delta S}{\delta \rho}|}_{\rho (t)}\cdot \dot{\rho}(t)={\frac{\delta S}{\delta \rho}|}_{\rho (t)}\cdot L[V(t)]\delta \rho (t)$$

$$\dot{S}(t)=\left({\frac{\delta S}{\delta \rho}|}_{{\rho}_{eq}(t)}+{\frac{{\delta}^{2}S}{\delta {\rho}^{2}}|}_{{\rho}_{eq}(t)}\cdot \delta \rho (t)+o(\delta {\rho}^{2})\right)\cdot L[V(t)]\delta \rho (t).$$

But since ρ_{eq} maximises Gibbs entropy for given expected energy, and since the time evolution operator L[V ] conserves expected energy,

$${\frac{\delta S}{\delta \rho}|}_{{\rho}_{eq}(t)}\cdot L[V(t)]\delta \rho (t)=0,$$

(To see intuitively what is going on here, consider a very small change V → V + δV made suddenly to a system initially at equilibrium. The sudden change leaves the state, and hence the Gibbs entropy, unchanged. The system then regresses to equilibrium on a trajectory of constant expected energy. But since the change is very small, and since the equilibrium state is an extremal state of entropy on the constant-expected-energy surface, to first order in δV the change in entropy in this part of the process is also zero.)

To summarise: Quasi-static adiabatic processes are isentropic: They do not induce changes in system entropy. What about non-quasi-static adiabatic processes? Well, if at any point in the process the system is not at (or very close to) equilibrium, by the baseline assumptions of statistical mechanics it follows that its entropy will increase as it evolves. So an adiabatic control process is isentropic if quasi-static, entropy-increasing otherwise.

In at least some cases, the result that quasi-static adiabatic processes are isentropic does not rely on any explicit equilibration assumption. To be specific: If the Hamiltonian has the form

$$\widehat{H}[{V}_{I}]=\sum _{i}{U}_{i}({\Lambda}_{I})|{\psi}_{i}({\Lambda}_{I})\rangle \hspace{0.17em}\langle {\psi}_{i}({\Lambda}_{I})|$$

In any case, we now have a complete solution to the control problem. By quasi-static processes we can move the controlled object’s state around arbitrarily on a given constant-entropy hypersurface; by applying a non-quasi-static process we can move it from one such hypersurface to a higher-entropy hypersurface. So the condition that the final state’s entropy is not lower than the initial state’s is sufficient as well as necessary: Adiabatic control theory allows a transition between equilibrium states iff it is entropy-nondecreasing.

A little terminology: The work done on the controlled object under a given adiabatic control process is just the change in its energy, and is thus the same for any two control processes that induce the same transition, and it has an obvious physical interpretation: The work done is the energy cost of inducing the transition by any physical implementation of the control theory. (In phenomenological treatments of thermodynamics it is usual to assume some independent understanding of “work done”, so that the observation that adiabatic transitions from x to y require the same amount of work however they are performed becomes contentful, and is one form of the First Law of Thermodynamics; from our perspective, though, it is just an application of conservation of energy.)

Following the conventions of thermodynamics, we write d―W for a very small quantity of work done during some part of a quasi-static control process. We have

$$\u0111W=\text{d}U{|}_{\delta S=0}=\sum _{I}{\left(\frac{\partial U}{\partial {V}_{I}}\right)}_{{V}_{J},S}\text{d}{V}_{I}\equiv -\sum _{I}{P}^{I}\text{d}{V}_{I}$$

Our second control theory, thermal contact theory, is again intended largely as a tool for the development of more interesting theories. To develop it, suppose that we have two systems initially dynamically isolated from one another, and that we introduce a weak interaction Hamiltonian between the two systems. Doing so, to a good approximation, will leave the internal dynamics of each system largely unchanged but will allow energy to be transferred between the systems. Given our statistical-mechanical assumptions, this will cause the two systems (which are now one system with two almost-but-not-quite-isolated parts) to proceed, on some timescale, to a joint equilibrium state. When two systems are coupled in this way, we say that they are in thermal contact. Given our assumption that the interaction Hamiltonian is small, we will assume that the equilibration timescales of each system separately are very short compared to the joint equilibration timescale, so that the interaction is always between systems which separately have states extremely close to the equilibrium state.

The result of this joint equilibration can be calculated explicitly. If two systems each confined to a narrow energy band are allowed to jointly equilibrate, the energies of one or other may end up spread across a wide range. For instance, if one system consists of a single atom initially with a definite energy E and it is brought in contact with a system of a great many such atoms, its post-equilibration energy distribution will be spread across a large number of states. However, for the most part we will assume that the microcanonical systems we consider are not induced to transition out of microcanonical equilibrium as a consequence of joint equilibration; systems with this property I call thermally stable.

There is a well-known result that characterises systems that equilibrate with thermally stable systems which is worth rehearsing here. Suppose two systems have density-of-state functions $\mathcal{V}$_{1}, $\mathcal{V}$_{2} and are initially in microcanonical equilibrium with total energy U. The probability of the two systems having energies U_{1}, U_{2} is then

$$\text{Pr}({U}_{1},{U}_{2})\propto \mathcal{V}({U}_{1})\mathcal{V}({U}_{2})\delta ({U}_{1}+{U}_{2}-U)$$

$$\text{Pr}({U}_{1})\propto \mathcal{V}({U}_{1})\mathcal{V}(U-{U}_{1}).$$

Assuming that the second system is thermally stable, we express the second term on the right hand side in terms of its Gibbs entropy and expand to first order around U (the assumption that the second system’s energy distribution is narrow tells us that higher terms in the expansion will be negligible):

$$\mathcal{V}(U-{U}_{1})=\hspace{0.17em}\text{exp}(S(U-{U}_{1}))\simeq \text{exp}\hspace{0.17em}\left\{S(U)-{\left(\frac{\partial S}{\partial U}\right)}_{{V}_{I}}{U}_{1}\right\}.$$

Since the partial derivative here is just the inverse of the microcanonical temperature T of the second system, the conclusion is that

$$\text{Pr}({U}_{1})\propto \mathcal{V}({U}_{1}){\text{e}}^{-{U}_{1}/T},$$

In any case, so long as we assume thermal stability then systems placed into thermal contact may be treated as remaining separately at equilibrium as they evolve towards a joint state of higher entropy.

We can now state thermal contact theory:

The controlled object is a fixed, finite collection of mutually isolated thermally stable statistical mechanical systems.

The available control operations are (i) placing two systems in thermal contact; (ii) breaking thermal contact between two systems; (iii) waiting for some period of time.

There are no feedback measurements.

The control processes are arbitrary sequences of control operations.

Given the previous discussion, thermal contact theory shares with adiabatic control theory the feature of inducing transitions between systems at equilibrium, and we can characterise the evolution of the systems during the control process entirely in terms of the energy flow between systems. The energy flow between two bodies in thermal contact is called heat. (A reminder: Strictly speaking, the actual amount of heat flow is a probabilistic quantity very sharply peaked around a certain value.)

The quantitative rate of heat flow between two systems in thermal contact will of course depend inter alia on the precise details of the coupling Hamiltonian between the two systems. But in fact the direction of heat flow is independent of these details. For the total entropy change (in either the microcanonical or canonical framework) when a small quantity of heat d―Q flows from system A to system B is

$$\delta S=\delta {S}_{A}+\delta {S}_{B}=\left\{-{\left(\frac{\partial {S}_{A}}{\partial {U}_{A}}\right)}_{{V}_{i}}+{\left(\frac{\partial {S}_{A}}{\partial {U}_{A}}\right)}_{{V}_{i}}\right\}\u0111Q.$$

But since the thermodynamical temperature T is just the rate of change of energy with entropy while external parameters are held constant, this can be rewritten as

$$\delta S=(1/{T}_{B}-1/{T}_{A})\u0111Q.$$

So heat will flow from A to B only if the inverse thermodynamical temperature of A is lower than that of B. In most cases (there are exotic counter-examples, notably in quantum systems with bounded energy) thermodynamical temperature is positive, so that this can be restated as: Heat will flow from A to B only if the thermodynamical temperature of A is greater than that of B. For simplicity I confine attention to this case.

If we define two systems as being in thermal equilibrium when placing them in thermal contact does not lead to any heat flow between them, then we have the following thermodynamical results:

- (1)
Two systems each in thermal equilibrium with a third system are at thermal equilibrium with one another; hence, thermal equilibrium is an equivalence relation. (The Zeroth Law of Thermodynamics).

- (2)
There exist real-valued empirical temperature functions which assign to each equilibrium system X a temperature t(X) such that heat flows from X to Y when they are in thermal contact iff t(X) > t(Y).

Returning to control theory, we can now see just what transitions can and cannot be achieved via thermal contact theory. Specifically, the only transitions that can be induced are the heating and cooling of systems, and a system can be heated only if there is another system available at a higher temperature. The exact range of transitions thus achievable will depend on the size of the systems (if I have bodies at temperatures 300 K and 400 K, I can induce some temperature increase in the first, but how much will depend on how quickly the second is cooled).

A useful extreme case involves heat baths: Systems at equilibrium assumed to be so large that no amount of thermal contact with other systems will appreciably change their temperature (and which are also assumed to have no controllable parameters, not that this matters for thermal control theory). The control transitions available via thermal contact theory with heat baths are easy to state: Any system can be cooled if its temperature is higher than some available heat bath, or heated if it is cooler than some such bath.

We are now in a position to do some non-trivial thermodynamics. In fact, we can consider two different thermodynamic theories that can thought of as two extremes. To be precise: Maximal no-feedback thermodynamics is specified like this:

The controlled object is a fixed, finite collection of mutually isolated statistical mechanical systems, assumed to be both thermally and parameter stable.

The control operations are (i) arbitrary entropy-non-decreasing transition maps on the combined states of the system; (ii) leaving the systems alone for a time longer than the equilibration timescale of each system.

There are no feedback measurements.

The control processes are arbitrary sequences of control operations terminating in operation (ii) (that is, arbitrary sequences after which the systems are allowed to reach equilibrium).

The only constraints on this control theory are that control operations do not actually decrease phase-space volume, and that the control operations to apply are chosen once-and-for-all and not changed on the basis of feedback.

By contrast, here is minimal thermodynamics, obtained simply by conjoining thermal contact theory and adiabatic control theory:

The controlled object is a fixed, finite collection of mutually isolated statistical mechanical systems, assumed to be both thermally and parameter stable.

The control operations are (i) moving two systems into or out of thermal contact; (ii) making smooth changes in the parameters determining the Hamiltonians of one or more system over some finite interval of time; (iii) leaving the systems alone for a time longer than the equilibration timescale of each system.

There are no feedback measurements.

The control processes are arbitrary sequences of control operations terminating in operation (iii) (that is, arbitrary sequences after which the systems are allowed to reach equilibrium).

The control theory for maximal thermodynamics is straightforward. The theory induces transitions between equilibrium states; no such transition can decrease entropy; transitions are otherwise totally arbitrary. So we can induce a transition x → y between two equilibrium states x, y iff S(x) ≤ S(y). It is a striking feature of thermodynamics that under weak assumptions minimal thermodynamics has exactly the same control theory, so that the apparently much greater strength of maximal no-feedback thermodynamics is illusory.

To begin a demonstration, recall that in the previous sections we defined the heat flow into a system as the change in its energy due to thermal contact, and the work done on a system as the change in its energy due to modification of the parameters. By decomposing any control process into periods of arbitrarily short length—in each of which we can linearise the total energy change as the change that would have occurred due to parameter change while treating each system as isolated plus the change that would have occurred due to entropy-increasing evolution while holding the dynamics fixed—and summing the results, we can preserve these concepts in minimal thermodynamics. For any system, we then have

$$\Delta U=Q+W,$$

The reader will probably recognise this result as another form of the First Law of Thermodynamics. In this context, it is a fairly trivial result: Its content, insofar as it has any, is just that there is a useful decomposition of energy changes by their various causes. In phenomenological treatments of thermodynamics the First Law gets physical content via some independent understanding of what “work done” is (in the axiomatic treatment of [10], for instance, it is understood in terms of the potential energy of some background weight). But the real content of the First Law from that perspective is that there is a thermodynamical quantity called energy which is conserved. In our microphysical-based framework the conservation of (expected) energy is a baseline assumption and does not need to be so derived.

The concept of a quasi-static transition also generalises from adiabatic control theory to minimal thermodynamics. If dU is the change in system energy during an extremely small step of such a control process, we have

$$\text{d}U=\sum _{I}{\left(\frac{\partial U}{\partial {V}_{I}}\right)}_{{V}_{J},S}\text{d}{V}_{I}+{\left(\frac{\partial U}{\partial S}\right)}_{{V}_{I}}\text{d}S$$

$$\text{d}U=-\sum _{I}{P}^{I}\text{d}{V}_{I}+T\text{d}S,$$

Putting our results so far together, we know that

- (1)
Any given system can be induced to make any entropy-nondecreasing transition between states.

- (2)
Any given system’s entropy may be reduced by allowing it to exchange heat with a system at a lower temperature, at the cost of increasing that system’s temperature by a greater amount.

- (3)
The total entropy of the controlled object may not decrease.

The only remaining question is then: Which transitions between collections of systems that do not decrease the total entropy can be induced by a combination of (1) and (2)? So far as I know there is no general answer to the question. However, we can answer it fully if we assume that one of the systems is what I will call a Carnot system: A system such that for any value of S, ${\left(\frac{\partial U}{\partial S}\right)|}_{{V}_{I}}$ takes all positive values on the constant-S hypersurface. The operational content of this claim is that a Carnot system in any initial equilibrium state can be controlled so as to take on any temperature by an adiabatic quasi-static process.

The ideal gas is an example of a Carnot system: Informally, it is clear that its temperature can be arbitrarily increased or decreased by adiabatically changing its volume. More formally, from its equation of state (8) we have

$$0=\frac{N}{V}\text{d}V+\frac{3N}{2U}\text{d}U{|}_{\delta S=0},$$

In any case, given a Carnot system we can transfer entropy between systems with arbitrarily little net entropy increase. For given two systems at temperatures T_{A}, T_{B} with T_{A} > T_{B}, we can (i) adiabatically change the temperature of the Carnot system to just below T_{A}; (ii) place it in thermal contact with the hotter system, so that heat flows into the Carnot system with arbitrarily little net entropy increase; (iii) adiabatically lower the Carnot system to a temperature just above T_{B}; (iv) place it in thermal contact with the colder system, so that (if we wait the right period of time) heat flows out of the Carnot system with again arbitrarily little net entropy increase. (In the thermodynamics literature this kind of process is called a Carnot cycle: Hence my name for Carnot systems.)

We then have a complete solution to the control problem for minimal thermodynamics: The possible transitions of the controlled object are exactly those which do not decrease the total entropy of all of the components. So “minimal” thermodynamics is, indeed, not actually that minimal.

The major loophole in all this—feedback—will be discussed from Section 9 onwards. Firstly, though, it will be useful to make a connection with the Second Law of Thermodynamics in its more phenomenological form.

While “the Second Law of Thermodynamics” is often read simply as synonymous with “entropy cannot decrease”, in phenomenal thermodynamics it has more directly empirical statements, each of which translates straightforwardly into our framework. Here’s the first:

This is a generalisation of the basic result of thermal contact theory, and the argument is essentially the same: Any such process decreases the entropy of the first system by more than it increases the entropy of the second. Since the entropy of the remaining systems is unchanged (they start and end the process in the same equilibrium states), the process is overall entropy-decreasing and thus forbidden by the statistical-mechanical dynamics. If both temperatures are positive, the condition becomes the more familiar one that T

The Second Law (Clausius statement):No sequence of control processes can induce heat flow Q from one system with an inverse temperature 1/T_{A}, heat flow Q into a second system with a lower inverse temperature 1/T_{B}, while leaving the states of all other systems unchanged.

And the second:

By the conservation of energy, any such process must result in net work Q being generated; an alternative way to give the Kelvin version is therefore “no process can extract heat Q from one system and turn it into work while leaving the states of all other systems unchanged”. In any case, the Kelvin version is again an almost immediate consequence of the principle that Gibbs entropy is non-decreasing: Since temperature is the rate of change of energy with entropy at constant parameter value, heat flow from a positive-temperature system must decrease its entropy, which (since the other systems are left unchanged) is again forbidden by the statistical-mechanical dynamics.

The Second Law (Kelvin statement):No sequence of control processes can induce heat flow Q from any one system with positive temperature while leaving the states of all other systems unchanged.

In both cases the “leaving the states of all other systems unchanged” clause is crucial. It is trivial to move heat from system A to system B with no net work cost if, for instance, system C, a box of gas, is allowed to expand in the process and generate enough work to pay for the work cost of the transition. Thermodynamics textbooks often use the phrase “operating in a cycle” to describe this constraint, and it will be useful to cast that notion more explicitly in our framework.

Specifically, let’s define heat bath thermodynamics (without feedback) as follows:

The controlled object consists of (a) a collection of heat baths at various initial temperatures; (b) another finite collection of statistical-mechanical systems, the auxiliary object, containing at least one Carnot system, and whose initial states are unconstrained.

The control operations are (a) moving one or more systems in the auxiliary object into or out of thermal contact with other auxiliary-object systems and/or with one or more heat baths; (b) applying any desired smooth change to the parameters of the systems in the auxiliary object over some finite period of time; (c) inducing one or more systems in the auxiliary object to evolve in an arbitrary entropy-nondecreasing way.

There are no feedback measurements.

A control process is an arbitrary sequence of control operations.

In this framework, a control process is cyclic if it leaves the state of the auxiliary object unchanged. The Clausius and Kelvin statements are then, respectively, that no cyclic process can have as its sole effect on the heat baths (a) that net heat Q flows from one bath to one with a higher temperature at no cost in work, and (b) that net heat Q from one bath is converted into work. And again, these are fairly immediate consequences of the fact that entropy is nondecreasing.

But perhaps we don’t care about cyclic processes? What does it matter what the actual final state of the auxiliary system is, provided the process works? We can make this intuition more precise like this: A control process delivers a given outcome repeatably if (i) we can perform it arbitrarily often using the final state of each process as the initial state of the next, and (ii) the Hamiltonian of the auxiliary object is the same at the end of each process as at the beginning. The Clausius statement, for instance, is now that no process can repeatably cause any quantity Q of heat to flow from one heat bath to another of higher temperature at no cost in work and with no heat flow between other heat baths.

This offers no real improvement, though. In the Clausius case, any such heat flow is entropy-decreasing on the heat baths: Specifically, if they have temperatures T_{A} and T_{B} with T_{A} > T_{B}, a transfer of heat Q between them leads to an entropy increase of Q/(T_{A} − T_{B}). So the entropy of the auxiliary object must increase by at least this much. By conservation of energy the auxiliary object’s expected energy must be constant in this process. But the entropy of the auxiliary object has a maximum for given expected energy [11] and so this can be carried out only finitely many times. A similar argument can readily be given for the Kelvin statement.

I pause to note that we can turn these entirely negative constraints on heat and work into quantitative limits in a familiar way by using our existing control theory results. (Here I largely recapitulate textbook thermodynamics.) Given two heat baths having temperatures T_{A}, T_{B} with T_{A} > T_{B}, and a Carnot system initially at temperature T_{A}, the Carnot cycle to transfer heat from the colder system to the hotter is:

- (1)
Adiabatically transition the Carnot system to the lower temperature T

_{B}.- (2)
Place the Carnot system in thermal contact with the lower-temperature heat bath, and modify its parameters quasi-statically so as to cause heat to flow from the heat bath to the system. (That is, carry out modifications which if done adiabatically would decrease the system’s temperature.) Do so until heat Q

_{B}has been transferred to the system.- (3)
Adiabatically transition the Carnot system to temperature T

_{A}.- (4)
Place the Carnot system in thermal contact with the higher-temperature heat bath, and return its parameters quasi-statically to their initial values.

At the end of this process the Carnot system has the same temperature and parameter values as at the beginning and so will be in the same equilibrium state; the process is therefore cyclic, and the entropy and energy of the Carnot system will be unchanged. But the entropy of the system is changed only by the heat flow in steps 2 and 4. If the heat flow out of the system in step 4 is Q_{A}, then the entropy changes in those steps are respectively +Q_{B}/T_{B} and −Q_{A}/T_{A}, so that Q_{A}/Q_{B} = T_{A}/T_{B}. By conservation of energy the net work done on the Carnot system in the cycle is W = Q_{A} − Q_{B}, and we have the familiar result that

$$W=\left(\frac{{T}_{A}}{{T}_{B}}\right){Q}_{B}$$

Since the process consists entirely of quasi-static modifications of parameters (and the making and breaking of thermal contact), it can as readily be run in reverse, giving us the equally-familiar formula for the efficiency of a heat engine: T_{B}/T_{A}. And since (on pain of violating the Kelvin statement) all reversible heat engines have the same efficiency (and all irreversible ones a lower efficiency), this result is general and not restricted to Carnot cycles.

The Carnot systems used in our analysis so far have been assumed to be parameter-stable, thermally stable systems that can be treated via the microcanonical ensemble (and thus, in effect, to be macroscopically large). But in fact, this is an overly restrictive conception of a Carnot system, and it will be useful to relax it. All we require of such a system is that for any temperature T it possesses states which will transfer heat to and from temperature-T heat baths with arbitrarily low entropy gain, and that it can be adiabatically and quasi-statically transitioned between any two such states.

As I noted in Section 5, it is a standard result in statistical mechanics that a system of any size in equilibrium with a heat bath of temperature T is described by the canonical distribution for that temperature, having probability density at energy U proportional to e^{−}^{U/T}. There is no guarantee that adiabatic, quasi-static transitions preserve the form of the canonical ensemble, but any system where this is the case will satisfy the criteria required for Carnot systems. I call such systems canonical Carnot systems; from here on, Carnot systems will be allowed to be either canonical or microcanonical.

To get some insight into which systems are canonical Carnot systems, assume for simplicity that there is only one parameter V and that the Hamiltonian can be written in the form required by the adiabatic theorem:

$$\widehat{H}[V]=\sum _{i}{U}_{i}(V)|{\psi}_{i}(V)\rangle \hspace{0.17em}\langle {\psi}_{i}(V)|.$$

Then if the system begins in canonical equilibrium, its initial state is

$$\rho (V)=\frac{1}{Z}\sum _{i}{\text{e}}^{-\beta {U}_{i}(V)}|{\psi}_{i}(V)\rangle \hspace{0.17em}\langle {\psi}_{i}(V)|.$$

By the adiabatic theorem, if V is altered sufficiently slowly to V′ while the system continues to evolve under Hamiltonian dynamics, it will evolve to

$$\rho ({V}^{\prime})=\sum _{i}{\text{e}}^{-\beta {U}_{i}(V)}|{\psi}_{i}({V}^{\prime})\rangle \hspace{0.17em}\langle {\psi}_{i}({V}^{\prime})|.$$

This will itself be in adiabatic form if we can find β′ and Z′ such that

$$\frac{{\text{e}}^{-\beta {U}_{i}(V)}}{Z}=\frac{{\text{e}}^{-\beta {U}_{i}({V}^{\prime})}}{{Z}^{\prime}}$$

$${U}_{i}({V}^{\prime})-{U}_{j}({V}^{\prime})=f(V,{V}^{\prime})({U}_{i}(V)-{U}_{j}(V)),$$

For an ideal gas, elementary quantum mechanics tells us that the energy of a given mode is inversely proportional to the volume of the box in which the gas is confined: (Quick proof sketch: increasing the size of the box by a factor K decreases the gradient by that factor, and hence decreases the kinetic energy density by a factor K^{2}. Energy is energy density × volume.)

$${U}_{i}(V)=\frac{g(i)}{V}.$$

So an ideal gas is a canonical Carnot system. This result is independent of the number of particles in the gas and independent of any assumption that the gas spontaneously equilibrates. So in principle, even a gas with a single particle—the famous one-molecule gas introduced by [12]—is sufficient to function as a Carnot system. Any repeatable transfer of heat between heat baths via arbitrary entropy-non-decreasing operations on auxiliary systems can in principle be duplicated using only quasi-static operations on a one-molecule gas [13].

For the rest of the paper, I will consider how the account developed is modified when feedback is introduced. The one-molecule gas was introduced into thermodynamics for just this purpose, and will function as a useful illustration.

What happens to the Gibbs entropy when a system with state ρ is measured? The classical case is easiest to analyse: Suppose phase space is decomposed into disjoint regions Γ_{i} and that

$${\int}_{{\Gamma}_{i}}\rho ={p}_{i}.$$

Then p_{i} is the probability that a measurement of which phase-space region the system lies in will give result i. The state can be rewritten in the form

$$\rho =\sum _{i}{p}_{i}{\rho}_{i},$$

$${\rho}_{i}(x)=\frac{1}{{p}_{i}}\rho (x)$$

The expected value of the Gibbs entropy after the measurement (“p-m”) is then

$${\langle {S}_{G}\rangle}_{p-m}=\sum _{i}{S}_{G}({\rho}_{i}){p}_{i}.$$

But we have

$${S}_{G}(\rho )=-\int \left(\sum _{i}{p}_{i}{\rho}_{i}\right)\hspace{0.17em}\text{ln}\hspace{0.17em}\left(\sum _{i}{p}_{i}{\rho}_{i}\right)$$

$${S}_{G}(\rho )=-\sum _{i}{p}_{i}\int {\rho}_{i}\hspace{0.17em}\text{ln}({p}_{i}{\rho}_{i})=-\sum _{i}{p}_{i}\hspace{0.17em}\text{ln}\hspace{0.17em}{p}_{i}\int {\rho}_{i}-\sum _{i}{p}_{i}\int {\rho}_{i}\hspace{0.17em}\text{ln}\hspace{0.17em}{\rho}_{i}.$$

But the integral in the first term is just 1 (since the ρ_{i} are normalised) and the integral in the second term is −S_{G}(ρ_{i}). So we have

$${\langle {S}_{G}\rangle}_{p-m}={S}_{G}(\rho )-\left(-\sum _{i}{p}_{i}\hspace{0.17em}\text{ln}\hspace{0.17em}{p}_{i}\right).$$

That is, measurement may decrease entropy for two reasons. Firstly, pure chance may mean that the measurement happens to yield a post-measurement state with low Gibbs entropy. But even the average value of the post-measurement entropy decreases, and the level of the decrease is equal to the Shannon entropy of the probability distribution of measurement outcomes. A measurement process which has a sufficiently dramatic level of randomness could, in principle, lead to a very sharp decrease in average Gibbs entropy [14].

In the quantum case, the situation is slightly more complicated. We can represent the measurement by a collection of mutually orthogonal projectors Π̂_{i} summing to unity, and define measurement probabilities

$${p}_{i}=\text{Tr}({\widehat{\Pi}}_{i}\rho )$$

$${\rho}_{i}=\frac{1}{{p}_{i}}{\widehat{\Pi}}_{i}\rho {\widehat{\Pi}}_{i},$$

Insofar as “the Second Law of Thermodynamics” is taken just to mean “entropy never decreases”, then, measurement is a straightforward counter-example, as has been widely recognised (see, for instance, [12,15], [16] [ch.5],or [17]) [18]. From the control-theory perspective, though, the interesting content of thermodynamics is which transitions it allows and which it forbids, and the interesting question about feedback measurements is whether they permit transitions which feedback-free thermodynamics does not. Here the answer is again unambiguous: It does.

To be precise: Define heat bath thermodynamics with feedback as follows:

The controlled object consists of (a) a collection of heat baths at various initial temperatures; (b) another finite collection of statistical-mechanical systems, the auxiliary object, containing at least one Carnot system, and whose initial states are unconstrained.

The control operations are (a) moving one or more systems in the auxiliary object into or out of thermal contact with other auxiliary-object systems and/or with one or more heat baths; (b) applying any desired smooth change to the parameters of the systems in the auxiliary object over some finite period of time; (c) inducing one or more systems in the auxiliary object to evolve in an arbitrary entropy-nondecreasing way.

Arbitrary feedback measurements may be made.

A control process is an arbitrary sequence of control operations.

In this framework, the auxiliary object can straightforwardly be induced (with high probability) to transition from equilibrium state x to equilibrium state y with S_{G}(y) < S_{G}(x). Firstly, pick a measurement such that performing it transitions x to x_{i} with probability p_{i}, such that

$$-\sum _{i}{p}_{i}\hspace{0.17em}\text{ln}\hspace{0.17em}{p}_{i}\gg {S}_{G}(x)-{S}_{G}(y.)$$

The expected value of the entropy of the post-measurement state will be much less than that of y; for an appropriate choice of measurement, with high probability the actually-obtained post-measurement state x_{i} will satisfy S_{G}(x_{i}) < S_{G}(y). Now perform an entropy-increasing transformation from x_{i} to y. (For instance, perform a Hamiltonian transformation of x_{i} to some equilibrium state, then use standard methods of equilibrium thermodynamics to change that state to y).

As such, the scope of controlled transitions of the auxiliary object is total: It can be transitioned between any two states. As a corollary, the Clausius and Carnot versions of the Second Law do not apply to this control theory: energy can be arbitrarily transferred from one heat bath to another, or converted from a heat bath into work.

In fact, the full power of the arbitrary transformations available on the auxiliary system is not needed to produce these radical results. Following Szilard’s classic method, let us assume that the auxiliary system is a one-molecule gas confined to a cylindrical container by a movable piston at each end, so that the Hamiltonian of the gas is parametrised by the position of the pistons. Now suppose that the position of the gas atom can be measured. If it is found to be closer to one piston than the other, the second piston can rapidly be moved at zero energy cost to the mid-point between the two. As a result, the volume of the gas has been halved without any change in its internal energy (and so its entropy has been decreased by ln 2; cf Equation (8).) If we now quasi-statically and adiabatically expand the gas to its original volume, its energy will decrease and so work will have been extracted from it.

Now suppose we take a heat bath at temperature T and a one-atom gas at equilibrium also at temperature T. The above process allows us to reduce the energy of the box and extract some amount of work δW from it. Placing it back in thermal contact with the heat bath will return it to its initial state and so, by conservation of energy, extracts heat δQ = δW from the bath. This is a straightforward violation of the Kelvin version of the Second Law. If we use the extracted work to heat a heat bath which is hotter than the original bath, we generate a violation of the Clausius version also.

To make this explicit, let’s define Szilard theory as follows:

The controlled object consists of (a) a collection of heat baths at various initial temperatures; (b) a one-atom gas as defined above.

The control operations are (a) moving the one-atom gas into or out of thermal contact with one or more heat baths; (b) applying any desired smooth change in the positions of the pistons confining the one-atom gas.

The only possible feedback measurement is a measurement of the position of the atom in the one-atom gas.

A control process is an arbitrary sequence of control operations.

Then the control operations available in Szilard theory include arbitrary cyclic transfers of heat between heat baths and conversion of heat into work.

The use of a one-atom gas in this algorithm is not essential. Suppose that we measure instead the particle density in each half of a many-atom gas at equilibrium Random fluctuations ensure that one side of the gas is at a slightly higher density than the other; compressing the gas slightly using the piston on the low-density side will reduce its volume at a slightly lower cost in work than would be possible on average without feedback; iterating such processes will again allow heat to be converted into work. (The actual numbers in play here are utterly negligible, of course—as for the one-atom gas—but we are interested here in in-principle possibility, not practicality [19].

The most famous example of measurement-based entropy decrease, of course, is Maxwell’s demon: A partition is placed between two boxes of gas initially at equilibrium at the same temperature. A flap, which can be opened or closed, is placed in the partition, and at short time intervals δt the boxes are measured to ascertain if, in the next period of time δt any particles will collide with the flap from (a) the left or (b) the right. If (a) holds but (b) does not, the flap is opened for the next δt seconds. Applying this alternation of feedback measurement and control operation for a sufficiently long time will reliably cause the density of the gas on the left to be much lower than on the right. Quasi-statically moving the partition to the left will then allow work to be extracted. The partition can then be removed, and reinserted in the middle; the temperature of the box will have been reduced. Placing the box in thermal contact with a heat bath will then extract heat from the bath equal to the work done; the Kelvin version of the Second Law is again violated. I will refrain from formally stating the “demonic control theory” into which these results could be embedded, but it is fairly clear that such a theory could be formulated.

Szilard control theory, and demonic control theory, allow thermodynamically forbidden transitions. Big deal, one might reasonably think: So does abracadabra control theory, where the allowed control operations include completely arbitrary shifts in a system’s state. We don’t care about abracadabra control theory because we have no reason to think that it is physically possible; we only have reason to care about entropy-decreasing control theories based on measurement if we have reason to think that they are physically possible.

Of course, answering the general question of what is physically possible isn’t easy. Is it physically possible to build mile-long relativistic starships? The answer turns on rather detailed questions of material science and the like. But no general physical principle forbids it. Similarly, detailed problems of implementation might make it impossible to build a scalable quantum computer, but the theory of fault-tolerant quantum computation [20,21] gives us strong reasons to think that such computers are not ruled out in principle. On the other hand, we do have reason to think that faster-than-light starships, or computers that can compute Turing-non-computable functions are in principle ruled out. It is this “in-principle” question of implementability that is of interest here.

To answer that question, consider again heat-bath control theory. The action takes place mostly with respect to the auxiliary object: The heat baths are not manipulated in any way beyond moving into or out of contact with that object. We can then imagine treating the auxiliary object, and the control machinery, as a single larger system: We set the system going, and then simply allow it to run. It churns away, from time to time establishing or breaking physical contact with a heat bath or perhaps drawing on or topping up an external energy reservoir, and in due course completes the control process it was required to implement.

This imagined treatment of the system can be readily incorporated into our system: We can take the auxiliary object of heat-bath theory with feedback together with its controlling mechanisms, draw a box around both together, and treat the result as a single auxiliary object for a heat-bath theory without feedback. Put another way, if the feedback-based control processes we are considering are physically possible, we ought to be able to treat the machinery that makes the measurement as physical, and the machinery that decides what operation to perform based on a given feedback result as likewise physical, and treat all that physical apparatus as part of the larger auxiliary object. Let’s call the assumption that this is possible the automation constraint; to violate it is to assume that some aspects of computation or of measurement cannot be analysed as physical processes, an assumption I will reject here without further discussion.

But we already know that heat bath theory without feedback does not permit any repeatable transfer of heat into work, or of a given quantity of heat from a cold body to a hotter body. Such transfers are possible, but only if the auxiliary object increases in Gibbs entropy. And given that the auxiliary object breaks into controlling sub-object and controlled sub-object and that ex hypothesi the control processes we are considering leave the controlled sub-object’s state unchanged, we can conclude that the Gibbs entropy of the controlling sub-object must have increased.

This raises an interesting question. From the perspective of the controlling system, control theory with feedback looks like a reasonable idealisation, but from the external perspective, we know that something must go wrong with that idealisation. The resolution of this problem lies in the effects of the measurement process on the controlling system itself: The process of iterated measurement is radically indeterministic from the perspective of the controlling object, and it can have only a finite number of relevantly distinct states, so eventually it runs out of states to use.

This point (though controversial; cf [22,23], and references therein) has been widely appreciated in the physics literature and can be studied from a variety of perspectives; in this rest of this section, I briefly describe the most commonly discussed one. Keep in mind in the sequel that we already know that somehow the controlling system’s strategy must fail (at least given the automation constraint): The task is not to show that it does but to understand how it does.

The perspective we will discuss uses what might be called a computational model of feedback: It is most conveniently described within quantum mechanics. We assume that the controlling object consists, at least in part, of some collection of N systems - bits—each of whose Hilbert space is the direct sum of two memory subspaces 0 and 1 and each of which begins with its state somewhere in the 0 subspace. A measurement with two outcomes is then a dynamical transition which leaves the measured system alone and causes some so-far-unused bit to transition into the 1 subspace if one outcome is obtained and to remain in the 0 subspace if the other is obtained. That is, if T̂ is some unitary transformation of the bit’s Hilbert space that maps the 0 subspace into the 1 subspace, the measurement is represented by some unitary transformation

$$\widehat{V}=\widehat{P}\otimes \widehat{T}+(\widehat{1}-\widehat{P})\otimes \widehat{1}$$

$$\widehat{U}={\widehat{U}}_{0}\otimes {\widehat{P}}_{0}+{\widehat{U}}_{1}\otimes {\widehat{P}}_{1}$$

The problem with this process is that eventually, the system runs out of unused bits. (Note that the procedure described above only works if the bit is guaranteed to be in the 0 subspace initially. To operate repeatably, the system will then have to reset some bits to the initial state. But Landauer’s Principle states that such resetting carries an entropy cost. Since the principle is controversial (at least in the philosophy literature!) I will work through the details here from a control-theory perspective.

Specifically, let’s define a computational process as follows: It consists of N bits (the memory) together with a finite system (the computer) and another system (the environment). A computation is a transition which is deterministic at the level of bits: that is, if the N bits begin, collectively, in subspaces that encode the binary form of some natural number n, after the transition they are found, collectively, in subspaces encoding f(n) for some fixed function f. (Reference [24] is a highly insightful discussion which inter alia considers the case of indeterministic computation.) The control processes are arbitrary unitary (quantum) or Hamiltonian (classical) evolutions on the combined system of memory, computer, and environment; the question of interest is what constraints on the transitions of computer and environment are required for given computational transitions to be implemented. For the sake of continuity with the literature I work in the classical framework (the quantum generalisation is straightforward); for simplicity I assume that the bits have equal phase space V assigned to 0 and 1.

If the function f is one-to-one, the solution to the problem is straightforward. The combined phase space of the memory can be partitioned into 2^{N} subspaces each of equal volume and each labelled with the natural number they represent. There is then a phase-space-preserving map from n to f(n) for each n, and these maps can be combined into a single map from the memory to itself. One-to-one (‘reversible’) computations can then be carried out without any implications for the states of computer or environment.

But now suppose that the function f takes values only between 1 and 2^{M} (M < N), so that any map implementing f must map the bits M + 1, . . . N into their zero subspaces independent of input. Any such map would map the uniform distribution over the memory (which has entropy N ln 2V) to one with support in a region of volume (2V)^{M} ×V^{N}^{−}^{M} (and so with maximum entropy M ln 2V +(N − M) ln V). Since the map as a whole is by assumption entropy-preserving, it must increase the joint entropy of system plus environment by (N − M) ln 2. In the limiting case of reset, M = 0 (f(n) = 0 for all n) and so the computer and environment must jointly increase in entropy by at least N ln 2. This is Landauer’s principle: Each bit that is reset generates at least ln 2 entropy.

If the computer is to carry out the reset operation repeatably, its own entropy cannot increase without limit. So a repeatable reset process dumps at least entropy ln 2 per bit into the environment. In the special case where the environment is a heat bath at temperature T, Landauer’s principle becomes the requirement that reset generates T ln 2 heat per bit.

A more realistic feedback-based control theory, then, might incorporate Landauer’s Principle explicitly, as in the following (call it computation heat-bath thermodynamics):

The controlled object consists of (a) a collection of heat baths at various initial temperatures; (b) another finite collection of statistical-mechanical systems, the auxiliary object, containing at least one Carnot system, and whose initial states are unconstrained; (c) a finite number N of 2-state systems (“bits”), the computational memory, each of which begins in some fixed (“zero”) initial state with probability 1.

The control operations are (a) moving one or more systems in the auxiliary object into or out of thermal contact with other auxiliary-object systems and/or with one or more heat baths; (b) applying any desired smooth change to the parameters of the systems in the auxiliary object over some finite period of time; (c) inducing one or more systems in the auxiliary object to evolve in an arbitrary entropy-nondecreasing way; (d) erasing M bits of the memory—that is, restoring them to their zero states—and at the same time transferring heat M ln 2/T to some heat bath at temperature T ; (e) applying any computation to the computational memory.

Arbitrary feedback measurements may be made (including the memory bits) provided that: (a) they have finitely many results; (b) the result of the measurement is faithfully recorded in the state of some collection of bits which initially each have probability 1 of being in the 0 state.

A control process is an arbitrary sequence of control operations.

At first sight, measurement in this framework is in the long run entropy-increasing: A measurement with 2^{M} outcomes having probabilities p_{1}, . . . p_{2}^{M} will reduce the entropy by ΔS = − ∑_{i} p_{i} ln p_{i}, but the maximum value of this is M ln 2, which is the entropy increase required to erase the M bits required to record the result. But as Zurek [15] has pointed out, Shannon’s noiseless coding theorem allows us to compress those M bits to, on average, −∑_{i} p_{i} ln p_{i} bits, so that the overall process can be made entropy-neutral.

This strategy of using Landauer’s principle to explain why Maxwell demons cannot repeatably violate the Second Law has a long history (see [25] and references therein). It has recently come under sharp criticism by John Earman and John Norton [22,26] as either trivial or question-begging: They argue that any such defences (‘exorcisms’) rely on arguments for Landauer’s Principle that are either Sound (that is, start off by assuming the Second Law), or Profound (that is, do not so start off). Exorcisms relying on Sound arguments are question-begging; those relying on Profound exorcisms leave us no good reason to accept Landauer’s principle in the first place.

Responses to Earman and Norton (see, e. g. [27,28]) have generally embraced the first horn of the dilemma, accepting that Landauer’s Principle does assume the Second Law but arguing that use of it can still be pedagogically illuminating. (See [26,29] for responses to this move.) But I believe the dialectic here fails to distinguish between statistical mechanics and thermodynamics. The argument here for Landauer’s Principle does indeed assume that the underlying dynamics are entropy-non-decreasing, and from that perspective appeal to Landauer’s principle is merely of pedagogical value: It helps us to make sense of how feedback processes can be entropy-decreasing despite the fact that any black-box process, even if it involves internal measurement of subsystems, cannot repeatedly turn heat into work. But (this is one central message of this paper) that dynamical assumption within statistical mechanics should not simply be identified with the phenomenological Second Law. In Earman and Norton’s terminology, the argument for Landauer’s Principle is Sound with respect to statistical mechanics, but Profound with respect to phenomenological thermodynamics.

The results of my exploration of control theory can be summarised as follows:

- (1)
In the absence of feedback, physically possible control processes are limited to inducing transitions that do not lower Gibbs entropy.

- (2)
That limit can be reached with access to very minimal control resources: Specifically, a single Carnot system and the ability to adiabatically control and put it in thermal contact with other systems.

- (3)
Introducing feedback allows arbitrary transitions.

- (4)
If we try to model the feedback process as an internal dynamical process in a larger system, we find that feedback does not increase the power of the control process.

- (5)
(3) and (4) can be reconciled by considering the physical changes to the controlling system during feedback processes. In particular, on a computation model of control and feedback, the entropy cost of resetting the memory used to record the result of measurement at least cancels out the entropy reduction induced by the measurement.

I will end with a more general moral. As a rule, and partly for pedagogical reasons, foundational discussions of thermal physics tend to begin with thermodynamics and continue to statistical mechanics. The task of recovering thermodynamics from successfully grounded statistical mechanics is generally not cleanly separated from the task of understanding statistical mechanics itself, and the distinctive requirements of thermodynamics blur into the general problem of understanding statistical-mechanical irreversibility. Conversely, foundational work on thermodynamics proper is often focussed on thermodynamics understood phenomenologically: A well-motivated and worthwhile pursuit, but not one that obviates the need to understand thermodynamics from a statistical-mechanical perspective.

The advantage of the control-theory way of seeing thermodynamics is that it permits a clean separation between the foundational problems of statistical mechanics itself and the reduction problem of grounding thermodynamics in statistical mechanics. I hope to have demonstrated: (a) These really are distinct problems, so that an understanding of (e.g.) why systems spontaneously approach equilibrium does not in itself suffice to give an understanding of thermodynamics; but also (b) that such an understanding, via the interpretation of thermodynamics as the control theory of statistical mechanics, can indeed be obtained, and can shed light on a number of extant problems at the statistical-mechanics/ thermodynamics boundary.

In writing this paper I benefitted greatly from conversations with David Albert, Harvey Brown, Wayne Myrvold, John Norton, Jos Uffink, and in particular Owen Maroney. I also wish to acknowledge comments from an anonymous referee.

- In fact, the etymology of thermodynamics according to the Oxford English Dictionary, is just that it is the study of heat (thermo) and work (dynamics) and their interaction. (I am grateful to Jos Uffink for this observation.).
- Wallace, D. The Non-Problem of Gibbs vs. Boltzmann Entropy. 2014. unpublished work. [Google Scholar]
- Wallace, D. There are No Statistical-Mechanical Probabilities. 2014. unpublished work. [Google Scholar]
- Wallace, D. Inferential vs dynamical conceptions of physics. 2013. arXiv:1306.4907v1. [Google Scholar]
- Strictly speaking there is generally no finite time after which the system has exactly equilibrated, but for any given level of approximation there will be a timescale after which the system has equilibrated to within that level.
- Wallace, D. What Statistical Mechanics Actually Does.
**2014**. unpublished work. [Google Scholar] - This is my own formulation of the argument, but I do not claim any particular originality for it; for an argument along similar lines, see [30] (pp.541–548).
- An anonymous reviewer (to whom I’m grateful for prompting me to clarify the preceding argument) queries whether the argument is in any case necessary (this is one thermodynamic fact the readership will take for granted). But notice that what is derived here is not really a thermodynamic fact but a statistical-mechanical one, providing the statistical-mechanical underpinning to one of the basic assumptions of phenomenological thermodynamics.
- See, e. g. [31] (ch.XVII sections 10–14) or [32] (pp. 193–196).
- Lieb, E.H.; Yngvason, J. The physics and mathematics of the second law of thermodynamics. Phys. Rep
**1999**, 310, 1–96. [Google Scholar] - The canonical distribution can be characterised as the distribution which maximises Gibbs entropy for given expected energy, so this maximum is just the entropy of that canonical distribution.
- Sizilard, L. Uber die Entropieverminderung in einem thermodynamischen System bei Eingriffen intelligenter wesen. 1929; Volume 53, (On the Decrease of Entropy in a Thermodynamic System by the Intervention of Intelligent Beings.; Feld, B.T., Szilard, GW., Eds.; MIT Press: Cambridge, MA, USA, 1972.
- The name one-molecule is a little unfortunate: the “molecule” here is monatomic and lacks internal degrees of freedom.
- An anonymous referee worries that something is wrong here: “I cannot convert heat to work merely by discovering where I left the car keys”. However, I can convert heat to work (more accurately: increase by capability for turning heat into work) merely by remembering which side of the partition I left my one-molecule gas; that I cannot do this with my car keys relies on mundane features of their non-isolated state and macroscopic scale.
- Zurek, W.H. Algorithmic randomness and physical entropy. Phys. Rev. A
**1989**, 40, 4731–4751. [Google Scholar] - Albert, D.Z. Time and Chance; Harvard University Press: Cambridge, MA, USA, 2000. [Google Scholar]
- Hemmo, M.; Shenker, O. The Road to Maxwell’s Demon: Conceptual Foundations of Statistical Mechanics; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
- Several of these accounts (notably [16,17] use a Boltzmannian setup for statistical mechanics (I am grateful to an anonymous referee for pressing this point). I argue in [2] that nothing very deep hangs on the decision to do so; in this context, the main point to note is that Boltzmann, unlike Gibbs, entropy is not reduced by measurement, but rather, measurement gives us the capacity to perform a subsequent entropy-reducing control operation which could not be carried out if our only information about the system was its macrostate and the standard probability distribution. This illustrates the point (which I develop further in [2]) that Boltzmann entropy in the absence of additional probabilistic assumptions is almost totally uninformative about a thermodynamic system’s behaviour.
- For forceful defence of the idea that the practicalities are what prevents Second Law violation in these cases, see [33].
- Preskill, J. Fault-tolerant quantum computation. Introd. Quantum Comput. Inf
**1998**, 213. [Google Scholar] - Shor, P.W. Fault-tolerant Quantum Computation. Proceedings of the IEEE 37th Annual Symposium on Foundations of Computer Science, Los Alamitos, California; IEEE Computer Society Press, 1996; pp. 56–65. [Google Scholar]
- Earman, J.; Norton, J. EXORCIST XIV The wrath of Maxwell’s demon. part II. from Szilard to Landauer and beyond. Stud. Hist. Philos. Mod. Phys
**1999**, 30, 1–40. [Google Scholar] - Maroney, O. Information Processing and Thermodynamic Entropy. In Stanford Encyclopedia of Philosophy; Zalta, E.N., Ed.; 2009; Available online: http://plato.stanford.edu/archives/fall2009/entries/information-entropy/ (accessed on 1 December 2013). [Google Scholar]
- Maroney, O.J.E. The (absence of a) relationship between thermodynamic and logical reversibility. Stud. Hist. Philos. Mod. Phys
**2005**, 36, 355–374. [Google Scholar] - Leff, H.; Rex, A.F. Maxwell’s Demon: Entropy, Information, Computing, 2nd ed.; Institute of Physics Publishing: London, UK, 2002. [Google Scholar]
- Norton, J.D. Eaters of the lotus: Landauer’s principle and the return of Maxwell’s demon. Stud. Hist. Philos. Mod. Phys
**2005**, 36, 375–411. [Google Scholar] - Bennett, C.H. Notes on Landauer’s principle, reversible computation, and Maxwell’s demon. Stud. Hist. Philos. Mod. Phys
**2003**, 34, 501–510. [Google Scholar] - Ladyman, J.; Presnell, S.; Short, A. The use of the information-theoretic entropy in thermodynamics. Stud. Hist. Philos. Mod. Phys
**2008**, 39, 315–324. [Google Scholar] - Norton, J. Waiting for Landauer. 2011. Available online: http://philsci-archive.pitt.edu/8635/ (accessed on 1 December 2013). [Google Scholar]
- Tolman, R.C. The Principles of Statistical Mechanics; Oxford University Press: New York, 1938. [Google Scholar]
- Messiah, A. Quantum Mechanics, Volume II; North-Holland Publishing Company: Netherlands, 1962. [Google Scholar]
- Weinberg, S. Quantum Mechanics; Cambridge University Press: Cambridge, UK, 2013. [Google Scholar]
- Norton, J. The End of the Thermodynamics of Computation: A no go Result. In Proceedings of the Philosophy of Science Association 23rd Biennial Meeting Collected Papers, San Diego, CA, USA; 2012. Available online: http://philsci-archive.pitt.edu/9658/ (accessed on 1 December 13). [Google Scholar]

© 2014 by the authors; licensee MDPI, Basel, Switzerland This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).