Entropy and Computation: The Landauer-Bennett Thesis Reexamined

The so-called Landauer-Bennett thesis says that logically irreversible operations (physically implemented) such as erasure necessarily involve dissipation by at least kln2 per bit of lost information. We identify the physical conditions that are necessary and sufficient for erasure and show that the thesis does not follow from the principles of classical mechanics. In particular, we show that even if one assumes that information processing is constrained by the laws of classical mechanics, it need not be constrained by the Second Law of thermodynamics.


Introduction
The so-called Landauer-Bennett (LB) thesis says that the physical implementations of logically irreversible operations necessarily and universally involve dissipation by at least kln2 per bit of lost information (see Landauer [1,2], Bennett [3,4]) (The LB thesis is ascribed by Leff and Rex [5] also to Oliver Penrose (LBP thesis), see Penrose [6]).In this paper we argue that while dissipation may hold in interesting and even familiar situations, the thesis itself as a universal claim does not logically follow from the principles of classical mechanics.This means that the thermodynamic properties of physical processes implementing computations depend on their physical details.In this sense the LB thesis is not universal.Although there are cases in which the entropy of the universe increases during erasure (e.g., , there are also cases compatible with classical mechanics in which the entropy does not increase during erasure (see Figures 7-9).As far as we know there is no general characterization of the conditions in which erasure is dissipative, so that as of now it seems that there is not even a partial proof of the conditions under which a limited thesis is true.(The LB thesis may have motivated the search for embedding logically irreversible operations within logically reversible algorithms, such as Fredkin's gate; see Fredkin and Toffoli [7].However, since the LB thesis is not universal, these results, although interesting as theorems in logic, do not bear on the thermodynamics of computation.) Rolf Landauer writes: "Consider a typical logical process, which discards information, e.g., a logical variable that is reset to 0, regardless of its initial state.... The erasure process we are considering must map the 1 space down into the 0 space.Now, in a closed conservative system phase space cannot be compressed, hence the reduction in the spread [in the degrees of freedom representing 1 and 0] must be compensated by a phase space expansion [in other degrees of freedom], i.e., a heating of the irrelevant degrees of freedom, typically thermal lattice vibrations.Indeed, we are involved here in a process which is similar to adiabatic magnetization (i.e., the inverse of adiabatic demagnetization), and we can expect the same entropy increase to be passed to the thermal background as in adiabatic magnetization, i.e., kln2 per erasure process.At this point, it becomes worthwhile to be a little more detailed.... This is, however, rather like the isothermal compression of a gas in a cylinder into half its original volume.The entropy of the gas has been reduced and the surroundings have been heated, but the process is not irreversible: the gas can subsequently be expanded again.Similarly, as long as 1 and 0 occupy distinct phase space regions, the mapping is reversible.The real irreversibility comes from the fact that the 1 and 0 spaces will subsequently be treated alike and will eventually diffuse into each other."(Landauer [2] p. 2, our emphasis).
In what follows we shall analyze Landauer's idea of diffusion and put forward a condition we call blending which clarifies Landauer's idea of diffusion.Blending in the way we define it is a necessary and sufficient condition for erasure to hold.

Macrostates
We start with a strictly classical mechanical underpinning of the notions that figure in the LB thesis, in particular macrostates and entropy.(By entropy we mean the Boltzmann notion of the Lebesgue measure of the macrostate of the system.For a Gibbsian version of the LB thesis, and criticism, see Maroney [8].) According to all the major theories of physics the universe is at each moment in some well-defined state, called a microstate (under the restrictions of relativity and a certain understanding of what is instantaneous velocity).The nature of the microstate is described by the physical theory; for example, in classical mechanics a microstate of the universe is the positions and velocities of its particles.The time evolution of the microstate is given by equations of motion that completely describe all the changes the system undergoes over time.This is all there is in mechanics.And so the reduction of thermodynamics to mechanics means that everything that we say about subsystems of the universe should be phrased in terms of statements about microstates and their time evolution.It is customary to represent the microstates of the universe as points in an abstract state space.(The theory that describes the nature of the microstates also determines the properties and dimensionality of the state space; most of these details do not matter for us here.Here we restrict attention to classical mechanics.) It is possible (and usually non controversial) that the universe can be described as consisting of two sets of degrees of freedom which we shall call O and G, such that it is meaningful to talk about the microstate of each set separately.(Separability need not always be the case, but we assume it for convenience.Relaxing it requires certain changes in the way some of the ideas here are expressed, but the main ideas still hold.)Not every microstate is possible for every system at every time: in general, there are constraints on the universe, which determine that it can only be in certain microstates and not others; for example, its energy may be limited.Given the constraints, the set of all the microstates in which the universe may be is represented by a region in the state space called the accessible region (the energy hypersurface is an important example in the reduction of thermodynamics to mechanics).Let us consider two schemes of the structure of accessible regions.
In the first (extremely simple) example (illustrated in Figure 1) the accessible region consists of a segment of a straight oblique line in the OG plane.In this structure of the accessible region there is a one-to-one correlation between the microstates of O and the microstates of G, and so if one knows the microstate of O one can deduce from it the microstate of G.In the second example (see Figure 2) the accessible region of the universe consists of two horizontal line segments in the OG plane.The correlation between O and G here is one-to-many: each microstate of O is correlated with several microstates of G.For example, the microstate o 1 of O is correlated with the three microstates p 1 , p 2 and p 3 of G. Due to this structure of the accessible region one cannot infer the microstate of G from the microstate of O, even if one knows the structure of the accessible region.
The microstates p 1 , p 2 and p 3 share the physical property of being correlated with o 1 ; this is a physical property since it is determined by the structure of the accessible region of the O + G universe, and this structure, in turn, is determined by the general constraints and limitations on the possible microstates of the universe.In some very important (and possibly rare) cases, such sets of microstates of G that are formed in virtue of one-to-many correlations with O gain significance.And these cases are those in which the physical system O has a special physical structure that justifies calling it an observer which observes its environment G. Our working hypothesis here is that the relevant notion of an observer involved here is a physical notion, and in particular that our experience of thermodynamic systems, that is, our micro correlations with our environment, can be accounted for in purely physical terms (we argue for this partially in [9]).We will now give an account of the notion of a macrostate in these physical terms.The account of O intended here is purely physical.O is nothing but a physical system, a subsystem of the O + G universe, and whatever we say about O, we say about O as a purely physical system.In the case of Figure 2, we may say that when O is in the microstate o 1 , O cannot tell whether G is in the microstates p 1 , p 2 and p 3 of G, even if O knows the structure of the accessible region.These microstates are indistinguishable for O.In this case we shall say that the microstates p 1 , p 2 and p 3 of G belong to one macrostate of G, which we denote by B 1 , and (for a similar reason) the microstates p 4 , p 5 and p 6 belong to a different macrostate of G, which we denote by B 2 .It is of utmost importance to bear in mind that macrostates in statistical mechanics are sets of microstates that are grouped together because an observer does not distinguish between them.In this sense macrostates are relative to the observer's resolution power.But this relativity is of course physical and objective.For further details about this point see our [10] Chapter 5 and [11].
To sum up: What is a macrostate?It is a set of microstates of G that is correlated with a microstate of O. Macrostates in this sense are objective since given classical mechanics there is, at each moment of time, a physical fact concerning the question of which set of microstates of G is correlated with each microstate of O.In this way we achieved both desiderata: we wanted macrostates to be grounded by objective facts, and we wanted these facts to be physical facts.
Macrostates, construed in this way, are objective physical facts: they are determined by the structure of the accessible region of the O + G universe; and this structure, in turn, is determined by the constraints and limitations on the possible microstates of the universe.And so we have here an account of how sets can gain ontological significance.However, it turns out that in order to endow sets of microstates with objective physical status, we must realize that an observer is part of the theory: a physical observer, but an observer nonetheless.And so it turns out that statistical mechanics cannot be made observerless.
Given this notion of a macrostate of G as a set of microstates of G that stand in correlation with a microstate of O, two more facts about the thermodynamic partition to macrostates become interesting: one concerns the special nature of the thermodynamic macrostates, and the other concerns the regularity they exhibit.
Each thermodynamic macrostate of G consists of microstates that share two kinds of physical properties.First, they share a correlation with a microstate of O, and this fact makes them a macrostate.
Second, all the microstates in a given macrostate of G share some property pertaining to G alone.For example, they may share the average kinetic energy of particles, or average position in space.The difference here is crucial: we see the world in terms of macrostates because of the correlations between O and G; but we characterize the thermodynamic macrostates in terms that pertain to G only, and these properties have been discovered as part of the creation of statistical mechanics.It is precisely because thermodynamic macrostates can be expressed by properties of G alone that the correlation with the observer O is overlooked.But this correlation is of utmost importance since without it, we would not have looked for the G only properties, and we would not have taken them to be physically significant.(For more on this see our [10], Chapter 5 and [11]. In the discussion so far we described a case where the observer O observes the system G directly.But observers often measure the state of the observed system indirectly, by using measuring devices.Let us describe a situation in which O measures some quantity of G by a measuring device D. The state space of this case is illustrated in Figure 3, where the vertical axis D stands for the measuring device.The external constraints upon O, G and D and the internal interactions between them determine their common accessible region.S is the macrostate of D representing its Ready state, and 0 and 1 are the macrostates of D representing the measurement outcomes.Taking the macrostates of G and the macrostates of D together, one may talk about the macrostates of D + G relative to O. We reiterate that macrostates in our approach are relative to observers, and for this reason we keep the O axis visible in our figures although for simplicity we do not depict its details.This fact is highly significant for understanding the notion of erasure (which is macroscopic).In Figure 3 these macrostates are denoted by the two-dimensional rectangles M 1 , M 2 , and M 3 .On the basis of this account we can understand the LB thesis to which we now turn.

Erasure
As already noted by Landauer [1] (p. 149 in Leff and Rex [5]), in classical mechanics there is no microscopic erasure because of the determinism of the dynamics.To see why note that in a trivial sense the microstate of the universe at any moment is a memory of all the microstates of the universe to the past of this moment, since given the equations of motion one can derive any microstate to the past (as well as future) from the present microstate.Therefore in classical mechanics microscopic memory can never be erased.
Another way of looking at this matter is this.In the context of the physical implementation of computation (namely, in computers), Landauer took an erasure to be a physical implementation of the function restore-to-one: f(0) = f(1) = 1.This is a special case of the function: f(0) = f(1) = X (where X is some standard state) in the sense that one cannot infer the initial state from the final state.However, since the classical dynamics is both deterministic and time reversal invariant it follows that two different microstates, such as those implementing the data 0 and 1, cannot both evolve to the same microstate, such as 0 or 1.And so it turns out that erasure (as well as other logically irreversible operations) can be carried out only on macrostates.(A measurement is a sort of a reversal of erasure and likewise is essentially macroscopic (see our [10], Chapter 9.) A necessary and sufficient condition for an erasure is what we call blending, which has some similarity with Landauer's notion of diffusion.Landauer [2], in the quotation cited in Section 1, says that the completion of an erasure requires what he calls diffusion.The set up that Landauer seems to have had in mind is the following, as sketched in figures 4, 5 and 6 below (the shaded areas contain the actual microstate at each case).The structure of the trajectories of the universe in this case is such that trajectories that start in the region M 1 end up after some time in the region M 0 , and likewise trajectories that start in the region M 2 also end up after the same time in M 0 .In the figure we do not depict the regions within M 0 occupied by the end points of these two trajectories bundles (since as far as erasure is concerned these details are not relevant).This structure of trajectories implies that whether the actual pre-erasure macrostate is M 1 (which is the case in Figure 4) or M 2 (which is the case in Figure 5), after the erasure the macrostate will be M 0 , as in Figure 6.Since both bundles of trajectories that start in M 1 and in M 2 evolve to M 0 , Liouville's theorem requires that the Lebesgue measure of M 0 be equal to at least the sum of M 1 and M 2 , and is satisfied in this case.Likewise, Liouville's theorem is satisfied in all the cases we analyze below.Here after the two bundles of trajectories arrive at M 0 from M 1 and M 2 , they diffuse or blend in M 0 , in the sense that they are no longer distinguishable since the observer O using the measuring device D cannot infer from the final macrostate M 0 the macroscopic history of D + G.That is, O cannot say, given M 0 whether the macrostate of D + G was M 1 or M 2 .(Here we did not distinguish between the erasure of known and unknown data; see below.)Since the Lebesgue measure of M 0 (which is entropy in the Boltzmannian approach) is equal in this case to sum of the Lebesgue measures of M 1 and M 2 , the entropy of D+G increases.This case, which satisfies the LB thesis, is often discussed in the literature (in various versions) but it is not the general case, as we shall see.(For the choice of the measure of entropy in statistical mechanics, see our [10].) Landauer may have thought that after some time any region of positive Lebesgue measure within M 0 contains both end points that came from M 1 and end points that came from M 2 , and so it is impossible to identify sub-regions in M 0 that contain end points that belong to only one of these bundles.The idea of diffusion is indeed very important, but it can and should be generalized; and the generalization entails-as we will now see-that Landauer's thesis concerning the entropy of erasure is not a universal theorem of mechanics.(A different argument criticizing the LB thesis is by Norton [12].A defense of Landauer's thesis is in e.g.Bub [13] and Ladyman and Robertson [14].) To see this, consider first Figures 7 and 8, which illustrate the necessary and sufficient condition for erasure that we call blending.In Figure 7 all four macrostates have the same Lebesgue measure; and the trajectories that start in the macrostate M 1 evolve in such a way that the bundle of trajectories partly overlaps with macrostates M 3 and M 4 .(In our [10] we call this bundle of trajectories, or to be more precise, the end points at a given time of this bundle, the dynamical blob.)The shaded areas are the initial macrostate and its evolution.In this special case, designed for simplicity, the bundle overlaps with exactly ½ of M 3 and ½ of M 4 .Similarly, in Figure 8 we see the evolution of the trajectories that start in M 2 : they also evolve so that the bundle of trajectories overlaps with exactly the remaining ½ of M 3 and the remaining ½ of M 4 .By the end of this evolution, O measures the state of D in order to learn from it the state of G. Here, the usual measurement process takes place (see [10], Chapter 9).Assuming that O is correlated with D and G in such a way that O can distinguish between M 3 and M 4 , a detection takes place and then the dynamical blob collapses on either M 3 or M 4 , depending on which of them contains the actual microstate of D + G.We reiterate that in statistical mechanics macrostates are equivalence classes defined relative to an observer's resolution power.For this reason we believe that the observer in statistical mechanics is essential (see our [11]).Since the account of both processes of erasure and measurement in mechanics necessarily involve macrostates, we keep the reference to an observer O in our discussion although we do not go into its details.For further reading about the role of the observer in statistical mechanics see our [10] and [11].
By the end of this measurement, O can say only that the macrostate of D + G is the outcome of the measurement, but cannot tell which sub-region of the actual macrostate contains the actual microstate, for this is the very idea of the notion of a macrostate.Since M 4 (and similarly M 3 ) contain end points that started out in both M 1 and M 2 , that is, if the blobs that started in M 1 and M 2 blend within M 3 and M 4 , O cannot infer from M 4 (or M 3 , depending on the actual outcome of the erasure) which macrostate M 1 or M 2 was the case before the erasure.In this special case the final macrostate detected by O is either M 3 or M 4 , and since the Lebesgue measure of M 3 and M 4 is equal to the Lebesgue measure of the initial macrostate (M 1 or M 2 ), the entropy of D + G did not change during the erasure.
The case in which the correlations between O and D + G are such that O can distinguish between the macrostates M 3 and M 4 is special: in general this correlation can be either finer or coarser.An example of an erasure with a coarser correlation is illustrated above in Figure 6.An example of an erasure with a finer correlation is given in Figure 9, in which the regions M 3 , M 4 , M 5 and M 6 are macrostates.It is very important to realize that there is no intrinsic connection between blending (or diffusing), which depends on the structure of trajectories in the blobs, and the entropy, which is fixed by the measure of the macrostates.It seems to us that Landauer [1,2] had in mind a blending dynamics that takes place within a given macrostate.But what we have just seen is that blending may equally take place across different macrostates (each of which may be of a smaller Lebesgue measure than the initial macrostate).Finally the above analysis of erasure holds also for the special case of erasure as restore-to-one, as can be seen in Figure 10, where S 1 is Landauer's information-bearing degree of freedom.Here, restoring to one does not result in an entropy increase, due to the partition to macrostates on the S 2 (non information bearing) degree of freedom.
In familiar thermodynamic situations there seem to be fixed limitations on the observation capabilities of human observers and, in this sense, one can perhaps introduce a maximally fine-grained partition to thermodynamic macrostates (see Earman [15]) which results in some specific entropy of post-erasure macrostates.But the details of these macrostates are a contingent matter of fact.In particular, the principles of mechanics entail no specific relation between the pre-erasure and post-erasure entropy of the universe.This means that whether or not the LB thesis is true for the familiar thermodynamic situations is a question of contingent fact as well.In any case, our analysis of erasure demonstrates that, contrary to the conventional wisdom, the LB thesis is not a theorem in classical mechanics (nor is it a theorem in quantum mechanics; see our [10], Appendix B.3).

Erasure of Random Data
Up to now our account focused on the erasure of data known to the observer, as opposed to unknown or random data (see Bennett [3,4] for these terms).(Bennett's [3] and Feynman's [16] analyses of an erasure using a bi-stable well are special cases of our analysis; see our [10], Chapter 12.) In the literature erasure is often described in terms that pertain to information bearing degrees of freedom vs. other degrees of freedom.Let us re-describe our account of erasure in the previous section in these terms.Consider first the erasure dynamics described in Figures 11,12.Here S 1 is the informationbearing degree of freedom and S 2 is the other non-information degree of freedom.In Figures 11, 12 the dynamics maps the initial macrostates La and Ra to the final macrostate Rab.As can be seen in the two figures the Lebesgue measure of Rab is equal to the sum of the Lebesgue measures of the initial macrostates La and Ra, as required in this case by Liouville's theorem.It can easily be seen that this dynamics is blending since one cannot infer from the final macrostate R of S 1 the initial macrostates L or R of S 1 .This dynamics, however, results in an increase of the total entropy of the universe S 1 + S 2 , and in particular the dynamics increases the entropy of the non-information bearing degree of freedom S 2 .However, this entropy increase is not necessary.Suppose, for example, that in the case of Figure 11 the observer carries out a measurement on the post-erasure blob in order to find out whether the system is in the macrostate Ra or Rb.The outcome of this measurement is either the one in Figure 13 or the one in Figure 14.Similarly for the case of Figure 12: here, too, the outcome of the measurement will be either the one in Figure 13 or the one in Figure 14.And note that this measurement can be finished at the same time that the erasure (in Figures 11, 12) ends.In these cases, the post-erasure macrostate has the same entropy as the pre-erasure macrostate.And of course other changes of entropy are possible, depending on the partition of the state space into macrostates at the same time (that is, the time at which the erasure is finished), according to the details of the correlations between the observer and the system.However, Landauer [1] insisted that the interesting case of erasure (namely the one that is relevant for analyzing computation and for applying the LB thesis) is that of unknown or random data.This case is described in Figures 15-17.In the initial state in Figure 15 the observer cannot distinguish between La and Ra and therefore the initial macrostate is La + Ra.In some sense one can say that in this case there is nothing to be erased since at the initial time before the erasure there is no information that is known to the observer.The only way to make sense of erasure of random data is in terms of counterfactuals.That is, if the observer were to know the initial data, then the erasing dynamics would result in a final macrostate from which the observer would not be able to recover his memory.Here is a dynamics that implements an erasure of this sort.In Figure 16 the dynamics takes all the trajectories in the macrostate La + Ra to the final macrostate Ra + Rb.In this case although there is no explicit blending it is not in fact required since as we said there is no information that is known to the observer that is finer than the initial macrostate L + R along S 1 .In the counterfactual sense we proposed above the dynamics maps L + R onto R, while expanding along S 2 (which is necessary due to Liouville's theorem).Although in this case the total entropy of the universe S 1 + S 2 is conserved, the entropy of S 2 increases.However, even this increase is not necessary.Consider Figure 17.Here the dynamics results in erasing (counterfactually) the information stored in S 1 , by mapping the initial macrostate La + Ra to two different macrostates Ra and Rb of S 2 , which in this case are distinguishable by the observer.That is, the blending in this case takes place across different macrostates of S 2 , while in the previous case (of Figure 16) the blending takes place within a single macrostate of S 2 .Whether the macrostates L and R of S 1 (and similarly the macrostates a and b of S 2 ) are distinguishable by an observer is a question of fact, and therefore cannot be settled universally by any principle of mechanics.
This completes our account of the notion of blending, which is both necessary and sufficient for erasure in classical mechanics.Our analysis can easily be generalized to all logically irreversible operations (see [10], Chapter 12).

Maxwell's Demon
Elsewhere we showed ([10], Chapter 9) that measurement has no specific implications concerning the behavior of entropy.In particular entropy can decrease during measurement (contra Szilard [17]).Maxwell's Demon is the conjunction of a measurement that decreases entropy and opens a causal route to exploiting energy to produce work, and an erasure that does not increase (too much) entropy.Since both dynamics are compatible with the principles of classical mechanics, so is a Maxwellian Demon (see Albert [18], Chapter 5 and a detailed phase space construction of a Maxwellian Demon in our [10], Chapter 13 and Appendix A, and [19,20].(We believe that a Maxwellian Demon is consistent with quantum mechanics, as well; see [10] Appendix B3.However, our argument in the context of quantum mechanics may not be conclusive since it makes no appeal to the notion of entropy.The reason is that we are not sure what is the quantum mechanical counterpart of thermodynamic entropy.We know it is not the von Neumann entropy; see our [21].) An influential argument to the effect that a Maxwellian Demon is incompatible with classical mechanics was given by Bennett [1], in which the LB thesis is put to work.In Bennett's analysis it is not entirely clear whether his erasure is of known or random data.However, we have shown above that the LB thesis does not follow from the principles of mechanics for both cases.Bennett's [3] analysis of the erasure in his thought experiment is a special case in which entropy increases (by lifting the partition), but it cannot be generalized.(Bennett's [3] account of the measurement in his thought experiment is a case of what we called the shadows approach; see [10], Section 9.5.)

Conclusion: Information is Physical
We showed above that the LB thesis is not universal in the sense that erasure need not be entropy increasing.In this sense information is not thermodynamic.However, we believe that information is physical and agree with Landauer that "information is not a disembodied abstract entity; it is always tied to a physical representation… this ties the handling of information to all the possibilities and restrictions of our real physical world, its laws of physics and its storehouse of available parts."(Landauer [22], p. 188) Without analyzing the concept of information we believe that it is safe to say that wherever there is information in the world it is represented in the physical states of things: "…by engraving on stone tablet, a spin, a charge, a hole in punched card, a mark on paper, or some other equivalent."(Landauer [22] p. 188) Here again we agree with Landauer, but wish to stress that if one wishes to conform to a robust physicalist outlook, one must understand in addition the notion of representation in physical terms (and likewise of course all the terms appearing in this sentence).In particular, one must add first of all to Landauer's list the human brain, and most importantly one must be ready to accept a full-fledged identity theory in which all concepts such as representation, computation, memory, intentionality, as well as mathematical concepts, are nothing but identical with some types of physical states of the brain (or any other system interacting with it).It is today a prevalent approach in both philosophy of mind and mathematics that one need not and cannot reduce mental states and mathematical concepts to physical states, but nonetheless that this inability is compatible with a physicalist outlook (according to which identity is mandatory at the token level).We are not sure that a full-fledged identity theory is either possible or true although we believe that given the success of physics this seems to be the best research program.What we are sure about, however, is that if this research program is in fact false, then the physicalist outlook is false, even at the token level.We join Landauer in saying that "we cannot expect our colleagues in mathematics and in computer science to be cheerful about surrendering their independence.Mathematicians in particular have long assumed that mathematics was there first and that physics needed that to describe the universe.We will instead ask for a self-consistent framework."(Landauer [22], p. 188))

Figure 1 .
Figure 1.Accessible region with one-to-one correlation.

Figure 2 .
Figure 2. Accessible region with one-to-many correlation.

Figure 11 .
Figure 11.Erasure of known data: Case a.

Figure 12 .
Figure 12.Erasure of known data: Case b.