A Cost/Speed/Reliability Tradeoff to Erasing

Gopalkrishnan, Manoj

doi:10.3390/e18050165

Open AccessArticle

A Cost/Speed/Reliability Tradeoff to Erasing^†

by

Manoj Gopalkrishnan

School of Technology and Computer Science, Tata Institute of Fundamental Research, Mumbai 400005, India

^†

This paper is an extended version of our paper published in the 14th International Conference on Unconventional Computation and Natural Computation (UCNC), Auckland, New Zealand, 30 August–3 September 2015

Entropy 2016, 18(5), 165; https://doi.org/10.3390/e18050165

Submission received: 21 December 2015 / Revised: 21 March 2016 / Accepted: 22 April 2016 / Published: 28 April 2016

(This article belongs to the Special Issue Information and Entropy in Biological Systems)

Download

Browse Figure

Versions Notes

Abstract

:

We present a Kullback–Leibler (KL) control treatment of the fundamental problem of erasing a bit. We introduce notions of reliability of information storage via a reliability timescale

τ_{r}

, and speed of erasing via an erasing timescale

τ_{e}

. Our problem formulation captures the tradeoff between speed, reliability, and the KL cost required to erase a bit. We show that rapid erasing of a reliable bit costs at least

log 2 - log (1 - e^{- \frac{τ_{e}}{τ_{r}}}) > log 2

, which goes to

\frac{1}{2} log \frac{2 τ_{r}}{τ_{e}}

when

τ_{r} > > τ_{e}

.

Keywords:

erasing; information; thermodynamics; Kullback–Leibler; optimal control; reliability; speed; tradeoff

1. Introduction

Biological systems are remarkably ordered at multiple scales and dimensions, from the spatial order witnessed in the packing of DNA inside the nucleus, the arrangement of cells to form tissues, and organs, and whole organisms, to the temporal order witnessed in the execution of various cellular processes. Superficially, such order might appear to violate the second law of thermodynamics which requires an increase in disorder with overwhelmingly high probability. In fact, there is no violation since biological systems expend energy to bring about and maintain this increase in order.

We would like to understand this “energy to order” conversion quantitatively. What are the fundamental limits to this conversion? Order can be measured in terms of information by counting the number of bits required to describe that order. From this point of view, understanding how much energy is required to create order becomes an instance of the investigation of the connection between information processing and thermodynamics. The basic information processing operation that increases order is the operation of “erasing” or resetting a bit to state 0. To fix ideas, imagine erasing random chalk marks from a blackboard, to leave it in a neat and ordered state.

Szilard [1] and later Landauer [2] have argued from the second law of thermodynamics that erasing at temperature T requires at least

k_{B} T log 2

units of energy, where

k_{B}

is Boltzmann’s constant. The Szilard engine is a simple illustration of this result. Imagine a single molecule of ideal gas in a cylindrical vessel. If this molecule is in the left half of the vessel, think of that as encoding the bit “0”, and the bit “1” otherwise. Erasing this Brownian bit corresponds to ensuring that the molecule lies on the left half, for example by compressing the ideal gas to half its volume. For a heuristic analysis, we may use the ideal gas law

P V = k_{B} T

, integrating the expression

d W = - P d V

for work from limits V to

V / 2

to obtain

W = k_{B} T log 2

. More rigorous and general versions of this calculation are known, which also clarify why this is a lower bound [3,4,5].

In practice, one finds that both man-made and biological instrumentation often require energy substantially more than

k_{B} T log 2

to perform erasing [6,7]. John von Neumann remarked on this large gap in his 1949 lectures at the University of Illinois [8]. Bennett [9] has remarked that DNA polymerases come close to the

k_{B} T log 2

bound. To copy a single base, a DNA polymerase hydrolyzes a triphosphate molecule to a monophosphate, which provides close to

18 k_{B} T

at temperature

T = 300

K. Note that this is still almost two orders of magnitude away from

k_{B} T log 2

. Furthermore, it is not clear whether the comparison is valid at all since copying and erasing are different operations.

How does one explain this large gap? Note that the result of

k_{B} T log 2

holds only in the isothermal limit, which takes infinite time. In practice, we want erasing to be performed rapidly, say in time

τ_{e}

, which requires extra entropy production. For intuition, suppose one wants to compress a gas in finite time

τ_{e}

. The gas heats up, and pushes back, increasing the work required.

Several groups [10,11,12] have recognized that rapid erasing requires entropy production which pushes up the cost of erasing beyond

k_{B} T log 2

, and have obtained bounds for this problem. A grossly oversimplified, yet qualitatively accurate, sketch of these various results is obtained by considering the energy cost of compressing the Szilard engine rapidly. Specializing a result from finite-time thermodynamics [13] to the case of the Szilard engine, one obtains an energy cost

(1 + \frac{k_{B} log 2}{σ τ_{e} - k_{B} log 2}) k_{B} T log 2

where σ is the coefficient of heat conductivity of the vessel.

The bounds obtained by such considerations depend on technological parameters like the heat conductivity σ, and not just on fundamental constants of physics and the requirement specifications of the problem. If one varies over the technological parameters as well, e.g., allowing

σ \to \infty

, the energy cost tends to

k_{B} T log 2

. Does there exist a more fundamental analysis for the cost of erasing that is independent of technological parameters, and improves on

k_{B} T log 2

? This is the open question we address in this paper.

Our contribution: We follow up on von Neumann’s suggestion [8] that the gap was “due to something like a desire for reliability of operation”. Swanson [14] and Alicki [15] have also looked into issues of reliability. We introduce the notion of “reliability timescale”

τ_{r}

, and explicitly consider the three-way trade-off between speed, reliability, and cost.

The other novelty of our approach is in bringing the tools of Kullback–Leibler (KL) control [16,17] to bear on the problem of erasing a bit. The intuitive idea is that the control can reshape the dynamics as it pleases but pays for the deviation from the uncontrolled dynamics. The cost of reshaping the dynamics is a relative entropy or KL divergence between the controlled and the uncontrolled dynamics, expressed as measures on path space.

We find the optimal control for rapid erasing of a reliable bit, and argue that it requires cost of at least

log 2 - log (1 - e^{- \frac{τ_{e}}{τ_{r}}}) > log 2

, which goes to

\frac{1}{2} log \frac{2 τ_{r}}{τ_{e}}

when

τ_{r} > > τ_{e}

. Importantly, our answer does not depend on any technological parameters, but only on the requirement specifications

τ_{r}

and

τ_{e}

of the problem.

2. The Erasing Problem

As a model of a bit, consider a two-state continuous-time Markov chain with states 0 and 1 and the passive or uncontrolled dynamics given by transition rates

k_{01}

from state 0 to state 1 and

k_{10}

from state 1 to state 0.

The transition rates

k_{01}

and

k_{10}

model spontaneous transitions between the states when no one is looking at the bit or trying to erase it. The time independence of these rates represents the physical fact that the system is not being driven.

Such finite Markov chain models often arise in physics by “coarse-graining”. For example, for the case of the Szilard engine, the transition rate

k_{10}

models the rate at which the molecule enters the left side, conditioned on it currently being on the right side.

Apart from their importance in approximating the behavior of real physical systems, finite Markov chains are also important to thermodynamics from a logical point of view. They may be viewed as finite models of a mathematical theory of thermodynamics. The terms “theory” and “model” are to be understood in their technical sense as used in mathematical logic. We develop this remark no further here since doing so would take us far afield.

Suppose the distribution at time t is

(p_{0} (t), p_{1} (t))

with

p_{1} (t) = 1 - p_{0} (t)

. Then, the time evolution of the bit is described by the ordinary differential equation (ODE):

\begin{matrix} {\dot{p}}_{0} (t) = - k_{01} p_{0} (t) + k_{10} (1 - p_{0} (t)) . \end{matrix}

(1)

Setting

π_{0} = k_{10} / (k_{01} + k_{10})

and the reliability timescale

τ_{r} : = 1 / (k_{01} + k_{10})

, this admits the solution

\begin{matrix} p_{0} (t) = π_{0} + e^{- t / τ_{r}} (p_{0} (0) - π_{0}) . \end{matrix}

(2)

Here,

τ_{r}

represents the time scale on which memory is preserved. The smaller the rates

k_{01}

and

k_{10}

, the larger is the value of

τ_{r}

, and the slower the decay to equilibrium, so that the system remembers information for longer.

Fix a required erasing time

τ_{e}

. Fix

p (0) = π_{0}

. We want to control the dynamics with transition rates

u_{01} (t)

and

u_{10} (t)

to achieve

p (τ_{e}) = (1, 0)

, where

\begin{matrix} {\dot{p}}_{0} (t) & = - u_{01} (t) p_{0} (t) + u_{10} (t) (1 - p_{0} (t)) . \end{matrix}

(3)

We want to find the cost of the optimal protocol

u_{01}^{*} (t)

and

u_{10}^{*} (t)

to achieve this objective, according to a cost function which we introduce next. In particular, when

k_{01} = k_{10}

, the equilibrium distribution

π = (π_{0}, 1 - π_{0})

takes the value

(1 / 2, 1 / 2)

, and we can interpret this task as erasing a bit of reliability

τ_{r} = 1 / (k_{01} + k_{10})

in time

τ_{e}

.

Kullback–Leibler Cost

Define the path space

P : = {0, 1}^{[0, τ_{e}]}

of the two-state Markov chain. This is the set of all paths in the time interval

[0, τ_{e}]

that jump between states 0 and 1 of the Markov chain. Each path can also be succinctly described by its initial state, and the times at which jumps occur. We can also effectively think of the path space as the limit as

h \to 0

of the space

P_{h} : = {0, 1}^{{0, h, 2 h, \dots, N h = τ_{e}}}

corresponding to the discrete-time Markov chain that can only jump at clock ticks of h units (Figure 1).

Once the rates

u_{01} (t), u_{10} (t)

and the initial distribution

p (0) = p

for the Markov chain are fixed, there is a unique probability measure

μ_{u, p}

on path space which intuitively assigns to every path the probability of occurrence of that path according to the Markov chain evolution (Equation (3)) with initial conditions p.

For pedagogic reasons, we first describe the discrete-time measure

μ_{u, p}^{h}

for a single path

i = (i_{0}, i_{1}, \dots, i_{N}) \in P_{h}

. First, we describe the transition probabilities of the discrete-time Markov chain. For

a, b \in {0, 1}

with

a \neq b

, for all times t, define

u_{a a}^{h} (t) : = 1 - h u_{a b} (t)

and

u_{a b}^{h} (t) : = h u_{a b} (t)

as the probability of jumping to a and to b, respectively, in the time step t, conditioned on being in state a. Then, the probability of the path i under control u is given by:

μ_{u, p} (i) : = p_{i_{0}} \prod_{j = 0}^{N - 1} u_{i_{j}, i_{j + 1}}^{h} (j h) .

We describe the continous-time case now. We could obtain the measure

μ_{u, p}

from

μ_{u, p}^{h}

by sending

h \to 0

, but it can also be described more directly. Fix

i_{0} \in {0, 1}

, and consider the set of paths

S = S_{i_{0}, t_{1}, t_{2}, \dots, t_{n}}

starting at

i_{0}

with jumps occurring around times

t_{1} < t_{2} < \dots < t_{n}

within infinitesimal intervals

d t_{1}, d t_{2}, \dots, d t_{n}

and leading to the trajectory

(i_{0}, i_{1}, \dots, i_{n}) \in {0, 1}^{n + 1}

. Setting

t_{0} = 0

:

μ_{u, p} (S) : = p_{i_{0}} \prod_{j = 0}^{n - 1} e^{- \int_{t_{j}}^{t_{j + 1}} u_{i_{j} i_{j + 1}} (s) d s} u_{i_{j} i_{j + 1}} (t_{j + 1}) d t_{j + 1},

where

p_{i_{0}}

is the probability of starting at

i_{0}

,

e^{- \int_{0}^{t_{1}} u_{i_{0} i_{1}} (s) d s}

is the probability of not jumping in the time interval

(0, t_{1})

,

u_{i_{0} i_{1}} (t_{1}) d t_{1}

is the probability of jumping from

i_{0}

to

i_{1}

in the interval

(t_{1}, t_{1} + d t_{1})

and so on. This is the well-known Feynam–Kac formula for this Markov chain.

Specializing to

u_{01} (t) = k_{01}

and

u_{10} (t) = k_{10}

, we obtain the probability measure

μ_{k, p}

induced on

P

by the passive dynamics (Equation (1)) with initial conditions p.

We declare the Kullback–Leibler (KL) cost

D (μ_{u, p} ∥ μ_{k, p})

as the cost for implementing the control u. More generally, for a physical system with path space

P

, passive dynamics corresponding to a measure ν on

P

, and a controlled dynamics with a control corresponding to a measure μ on

P

, we declare

D (μ ∥ ν)

as the cost for implementing the control. This cost function has been widely used in control theory [16,17,18,19,20,21,22,23,24,25,26,27]. In Section 4, we will explore some other interpretations of this cost function.

3. Solution to the Erasing Problem

Out of all controls

u (t)

, we want to find a control

u^{*} (t)

that starts from

p (0) = π = (\frac{k_{10}}{(k_{01} + k_{10}}, \frac{k_{01}}{k_{01} + k_{10}})

and achieves

p (τ_{e}) = (1, 0)

while minimizing the relative entropy

D (μ_{u^{*}, π} ∥ μ_{k, π})

,

\begin{matrix} To find : u^{*} = arg inf_{u} D (μ_{u, π} ∥ μ_{k, π}), \\ Subject to : μ_{u, π} (τ_{e}) = (1, 0) . \end{matrix}

This question can be described within the framework of a well-studied problem in optimal control theory that has a closed-form solution [16,17,28]. Following Todorov [16], we introduce the optimal cost-to-go function

v (t) = (v_{0} (t), v_{1} (t))

. We intend

v_{i} (t)

to denote the expected cumulative cost for starting at state i at time

t < τ_{e}

, and reaching a distribution close to

(1, 0)

at time

τ_{e}

.

To discourage the system from being in state 1 at time

τ_{e}

, define

v_{1} (τ_{e}) = + \infty

and

v_{0} (τ_{e}) = 0

. Suppose the control performs actions

u_{01} (t)

and

u_{10} (t)

at time t. Fix a small time

h > 0

. Define the transition probability

u_{i j}^{h} (t)

as the probability that a trajectory starting in state i at time t will be found in state j at time

t + h

. When

i \neq j

,

u_{i j}^{h} (t) \approx h u_{i j} (t)

, whereas

u_{i i}^{h} (t) \approx 1 - u_{i j}^{h} (t)

ignoring terms of size

O (h^{2})

. We define

k_{i j}^{h}

similarly.

Let “log” denote the natural logarithm. To derive the law satisfied by the optimal cost-to-go

v (t)

, we approximate

v (t)

by the backward recursion relations:

\begin{matrix} v_{0} (t) & = min_{u_{01} (t)} E [v_{i} (t + h) + log \frac{u_{0 i}^{h} (t)}{k_{0 i}^{h}}], \\ v_{1} (t) & = min_{u_{10} (t)} E [v_{i} (t + h) + log \frac{u_{1 i}^{h} (t)}{k_{1 i}^{h}}], \end{matrix}

(4)

where the first expectation is over

i \sim_{law} (u_{00}^{h} (t), u_{01}^{h} (t))

, and the second is over

i \sim_{law} (u_{10}^{h} (t), u_{11}^{h} (t))

, and the approximation ignores terms of size

O (h^{2})

. As

h \to 0

, the second terms

E log \frac{u_{j i}^{h} (t)}{k_{j i}^{h}}

approach the relative entropy cost in path space over the time interval

(t, t + h)

.

Equation (4) says that the cost-to-go from state 0 at time t equals the cost of the control

u (t)

plus the expected cost-to-go in the new state i reached at time

t + h

. The cost of the control is measured by relative entropy of the control dynamics relative to the passive dynamics, over the time interval

(t, t + h)

.

Define the desirability

z_{0} (t) = e^{- v_{0} (t)}

and

z_{1} (t) = e^{- v_{1} (t)}

. Define

\begin{matrix} G_{0} [z] (t) & = k_{00}^{h} z_{0} (t) + k_{01}^{h} z_{1} (t), \\ G_{1} [z] (t) & = k_{10}^{h} z_{0} (t) + k_{11}^{h} z_{1} (t) . \end{matrix}

We can rewrite Equation (4) as:

\begin{matrix} log z_{0} (t) & = log G_{0} [z] (t + h) - min_{u_{01} (t)} E [log \frac{u_{0 i}^{h} (t) G_{0} [z] (t + h)}{k_{0 i}^{h} z_{i} (t + h)}], \\ log z_{1} (t) & = log G_{1} [z] (t + h) - min_{u_{10} (t)} E [log \frac{u_{1 i}^{h} (t) G_{1} [z] (t + h)}{k_{1 i}^{h} z_{i} (t + h)}] . \end{matrix}

(5)

Since the last term is the relative entropy of

(u_{j 0}^{h} (t), u_{j 1}^{h} (t))

relative to the probability distribution

(k_{j 0}^{h} z_{0} (t + h) / G_{j} [z] (t + h), k_{j 1}^{h} z_{1} (t + h) / G_{j} [z] (t + h))

, its minimum value is 0, and is achieved by the protocol

u^{*}

given by:

\begin{matrix} \frac{u_{j i}^{*} (t)}{k_{j i}} = lim_{h \to 0} \frac{e^{- v_{i} (t + h)}}{G_{j} [z] (t + h)} = \frac{e^{- v_{i} (t)}}{e^{- v_{j} (t)}}, \end{matrix}

(6)

when

i \neq j

.

It remains to solve for

z (t)

and the optimal cost. From Equation (5), at the optimal control

u^{*}

, the desirability

z (t)

must satisfy the equation

- log z (t) = - log G [z] (t + h) + 0

, so that:

\begin{matrix} (\begin{matrix} z_{0} (t) \\ z_{1} (t) \end{matrix}) = (\begin{matrix} 1 - k_{01} h & k_{01} h \\ k_{10} h & 1 - k_{10} h \end{matrix}) (\begin{matrix} z_{0} (t + h) \\ z_{1} (t + h) \end{matrix}), \end{matrix}

which simplifies to

\frac{d z}{d t} = - K z

in the limit

h \to 0

, where

K = (\begin{matrix} - k_{01} & k_{10} \\ k_{10} & - k_{10} \end{matrix})

is the infinitesimal generator of the Markov chain. This equation has the formal solution

z (τ_{e} - t) = e^{K t} z (τ_{e})

where

z (τ_{e}) = (\begin{matrix} 1 \\ 0 \end{matrix})

. In the symmetric case,

k_{01} = k_{10}

,

z (t) = (\begin{matrix} 1 / 2 \\ 1 / 2 \end{matrix}) + e^{- \frac{τ_{e} - t}{τ_{r}}} (\begin{matrix} 1 / 2 \\ - 1 / 2 \end{matrix}),

where

τ_{r} = 1 / (k_{01} + k_{10})

. Substituting

t = 0

and taking logarithms, we find the cost-to-go function at time 0:

v (0) = (\begin{matrix} log 2 \\ log 2 \end{matrix}) - (\begin{matrix} log (1 + e^{- τ_{e} / τ_{r}}) \\ log (1 - e^{- τ_{e} / τ_{r}}) \end{matrix}) .

When

ν (0) = (1 / 2, 1 / 2)

with

k_{01} = k_{10}

, the cost

C_{erase} (τ_{r}, τ_{e}, T)

required for erasing a bit of reliability

τ_{r} = 1 / (k_{01} + k_{10})

in time

τ_{e}

is at least:

\begin{matrix} log 2 - \frac{1}{2} log (1 - e^{- 2 τ_{e} / τ_{r}}) . \end{matrix}

(7)

Note that

C_{erase} \geq log 2

with equality when

τ_{e} / τ_{r} \to \infty

, since

1 - e^{- 2 τ_{e} / τ_{r}} \leq 1

. From Equation (7),

C_{erase} \geq \frac{1}{2} log \frac{2 τ_{r}}{τ_{e}}

when

τ_{r} > > τ_{e}

.

4. Interpreting the KL Cost

One motivation for our cost function comes from the field of KL control theory. We now compare other possible meanings to this cost function.

4.1. Path Space Szilard–Landauer Correspondence

The correspondence between information and thermodynamics was revealed in the work of Szilard, and clarified by Landauer. More rigorous and general treatments of this correspondence have been worked out recently [3,4,5]. We first recall this result, and then show how our cost function is a formal extension of this result.

Consider a physical system with finite state space S and energy

E : S \to R

. (More general state spaces S can be handled by replacing the sum by an appropriate integral. For our present purposes, it suffices to assume S is finite.) Define the Gibbs distribution π at temperature T by

π (i) = \frac{e^{- E_{i} / k_{B} T}}{\sum_{j \in S} e^{- E_{j} / k_{B} T}},

for all

i \in S

. Define the free energy:

F (p) : = \sum_{i \in S} p_{i} E_{i} - k_{B} T \sum_{i \in S} p_{i} log \frac{1}{p_{i}},

where p is a probability distribution.

Define the relative entropy

D (p ∥ q) = \sum_{i \in S} p_{i} log \frac{p_{i}}{q_{i}}

with Euler’s constant for the base of the logarithm. Following Jaynes [29], assume that equilibrium π corresponds to a maximally uninformative state of the system, so that we have zero information about the system when it is at equilibrium. Recall that a nat is the unit of information when logarithms are taken to the base of Euler’s constant. 1 bit

= log 2

nats. Then, the relative entropy

D (p ∥ π)

has an axiomatic identification with the amount of information in nats that we know about the system when it is in a nonequilibrium state p [4].

The following identity is easily verified:

\begin{matrix} F (p) - F (π) = k_{B} T D (p ∥ π) . \end{matrix}

(8)

The conceptual significance of this simple identity is that it supplies a dictionary between thermodynamics and information theory [4]. In particular, erasing a bit corresponds to increasing relative entropy, which, in turn, corresponds—via the identity—to increasing available free energy

F (p) - F (π)

by

k_{B} T log 2

, recovering the classical result of Szilard as an alternative statement of the second law of thermodynamics. In the other direction, charging a battery corresponds to increasing available free energy which in turn corresponds—via Identity Equation (8)—to erasing of information. This relates the energy efficiency of charging a battery to the energy required to erase a bit.

Now, consider our cost function

D (μ ∥ ν)

. The relative entropy

D (μ ∥ ν)

counts the number of nats erased by the control in path space, relative to the passive dynamics. Since the Szilard–Landauer principle asserts that erasing one bit requires at least

k_{B} T log 2

units of energy, our cost function may be viewed as a Path Space Szilard–Landauer Principle, formally extending Identity Equation (8) to path space.

4.2. Thermodynamic Interpretation

We wish to compare the cost

D (μ ∥ ν)

with the usual thermodynamic expected work

Δ W

. We will quickly outline how thermodynamic quantities can be defined for a two-state Markov chain.

4.2.1. Thermodynamics on a Two-State Markov Chain

The ideas we present here are well-known in the nonequilibrium thermodynamics community, for example see Propp’s thesis [30]. The construction can be carried out more generally, but the generalization is not necessary for our present purposes.

Consider again the two-state continuous-time Markov chain with passive dynamics given by transition rates $k_{01}$ and $k_{10}$ .

Let $E_{0}$ and $E_{1}$ denote the internal energy of states “0” and “1”, respectively. Then, the equilibrium distribution is given by $π_{0} \propto e^{- E_{0} / k_{B} T}$ and $π_{1} \propto e^{- E_{1} / k_{B} T}$ . We also have $k_{01} π_{0} = k_{10} π_{1}$ from detailed balance. Together this yields

$\begin{matrix} E_{0} - E_{1} = k_{B} T log \frac{k_{01}}{k_{10}} . \end{matrix}$

(9)
Now consider the same two-state system with a control applied to it by means of a field of potential $ϕ (t) = (ϕ_{0} (t), ϕ_{1} (t))$ so that the potential energy in state i becomes $E_{i} + ϕ_{i} (t)$ . The transition rates due to the control become $u_{01} (t)$ and $u_{10} (t)$ . By a reasoning similar to how we derived Equation (9), we get

$E_{0} + ϕ_{0} - E_{1} - ϕ_{1} = k_{B} T log \frac{u_{01}}{u_{10}} .$

Combining with Equation (9), this yields

$\begin{matrix} ϕ_{0} - ϕ_{1} = k_{B} T log \frac{u_{01} k_{10}}{u_{10} k_{01}} . \end{matrix}$

(10)
Given a distribution p = ( p 0 , p 1 ) on the states, we can define the following thermodynamic quantities:
- Expected internal energy $E (p) = p_{0} E_{0} + p_{1} E_{1}$ .
- Entropy $S (p) = - p_{0} log p_{0} - p_{1} log p_{1}$ .
- Nonequilibrium free energy $F (p) = E (p) - k_{B} T S (p)$ .
Given a transition from state i to state j in the presence of the control field, we can define the following thermodynamic quantities:
- Heat dissipated $Q_{i j} (t) = E_{i} + ϕ_{i} (t) - E_{j} - ϕ_{j} (t)$ .
- Work done by the control $W_{i j} (t) = ϕ_{i} (t) - ϕ_{j} (t)$ . This expression for work can be traced back to Sekimoto [31], and is commonly employed in the field of Stochastic Thermodynamics to describe the work done by switching on a control field [32].
- Entropy increase of the system $S_{i j} (t) = log \frac{p_{i} (t)}{p_{j} (t)}$ .
The first law of thermodynamics manifests as W i j = Q i j + E i - E j .
Suppose the system is described at time t by a distribution $p (t) = (p_{0} (t), p_{1} (t))$ . Define the Current $J_{i j} (t) = p_{i} (t) u_{i j} (t) - p_{j} (t) u_{j i} (t)$ so that ${\dot{p}}_{0} (t) = - J_{01} (t)$ .
We can further compute

$\begin{matrix} \frac{d E}{d t} & = J_{01} (t) (E_{1} (t) - E_{0} (t)) = k_{B} T J_{01} (t) log \frac{k_{10}}{k_{01}}, \\ \frac{d W}{d t} & = J_{01} (t) W_{01} (t) = J_{01} (t) (ϕ_{0} (t) - ϕ_{1} (t)) = k_{B} T J_{01} (t) log \frac{u_{01} (t) k_{10}}{u_{10} (t) k_{01}}, \\ \frac{d Q}{d t} & = J_{01} (t) Q_{01} (t) = J_{01} (t) (E_{0} + ϕ_{0} (t) - E_{1} - ϕ_{1} (t)) = k_{B} T J_{01} (t) log \frac{u_{01} (t)}{u_{10} (t)}, \\ \frac{d S}{d t} & = J_{01} (t) S_{01} (t) = J_{01} (t) log \frac{p_{0} (t)}{p_{1} (t)}, \\ \frac{d F}{d t} & = k_{B} T J_{01} (t) log \frac{k_{10} p_{1} (t)}{k_{01} p_{0} (t)} . \end{matrix}$
Define Total Entropy Production $S_{tot} (t)$ to be the total entropy produced from time 0 to time t. In other words, $S_{tot} (0) = 0$ and

$\frac{d S_{tot} (t)}{d t} = \frac{1}{k_{B} T} \frac{d Q}{d t} + \frac{d S}{d t} .$

After simplification,

$\begin{matrix} \frac{d S_{tot} (t)}{d t} = (p_{0} (t) u_{01} (t) - p_{1} (t) u_{10} (t)) log \frac{p_{0} (t) u_{01} (t)}{p_{1} (t) u_{10} (t)} \geq 0, \end{matrix}$

(11)

which is a statement of the second law of thermodynamics.
The following identity is immediate

$\begin{matrix} \frac{d W}{d t} = \frac{d F}{d t} + \frac{d S_{tot}}{d t} \end{matrix}$

and is another form of the first law.

4.2.2. Thermodynamic Cost for Rapid Erasing of a Reliable Bit

How much does it cost for rapid erasing of a reliable bit, with the cost function equal to

Δ W

? We claim that it costs

k_{B} log 2

. In particular, neither the reliability timescale

τ_{r}

nor the erasing timescale

τ_{e}

appear in this answer.

Suppose we can erase a

(τ_{r}, τ_{e})

bit for work W. First, note that

\frac{d W}{d t}

is a function of

k_{01} / k_{10}

,

u_{01} (t) / u_{10} (t)

and

J_{01} (t)

as in Equation (6). In particular, simultaneously sending the rates

k_{01}

and

k_{10}

as low as possible while keeping their ratio the same has no effect on the work. Thus, if we can erase a

(τ_{r}, τ_{e})

bit for work W, then we can erase a

(A τ_{r}, τ_{e})

bit for work W, for an arbitrarily large constant A. In particular, it is enough for us to demonstrate a protocol when

τ_{r} = 1

.

Now, note that

\frac{d W}{d t}

depends only on the ratio

u_{01} (t) / u_{10} (t)

and not on the actual values of the rates. We can also erase a

(τ_{r}, τ_{e} / 2)

bit for work W by taking the

(τ_{r}, τ_{e})

-protocol

(u_{i j} (t))

and defining new rates

v_{i j} (t) = 2 u_{i j} (2 t)

. Since

v_{01} (t) / v_{10} (t) = u_{01} (2 t) / u_{10} (2 t)

, it follows from a simple calculation that the work required does not change.

By taking a limit of this time scaling argument, we only need to erase a

(1, + \infty)

bit. Here, the infinite-time isothermal protocol, which proceeds by raising the `1’ well infinitesimally, waiting for the system to equilibrate, and repeating, erases for a total work of

k_{B} T log 2

since that is the free energy difference between the initial and final state, and there is no extra dissipation. This establishes our claim.

A more detailed version of this calculation can be found in [33]. This work assumes that there is a maximum energy limit

E_{max}

to which a state can be raised, so that there will be some small error to erasing. It also makes another assumption about thermalization timescale which translates in our setting to assuming that there is a maximum value to the rates

u_{01} (t)

and

u_{10} (t)

. With these assumptions, they show that the cost of rapid erasing is slightly more than

k_{B} log 2

and goes to

k_{B} T log 2

very quickly as the timescale of thermal relaxation becomes smaller and

E_{max}

goes to infinity.

4.2.3. Link between KL-Cost and Thermodynamic Work

We will now characterize entropy production in terms of time reversal. This will allow us to make a link between KL-cost and the thermodynamic work W.

We first recall the notion of time reversal of a Markov chain. Usually time-reversal is defined for time-homogeneous Markov chains. However, for the purposes of characterizing entropy production in terms of time reversal, we will work with a definition that applies to time-inhomogeneous Markov chains also. Instead of giving this definition in full generality, we work with a Markov chain with a finite state space. This is sufficient for our purposes and allows us to avoid dealing with certain technical issues.

Note that given a matrix U with positive non-diagonal entries and

U \cdot 1 = 0

, there is a nonnegative vector v such that

v \cdot U = 0

. This can be shown by applying the Perron–Frobenius theorem to the exponential matrix

e^{U}

.

Definition 1

(Time-reversal). Consider a continuous-time time-inhomogeneous Markov chain with state space

[n] = {1, 2, \dots, n}

described by a time-dependent transition matrix

U (t) = {(u_{i j} (t))}_{i, j \in [n]}

(so that at time t,

u_{i j} (t)

denotes the rate of jumping to state j given that the system is in state i). Let

π (t)

be a sequence of stationary probability distributions on

[n]

, i.e.,

π (t) U (t) = 0

and

π_{i} \geq 0

for all

i \in [n]

and

\sum_{i} π_{i} = 1

. Then, the time-reversal Markov chain is described by the time-dependent transition matrix

\hat{U} (t) = {({\hat{u}}_{i j} (t))}_{i, j \in [n]}

where

{\hat{u}}_{i j} (t) = \frac{π_{i} (t) u_{i j} (t)}{π_{i} (t)} .

A Markov chain is reversible if

\hat{U} = U

.

The justification for considering time-reversal comes from Bayes’ rule. Reversible Markov chains are well-known to be characterized by the conditions of existence of a detailed balanced equilibrium, as well as by the Kolmogorov chain conditions. In particular, two-state Markov chains are always reversible.

For the special case of Equation (3) in particular, given a distribution q at time

τ_{e}

, the time reversal Markov chain evolves in time according to the ODE:

\begin{matrix} {\dot{q}}_{0} (t) = - u_{01} (t) q_{0} (t) + u_{10} (t) (1 - q_{0} (t)), q (τ_{e}) = q . \end{matrix}

(12)

We see that the difference is that, in Equation (3), the boundary condition was specified at time 0, whereas here the boundary condition is specified at time

τ_{e}

.

We define the time-reversed measure

μ_{u, q}^{rev}

as the measure on path space corresponding to the process described by Equation (12). Strictly speaking, we should write

μ_{u, q, τ_{e}}^{rev}

to denote the time at which the boundary condition is provided to the differential equation, but we will avoid this by using the convention that we are always going to set the boundary condition at time

τ_{e}

when considering the time-reversal.

The following result is key to our comparison.

Theorem 1.

Run the control dynamics Equation (3) forward from initial condition

p (0)

upto time

τ_{e}

to obtain the distribution

p (τ_{e})

. Consider the measure

μ_{u, p (τ_{e})}^{rev}

. Then, the total entropy production

S_{tot} (τ_{e})

from time 0 to time

τ_{e}

equals

S_{tot} (τ_{e}) = D (μ_{u, p (0)} ∥ μ_{u, p (τ_{e})}^{rev}) .

Proof.

We will show that the time derivative of the right-hand side (RHS) equals the rate of entropy production. This will prove the theorem.

Fix a time

t \in [0, τ_{e}]

. Let the probability distribution at time t be represented by

p (t) = (p_{0} (t), p_{1} (t))

. Let

f_{i j} : = p_{i} (t) u_{i j} (t)

denote the flow rate from state i to state

j \neq i

at time t. Then,

p_{i} (t + h) = p_{i} (t) + h (f_{j i} - f_{i j}) + o (h)

where

i \neq j

and

o (h)

denotes terms

g (h)

such that

{lim}_{h \to 0} g (h) / h \to 0

.

We will consider the probabilities of the four Markov chain transitions

0 \to 0

,

0 \to 1

,

1 \to 0

and

1 \to 1

in the interval

(t, t + h)

in the limit

h \to 0

according to

μ_{u, p (0)}

and according to

μ_{u, p (τ_{e})}^{rev}

. Up to terms of size

o (h^{2})

, we have for

i \neq j

:

\begin{matrix} μ_{u, p (0)} (i \to j) & = h p_{i} (t) u_{i j} (t) = h f_{i j}, \\ μ_{u, p (τ_{e})}^{rev} (i \to j) & = h p_{j} (t + h) u_{j i} (t) = h (f_{j i} + h (f_{i j} - f_{j i})) . \end{matrix}

The increment in the relative entropy in the time interval

(t, t + h)

equals, up to

o (h^{2})

terms:

\begin{matrix} (1 - h f_{01}) log \frac{1 - h f_{01}}{1 - h f_{10} - h^{2} (f_{01} - f_{10})} + h f_{01} log \frac{h f_{01}}{h f_{10} + h^{2} (f_{01} - f_{10})} \\ + h f_{10} log \frac{h f_{10}}{h f_{01} + h^{2} (f_{10} - f_{01})} + (1 - h f_{10}) log \frac{1 - h f_{10}}{1 - h f_{01} - h^{2} (f_{10} - f_{01})} . \end{matrix}

The off-diagonal terms contribute:

\begin{matrix} (1 - h f_{01}) log \frac{1 - h f_{01}}{1 - h f_{10} - h^{2} (f_{01} - f_{10})} + (1 - h f_{10}) log \frac{1 - h f_{10}}{1 - h f_{01} - h^{2} (f_{10} - f_{01})} \\ \approx (1 + o (h)) (h f_{10} - h f_{01} + o (h^{2})) + (1 + o (h)) (h f_{01} - h f_{10} + o (h^{2})) \\ \approx o (h^{2}) . \end{matrix}

Divide by h, and take the limit

h \to 0

. We can ignore the off-diagonal terms. The diagonal terms sum to the rate of entropy production

\frac{d S_{tot}}{d t}

as in Equation (11), and we are done. ☐

By the First Law of Thermodynamics and Theorem 1,

\begin{matrix} Δ W = Δ F + k_{B} T D (μ_{u, p} ∥ μ_{u, p (τ_{e})}^{rev}), \end{matrix}

(13)

where the increase in free energy of the system

Δ F = k_{B} T (D (p (τ_{e}) ∥ π) - D (p (0) ∥ π))

by Equation (8). Now, to compare our cost function with

Δ W

.

Theorem 2.

The

K L

-cost equals change in free energy by

k_{B} T

plus a path-space relative entropy term that resembles entropy production:

\begin{matrix} k_{B} T D (μ_{u, p} ∥ μ_{k, p}) = Δ F + k_{B} T D (μ_{u, p} ∥ μ_{k, p (τ_{e})}^{rev}), \end{matrix}

(14)

where

p (τ_{e})

is—as in Equation (13)—the solution to the control dynamics Equation (3) at time

τ_{e}

.

Proof.

Using Equation (8), we can rewrite the claim as

D (μ_{u, p} ∥ μ_{k, p}) + D (p (0) ∥ π) = D (μ_{u, p} ∥ μ_{k, p (τ_{e})}^{rev}) + D (p (τ_{e}) ∥ π) .

Both left-hand side (LHS) and RHS equal

D (μ_{u, p} ∥ μ_{k, π})

. The assertion for the LHS is straightforward. The assertion for the RHS is true because time-reversal dynamics was defined to keep the stationary distribution π remaining stationary under time reversal. ☐

Comparing Equations (13) and (14), a KL control treatment replaces the total entropy production

D (μ_{u, p} ∥ μ_{u, p (τ_{e})}^{rev})

in Equation (13) by the new term

D (μ_{u, p} ∥ μ_{k, p (τ_{e})}^{rev})

which compares the control dynamics with the time reversal of the passive dynamics. This suggests an interpretation as follows. If we applied the control during some time interval

[0, τ_{e}]

, and remembered what control we applied, then the entropy production is correctly given by

D (μ_{u, p} ∥ μ_{u, p (τ_{e})}^{rev})

. However, the information that a control was applied also needs to be stored somewhere. If we forget that a control was applied, and if application of the control is very rare, then our default model for the dynamics should be much closer to the passive dynamics. In this case, entropy production may be closer to the value

D (μ_{u, p} ∥ μ_{k, p (τ_{e})}^{rev})

.

4.3. Large Deviations Interpretation

Our cost function

D (μ ∥ ν)

also admits a large deviation interpretation which was, remarkably, already noted by Schrödinger in 1931 [34,35,36,37]. Motivated by quantum mechanics, Schrödinger asked: conditioned on a more or less astonishing observation of a system at two extremes of a time interval, what is the least astonishing way in which the dynamics in the interval could have proceeded? Specializing to our problem of erasing, suppose an ensemble of two-state Markov chain with passive dynamics given by Equation (1) was observed at time 0 and at time

τ_{e}

. Suppose the empirical state distribution over the ensemble was found to be the equilibrium distribution π at time 0, and (1, 0) at time

τ_{e}

, respectively. This would be astonishing because no control has been applied, yet the ensemble has arrived at a state of higher free energy. Conditioned on this rare event having taken place, what is the least unlikely measure

μ^{*}

on path space via which the process took place?

By a statistical treatment of multiple single particle trajectories, Schrödinger found that the likelihood of an empirical measure μ on path space falls exponentially fast with the relative entropy

D (μ ∥ ν)

, where ν is the measure induced by the passive dynamics. In particular, the least unlikely measure

μ^{*}

is that measure which—among all μ whose marginals at time 0 and time

τ_{e}

respect the observations—minimizes

D (μ ∥ ν)

. Thus, for the problem of erasing, where

k_{01} = k_{10}

, the measure μ varies over all measures that have marginal

(1 / 2, 1 / 2)

at time 0 and marginal (1, 0) at time

τ_{e}

, and

μ^{*}

is that measure among all such μ that minimizes

D (μ ∥ μ_{k, (1 / 2, 1 / 2)})

. Thus, our optimal control produces in expectation the least surprising trajectory among all controls that perform rapid erasing.

4.4. Gibbs Measure

Equation (6) is not accidental for this example, but is in fact a general feature when the cost function is relative entropy [28]. More abstractly, the Radon–Nikodym derivative (i.e., “probability density”)

\frac{d μ^{*}}{d ν}

of the measure

μ^{*}

induced on path space by the optimal control

u^{*}

is a Gibbs measure with respect to the measure ν induced by the passive dynamics, with the cost-to-go function

v (t)

playing the role of an energy function. In other words, mathematically our problem is precisely the free energy minimization problem so familiar from statistical mechanics. There is also a possible physical interpretation: we are choosing paths in

P

as microstates, instead of points in phase space. The idea of paths as microstates has occurred before [38].

5. Conclusions

Since charging a battery can also be thought of as erasing a bit [4], our result may also hold insights into the limits of efficiencies of rapidly charging batteries that must simultaneously hold their energy for a long time.

So long as the noise is Markovian, we conjecture that the KL cost for erasing the two-state Markov chain is a lower bound for more general cases—for example, for bits with Langevin dynamics [39]—which is a stochastic differential equation expressing Newton’s laws of motion with Brownian noise perturbations.

Acknowledgments

I thank Sanjoy Mitter, Vivek Borkar, Nick S. Jones, Mukul Agarwal, and Krishnamurthy Dvijotham for helpful discussions. I thank Abhishek Behera for drawing Figure 1.

Conflicts of Interest

The author declares no conflict of interest.

References

Szilard, L. Über die Entropieverminderung in einem thermodynamischen System bei Eingriffen intelligenter Wesen. Z. Phys. 1929, 53, 840–856. (In German) [Google Scholar] [CrossRef]
Landauer, R. Irreversibility and heat generation in the computing process. IBM J. Res. Dev. 1961, 5, 183–191. [Google Scholar] [CrossRef]
Esposito, M.; van den Broeck, C. Second law and Landauer principle far from equilibrium. Europhys. Lett. 2011, 95, 40004. [Google Scholar] [CrossRef]
Gopalkrishnan, M. The Hot Bit I: The Szilard–Landauer correspondence. 2013; arXiv:1311.3533. [Google Scholar]
Reeb, D.; Wolf, M.M. An improved Landauer principle with finite-size corrections. New J. Phys. 2014, 16, 103011. [Google Scholar] [CrossRef]
Laughlin, S.B.; de Ruyter van Steveninck, R.R.; Anderson, J.C. The metabolic cost of neural information. Nat. Neurosci. 1998, 1, 36–41. [Google Scholar] [CrossRef] [PubMed]
Mudge, T. Power: A first-class architectural design constraint. Computer 2001, 34, 52–58. [Google Scholar] [CrossRef]
Von Neumann, J. Theory of Self-Reproducing Automata; University of Illinois Press: Urbana, IL, USA, 1966; p. 66. [Google Scholar]
Bennett, C.H. The thermodynamics of computation—A review. Int. J. Theor. Phys. 1982, 21, 905–940. [Google Scholar] [CrossRef]
Aurell, E.; Gawȩdzki, K.; Mejía-Monasterio, C.; Mohayaee, R.; Muratore-Ginanneschi, P. Refined second law of thermodynamics for fast random processes. J. Stat. Phys. 2012, 147, 487–505. [Google Scholar] [CrossRef] [Green Version]
Diana, G.; Bagci, G.B.; Esposito, M. Finite-time erasing of information stored in fermionic bits. Phys. Rev. E 2013, 87, 012111. [Google Scholar] [CrossRef] [PubMed]
Zulkowski, P.R.; DeWeese, M.R. Optimal finite-time erasure of a classical bit. Phys. Rev. E 2014, 89, 052140. [Google Scholar] [CrossRef] [PubMed]
Salamon, P.; Nitzan, A. Finite time optimizations of a Newton’s law Carnot cycle. J. Chem. Phys. 1981, 74, 441482. [Google Scholar] [CrossRef]
Swanson, J.A. Physical versus logical coupling in memory systems. IBM J. Res. Dev. 1960, 4, 305–310. [Google Scholar] [CrossRef]
Alicki, R. Information is not physical. 2014; arXiv:1402.2414. [Google Scholar]
Todorov, E. Efficient computation of optimal actions. Proc. Natl. Acad. Sci. USA 2009, 106, 11478–11483. [Google Scholar] [CrossRef] [PubMed]
Fleming, W.H.; Mitter, S.K. Optimal control and nonlinear filtering for nondegenerate diffusion processes. Stochastics 1982, 8, 63–77. [Google Scholar] [CrossRef]
Kappen, H.J. Path integrals and symmetry breaking for optimal control theory. J. Stat. Mech. 2005, 2005, P11011. [Google Scholar] [CrossRef]
Kappen, H.J. Linear theory for control of nonlinear stochastic systems. Phys. Rev. Lett. 2005, 95, 200201. [Google Scholar] [CrossRef] [PubMed]
Theodorou, E.A. Iterative Path Integral Stochastic Optimal Control: Theory and Applications to Motor Control. Ph.D. Thesis, University of Southern California, Los Angeles, CA, USA, 2011. [Google Scholar]
Theodorou, E.; Todorov, E. Relative entropy and free energy dualities: Connections to path integral and KL control. In Proceedings of the 51st IEEE Conference on Decision and Control (CDC), Maui, HI, USA, 10–13 December 2012; pp. 1466–1473.
Stulp, F.; Theodorou, E.A.; Schaal, S. Reinforcement learning with sequences of motion primitives for robust manipulation. IEEE Trans. Robot. 2012, 28, 1360–1370. [Google Scholar] [CrossRef]
Dvijotham, K.; Todorov, E. A unified theory of linearly solvable optimal control. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011), Barcelona, Spain, 14–17 July 2011.
Kappen, H.J.; Gómez, V.; Opper, M. Optimal control as a graphical model inference problem. Mach. Learn. 2012, 87, 159–182. [Google Scholar] [CrossRef]
Van den Broek, B.; Wiegerinck, W.; Kappen, B. Graphical model inference in optimal control of stochastic multi-agent systems. J. Artif. Intell. Res. 2008, 32, 95–122. [Google Scholar]
Wiegerinck, W.; van den Broek, B.; Kappen, H. Stochastic optimal control in continuous space-time multi-agent systems. 2012; arXiv:1206.6866. [Google Scholar]
Horowitz, M.B. Efficient Methods for Stochastic Optimal Control. Ph.D. Thesis, California Institute of Technology, Pasadena, CA, USA, 2014. [Google Scholar]
Dupuis, P.; Ellis, R.S. A Weak Convergence Approach to the Theory of Large Deviations; Wiley: New York, NY, USA, 2011. [Google Scholar]
Jaynes, E.T. Information theory and statistical mechanics. Phys. Rev. 1957, 106, 620–630. [Google Scholar] [CrossRef]
Propp, M.B. The Thermodynamic Properties of Markov Processes. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 1985. [Google Scholar]
Sekimoto, K. Kinetic characterization of heat bath and the energetics of thermal ratchet models. J. Phys Soc. Jpn. 1997, 66, 1234–1237. [Google Scholar] [CrossRef]
Seifert, U. Stochastic thermodynamics, fluctuation theorems and molecular machines. Rep. Prog. Phys. 2012, 75, 126001. [Google Scholar] [CrossRef] [PubMed]
Browne, C.; Garner, A.J.P.; Dahlsten, O.C.O.; Vedral, V. Guaranteed energy-efficient bit reset in finite time. Phys. Rev. Lett. 2014, 113, 100603. [Google Scholar] [CrossRef] [PubMed]
Schrödinger, E. Uber die umkehrung der naturgesetze, sitzung ber preuss. Akad. Wiss. Berlin Phys. Math. 1931, 2, 144–153. (In German) [Google Scholar]
Beurling, A. An automorphism of product measures. Ann. Math. 1960, 72, 189–200. [Google Scholar] [CrossRef]
Föllmer, H. Random fields and diffusion processes. In École d’Été de Probabilités de Saint-Flour XV–XVII, 1985–87; Springer: Berlin/Heidelberg, Germany, 1988; pp. 101–203. [Google Scholar]
Aebi, R. Schrödinger Diffusion Processes; Springer: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
Wissner-Gross, A.D.; Freer, C.E. Causal entropic forces. Phys. Rev. Lett. 2013, 110, 168702. [Google Scholar] [CrossRef] [PubMed]
Zwanzig, R. Nonequilibrium Statistical Mechanics; Oxford University Press: New York, NY, USA, 2001. [Google Scholar]

Figure 1. The discrete-time path space

P_{h}

. A specific path is labeled in red.

Figure 1. The discrete-time path space

P_{h}

. A specific path is labeled in red.

© 2016 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gopalkrishnan, M. A Cost/Speed/Reliability Tradeoff to Erasing. Entropy 2016, 18, 165. https://doi.org/10.3390/e18050165

AMA Style

Gopalkrishnan M. A Cost/Speed/Reliability Tradeoff to Erasing. Entropy. 2016; 18(5):165. https://doi.org/10.3390/e18050165

Chicago/Turabian Style

Gopalkrishnan, Manoj. 2016. "A Cost/Speed/Reliability Tradeoff to Erasing" Entropy 18, no. 5: 165. https://doi.org/10.3390/e18050165

APA Style

Gopalkrishnan, M. (2016). A Cost/Speed/Reliability Tradeoff to Erasing. Entropy, 18(5), 165. https://doi.org/10.3390/e18050165

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Cost/Speed/Reliability Tradeoff to Erasing^†

Abstract

1. Introduction

2. The Erasing Problem

Kullback–Leibler Cost

3. Solution to the Erasing Problem

4. Interpreting the KL Cost

4.1. Path Space Szilard–Landauer Correspondence

4.2. Thermodynamic Interpretation

4.2.1. Thermodynamics on a Two-State Markov Chain

4.2.2. Thermodynamic Cost for Rapid Erasing of a Reliable Bit

4.2.3. Link between KL-Cost and Thermodynamic Work

4.3. Large Deviations Interpretation

4.4. Gibbs Measure

5. Conclusions

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Cost/Speed/Reliability Tradeoff to Erasing †

Abstract

1. Introduction

2. The Erasing Problem

Kullback–Leibler Cost

3. Solution to the Erasing Problem

4. Interpreting the KL Cost

4.1. Path Space Szilard–Landauer Correspondence

4.2. Thermodynamic Interpretation

4.2.1. Thermodynamics on a Two-State Markov Chain

4.2.2. Thermodynamic Cost for Rapid Erasing of a Reliable Bit

4.2.3. Link between KL-Cost and Thermodynamic Work

4.3. Large Deviations Interpretation

4.4. Gibbs Measure

5. Conclusions

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

A Cost/Speed/Reliability Tradeoff to Erasing^†