1.1. General Review
Today we are faced with the task of making the limits of validity of the Second Law of Thermodynamics more precise.
Marian Smoluchowski (1914) [
1]
The Second Law of Thermodynamics is one of the most remarkable physical laws for its profound meaning, its deep implications in every field of physics and the several applications in other fields of science, such as any field of engineering, chemistry, biology, genetics, medicine and, more generally, in natural sciences. And yet, behind Smoluchowski above motto there are grounds for examining the limits of this law’s validity [
1].
In simple terms, Clausius’ First Statement of the Second Law of Thermodynamics states that the transfer of heat from a higher-temperature reservoir to a lower-temperature reservoir is spontaneous and continues until thermal equilibrium is established (if possible). Instead, the transfer of heat from a lower-temperature reservoir to a higher-temperature reservoir is not spontaneous and occurs only if work is performed [
2]. More profoundly, the understanding of heat, energy and their interplay requires the knowledge of both the First Law of Thermodynamics and the Second Law of Thermodynamics [
3].
Thanks to the introduction of the concept of entropy (
S) by Clausius, who studied the relation between heat transfer and work as a form of heat energy, a second version of the Second Law of Thermodynamics can be stated, which can be identified with Clausius’ Second Statement for closed and isolated thermodynamic systems subjected to reversible processes in terms of entropy change: entropy does not decrease, namely,
. The entropy variation
is equal to the ratio of heat
reversibly exchanged with the environment and the thermodynamic temperature
T at which this exchange occurs, viz.,
. If the heat is absorbed from the environment by the thermodynamic system,
, leading to
, while if it is released by the system into the environment,
. Here,
indicates the finite variation experienced either by entropy or by heat when the thermodynamic system passes from its initial to its final state supposed in thermodynamic equilibrium. The relationship between entropy and heat energy in terms of their variations can be regarded as the quantitative formulation of the Second Law of Thermodynamics based on a classical framework [
4,
5,
6,
7,
8,
9,
10]. This relationship has been given by Clausius also in differential form, and for reversible processes, it is
, where
is the infinitesimal entropy change due to the infinitesimal heat
reversibly exchanged. It should be emphasized that while
is not an exact differential,
is a complete (exact) differential due to the integrating factor
, so
S is a function of state depending only on the initial and final thermodynamic states.
The so-called equivalent formulation of the Second Law of Thermodynamics is represented by the Kelvin Statement, according to which it is not possible to convert all the heat absorbed from an external reservoir at higher temperature into work: part of it will be transferred to an external reservoir at lower temperature. Specifically, the proof of the equivalence between the Kelvin Statement and Clausius’ First Statement is outlined in [
11]. The foundations of the Kelvin Statement were laid in [
12,
13], where the consequences of the Carnot proposition, according to which there is a waste of mechanical energy when heat is allowed to pass from one body to another at lower temperature, are discussed. This statement has been historically referred to as the Kelvin–Planck Statement [
11,
14]. However, another pioneer of this principle was Ostwald, who completed the Kelvin Statement by formulating the
Perpetuum Mobile of the Second Kind. For this reason, in this paper, we combine together the two pioneering contributions, and we refer to the Kelvin–Ostwald Statement. Planck’s role in pioneering this principle may not have been major but was mainly focused in its divulgation within the scientific community. Note that recently, Moran et al. [
15] have discussed the Second Law of Thermodynamics by using Clausius’ First Statement and the Kelvin Statement, but they did not take into account Clausius’ Second Statement and the Carnot Type Statement.
In strict relation with the above-mentioned general statements, it can be stated that according to the Second Law of Thermodynamics, it is impossible to realize a
Perpetuum Mobile of the Second Kind able to use the internal energy of only one heat reservoir (see also
Section 1.2 for more details).
As a general note, the entity of entropy variations in the Second Law of Thermodynamics applied to isolated thermodynamic systems allows for the distinction between reversible processes, i.e., ideal processes for which
(
), and irreversible processes, i.e., real processes for which
(
). To study reversible and irreversible thermodynamic processes, it is not sufficient to consider the thermodynamic system itself, but it is also necessary to consider its local and nonlocal surroundings, defining together with the system what is called the thermodynamic universe [
11,
14]. It describes the constructive role of entropy growth and makes the case that energy matters but entropy growth matters more.
At a later time, the Second Law of Thermodynamics was reformulated mathematically and geometrically again in a classical thermodynamic framework, through the so-called Carathéodory’s Statement [
16], which exists in different formulations appearing in textbooks and articles (see, e.g., [
11,
14]), even though they are very similar to one another. Following Koczan [
17], Carathéodory’s Statement states that “In the surroundings of each thermodynamic state, there are states that cannot be achieved by adiabatic processes”. This statement has often been given without a rigorous proof of equivalence with other formulations, but it has also been proved, even though not so rigorously, that this statement is a direct consequence of Clausius’ and Kelvin Statements [
18,
19,
20]. However, a recent critique of this statement [
21] shows that it already received strong criticism by Planck, and the issue related to its necessity and validity is still a matter of some discussion.
Afterwards, the Second Law of Thermodynamics acquired more general significance when a statistical physics definition of entropy was introduced by Boltzmann and Planck [
22,
23,
24,
25,
26] and then generalized by the same Boltzmann [
27] and by Gibbs [
28]—this aspect was not discussed by the authors in previous works [
14,
17]. The main advancement was the celebrated Boltzmann probabilistic formula proposed within his kinetic theory of gases:
, with
being the Boltzmann constant and
the number of microstates which characterize a gas’s macrostate. This formula was put forward and interpreted later by Planck [
29] and is also known as the Boltzmann–Planck relationship: the entropy of a thermodynamic system is proportional to the number of ways its atoms or molecules can be arranged in different thermodynamic microstates. If the occupation probabilities of every microstate are different, it can be shown that Boltzmann formula is written in the Gibbs form, or Boltzmann–Gibbs form, valid also for states far from thermodynamic equilibrium:
, with
being the probability that microstate
i has energy
during the system energy fluctuation. It is straightforward to show that the Boltzmann–Gibbs entropy infinitesimal change in a canonical ensemble is equivalent to the Clausius’ entropy change, and thanks to this equivalence, the Second Law of Thermodynamics is also formulated within a statistical physics approach.
It is important to point out that very recently, Koczan has shown that Clausius’ and Kelvin Statements are not exhaustive formulations of the Second Law of Thermodynamics if compared with the Carnot Type Statement [
17]. Specifically, it has been proved that the Kelvin Statement is a weaker statement (or, more strictly, non-equivalent) than Clausius’ First Statement, and Clausius’ First Statement is a weaker statement than the Carnot Type Statement, which can be considered equivalent to Clausius’ Second Statement. By indicating the heat absorbed by a heat source at temperature
with
(
) and the work resulting from the conversion of heat with
W in the device operating between the heat source and the heat receiver at temperature
, the Carnot Type Statement states that
, with
being the efficiency
; the efficiency of the conversion process of heat
into work
W in the device operating in the range between
and
cannot be greater than the ratio of the difference between the two temperatures and the temperature of the heat source. Koczan’s arguments [
17] about Clausius’ and Kelvin Statements were presented by Gadomski [
30], who reiterated that the Kelvin Statement and Clausius’ First Statement are not exhaustive formulations of the Second Law of Thermodynamics and that instead, the Carnot Type Statement is more far-reaching for such a formulation. In this respect, very recently, the concept of entropy and the Second Law of Thermodynamics were reinterpreted and reformulated by Neukart [
31] to assess the complexity of computational problems, giving a thermodynamic perspective on computational complexity.
This article ignores Carathéodory’s Statement, showing that there is not a real equivalence among this statement and Clausius’ and Kelvin Statements of the Second Law of Thermodynamics, contrary to what is asserted in some textbooks (see, e.g., [
11]). Instead, more attention is devoted to the problem of deriving the Second Law of Thermodynamics from statistical physics. The problem of reversibility and irreversibility related to the usual formulation of the Second Law of Thermodynamics in terms of increasing entropy is also revised. It turns out that it cannot be always considered a fundamental and elementary law of statistical physics and, in some cases, even not completely true, as recently proved via the fluctuation theorem [
32,
33]. At the same time, it is also shown that the Second Law of Thermodynamics is a phenomenological law of nature working extremely well for describing the behavior of several thermodynamic systems. In this respect, one of the aims of this work is to explain the above-mentioned discordance based on precise and novel formulations of the Second Law of Thermodynamics.
More specifically, it is shown that three propositions can be formulated by taking into account that the thermodynamic formulations are non-exhaustive and the statistical formulations are incomplete and partly paradoxical: (i) the Kelvin–Ostwald Statement is strengthened by replacing Perpetuum Mobile of the Second Kind with the Perpetuum Mobile of the Third Kind by means of the Carnot Type Statement, according to which even this latter engine does not exist (Proposition 1); (ii) the Inequality of Heat and Temperature Proportions are stronger than Clausius’ First Statement (Proposition 2); and (iii) according to the Probabilistic Scheme of the Second Law of Thermodynamics, the expected value of the change in the elementary entropy in the state of non-maximum probability beyond thermodynamic equilibrium is positive (Proposition 3). Moreover, the Refrigeration Perpetuum Mobile, which is a practically unknown (or somewhat forgotten) limitation for heat pumps and refrigerators, is introduced.
This paper is organized as follows:
Section 1 (introductory review chapter) presents the basic definitions and statements of phenomenological thermodynamics and statistical mechanics, including the various definitions of entropy, Boltzmann’s
H theorem and the fluctuation theorems.
Section 2 is devoted to some general clarifications of the Second Law of Thermodynamics in its purely phenomenological and classical form, while
Section 3 deals with some important clarifications of this law in statistical physics. These three main sections are divided into four detailed subsections. Conclusions are drawn in
Section 4.
1.2. Basic Definitions of Phenomenological Thermodynamics
A very important concept that led to the formulation of the laws of thermodynamics was the Perpetuum Mobile. However, even in the classical sense, there are several types of it. It is, therefore, worth clarifying this type terminology now, so that it can be further developed and used effectively in later sections of the article.
The Perpetuum Mobile is most often understood to mean a hypothetical device which would operate contrary to the accepted laws of physics. Usually, by Perpetuum Mobile, we mean a heat engine or a heat machine; therefore, it has the following basic definitions.
Definition 1 (
Perpetuum Mobile of Kind Zero and
Perpetuum Mobile of the First Kind)
. The “Perpetuum Mobile” of Kind Zero is a hypothetical device which moves forever without power and appears to be free from resistance to motion. This device neither performs work nor emits heat. On the other hand, a “Perpetuum Mobile” of the First Kind is a hypothetical device which performs work without an external energy source (it has infinite efficiency or greater than 100%).
Sometimes, the Perpetuum Mobile of Kind Zero is referred to as the Third Kind. However, number 0 better reflects the original Latin meaning of the word (from Latin perpetuum mobile = perpetual motion), and number III will be used further. Anyway, in traditional thermodynamics, the greatest emphasis, not necessarily rightly, was placed on number II.
Definition 2 (
Perpetuum Mobile of the Second Kind)
. The “Perpetuum Mobile” of the Second Kind is a hypothetical lossless operating warm engine with an efficiency of exactly 100%.
It is quite obvious that the Second and Third Kinds do not exist, but it is not clear that Kind Zero cannot exist. For example, the solar system, which has existed in an unchanged form for millions of years, seems to be of Kind Zero in kinematic terms. Of course, we assume here that the energy of the Sun’s thermonuclear processes does not directly affect the motion of the planets.
Since the First Law of Thermodynamics was recognized historically later than the second one, we will introduce a special form of the First Law for the needs of the Second Law of Thermodynamics.
Definition 3 (First Law of Thermodynamics for Two Heat Reservoirs)
. We consider thermal processes taking place in contact with two reservoirs: a radiator with a temperature and a cooler with a temperature . We denote the heat released by the radiator by and the heat absorbed by the cooler by . We assume that we can only transfer work to the environment or absorb work but we cannot remove or absorb heat from the environment. Therefore, the principle of conservation of energy for the process takes the formwhich can be written, following the example of the First Law of Thermodynamics, as an expression for the change in the internal energy E of the system:In the set of processes, all possible signs and zeros are allowed for the quantities and W that meet condition (1) or (2). In addition, thermodynamic processes can be combined (added) but not necessarily subtracted. The specific form of the First Law of Thermodynamics for a system of two reservoirs results from the assumption of internal heat exchange only. The definition of the First Law of Thermodynamics formulated in this way creates a space of energetically allowed processes for the Second Law of Thermodynamics. However, the Second Law of Thermodynamics introduces certain restrictions on these processes (see [
17]).
Definition 4 (Clausius’ First Statement)
. Heat naturally flows from a body at higher temperature to a body at lower temperature. Therefore, a direct (not forced by work) process of heat transfer from the body at lower temperature to the body at higher temperature is not possible. Clausius’ First Statement allows for a large set of possible physical processes which do not violate the First Law of Thermodynamics.
Definition 5 (Kelvin–Ostwald Statement)
. The processes of converting heat into work and work into heat do not run symmetrically. A full conversion of work into heat (internal energy) is possible. However, a full conversion of heat into work is not possible in a cyclical process. In other words, there is no Perpetuum Mobile of the Second Kind. The Kelvin–Ostwald Principle allows for the existence of a large set of possible physical processes which do not create a Perpetuum Mobile of the Second Kind and do not violate the First Law of Thermodynamics.
To provide a more comprehensive statement of the Second Law of Thermodynamics, thermodynamic entropy must be defined. It turns out that this can be performed in thermodynamics in three subtly different ways. We will see further that in statistical mechanics, contrary to appearances, there are even more possibilities.
Definition 6 (C-type Entropy for Reservoirs)
. There is a thermodynamic function of the state of a system of two heat reservoirs. Its change is equal to the sum of the ratios of heat absorbed by these reservoirs to the absolute temperature of these reservoirs:
The minus sign results from the convention that a reservoir with higher temperature releases heat (for ). However, if , then by the same convention, the reservoir with temperature will absorb heat. In this version, the total entropy change in the heat reservoirs is considered. We assume that the devices operating between these reservoirs operate cyclically, so their entropy remains unchanged. The definition of C-type entropy assumes that the heat capacity of the reservoirs is so large that the heat flow does not change their temperatures. Following Clausius, we can of course generalize the definition of entropy to processes with changing temperature.
Definition 7 (
Q- and
q-type Clausius Entropies)
. The entropy change in a thermodynamic system or part of it is the integral of the heat gain divided by the temperature of that system or part of it, i.e.,whereby if we use the temperature of the surroundings instead of the system temperature, we define the original (somewhat forgotten) Clausius’ entropy: A simple analysis based on Clausius’ First Statement leads to entropy inequalities of
Q- and
q-types:
where the equality can only apply to isothermal or trivial adiabatic processes (0 = 0). Applying this inequality to cyclic processes (
) leads to Clausius’ inequality (
), which became the basis of the entropic formulation of the Second Law of Thermodynamics. Of course, the change in the sign of this inequality resulted from further redefinitions of the entropy of the
q-type to the entropy of the
C-type and similar twists.
The
Q-type entropy formula can apply not only to reversible processes but also to some irreversible ones. The simplest example is the equalization of temperatures of two bodies. It turns out that in some other irreversible processes, entropy increases despite the lack of heat flow. Therefore, we can further generalize the entropy formula, e.g., for the expansion of gas into vacuum [
34].
Definition 8 (
V-type Entropy)
. The entropy change in a gas is equal to the integral of the sum of the increase in internal energy and the product of the pressure and the increase in volume divided by the temperature of the gas: Entropies of the
V- and
Q-types seem apparently equal, but it will be shown later that this need not be the case—which also applies to entropy of the C-type. However, for any entropy (by default of the C- or
Q-type), the following is considered the most popular Statement of the Second Law of Thermodynamics (see [
17]).
Definition 9. (Clausius’ Second Statement)
. There is a function of the state of a thermodynamic system called entropy, which reaches a maximum when the system is in a state of thermodynamic equilibrium. In other words, in thermally insulated systems, the only possible cyclical processes are those in which the total entropy of the system (two heat reservoirs) does not decrease:For reversible processes, the increase in total entropy is zero, and for irreversible processes, it is greater than zero. The same statement can also be applied to the Q-type entropy used for reservoirs (). In the language of the q-type entropy for a gas (), the statement is Clausius’ inequality: Assuming a given definition of entropy and thanks to the principle of its additivity, Clausius’ Second Statement is a direct condition for every process (possible and impossible). Therefore, this statement immediately states which process is possible and which is impossible and should be rejected according to this statement.
However, Clausius’ First Statement and the Kelvin–Ostwald Statement allow us to reject only some of the impossible processes. We reject the remaining part of the impossible processes on the basis of adding processes leading to the impossible ones. Possible processes are those which, as a result of arbitrary addition, do not lead to impossible processes. It has been shown that Clausius’ Second Statement is stronger and not equivalent to Clausius’ First Statement and the Kelvin–Ostwald Statement [
17].
1.3. A Review of Statistical Definitions of Entropy and Its Application to an Ideal Gas
It is also worth considering statistical definitions of entropy, a topic not discussed in [
14,
17]. Contrary to appearances, this issue is not disambiguated in relation to phenomenological thermodynamics. On the contrary, there are many different aspects and cases of entropy in statistical physics. It is probably not even possible to talk about a strict universal definition, but only about definitions for characteristic types of statistical systems. Most often, definitions of entropy in statistical physics refer
implicitly to an equilibrium situation—that is, a state with the maximum possible entropy. From the point of view of the analysis of the Second Law of Thermodynamics, this situation seems loopy—we postulate entropy maximization, but we define maximum entropy. However, dividing the system into subsystems makes sense of such a procedure. This shows a sample of the problem of defining entropy and the problem of precisely formulating the Second Law of Thermodynamics. Boltzmann’s general definition is considered to be the first historical static definition of entropy.
Definition 10 (General Boltzmann Entropy for Number of Microstates)
. The entropy of the state of a macroscopic system is a logarithmic measure of the number of microstates that can realize the given macrostate, assuming a small but finite resolution of distinguishing microscopic states (in terms of location in the volume and values of speed, energy or temperature):where the proportionality coefficient is the Boltzmann constant, . The counting of states in classical mechanics is arbitrary, in the sense that it depends on the sizes of the elementary resolution cells. However, due to the property of the logarithm function for large numbers, the resolution parameters should only affect the entropy value in an additive way. In the framework of quantum mechanics, counting states is more literal, but for example, for an ideal gas without rotational degrees of freedom, it should lead to essentially the same entropy value. However, entropy defined in this way cannot be completely unambiguous as to the additive constant. Even in quantum mechanics, not all microstates are completely quantized, so there may be an element of arbitrariness in the choice of resolution parameters. Even if the quantum mechanics method for an ideal gas were unambiguous, it should be realized that at low temperatures, such a method loses its physical sense and cannot be consistent with the experiment. Therefore, to eliminate the freedom of constant additive entropy, the Third Law of Thermodynamics is introduced, which postulates the zeroing of entropy as the temperature approaches zero. However, such a principle has only a conventional and definitional character and cannot be treated as a fundamental solution—and should not even be treated as a law of physics. Moreover, it does not apply to an ideal gas, because no constant will eliminate the logarithm singularity at zero temperature.
Sometimes, in the definition of Boltzmann entropy, other symbols are used instead of : or . However, the use of the Greek letter omega may be misleading, because it suggests a reference to all microstates of the system, not just those that realize a given macrostate. Therefore, in this article, it was decided to use the letter , just like on Boltzmann’s tombstone—but in a decorative “mathcal” version to distinguish it from the work symbol W.
Until we specify what microstates we consider to realize a given macrostate, (i) we do not even know whether we can count the microstates of the non-equilibrium type in the entropy formula. Similarly, the following is not clear: (ii) can the Boltzmann definition apply to non-equilibrium macrostates? Regarding problem (ii), it seems that the Boltzmann definition can be used for non-equilibrium states, even though the equilibrium entropy formulas are most often given. The latter results from the simple fact that the equilibrium state is described by fewer parameters. For example, when specifying the parameters of the gas state, we mean (consciously or not) the equilibrium state. Therefore, consistently regarding (i), if we have a non-equilibrium macrostate, then its entropy must be calculated after the non-equilibrium microstates. If the macrostate is in equilibrium, it is logical to calculate the entropy after the microstates relating to equilibrium. Unfortunately, some definitions of entropy force the counting of non-equilibrium states. This is the case, for example, in the microcanonical ensemble, in which one should consider a state in which one particle has taken over the energy of the entire system. Fortunately, this condition has negligible importance (probability), so problem (i) is not critical. This is because probability, the Second Law of Thermodynamics and entropy distinguish equilibrium states. However, this distinction is a result of the nature of things and should not be put into place by hand at the level of definition.
It is worth making the definition of Boltzmann entropy a bit more specific. Let us consider a large
set of identical particles (but not necessarily quantum-indistinguishable). Let us assume that we recognize a given macrostate as a specific filling of
k cells into which the phase space has been divided (the space of positions and momenta of one particle—not to be confused with the full configuration space of all particles). Each cell contains a certain number of particles, e.g., in the
ith cell there are
particles. Of course, the number of particles must sum to
N:
Now, the possible number of all configurations is equal to the number of permutations with repetitions, because we do not distinguish permutations within one cell:
In mathematical considerations, one should be prepared for a formally infinite number of cells k and, consequently, fractional or even smaller-than-unity values of . However, even in such cases, it is possible to calculate a finite number of states . Taking the above into account, let us define Boltzmann entropy in relation to the Maxwell–Boltzmann distribution.
Definition 11 (Boltzmann Entropy at a Specific Temperature)
. The entropy of an equilibrium macroscopic system with volume V and temperature T, composed of N “point” particles, is a logarithmic measure of the number of microstates that can realize this homogeneous macrostate by assuming small-volume cells in the space of positions υ and the space of momentum μ:where the proportionality coefficient is the Boltzmann constant, . The counting of states within classical mechanics is arbitrary here, in the sense that it depends on the volume of the unit cells. However, thanks to the property of the logarithm function for large numbers, these parameters only affect the additive entropy constant (and also allow the arguments of the logarithmic function to be written in dimensionless form). An elementary formula for this type of Boltzmann entropy can be derived by using formula (
9) for the distribution in space cells and the Maxwell–Boltzmann distribution for momentum space (the temperature-dependent part) [
35,
36]:
where the cell size of the shot space of volume
has been replaced for simplicity by the equivalent temperature “pixel”:
where
m is the mass of one particle.
In formula (
11), what is disturbing is the presence of an extensive variable
V in the logarithm, instead of an intensive combination of variables
. In [
35] (p. 376), and in [
36] (pp. 72, 73), in entropy derivation, division by
N does not appear explicitly or implicitly via an additive constant either. The question is whether it is an error of this type of derivation (in two sources) or an error of the definition, which should, for example, include some dividing factor of the Gibbs type. This issue is taken up further in this work. However, it is worth seeing that in the volume part, formula (
11) for
of air gives an entropy value more than twice as large as for
, and it should not be like that.
The second doubt concerns the coefficient
in the expression
in the derivations in [
35] (p. 378) and in [
36] (p. 72), which also appears in the entropy derived by Sackur in 1913 (see [
37]). In further alternative calculations, the ratio is
—e.g., in Tetrode’s calculations from 1912 (see [
37]). It is worth adding that we are considering here a gas of material “points” (so-called monatomic) with the number of degrees of freedom for a determinate particle of 3, not 5. It is, therefore, difficult to indicate the reason for the discrepancy and to determine whether it is important due to the frequent omission of the considered term (or terms conditioned by “random” constants). There is quite a popular erroneous opinion that the considered term can only be derived on the basis of statistical mechanics, and it cannot be performed within the framework of phenomenological thermodynamics. In other words, if, at a constant volume, we start to increase the number of particles at the same temperature, then, according to the definition of the
V-type, we will obtain the entropy term under consideration with the coefficient
. However, if the same were calculated formally at constant pressure, the coefficient would be
.
A steady-temperature state subject to the Maxwell–Boltzmann distribution is effectively a canonical ensemble—this will be analyzed further in these terms. One can also consider a microcanonical ensemble in which the energy E of the system is fixed. This leads to a slightly different way of understanding Boltzmann entropy.
Definition 12 (Boltzmann Entropy at a Specific Energy)
. The entropy of a macroscopic system with volume V and energy E, composed of N particles, is a logarithmic measure of the number of microstates that can realize this homogeneous macrostate by assuming small cells of volume υ and low resolution of energy levels ε:where the proportionality coefficient is the Boltzmann constant, . The counting of states within classical mechanics is arbitrary in the sense that it depends on the choice of the size of the elementary volume cell and the choice of the width of the energy intervals. However, due to the property of the logarithm function for large numbers, the parameters υ and ε only affect the entropy value in an additive way. The calculation of this type of Boltzmann entropy for the part of the spatial volume is the same as before. Unfortunately, calculating the energy part is much more difficult. First, one must consider the partition of energy E into individual molecules. Secondly, the isotropic velocity distribution should be considered here without assuming the Maxwell–Boltzmann distribution. Therefore, the counting of states should take place over a larger set than for equilibrium states, but for states that are isotropic in terms of velocity distribution.
Counting states by using the partition function is rarely performed because it is cumbersome. An example is represented by the calculations for the photon sphere of a black hole in the context of Hawking radiation [
34]. These calculations led to a result consistent with the equation of state of the photon gas, but compliance with Hawking radiation was obtained only for low energies.
While the Boltzmann entropy of the -type was defined directly for equilibrium states, and the entropy of the -type counted isotropic but not necessarily equilibrium states, one can also consider the entropy counting even non-isotropic states. This takes place in the extended phase space, i.e., in the configuration space, where the state of the microstate is described by one point in the -dimensional space. In other words, there is another approach to measuring the complexity of a macrostate from previous ones. Instead of dividing the space of positions and of momentum into cells and counting all possibilities, we can take, as a measure, the volume of configurational phase space that a given macrostate could realize. Unfortunately, energy resolution or cell size is also somewhat useful here.
Definition 13 (Boltzmann Entropy for a Phase Volume with the Gibbs Divider)
. The entropy of a macroscopic system with a volume V, composed of N particles, is a logarithmic measure of the phase volume of all microstates of the system with energy in the range , which can realize a given macrostate with energy in this range:where in addition to the unit volume of the ω phase-space configuration cell, there is the so-called Gibbs divider . The role of the Gibbs divider is to reduce a significant value of the phase volume and is sometimes interpreted on the basis of the indistinguishability of identical particles. The dependence on the parameters ε and ω is additive. The definition could be limited to the surface area of the hypersphere in the configurational phase space instead of the volume of the hypersphere shell. This would remove the parameter but would formally require changing the volume of the unit cell to a unit area one dimension smaller.
In Boltzmann phase entropy, counting takes place over all microstates from the considered energy range. Therefore, both non-isotropic states in velocity and non-uniform states in position—in short, non-equilibrium states—are taken into account. Nevertheless, we will refer the entropy to the equilibrium state, since no non-equilibrium parameters are given for the state. The entropy of non-equilibrium macrostates can be found by dividing them into subsystems that can be treated as equilibrium and then adding their entropies.
It is sometimes suggested e.g., see [
36] (p. 97) that the volume of a unit cell in phase space follows from the Heisenberg uncertainty principle, the strict form of which is
. Then, it should be
, although it is most often defined more simply as
. The units are more important here than the values themselves, but the problem of the lack of a divisor
is rather decidable in quantum computations. This can be seen most easily from the de Broglie relation
, which defines the volume of two-dimensional phase space (for momentum and position in the range of the matter wavelength
). A more detailed calculation for a single particle involves quantizing the momentum on a cubic
box. In the quasiclassical limit, the summation over discrete states turns into a continuous integral over phase space see [
38] (pp. 181, 415):
So, there is indeed a divisor
here, which for
N particles will be
. The limit is quasiclassical because if the size of the unit cell tends to zero (
), the entropy would tend to infinity. Therefore, quantum mechanics is believed to specify a finite value of entropy even at the level of the additive constant. It is worth knowing, however, that this is a theoretical assumption that is not necessary for thermodynamic or even statistical considerations. In other words, most of such considerations do not even depend on the arbitrary choices of the additive entropy constant and the size of the elementary cells.
The area of the constant energy phase hypersurface of dimensions
can be calculated from the exact formula for the area of an odd-dimensional hypersphere of radius
r immersed in even-dimensional space:
Taking into account that the radius of the sphere in the configurational momentum space can be related to the energy for material points
as follows, we obtain
For further simplifications, Stirling’s asymptotic formula is standardly used:
Its application to further approximations leads to
On this basis, the Boltzmann phase entropy of an ideal gas takes the form
where the auxiliary dimensional constant
was chosen so that the volume of the phase cell of one particle is
. It is customary to simply omit the dimensional constants and the additive constant (but not in the so-called Sackur–Tetrode entropy formula). In any case, here, the simplest part of the additive term has a coefficient of 5/2, and not 3/2 as before. Furthermore, the volume under the logarithm sign is divided by the number of particles
N. Generally, the presented result and the derivation method are consistent with the Tetrode method (see [
37]).
There is no temperature yet in the entropy formula obtained from the microcanonical decomposition. However, the general formalism of thermodynamics allows us to introduce temperature in a formal way:
The formula uses the Boltzmann phase
entropy, although the formula also applies to the
entropy; the later, however, has not been calculated (at least here). In any case, by using the above relation between temperature and energy, the
entropy can be given in a form almost equivalent to the
entropy. The difference concerns the discussed factor dividing the volume and the less important term by the factors 5/2 vs. 3/2.
There is yet a slightly different approach to the statistical definition of entropy than Boltzmann’s original approach. Most often, it refers to the canonical distribution and is attributed to Gibbs. Gibbs was probably the first to use the new formula for statistical entropy, but Boltzmann did not shy away from this formula either (see below). In addition, Shannon used this formula in information theory, as well as in ordinary mathematical statistics without a physical context.
Definition 14 (Gibbs or Gibbs–Shannon entropy)
. The Gibbs entropy of a macroscopic system whose microscopic energy realizations have probabilities is given bywhere the proportionality coefficient is the Boltzmann constant taken with a minus sign. The counting of states may be arbitrary, in the sense that they can be divided into the micro-scale realizations of a macrostate and, consequently, assigned a probability distribution. This entropy is sometimes also called Gibbs–Shannon entropy. The Gibbs entropy usually applies to a canonical system in which the energy of the system is not strictly defined, but the temperature is specified. Therefore, we can only talk about the dependence of entropy on the average energy value. In a sense, entropy (like energy) in the canonical distribution also has a secondary, resulting role—as a derivative of another quantity or as an average value. A more direct role is played by the Helmholtz free energy, which in thermodynamics is defined as follows:
Only from this energy, entropy can be calculated. This is an entropy equivalent to the Gibbs entropy, but due to a different way of obtaining it, we will treat it as a new definition.
Definition 15 (Entropy of the
F-type)
. The entropy of a (canonical) system, expressed in terms of the Helmholtz free energy F, is, with opposite sign, equal to the partial derivative of this free energy with respect to temperature at fixed volume V and number of particles N:In statistical terms, the free energy F is proportional with opposite sign to the absolute temperature and to the logarithm of the statistical sum:The statistical sum is the normalization coefficient of the unnormalized exponential probabilities of energy in the thermal (canonical) distribution and is called partition function: Although this definition of entropy partially resembles the Boltzmann definition (e.g., instead of the number of states
, there is a statistical sum
Z), it is more complex and specific (it includes temperature, which complicates the partial derivative). At least superficially, this entropy seems to be slightly different from the general idea of Boltzmann entropy (or even from Gibbs entropy), as
but immediately, one sees that the additional last term contains only a part proportional to the number of particles
N and is often omitted in many types of entropy (it does not apply to the Sackur–Tetrode entropy). However, in this work, it was decided to check the main part of this term, omitting the scale constants (Planck’s constant, the sizes of unit cells, the mass of particles and the logarithms of
and
e).
For an ideal gas, the statistical sum
Z can be calculated similarly to the calculation of the
entropy. For the calculation of the Boltzmann
entropy, we referred to arbitrary cells of the position and momentum space. Due to the discrete nature of the statistical sum, its calculation is usually performed as part of quantization on a cubic or cuboid box. The spatial boundary conditions on the box quantize momentum and energy, so the counting takes place only in terms of momentum quantum numbers—the volume of the box appears only indirectly in these relations. The result of this calculation is [
39]
One sees that this value is very similar to
when calculating the Boltzmann entropy of the
-type. Indeed, after making standard approximations, the first principal term (
) of the entropy of an ideal gas of the
-type differs from the previous entropy only by a term proportional to the variable
N itself. However, the additional term of this entropy will be
and will reconcile these entropies:
where in this case, the auxiliary dimensional constants
and
satisfy the relation
.
In addition to the microcanonical and canonical (thermal) systems, there is also a large canonical system (grand canonical system), in which even the number of particles is not constant. Due to the numerous complications in defining entropy so far, the grand canonical system will not be considered here. Even more so, isobaric systems (isothermal or isenthalpic) will be omitted.
Note that the probability appearing in the Gibbs entropy is simply normalized to unity, and not to the number of particles
N, which may suggest that the Gibbs entropy differs from the Boltzmann-type entropy in the absence of this multiplicative factor
N. Indeed, this is reflected in the following version of the Boltzmann entropy, the Boltzmann entropy of the
H-type, in which there is a single-particle probability distribution function in the phase space [
40] normalized to
N (and not to unity).
Definition 16 (Boltzmann entropy of
H-type)
. In the state described in the phase space by the single-particle distribution function , with t being the time, the spatial coordinate and the velocity, the H-type entropy of the system (or simply the function, but not the Hamiltonian and not enthalpy) is defined by the following integral:where the distribution function normalized to the number of particles N and in the denominator is the product of constants. The φ constant is usually the neglected volume of the elementary phase cell for positions and velocities (dimensional constant), and e is a useful numerical constant here. Note that the divisor e, after taking into account the normalization condition of the distribution function, leads to an additive entropy term of without the factor or . However, perhaps, this divisor actually corrects the target entropy value.
Moreover, it is often assumed that the quantity defined above (or slightly modified) is not entropy but, taken with opposite sign, the so-called H function. However, in light of the various definitions cited in this work, there is no point in considering this quantity to be something different from entropy, all the more so because Boltzmann formulated a theorem regarding this function, which was supposed to reflect the Second Law of Thermodynamics. To formulate this theorem, the Boltzmann kinetic equation and the assumption of molecular chaos are also needed.
The Boltzmann kinetic equation for the one-particle distribution function
for a state of
N particles (and normalized to
N) postulates that the total derivative of the distribution function is equal to the partial derivative taking into account two-particle collisions (without which the total derivative of the distribution function would be zero, i.e., the Liouville equation):
where
is the external acceleration field—e.g., the gravitational field—and the subscript
coll stands for collisions. The assumption of molecular chaos (
Stosszahlansatz), in simple terms, consists in assuming that two-particle collisions are factored by using the one-particle distribution function as follows:
where
is the differential cross-section related to the velocity angles satisfying the relation
. The kinematic–geometric idea of the molecular chaos assumption is quite simple (it is a simplification), but the mathematical formula itself is already complex; therefore, we will not go into details here. Readers interested in the details can be referred to the extensive textbook [
41], which describes an even richer set of equations, there called the Bogoliubov equations. In another source, this set of equations is called the BBGKY hierarchy (after Bogoliubov, Born, Green, Kirkwood and Yvon, respectively) [
42].
It turned out that the assumption
Stosszahlansatz together with evolution Equation (
32) was enough for Boltzmann, in a sense, to derive the Second Law of Thermodynamics.
Theorem 1 (Boltzmann
H)
. If the single-particle distribution function satisfies the “Stosszahlansatz” assumption regarding Boltzmann evolution Equation (32), then the entropy of the Boltzmann H-type is a non-decreasing function of time: The proof of Boltzmann’s theorem, in a notation analogous to the one adopted here, can be found in the textbooks [
38,
40,
42].
Although Boltzmann’s theorem is a true mathematical theorem, it is unfortunately not (and cannot be) a derivation of the Second Law of Thermodynamics. The thesis of every theorem follows from an assumption, and the assumption of molecular chaos (Stosszahlansatz) is not entirely true. This assumption, in some strange way, introduces the irreversibility and asymmetries of time evolution into the macroscopic system, when the remaining equations of physics do not contain this element of time asymmetry. This issue has been called the irreversibility problem or even Loschmidt’s irreversibility paradox.
An interesting discussion on molecular chaos was undertaken by Huang, the author of the textbook [
38] (pp. 85–91), who called the paradox of irreversibility a purely historical issue. Huang claims that molecular chaos is a local minimum of entropy (a local maximum of the
H function), while the global maximum is the Maxwell–Boltzmann distribution (a global minimum of the
H function). The evolution of the system does not proceed strictly according to the Boltzmann kinetic equation, but the entropy increases to a maximum value taking into account the fluctuation noise with local minima in states of molecular chaos and local maxima in other unidentified states. This increase in entropy is called the statistical approach to the Second Law of Thermodynamics. This image is quite suggestive, but a bit inconsistent and somewhat destructive in terms of knowing the evolution of the system. The presence of the destruction effect is clear and perhaps necessary—the strict evolution of the system was simply rejected in order to introduce local time reversal symmetry—without a rational justification that on average, the evolution proceeds according to the kinetic equation. The element of inconsistency concerns the suggestion (used in the proof of a local entropy minimum in a state of molecular chaos) that time reversal leads further to an increase in entropy (backwards in time). This leads to two ambiguities. First, it is not known whether this is how time reversal should work (perhaps it should). Second, even if the time reversal were correct, what would be the departure from Boltzmann kinetic evolution? Perhaps in the form of “backward time” evolution. But if so, how do we explain the advantage of forward-in-time evolution over backward-in-time evolution? In any case, this does not provide a new rigorous formulation of the Second Law of Thermodynamics, but only a somewhat speculative explanation of the reversibility paradox at the expense of Boltzmann kinetic evolution.
Typically, in the context of the irreversibility problem, it is claimed (somewhat incorrectly) that all the fundamental equations of physics are time-symmetric. However, this does not in any way apply to equations involving resistance to motion, including friction and viscosity. It is difficult to explain why the aspect of resistance to motion is so neglected in the history of physics. The ignorance of resistance to movement is imputed to Aristotle, when in fact it is exactly the opposite, and Aristotle described the proportion of motion that included resistance. This proportion of motion can even be interpreted as consistent with Newtonian dynamics according to the correspondence principle [
43]. The time asymmetry of Aristotle’s equation (proportion) of dynamics was noticed by American physicist Leonard Susskind (see [
43]).
Another example of an equation lacking time symmetry is the Langevin equation, which also relates to viscous friction and has applications in statistical mechanics ([
36] p. 236). The anisotropic generalization of the Langevin equation is even used to describe the fission process of atomic nuclei [
44]. If we generalize the Langevin equation for a point particle to a continuous medium (with viscosity), we obtain the Navier–Stokes equation (see [
45] p. 45). Moreover, the general solutions of the Navier–Stokes equations constitute the oldest and still unsolved Millennium Prize Problem. This equation can be applied even to interstellar matter [
46]. Another simpler example is the thermal conductivity equation, which almost by definition distinguishes the direction of time flow and therefore the direction of heat flow. One can only ask whether the Langevin, Navier–Stokes or heat conduction equations are fundamental equations. In a sense, all of these equations reflect the Second Law of Thermodynamics, the first and second ones implicitly and the third one explicitly in the form of Clausius’ First Statement. It is worth noting that the conduction equation is a first-order differential equation with respect to time. So, is it this property that determines the irreversibility of this equation in time? It seems that not only this is true but that it can also be compared with other equations with this property.
A good example of a general equation in statistical mechanics with a first derivative with respect to time is the Fokker–Planck equation (later also known as the Kolmogorov forward equation from 1931). This equation has applications even in nuclear physics [
47]. The Fokker–Planck equation is a partial differential equation describing the evolution of the probability distribution function taking into account drift, diffusion and drag forces. In its most general version, the equation from Planck’s 1917 paper [
48], the probability distribution is given in phase space, but most of the equations in that paper concern only position space. However, in the textbook [
40] (pp. 91–93), this equation is presented only in momentum space as the space-averaged (integrated) Boltzmann kinetic equation in the diffusion approximation. At the same time, it was noted there that the form of the Fokker–Planck equation is so universal that it can be obtained in various variables. This is probably the reason why in the textbook [
36], the Fokker–Planck equation on page 215 is a position equation, and on pages 236 and 238, it is already a phase equation (with velocity as a variable). Regardless of the subtleties mentioned, there is no doubt that the Fokker–Planck equation is not symmetric in time.
A very similar example is the Smoluchowski diffusion equation from 1916 in [
49] (see also [
50]). While the Fokker–Planck equation concerns the distribution function in position space or momentum space or both (phase space), the Smoluchowski diffusion equation concerns the distribution function in only position space in one dimension. In classical mechanics, the momentum and position images are not complementary, so it is difficult to compare the equations written in these two different images. However, the fact is that equation “(4)” from Smoluchowski’s 1916 paper [
49] is a version of the Fokker–Planck equation that took its final form in Planck’s 1917 paper [
48]. At the same time, for example, in Fokker’s work from 1914 [
51], there is no such equation explicitly written down, nor even a substantially similar equation (with the second spatial derivative and the first time derivative—in explicit or Fourier form). The positional image allows for a more accurate description of Brownian motion, which Smoluchowski had already described earlier in 1906 with a formula now known as the Einstein–Smoluchowski relation [
52].
In this respect, recently, it has been shown by Yuvan and Bier [
53] that unlike diffusion systems, out-of-equilibrium statistical systems consisting of overdamped particles transporting or converting energy are not subjected to Gaussian noise anymore but to Lévy noise. The particles exhibit another type of Brownian motion generated by Lévy noise and thus obey to an
-stable nonhomogeneous distribution, or Lévy distribution. Bringing the systems back to thermodynamic equilibrium, the relaxation process leads to an increase in entropy which, in turn, causes a decrease in free energy.
The issue of the time derivative is of a completely different nature in quantum mechanics and for the Schrödinger equation, which was written only in 1926. This equation contains almost the same derivatives, including only the first derivative with respect to time, as the Fokker–Planck or Smoluchowski equation. Due to the complex nature of this equation and the imaginary unit in the time derivative, it is treated as invariant under time reversal. This statement requires the assumption of the complex conjugate accompanying time reversal.
In the context of Boltzmann and statistical physics, it is impossible not to mention the very fundamental ergodic hypothesis, which serves as a basic assumption and postulate. The ergodic hypothesis assumes that the average value of a physical quantity (random variable) based on a statistical distribution is realized over time, i.e., it tends to the average time value of this physical quantity (random variable). Let us assume that this pursuit simply boils down to the postulate of equality of these two types of averages (without precisely defining the period
of this averaging):
If the distribution is stationary (it does not evolve over time), then of course
can and should tend to infinity
. And this is indeed the standard assumption; however, averaging over the future or past or the entire timeline should not differ. However, for non-stationary (time-dependent) distributions, averaging over an infinite time seems to be pointless. In such situations, one could, therefore, consider “local” averages over time. Then, the forward time averaging used above could prefer to evolve forward in time relative to the symmetric averaging time interval. Unfortunately, ergodicity is limited to stationary situations, even where it supposedly does not always occur.
As one can see, the physical status of the ergodic hypothesis is not entirely clear. While the left side of the considered equalities has a clear statistical sense, it is not entirely clear how to determine the right side, which requires knowledge of the time evolution of the system under consideration. What extension of the Liouville equation should be used to describe this evolution in time, the Boltzmann equation or some other equivalent? In other words, it is not clear whether in practice, the ergodic hypothesis is a working definition of time averages (right-hand side) or a postulate or principle of physics that can be confirmed or disproved theoretically or experimentally. The literature states as an indisputable fact that given known systems are ergodic or not ergodic. Nevertheless, we should be more humble regarding the physical status of the ergodic hypothesis and treat it as an important methodological tool. In any case, it seems that the ergodic hypothesis should be independent of the Boltzmann equation with molecular chaos. Therefore, the problem of irreversibility cannot be solved merely by questioning the ergodic hypothesis. Ergodicity should not be confused with weak ergodicity, which is based on the Boltzmann equation as a simplifying assumption.
The violation of weak ergodicity (molecular chaos) probably has deeper reasons, since even a simple system of rigid spheres caused problems in the proof of this property (no violation) in the context of the Sinai theorem [
54]. Anyway, more physical examples of magnetics, glasses (regular and spin) and colloids do not satisfy weak ergodicity or do not satisfy the molecular chaos hypothesis at the basis of the Boltzmann equation.
Regardless of the existence of resistance to motion, criticisms of the molecular chaos assumption, the problems with the status of ergodicity and Boltzmann’s generally enormous contribution, it seems that the essence of the Second Law of Thermodynamics in a statistical approach can be contained in the Newer Fluctuation Theorem and its corollary, the Inequality of the Second Law.