# Calibration Invariance of the MaxEnt Distribution in the Maximum Entropy Principle

^{1}

^{2}

^{3}

Next Article in Journal

Next Article in Special Issue

Next Article in Special Issue

Previous Article in Journal

Previous Article in Special Issue

Previous Article in Special Issue

Section for the Science of Complex Systems, Center for Medical Statistics, Informatics, and Intelligent Systems (CeMSIIS), Medical University of Vienna, Spitalgasse 23, 1090 Vienna, Austria

Complexity Science Hub Vienna, Josefstädterstrasse 39, 1080 Vienna, Austria

Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University, 11519 Prague, Czech Republic

Received: 11 December 2020
/
Revised: 7 January 2021
/
Accepted: 9 January 2021
/
Published: 11 January 2021

(This article belongs to the Special Issue The Statistical Foundations of Entropy)

The maximum entropy principle consists of two steps: The first step is to find the distribution which maximizes entropy under given constraints. The second step is to calculate the corresponding thermodynamic quantities. The second part is determined by Lagrange multipliers’ relation to the measurable physical quantities as temperature or Helmholtz free energy/free entropy. We show that for a given MaxEnt distribution, the whole class of entropies and constraints leads to the same distribution but generally different thermodynamics. Two simple classes of transformations that preserve the MaxEnt distributions are studied: The first case is a transform of the entropy to an arbitrary increasing function of that entropy. The second case is the transform of the energetic constraint to a combination of the normalization and energetic constraints. We derive group transformations of the Lagrange multipliers corresponding to these transformations and determine their connections to thermodynamic quantities. For each case, we provide a simple example of this transformation.

The maximum entropy principle (MEP) is one of the most fundamental concepts in equilibrium statistical mechanics. It was originally proposed by Jaynes [1,2] in order to connect information entropy introduced by Shannon and thermodynamic entropy introduced by Clausius, Boltzmann, and Gibbs. Although the MEP was originally introduced for the case of Shannon entropy, with the advent of generalized entropies [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17] the natural effort was to apply the maximum entropy principle beyond the case of Shannon entropy. Another question that arose naturally is whether the MEP can be applied to other than ordinary linear constraints. Examples of the constraints that might be considered in connection with the MEP are escort constraints [18,19,20], Kolmogorov–Nagumo means [21,22], or more exotic types of constraints [23]. It brought some discussion about the applicability of the principle for the case of generalized entropies [24,25] and nonlinear constraints and its thermodynamic interpretation [26,27,28,29,30]. Indeed, MEP is not the only one extremal principle in statistical physics, let us mention, e.g., the principle of maximum caliber [31] which is useful in non-equilibrium physics. In this paper, we stick, however, to MEP, as it is the most widespread principle and the theory of generalized thermostatistics has been mainly focused on MEP. For a recent review of other principles, see also in [32]. For the discussion between entropy arising from information theory and thermodynamics, see in [33]. For the sake of simplicity, let us consider canonical ensemble, i.e., fluctuations in internal energy. For the case of the grand-canonical ensemble, one can obtain similar results to the ones presented in this paper for the case of a chemical potential $\mu $.

In order to grasp the debate about the applicability of the MEP, let us emphasize that the MEP consists of two main parts:

- (I)
- Finding a distribution (MaxEnt distribution) that maximizes entropy under given constraints.
- (II)
- Plugging the distribution into the entropic functional and calculating physical quantities as thermodynamic potentials, temperature, or response coefficients (specific heat, compressibility, etc.).

The first part is rather a mathematical procedure of finding a maximum subject to constraints. This is done by the method of Lagrange multipliers, by defining a Lagrange function in the form

$$Lagrange\phantom{\rule{4pt}{0ex}}function=entropy-\left(Lagrange\phantom{\rule{4pt}{0ex}}multiplier\right)\xb7\left(constraint\right)$$

The Lagrange multipliers’ role at this stage is to ensure fulfillment of constraints as they are determined from the set of equations obtained from the maximization of the Lagrange function. This procedure is known in statistics as Softmax, a method used to infer distribution from given data. Shore and Johnson [34,35] therefore studied MEP as a statistical inference procedure and established a set of consistency axioms. Shore and Johnson’s work heated a debate about whether MEP for generalized entropies can be also understood as a statistical inference method satisfying the consistency requirements [24,36,37,38,39,40,41]. In [42], it was shown that the class of entropies satisfying the original Shore–Johnson axioms is wider than previously thought. Moreover, in [43], the connection between Shore–Johnson axioms and Shannon–Khinchin axioms was investigated and the equivalence of information theory and statistical inference axiomatics was established.

In the second part, the physical interpretation of entropy starts to arise. Similar to the case of Lagrangian mechanics, where the Lagrangian is the difference between kinetic and potential energy and the Lagrange multipliers play the role of the normal force to the constraints, here the entropy becomes a thermodynamic state variable. For Shannon entropy and linear constraints, the Lagrange multipliers become inverse temperature and free entropy, respectively.

The main aim of this paper is to discuss the relation between points (I) and (II). In the first part, it is possible to find a class of entropic functionals and constraints leading to the same MaxEnt distribution. However, in the second part, different entropy and/or constraints lead to different thermodynamics and different relations between physical quantities and Lagrange multipliers. The two main messages of this paper are listed below.

- (i)
- For each MaxEnt distribution, there exists the whole class of entropies and constraints leading to generally different thermodynamics.
- (ii)
- It is possible to establish transformation relations of Lagrange parameters (and subsequently the thermodynamic quantities) for classes of entropies and constraints giving the same MaxEnt distribution.

We call the latter transformation relation calibration invariance of the MaxEnt distribution. A straightforward consequence is that in order to fully determine the statistical properties of a thermal system in equilibrium, it is not enough to measure the statistical distribution of energies.

The rest of the paper is organized as follows. In the next section, we briefly discuss the main aspects of MEP for the case of general entropic functional and general constraints. In the following two sections, we introduce two simple transformations of entropic functional (Section 3) and constraints (Section 4) that lead to the same MaxEnt distribution and derive transformations between the Lagrange multipliers. These transformations form a group. After the general derivation, we provide a few simple examples for each case. The last section is devoted to conclusions.

Maximum entropy principle is the way of obtaining the representing probability distribution from the limited amount of information. Our aim is to find the probability distribution of the system $P={\left\{{p}_{i}\right\}}_{i=1}^{n}$ under the set of given constraints. In the simplest case, the principle can be formulated as follows.

The normalization condition is considered in the regular form, i.e., ${f}_{0}\left(P\right)={\sum}_{i}{p}_{i}-1=\langle 1\rangle -1$. Moreover, we have a class of constraints, which originally described the average energy of the system. Therefore, we call them energy constraints. We consider only one energy constraint, for simplicity, although there can be more constraints, and they do not have to consider only internal energy but also other thermodynamic quantities. In the original formulation, the energy constraint is linear in probabilities, i.e.,
but it can be generally any nonlinear function of probabilities—escort means provide an example. A large class of energy constraints can be written in a separable form, which means that ${f}_{E}\left(P\right)=\mathcal{E}\left(P\right)-E$, i.e., in the form expressing the “expected” internal energy (macroscopic variable) as a function of probability distribution (microscopic variable). This class of constraints plays a dominant role in the thermodynamic systems.

$${f}_{E}\left(P\right)=\sum _{i}{p}_{i}{E}_{i}-E=\langle E\rangle -E,$$

In order to find a solution of the Maximum entropy principle, we use a common method of Lagrange multipliers, which can be done through maximization of Lagrange function:
The maximization procedure leads to the set of equations
from which we determine the resulting MaxEnt distribution. In order to obtain a unique solution, we require that the entropic functional should be a Schur-concave symmetric function [42].

$$\mathcal{L}(P;\alpha ,\beta )=S\left(P\right)-\alpha {f}_{0}\left(P\right)-\beta {f}_{E}\left(P\right)$$

$$\begin{array}{ccc}\hfill \frac{\partial \mathcal{L}(P;\alpha ,\beta )}{\partial {p}_{i}}& =& 0\phantom{\rule{1.em}{0ex}}\forall \phantom{\rule{4pt}{0ex}}i\in \{1,\dots ,n\}\hfill \\ \hfill \frac{\partial \mathcal{L}(P;\alpha ,\beta )}{\partial \alpha}& =& {f}_{0}\left(P\right)=0\hfill \\ \hfill \frac{\partial \mathcal{L}(P;\alpha ,\beta )}{\partial \beta}& =& {f}_{E}\left(P\right)=0\hfill \end{array}$$

As a consequence, we obtain the values of Lagrange multipliers $\alpha $ and $\beta $. From the strictly mathematical point of view, Lagrange multipliers are just auxiliary parameters to be solved from the set of Equation (3). However, in physics, Lagrange parameters also have a physical interpretation. In Lagrangian mechanics, Lagrange parameters play the role of normal force to the constraints. Similarly, in ordinary statistical mechanics based on Shannon entropy $H\left(P\right)=-{\sum}_{i}{p}_{i}\mathrm{log}{p}_{i}$ and linear constraints (1), the Lagrange multipliers have the particular physical interpretation:
Note that the free entropy is, similarly to Helmholtz free energy, a Legendre transform of entropy w.r.t. internal energy. For the case of ordinary thermodynamics (Shannon entropy and linear constraints), it is equal to the logarithm of the partition function.

$$\begin{array}{ccc}\hfill \beta & =& \frac{1}{T}\phantom{\rule{1.em}{0ex}}\left(\mathrm{inverse}\phantom{\rule{4pt}{0ex}}\mathrm{temperature}\right),\hfill \end{array}$$

$$\begin{array}{ccc}\hfill \alpha & =& S-\frac{1}{T}E\phantom{\rule{1.em}{0ex}}\left(\mathrm{free}\phantom{\rule{4pt}{0ex}}\mathrm{entropy}\right).\hfill \end{array}$$

This interpretation is valid only in this case. In the case, when we use different entropy functional or different constraints, these relation between Lagrange multipliers and thermodynamic quantities are no longer valid. This is even the case, when the resulting MaxEnt distribution is the same.

The main aim of this paper is to show how the invariance of MaxEnt distribution affects the Lagrange multipliers and their relations to thermodynamic quantities. Let us now solve Equation (3). The first set of equations leads to
Let us assume the normalization in the usual way which leads to $\frac{\partial {f}_{0}\left(P\right)}{\partial {p}_{i}}=1$. Moreover, let us consider separable energy constraint, so $\frac{\partial {f}_{E}\left(P\right)}{\partial {p}_{i}}=\frac{\partial \mathcal{E}\left(P\right)}{\partial {p}_{i}}$. The resulting probability distribution can be expressed as
where ${}^{(-1)}$ denotes inverse function of $\partial S/\partial {p}_{i}$ (provided it exists and is unique). We can express $\alpha $ by multiplying the equation by ${p}_{i}$ and summing over i, which leads to
where $\langle X\rangle ={\sum}_{i}{x}_{i}{p}_{i}$ and ${\nabla}_{P}=(\frac{\partial}{\partial {p}_{1}},\dots ,\frac{\partial}{\partial {p}_{n}})$. By plugging back to the previous equation, we can get $\beta $ as
where ${\Delta}_{i}\left(X\right)={x}_{i}-\langle X\rangle $ is the difference from the average.

$$\frac{\partial S\left(P\right)}{\partial {p}_{i}}-\alpha \frac{\partial {f}_{0}\left(P\right)}{\partial {p}_{i}}-\beta \frac{\partial {f}_{E}\left(P\right)}{\partial {p}_{i}}=0.$$

$${p}_{i}^{\star}={\frac{\partial S}{\partial {p}_{i}}}^{(-1)}\left[\alpha +\beta \frac{\partial \mathcal{E}\left(P\right)}{\partial {p}_{i}}\right].$$

$$\alpha =\u2329{\nabla}_{P}S\left(P\right)\u232a-\beta \u2329{\nabla}_{P}\mathcal{E}\left(P\right)\u232a$$

$$\beta =\frac{{\Delta}_{i}(\nabla S\left(P\right))}{{\Delta}_{i}(\nabla \mathcal{E}\left(P\right))}$$

The solution of Equation (3) depends on the internal energy E. However, in thermodynamics it is natural to invert the relation $\beta =\beta \left(E\right)$ and express the relevant quantities in terms of $\beta $, so $E=E\left(\beta \right)$. With that, we can calculate dependence of entropy on $\beta $:
For separable energy constraints, $\frac{\partial {f}_{E}}{\partial E}=-1$, so we obtain the well-known relation

$$\frac{\partial S}{\partial \beta}=\sum _{i}\frac{\partial S}{\partial {p}_{i}}\frac{\partial {p}_{i}}{\partial \beta}=\sum _{i}\left(\alpha +\beta \frac{\partial \mathcal{E}\left(P\right)}{\partial {p}_{i}}\right)\frac{\partial {p}_{i}}{\partial \beta}=\beta \sum _{i}\frac{\partial {f}_{E}}{\partial {p}_{i}}\frac{\partial {p}_{i}}{\partial \beta}=\beta \left(-\frac{\partial {f}_{E}}{\partial E}\frac{\partial E}{\partial \beta}\right)$$

$$\frac{\partial S}{\partial \beta}=\beta \frac{\partial E}{\partial \beta}\Rightarrow \beta =\frac{\partial S}{\partial E}.$$

Let us now define the Legendre conjugate of entropy called free entropy (also called Jaynes parameter [44] or Massieu function [45]):
Free entropy is connected to Helmholtz free energy as $\psi =-\beta F$. The difference between $\alpha $ and $\psi $ can be expressed as
Therefore, we can understand the difference $\psi -\alpha $ as the Legendre transform of $\psi $ with respect to P. From this, we see that the difference between $\psi $ and $\alpha $ is a constant (not depending on thermodynamic quantities), if two independent conditions are fulfilled, i.e., $E=\langle {\nabla}_{P}\mathcal{E}\left(P\right)\rangle $ and $S=\langle {\nabla}_{P}S\rangle +a$. The former constraint leads to linear energy constraints, while the latter one leads to the the conclusion that the entropy must be in trace form $S\left(P\right)={\sum}_{i}g\left({p}_{i}\right)$. Moreover, the function g has to fulfill the following equation,
leading to $g\left(x\right)=-ax\mathrm{log}\left(x\right)+bx$ which is equivalent to Shannon entropy.

$$\psi =S-\frac{\partial S}{\partial E}E=S-\beta E$$

$$\psi -\alpha =\left(S-\langle {\nabla}_{P}S\rangle \right)-\beta (E-\langle {\nabla}_{P}\mathcal{E}\rangle )$$

$$g\left(x\right)-ax=x{g}^{\prime}\left(x\right)$$

In the next sections, we will explore how the transformation of the entropy and the energy constraint that leaves the MaxEnt distribution invariant affects the Lagrange multipliers and their relation to thermodynamic quantities.

The simplest transformation of Lagrange functional that leaves the MaxEnt distribution invariant is to consider an arbitrary increasing function of entropy, i.e., we replace $S\left(P\right)$ by $c\left(S\right(P\left)\right)$, where ${c}^{\prime}\left(x\right)>0$. Let us note that this transform preserves the uniqueness of the MEP because it is easy to show that if $S\left(P\right)$ is Schur-concave, $c\left(S\right(P\left)\right)$ is also Schur-concave [42] which is a sufficient condition for uniqueness of the MaxEnt distribution.

In this case, the Lagrange equations are adjusted as follows,
leading to
and
so we get that the function c causes rescaling of $\alpha $ and $\beta $, so
while its ratio remains unchanged, i.e., ${\alpha}_{c}/{\beta}_{c}=\alpha /\beta $. Actually, the set of increasing functions conform a group of Lagrange multipliers, because it is easy to show that the Lagrange parameters related to the entropy ${c}_{1}({c}_{2}\left(S\left(P\right)\right)$
which can be described as the group operation $({c}_{1}\circ {c}_{2})\mapsto {c}_{1}^{\prime}\left({c}_{2}\right)\xb7{c}_{2}^{\prime}$.

$${c}^{\prime}\left(S\left(P\right)\right)\frac{\partial S\left(P\right)}{\partial {p}_{i}}-{\alpha}_{c}\frac{\partial {f}_{0}\left(P\right)}{\partial {p}_{i}}-{\beta}_{c}\frac{\partial \mathcal{E}\left(P\right)}{\partial {p}_{i}}=0$$

$${\alpha}_{c}={c}^{\prime}\left(S\left(P\right)\right)\u2329{\nabla}_{P}S\left(P\right)\u232a-{\beta}_{c}\u2329{\nabla}_{P}\mathcal{E}\left(P\right)\u232a$$

$${\beta}_{c}={c}^{\prime}\left(S\left(P\right)\right)\frac{{\Delta}_{i}\left({\nabla}_{P}S\left(P\right)\right)}{{\Delta}_{i}\left({\nabla}_{P}\mathcal{E}\left(P\right)\right)}$$

$$\begin{array}{c}\hfill {\alpha}_{c}={c}^{\prime}\left(S\left(P\right)\right)\phantom{\rule{0.166667em}{0ex}}\alpha \end{array}$$

$$\begin{array}{c}\hfill {\beta}_{c}={c}^{\prime}\left(S\left(P\right)\right)\phantom{\rule{0.166667em}{0ex}}\beta \end{array}$$

$${\beta}_{{c}_{1}\circ {c}_{2}}={c}_{1}^{\prime}({c}_{2}\left(S\left(P\right)\right)\xb7{c}_{2}^{\prime}\left(S\left(P\right)\right)\phantom{\rule{0.166667em}{0ex}}\beta ={c}_{1}^{\prime}({c}_{2}\left(S\left(P\right)\right){\beta}_{{c}_{2}}$$

An important property of this transformation is that it changes the extensive–intensive duality of the conjugated pair of thermodynamic variables and the respective forces while it maintains the distribution. Notably, by changing the entropic functional from extensive (i.e., $S\left(n\right)\sim U\left(n\right)$) to non-extensive, it changes $\beta $ from intensive (i.e., size-independent, at least in the thermodynamic limit) to non-intensive, i.e., explicitly size-dependent. This point has been discussed in connection with q-non-extensive statistical physics of [29,30] and the relation to the zeroth law of thermodynamics was shown in [46]. As one can see from the example below, although Rényi entropy and Tsallis entropy have the same maximizer, the corresponding thermodynamics is different. While Rényi entropy is additive (and therefore extensive for systems where $U\left(n\right)\sim n$) and the temperature is intensive, Tsallis entropy is non-extensive, and the corresponding temperature explicitly depends on the size of the system.

Let us finally mention that the difference between free entropy and Lagrange parameter $\alpha $ transforms as
While free entropy and other thermodynamic potentials are transformed, the heat change remains invariant under this transformation:

$${\psi}_{c}-{\alpha}_{c}=(c\left(S\right)-{c}^{\prime}\left(S\right)\langle {\nabla}_{P}S\left(P\right)\rangle -{c}^{\prime}\left(S\right)\beta (E-\langle {\nabla}_{P}\mathcal{E}\left(P\right)\rangle )={c}^{\prime}\left(S\right)\left(\psi -\alpha \right)+(c\left(S\right)-{c}^{\prime}\left(S\right)\xb7S).$$

$$\u0111{Q}_{c}={T}_{c}\mathrm{d}\phantom{\rule{0.166667em}{0ex}}c\left(S\right)=\frac{T}{{c}^{\prime}\left(S\right)}{c}^{\prime}\left(S\right)\mathrm{d}S=T\mathrm{d}S=\u0111Q.$$

We exemplify the calibration invariance on two popular examples of closely related entropies.

**Rényi entropy and Tsallis entropy**: Two most famous examples of generalized entropies are Rényi entropy ${R}_{q}\left(P\right)=\frac{1}{1-q}\mathrm{ln}\left({\sum}_{i}{p}_{i}^{q}\right)$ and Tsallis entropy ${S}_{q}\left(P\right)=\frac{1}{1-q}\left({\sum}_{i}{p}_{i}^{q}-1\right)$. Their relation can be expressed as$${R}_{q}\left(P\right)={c}_{q}\left({S}_{q}\left(P\right)\right)=\frac{1}{1-q}\mathrm{ln}\left[(1-q){S}_{q}\left(P\right)+1\right]$$and therefore we obtain that$${c}_{q}^{\prime}\left({S}_{q}\left(P\right)\right)=\frac{1}{1+(1-q){S}_{q}}=\frac{1}{{\sum}_{i}{p}_{i}^{q}}.$$The difference between free entropy and α can be obtained as$${\psi}_{R}-{\alpha}_{R}=\frac{1}{{\sum}_{i}{p}_{i}^{q}}({\psi}_{S}-{\alpha}_{S})+\left({R}_{q}\left(P\right)-\frac{{S}_{q}\left(P\right)}{{\sum}_{i}{p}_{i}^{q}}\right).$$One can therefore see that even though Rényi and Tsallis entropy lead to the same MaxEnt distribution, their thermodynamic quantities, such as temperature or free entropy, are different. Whether the system follows Rényi or Tsallis entropy depends on additional facts, as e.g., (non)-extensitivity and (non)-intensivity of thermodynamic quantities.**Shannon entropy and Entropy power**: A similar example is provided with Shannon entropy $H\left(P\right)={\sum}_{i}{p}_{i}\mathrm{ln}1/{p}_{i}$ and entropy power $\mathcal{P}\left(P\right)={\prod}_{i}{\left(1/{p}_{i}\right)}^{{p}_{i}}$. The relation between them is simply$$H\left(P\right)=c\left(\mathcal{P}\right(P\left)\right)=\mathrm{log}\left(\mathcal{P}\right(P\left)\right),$$so we obtain that$${c}^{\prime}\left(\mathcal{P}\left(P\right)\right)=1/\left(\mathcal{P}\left(P\right)\right)=\mathrm{exp}(-H\left(P\right)).$$For the difference between free entropy and α, we obtain that$$0={\psi}_{H}-{\alpha}_{H}=\frac{1}{\mathcal{P}\left(P\right)}\left({\psi}_{\mathcal{P}}-{\alpha}_{\mathcal{P}}\right)+\left(H\left(P\right)-1\right)$$from which we get that$${\psi}_{\mathcal{P}}-{\alpha}_{\mathcal{P}}=\mathcal{P}\left(P\right)\left(1-\mathrm{log}\mathcal{P}\left(P\right)\right).$$Therefore, we see that even that the MaxEnt distribution remains unchanged, the relation between α and free energy is different.

Similarly, one can uncover the invariance of the MaxEnt distribution when the constraints are transformed in a certain way. Generally, if two sets of constraints define the same domain, the resulting Maximum entropy principle should lead to equivalent results. We will not be so general, but we focus on a specific situation, which might be quite interesting for thermodynamic applications. Let us remind two conditions, which we assume: normalization ${f}_{0}\left(P\right)=0$ and energy constraint ${f}_{E}\left(P\right)=0$. Let us investigate the latter. Similarly to the previous case, it is possible to take any function g of ${f}_{E}\left(P\right)$, for which $g\left(y\right)=0$ if $y=0$. More generally, we can also take into account the normalization constraint and replace the original energy condition by
for any $g(x,y)$, for which $g(x,y)=0\Rightarrow y=0$. Let us investigate the Maximum entropy principle for this case. We can express the Lagrange function as
which leads to a set of equations
where ${G}^{(1,0)}=\frac{\partial g(x,y)}{\partial x}{|}_{(0,0)}$ and ${G}^{(0,1)}=\frac{\partial g(y,x)}{\partial x}{|}_{(0,0)}$. We take again into account that $\frac{\partial {f}_{0}\left(P\right)}{\partial {p}_{i}}=1$, multiply the equations by ${p}_{i}$ and some over i. This gives us
By plugging ${\alpha}_{g}$ back, we end with relation for ${\beta}_{g}$:
For ${\alpha}_{g}$ we end with
Thus, we end again with rescaling of ${\alpha}_{g}$ and ${\beta}_{g}$, which reads
The ratio of Lagrange multipliers is also transformed, so we get
Again, the set of all functions fulfilling the aforementioned condition conform a group. The group operation can be described by the relation between coefficients ${G}^{(1,0)}$ and ${G}^{(0,1)}$ for the composite function $g(x,y)={g}_{1}(x,{g}_{2}(x,y))$. We obtain that
which leads to group relations

$$g({f}_{0}\left(P\right),{f}_{E}\left(P\right))=0$$

$$\mathcal{L}\left(P\right)=S\left(P\right)-{\alpha}_{g}{f}_{0}\left(P\right)-{\beta}_{g}g({f}_{0}\left(P\right),{f}_{E}\left(P\right))$$

$$\frac{\partial S\left(P\right)}{\partial {p}_{i}}-{\alpha}_{g}\frac{\partial {f}_{0}\left(P\right)}{\partial {p}_{i}}-{\beta}_{g}\left[{G}^{(1,0)}\frac{\partial {f}_{0}\left(P\right)}{\partial {p}_{i}}+{G}^{(0,1)}\frac{\partial \mathcal{E}\left(P\right)}{\partial {p}_{i}}\right]=0$$

$${\alpha}_{g}=\u2329{\nabla}_{P}S\left(P\right)\u232a-{\beta}_{g}\left[{G}^{(1,0)}+{G}^{(0,1)}\u2329{\nabla}_{P}\mathcal{E}\left(P\right)\u232a\right]\phantom{\rule{0.166667em}{0ex}}.$$

$${\beta}_{g}=\frac{1}{{G}^{(0,1)}}\frac{{\Delta}_{i}\left({\nabla}_{P}S\left(P\right)\right)}{{\Delta}_{i}\left({\nabla}_{P}\mathcal{E}\left(P\right)\right)}\phantom{\rule{0.166667em}{0ex}}.$$

$${\alpha}_{g}=\u2329{\nabla}_{P}S\left(P\right)\u232a-\frac{{\Delta}_{i}\left({\nabla}_{P}S\left(P\right)\right)}{{\Delta}_{i}\left({\nabla}_{P}\mathcal{E}\left(P\right)\right)}\u2329\nabla {f}_{E}\left(P\right)\u232a\left[1+\frac{{G}^{(1,0)}}{{G}^{(0,1)}}\frac{1}{\u2329{\nabla}_{P}\mathcal{E}\left(P\right)\u232a}\right]\phantom{\rule{0.166667em}{0ex}}.$$

$$\begin{array}{c}\hfill {\alpha}_{g}(\alpha ,\beta )=\alpha -\frac{{G}^{(1,0)}}{{G}^{(0,1)}}\beta \phantom{\rule{0.166667em}{0ex}},\end{array}$$

$$\begin{array}{c}\hfill {\beta}_{g}\left(\beta \right)=\frac{\beta}{{G}^{(0,1)}}\phantom{\rule{0.166667em}{0ex}}.\end{array}$$

$$\frac{{\alpha}_{g}}{{\beta}_{g}}={G}^{(0,1)}\frac{\alpha}{\beta}-{G}^{(1,0)}.$$

$$\begin{array}{ccc}\hfill {G}^{(1,0)}& =& {G}_{1}^{(1,0)}+{G}_{1}^{(0,1)}{G}_{2}^{(1,0)}\hfill \end{array}$$

$$\begin{array}{ccc}\hfill {G}^{(0,1)}& =& {G}_{1}^{(0,1)}{G}_{2}^{(0,1)}\hfill \end{array}$$

$$\begin{array}{ccc}\hfill {\alpha}_{g}(\alpha ,\beta )& =& {\alpha}_{{g}_{1}}({\alpha}_{{g}_{2}}(\alpha ,\beta ),{\beta}_{{g}_{2}}\left(\beta \right))-\frac{{G}_{1}^{(1,0)}}{{G}_{1}^{(0,1)}}{\beta}_{{g}_{2}}\left(\beta \right)\hfill \end{array}$$

$$\begin{array}{ccc}\hfill {\beta}_{g}\left(\beta \right)& =& \frac{{\beta}_{{g}_{2}}\left(\beta \right)}{{G}_{1}^{(0,1)}}.\hfill \end{array}$$

Here we mention two simple examples of the aforementioned transformation.

**Energy shift:**Under this scheme, we can assume the constant shift in the energy spectrum. Let us rewrite the constraint $f\left(P\right)$ in the following form,$${f}_{E}\left(P\right)=\sum {p}_{i}{E}_{i}-E=\sum {p}_{i}({E}_{i}-{E}^{\prime})-(E-{E}^{\prime})$$which allows us to identify the function $g(x,y)$ as$$g(x,y)=y-{E}^{\prime}x+{E}^{\prime}$$**Latent escort means:**Apart from linear means, it is possible to use some generalized approaches. One of these examples is provided by so-called escort mean:$${E}_{q}={\langle E\rangle}_{q}=\frac{{\sum}_{i}{p}_{i}^{q}{E}_{i}}{{\sum}_{i}{p}_{i}^{q}}$$$$\frac{\sum {p}_{i}{E}_{i}}{\sum {p}_{i}}-E$$$$g(x,y)=\frac{y+E}{x+1}-E.$$Therefore, we obtain that ${G}^{(1,0)}=-E$ and ${G}^{(0,1)}=1$, which correspond to the previous example for ${E}^{\prime}=E$. Therefore, the latent energy mean can be understood in terms of MaxEnt procedure as the shift of the energy spectrum by its average energy.

In this paper, we have discussed the calibration invariance of MEP, which means that for a given MaxEnt distribution, there exists a whole class of entropies and constraints that lead to different thermodynamics (Thermodynamic quantities and response coefficients generally have different behavior. For example, from intensive temperature we can obtain temperature that explicitly depends on the size of the system). We have stressed that the MEP procedure consists of two parts, where the first part, consisting of determining the MaxEnt distribution, is rather a mathematical tool, while the second part, making connection between Lagrange multipliers and thermodynamic quantities, is a specific for application of MEP in statistical physics. Indeed, the paper does not cover all possible transformations leading to the same MaxEnt distribution (let us mention, at least, the additive duality of Tsallis entropy, where maximizing ${S}_{2-q}$ with linear constraint leads to the same result as maximizing ${S}_{q}$ with escort constraints [47]). The main lesson of this paper is that in order to fully determine a thermal system in equilibrium, we need to measure not only probability distribution, but also all relevant thermodynamic quantities (as entropy). Moreover, the transformation between Lagrange parameters and its connection to thermodynamic potentials can be useful in situations when one is not certain about the exact form of entropy.

This research was funded by the Austrian Science Fund (FWF), project I 3073, Austrian Research Promotion agency (FFG), project 882184 and by the Grant Agency of the Czech Republic (GAČR), grant No. 19-16066S.

I would like to thank Petr Jizba for helpful discussions.

The author declares no conflict of interest.

- Jaynes, E.T. Information Theory and Statistical Mechanics. Phys. Rev.
**1957**, 106, 620. [Google Scholar] [CrossRef] - Jaynes, E.T. Information Theory and Statistical Mechanics. II. Phys. Rev.
**1957**, 108, 171. [Google Scholar] [CrossRef] - Burg, J.P. The relationship between maximum entropy spectra and maximum likelihood spectra. Geophysics
**1972**, 37, 375–376. [Google Scholar] [CrossRef] - Rényi, A. Selected Papers of Alfréd Rényi; Akademia Kiado: Budapest, Hungary, 1976; Volume 2. [Google Scholar]
- Havrda, J.H.; Charvát, F. Quantification Method of Classification Processes. Concept of Structural α-Entropy. Kybernetika
**1967**, 3, 30–35. [Google Scholar] - Sharma, B.D.; Mitter, J.; Mohan, M. On Measures of “Useful” Information. Inf. Control
**1978**, 39, 323–336. [Google Scholar] [CrossRef] - Tsallis, C. Possible generalization of Boltzmann-Gibbs statistics. J. Stat. Phys.
**1988**, 52, 479–487. [Google Scholar] [CrossRef] - Frank, T.; Daffertshofer, A. Exact time-dependent solutions of the Renyi Fokker-Planck equation and the Fokker-Planck equations related to the entropies proposed by Sharma and Mittal. Physica A
**2000**, 285, 351–366. [Google Scholar] [CrossRef] - Kaniadakis, G. Statistical mechanics in the context of special relativity. Phys. Rev. E
**2002**, 66, 056125. [Google Scholar] [CrossRef] - Jizba, P.; Arimitsu, T. The world according to Rényi: Thermodynamics of multifractal systems. Ann. Phys.
**2004**, 312, 17–59. [Google Scholar] [CrossRef] - Hanel, R.; Thurner, S. A comprehensive classification of complex statistical systems and an ab-initio derivation of their entropy and distribution functions. Europhys. Lett.
**2011**, 93, 20006. [Google Scholar] [CrossRef] - Thurner, S.; Hanel, R.; Klimek, P. Introduction to the Theory of Complex Systems; Oxford University Press: Oxford, UK, 2018. [Google Scholar]
- Korbel, J.; Hanel, R.; Thurner, S. Classification of complex systems by their sample-space scaling exponents. New J. Phys.
**2018**, 20, 093007. [Google Scholar] [CrossRef] - Tempesta, P.; Jensen, H.J. Universality classes and information-theoretic Measures of complexity via Group entropies. Sci. Rep.
**2020**, 10, 1–11. [Google Scholar] [CrossRef] - Ilić, V.M.; Stankovixcx, M.S. Generalized Shannon-Khinchin axioms and uniqueness theorem for pseudo-additive entropies. Physica A
**2014**, 411, 138–145. [Google Scholar] [CrossRef] - Ilić, V.M.; Scarfone, A.M.; Wada, T. Equivalence between four versions of thermostatistics based on strongly pseudoadditive entropies. Phys. Rev. E
**2019**, 100, 062135. [Google Scholar] [CrossRef] - Czachor, M. Unifying Aspects of Generalized Calculus. Entropy
**2020**, 22, 1180. [Google Scholar] [CrossRef] - Beck, C.; Schlögl, F. Thermodynamics of Chaotic Systems: An Introduction; Cambridge University Press: Cambridge, UK, 1993. [Google Scholar]
- Abe, S. Geometry of escort distributions. Phys. Rev. E
**2003**, 68, 031101. [Google Scholar] [CrossRef] [PubMed] - Bercher, J.-F. On escort distributions, q-gaussians and Fisher information. AIP Conf. Proc.
**2011**, 1305, 208. [Google Scholar] - Czachor, M.; Naudts, J. Thermostatistics based on Kolmogorov-Nagumo averages: Unifying framework for extensive and nonextensive generalizations. Phys. Lett. A
**2002**, 298, 369–374. [Google Scholar] [CrossRef] - Scarfone, A.M.; Matsuzoe, H.; Wada, T. Consistency of the structure of Legendre transform in thermodynamics with the Kolmogorov-Nagumo average. Phys. Lett. A
**2016**, 380, 3022–3028. [Google Scholar] [CrossRef] - Bercher, J.-F. Tsallis distribution as a standard maximum entropy solution with ‘tail’ constraint. Phys. Lett. A
**2008**, 372, 5657–5659. [Google Scholar] [CrossRef] - Pressé, S.; Ghosh, K.; Lee, J.; Dill, K.A. Nonadditive Entropies Yield Probability Distributions with Biases not Warranted by the Data. Phys. Rev. Lett.
**2013**, 111, 180604. [Google Scholar] [CrossRef] [PubMed] - Oikonomou, T.; Bagci, B. Misusing the entropy maximization in the jungle of generalized entropies. Phys. Lett. A
**2017**, 381, 207–211. [Google Scholar] [CrossRef] - Tsallis, C.; Mendes, R.S.; Plastino, A.R. The role of constraints within generalized nonextensive statistics. Phys. A
**1998**, 286, 534–554. [Google Scholar] [CrossRef] - Martínez, S.; Nicolás, F.; Peninni, F.; Plastino, A. Tsallis’ entropy maximization procedure revisited. Phys. A
**2000**, 286, 489–502. [Google Scholar] [CrossRef] - Plastino, A.; Plastino, A.R. On the universality of thermodynamics’ Legendre transform structure. Phys. Lett. A
**1997**, 226, 257–263. [Google Scholar] [CrossRef] - Rama, S.K. Tsallis Statistics: Averages and a Physical Interpretation of the Lagrange Multiplier β. Phys. Lett. A
**2000**, 276, 103–108. [Google Scholar] [CrossRef] - Campisi, M.; Bagci, G.B. Tsallis Ensemble as an Exact Orthode. Phys. Lett. A
**2007**, 362, 11–15. [Google Scholar] [CrossRef] - Dixit, P.D.; Wagoner, J.; Weistuch, C.; Pressé, S.; Ghosh, K.; Dill, K.A. Perspective: Maximum caliber is a general variational principle for dynamical systems. J. Chem. Phys.
**2018**, 148, 010901. [Google Scholar] [CrossRef] - Lucia, U. Stationary Open Systems: A Brief Review on Contemporary Theories on Irreversibility. Physica A
**2013**, 392, 1051–1062. [Google Scholar] [CrossRef] - Palazzo, P. Hierarchical Structure of Generalized Thermodynamic and Informational Entropy. Entropy
**2018**, 20, 553. [Google Scholar] [CrossRef] - Shore, J.E.; Johnson, R.W. Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. IEEE Trans. Inf. Theor.
**1980**, 26, 26–37. [Google Scholar] [CrossRef] - Shore, J.E.; Johnson, R.W. Properties of cross-entropy minimization. IEEE Trans. Inf. Theor.
**1981**, 27, 472–482. [Google Scholar] [CrossRef] - Uffink, J. Can the maximum entropy principle be explained as a consistency requirement? Stud. Hist. Philos. Mod. Phys.
**1995**, 26, 223–261. [Google Scholar] [CrossRef] - Tsallis, C. Conceptual Inadequacy of the Shore and Johnson Axioms for Wide Classes of Complex Systems. Entropy
**2015**, 17, 2853–2861. [Google Scholar] [CrossRef] - Pressé, S.; Ghosh, K.; Lee, J.; Dill, K.A. Reply to C. Tsallis’ “Conceptual Inadequacy of the Shore and Johnson Axioms for Wide Classes of Complex Systems”. Entropy
**2015**, 17, 5043–5046. [Google Scholar] [CrossRef] - Oikonomou, T.; Bagci, G.B. Rényi entropy yields artificial biases not in the data and incorrect updating due to the finite-size data. Phys. Rev. E
**2019**, 99, 032134. [Google Scholar] [CrossRef] - Jizba, P.; Korbel, J. Comment on “Rényi entropy yields artificial biases not in the data and incorrect updating due to the finite-size data”. Phys. Rev. E
**2019**, 100, 026101. [Google Scholar] [CrossRef] - Oikonomou, T.; Bagci, G.B. Reply to “Comment on Rényi entropy yields artificial biases not in the data and incorrect updating due to the finite-size data”. Phys. Rev. E
**2019**, 100, 026102. [Google Scholar] [CrossRef] - Jizba, P.; Korbel, J. Maximum Entropy Principle in Statistical Inference: Case for Non-Shannonian Entropies. Phys. Rev. Lett.
**2019**, 122, 120601. [Google Scholar] [CrossRef] - Jizba, P.; Korbel, J. When Shannon and Khinchin meet Shore and Johnson: Equivalence of information theory and statistical inference axiomatics. Phys. Rev. E
**2020**, 101, 042126. [Google Scholar] [CrossRef] - Plastino, A.; Plastino, A.R. Tsallis Entropy and Jaynes’ Information Theory Formalism. Braz. J. Phys.
**1999**, 29, 50–60. [Google Scholar] [CrossRef] - Naudts, J. Generalized Thermostatistics; Springer: London, UK, 2011. [Google Scholar]
- Biró, T.S.; Ván, P. Zeroth law compatibility of nonadditive thermodynamics. Phys. Rev. E
**2011**, 83, 061147. [Google Scholar] [CrossRef] [PubMed] - Wada, T.; Scarfone, A.M. Connections between Tsallis’ formalisms employing the standard linear average energy and ones employing the normalized q-average energy. Phys. Lett. A
**2005**, 335, 351–362. [Google Scholar] [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).