1. Introduction
This paper applies our work in [
1] on the
Value of Information,
(VoI), based on the Shannon/Stratonovich approach.
VoI measures the usefulness or worth of acquiring or possessing certain information. As such, it helps decision makers to make efficient decisions by assessing the potential benefits that can be gained from obtaining new information. Knowing the
VoI of different tasks allows us to prioritize and allocate scarce resources efficiently. It helps to determine whether it is worth investing time, effort, or money in order to gather additional information before making a decision. As such, the
VoI allows us to weigh the potential benefits against the costs associated with obtaining or processing information (Another recent approach, the Payoffs–Beliefs Duality framework of De Lara and Gossner [
2], evaluates the value of information by explicitly modelling the interaction between the agent’s beliefs and the set of available actions).
The
VoI concept was founded in the works of Shannon and Stratonovich and is a subject of
information theory [
3,
4]. In [
1], we focus on a circular spatial setting. The circle is a convenient and general vehicle to introduce real or perceived space (e.g., product differentiation) into economic modelling. The circular space can be partitioned into distinct segments, thereby inducing a spatial network structure that enables the incorporation of
transport costs, such as in the simplest case the
taxi cost associated with transferring a passenger from a given origin to a designated destination.
The focus in this paper is the simpler linear spatial setting. Our strategy is to employ standard economic utility (or cost) functions such as linear, quadratic, and constant absolute and relative risk aversion (CARA and CRRA) functions [
5] and to modify the stochastic environments and their priors to derive the relevant
VoI. This supremum utility is found by solving
where
=
denotes sets of random variables with
u, some action, and
, some resulting cost or negative utility. The maximization is performed over conditional or mutual probabilities, a formulation that is also standard in the economic
rational inattention literature (see e.g., [
6] for a survey), subject to an upper bound
I to the available Shannon [
3] mutual information:
The more information the agent is able to acquire about the true probability distribution, the more accurately they can determine the optimal course of action. Information about x may in practice be communicated by some variable y. We then have a triple and a decision rule and again .
In [
1], we extend the original discrete Stratonovich example of
VoI in a circular setting, compare it to a simpler information measure (referred to as
Hartley information in [
4]), and examine how changes in the distribution and alternative specifications of transport costs influence the resulting values. In the present paper, we shift the focus to continuous distributions within a linear spatial framework, employing economically motivated utility functions across various stochastic environments to analyze the resulting
VoI.
2. The VoI Calculations
In [
1] based on [
4], we make use of the following result:
Theorem 1. The condition for the optimal joint measure can be derived aswhere is the cumulant generating function, is the partition function as defined below, and β the inverse Lagrange multiplier of the information constraint that is put in terms of Shannon mutual information. Also The
partition function , is a transformed transport cost function where
is the inverse Lagrange multiplier for the information constraint. In [
4], Stratonovich employs the partition function
which is discrete and assumes equal probabilities of all
n outcomes, as the information constraint is an inequality,
≥ 0. The
cumulant-generating function can be derived as
.
Unlike [
1], this paper focuses on a continuous linear setting with a prior
continuous that is not necessarily uniform. As in [
1], we consider
translation-invariant cost functions, that is, cases where the setting can be reduced to an outcome
with
. We thus have
The Stratonovich procedure has several instructive duality properties. Importantly, we have
which implies that for any non-degenerate prior distribution
, the cumulant-generating function
is convex.
An equivalent formulation of the variational problem can be obtained by replacing the cost function with a utility function . This change in sign inverts the optimization objective from cost minimization to utility maximization while preserving the structure of the exponential family distribution. Consequently, the cumulant-generating function remains defined as , but its derivative now yields the expected utility: . Although the statistical mechanics formulation is unchanged, the interpretation of the Lagrange multiplier shifts accordingly: under a utility-maximizing perspective, takes the opposite sign relative to the cost-minimizing case.
Solving the key-equations
and
jointly for all “inverse temperatures”
yields
, which after normalizing for the expected cost/utility under no information
yields the
Value of Information, (
VoI) as
By , we then have that its Legendre–Fenchel transform is also convex with . Then is concave by application of duality.
Much of the previous economic literature has focused on various forms of “non-concavities” in the value of information e.g., Radner and Stiglitz [
7]. Recently the entropy constraint optimization underlying
VoI has also been investigated in
rational inattention models [
6,
8], and it is closely related to
entropic risk measures [
9,
10]. Originally, Stratonovich’s
VoI theory and Shannon’s
rate distortion theory focus on the utility of decisions, evaluating signal errors and their associated costs. These are, however, not economic utilities in the sense of reflecting preferences or money but are often task-specific and indirect. The theories consider the maximum possible improvement in expected utility due to additional information
. Formally, this improvement can be written as
or more simply
as above. Notably, these formulations do not account for the cost of obtaining the extra information
, and
VoI is strictly increasing and concave, reflecting pure gains from additional information.
In contrast, Radner and Stiglitz [
7] incorporate the cost of obtaining information directly into their utility function. They define
, where
d is a decision and
is an information constraint, which is non-increasing (and even decreasing) with respect to
. Here,
represents the expected utility and differs from our
, the supremum of expected utility. Radner and Stiglitz’s
, analogous to our
, is the supremum of
. By including the cost of information, they observe a potential negative derivative of
, indicating non-concavity, which they attribute to the cost factor.
To reconcile these views, we may define a function
, where
is the expected cost of obtaining information
I. This function,
(the return from information), can be negative if the information cost exceeds the maximum possible gain
. Moreover,
does not need to be concave, and its derivative is
. Radner and Stiglitz’s formulation allows
but not
, as the supremum is taken
after paying for the information, preventing a strictly decreasing value. The static literature on rational inattention, e.g., Matějka and McKay [
8], typically also incorporates the cost of obtaining information, such as the cost of entropy reduction, into their problem framework; thus, their inclusion of information costs aligns with the Radner and Stiglitz approach.
We would argue however, following Stratonovich and Shannon, that the cost of obtaining information should be considered separately from the VoI. The VoI, , measures the maximum gain in expected utility related to a specific decision problem without factoring in the cost of information. This separation provides a clear upper bound for the cost of information that should not be exceeded. By maintaining a distinction between the value and cost of information, we can better understand and optimize decision-making processes, ensuring that the pure benefits of information are accurately assessed and utilized.
Clearly, the curvature of , i.e., the impact of marginal information on the expected cost/utility, enabling the decision maker to decide whether or not to acquire additional information, will depend on the exact data of the problem in the form of the utility function and the stochastic environment. We will focus on the implications of this data next.
3. The Stratonovich VoI and Quadratic Utility with a Normal Prior
We next consider the
VoI under quadratic utility
. Quadratic loss is a common choice in rate distortion applications. However, in economics, quadratic utility has some drawbacks, one being increasing the absolute risk aversion (ARA). Here the coefficient of absolute risk aversion is given as
As wealth increases, individuals with quadratic utility become more averse to risk in absolute terms, which is not consistent with observed behavior where people tend to exhibit decreasing or constant absolute risk aversion.
To satisfy participation constraint, a constant
can be added, and the mean may be increased by some
to
Still, for large values of wealth, the marginal utility (the additional satisfaction gained from an extra unit of wealth) can become negative. This implies that beyond a certain wealth level, individuals derive negative satisfaction from additional wealth, which is unrealistic at least in the long run. So, in essence, the use of quadratic utility is most appropriate at lower levels of wealth and consumption or for short-term consumption, where goods may well reveal a consumption maximum.
A well known recompense is that we may assume that the prior probability is normal Gaussian, i.e., that
where
is the “precision” of the normal.
Here the partition function can be derived analytically as
It is well known that the computation and conditions for convergence of such partition functions cannot be taken for granted and occupy a central position in statistical physics. Following the above procedure, we find the utility
as
and information
as
or
Solving
for
explicitly, we find the relevant solution for a given level of
U as
which allows us to solve for
explicitly as
which has a minimum at
. Higher values of precision
will change the slope of the convex
function as well as this minimum. With
, we can take the derivative
so that a higher precision of the prior distribution will flatten
but also move
, which we will have to take into account for conclusions about the
VoI.
Figure 1 shows the
I(U) functions for
for various precision
(blue),
(yellow), and
(green).
The effect of
is unambiguous. We can show that higher values of f will change the convex
function increasing the
VoI.
Figure 2 shows the
I(U) functions for
,
and
(blue), 5 (yellow), and 9 (green).
4. The Stratonovich VoI and Linear Utility with an Exponentially Distributed Prior
The above line of reasoning can also be found for a “simpler” linear utility that implies the extreme case of no risk aversion, i.e., . Here the individual is indifferent between a certain outcome and a risky one with the same expected value.
We assume an
exponential prior
for
and
, so with
, we have
and
We can use
to obtain the explicit dependency
This function is convex, and it is minimized at (i.e., depends on the rate parameter). If the reference measure tends to a uniform (i.e., ), then information is never minimized and the function is decreasing for all . This also means that the inverse temperature is always negative for .
This implies that one cannot really maximize the expected utility in systems where a uniform measure is taken as a reference. Intuitively, this is from as , which basically means that the maximum value of expected utility should be negative if , but this is a contradiction, because we require . Negative corresponds to a minimization problem (and it is okay to minimize on ).
However, for any exponential reference with , there is an area where (i.e., is minimized at , and it is increasing for ). This means that maximizing expected utility on probability measures that have finite divergence from the reference exponential measure is possible. The maximum expected utility will be a concave function corresponding to the inverse of for . Using and taking the derivative , an increase in the rate parameter (lowering the variance of the prior distribution) will steepen and thus flatten and thus the VoI, which is supported by the intuition that for we get to the uniform distribution where information is most valuable.
Most individuals exhibit some degree of risk aversion, preferring certain outcomes over uncertain ones with the same expected value. Also with linear utility, the marginal utility of wealth is constant, meaning that each additional unit of wealth provides the same amount of utility regardless of the individual’s current level of wealth. This contradicts empirical observations where the marginal utility of wealth typically decreases as wealth increases. Risk-averse behavior is what we will allow for next.
5. The Stratonovich VoI with a CRRA Utility and an Exponentially Distributed Prior
The constant relative risk aversion (CRRA) utility function is widely used in economics due to its appealing properties. CRRA utility implies a constant coefficient of relative risk aversion, which is given as
and implies that an individual’s aversion to risk does not change proportionally with their wealth level. This is more realistic compared to other utility functions (e.g., quadratic), where risk aversion may vary with wealth. As individuals become wealthier, they are willing to take on more absolute risk, which aligns with observed behavior. The utility function takes the following form:
for
for which
. Note that by l’Hôpital rule as
, we obtain the simple logarithmic utility function
CRRA utility functions are scale-invariant so that the utility derived from consumption depends on the proportion of wealth consumed, rather than the absolute level. This property makes CRRA utility functions useful for analyzing decisions involving proportional changes, such as investment and savings decisions like in portfolio selection theories, such as the capital asset pricing model (CAPM) and the consumption-based capital asset pricing model (CCAPM). They help explain how individuals allocate their wealth among different assets based on their risk–return trade-offs. Also, investors with CRRA utility are more likely to diversify their portfolios to manage risk, as they are consistently risk-averse regardless of their wealth levels.
The prior distribution is again
for
and
. Thus, the partition function becomes
Unfortunately, for general
g, this partition function has no analytic solution. First, we will therefore focus on the particular case
where the partition function becomes
if
, where
is the
function:
Following the procedure above, we find that
where
is the
function, the logarithmic derivative of the
function. Note that with
, the last equation simplifies to
which now no longer depends on the rate parameter
a.
To find the value of information
VoI as defined by
, we need to normalize. The unconditional, zero-information-expected utility, can be derived either as
or alternatively as
where
, the negative of the Euler constant. We thus find the
VoI for
as
or
which no longer depends on on the distribution parameter
a either.
Hence, as neither the inverse temperature
via
nor
will depend on
a, the
VoI will be
independent of the parameter of the prior exponential distribution. Remember that this result also comprises the uniform prior distribution as a limit case. In
Figure 3, we show the
VoI function
V(I) for any
. The implications of this
invariance property for economics, portfolio theory, and risk analysis deserve further investigation [
11].
Finally, we will show that this property does not hold for
. Assuming
, the CRRA utility becomes
The prior distribution is again
for
and
. The unconditional expectation of this utility is
Thus, the partition function
can be derived and the proceeding above yields
Figure 4, where we show the
VoI function
V(I) for a CRRA utility with
and
confirming that for
, the invariance property no longer holds.