From Penalties to Protection: The Continuous Time Sustainable Efficiency Frontier

Müller, Lukas

doi:10.3390/jrfm18110610

Open AccessTechnical Note

From Penalties to Protection: The Continuous Time Sustainable Efficiency Frontier

by

Lukas Müller

Deutsche Bundesbank, 60329 Frankfurt am Main, Germany

J. Risk Financial Manag. 2025, 18(11), 610; https://doi.org/10.3390/jrfm18110610

Submission received: 26 September 2025 / Revised: 28 October 2025 / Accepted: 29 October 2025 / Published: 30 October 2025

(This article belongs to the Special Issue Featured Papers in Mathematics and Finance, 2nd Edition)

Download Versions Notes

Abstract

We develop a robust continuous time portfolio optimization framework that incorporates time-varying ESG risk through dynamically evolving drift ambiguity. Building on the equivalence between linear ESG penalties in mean-variance optimization and robust formulations under drift uncertainty, we extend the analysis to a dynamic setting with time-dependent, ESG-weighted ellipsoidal ambiguity sets. The model admits a tractable solution under CRRA preferences: the worst-case drift is obtained in closed form, and the optimal portfolio strategy is characterized as the unique maximizer of an ESG-adjusted Markowitz-type objective at each point in time. Economically, the framework provides a rigorous justification for penalty-based ESG portfolio models, while offering time-consistent robustness, forward-looking risk management, and dynamic hedging against sustainability-related model risk.

Keywords:

sustainable finance; ESG integration; risk management; robust portfolio optimization; model uncertainty; financial stability and resilience; regulatory frameworks for sustainable finance

1. Introduction

1.1. Portfolio Optimization

The foundations of modern portfolio theory date back to the pioneering work of H. Markowitz (1959), who formulated the trade-off between expected return and variance in a static setting. Later, Merton (1969) introduced a dynamic framework in continuous time, modeling portfolio selection as a stochastic control problem, while the martingale approach developed by Pliska (1986), Karatzas et al. (1987), and Cox and Huang (1989) offered an alternative perspective via duality and terminal wealth characterization. These models laid the basis for a vast literature on optimal portfolio selection, risk management, and dynamic asset allocation.

1.2. Risk, Ambiguity, and Model Uncertainty

A key assumption in classical models is the precise knowledge of asset return dynamics and their underlying probability measure. However, as emphasized by Knight (1921), not all uncertainty can be expressed as measurable risk. Ellsberg (1961) further highlighted the behavioral distinction between risk, where probabilities are known, and ambiguity, where they are not. Portfolio optimization under model uncertainty has since been studied in both dominated frameworks, where alternative measures are absolutely continuous with respect to a reference measure (Z. Chen & Epstein, 2002; Garlappi et al., 2006; Liu, 2011; Schied et al., 2009), and undominated frameworks, where no such reference measure exists (Hernández-Hernández & Schied, 2006; Neufeld & Sikic, 2018; Nutz, 2016). Robust portfolio problems in continuous time, such as those studied by Biagini and Pınar (2017), typically consider drift and volatility ambiguity within these frameworks and provide optimal strategies for utility-maximizing investors.

1.3. ESG Risk and Sustainable Investment

Environmental, social, and governance (ESG) considerations have become central to modern portfolio management, as investors and institutions increasingly integrate sustainability metrics into asset selection and optimization. A prominent approach extends the classical mean–variance optimization framework of H. M. Markowitz (1952) by incorporating ESG disutilities, typically through penalty terms that reduce utility in proportion to the portfolio’s aggregate ESG exposure. Linear specifications (Blitz et al., 2024; Garcia-Bernabeu et al., 2024; Gasser et al., 2016; Utz et al., 2014) subtract a term

λ s^{⊤} w

from expected returns, thereby discouraging allocations to firms with poor ESG profiles. More general utility functions

f (s^{⊤} w)

, as in Pedersen et al. (2021), embed nonlinear preferences. A recent survey by Varmaz et al. (2022) documents such penalty-based methods as standard tools for greening portfolios.

Beyond penalization, sustainable investing has also been framed as a multi-criteria decision problem. Kanamura (2023) analyzes diversification effects of sustainable assets using CVaR-based optimization, while Feng et al. (2024) combine sustainability sentiment indicators with machine learning to construct profitable strategies. The challenge of balancing financial and ESG objectives is also highlighted in Alda (2025), who study portfolio concentration in socially responsible pension funds. These approaches underscore the breadth of methods available for integrating sustainability preferences into portfolio design. Recent contributions such as Müller and Joubrel (2025) explicitly construct ESG-driven ambiguity sets that tilt allocations toward sustainable assets without requiring explicit penalty terms, thereby linking ESG risk to worst-case return scenarios.

Although a considerable strand of the literature has investigated optimal investment strategies that integrate sustainability considerations, the majority of these contributions, such as those outlined above, remain anchored in a one period Markowitz framework. In contrast, studies that address portfolio choice in continuous time under sustainability constraints are comparatively scarce. Continuous time approaches offer significant advantages: they enable time-consistent decision-making, allow explicit modeling of dynamic ESG risk, and provide a natural framework for incorporating forward-looking information, and scenario analysis. An example can be found in Korn and Nurkanović (2025), where the authors analyze the optimal investment problem of a life insurer facing explicit demand from clients for sustainable investment. Korn (2025) advances this discussion by shifting the perspective to the state or government, which exerts influence through regulation and mandatory requirements. Another notable exception in continuous time is provided by A. Chen et al. (2025), which employs a utility-maximization approach while explicitly incorporating carbon risk constraints.

1.4. Limitations of Existing Approaches

While a substantial body of literature has incorporated ESG considerations into portfolio choice, most models remain anchored in static or discrete-time settings. Linear or quadratic penalty functions are widely used to discourage allocations to low-rated firms, yet these approaches are essentially one-period adaptations of the Markowitz framework. They lack time consistency, do not naturally accommodate forward-looking ESG risk dynamics, and their robust optimization interpretation has remained largely unexplored.

In existing continuous time works, sustainability is often integrated via explicit, time-varying portfolio constraints. For example, A. Chen et al. (2025) impose a deterministic carbon risk limit that is designed to decrease over time, requiring the portfolio’s carbon exposure to remain below that limit throughout the horizon. Likewise, Korn and Nurkanović (2025) capture client demand by constraining the portfolio’s volume weighted average sustainability rating over time. While such constraint-based formulations are well suited to encode minimum sustainability requirements, they treat sustainability primarily as a feasibility condition: they rely on exogenously specified, time-varying limits or demand schedules and do not explicitly capture model uncertainty associated with ESG-related risks.

1.5. Contribution of This Paper

We show that common linear ESG penalties admit an exact robustness interpretation and extend this equivalence to continuous time by introducing time-dependent, ESG-weighted ellipsoidal drift sets. For any admissible portfolio, we derive the worst-case drift in closed form and demonstrate that the max–min problem reduces pointwise to a strictly concave ESG-adjusted Markowitz program. Under CRRA preferences, the optimal policy is the unique, measurable maximizer of this tractable objective. Taken together, these results delineate a continuous time sustainable efficiency frontier, generalizing the classical mean-variance trade-off to incorporate ESG-driven model uncertainty. The framework thus provides a time-consistent, analytically transparent approach that connects heuristic ESG penalties to rigorous dynamic robust control.

Beyond its technical results, the framework has broader significance. Practically, it offers a tractable way for investors to account for ESG considerations. From a policy perspective, it guides the integration of ESG metrics into investment strategies and highlights potential implications of ESG-driven model uncertainty for portfolio risk and return.

2. ESG Penalty in Mean-Variance Optimization and Its Robust Interpretation

This section revisits the classical mean-variance portfolio optimization framework and examines the role of a linear penalty based on ESG scores, as commonly employed in sustainable investment practice. We further establish a robust optimization interpretation, where ESG scores parameterize the uncertainty in expected returns.

2.1. Static Model with Linear ESG Penalty

Consider a universe of d risky assets. The admissible portfolios are given by the long-only simplex,

W : = \{w \in R^{d} : w^{⊤} 1_{d} = 1, w_{i} \geq 0 \forall i\} .

Each asset i is characterized by its expected return

μ_{i}

and the positive definite covariance matrix

Σ ≻ 0

. Let

μ = {(μ_{1}, \dots, μ_{d})}^{⊤}

denote the vector of expected returns, and let

s \in R_{+ +}^{d}

be a vector of ESG scores, where higher values

s_{i}

indicate poorer ESG characteristics (i.e., higher ESG risk).

The investor seeks to maximize a mean-variance objective, penalized linearly by the ESG scores:

max_{w \in W} \{μ^{⊤} w - \frac{γ}{2} w^{⊤} Σ w - λ s^{⊤} w\},

where

γ > 0

is the risk aversion parameter and

λ \geq 0

controls the strength of the ESG penalty. The penalty term

λ s^{⊤} w

directly reduces allocations to assets with higher ESG scores, proportionally to the invested weights.

2.2. Robust Formulation: ESG-Dependent Return Uncertainty

An alternative perspective is to model the expected returns as uncertain within intervals determined by the ESG scores. Specifically, for a robustness parameter

ζ \geq 0

, we define the uncertainty set

U_{R} (ζ) : = \{μ^{'} \in R^{d} : μ_{i}^{'} \in [μ_{i} - ζ s_{i}, μ_{i} + ζ s_{i}] \forall i\} .

This formulation reflects the view that assets with higher ESG scores are subject to greater uncertainty in their expected returns, both upwards and downwards, proportional to

s_{i}

.

The corresponding robust mean-variance optimization problem is

max_{w \in W} min_{μ^{'} \in U_{R} (ζ)} \{{μ^{'}}^{⊤} w - \frac{γ}{2} w^{⊤} Σ w\} .

2.3. Equivalence of Linear ESG Penalty and Robust Optimization

The following result formalizes the equivalence between the linear ESG penalty and the robust optimization approach with symmetric uncertainty sets.

Before stating the theorem, we note that in the robust formulation, the inner minimization over

μ^{'}

is separable across assets and, due to the non-negativity of portfolio weights, always attains its minimum at the lower endpoint of each interval. This observation allows us to directly relate the robust objective to the penalized mean-variance objective.

Lemma 1

(Equivalence of Linear ESG Penalty and Robust Mean-Variance Optimization). Let

γ > 0

,

λ \geq 0

, and define

U_{R} (λ)

as above with

ζ = λ

. Then, for every

w \in W

,

min_{μ^{'} \in U_{R} (λ)} \{{μ^{'}}^{⊤} w - \frac{γ}{2} w^{⊤} Σ w\} = μ^{⊤} w - λ s^{⊤} w - \frac{γ}{2} w^{⊤} Σ w,

and thus, the robust and penalized mean-variance problems admit the same optimizer and objective value.

Proof.

Fix

w \in W

. By definition, for each asset i,

μ_{i}^{'} \in [μ_{i} - λ s_{i}, μ_{i} + λ s_{i}]

. Since

w_{i} \geq 0

, the minimum over

μ_{i}^{'}

is attained at

μ_{i} - λ s_{i}

. Therefore,

min_{μ^{'} \in U_{R} (λ)} w^{⊤} μ^{'} = \sum_{i = 1}^{d} w_{i} min_{μ_{i}^{'} \in [μ_{i} - λ s_{i}, μ_{i} + λ s_{i}]} μ_{i}^{'} = \sum_{i = 1}^{d} w_{i} (μ_{i} - λ s_{i}) = μ^{⊤} w - λ s^{⊤} w .

Subtracting the risk term yields the stated equivalence. The strict concavity of the mean-variance objective on the compact convex set

W

ensures uniqueness of the optimizer and objective value. □

This equivalence demonstrates that a linear ESG penalty in the mean-variance framework is mathematically identical to a robust optimization approach in which the expected returns are subject to symmetric, ESG-dependent uncertainty intervals. In both cases, the effect is to reduce allocations to assets with higher ESG scores, reflecting both investor preferences and a conservative stance towards ESG-related risks.

While the mathematical equivalence between a linear ESG penalty and robust optimization is clear, some economic and practical considerations remain. The approach assumes that higher ESG scores correspond to greater uncertainty in expected returns, which may not capture all nuances of ESG risk. In practice, ESG scores can vary across providers and over time, and the symmetric uncertainty assumption is a simplification. Nevertheless, the linear penalty offers a tractable first approximation and a useful foundation for more refined models that account for dynamic or non-linear ESG effects.

Remark 1

(On long-only constraints and general weight domains). The equivalence established in Lemma 1 relies on the long-only simplex

W

, which ensures that all portfolio weights satisfy

w_{i} \geq 0

. This condition implies that the inner minimization over the interval uncertainty set

U_{R} (λ)

always selects the lower endpoint

μ_{i} - λ s_{i}

, thereby producing the penalty term

- λ s^{⊤} w

.

If short-selling were permitted (e.g., under box constraints allowing

w_{i} < 0

), the inner minimization would flip sign for negative weights. In such cases, the robust counterpart corresponds to a penalty of the form

- λ, s^{⊤} | w |

, since adverse return shifts would occur in the opposite direction for short positions. Hence, the equivalence between a linear ESG penalty and a robust formulation continues to hold when extended to general sign patterns, provided the penalty depends on the absolute portfolio weights.

The restriction to long-only weights in this section is thus primarily conceptual: it aligns with the common setup of ESG-penalized mean-variance models in the literature, which typically preclude short positions. Our main goal here is to demonstrate the robustness interpretation for this widely used specification. The subsequent continuous time model in Section 3 deliberately relaxes this restriction, defining admissible portfolio processes by square-integrability alone to provide a more general and analytically tractable dynamic framework.

Finally, we note that this paper focuses on drift ambiguity as the robust counterpart to linear ESG penalties. Extending the analysis to joint ambiguity in both drift and volatility (as in Biagini and Pınar (2017), where a similar robust approach is used without an ESG context) constitutes a natural and promising direction for future research, particularly in view of ESG-related uncertainty in higher-order risk dynamics.

3. Continuous Time Market with Time-Dependent ESG Drift Ambiguity

Motivated by the static linear-penalty equivalence of Section 2, we introduce a continuous time portfolio problem under ESG-dependent drift uncertainty. Following the approach of Müller and Joubrel (2025), we define ellipsoidal uncertainty sets for the drift, weighted by ESG scores.

Compared with box-type uncertainty sets, ellipsoids are often preferred because box sets allow independent, simultaneous worst-case shifts in each asset, an unduly conservative and typically unrealistic assumption, whereas ellipsoids couple deviations across assets and mitigate such extreme joint combinations. To capture evolving ESG risks, we extend the static ellipsoids to a time-dependent ESG-weighted ellipsoidal set, where ESG scores

s_{i} (t)

vary deterministically over time.

Remark 2

(On the economic and methodological significance of time-dependent ESG scores). A central innovation of this paper is the explicit modeling of time-dependent ESG scores within a continuous time robust portfolio framework. This approach departs fundamentally from the classical Markowitz paradigm and most of the existing sustainable investment literature, where ESG characteristics are typically treated as static, exogenously given, or at best as discrete-time updates. In reality, however, ESG profiles of firms and sectors evolve dynamically in response to regulatory changes, technological innovation, shifts in consumer preferences, and firm-level actions. By allowing the ESG score

s_{i} (t)

of each asset to vary deterministically over time, our model captures this essential feature of real-world sustainability risk.

The economic advantages of this dynamic treatment are manifold. First, it enables investors to anticipate and hedge against future changes in ESG risk, rather than merely reacting to current or historical scores. This is particularly relevant for long-term investors, such as pension funds or insurance companies, whose investment horizons may span periods of significant regulatory or societal transformation. Second, the time-dependent structure allows for the integration of forward-looking information, such as regulatory roadmaps, transition scenarios, or expert forecasts, directly into the portfolio optimization process. Third, the framework naturally accommodates risk management applications, including scenario analysis and stress testing: by specifying adverse or sector-specific ESG trajectories, risk managers can assess the impact of regulatory shocks, reputational events, or technological disruptions on optimal allocations and portfolio performance.

From a methodological perspective, the explicit time dependence of ESG scores opens new avenues for future research. One promising direction is to model

s_{i} (t)

analogously to credit spreads, decomposing the ESG score into a sector-specific component and an entity-specific (idiosyncratic) component. The sector component could be driven by macroeconomic or regulatory factors, while the entity component reflects firm-level actions and idiosyncratic events. Such a decomposition would allow for the use of economic arguments, scenario-based modeling, or even stochastic processes to describe the evolution of ESG risk over time. In particular, sector-level ESG trajectories could be informed by regulatory transition pathways or climate policy scenarios, while entity-specific paths could be modeled using firm disclosures, ESG rating transitions, or machine learning forecasts. This would enable a more granular and economically grounded integration of ESG risk into both portfolio construction and risk management.

The remainder of the exposition follows standard continuous time portfolio notation, incorporating the time-dependent ESG ellipsoids into the definition of admissible drift processes.

3.1. Mathematical Setting

Consider a finite horizon

T > 0

on a complete probability space

(Ω, F, P)

. The market contains a risk-free asset B with constant rate r and d risky stocks

P_{i}

,

i = 1, \dots, d

, with drift vectors denoted by

b (t) \in R^{d}

and volatility matrix

σ \in R^{d \times n}

, where

n \geq d

. We assume that the volatility matrix

σ \in R^{d \times n}

is nondegenerate, that is, the covariance matrix

Σ : = σ σ^{⊤}

is positive definite (i.e.,

Σ ≻ 0

). To capture the dynamics of the risky stocks, we employ an n-dimensional Brownian motion

{(W (t))}_{t \in [0, T]}

. The probability space is structured with a filtration

F = {(F_{t})}_{t \in [0, T]}

that adheres to the "usual conditions" and incorporates the filtration generated by the Brownian motion W.

Let each asset i have a deterministic, time-dependent ESG score

s_{i} : [0, T] \to {1, \dots, S_{max}}

, with higher values indicating greater ESG risk. Fix a strictly increasing score transformation

F : {1, \dots, S_{max}} \to R_{> 0}

and define the diagonal, time-dependent matrix

Ξ_{t} : = diag (F {(s_{1} (t))}^{2}, \dots, F {(s_{d} (t))}^{2}), t \in [0, T] .

Let

\bar{b} \in R^{d}

denote a nominal historical drift estimate. For a fixed robustness radius

ϵ > 0

, define the time-dependent ESG-weighted ellipsoidal uncertainty set:

\begin{matrix} B_{t} (F) : = \{b \in R^{d} : {(b - \bar{b})}^{⊤} Ξ_{t}^{- 1} (b - \bar{b}) \leq ϵ^{2}\}, t \in [0, T] . \end{matrix}

(1)

The collection of all admissible drift processes is

P : = \{b (\cdot) progressively measurable : b (t) \in B_{t} (F) \forall t \in [0, T]\} .

The set

B_{t} (F)

collects plausible drift vectors at each point in time: assets with higher ESG score

s_{i} (t)

allow larger deviations from

{\bar{b}}_{i}

, reflecting the greater uncertainty about the impact of ESG risks on future returns. The robustness parameter

ϵ

sets the overall size of the ambiguity set and thus the degree of conservatism; larger values widen the set and lead to more cautious allocations. The ESG transformation

F (\cdot)

maps raw ESG assessments into weights that modulate how uncertainty is distributed across assets; different functional forms encode different emphasis across the ESG spectrum. For a specification of

F (\cdot)

, a mathematical analysis of how

ϵ

and

F (\cdot)

shape the ellipsoidal ambiguity set, and calibration considerations, see Müller and Joubrel (2025).

We now formalize the set of admissible portfolio processes in relation to the financial market model and the drift uncertainty. The admissibility of a portfolio process is defined relative to the drift parameter, ensuring well-posedness of the wealth dynamics for all possible model specifications.

Definition 1

(Admissible portfolio process and wealth dynamics). Consider the financial market model as described above, with initial capital

x > 0

. For a given drift process

b (\cdot) \in P

, the set

A^{b}

of admissible portfolio processes consists of all d-dimensional progressively measurable processes

π (\cdot)

such that the associated wealth process

X^{π, b}

is the unique strong solution to the stochastic differential equation

\begin{matrix} X^{π, b} (0) & = x, \\ d X^{π, b} (t) & = X^{π, b} (t) (r + π {(t)}^{⊤} (b (t) - r 1_{d})) d t + X^{π, b} (t) π {(t)}^{⊤} σ d W (t), \end{matrix}

for

t \in [0, T]

, and such that the integrability condition

\int_{0}^{T} {| π_{i} (t) |}^{2} d t < \infty P - a . s . for all i = 1, \dots, d

is satisfied.

The set of admissible portfolio processes under drift uncertainty is then defined as

A : = ⋂_{b (\cdot) \in P} A^{b},

i.e.,

A

consists of all progressively measurable processes

π (\cdot)

that are admissible for every possible drift process

b (\cdot) \in P

.

3.2. Investor Preferences and the Max–Min Problem

Let the investor have CRRA utility with parameter

γ > 0, γ \neq 1

:

U (x) = \frac{1}{1 - γ} x^{1 - γ}, x > 0 .

The investor faces Knightian ambiguity over the drift and solves the worst-case problem

\begin{matrix} sup_{π \in A} inf_{b (\cdot) \in P} E [U (X^{π, b} (T))] . \end{matrix}

(2)

4. Solution: Worst-Case Drift and Optimal Portfolio

We now address the robust continuous time portfolio optimization problem with time-dependent ESG-weighted drift uncertainty as formulated in Section 3. The solution proceeds by decomposing the expected utility, characterizing the pointwise worst-case drift within the time-dependent ellipsoidal uncertainty sets, and deriving the optimal portfolio strategy.

4.1. Decomposition of the Terminal Utility

For

π (\cdot) \in A

and

b (\cdot) \in P

, consider the associated wealth process

X^{π, b}

. Applying Itô’s formula to

x \mapsto x^{1 - γ}

(with

γ \neq 1

) yields the standard multiplicative decomposition

U (X^{π, b} (T)) = U (x) exp ((1 - γ) \int_{0}^{T} ϕ (π (t), b (t)) d t) M_{T} (π),

where the drift term is

ϕ (π, b) = r + π^{⊤} (b - r 1_{d}) - \frac{1}{2} γ π^{⊤} Σ π,

and the stochastic exponential is

M_{t} (π) = exp ((1 - γ) \int_{0}^{t} π {(s)}^{⊤} σ d W (s) - \frac{1}{2} {(1 - γ)}^{2} \int_{0}^{t} π {(s)}^{⊤} Σ π (s) d s) .

Note that

M (π)

depends only on

π

(and

σ

), not on the drift

b (\cdot)

. This separation is crucial for the subsequent robust analysis.

4.2. Pointwise Minimization over the Ellipsoidal Uncertainty Set

Recall the time-dependent ellipsoids

B_{t} (F) = {b \in R^{d} : {(b - \bar{b})}^{⊤} Ξ_{t}^{- 1} (b - \bar{b}) \leq ϵ^{2}},

which are deterministic for each fixed t. Because

ϕ (π (t), b (t))

is affine in

b (t)

for fixed

π (t)

, the minimization over

b (\cdot) \in P

reduces to a pointwise-in-time minimization of the linear functional

b \mapsto π {(t)}^{⊤} b

over

B_{t} (F)

. The next lemma gives the explicit solution and verifies measurability/admissibility of the pointwise minimizer when assembled into a process.

Lemma 2

(Pointwise minimizer). Fix

t \in [0, T]

and let

π \in R^{d}

.

1.: If $π \neq 0$ , the unique minimizer of $b \mapsto π^{⊤} b$ over $B_{t} (F)$ is

$b^{π} (t) = \bar{b} - ϵ \frac{Ξ_{t} π}{\sqrt{π^{⊤} Ξ_{t} π}},$

and the minimum value equals $π^{⊤} \bar{b} - ϵ \sqrt{π^{⊤} Ξ_{t} π}$ .
2.: If $π = 0$ , any $b \in B_{t} (F)$ attains the minimum; we set $b^{π} (t) = \bar{b}$ by convention.
3.: The mapping $π \mapsto b^{π} (t)$ is Borel measurable on $R^{d}$ for each fixed t. In particular, if $π (\cdot)$ is progressively measurable, then so is $b^{π} (\cdot)$ , and $b^{π} (\cdot) \in P$ .

Proof.

(1) Let

π \neq 0

. The constrained minimization problem is

min_{b \in R^{d}} π^{⊤} b s . t . {(b - \bar{b})}^{⊤} Ξ_{t}^{- 1} (b - \bar{b}) \leq ϵ^{2} .

Set

u = b - \bar{b}

and minimize

π^{⊤} u

subject to

u^{⊤} Ξ_{t}^{- 1} u \leq ϵ^{2}

. Form the Lagrangian

L (u, λ) = π^{⊤} u + \frac{λ}{2} (u^{⊤} Ξ_{t}^{- 1} u - ϵ^{2}), λ \geq 0 .

A minimizer (on the compact feasible set) either lies in the interior (then

λ = 0

) or on the boundary. If

λ = 0

, the first-order condition gives

π = 0

, contradicting the assumption

π \neq 0

. Hence the constraint must be active and

λ > 0

. First-order condition in u yields

π + λ Ξ_{t}^{- 1} u = 0 ⟹ u = - λ^{- 1} Ξ_{t} π .

Enforce the active constraint

u^{⊤} Ξ_{t}^{- 1} u = ϵ^{2}

to determine

λ

:

λ^{- 2} π^{⊤} Ξ_{t} π = ϵ^{2} ⟹ λ = ϵ^{- 1} \sqrt{π^{⊤} Ξ_{t} π} .

Hence

u^{*} = - ϵ \frac{Ξ_{t} π}{\sqrt{π^{⊤} Ξ_{t} π}}, b^{*} = \bar{b} + u^{*} = \bar{b} - ϵ \frac{Ξ_{t} π}{\sqrt{π^{⊤} Ξ_{t} π}} .

Substituting in the objective gives the stated minimum value. Uniqueness follows from strict convexity of the constraint surface in the direction of optimization (equivalently, from the fact that

π \neq 0

and the boundary is a strictly convex set) and the first-order condition which yields a unique

u^{*}

.

(2) If

π = 0

the objective is constant and every feasible b is optimal; we set

b^{π} (t) = \bar{b}

by convention.

(3) For each fixed t, the mapping

π \mapsto b^{π} (t)

is given explicitly by

b^{π} (t) = \{\begin{matrix} \bar{b} - ϵ \frac{Ξ_{t} π}{\sqrt{π^{⊤} Ξ_{t} π}}, & if π \neq 0, \\ \bar{b}, & if π = 0 . \end{matrix}

This is a composition of continuous functions on

R^{d} ∖ {0}

, and the value at

π = 0

can be assigned arbitrarily, so the entire mapping is Borel measurable on

R^{d}

. Therefore, if

π (\cdot)

is progressively measurable, then

b^{π} (\cdot)

is also progressively measurable and thus,

b^{π} (\cdot) \in P

. □

4.3. Reduced Objective and Concavity Properties

Using Lemma 2, inserting the pointwise minimizer into

ϕ

yields for every t and

π

min_{b \in B_{t} (F)} ϕ (π, b) = r + π^{⊤} (\bar{b} - r 1_{d}) - ϵ \sqrt{π^{⊤} Ξ_{t} π} - \frac{1}{2} γ π^{⊤} Σ π .

Define the pointwise reduced objective

Ψ_{t} (π) : = r + π^{⊤} (\bar{b} - r 1_{d}) - ϵ \sqrt{π^{⊤} Ξ_{t} π} - \frac{1}{2} γ π^{⊤} Σ π .

We next establish the concavity properties of

Ψ_{t}

used in the main theorem.

Lemma 3

(Concavity). For each fixed

t \in [0, T]

the function

π \mapsto Ψ_{t} (π)

is concave on

R^{d}

. If

Σ ≻ 0

and

γ > 0

then

Ψ_{t}

is strictly concave.

Proof.

Ψ_{t}

is the sum of a linear function, the negative of a norm (which is concave), and a negative definite quadratic form (which is strictly concave if

Σ ≻ 0

and

γ > 0

). The claim follows. □

4.4. Optimal Portfolio Strategy

We now characterize the solution to the robust portfolio optimization problem. Due to the time-dependence of the uncertainty set, the optimal portfolio is, in general, time-dependent.

Theorem 1

(Optimal portfolio under time-dependent ESG drift uncertainty). Suppose

Σ = σ σ^{⊤} ≻ 0

and

γ > 0

. For each

t \in [0, T]

, let

π^{*} (t) \in A

be the unique maximizer of the reduced objective

Ψ_{t} (π)

, and let

b^{π^{*}} (t)

denote the corresponding worst-case drift as in Lemma 2. Then,

π^{*}

solves the robust portfolio optimization problem

sup_{π \in A} inf_{b \in P} E [U (X^{π, b} (T))],

and the value of the problem is

sup_{π \in A} inf_{b \in P} E [U (X^{π, b} (T))] = E [U (X^{π^{*}, b^{π^{*}}} (T))] = U (x) exp ((1 - γ) \int_{0}^{T} Ψ_{t} (π^{*} (t)) d t) .

Proof.

Let

π (\cdot) \in A

be arbitrary. Recall the decomposition used above:

U (X^{π, b} (T)) = U (x) exp ((1 - γ) \int_{0}^{T} ϕ (π (t), b (t)) d t) M_{T} (π),

where

ϕ (π, b) = r + π^{⊤} (b - r 1_{d}) - \frac{1}{2} γ π^{⊤} Σ π,

and

M_{t} (π) = exp ((1 - γ) \int_{0}^{t} π {(s)}^{⊤} σ d W (s) - \frac{1}{2} {(1 - γ)}^{2} \int_{0}^{t} π {(s)}^{⊤} Σ π (s) d s) .

Note that

M_{T} (π)

depends on

π

and

σ

but not on the drift process

b (\cdot)

.

Because

M_{T} (π) \geq 0

and the inner objective is multiplicative in the factor depending on b, we have for fixed

π

and each fixed

ω \in Ω

the pointwise identity

\begin{matrix} inf_{b (\cdot) \in P} \{exp ((1 - γ) \int_{0}^{T} ϕ (π (t), b (t)) d t) M_{T} (π)\} \\ = M_{T} (π) \cdot inf_{b (\cdot) \in P} exp ((1 - γ) \int_{0}^{T} ϕ (π (t), b (t)) d t) . \end{matrix}

Since the uncertainty set is time-separable (each

B_{t} (F)

depends only on the time t and deterministic quantities), the inner minimization over

b (\cdot)

reduces to a pointwise-in-time problem. By Lemma 2, for each t and admissible

π (t)

there exists a pointwise minimizer

b^{π} (t) \in B_{t} (F)

. As a progressively measurable process,

b^{π} (\cdot)

then belongs to

P

, so that the infimum over

b (\cdot)

is attained and can be interchanged with the expectation:

inf_{b (\cdot) \in P} E [U (X^{π, b} (T))] = U (x) E [M_{T} (π) \cdot inf_{b (\cdot) \in P} exp ((1 - γ) \int_{0}^{T} ϕ (π (t), b (t)) d t)] .

Moreover, Lemma 2 provides the explicit formula for the pointwise minimizer, which allows us to compute the inner integral:

inf_{b (\cdot) \in P} \int_{0}^{T} ϕ (π (t), b (t)) d t = \int_{0}^{T} Ψ_{t} (π (t)) d t .

Hence, we obtain

inf_{b (\cdot) \in P} E [U (X^{π, b} (T))] = U (x) E [M_{T} (π) exp ((1 - γ) \int_{0}^{T} Ψ_{t} (π (t)) d t)] .

At this point we use two further observations. First, by assumption on the admissible set

A

, the stochastic exponential

M_{t} (π)

is a true martingale and

E [M_{T} (π)] = 1

for every

π \in A

. Second, by Lemma 3 the function

π \mapsto Ψ_{t} (π)

is strictly concave for each fixed t, and therefore the pointwise maximizer

π^{*} (t) : = arg {max}_{π \in R^{d}} Ψ_{t} (π)

is unique; note that

π^{*} (t)

depends only on deterministic objects at time t (namely

\bar{b}, Ξ_{t}, Σ, γ

) and is therefore deterministic (measurable and non-random).

For any admissible

π (\cdot) \in A

we have the pointwise inequality

Ψ_{t} (π (t, ω)) \leq Ψ_{t} (π^{*} (t)) for a . e . (t, ω),

hence, since

(1 - γ)

is a scalar, almost surely

exp ((1 - γ) \int_{0}^{T} Ψ_{t} (π (t)) d t) \leq exp ((1 - γ) \int_{0}^{T} Ψ_{t} (π^{*} (t)) d t) .

Multiplying both sides by the nonnegative random variable

M_{T} (π)

and taking expectations, we get

E [M_{T} (π) exp ((1 - γ) \int_{0}^{T} Ψ_{t} (π (t)) d t)] \leq exp ((1 - γ) \int_{0}^{T} Ψ_{t} (π^{*} (t)) d t) E [M_{T} (π)] .

By the martingale property

E [M_{T} (π)] = 1

, therefore

inf_{b (\cdot) \in P} E [U (X^{π, b} (T))] \leq U (x) exp ((1 - γ) \int_{0}^{T} Ψ_{t} (π^{*} (t)) d t) .

Finally, taking

π = π^{*}

itself yields equality. Indeed, for

π^{*}

the pointwise minimizer is

b^{π^{*}} (\cdot)

and the same calculation as above gives

\begin{matrix} inf_{b (\cdot) \in P} E [U (X^{π^{*}, b} (T))] & = U (x) E [M_{T} (π^{*}) exp ((1 - γ) \int_{0}^{T} Ψ_{t} (π^{*} (t)) d t)] \\ = U (x) exp ((1 - γ) \int_{0}^{T} Ψ_{t} (π^{*} (t)) d t), \end{matrix}

because

E [M_{T} (π^{*})] = 1

. Combining the inequalities shows that

π^{*}

attains the supremum over

π \in A

and is unique by strict concavity. This completes the proof. □

In particular, Theorem 1 reveals a key structural insight: the robust continuous time portfolio problem with time-dependent ESG drift ambiguity reduces, at each instant in time, to a static maximization of the reduced objective

Ψ_{t} (π)

. This means that, despite the dynamic and path-dependent nature of the original optimization problem, the optimal portfolio strategy can be characterized by solving, for each t, a Markowitz-type problem with an explicit, ESG-adjusted penalty term. The reduced objective

Ψ_{t} (π) = r + π^{⊤} (\bar{b} - r 1_{d}) - ϵ \sqrt{π^{⊤} Ξ_{t} π} - \frac{1}{2} γ π^{⊤} Σ π

is concave, and its maximizer lies on the boundary of a dynamic sustainable efficiency frontier, reflecting the trade-off between expected return, variance, and ESG-driven ambiguity. This structural parallel demonstrates that our continuous time robust model provides the natural dynamic analogue to the sustainable investment problems widely used in the finance literature. It rigorously justifies interpreting ESG-penalized mean–variance optimization as a special case of robust control under drift uncertainty and extends this perspective to a fully dynamic, time-consistent setting.

Economically, this reduction to a sequence of explicit, pointwise optimization problems has several important implications. First, it ensures analytical tractability and transparency: the optimal portfolio at each time is given by the unique maximizer of the strictly concave reduced objective

Ψ_{t} (π)

, analogous to classical Markowitz weights but dynamically adjusted for ESG-related model risk. This allows for straightforward computation, sensitivity analysis, and a clear economic interpretation of how ESG risk and ambiguity affect optimal allocations. Second, the time-dependent structure enables investors to adapt their portfolios continuously in response to evolving ESG risk profiles, regulatory changes, or anticipated sustainability trends. In particular, it allows for forward-looking risk management, where investors can hedge not only against current ESG risks but also against future developments in a time-consistent manner. Third, by bridging the gap between static, penalty-based approaches and dynamic, robust optimization, our framework provides a unified theoretical foundation for sustainable portfolio construction that is both rigorous and directly applicable in practice. Finally, the explicit, time-local structure of the solution facilitates further extensions, such as incorporating stochastic or scenario-based ESG scores, learning mechanisms, or multi-period stress testing, thereby opening new avenues for research and practical implementation in sustainable investment.

Remark 3

(Calibration and interpretation of time-dependent ESG ambiguity). The implementation of the dynamic ESG-weighted ambiguity sets requires calibration of the ESG trajectories

s_{i} (t)

, the transformation function

F (\cdot)

, and the robustness parameter ϵ. As outlined in Remark 2, the functions

s_{i} (t)

represent deterministic ESG profiles that may be derived from historical or projected sustainability indicators, regulatory scenarios, or other forward-looking pathways.

The transformation

F (\cdot)

converts the discrete ESG scale into weights that determine the relative impact of ESG risk on model uncertainty. Its specific form controls how strongly differences in ESG risk translate into ambiguity adjustments; linear specifications maintain proportionality, whereas convex forms place greater emphasis on high-risk assets.

The robustness radius

ϵ > 0

determines the overall size of the uncertainty set and can be viewed as an implied confidence radius. Smaller values reflect higher trust in the nominal drift estimate

\bar{b}

, while larger values capture stronger ambiguity aversion. In practice, ϵ may be linked to statistical confidence levels or tuned via backtesting and sensitivity analysis to ensure stability of portfolio outcomes under plausible ESG-driven perturbations. A simple sensitivity figure, showing how optimal weights or certainty equivalents vary with

(ϵ, F)

, can further illustrate the interaction between ESG amplification and model robustness.

5. Conclusions

This paper develops a robust continuous time portfolio optimization framework that incorporates time-dependent ESG risk through dynamically evolving drift uncertainty sets. By establishing a rigorous equivalence between linear ESG penalties and robust optimization under ESG-weighted ambiguity, we provide a unified theoretical foundation that connects classical mean–variance models with modern sustainable investment practice. The explicit reduction of the dynamic problem to a sequence of tractable, Markowitz-type optimizations enables both analytical clarity and practical implementation, while allowing for forward-looking, time-consistent risk management.

Our approach highlights the importance of modeling ESG risk as a dynamic, evolving phenomenon, rather than as a static constraint or penalty. This perspective opens new avenues for integrating regulatory scenarios, sectoral transitions, and firm-level developments into portfolio construction and risk assessment. Nevertheless, the framework is built upon simplifying assumptions, most notably deterministic ESG trajectories and the exclusion of transaction costs, that delineate its theoretical scope. Extending the model to incorporate market frictions and transaction cost–sensitive structures would further enhance its empirical realism and robustness. Future research may also explore stochastic or scenario-based ESG trajectories, learning mechanisms, or multi-period stress testing, further enhancing its relevance for both academic inquiry and real-world sustainable investment. Beyond these methodological considerations, recent data-driven paradigms such as generative modeling of asset returns (Cheng & Chen, 2023) and reinforcement learning–based portfolio selection (Alzaman, 2025) offer complementary directions. Integrating robust control principles with adaptive learning architectures could yield hybrid frameworks that combine model uncertainty awareness with data-driven adaptability, thus bridging theoretical robustness and empirical flexibility.

Overall, our results demonstrate that robust, dynamic modeling of ESG risk is not only theoretically sound but also practically valuable, providing a flexible and transparent toolkit for investors seeking to align financial performance with evolving sustainability objectives.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

In preparing this manuscript, we used ChatGPT-5 to check for grammatical and typographical errors, as well as consistency and readability.

Conflicts of Interest

The author declares no conflicts of interest. This note contains the authors’ personal opinions and does not represent the opinion of Deutsche Bundesbank.

References

Alda, M. (2025). Importance of portfolio optimization in SRI and conventional pension funds. Financial Innovation, 11(1), 79. [Google Scholar] [CrossRef]
Alzaman, C. (2025). Optimizing portfolio selection through stock ranking and matching: A reinforcement learning approach. Expert Systems with Applications, 269, 126430. [Google Scholar] [CrossRef]
Biagini, S., & Pınar, M. Ç. (2017). The robust Merton problem of an ambiguity averse investor. Mathematics and Financial Economics, 11(1), 1–24. [Google Scholar] [CrossRef]
Blitz, D., Chen, M., Howard, C., & Lohre, H. (2024). 3D investing: Jointly optimizing return, risk and sustainability. Financial Analysts Journal, 80(2), 22–40. [Google Scholar] [CrossRef]
Chen, A., Gerick, L., & Jin, Z. (2025). Optimizing portfolios under carbon risk constraints: Setting effective constraints to favor green investments. Energy Economics, 148, 108634. [Google Scholar] [CrossRef]
Chen, Z., & Epstein, L. (2002). Ambiguity, risk, and asset returns in continuous time. Econometrica, 70(4), 1403–1443. [Google Scholar] [CrossRef]
Cheng, T., & Chen, K. (2023). A general framework for portfolio construction based on generative models of asset returns. The Journal of Finance and Data Science, 9, 100113. [Google Scholar] [CrossRef]
Cox, J. C., & Huang, C.-F. (1989). Optimal consumption and portfolio policies when asset prices follow a diffusion process. Journal of Economic Theory, 49(1), 33–83. [Google Scholar] [CrossRef]
Ellsberg, D. (1961). Risk, ambiguity, and the savage axioms. The Quarterly Journal of Economics, 75(4), 643–669. [Google Scholar]
Feng, X., von Mettenheim, H., Sermpinis, G., & Stasinakis, C. (2024). Sustainable portfolio construction via machine learning: ESG, SDG and sentiment. European Financial Management, 31(3), 1148–1169. [Google Scholar] [CrossRef]
Garcia-Bernabeu, A., Hilario-Caballero, A., Tardella, F., & Pla-Santamaria, D. (2024). ESG integration in portfolio selection: A robust preference-based multicriteria approach. Operations Research Perspectives, 12, 100305. [Google Scholar] [CrossRef]
Garlappi, L., Uppal, R., & Wang, T. (2006). Portfolio selection with parameter and model uncertainty: A multi-prior approach. The Review of Financial Studies, 20(1), 41–81. [Google Scholar] [CrossRef]
Gasser, S., Rammerstorfer, M., & Weinmayer, K. (2016). Markowitz revisited social portfolio engineering. SSRN Electronic Journal, 258(3), 1181–1190. [Google Scholar] [CrossRef]
Hernández-Hernández, D., & Schied, A. (2006). Robust utility maximization in a stochastic factor model. Statistics & Risk Modeling, 24(1), 109–125. [Google Scholar] [CrossRef]
Kanamura, T. (2023). Portfolio diversification and sustainable assets from new perspectives. Journal of Asset Management, 24(7), 581–600. [Google Scholar] [CrossRef]
Karatzas, I., Lehoczky, J. P., & Shreve, S. E. (1987). Optimal portfolio and consumption decisions for a “small investor” on a finite horizon. SIAM Journal on Control and Optimization, 25(6), 1557–1586. [Google Scholar] [CrossRef]
Knight, F. H. (1921). Risk, uncertainty and profit (Vol. 31). Houghton Mifflin. [Google Scholar]
Korn, R. (2025). A framework for optimal portfolios with sustainable assets and climate scenarios. European Actuarial Journal, 15(1), 1–13. [Google Scholar] [CrossRef]
Korn, R., & Nurkanović, A. (2025). Sustainable portfolio optimization and sustainable taxation: R. Korn, A. Nurkanović. European Actuarial Journal, 1–20. [Google Scholar] [CrossRef]
Liu, H. (2011). Dynamic portfolio choice under ambiguity and regime switching mean returns. Journal of Economic Dynamics and Control, 35(4), 623–640. [Google Scholar] [CrossRef]
Markowitz, H. (1959). Portfolio selection. Yale University Press; New Haven. [Google Scholar]
Markowitz, H. M. (1952). Portfolio selection. The Journal of Finance, 7(1), 77–91. [Google Scholar] [CrossRef]
Merton, R. C. (1969). Lifetime portfolio selection under uncertainty: The continuous-time case. The review of Economics and Statistics, 51, 247–257. [Google Scholar] [CrossRef]
Müller, L., & Joubrel, M. (2025). A novel approach to sustainable mean-variance portfolio optimization: Accounting for ESG-related uncertainty. Finance Research Letters, 85, 108056. [Google Scholar] [CrossRef]
Neufeld, A., & Sikic, M. (2018). Robust utility maximization in discrete-time markets with friction. SIAM Journal on Control and Optimization, 56(3), 1912–1937. [Google Scholar] [CrossRef]
Nutz, M. (2016). Utility maximization under model uncertainty in discrete time. Mathematical Finance, 26(2), 252–268. [Google Scholar] [CrossRef]
Pedersen, L. H., Fitzgibbons, S., & Pomorski, L. (2021). Responsible investing: The ESG-efficient frontier. Journal of Financial Economics, 142(2), 572–597. [Google Scholar] [CrossRef]
Pliska, S. (1986). A stochastic calculus model of continuous trading: Optimal portfolios. Mathematics of Operations Research, 11, 371–382. [Google Scholar] [CrossRef]
Schied, A., Föllmer, H., & Weber, S. (2009). Robust preferences and robust portfolio choice. Handbook of Numerical Analysis, 15, 29–87. [Google Scholar]
Utz, S., Wimmer, M., Hirschberger, M., & Steuer, R. E. (2014). Tri-criterion inverse portfolio optimization with application to socially responsible mutual funds. European Journal of Operational Research, 234(2), 491–498. [Google Scholar] [CrossRef]
Varmaz, A., Fieberg, C., & Poddig, T. (2022). Portfolio optimization for sustainable investments. Annals of Operations Research, 341(2–3), 1151–1176. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Müller, L. From Penalties to Protection: The Continuous Time Sustainable Efficiency Frontier. J. Risk Financial Manag. 2025, 18, 610. https://doi.org/10.3390/jrfm18110610

AMA Style

Müller L. From Penalties to Protection: The Continuous Time Sustainable Efficiency Frontier. Journal of Risk and Financial Management. 2025; 18(11):610. https://doi.org/10.3390/jrfm18110610

Chicago/Turabian Style

Müller, Lukas. 2025. "From Penalties to Protection: The Continuous Time Sustainable Efficiency Frontier" Journal of Risk and Financial Management 18, no. 11: 610. https://doi.org/10.3390/jrfm18110610

APA Style

Müller, L. (2025). From Penalties to Protection: The Continuous Time Sustainable Efficiency Frontier. Journal of Risk and Financial Management, 18(11), 610. https://doi.org/10.3390/jrfm18110610

Article Menu

From Penalties to Protection: The Continuous Time Sustainable Efficiency Frontier

Abstract

1. Introduction

1.1. Portfolio Optimization

1.2. Risk, Ambiguity, and Model Uncertainty

1.3. ESG Risk and Sustainable Investment

1.4. Limitations of Existing Approaches

1.5. Contribution of This Paper

2. ESG Penalty in Mean-Variance Optimization and Its Robust Interpretation

2.1. Static Model with Linear ESG Penalty

2.2. Robust Formulation: ESG-Dependent Return Uncertainty

2.3. Equivalence of Linear ESG Penalty and Robust Optimization

3. Continuous Time Market with Time-Dependent ESG Drift Ambiguity

3.1. Mathematical Setting

3.2. Investor Preferences and the Max–Min Problem

4. Solution: Worst-Case Drift and Optimal Portfolio

4.1. Decomposition of the Terminal Utility

4.2. Pointwise Minimization over the Ellipsoidal Uncertainty Set

4.3. Reduced Objective and Concavity Properties

4.4. Optimal Portfolio Strategy

5. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI