1. Introduction
Risk measurement is one of the central themes in modern finance, forming the foundation for regulatory frameworks, portfolio management, and stress testing of financial markets. While systemic risk is often defined in terms of cross-sectional spillovers and contagion across institutions, an equally important dimension concerns the persistence and amplification of tail losses over time in aggregate markets. This temporal dimension of risk captures how extreme losses propagate and cluster across successive periods, even in the absence of explicit cross-institutional contagion. The increasing complexity of financial systems has therefore motivated the development of robust tools for quantifying conditional and joint tail risk in stressed market environments. Traditional measures such as Value at Risk (VaR) and Expected Shortfall (ES) remain widely adopted by both practitioners and regulators. However, these measures are primarily designed to capture the risk of individual portfolios and often abstract from nonlinear dependence structures and asymmetric tail behavior that emerge during periods of market stress.
A frequently cited limitation of VaR and ES concerns their ability to reflect asymmetric dependence, nonlinear tail behavior, and clustering of extreme events. It is important to emphasize, however, that VaR and ES themselves are distribution-free risk functionals defined via quantiles and tail expectations, rather than moment-based objects. Only specific parametric or semi-parametric implementations rely on assumptions about moments or distributional forms. In practice, financial markets often exhibit skewness, heavy tails, and nonlinear dependence that are not fully summarized by variance or correlation alone. This has motivated the widespread use of higher-order co-moment functionals—such as co-skewness and co-kurtosis—as descriptive tools for characterizing asymmetric and nonlinear dependence in financial returns.
Despite their descriptive appeal, higher-order co-moment functionals do not satisfy the axioms of coherence introduced by
Artzner et al. (
1999). In particular, co-skewness can violate subadditivity and co-kurtosis can violate monotonicity, confirming that these functionals are not coherent risk measures and cannot serve as risk measures in any normative, regulatory, or capital-allocation sense. Their role is therefore inherently descriptive: they summarize distributional shape and nonlinear dependence but do not provide admissible measures of tail risk.
In parallel, the literature on tail-risk measurement has developed conditional risk measures that explicitly focus on joint losses under stressed conditions. Co-Expected Shortfall (CoES) inherits the coherence of ES while extending it to a conditional joint-tail framework, thereby providing a theoretically rigorous measure of conditional tail losses. Co-Value at Risk (CoVaR), while intuitive and widely used in stress-testing and policy discussions, inherits the non-coherence of VaR—most notably violating subadditivity—and captures tail events only at a quantile level without accounting for loss severity. As such, CoVaR is best interpreted as a conditional stress indicator rather than a fully coherent risk measure.
The gap in the literature lies in the tension between these two classes of tools. Higher-order co-moment functionals capture asymmetric and nonlinear dependence but lack coherence, while conditional tail-risk measures such as CoES offer normative rigor but abstract from higher-order distributional shape. The purpose of this paper is not to position co-moments as systemic risk measures, nor to model cross-sectional contagion across institutions, but rather to formally demonstrate why co-moments fail coherence axioms and to contrast them with coherent tail-based measures within a unified framework of conditional tail risk. This comparison clarifies the distinct roles played by descriptive distributional diagnostics and coherent conditional risk measures, and motivates a principled pathway for incorporating higher-order adjustments into CoES while preserving coherence.
This study contributes in three ways. First, it provides formal demonstrations—via explicit counterexamples—that co-skewness violates subadditivity and co-kurtosis violates monotonicity, clarifying their non-coherence and delimiting their role as descriptive statistics. Second, it establishes a transparent hierarchy among co-moments, CoVaR, and CoES, highlighting the trade-offs between statistical descriptiveness and normative robustness. Third, it illustrates the empirical behavior of these measures during major U.S. equity market stress episodes, emphasizing their relative stability, interpretability, and identification properties.
To maintain consistency between theory and empirics, we adopt right-tail loss notation throughout and employ a predictive, lagged-conditioning design with and . Although both variables are derived from the same market index, they represent distinct random variables at different points in time, rendering higher-order co-moment functionals well-defined in this single-index setting. Within this framework, co-moments are interpreted strictly as descriptive diagnostics of nonlinear temporal dependence and tail asymmetry, rather than as measures of cross-sectional contagion or systemic interconnectedness across institutions.
Conditional tail-risk measures are reported only when the joint tail contains a sufficient number of observations to ensure reliable identification, reflecting a conservative empirical design rather than numerical limitation. To assess robustness under less sparse tail conditioning, we additionally examine conditional tail measures at a less extreme tail level, without altering the qualitative interpretation of stress regimes. This approach prevents spurious tail estimates driven by a handful of extreme observations and aligns the empirical analysis with the theoretical emphasis on coherent conditional risk measurement.
The predictive interpretation follows standard tail-risk methodologies developed by
Acharya et al. (
2017),
Brownlees and Engle (
2017), and
Adrian and Brunnermeier (
2016), all of which condition tail-risk measures on lagged variables or dynamic state processes. Within this framework, we study short-horizon propagation of market-wide tail risk while preserving the formal structure of conditional tail-based risk measures.
The remainder of the paper is structured as follows.
Section 2 reviews the relevant literature on higher-order co-moments and coherent tail-risk measures.
Section 3 presents the definitions and mathematical properties of co-skewness, co-kurtosis, CoES, and CoVaR, including counterexamples establishing non-coherence.
Section 4 describes the empirical design, data, and results.
Section 5 concludes and discusses potential hybrid extensions that integrate higher-order distributional adjustments into coherent tail-based measures.
2. Literature Review
The measurement of financial risk has undergone substantial development over the past three decades, driven by theoretical advances and repeated episodes of market turmoil. The introduction of Value at Risk (VaR) in the 1990s marked a pivotal shift in risk management by providing a quantile-based measure of downside exposure. However, VaR soon drew criticism for failing to satisfy subadditivity, implying that diversification could perversely increase measured risk. This critique was formalized by
Artzner et al. (
1999), who introduced the framework of coherent risk measures and established four axioms—monotonicity, subadditivity, positive homogeneity, and translation invariance—that any theoretically sound risk measure must satisfy. Expected Shortfall (ES), also known as Conditional VaR, subsequently emerged as the most prominent coherent alternative.
Acerbi and Tasche (
2002) demonstrated that ES satisfies all coherence axioms while capturing the average severity of losses beyond a given quantile, and ES was later incorporated into the Basel III regulatory framework, reinforcing its practical relevance.
It is important to emphasize that VaR and ES are distribution-free risk functionals defined via quantiles and tail expectations, rather than moment-based objects. Nevertheless, many commonly used parametric and semi-parametric implementations of these measures rely on low-order moments and may fail to capture stylized features of financial returns such as skewness, heavy tails, volatility clustering, and nonlinear dependence (
Bollerslev, 1986;
Ding et al., 1993). These considerations motivated the use of higher-order co-moment functionals, which are designed to describe asymmetry, comovement asymmetries, and nonlinear dependence rather than to serve as risk measures.
Substantial literature documents the empirical relevance of higher-order moments and co-moments.
Harvey and Siddique (
2000) show that conditional skewness is priced in asset returns, highlighting the role of asymmetric dynamics in risk premia.
Jurczenko and Maillet (
2006) demonstrate that co-skewness and co-kurtosis affect optimal portfolio allocation, particularly for investors concerned with downside risk. More recently,
Zou et al. (
2025) document cross-market contagion transmitted through third- and fourth-moment channels in a U.S.–China setting, showing that higher-order co-moments react sharply during periods of market stress. While these functionals reveal meaningful distributional characteristics, they are not risk measures: they do not evaluate tail losses, they do not generate capital requirements, and they fail the coherence axioms of
Artzner et al. (
1999). Their role in risk analysis is therefore descriptive and diagnostic rather than normative.
The global financial crisis of 2007–2009 further exposed the limitations of portfolio-level risk measures for understanding system-wide vulnerabilities. In response, a new generation of conditional risk measures was developed to explicitly capture joint tail behavior under stressed conditions. Among the most influential is Co-Value at Risk (CoVaR), introduced by
Adrian and Brunnermeier (
2016), which quantifies the value at risk of one entity conditional on another being in distress. CoVaR has become widely used for assessing conditional stress and interconnectedness, yet it inherits the theoretical shortcomings of VaR—most notably the failure of subadditivity. Co-Expected Shortfall (CoES) addresses this limitation by extending ES to a conditional framework. By evaluating expected losses in the joint tail, CoES preserves coherence while providing a theoretically sound measure of conditional joint tail risk.
The literature on tail-risk measurement can be broadly divided into two complementary strands. The first focuses on cross-sectional systemic risk, examining contemporaneous contagion and spillovers across institutions or market segments. This strand includes CoVaR, MES, SRISK, network-based contagion models (
Diebold & Yilmaz, 2014), and copula-based dependence structures (
Patton, 2012), all of which are designed to capture joint distress and interconnectedness across distinct entities. The second strand emphasizes time-series tail risk, studying the temporal propagation and persistence of extreme losses within a given market or institution. In this predictive setting, lagged conditioning variables and dynamic state processes are routinely employed.
Notably,
Acharya et al. (
2017) and
Brownlees and Engle (
2017) estimate MES and SRISK using dynamic models in which conditional tail risk depends on lagged returns and volatility states. Similarly, empirical implementations of CoVaR following
Adrian and Brunnermeier (
2016) condition risk measures on lagged macro-financial variables. These predictive approaches are widely used in risk management and stress testing, where the objective is to forecast near-term vulnerability rather than to measure contemporaneous cross-institutional contagion.
The present study lies at the intersection of these strands. Conceptually, our theoretical analysis adopts the axiomatic framework of conditional tail risk and coherence, which is often invoked in cross-sectional definitions of systemic risk. Empirically, however, we adopt a deliberately parsimonious time-series design that examines how lagged market distress conditions future tail losses of the same market index. This design does not aim to measure cross-sectional contagion across institutions; rather, it serves as a transparent empirical illustration of how higher-order co-moment diagnostics, CoVaR, and CoES behave within a predictive conditional tail-risk framework.
Taken together, the literature reveals a clear hierarchy among commonly used tools. Higher-order co-moment functionals provide valuable descriptive information about asymmetry and nonlinear dependence but lack coherence and cannot serve as normative risk measures. CoVaR offers an intuitive conditional stress indicator but inherits the non-coherence of VaR. CoES emerges as the coherent benchmark for conditional tail risk, combining theoretical rigor with sensitivity to joint extreme losses. By formalizing these distinctions within a unified framework, this paper clarifies the respective roles of descriptive diagnostics and coherent tail-risk measures, and motivates future extensions that integrate higher-order distributional information into coherent conditional risk frameworks.
3. Mathematical Properties of Co-Moment Functionals and Conditional Tail Risk Measures
The purpose of this section is clarificatory and expository. We formalize the analytical objects examined in the paper—higher–order co-moment functionals (co-skewness and co-kurtosis) and conditional tail risk measures (Co-Expected Shortfall and Co-Value at Risk)—and place them within the coherence framework of
Artzner et al. (
1999). As emphasized in the Introduction, co-skewness and co-kurtosis are not risk measures in any normative, regulatory, or capital-allocation sense. They are purely descriptive statistics summarizing asymmetry and nonlinear dependence. The objective here is therefore not to elevate co-moments to systemic risk measures, but to demonstrate formally and transparently why they fail the coherence axioms of
Artzner et al. (
1999), thereby justifying their role as diagnostic tools rather than actionable measures of conditional tail fragility.
For clarity, we adopt the following working definition of a conditional tail risk measure: a conditional tail risk measure is a functional that evaluates the loss of one component of a financial system conditional on stress in another component, institution, or market segment. Formally, given a bivariate loss vector , such a measure quantifies the behavior of Y when X enters a distress region such as . This conditional perspective is central to the tail-risk and systemic-risk literatures, where joint extreme events and stress conditioning play a defining role.
Under this definition, CoVaR and Co-Expected Shortfall are canonical conditional tail risk measures, whereas higher-order co-moment functionals do not qualify as conditional risk metrics and should be interpreted solely as distributional descriptors. This distinction is fundamental: co-moments do not assess losses, do not yield capital requirements, and fail key coherence axioms, while CoES—by inheriting the coherence of ES—provides a rigorously interpretable measure of joint tail fragility.
To establish a unified analytical framework, we present the mathematical definitions of co-skewness, co-kurtosis, CoVaR, and Co-Expected Shortfall, and evaluate each within the coherence framework of
Artzner et al. (
1999), which consists of monotonicity, subadditivity, positive homogeneity, and translation invariance. For the higher–order co-moments, we provide streamlined counterexamples that highlight, in the most transparent form, the structural reasons why co-skewness violates subadditivity and co-kurtosis violates monotonicity. These diagnostics demonstrate that higher-order co-moments, although useful as descriptive functionals, cannot serve as coherent risk measures. In parallel, we show that CoVaR inherits the non-subadditivity of VaR, whereas CoES satisfies all four coherence axioms.
Table 1 provides a taxonomy distinguishing “tail risk” from “extreme conditional tail risk.” Tail risk refers to exceedances beyond moderate quantiles (e.g., 5% or 1% VaR), whereas extreme conditional tail risk concerns joint realizations in the far tail with very small probabilities under stress conditioning. In this taxonomy, CoVaR focuses on conditional quantiles at a given probability level, while CoES evaluates joint expected losses in the extreme tail and therefore captures conditional tail fragility more comprehensively. Higher-order co-moments may reflect asymmetric dependence but do not isolate the joint tail region in the way coherent conditional tail risk measures such as CoES do.
The remainder of this section proceeds as follows.
Section 3.1 presents the non-coherence of co-skewness and co-kurtosis, including formal proofs and counterexamples.
Section 3.2 introduces Co-Expected Shortfall and CoVaR, highlighting their respective coherence properties.
Section 3.3 synthesizes the findings in a unified comparative framework that positions co-moments as diagnostic statistics and CoES as a coherent benchmark for conditional tail-risk measurement.
3.1. Non-Coherence of Higher-Order Co-Moment Functionals
The purpose of this subsection is clarificatory and pedagogical. Higher–order co-moment functionals describe how deviations in one asset’s returns interact with the distributional shape of another. They summarize asymmetry (co-skewness) and tail-heaviness (co-kurtosis) in a multivariate setting and therefore play a useful descriptive role in the empirical literature on nonlinear dependence and contagion-like comovements. However, they are not risk measures in the sense of
Artzner et al. (
1999); they do not evaluate losses, they generate no capital requirements, and—crucially—they violate key coherence axioms. The objective here is not to contribute new theoretical results, but to present transparent, self-contained counterexamples that formally demonstrate why co-moments cannot serve as coherent or conditional tail risk measures. This framing aligns with the paper’s aim of positioning co-moments strictly as diagnostic descriptors rather than normative tools for conditional or systemic tail-risk assessment.
3.1.1. Failure of Subadditivity for Co-Skewness
Proposition 1. Co-skewness fails the subadditivity axiom and therefore cannot be a coherent risk measure.
Proof. For two random variables
X and
Z, co-skewness is
Consider the functional
for fixed
Z. Subadditivity requires
Let
, etc. Then
Whenever the triple product
, subadditivity is violated.
Counterexample. Let
be independent Rademacher variables, with
. Set
,
, and
. Because
, we have
, etc. Then
but
Thus
violating subadditivity. □
The counterexample is presented in a deliberately minimal form to make the violation of subadditivity fully transparent. Co-skewness captures how asymmetric shocks in one asset covary with fluctuations in another—particularly during periods of market stress—which explains its usefulness in descriptive analyses of nonlinear dependence and contagion-like comovements. However, its failure to satisfy subadditivity means that diversification can appear to increase the assessed “risk,” contradicting a core principle of coherent measurement. Combined with its substantial sampling variability, this limitation confines co-skewness to a diagnostic or exploratory role rather than a coherent measure of conditional tail fragility.
3.1.2. Failure of Monotonicity for Co-Kurtosis
Proposition 2. Co-kurtosis fails the monotonicity axiom and therefore cannot be a coherent risk measure.
Proof. Co-kurtosis is defined as
Consider
. Monotonicity requires: if
almost surely, then
.
Counterexample. Let and let Y be Bernoulli with . Clearly a.s.
Define
so
when
and
when
. Then
,
, and a direct computation yields
Hence
violating monotonicity. □
Monotonicity requires that a position that is everywhere riskier should not be assigned a smaller value. Co-kurtosis fails this basic ordering property: its sign and magnitude depend critically on how the benchmark variable Z aligns with the realizations of Y, especially in the tails. Although co-kurtosis can highlight channels of joint extreme behavior or crash comovements, its violation of monotonicity and extreme sensitivity to tail alignment make it unreliable for conditional tail-risk monitoring or capital-allocation purposes. Consequently, co-kurtosis belongs in the class of descriptive statistics that provide insight into nonlinear tail dependence, not in the class of coherent or systemic tail-risk measures.
3.2. Coherent Systemic Risk Measures
Expected Shortfall (ES) is the canonical example of a coherent risk measure (
Acerbi & Tasche, 2002). Its conditional extension, Co-Expected Shortfall (CoES), evaluates expected losses in the joint tail and is defined by
where both variables are interpreted as losses in systemic applications. Under standard integrability and measurability conditions, CoES inherits monotonicity, subadditivity, positive homogeneity, and translation invariance from ES, and is therefore coherent.
1CoES directly implements the definition of a systemic risk measure introduced in
Section 3: it evaluates the magnitude of
Y’s losses conditional on
X entering a distress region. Unlike CoVaR, which captures only a quantile shift, CoES accounts for the severity of losses in the joint tail. This makes CoES particularly well suited to systemic contexts in which the propagation of large losses—not merely threshold exceedances—is central. Its coherence ensures that diversification is rewarded, capital scaling is respected, and comparisons across institutions are meaningful. Regulatory frameworks such as Basel III and IV explicitly prefer ES over VaR for these reasons, reinforcing CoES as the natural coherent benchmark for systemic-risk measurement. Although estimation of joint tails presents computational challenges, CoES remains the most theoretically rigorous and policy-relevant systemic measure among those considered in this paper.
CoVaR, by contrast, is not coherent. It captures only a conditional quantile and ignores the severity of tail losses beyond the threshold. Formally, for conditioning variable
X and target variable
Y,
Because it inherits all theoretical limitations of VaR—most notably the failure of subadditivity—it may assign lower risk values to more diversified positions. This violates the axiomatic foundation of coherent measurement and limits CoVaR’s suitability for capital allocation or regulatory use.
Failure of Subadditivity
Let
X be Bernoulli with
. Let
be i.i.d. losses such that, conditional on
,
Then
But
takes values
, and its conditional median given
is 1. Thus,
violating subadditivity. Therefore, CoVaR cannot be coherent.
CoVaR extends VaR to a conditional setting and provides an intuitive communication tool for stress testing: it quantifies how the -quantile of Y shifts when X becomes distressed. However, because it neglects loss severity beyond the threshold and fails to satisfy coherence axioms, its role is necessarily limited to descriptive or scenario-analysis applications. It is informative but not normative, and should not be interpreted as a coherent systemic risk measure.
Because systemic risk measures operate on random vectors, their behaviour depends critically on the form of dependence between components. Higher-order co-moment functionals grow in magnitude when nonlinear or asymmetric dependence strengthens and vanishes under independence. CoVaR is sensitive to dependence because it measures a conditional quantile: if
X and
Y are independent,
CoES is even more sensitive because it averages losses over the joint tail region. As extreme co-movement increases, the joint tail becomes heavier and CoES rises systematically. Under independence, co-moments vanish, CoVaR reduces to
, and CoES reduces to
, providing a clean hierarchy:
This hierarchy will be central to the empirical analysis in
Section 4, where we illustrate how these measures behave across multiple stress episodes.
This clarification highlights how marginal tail properties and dependence structure jointly shape the behaviour of systemic risk functionals, and why only CoES satisfies the full coherence axioms while capturing severity in the joint tail.
3.3. Comparative Synthesis and Implications
Table 1 summarizes the conceptual and mathematical distinctions among the four measures examined in this paper. Higher-order co-moment functionals (co-skewness and co-kurtosis) provide descriptive information on asymmetry and joint tail shape but do not satisfy coherence axioms, which restricts their role to diagnostic analysis rather than formal risk measurement. Co-Expected Shortfall (CoES), by contrast, is fully coherent and captures joint tail severity under stress conditioning, making it the most theoretically rigorous and practically relevant conditional tail risk measure. Co-Value at Risk (CoVaR) occupies an intermediate position: it offers an intuitive conditional-quantile view of stress transmission but is not coherent and ignores tail severity beyond the quantile.
In
Table 1, “tail risk” refers to losses exceeding moderate quantile thresholds (e.g., 5%–1% VaR), while “extreme conditional tail risk” refers to the far joint tail (
) under stress conditioning. Co-moment functionals summarize distributional asymmetry and tail shape, whereas CoVaR and CoES evaluate conditional or joint tail behavior directly.
The theoretical results highlight why these measures play fundamentally different roles. Co-skewness and co-kurtosis, although informative for assessing asymmetry and nonlinear dependence, fail subadditivity and monotonicity, respectively. These violations disqualify them from serving as risk measures in the sense of
Artzner et al. (
1999), but they remain useful for exploratory analyses of dependence patterns and distributional distortions.
Co-Expected Shortfall emerges as the most robust and policy-aligned conditional tail risk measure. It respects all coherence axioms, captures both the likelihood and the severity of joint tail losses, and scales consistently with exposure. Its adoption in Basel III/IV underscores its suitability for supervisory stress testing, capital design, and conditional tail-risk monitoring.
CoVaR continues to be valuable as an intuitive scenario and communication tool—its quantile-based formulation is straightforward and widely understood. However, because it neglects tail severity and fails subadditivity, its appropriate role is as a complement to, rather than a substitute for, CoES.
Taken together, the findings support a clear hierarchical interpretation of tail-risk tools:
Co-moments: descriptive supplements for diagnosing asymmetry and nonlinear dependence;
CoVaR: intuitive stress-test indicator capturing conditional quantiles;
CoES: coherent foundation for conditional tail-risk measurement, capital assessment, and regulatory analysis.
For researchers, this hierarchy clarifies the theoretical boundaries between descriptive functionals and coherent conditional risk measures. For practitioners and policymakers, it provides a structured toolkit for evaluating market fragility—from descriptive diagnostics, to scenario-based signals, to rigorous, capital-relevant measures grounded in coherence.
4. Empirical Design and Results
While the theoretical analysis clarifies the distinction between descriptive co-moment functionals and coherent tail-based risk measures, empirical analysis is essential for illustrating their behavior during episodes of market stress. In this section, we evaluate co-skewness and co-kurtosis strictly as descriptive indicators of nonlinear dependence and tail asymmetry, and we examine Co-Expected Shortfall (CoES) and Co-Value at Risk (CoVaR) as conditional tail-risk measures. Importantly, the empirical analysis is not intended to measure cross-sectional systemic contagion across institutions. Instead, it focuses on temporal tail dependence and short-horizon propagation of market stress within a single broad market index. Accordingly, the empirical results should be interpreted as evidence on market-level tail dynamics under predictive conditioning, rather than as a full systemic-risk assessment in the cross-sectional sense.
4.1. Data, Stress Windows, and Estimation Strategy
We use daily close-to-close prices for the S&P 500 index over the period 2007–2023, a span that includes both tranquil regimes and several well-documented episodes of market turmoil.
2A central choice in our empirical strategy is the adoption of a predictive, lagged-conditioning design,
and
, which studies how one-day-ahead tail losses respond to stress in the immediately preceding period. This design parallels predictive tail-risk frameworks such as MES and SRISK (
Acharya et al., 2017;
Brownlees & Engle, 2017), in which conditional tail losses depend explicitly on lagged information sets. Similarly, CoVaR implementations following
Adrian and Brunnermeier (
2016) condition quantiles on lagged state variables. Our approach therefore preserves the formal definitions of conditional tail-risk measures, while shifting the conditioning set by one period in a forecasting-oriented manner. This design enables short-horizon analysis of temporal tail-risk propagation without invoking cross-sectional contagion across distinct institutions.
Although CoVaR and CoES are often introduced in cross-sectional settings involving distinct institutions or assets, their defining feature is conditional tail evaluation rather than cross-sectional structure per se. In the present predictive, single-index design, and represent distinct random variables at successive points in time, so conditional tail measures remain well-defined. Under this interpretation, CoVaR and CoES quantify how current-period tail losses respond to lagged market stress, capturing temporal conditional tail dependence and short-horizon propagation of extreme losses rather than contemporaneous spillovers across institutions. This usage is consistent with the operational implementation of MES, SRISK, and CoVaR, which condition tail risk on lagged returns or state variables in forecasting-oriented frameworks.
Returns are computed as log differences,
, and losses are defined as
.
3 Non-trading days are removed. As a robustness check, we also consider winsorizing
at the
quantiles, although all baseline results use the raw, unadjusted data.
Within the full sample, we focus on five periods of pronounced market stress in U.S. equity markets. These include the Lehman phase of the Global Financial Crisis (September 2008–March 2009), the Eurozone spillover (July 2011–June 2012), the China devaluation and energy bust (August 2015–February 2016), the COVID-19 crash (February 2020–June 2020), and the inflation-and-Ukraine bear market (February 2022–October 2022). These episodes represent a broad spectrum of market stress events, ranging from globally synchronized crises to more localized and short-lived disruptions.
To maintain internal consistency across all objects, we apply the same predictive lagged-conditioning framework to co-moment statistics, CoES, and CoVaR. This ensures that all measures are evaluated with respect to the same information set, avoiding the conceptual mismatch that would arise if co-moments were computed using contemporaneous values while tail-risk measures used predictive conditioning. Thus, differences in empirical behavior reflect differences in measure construction—not differences in data alignment.
For each window
, we compute sample means
and
and define centered series
and
. The descriptive co-moment functionals are computed as
These quantities are used purely to illustrate nonlinear dependence and asymmetric co-movement during crises; they are not interpreted as risk measures and do not carry normative or capital-allocation meaning.
For Co-Expected Shortfall, we select confidence levels
and compute empirical quantiles
and
. The joint tail set is
and the conditional shortfall is
reported only when
exceeds a minimum threshold (e.g., 25 observations).
Co-Value at Risk is estimated via quantile regression,
yielding the conditional quantile of
Y when
X lies at its own
-quantile loss.
For robustness, we verified that winsorization does not change the relative ordering of stress severity nor the qualitative distinction between CoES and CoVaR. Block bootstrap inference (15-day blocks, 1000 replications) confirms that CoES exhibits statistically different conditional tail behavior between severe and moderate stress episodes, consistent with its joint-tail construction, whereas CoVaR produces wider and frequently overlapping confidence intervals due to its quantile-based definition.
4.2. Estimation Procedure and Statistical Properties
Our empirical estimation follows established nonparametric methods for descriptive co-moment statistics and for conditional tail-risk measures. Because co-skewness and co-kurtosis are used here purely as distributional descriptors of nonlinear dependence and asymmetry, their estimators are simply sample co-moments computed from centered losses.
For a given window
, define
and compute
These estimators are strongly consistent under mild moment and mixing conditions by the law of large numbers for polynomial functionals of dependent data; see
Harvey and Siddique (
2000);
Jurczenko and Maillet (
2006). Their purpose is descriptive—their values summarize how asymmetric dependence behaves across crisis windows—and they are not interpreted as risk measures.
For tail-risk measures, we estimate empirical quantiles
for
and
using order statistics. Under stationarity and ergodicity, sample quantiles are strongly consistent (
Newey & McFadden, 1994). Co-Expected Shortfall is then computed by averaging losses within the joint tail region
The estimator
is consistent whenever the joint tail probability is positive and
; see
Acerbi and Tasche (
2002);
Fissler and Ziegel (
2016).
Because CoES averages losses in the joint tail, its statistical behavior depends primarily on joint extreme dependence between X and Y under stress conditioning.
Co-Value at Risk is estimated through quantile regression,
Quantile regression estimators are asymptotically normal and consistent under standard regularity conditions (
Koenker, 2005;
Koenker & Bassett, 1978). CoVaR, as a conditional quantile, reflects threshold behavior of
Y under distress in
X, and therefore serves as a stress-testing indicator rather than a coherent risk measure.
To quantify sampling uncertainty across all measures, we compute block bootstrap confidence intervals using 15-day moving blocks with 1000 replications. This approach accounts for serial dependence and produces reliable inference for both tail-based measures and descriptive co-moments. Bootstrap replications in which the joint tail set is insufficiently populated are discarded rather than imputed, ensuring that inference is not driven by empty or ill-defined tail events.
Overall, this estimation framework ensures that (i) co-skewness and co-kurtosis are treated strictly as descriptive statistics summarizing higher-order dependence, and (ii) CoES and CoVaR are estimated using statistically sound methods appropriate for conditional market tail-risk analysis. We now illustrate the dynamic behavior of CoES using a rolling-window specification.
Figure 1 illustrates a 1000-day rolling CoES path at the 5% level for the S&P 500. Peaks during 2008–2009, 2020, and 2022–2023 correspond closely to periods of heightened market stress, while the prolonged tranquil period from 2013–2017 exhibits persistently low values. This rolling analysis reinforces the central empirical finding that CoES delivers a coherent, robust, and timely signal of conditional market fragility in a predictive, single-index setting.
4.3. Discussion of Results
The empirical evidence across U.S. stress episodes is fully consistent with the theoretical hierarchy established earlier. Coherent conditional tail-based measures provide stable and economically interpretable information about market-wide fragility when their identifying conditions are satisfied, whereas higher-order co-moment statistics function solely as descriptive indicators of distributional asymmetry and nonlinear dependence. To eliminate trivial scale effects, higher-order co-moments are analyzed in standardized form, constructed analogously to correlation coefficients and therefore scale-free, so that comparisons are invariant to the units of daily returns. Even under this normalization, co-skewness and co-kurtosis remain sign-unstable and exhibit no systematic ordering across stress regimes, confirming that their limitations are structural rather than an artifact of raw magnitudes.
Co-Expected Shortfall (CoES) emerges as the most reliable and economically meaningful measure in our empirical analysis when the relevant joint tail is sufficiently populated. By construction, CoES is non-negative and summarizes the average magnitude of losses conditional on lagged market distress. At the 5% level, CoES is well identified only in stress episodes characterized by persistent and prolonged downside pressure—most notably the Eurozone spillover and the Inflation/Ukraine bear market—attaining values of 0.0065 and 0.0069, respectively. These magnitudes reflect sustained conditional tail exposure rather than isolated extreme realizations.
4
Table 2.
Conditional tail-based risk measures across U.S. stress episodes. CoES is reported only when the effective joint tail exceeds a conservative minimum-hit threshold required for reliable identification. NA entries indicate insufficient identifying observations due to tail sparsity, not numerical instability or lack of market stress.
Table 2.
Conditional tail-based risk measures across U.S. stress episodes. CoES is reported only when the effective joint tail exceeds a conservative minimum-hit threshold required for reliable identification. NA entries indicate insufficient identifying observations due to tail sparsity, not numerical instability or lack of market stress.
| Window | n | CoES (5%) | CoES (10%) | CoVaR (5%) | CoVaR (10%) | CoVaR (5%) | CoVaR (10%) |
|---|
| GFC (Lehman phase) | 146 | NA | 0.0096 | NA | 0.0357 | NA | |
| Eurozone spillover | 252 | 0.0065 | 0.0049 | 0.0303 | 0.0217 | 0.0063 | 0.0055 |
| China/energy bust | 145 | NA | 0.0074 | NA | 0.0251 | NA | 0.0111 |
| COVID crash (expanded) | 104 | NA | 0.0141 | NA | 0.0297 | NA | |
| Inflation/Ukraine bear | 189 | 0.0069 | 0.0069 | 0.0250 | 0.0181 | | |
In contrast, for shorter or more abrupt stress episodes—such as the Lehman phase of the Global Financial Crisis, the COVID-19 crash, and the China/energy bust—the number of tail observations at the 5% level falls below a conservative minimum-hit threshold. In these cases, CoES is intentionally reported as NA. This outcome reflects identification limits rather than numerical instability and prevents tail estimates from being driven by only a handful of extreme observations. By enforcing a minimum tail requirement, the analysis avoids spurious inference and ensures that CoES is reported only when it provides a statistically meaningful summary of conditional downside risk.
Bootstrap confidence intervals reported in
Appendix A.1 confirm that, whenever CoES is identifiable in the original sample, the resulting estimates are statistically stable. Bootstrap resampling does not overcome identification limits imposed by tail sparsity in short stress windows, reinforcing the importance of explicit tail-population criteria.
To illustrate robustness under less sparse tail conditioning, we additionally report CoES at the 10% level. As expected, a broader subset of stress windows becomes identifiable at this level, including the Lehman phase of the Global Financial Crisis, the China/energy bust, and the COVID-19 crash. Importantly, while the expanded tail definition increases statistical feasibility, the qualitative ranking across stress regimes remains unchanged: prolonged stress episodes continue to exhibit higher and more persistent conditional tail losses than short-lived dislocations.
CoVaR provides complementary information in the form of conditional quantiles and is reported when estimation is feasible under the same minimum-hit discipline. At both the 5% and 10% levels, CoVaR values are elevated during periods of heightened stress, reflecting sharp threshold losses when lagged market distress is present. The corresponding CoVaR is defined as the difference between the distress-state conditional quantile and a baseline-state conditional quantile, where the baseline is defined using a robust interquartile-range band for the conditioning variable.
Because this baseline band can itself be elevated during persistent stress regimes, CoVaR need not be positive. For example, during the Inflation/Ukraine bear market, CoVaR is negative at both the 5% and 10% levels, indicating that baseline conditional risk remains unusually high even outside the formal distress state. Similar behavior at the 10% level during the Lehman phase and the COVID-19 crash reflects elevated baseline risk rather than numerical artifacts. In shorter crisis windows, CoVaR is frequently undefined due to insufficiently populated distress or baseline conditioning sets, highlighting a structural limitation of quantile-difference measures in short samples.
Higher-order co-moment statistics convey a fundamentally different type of information. Standardized co-skewness and co-kurtosis summarize nonlinear dependence and distributional shape in the joint loss distribution but do not scale systematically with crisis severity and do not satisfy coherence axioms, as documented in
Table 3. They therefore serve as descriptive diagnostics of nonlinear dependence rather than admissible measures of conditional tail risk.
5Taken together, the empirical results align closely with the theoretical framework developed in this paper. CoES provides a coherent, stable, and economically interpretable benchmark for conditional market tail risk when its identifying conditions are satisfied. CoVaR supplies auxiliary stress-testing information but lacks coherence and robustness in short samples, while CoVaR is particularly sensitive to baseline specification and tail sparsity. Higher-order co-moments remain informative descriptive tools but are unsuitable for monitoring, ranking, or regulating market-wide fragility.
5. Conclusions
This paper compared higher-order co-moment functionals—co-skewness and co-kurtosis—with conditional tail-risk measures, namely Co-Expected Shortfall (CoES) and Co-Value at Risk (CoVaR), through the lens of coherence. The theoretical analysis establishes that higher-order co-moments, while informative about asymmetry and nonlinear dependence, fail to satisfy the coherence axioms of
Artzner et al. (
1999). In particular, co-skewness violates subadditivity and co-kurtosis violates monotonicity, confirming that these objects are descriptive statistical functionals rather than valid risk measures in a normative or regulatory sense. By contrast, CoES inherits the coherence of Expected Shortfall and extends it to a conditional joint-tail framework, making it a theoretically sound benchmark for conditional tail-risk assessment. CoVaR, although intuitive and widely used in policy and stress-testing contexts, remains non-coherent due to its quantile-based construction and failure of subadditivity.
The empirical analysis, based on S&P 500 losses across major U.S. market stress episodes from 2008 to 2023, reinforces this theoretical hierarchy in a predictive, single-index setting. When the joint tail is sufficiently populated—most notably during prolonged stress regimes such as the Eurozone spillover and the Inflation/Ukraine bear market—CoES delivers stable and economically interpretable estimates of conditional tail risk. In shorter or more abrupt crisis episodes, the number of joint tail realizations at conventional confidence levels is insufficient once conservative identification thresholds are imposed, and CoES is intentionally not reported, reflecting identification limits rather than numerical instability. To assess robustness under less sparse tail conditioning, the analysis also considers a higher tail probability, which confirms that the qualitative ordering of stress regimes is preserved even when statistical feasibility improves.
CoVaR provides complementary information in the form of conditional quantiles and is estimable across all stress windows. Its values increase during periods of heightened market distress, indicating sharp threshold losses when lagged stress is present. However, CoVaR exhibits greater sensitivity to sample size and window length, and the corresponding CoVaR is frequently undefined due to an insufficiently populated baseline conditioning set. These features are not implementation artifacts but reflect fundamental limitations of quantile-based measures in short samples. Consistent with theory, CoVaR is therefore best interpreted as a stress-testing indicator rather than a coherent measure of conditional tail risk.
Higher-order co-moment statistics exhibit extremely small magnitudes and pronounced sign instability across stress episodes. Their empirical behavior confirms their role as descriptive diagnostics of nonlinear dependence and distributional asymmetry rather than as measures of tail risk. Importantly, these statistics do not scale systematically with crisis severity and provide no normative guidance for risk aggregation or regulation.
Taken together, the theoretical and empirical evidence supports a clear hierarchy among the examined functionals. Higher-order co-moments serve exclusively as descriptive tools, CoVaR functions as an auxiliary stress-testing metric with limited theoretical robustness, and CoES emerges as the coherent benchmark for conditional tail-risk measurement when its identifying conditions are met. These distinctions reflect fundamental structural properties of the measures rather than sampling variability or modeling choices.
It is important to emphasize that the empirical design adopted in this paper captures temporal conditional tail dependence in aggregate market losses rather than cross-sectional contagion across institutions. Accordingly, the results should be interpreted as evidence on market-wide tail dynamics in a forecasting-oriented framework, not as a full cross-sectional measure of systemic risk. Extending the analysis to genuinely multivariate or network-based settings constitutes a natural and important direction for future research.
A further avenue for future work lies in the development of hybrid approaches that incorporate higher-order distributional information into coherent tail-risk measures. For example, CoES could be augmented with asymmetry-sensitive or moment-adjusted components that preserve coherence while enriching tail-shape diagnostics. Such extensions may be particularly valuable in stress-testing environments, where practitioners seek both robust tail-loss measures and insight into the structure of extreme risks.
Overall, the contribution of this paper is deliberately clarificatory. By organizing widely used dependence measures within a unified coherence-based framework and demonstrating how their empirical behavior aligns with their theoretical properties, the paper helps distinguish descriptive statistics from coherent tail-risk measures and clarifies their appropriate roles in empirical finance, risk management, and stress-testing applications.