1. Introduction
Understanding and accurately estimating the covariance matrix is a fundamental task that connects modern statistics with a wide range of applied disciplines. In quantitative finance, it forms the analytical foundation for asset allocation and portfolio design, providing theoretical support for the mean–variance framework proposed by (Markowitz 1952) [
1] and its subsequent developments [
2,
3]. This structure likewise underpins financial risk evaluation, as studies such as [
4,
5] highlight the impact of noise and stability in empirical correlation matrices. Beyond finance, covariance modeling is crucial in genomics and bioinformatics [
6], where it helps uncover dependencies among high-dimensional gene-expression variables. In engineering fields such as signal reconstruction and image completion, covariance-based low-rank formulations [
7] offer powerful theoretical and algorithmic frameworks for recovering structured information from complex data.
In empirical research, the sample covariance matrix is often used as a straightforward measure of cross-sectional dependence, yet its performance deteriorates rapidly in high-dimensional contexts. When the dimensionality
approaches or exceeds the sample size
, the estimator becomes numerically unstable, amplifies estimation noise, and eventually loses invertibility, which undermines its use in portfolio construction and risk evaluation. This limitation has stimulated a wide spectrum of methodological developments. A prominent line of inquiry builds on latent factor representations to capture dominant co-movements and achieve substantial dimensionality reduction [
3,
8]. Another direction focuses on imposing regularization or shrinkage so as to stabilize eigenvalue dispersion and improve estimation robustness [
9,
10]. Parallel advances further exploit structured assumptions such as sparsity or banding in the covariance or precision domain, producing estimators with strong theoretical guarantees and competitive empirical accuracy [
11,
12]. Despite their methodological differences, these approaches typically require either strong low-dimensional assumptions or large effective sample sizes, and often provide limited control over the global stability and interpretability of covariance estimates in large-scale financial applications.
Beyond these “Euclidean” regularization strategies, an important body of work approaches covariance matrices as elements on the manifold of symmetric positive definite (SPD) matrices, advocating geometry-aware metrics and computations. In particular, the affine-invariant Riemannian metric provides an intrinsic notion of distance and averaging on SPD matrices [
13], while the log-Euclidean framework offers a computationally convenient alternative that preserves key geometric properties by performing calculations in the matrix-log domain [
14]. This geometric perspective has been widely used to define meaningful distances, interpolation, and averaging of covariance-type objects, and it highlights the substantive modeling implications of working in the log domain [
15]. However, much of this literature is primarily developed for geometric analysis and signal/image processing tasks, and it is less directly connected to scalable covariance parameterizations with economic interpretability for large-universe portfolio optimization. A further related strand concerns dynamic covariance models that explicitly capture time variation in conditional covariances and correlations, such as multivariate GARCH formulations (e.g., CCC and BEKK) and dynamic conditional correlation models [
16,
17,
18]. While these models are economically meaningful for volatility and correlation dynamics, their direct application to large asset universes is often limited by parameter proliferation and computational burden, which typically necessitate strong restrictions or low-dimensional structure assumptions in practice.
While numerous advances have been made in covariance matrix estimation, many existing methods still rely, either explicitly or implicitly, on assumptions that the effective sample size expands in tandem with dimensionality, or impose strong structural restrictions and tuning choices that limit scalability, numerical stability, and economic interpretability in large-scale financial applications. In practice, this assumption often breaks down—particularly in fast-developing equity markets where the universe of tradable securities grows much faster than the available return history. The U.S. stock market provides a clear example: the cross-section of listed firms has expanded rapidly, yet time-series observations remain short, resulting in a severely high-dimensional setting that magnifies estimation noise. To enhance stability under such conditions, recent work has advocated the use of structured covariance specifications that embed economically meaningful constraints to achieve both parsimony and robustness. Among these, blockwise formulations stand out for their ability to reconcile flexibility with efficiency: by clustering assets into internally homogeneous groups, the dimensionality of the covariance parameter space contracts from the order of to roughly , yielding substantial gains in interpretability and numerical tractability.
Although blockwise covariance modeling has enhanced parameter efficiency, it still suffers from several important deficiencies. In practice, the total number of parameters continues to grow quadratically with the number of blocks
, which risks overfitting and compromises numerical stability when
becomes large. Furthermore, empirical block estimates may lose positive-definiteness, creating difficulties for matrix inversion and other operations that depend on stable covariance structures. In addition, the framework of [
19], which conceptually inspires this study, leaves several key aspects unresolved: it offers no clear guidance on how to cluster assets into economically meaningful groups, provides no analytical criterion for selecting the appropriate number of clusters
, and relies exclusively on historical return data without incorporating other potentially informative signals such as firm fundamentals, industry classifications, or latent network relations. These limitations motivate the development of a more general and data-enriched positive-definite block modeling approach capable of addressing both statistical and economic considerations in high-dimensional settings.
To resolve the aforementioned limitations, we construct a Blockwise Exponential Covariance Model (BECM) that leverages the structural properties of block matrices. A key observation, formalized in Corollary 1 of [
19], is that both the matrix logarithm and the matrix exponential preserve the original block partitioning, allowing each block to be modeled independently while maintaining the global block organization. The exponential mapping further ensures that the resulting covariance matrix is strictly positive definite for any real-valued parameterization, providing numerical stability for inversion-based applications. Building on these properties, we model the logarithm of the block covariance matrix, where intra- and inter-block dependencies are parameterized through interpretable coefficients and a kernel alignment similarity of block-level factor loadings, which captures their representational consistency in the feature space. The covariance matrix is then recovered through the matrix exponential transformation. Conceptually, this framework extends the theoretical foundation of [
19] by offering a more general formulation that accommodates additional information sources, thereby enhancing both the statistical and economic interpretability of blockwise covariance estimation in high-dimensional settings.
This study makes several contributions to the literature on high-dimensional covariance estimation and portfolio optimization. First, we propose a blockwise log-covariance estimation framework, referred to as the Blockwise Exponential Covariance Model (BECM), which parameterizes dependence structures at the log-covariance level and recovers the covariance matrix via the matrix exponential. This formulation ensures strict positive definiteness by construction, substantially reduces the number of free parameters through blockwise structure, and provides a flexible architecture for incorporating additional information into covariance estimation. Second, using the estimation procedure of [
19] as a benchmark, we conduct a systematic empirical assessment of how different clustering algorithms and various numbers of clusters influence the out-of-sample performance of minimum-variance portfolios. Finally, the results demonstrate that the proposed BECM-based minimum-variance strategy (BECM) consistently outperforms the estimator of [
19] across multiple evaluation metrics, highlighting its superior stability and risk-adjusted returns in high-dimensional applications. Overall, the proposed BECM addresses a central economic challenge in high-dimensional financial markets: how to obtain stable and interpretable risk estimates that remain robust to structural choices and can incorporate economic information, thereby improving the practical reliability of portfolio optimization and risk management.
The paper is organized into three main parts.
Section 2 introduces the methodological framework and explains the key components of our approach.
Section 3 presents the empirical design, data description, and comparative analysis of results.
Section 4 closes the study with concluding remarks and a discussion of its broader implications.
2. Model Setup and Estimation
We propose a generalized framework named the Blockwise Exponential Covariance Model (BECM), designed to capture nonlinear and covariate-dependent linkages among variable groups. To motivate the formulation, we first recall the idea of a block-partitioned covariance representation that serves as the conceptual foundation for our approach.
Let be a collection of -dimensional random vectors independently drawn from a multivariate normal distribution with mean vector and covariance matrix , i.e., Here represents an vector of zeros. We assume that the coordinates of display a groupwise correlation pattern, divided into distinct blocks.
We denote the complete index set by . This set is partitioned into mutually exclusive and collectively exhaustive subsets, each representing one block of variables.
Formally, the set G can be divided into subsets such that every element in G belongs to exactly one subset and no two subsets overlap. Let the number of variables in the k-th subset be denoted by . The total number of variables across all blocks therefore satisfies .
Based on this grouping, each observation vector
can be rearranged as
Under this block representation, the population covariance matrix can be expressed as
The
-th diagonal block has the intra-group structure
while the off-diagonal block between group
and
takes the homogeneous-correlation form
Here and denote the identity and all-ones matrices of indicated size.
This formulation characterizes the traditional blockwise dependence pattern, where both within- and between-group correlations are parameterized in a homogeneous manner. In the following subsection, we extend this framework to the Blockwise Exponential Covariance Model (BECM).
In the BECM framework, we directly specify the covariance structure in the log-domain. Specifically, the logarithm of the covariance matrix, denoted by
, is assumed to exhibit a blockwise form
where
are the parameter vectors. In addition, let each variable
be associated with a
-dimensional covariate vector
. For the
-th block and the
-th covariate dimension, we define the group-level mean as.
This quantity summarizes the average covariate feature within each block.
Although the model involves multiple components, identifiability is ensured by structural separation. The blockwise log-covariance specification decomposes dependence into distinct within-block, between-block, and information-guided components, each operating at a different aggregation level. Parameters are defined at the block level rather than at the asset level, which substantially reduces dimensionality and prevents overlap among parameter roles. As a result, the mapping from parameters to the log-covariance matrix is injective under a fixed block partition, ensuring structural identifiability without requiring additional normalization constraints.
To capture the nonlinear similarity between groups, we construct the blockwise weight
The resulting
represents the cosine similarity between the standardized feature mappings of the group-level covariates and thus serves as a data-driven measure of inter-block association.
The cosine similarity is adopted for its scale invariance and stability properties when applied to standardized block-level factor representations. Since the similarity measure depends on directional alignment rather than magnitude, it is well suited for guiding relative inter-block associations in the log-covariance space. Importantly, the proposed framework does not rely on cosine similarity per se; alternative similarity or kernel functions could be incorporated without altering the underlying log-covariance parameterization.
From a numerical perspective, the log-covariance parameterization further inherits well-established stability properties of symmetric positive definite (SPD) matrix geometry. It is well known that operating in the matrix-log domain effectively regularizes spectral behavior by mapping eigenvalues to an additive scale, thereby reducing sensitivity to extreme or near-zero eigenvalues [
14,
15]. The subsequent exponential mapping preserves eigenvalue ordering and guarantees positive definiteness by construction, avoiding numerical instabilities associated with direct covariance estimation. These properties have motivated the widespread use of log-Euclidean representations in SPD-based modeling and signal processing [
12], and they provide theoretical support for the numerical stability of the proposed framework without requiring additional constraints.
Following [
19] the proposed specification leads to a conveniently separable likelihood function, allowing the parameters to be estimated in a straightforward manner. Provided that the initial parameter values yield a positive definite covariance matrix, the optimization routine quickly converges to a blockwise covariance estimator that remains strictly positive definite. After obtaining the estimated log-covariance matrix
, the final blockwise covariance estimator is obtained through the matrix exponential mapping
, ensuring positive definiteness by construction. The next section presents an empirical analysis that revisits several open issues in [
19] and evaluates the out-of-sample performance of our proposed approach.
Model Scope and Assumptions. The proposed BECM is intended for high-dimensional settings where assets exhibit meaningful block-level dependence structures. Its performance relies on the existence of relatively stable groupwise correlations and economically interpretable clustering. Parameters are defined at the block level rather than the asset level, which enhances numerical stability but may limit flexibility when block assignments are highly unstable. Accordingly, the BECM should be viewed as a structured regularization approach that emphasizes stability and interpretability rather than a fully unrestricted covariance estimator.
3. Real Data Analysis
This section empirically examines the practical implementation issues of blockwise covariance modeling in portfolio optimization. We first analyze the Canonical Block Representation Model (CBRM) to evaluate how clustering algorithms and partition granularity influence the out-of-sample performance of minimum-variance portfolios, and then implement the proposed Blockwise Exponential Covariance Model (BECM) under the same rolling-window framework.
Consistent with this objective, the empirical study uses monthly excess returns of 500 randomly selected U.S. stocks from July 2013 to June 2023. In each 120-month rolling window, covariance matrices are estimated, portfolios are rebalanced, and performance is evaluated using the Sharpe ratio (SR), volatility (VOL), and turnover metrics (TO). This setup enables a comprehensive comparison between models and offers practical insights into the design of blockwise covariance estimators for high-dimensional portfolio construction.
Cluster selection is conducted using an inner validation window that is fully separated from the evaluation period, while portfolio performance is evaluated strictly out of sample.
3.1. Clustering Method and Block Number in the CBRM Framework
We conduct a real-data experiment to evaluate how different clustering methods and varying numbers of clusters influence the out-of-sample performance of CBRM-based minimum-variance portfolios. Specifically, we estimate block covariance matrices using four representative clustering algorithms—spectral, hierarchical, K-means, and Gaussian mixture models (GMM)—across cluster counts
. For each configuration, a minimum-variance portfolio is constructed and evaluated using the Sharpe ratio, volatility, and turnover. This experimental setup enables a systematic analysis of how clustering choices and structural granularity affect portfolio outcomes, as summarized in
Table 1.
To address the first empirical question regarding which clustering method performs better in constructing the Canonical Block Representation model, the results are summarized as follows. Across all clustering methods, the volatility and turnover of the resulting portfolios are generally comparable. At each given value, these two measures remain close in magnitude across methods, and both tend to increase monotonically as grows. This pattern indicates that finer block segmentation systematically raises estimation variance and leads to more frequent portfolio rebalancing. In contrast, the Sharpe ratio displays more variation across clustering algorithms and cluster counts. K-means and GMM achieve their highest performance at smaller values (), suggesting that these methods perform better under coarser grouping structures. Hierarchical clustering performs best at a moderate level (), whereas spectral clustering attains its peak Sharpe ratio at a larger (i.e., ). Despite these differences in the location of the optimal , the highest Sharpe ratios of all four clustering approaches are relatively close—generally between 0.18 and 0.20—and all exceed the benchmark 1/N portfolio, which yields a Sharpe ratio of 0.156. Overall, these results indicate that the four clustering methods deliver broadly comparable improvements in portfolio efficiency within the CBRM-based minimum-variance portfolios, each exhibiting distinct advantages under different segmentation levels.
To address the second empirical question concerning how the number of clusters affects portfolio performance, we first discuss the theoretical expectation behind the bias–variance mechanism. The theoretical expectation is straightforward. When the number of clusters is small, the Canonical Block Representation model is highly simplified. By construction, both intra-cluster and inter-cluster covariances may be ignored or excessively smoothed, which introduces substantial structural bias while keeping estimation variance low. As increases, the block partition becomes more granular and the bias decreases because the model captures a larger portion of the true covariance structure. However, relative to the available number of observations, the number of parameters to be estimated rises rapidly with , making the overall model more sensitive to noise and increasing the estimation variance. Consequently, portfolio performance should not vary monotonically with . It is expected to improve in the beginning when reduced bias dominates, but to deteriorate later when estimation variance becomes excessive. The optimal cluster number lies at the balance point between these two opposing forces, where the total mean squared error of the covariance estimate is minimized.
We next verify this theoretical expectation using the empirical results. Portfolio performance exhibits a clear non-monotonic relationship with the number of clusters . As increases from 5 to 30, the out-of-sample Sharpe ratios under all four clustering methods first rise and then decline, forming a typical single-peaked pattern. For example, under spectral clustering, the Sharpe ratio increases steadily from 0.146 at to 0.198 at —the highest among all methods—and then drops to 0.156 at . A similar pattern is observed for the other methods: hierarchical clustering reaches its peak at (SR = 0.184) and declines sharply thereafter; K-means performs relatively well at (SR 0.18) but deteriorates significantly at higher ; and GMM shows local maxima at (SR = 0.198). Meanwhile, both volatility and turnover increase with , reflecting the expansion of estimation variance and more frequent portfolio rebalancing. Overall, the results provide clear empirical evidence for the theoretical bias–variance trade-off: moderate clustering helps reduce structural bias, whereas excessive segmentation amplifies estimation variance and ultimately deteriorates portfolio performance.
In summary, the empirical evidence suggests that the choice of clustering algorithm within the Canonical Block Representation model (CBRM) leads to distinct yet broadly comparable portfolio outcomes. No single method consistently outperforms the others, indicating that the clustering technique itself is not the dominant factor in determining the effectiveness of the CBRM-based minimum-variance portfolio. In contrast, the number of clusters exerts a much more pronounced influence, with portfolio performance exhibiting the expected bias–variance trade-off as increases. These findings highlight that the practical challenge lies not in selecting the clustering approach but in determining the appropriate level of structural granularity. Motivated by this observation, the next section focuses on addressing the optimal selection of and evaluating the proposed Blockwise Exponential Covariance Model (BECM), which aims to endogenize this choice and further enhance out-of-sample efficiency.
3.2. Empirical Performance of the Blockwise Exponential Covariance Model
In implementing the Blockwise Exponential Covariance Model (BECM), we begin by estimating firm-level factor loadings based on the [
20] three-factor model. Specifically, for each stock, monthly excess returns are regressed on the three systematic risk factors—market (MKT), size (SMB), and value (HML)—to obtain its individual factor exposures. The estimated beta coefficients capture each stock’s sensitivity to common sources of risk. According to the clustering assignments obtained from the previous analysis, we then compute the average factor-loading vector for each cluster by taking the mean of the betas of all member stocks within that cluster. The distance between the mean loading vectors of different clusters is used to measure inter-cluster similarity, providing a more accurate representation of the differences in their common risk exposures. This similarity measure serves as the basis for quantifying inter-cluster covariances within the BECM framework.
Given that the previous section demonstrated that the choice of clustering algorithm has limited influence on the performance of minimum-variance portfolios constructed from block covariance matrices, we adopt spectral clustering as the default method in implementing the BECM. The number of clusters, , is determined automatically through a cross-validation procedure. Specifically, in each rolling window, we estimate BECM covariance matrices under a range of candidate values and construct the corresponding minimum-variance portfolios. We then evaluate the out-of-sample Sharpe ratios of these portfolios and select the that achieves the highest Sharpe ratio as the optimal number of clusters for that period. This performance-driven approach ensures a more balanced trade-off between structural flexibility and estimation stability in high-dimensional settings.
To benchmark the empirical performance of the BECM, we consider two reference strategies. The first is the naive 1/N portfolio, which is widely regarded as a robust diversification benchmark and does not rely on covariance estimation. The second is a CBRM (Canonical Block Representation Model)-based minimum-variance portfolio constructed using spectral clustering with a fixed number of clusters . By construction, the CBRM mitigates instability in high-dimensional covariance estimation through blockwise structure but does not allow for adaptive adjustment in the number of clusters. Together, these benchmarks provide a transparent reference framework for evaluating the relative performance of the proposed BECM.
Table 2 reports the out-of-sample performance of the minimum-variance portfolios constructed using the BECM, the CBRM, and the naive 1/N benchmark. The Sharpe ratios are computed using monthly portfolio excess returns based on the standard definition, without annualization. All strategies employ the same rolling monthly covariance estimation and monthly rebalancing scheme, so the reported Sharpe ratios are intended for relative comparison under a common experimental design rather than for direct interpretation as annualized performance. Several observations emerge from the results. First, the BECM exhibits a markedly higher out-of-sample Sharpe ratio (SR = 0.423) than both the CBRM (0.198) and the 1/N portfolio (0.158). This improvement is observed in strictly out-of-sample evaluations and is particularly pronounced in a high-dimensional environment where estimation noise and covariance instability are non-negligible. Unlike approaches with fixed block segmentation, the BECM allows the number of clusters to vary over time through cross-validation, enabling the model to adapt to changes in cross-sectional dependence patterns. In addition, incorporating factor-loading information to characterize differences in cluster-level risk exposures embeds richer cross-sectional structure into the covariance estimation. From an analytical perspective, this mechanism reduces structural bias by aligning inter-block dependence with economically meaningful risk similarities, while the log-covariance formulation mitigates estimation variance through spectral regularization.
Moreover, the BECM maintains strict positive definiteness of the covariance matrix even when the number of clusters is large, ensuring numerical stability and feasibility of portfolio optimization in high-dimensional environments. Owing to its exponentially weighted structure and the effective reduction in the number of parameters to be estimated induced by the blockwise approximation, the BECM avoids singularity or ill-conditioning under increasingly fine-grained partitions. This property helps explain why the BECM continues to perform well as increases and suggests that the model attains a favorable balance between bias and variance in high-dimensional settings.
Although the BECM delivers higher risk-adjusted returns, these gains are accompanied by higher volatility and turnover. The volatility of the BECM portfolio (VOL = 0.198) exceeds that of the CBRM (0.050) and the 1/N portfolio (0.058), reflecting greater responsiveness to changes in the estimated covariance structure. Similarly, the turnover of the BECM (TO = 0.636) is higher than that of the CBRM (0.535) and the 1/N portfolio (0.099). This behavior is consistent with the model’s adaptive design: time-varying clustering induces more frequent portfolio rebalancing as market dependence patterns evolve. While higher turnover implies increased trading intensity and potential transaction costs, it also indicates an enhanced ability to capture shifts in market co-movements and respond to changing risk interdependencies.
In this sense, the empirical performance of the BECM can be interpreted as an extension of the bias–variance trade-off discussed in
Section 3.1, where adaptive clustering and log-domain regularization jointly prevent excessive variance inflation under fine-grained partitions.
3.3. Economic Interpretation and Practical Implications
From a financial perspective, the higher volatility and turnover observed for BECM-based portfolios can be interpreted as a natural consequence of their responsiveness to changes in the estimated dependence structure. Unlike approaches that rely on fixed clustering schemes or heavily smoothed covariance estimates, BECM allows cross-asset correlations and block relationships to adjust over time. As a result, shifts in the underlying risk structure are reflected more promptly in portfolio weights, which manifests empirically as more frequent rebalancing and higher short-term return volatility.
Importantly, this pattern reflects a well-known trade-off in portfolio management rather than an anomaly. Methods that impose stronger smoothing or static structure tend to produce more stable portfolio weights and lower turnover, but they may also respond more slowly to changes in market-wide dependence. In contrast, covariance estimators that are more adaptive to evolving correlation patterns naturally lead to greater variation in portfolio composition and realized volatility. The empirical characteristics of BECM portfolios are therefore consistent with a strategy that places greater emphasis on tracking time-varying risk relationships.
From the perspective of practical portfolio analysis, these results suggest that BECM-based portfolios embody a different balance between stability and adaptability compared with more static benchmarks. The higher Sharpe ratios reported in the empirical analysis indicate that the increased volatility and turnover are accompanied, on average, by higher risk-adjusted returns over the sample period. At the same time, the observed trading intensity highlights that the performance of such strategies should be interpreted jointly with their rebalancing behavior, particularly when comparing them to portfolios constructed under stronger structural or smoothing assumptions.
From a decision-making perspective, these findings provide evidence that covariance estimators emphasizing adaptability to evolving dependence structures can support risk management objectives when evaluated jointly with their implications for trading intensity and volatility.