High-Dimensional Precision Matrix Estimation through GSOS with Application in the Foreign Exchange Market

Kheyri, Azam; Bekker, Andriette; Arashi, Mohammad

doi:10.3390/math10224232

Open AccessArticle

High-Dimensional Precision Matrix Estimation through GSOS with Application in the Foreign Exchange Market

by

Azam Kheyri

¹

,

Andriette Bekker

^1,*

and

Mohammad Arashi

^1,2

¹

Department of Statistics, Faculty of Natural and Agricultural Sciences, University of Pretoria, Pretoria 0028, South Africa

²

Department of Statistics, Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad 9177948974, Iran

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(22), 4232; https://doi.org/10.3390/math10224232

Submission received: 14 October 2022 / Revised: 4 November 2022 / Accepted: 9 November 2022 / Published: 12 November 2022

(This article belongs to the Special Issue Contemporary Contributions to Statistical Modelling and Data Science)

Download

Browse Figures

Versions Notes

Abstract

:

This article studies the estimation of the precision matrix of a high-dimensional Gaussian network. We investigate the graphical selector operator with shrinkage, GSOS for short, to maximize a penalized likelihood function where the elastic net-type penalty is considered as a combination of a norm-one penalty and a targeted Frobenius norm penalty. Numerical illustrations demonstrate that our proposed methodology is a competitive candidate for high-dimensional precision matrix estimation compared to some existing alternatives. We demonstrate the relevance and efficiency of GSOS using a foreign exchange markets dataset and estimate dependency networks for 32 different currencies from 2018 to 2021.

Keywords:

exchange rate; Gaussian graphical model; graphical elastic net; high-penalized log-likelihood; precision matrix estimation; ridge estimation

MSC:

62H12; 62J07

1. Introduction

In recent years, covariance and precision matrix estimates have been studied extensively under high-dimensional scenarios. The motivation for these investigations is the argument that traditional likelihood-based techniques perform inaccurately or do not even exist. Examples of these include the graphical lasso algorithm and its extensions ([1,2,3,4,5,6,7]), and the ridge regularization-driven approach for the log-likelihood function ([8,9,10,11]). In addition to considering single

L_{1}

(graphical lasso) and Frobenius (graphical ridge) penalties for the precision matrix, some studies have proposed combining both, resulting in an elastic net-type penalty. For instance, the following articles can be mentioned [10,12,13,14].

Despite the vast number of reported studies in the literature on the estimation of precision matrices, there are few reported studies concerning generalization to allow for the incorporation of prior knowledge of a precision matrix. For the Frobenius penalized case we can mentioned [9,10,11]. The only proposals for incorporating target matrices (prior knowledge) for the

L_{1}

and elastic net penalty are those of [10,15]. In this paper, we propose the graphical selector operator with shrinkage (GSOS) as a combination of the generalized ridge (ridge with target matrix) and

L_{1}

penalty. The main aim of this paper is to present and compare the new precision matrix estimator and propose a novel algorithm for Gaussian graphical models. The numerical study shows that the GSOS estimator is comparable with other available estimators and performs better in most cases than others.

Estimation, Literature Review

In graphical models, Gaussian graphical models are frequently used for modeling conditional dependencies in multivariate data. Dependency structures are determined by estimating the precision matrix through standard methodologies, for instance, a maximum likelihood or the regularized type of maximum likelihood in high-dimensional cases.

Consider random vector

z \sim N_{p} (μ, Σ)

; it follows a multivariate Gaussian distribution with mean vector

μ \in R^{p}

and positive definite covariance matrix

Σ \in R^{p} \times R^{p}

corresponding to conditional independence graph

G

. In terms of mathematics, a pair

(z, G)

is a Gaussian graphical model. In these models, the graph

G

and the precision matrix

Θ : = Σ^{- 1}

are intimately related. That means that the zeroes in the precision matrix correspond to pairs of conditionally independent features, given all the other variables. In other words, there is no edge between these two variables (nodes) in a graph

G

. Furthermore, two coordinates

z_{k}

and

z_{k}^{^{'}}

are independent if, in the graph

G

, there is no edge between them. Maximum likelihood estimation is a logical technique to estimate the precision matrix for the positive definite matrix

Θ

; this estimator is expressed as

\hat{Θ} = arg min_{Θ ≻ 0} {- log det (Θ) + t r (S Θ)},

(1)

where

S = z^{T} z / n

is the sample covariance matrix and the maximum likelihood estimator

Θ

is equal to

S^{- 1}

. However, two issues might emerge when utilizing this maximum likelihood technique to estimate

Θ

. First, in the high dimension case

p > n

, the empirical covariance matrix

S

is singular and hence cannot be inverted to provide an estimate of

Θ

. Even if p and n are almost equal and

S

is singular, the maximum likelihood estimate for

Θ

will have a relatively high variance. Second, it is frequently valuable to identify pairs of disconnected variables in a graphical model that are conditionally independent; these correspond to zeroes in

Θ

. However, in general, (1) will produce an estimate of

Θ

with no elements equal to zero.

The likelihood function is complemented with a penalty function in high-dimensional settings, which yields the maximum regularized likelihood estimator. Graphical lasso adds an

L_{1}

penalty to log-likelihood and induces sparsity of

Θ

due to maximizing the penalized log-likelihood:

{\hat{Θ}}_{G l a s s o} (ρ) = arg min_{Θ ≻ 0} {{- log det (Θ) + t r (S Θ) + ρ | | Θ | |}_{1}},

(2)

where

ρ

is a non-negative tuning parameter and

{| | Θ | |}_{1}

denotes the sum of the absolute values of the elements of

Θ

. This approach was considered almost simultaneously by [1,16,17,18]. Due to the ensuing sparse answer, the graphical lasso estimate has gained significant interest and has become popular; it is still is an active area of research (cf. [6,19,20,21]). In contrast to circumstances where sparsity is advantageous, there are cases when more exact representations of the high-dimensional precision matrix are fundamentally desirable. Furthermore, the genuine (graphical) model does not have to be (very) sparse in terms of having many zero elements. In these situations, we may use a regularization approach that shrinks the estimated precision matrix elements instead of forcing them to be zero. The ridge precision estimator maximizes the log-likelihood augmented by a Frobenius norm, [9] in the most general form presented the ridge estimator as follows

{\hat{Θ}}_{R i d g e} (λ) = arg min_{Θ ≻ 0} {- log det (Θ) + t r (S Θ) + \frac{1}{2} λ | | Θ - {T | |}_{F}^{2}},

(3)

where

λ

is the penalty parameter,

T

is a known symmetric and semi-positive definite target matrix, and

{| | A | |}_{F}^{2}

denotes the Frobenius norm, the sum of the square values of the elements of the matrix

A

. Before estimating, the target matrix is set as an initial guess, toward which the precision estimate is shrunk. Estimator (3) is called the alternative Type I ridge precision estimator, and when the target matrix is equal to zero matrices, the alternative Type II ridge precision estimator. It should be mentioned that [11] considered the estimator mentioned in (3) independently and concurrently, and called it a ridge-type operator for precision matrix estimation (ROPE).

In a recent study in this area, Ref. [10] generalized the ridge inverse covariance estimator to allow for entry-wise penalization. Their proposed estimator shrinks toward a user-specified non-random target matrix and is shown to be positive definite and consistent. Furthermore, they obtained a generalization of the graphical lasso estimator and its elastic net counterpart. Recently, ref. [15] considered elastic net-type penalization (gelnet) for precision matrix estimation in the presence of a diagonal target matrix. They showed that it is possible to adopt the iterative procedure of [10] for the elastic net problem, but it is not computationally attractive.

Under the continuity property of the ridge estimator and apart from sparsity, estimations of the precision matrix with (3) can be far from the actual value

Θ

. Hence, the motivation of our approach to find a closer estimator. We use the Frobenius norm penalty to penalize the deviation between any primary or initial target estimator and

Θ

, and tune it to improve the accuracy of the final estimation. Furthermore, the

L_{1}

penalty is added to account for sparsity, returning to the well-known elastic net with some added refinement. Therefore, in this paper, we consider combining the estimation approaches (2) and (3) to propose an elastic net-type estimator for the precision matrix: the GSOS.

2. The Proposed Method

Let

X_{n \times p}

be the data matrix, including n observations of a

p —

dimensional Gaussian distribution with zero mean and positive definite covariance matrix

Σ

. Consider the following type of estimation problem for estimating the unknown precision matrix

Θ = Σ^{- 1}

based on n observations:

\hat{Θ} (α, λ, T) = arg min_{Θ ≻ 0} {{- log det (Θ) + t r (S Θ) + λ (α | | Θ | |}_{1} + \frac{(1 - α)}{2} | | Θ - {T | |}_{F}^{2})},

(4)

where

T_{p \times p}

is a known symmetric and semi-positive definite target matrix. Additionally,

λ \geq 0

,

α \in [0, 1]

are tuning parameters. Note that if

α = 0

, the precision matrix estimator (4) becomes the ridge estimator that is mentioned in (3).

Using sub-gradient notation and rules from [22], we obtain the following optimal conditions for (4)

\begin{matrix} - Θ^{- 1} + S + λ α Γ - λ (1 - α) (T - Θ) = 0 \\ i . e ., & - Θ^{- 1} + S + λ α Γ - λ (1 - α) T + λ (1 - α) Θ = 0 \\ i . e ., & - Θ^{- 1} + A + λ α Γ + λ (1 - α) Θ = 0, \end{matrix}

(5)

where

A : = A (S, α, λ, T) = S - λ (1 - α) T

, and the matrix

Γ = [γ_{k k^{^{'}}}]

denotes the component-wise signs of

Θ

with

γ_{k k^{^{'}}} \in [- 1, 1]

for

Θ_{k k^{^{'}}} = 0

. The solution of the normal Equation (5) is found by iteratively running over the columns/rows, considering the remaining ones fixed. The update requires that each matrix in (5) is partitioned as follows:

Θ = (\begin{matrix} Θ_{11} & θ_{12} \\ θ_{21} & θ_{22} \end{matrix}), Γ = (\begin{matrix} Γ_{11} & γ_{12} \\ γ_{21} & γ_{22} \end{matrix}), A = (\begin{matrix} A_{11} & a_{12} \\ a_{21} & a_{22} \end{matrix}),

(6)

where

Θ_{11}

is a square matrix,

θ_{12}

is a vector, and

θ_{22}

is a scalar.

Γ

and

A

are partitioned similarly. Consider

W = Θ^{- 1}

; using the properties of inverses of block-partitioned matrices, we have that

W = (\begin{matrix} W_{11} & w_{12} \\ w_{21} & w_{22} \end{matrix}) = (\begin{matrix} Θ_{11}^{- 1} + \frac{Θ_{11}^{- 1} θ_{12} θ_{21} Θ_{11}^{- 1}}{(θ_{22} - θ_{21} Θ_{11}^{- 1} θ_{12})} & - \frac{Θ_{11}^{- 1} θ_{12}}{θ_{22} - θ_{21} Θ_{11}^{- 1} θ_{12}} \\ - \frac{θ_{21} Θ_{11}^{- 1}}{θ_{22} - Θ_{21} Θ_{11}^{- 1} θ_{12}} & \frac{1}{θ_{22} - θ_{21} Θ_{11}^{- 1} θ_{12}} \end{matrix}) .

(7)

Considering the pth column of (5), we obtain

- w_{12} + a_{12} + λ α γ_{12} + λ (1 - α) θ_{12} = 0 .

(8)

By substituting

w_{12}

and

w_{22}

into (7), we have that

\begin{matrix} \frac{Θ_{11}^{- 1} θ_{12}}{θ_{22} - θ_{21} Θ_{11}^{- 1} θ_{12}} + a_{12} + λ α γ_{12} + λ (1 - α) θ_{12} = 0 \\ i . e ., & Θ_{11}^{- 1} θ_{12} w_{22} + a_{12} + λ α γ_{12} + λ (1 - α) θ_{12} = 0 \\ i . e ., & (Θ_{11}^{- 1} w_{22} + λ (1 - α) I_{p - 1}) θ_{12} + a_{12} + λ α γ_{12} = 0 . \end{matrix}

(9)

Consider

{\tilde{q}}_{12} : = a b s (θ_{12})

and

\tilde{γ} : = λ α γ_{12}

; then, (9) is equivalent to

\begin{matrix} (Θ_{11}^{- 1} \frac{w_{22}}{α λ} + \frac{1 - α}{α} I_{p - 1}) {\tilde{q}}_{12} * \tilde{γ} + a_{12} + \tilde{γ} = 0, \\ {\tilde{q}}_{12} * (a b s (\tilde{γ}) - λ α 1_{p - 1}) = 0, \\ | | \tilde{γ} {| |}_{\infty} \leq λ α, \end{matrix}

(10)

where ∗ denotes the element-wise multiplication of two vectors. The conditions mentioned in (10) are KKT optimality conditions for the following box-constrained problem for

γ \in R^{p - 1}

\underset{| | \tilde{γ} {| |}_{\infty} \leq λ α}{m i n i m i z e} \frac{1}{2} {(a_{12} + \tilde{γ})}^{T} {(Θ_{11}^{- 1} \frac{w_{22}}{α λ} + \frac{1 - α}{α} I_{p - 1})}^{- 1} (a_{12} + \tilde{γ}) .

(11)

By solving the optimization problem (11) and finding the optimum point

λ^{*}

,

θ_{12}

can be updated as follows for all

α \in (0, 1]

\begin{matrix} {\hat{θ}}_{12} = - {(Θ_{11}^{- 1} \frac{w_{22}}{α λ} + \frac{1 - α}{α} I_{p - 1})}^{- 1} (a_{12} + γ^{*}) . \end{matrix}

(12)

For the diagonal elements of the precision matrix

θ_{22}

, we consider (5) for diagonal elements

- w_{22} + a_{22} + λ α + λ (1 - α) θ_{22} = 0 .

(13)

From (7), we have that

w_{22} = 1 / (θ_{22} - θ_{21} Θ_{11}^{- 1} θ_{12})

; this implies that (13) is equivalent to

1 + (a_{22} + λ α + λ (1 - α) θ_{22}) (θ_{21} Θ_{11}^{- 1} θ_{12} - θ_{22}) = 0 .

(14)

Let

C : = θ_{21} Θ_{11}^{- 1} θ_{12}

; therefore, we have this quadratic equation

- λ (1 - α) θ_{22}^{2} + (λ (1 - α) C - a_{22} - λ α) θ_{22} + C (a_{22} + λ α) + 1 = 0 .

(15)

Quadratic Equation (15) has two distinct real roots: positive (acceptable) and negative (unacceptable). Hence, we can update the diagonal elements of the precision matrix with the positive root. Finally,

w_{12}

and

w_{22}

are updated using normal Equation (5). Our method here is similar to the dpglasso estimator proposed by [5]. Furthermore, we can follow the glasso approach to solve the problem. The fundamental distinction between glasso and dpglasso is that in glasso,

W

is not equal to the inverse of

Θ

. Furthermore, glasso deals with

W

, whereas dpglasso considers its inverse

Θ

. Finally, Algorithm 1 shows the procedure for

α \in (0, 1]

.

Algorithm 1 GSOS based on dpglasso approach

Initialize $Θ$ , a diagonal matrix with the diagonal elements as follows, for $k = 1, \dots, p$

$θ_{k k} = \frac{- (a_{k k} + λ α) + \sqrt{{(a_{k k} + λ α)}^{2} + 4 λ (1 - α)}}{2 λ (1 - α)}$

and $W = A + λ α I_{p} + λ (1 - α) Θ$ ,
Focus on the columns and repeatedly perform the following steps till convergence.

(A): Rearrange the rows/columns so that the target column is last (according to (6));
(B): Solve (11) and update

$\begin{matrix} {\hat{θ}}_{12} = - {(Θ_{11}^{- 1} \frac{w_{22}}{α λ} + \frac{1 - α}{α} I_{p - 1})}^{- 1} (a_{12} + γ^{*}); \end{matrix}$
(C): Solve (15) to obtain ${\hat{θ}}_{22}$ ;
(D): Update ${\hat{w}}_{12} = a_{12} + γ^{*}$ ;
(E): Update ${\hat{w}}_{22} = a_{22} + λ α + λ (1 - α) {\hat{θ}}_{22}$ .

Glasso Scenario

The following relationship between

W

and

Θ

is established in glasso

W = (\begin{matrix} W_{11} & w_{12} \\ w_{21} & w_{22} \end{matrix}) = (\begin{matrix} {(Θ_{11} - \frac{θ_{12} θ_{21}}{θ_{22}})}^{- 1} & - W_{11} \frac{θ_{12}}{θ_{22}} \\ - \frac{θ_{21}}{θ_{22}} W_{11} & \frac{1}{θ_{22}} + \frac{θ_{21} W_{11} θ_{12}}{θ_{22}^{2}} \end{matrix}) .

(16)

Consider the pth column of (5) and substitute

w_{12}

from (16):

\begin{matrix} - w_{12} + a_{12} + λ α γ_{12} + λ (1 - α) θ_{12} = 0, \\ W_{11} \frac{θ_{12}}{θ_{22}} + a_{12} + λ α γ_{12} + λ (1 - α) θ_{12} = 0 . \end{matrix}

Define

β = - \frac{θ_{12}}{θ_{22}}

; therefore

W_{11} β - a_{12} - λ α γ_{12} + λ (1 - α) θ_{22} β = 0 .

(17)

Letting

W_{11} = V

,

a_{12} = u

and

γ_{12} = ρ

, then, for every

i = 1, \dots, p

, we have that

\begin{matrix} V_{i i} β_{i} + \sum_{k \neq i}^{p - 1} V_{k i} β_{k} - u_{i} - λ α ρ_{i} + λ (1 - α) θ_{22} β_{i} = 0 . \end{matrix}

(18)

Subsequently, the update has the following form

{\hat{β}}_{i} \leftarrow \frac{S (u_{i} - \sum_{k \neq i}^{p - 1} V_{k i} β_{k}, α λ)}{V_{i i} + λ (1 - α) θ_{22}},

(19)

where S is the soft-threshold operator

S (x, t) = s i g n (x) {(| x | - t)}_{+}

. We cycle through the predictors until convergence.

\hat{β}

is the optimum point of the following quadratic problem

\underset{β \in R^{p - 1}}{m i n i m i z e} \frac{1}{2} β^{T} W_{11} β - β^{T} a_{12} - {λ α | | β | |}_{1} + \frac{λ (1 - α)}{2} θ_{22} β^{T} β,

(20)

which corresponds to the normal equations (17). After solving this quadratic problem and finding

\hat{β}

, we can update

{\hat{w}}_{12}

. From (5) for the diagonal elements of

θ_{22}

of the precision matrix, we have that

- w_{22} + a_{22} + λ α + λ (1 - α) θ_{22} = 0 .

(21)

From (16) we have that

w_{22} = \frac{1}{θ_{22}} + \frac{θ_{21} W_{11} θ_{12}}{θ_{22}^{2}} = \frac{1}{θ_{22}} + β^{T} W_{11} β

; hence, (21) is equivalent to

λ (1 - α) θ_{22}^{2} - (β^{T} W_{11} β - a_{22} - α λ) θ_{22} - 1 = 0 .

(22)

Equation (22) has two distinct real roots with positive (acceptable) and negative (unacceptable) signs. Hence, we can update the diagonal elements of the precision matrix with the positive root. Finally,

{\hat{θ}}_{12}

and

{\hat{w}}_{22}

are updated by (5). Algorithm 2 briefly outlines these steps.

Algorithm 2 GSOS based on glasso approach

Initialize $Θ$ , a diagonal matrix with the diagonal elements as follows, for $k = 1, \dots, p$

$θ_{k k} = \frac{- (a_{k k} + λ α) + \sqrt{{(a_{k k} + λ α)}^{2} + 4 λ (1 - α)}}{2 λ (1 - α)}$

and $W = A + λ α I_{p} + λ (1 - α) Θ$ .
Focus on the columns and repeatedly perform the following steps till convergence.

(A): Rearrange the rows/columns so that the target column is last;
(B): Solve (20) and update ${\hat{w}}_{12} = W_{11} \hat{β}$ ;
(C): Solve (22) to update ${\hat{θ}}_{22}$ ;
(D): Update ${\hat{θ}}_{12} = - {\hat{θ}}_{22} \hat{β}$ ;
(E): Update ${\hat{w}}_{22} = a_{22} + λ α + λ (1 - α) {\hat{θ}}_{22}$ .

3. Simulation

In this section, the statistical performance of GSOS is compared to some other popular estimators (glasso [1], ROPE or alternative ridge [9,11], and graphical elastic net [15]).

We simulate the data from a multivariate Gaussian distribution

N_{p} (0, Σ)

, where

Σ = [σ_{k, k^{^{'}}}]

and

Θ = Σ^{- 1} = [θ_{k, k^{^{'}}}]

are

p \times p

positive definite matrices. Six different models are used to compare the methods (see Appendix A for precision matrices). We evaluate these estimators on the six network structures with

p = 20, 50, 100

nodes and sample size of

n = 50

by implementing Algorithm 2.

Network 1: A model with compound symmetry structure where $σ_{k, k} = 1$ and $σ_{k, k^{^{'}}} = {0.6}^{2}$ for $k \neq k^{^{'}}$ . In this model, the covariance matrix is structured and non-sparse;
Network 2. The prototype matrix $Θ_{0}$ is used to standardize the precision matrix $Θ$ to possess a unit diagonal. Let $Θ_{0} = A + a I$ , where each off-diagonal entry in $A$ is generated independently and equals $0.5$ with probability $0.1$ , or 0 with probability $0.9$ . a is chosen such that the condition number of the matrix is equal to p. Here we have an unstructured and sparse precision matrix;
Network 3. The precision matrix is defined $Θ = \frac{1}{n} Y^{T} Y$ , where $Y = [y_{k k^{^{'}}}]$ is an $n \times p$ matrix with $n = 10, 000$ and $y_{k k^{^{'}}}$ comes from $N (0, 1)$ . The precision matrix of this model is unstructured and non-sparse;
Network 4. A star model with $θ_{k, k} = 1$ , $θ_{1, k} = θ_{k, 1} = 0.1$ and $θ_{k, k^{^{'}}} = 0$ otherwise. This local area network has a structured and sparse precision matrix;
Network 5. A moving average (MA) model with $σ_{k, k^{^{'}}} = 1$ , $σ_{k, k - 1} = σ_{k - 1, k} = 0.2$ and $σ_{k, k - 2} = σ_{k - 2, k} = {0.2}^{2}$ . This covariance matrix is structured and sparse;
Network 6. A diagonally dominant model. Consider $B = \frac{1}{2} (A + A^{T})$ , where $A$ is a $p \times p$ matrix with zero diagonal elements. Each off-diagonal element of $A$ is drawn from a standard uniform distribution. Compute a matrix $D = \frac{1}{γ} B$ , where $D = [d_{k, k^{^{'}}}]$ and $γ$ is the largest row sum of the absolute values of the elements of the matrix $B$ . Finally, each off-diagonal element of $Σ$ is chosen as $σ_{k, k^{^{'}}} = d_{k, k^{^{'}}}$ and $σ_{k, k} = 1 + e_{i}$ , where $e_{i}$ is drawn from uniform distribution with minimum 0 and maximum $0.1$ . This covariance matrix is unstructured and non-sparse.

To calculate the performance measures for all the methods, 100 independent simulations for each network are performed and the average of several loss functions is calculated. The optimal tuning parameter

λ

for each method is determined for each simulation run using five-fold cross-validation. For glasso, gelnet, and ROPE, we use the five-fold cross-validation found in the R-package “GLassoElnetFast” with a set value of

λ

of 50 elements varying from

0.01

to 10, and

α

with 20 elements varying from

0.01

to

0.99

for gelnet and our estimators. As a target matrix, assume identity matrix

I

and a scalar matrix

ν I

, where

ν = p^{2} / t r (S)

.

3.1. Performance Measures

To calculate the performance of a given estimator

\hat{Θ}

, we consider four loss functions that have been used widely in other research in this area (see, e.g., [7,11,14,15]).

Loss Functions

The Kullback–Leibler loss: $K L = t r (Σ \hat{Θ}) - l o g d e t (Σ \hat{Θ}) - p$ ;
The $L_{2}$ loss: $L_{2} = | | Θ - \hat{Θ} {| |}_{F}$ ;
The quadratic loss: $Q L = t r {(Σ \hat{Θ} - I_{p})}^{2}$ ;
The spectral norm loss: $S P = d_{1}$ , where $d_{1}^{2}$ is the largest eigenvalue of the matrix ${(Θ - \hat{Θ})}^{2}$ .

To provide risk measures, the averages of these losses are determined for each method from 100 simulations. Figure 1, Figure 2, Figure 3 and Figure 4 show the final findings. The plot layout is taken from [11]; the columns that go along the little black dots represent loss means. The bars at the top of each column represent standard errors (

m e a n \pm S D

).

3.2. Simulation Results

The results for different networks are displayed in Figure 1, Figure 2, Figure 3 and Figure 4. We summarize some observations based on the results as follows:

In general, as mentioned in [15], it is advantageous to include a target in the methods, since each approach works better with the appropriate target;
Compared to other alternatives, GSOS is often a considerable contender for high-dimensional precision matrix estimation;
The question of which target is more effective remains. However, in most cases, our simulations suggest that the identity target matrix works better.

4. Real Data

In the following section, we study market data provided by the Pacific Exchange Rate Service dataset (PERS), which is available on https://fx.sauder.ubc.ca/, accessed on 13 September 2022. PERS provides daily values of the currencies and commodities priced in various base currencies, known as numeraire. We consider 32 different currencies for the four years between 2018 and 2021. The names of these 32 currencies and their abbreviations are listed in Table 1.

We gather the data initially with the US dollar (USD) as numeraire and then, according to the initial data, calculate all currency exchange rates (for more detail, see [23,24]). After determining all exchange rates, using each of the 32 currencies as a numeraire, we compute the daily log returns of the remaining 31 currencies. We estimate the related covariance and precision matrices based on all possible exchange currencies (496 different exchanges). We apply our proposed estimator GSOS, with the scalar matrix as a target for four-year and annual data separately (the five-fold cross-validation is considered in selecting the tuning parameters). The estimated networks are presented in Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9; it should be mentioned that we consider 50 of the strongest partial correlations.

Real Data Results

To report the analytical results, we consider the different thresholds for partial correlation to explain the most important relations between considered currencies. We apply filtering by considering only ones that exceed those thresholds to explain how these networks are formed.

2018–2021 network: First, strong partial correlations appear between European currencies: DKK, HRK, CZK, PLN, HUF, CHF, and EUR. Additionally, we find a cluster that consists of HKD, SAR, CNY, TWD, SGD, and USD. Considering a smaller threshold, one notices the partial correlation between the Oceania couple of AUD and NZD and CAD, and the joining of EUR to the USD-based cluster, which connects European currencies to this cluster. Finally, a rather interesting cluster centered around a Latin American currency, MXN, appears, which consists of BRL, ZAR, and RUB;
2018 network: We observe strong correlations between European currencies: DKK, HRK, CZK, PLN, HUF, and EUR. The USD-based cluster here consists of HKD, PEN, TWD, SGD, CNY, THB, and MYR. The EUR currency connects these two major clusters. We find the relation between the Oceania couple AUD and NZD and CAD for a smaller threshold. In addition, a cluster between the Latin American, African, and Asian currencies MXN, ZAR, and RUB is noteworthy;
2019 network: Similar to the 2018 network, we have strong and connected European currencies: DKK, HRK, CZK, PLN, and HUF. The USD-based cluster is formed by HKD, SAR, TWD, SGD, CNY, and MYR at first; by considering a smaller threshold, we find other currencies such as PEN and THB. We observe that EUR connects these two clusters, and another cluster consists of AUD, NZD, NOK, and SEK;
2020 network: This year, we observe the USD cluster as follows: HKD, SAR, TWD, SGD, and CNY. We can see MYR, PHP, and THB as a disjointed clique (every two distinct vertices are adjacent). Regarding European currencies, EUR and CHF have a connecting role between this cluster and the USD-based cluster. The European currency cluster consists of DKK, HRK, CZK, PLN, and HUF. In addition, we observe two other clusters: the Oceania couple AUD and NZD with CAD and GBP, and MXN, BRL, ZAR, and RUB. These two clusters have a connection to European currencies by SEK;
2021 network: As previously, we observe that the USD-based cluster consists of HKD, SAR, TWD, CNY, and SGD. DKK, HRK, CZK, HUF, PLN, CHF, SEK, and EUR form a European currency cluster, which is connected to the USD cluster by EUR. In addition, we discover an interesting cluster centered around AUD consisting of NZD, CAD, NOK, and GBP. Considering a smaller threshold, we find more relations in the last cluster and considerable partial correlations between AUD, NZD, NOK, and SEK. This year, we see a strong partial correlation between GBP and SGD, which has never been observed before. Finally, we have a small cluster consisting of three Latin American, Asian, and African currencies: MXN, RUB, and ZAR.

5. Discussion

This section provides a summary of the paper and highlights the results. Additionally, we address some limitations of the research and gaps for future research. Finally, we review the findings and the results of the proposed method: GSOS.

5.1. Summary

In this paper, we proposed two methods to estimate a precision matrix in multivariate Gaussian settings. Estimating a precision matrix is one of the most critical tools for reconstructing an undirected graphical model expressing conditional dependencies between variables. Furthermore, obtaining an estimated network of partial correlations with a suitable rescaling of the precision matrix is possible.

We focused on the high-dimensional case to obtain regularized and sparse estimates considering the elastic net-type penalty. This penalty is a combination of the Frobenius norm when a target matrix is taken into account and the

L_{1}

norm penalty to create sparsity. The first method, Algorithm 1, employs the dpglasso technique proposed by [5] to find penalized log-likelihood estimation. The second method, Algorithm 2, follows a similar approach to glasso, as studied by [1], and solves optimization problems by turning them into quadratic problems.

We conducted a simulation study to test the proposed algorithms using three different sample sizes and six common network structures reflecting various forms of conditional dependence. To calculate the performance measures for all the methods, we considered 100 independent simulations for each network; we presented the average of the Kullback–Leibler,

L_{2}

, quadratic, and spectral norm loss functions. The optimal tuning parameters for each method were determined using five-fold cross-validation for each simulation run.

Lastly, we presented an empirical study of the network between 32 currencies for the years between 2018 and 2021. We estimated annual and four-year networks according to these years. We evaluated the high-dimensional precision matrix in every yearly network in this real data study.

5.2. Contributions

We added sparsity to the alternative ridge estimator by considering the

L_{1}

norm as an additional penalty. In this area, [15] also focused on the elastic net-type penalties for a Gaussian log-likelihood-based precision matrix estimation called gelnet. However, they simultaneously considered the target matrix for the

L_{1}

and Frobenius norm. Therefore, we cannot obtain our estimator from gelnet when considering nonzero target matrices. These relations encouraged us to choose glasso, alternative ridge, and gelnet as competitor estimators to compare with our proposed estimator, GSOS.

We used the R programming language [25] and the “GLassoElnetFast” package for the simulation study; the codes are available on https://github.com/Azamkheyri/GSOS.git, accessed on 27 October 2022. In terms of simulation results, GSOS outperformed alternative ridge with most of the underlying structures in the high-dimensional case for all considered sample sizes. GSOS and gelnet, in most cases, behaved almost similarly, but in Network 4 for the Kullback–Leibler and

L_{2}

risk measures, GSOS significantly outperformed gelnet.

On glasso: according to the Kullback–Leibler risk measure, GSOS and glasso performed similarly, except in Networks 1 and 2, where GSOS performed better. For the

L_{2}

measure, GSOS performed better in Network 1, while for the quadratic risk measure, their behaviors are different; in three networks, GSOS outperformed glasso and for the remainder, glasso was better. Therefore, our proposed estimator is an efficient way to estimate precision matrices for the high-dimension Gaussian graphical models.

Finally, we added a real data example, the PERS dataset, and estimated four annual and one four-year dependency networks from 2018 to 2021. Since the log returns must be calculated using data from two consecutive days, we removed from our research those currencies for which there were at least ten missing values for exchange rates with the USD. Therefore, we estimated high-dimension precision matrices with GSOS for 32 currencies worldwide.

5.3. Strengths and Limitations

As we mentioned before, we considered the elastic net-type penalty for the Gaussian log-likelihood problem to estimate the precision matrix. The most important strength of our proposal is that this kind of penalty could help simultaneously obtain the advantages of the Frobenius norm and

L_{1}

-penalized estimations. We could not prove that our proposed estimator mathematically outperforms the known methods; this might be considered as a limitation. However, we illustrated the performance of the GSOS estimator with a simulation study under six different frequently used dependency networks in the literature.

Our simulation results are related to the considered network’s structure and performance measures; hence, they are not general. Furthermore, despite our proposed estimator’s considerable statistical performance, there is no overall winner.

Finally, because of the presence of missing data in the foreign exchange markets and the fact that log returns of the data have to be calculated from two consecutive days, only those currencies in which there were less than ten missing values for exchange rates based on USD were considered.

5.4. Future Work

For future research, we will consider studying the asymptotic behavior of our suggested estimators. In addition, while all the recommended estimators depend on the multivariate Gaussian assumption, we might also consider ways to do away with it by proposing distribution-free approaches or looking at other distributions to capture the data’s properties better. The relationship between the values of two tuning parameters—

α

and

λ

—might be worthwhile to investigate.

Moreover, the considered real data example has more potential for further study, for instance, considering long-term studies and interpreting the results based on geographic, economic, or political factors.

6. Conclusions

The proposed precision estimator is based on the

L_{1}

penalty for sparsity on Gaussian log-likelihood and penalization with Frobenius norm shrinkage to an arbitrary non-random target value. We proposed two algorithms based on gradient descent (glasso) and the box-constrained quadratic program (dpglasso). Our approach is similar to gelnet by [15], but with some refinement; the estimator proposed here cannot be obtained from gelnet. Compared to other alternatives, the simulation study illustrated that our proposed strategy is a good competitor for high-dimensional precision matrix estimation. Additionally, we presented an empirical analysis of a network of 32 popular currencies and estimated yearly and four-year dependency networks.

Author Contributions

Funding acquisition, A.B. and M.A.; methodology, A.K., A.B. and M.A.; project administration, A.B.; supervision, A.B. and M.A.; validation, A.K., A.B. and M.A.; writing—original draft, A.K.; writing—review and editing, A.B. and M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was based upon research supported in part by the National Research Foundation (NRF) of South Africa, Ref.: SRUG190308422768, grant no. 120839; the South African DST-NRF-MRC SARChI Research Chair in Biostatistics (grant no. 114613); and STATOMET at the Department of Statistics at the University of Pretoria. The third author’s research (M. Arashi) is supported by a grant from Ferdowsi University of Mashhad (N.2/58091).

Data Availability Statement

The data under consideration in this study is in the public domain.

Acknowledgments

We would like to sincerely thank the two anonymous reviewers for their constructive comments, which led us to add many details to the paper and improve the presentation.

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

GLASSO	Graphical lasso with zero target matrix
GLASSO-I	Graphical lasso with identity target matrix
GLASSO-vI	Graphical lasso with scalar target matrix
GELNET	Graphical elastic net with zero target matrix
GELNET-I	Graphical elastic net with identity target matrix
GELNET-vI	Graphical elastic net with scalar target matrix
ROPE	ROPE with zero target matrix
ROPE-I	ROPE with identity target matrix
ROPE-vI	ROPE with scalar target matrix
GSOS	GSOS with zero target matrix
GSOS-I	GSOS with identity target matrix
GSOS-vI	GSOS with scalar target matrix

Appendix A

Figure A1 shows the precision matrices of the considered networks in the simulation study.

Figure A1. Precision matrices (diagonal elements ignored).

References

Friedman, J.; Hastie, T.; Tibshirani, R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 2008, 9, 432–441. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Fan, J.; Feng, Y.; Wu, Y. Network exploration via the adaptive LASSO and SCAD penalties. Ann. Appl. Stat. 2009, 3, 521. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bien, J.; Tibshirani, R.J. Sparse estimation of a covariance matrix. Biometrika 2011, 98, 807–820. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Witten, D.M.; Friedman, J.H.; Simon, N. New insights and faster computations for the graphical lasso. J. Comput. Gr. Stat. 2011, 20, 892–900. [Google Scholar] [CrossRef]
Mazumder, R.; Hastie, T. The graphical lasso: New insights and alternatives. Electron. J. Stat. 2012, 6, 2125. [Google Scholar] [CrossRef]
Danaher, P.; Wang, P.; Witten, D.M. The joint graphical lasso for inverse covariance estimation across multiple classes. J. R. Stat. Soc. Ser. B (Statist. Methodol.) 2014, 76, 373–397. [Google Scholar] [CrossRef] [Green Version]
Avagyan, V.; Alonso, A.M.; Nogales, F.J. Improving the graphical lasso estimation for the precision matrix through roots of the sample covariance matrix. J. Comput. Graph. Stat. 2017, 26, 865–872. [Google Scholar] [CrossRef]
Warton, D.I. Penalized normal likelihood and ridge regularization of correlation and covariance matrices. J. Am. Stat. Assoc. 2008, 103, 340–349. [Google Scholar] [CrossRef]
Van Wieringen, W.N.; Peeters, C.F. Ridge estimation of inverse covariance matrices from high-dimensional data. Comput. Stat. Data Anal. 2016, 103, 284–303. [Google Scholar] [CrossRef] [Green Version]
van Wieringen, W.N. The generalized ridge estimator of the inverse covariance matrix. J. Comput. Graph. Stat. 2019, 28, 932–942. [Google Scholar] [CrossRef]
Kuismin, M.; Kemppainen, J.; Sillanpää, M. Precision matrix estimation with ROPE. J. Comput. Graph. Stat. 2017, 26, 682–694. [Google Scholar] [CrossRef] [Green Version]
Rothman, A.J. Positive definite estimators of large covariance matrices. Biometrika 2012, 99, 733–740. [Google Scholar] [CrossRef]
Atchadé, Y.F.; Mazumder, R.; Chen, J. Scalable computation of regularized precision matrices via stochastic optimization. arXiv 2015, arXiv:1509.00426. [Google Scholar]
Bernardini, D.; Paterlini, S.; Taufer, E. New estimation approaches for graphical models with elastic net penalty. arXiv 2021, arXiv:2102.01053. [Google Scholar] [CrossRef]
Kovács, S.; Ruckstuhl, T.; Obrist, H.; Bühlmann, P. Graphical Elastic Net and Target Matrices: Fast Algorithms and Software for Sparse Precision Matrix Estimation. arXiv 2021, arXiv:2101.02148. [Google Scholar]
Yuan, M.; Lin, Y. Model selection and estimation in the Gaussian graphical model. Biometrika 2007, 94, 19–35. [Google Scholar] [CrossRef] [Green Version]
Banerjee, O.; El Ghaoui, L.; d’Aspremont, A. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 2008, 9, 485–516. [Google Scholar]
Yuan, M. Efficient computation of ℓ1 regularized estimates in Gaussian graphical models. J. Comput. Graph. Stat. 2008, 17, 809–826. [Google Scholar] [CrossRef]
Guo, J.; Levina, E.; Michailidis, G.; Zhu, J. Joint estimation of multiple graphical models. Biometrika 2011, 98, 1–15. [Google Scholar] [CrossRef] [Green Version]
Shan, L.; Kim, I. Joint estimation of multiple Gaussian graphical models across unbalanced classes. Comput. Stat. Data Anal. 2018, 121, 89–103. [Google Scholar] [CrossRef]
Londschien, M.; Kovács, S.; Bühlmann, P. Change-point detection for graphical models in the presence of missing values. J. Comput. Graph. Stat. 2021, 30, 768–779. [Google Scholar] [CrossRef]
Boyd, S.; Boyd, S.P.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Basnarkov, L.; Stojkoski, V.; Utkovski, Z.; Kocarev, L. Correlation patterns in foreign exchange markets. Phys. A Stat. Mech. Its Appl. 2019, 525, 1026–1037. [Google Scholar] [CrossRef] [Green Version]
Fenn, D.J.; Porter, M.A.; Mucha, P.J.; McDonald, M.; Williams, S.; Johnson, N.F.; Jones, N.S. Dynamical clustering of exchange rates. Quant. Financ. 2012, 12, 1493–1520. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2022. [Google Scholar]

Figure 1. The mean of Kullback–Leibler loss for the different models and methods based on 100 replications.

Figure 2. The mean of

L_{2}

loss for the different models and methods based on 100 replications.

Figure 2. The mean of

L_{2}

loss for the different models and methods based on 100 replications.

Figure 3. The mean of quadratic loss for the different models and methods based on 100 replications.

Figure 4. The mean of spectral norm loss for the different models and methods based on 100 replications.

Figure 5. Estimated currency network based on 2018 to 2021 foreign exchange markets data.

Figure 6. Estimated currency network based on 2018 foreign exchange markets data.

Figure 7. Estimated currency network based on 2019 foreign exchange markets data.

Figure 8. Estimated currency network based on 2020 foreign exchange markets data.

Figure 9. Estimated currency network based on 2021 foreign exchange markets data.

Table 1. List of abbreviations of the considered currencies.

Currencies
AUD	Australian dollar	MXN	Mexican peso
BRL	Brazilian real	MYR	Malaysian ringgit
CAD	Canadian dollar	NOK	Norwegian krone
CHF	Swiss franc	NZD	New Zealand dollar
CNY	Chinese renminbi	PEN	Peruvian nuevo sol
CZK	Czech koruna	PHP	Philippines peso
DKK	Danish krone	PLN	Polish zloty
EUR	Euro	RUB	Russian ruble
GBP	British pound	SAR	Saudi Arabian riyal
HKD	Hong Kong dollar	SEK	Swedish krona
HRK	Croatian kuna	SGD	Singapore dollar
HUF	Hungarian forint	THB	Thailand baht
IDR	Indonesian rupiah	TRY	Turkish lira
ISK	Icelandic krona	TWD	Taiwanese dollar
INR	Indian rupee	USD	US dollar
JPY	Japanese yen	ZAR	South African rand

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kheyri, A.; Bekker, A.; Arashi, M. High-Dimensional Precision Matrix Estimation through GSOS with Application in the Foreign Exchange Market. Mathematics 2022, 10, 4232. https://doi.org/10.3390/math10224232

AMA Style

Kheyri A, Bekker A, Arashi M. High-Dimensional Precision Matrix Estimation through GSOS with Application in the Foreign Exchange Market. Mathematics. 2022; 10(22):4232. https://doi.org/10.3390/math10224232

Chicago/Turabian Style

Kheyri, Azam, Andriette Bekker, and Mohammad Arashi. 2022. "High-Dimensional Precision Matrix Estimation through GSOS with Application in the Foreign Exchange Market" Mathematics 10, no. 22: 4232. https://doi.org/10.3390/math10224232

APA Style

Kheyri, A., Bekker, A., & Arashi, M. (2022). High-Dimensional Precision Matrix Estimation through GSOS with Application in the Foreign Exchange Market. Mathematics, 10(22), 4232. https://doi.org/10.3390/math10224232

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

High-Dimensional Precision Matrix Estimation through GSOS with Application in the Foreign Exchange Market

Abstract

1. Introduction

Estimation, Literature Review

2. The Proposed Method

Glasso Scenario

3. Simulation

3.1. Performance Measures

Loss Functions

3.2. Simulation Results

4. Real Data

Real Data Results

5. Discussion

5.1. Summary

5.2. Contributions

5.3. Strengths and Limitations

5.4. Future Work

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI