Next Article in Journal
Acoustic Sounding of Hydraulic Fractures in a Low-Permeability Reservoir
Previous Article in Journal
Evolution of Filtration Pressure Waves in a Hydraulic Fracture during Transient-Well-Operation Modes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Canonical Concordance Correlation Analysis

Independent Researcher, Minneapolis, MN 55305, USA
Mathematics 2023, 11(1), 99; https://doi.org/10.3390/math11010099
Submission received: 24 November 2022 / Revised: 14 December 2022 / Accepted: 21 December 2022 / Published: 26 December 2022

Abstract

:
A multivariate technique named Canonical Concordance Correlation Analysis (CCCA) is introduced. In contrast to the classical Canonical Correlation Analysis (CCA) which is based on maximization of the Pearson’s correlation coefficient between the linear combinations of two sets of variables, the CCCA maximizes the Lin’s concordance correlation coefficient which accounts not just for the maximum correlation but also for the closeness of the aggregates’ mean values and the closeness of their variances. While the CCA employs the centered data with excluded means of the variables, the CCCA can be understood as a more comprehensive characteristic of similarity, or agreement between two data sets measured simultaneously by the distance of their mean values and the distance of their variances, together with the maximum possible correlation between the aggregates of the variables in the sets. The CCCA is expressed as a generalized eigenproblem which reduces to the regular CCA if the means of the aggregates are equal, but for the different means it yields a different from CCA solution. The properties and applications of this type of multivariate analysis are described. The CCCA approach can be useful for solving various applied statistical problems when closeness of the aggregated means and variances, together with the maximum canonical correlations are needed for a general agreement between two data sets.

1. Introduction

Multivariate statistical analysis for two sets of variables is usually performed by the Canonical Correlation Analysis (CCA), which is a tool for finding parameters of their aggregation into the total scores with maximum correlation. The CCA was proposed by H. Hotelling [1] for studying correlations between two aggregated variables, and has been extended in multiple works, including generalizations to nonlinear analysis and several data sets [2,3,4,5]. Its applications are known in various fields, from management and information systems to machine learning and biometrics [6,7,8,9,10,11,12,13]. The classical CCA is based on the maximization of the Pearson’s correlation between the aggregates of two sets of variables.
In place of the Pearson’s correlation, a related measure of agreement can be used—namely, Lin’s concordance correlation coefficient [14] which accounts not only for the correlation but also for the closeness of the variables’ means values. Various features of Lin’s coefficient are discussed in [15], including measuring agreement among raters, instruments, predictors, and intra-class correlations (ibid., with the references within). Results obtained by Lin’s coefficient define the best linear predictor with the restriction that its variance equals the variance of the dependent variable itself [15,16], that also corresponds to the so-called diagonal, or geometric mean regression [17].
The current paper introduces an alternative to the CCA multivariate technique of the Canonical Concordance Correlation Analysis (CCCA) which maximizes the Lin’s concordance correlation coefficient between the aggregates of two data sets. It simultaneously takes into account not just the correlation but also the closeness of the aggregates’ means values and the closeness of their variances. The regular CCA employs the centered data with excluded means of the variables, but the CCCA uses a more comprehensive characteristic of closeness, similarity, or agreement between data sets. It incorporates the maximum possible correlation between the aggregates of the variables in the data sets with optimization by the minimum distance between their mean values and the minimum distance of their variances.
The CCCA can be expressed as a generalized eigenproblem for the matrices of variances and covariance for data sets, in a special form of block-matrices and outer-product vectors of means for the combined vector of loadings for variables in both data sets. The CCCA reduces to the regular CCA if the means of the aggregates are equal, otherwise the CCCA solution differs from the corresponding CCA solution. Particularly, it is shown that in contrast to the regular CCA with only positive correlation values, the CCCA can produce also the negative concordance correlations with the optimal features, that extends the properties of the canonical correlations. The properties and applications of this type of multivariate analysis for the sample data are considered in this work.
The paper is organized as follows. Regular CCA and its modification to CCCA are described in Section 2 and Section 3, numerical example and comparisons are given in Section 4, and Section 5 summarizes the findings. Appendix A contains additional mathematical results.

2. Pair Correlation and Canonical Correlations

Consider two sampled data sets presented as matrices X and Y of the orders N × n and N × m, where N is the number of observations, and n and m are the numbers of variables xj (j = 1, 2, …, n) and yk (k = 1, 2, …, m), or columns of X and Y, respectively. The mean values (denoted by bars) of these variables x j ¯ and y k ¯ can be assembled as the elements of the column vectors x ¯ and y ¯   of the n-th and m-th order, respectively. Let the matrices X and Y contain the centered data, so the centered second moments (denoted by C), or sample variances and covariances of these variables, can be presented in the matrix form as follows:
v a r ( x ) = C x x = 1 N X X ,         v a r ( y ) = C y y = 1 N Y Y ,       c o v ( x , y ) = C x y = 1 N X Y ,
where the prime indicates transposition. The Cxx and Cyy are square matrices of the n-th and m-th order, Cxy and its transposition Cyx (which equals 1 N Y X ) are rectangular matrices of the n by m, and of m by n order, respectively.
Let us briefly describe some main formulae of the canonical correlation analysis, or CCA. The scores of each data set aggregation variables are:
v = X a ,         w = Y b ,
where v and w are vector-columns of the N-th order, and a and b are the vector-columns of parameters of the order n and m, respectively. Pearson’s correlation between two variables (2) can be expressed via their covariance and variances as
ρ = c o v ( v , w ) v a r ( v ) v a r ( w ) .
The canonical correlation of two data sets, or the pair correlation of their scores v and w, measures the connection between these sets. With definitions (1) and (2), the expression (3) equals:
ρ = 1 N v w ( 1 N v v ) ( 1 N w w ) = a ( 1 N X Y ) b ( a ( 1 N X X ) a ) ( b ( 1 N Y Y ) b ) = a C x y b ( a C x x a ) ( b C y y b ) .
CCA calculates the vectors a and b via maximizing the correlation (4), which can be achieved using the conditional objective
ρ = a C x y b λ 2 ( a C x x a 1 ) μ 2 ( b C y y b 1 ) ,
where λ and μ are Lagrange multipliers. Differentiating ρ (5) with respect to the vectors a and b and equating to zero yields the system of equations:
C x y b = λ C x x a ,   C y x a = μ C y y b .
Multiplying row vectors a and b by the first and second Equation (6), respectively, and using conditions (5) yields the equality of the Lagrange multipliers one to the other and to the Pearson’s coefficient of correlation,
λ = μ = a C x y b = ρ .
It means that two normalizing conditions used in (5) can be reduced to one combined condition for the sum of both quadratic forms of the variances from (4). Some other normalizing conditions are described in Appendix A.
Solving the second Equation (6) for the vector b and substituting it into the first equation, and also solving the first Equation (6) for the vector a and substituting it into the second one, leads to the expressions:
C x x 1 C x y C y y 1 C y x a = λ 2 a ,   C y y 1 C y x C x x 1 C x y b = λ 2 b .
The eigenproblems (8) present the classical CCA solution as the eigenvalues for the squared canonical correlations λ 2 = ρ 2 and the corresponding pairs of eigenvectors a and b. The eigenproblems (8) yield all canonical correlations and the corresponding pairs of the eigenvectors a and b for the first, second, third, etc., canonical variables, which number equals the minimum rank of the data matrices X and Y, that corresponds to the solutions with nonzero λ 2 . It is sufficient to solve only one of the eigenproblems in (8) of the lesser order, then the dual vectors can be obtained from one of the linear systems (6). For example, if there are ten x variables and twenty y variables, we can solve the first eigenproblem (8) of the matrix of the 10-th order, then for each λ j and a j (j = 1, 2, …, 10) to solve the second linear system (6) for the dual vectors b j = λ j 1 C y y 1 C y x a j .
Equations (6) and (7) can be presented via one generalized eigenproblem for the extended block-matrices:
( 0 n n C x y C y x 0 m m ) ( a b ) = λ ( C x x 0 n m 0 m n C y y ) ( a b ) ,
where 0nn is the zero matrix of the nth order, and similar notations are used for the orders of the other zero matrices. In place of two eigenproblems (8), here is one generalized eigenproblem (9) for the square matrices of the n + m order. From the combined eigenvector in (9) we can extract two sub-vectors a and b which after normalizing would coincide with the eigenvectors obtained from the two separate problems (8).
It is important to note the following. Suppose the minimum rank of the matrices in (8) equals n, then the first eigenproblem in (8) yields n positive eigenvalues λ 2 , while the second eigenproblem in (8), among its m eigenvalues, has the same n positive eigenvalues λ 2 plus m-n eigenvalues equal zero. The generalized eigenproblems (9), among its eigenvalues λ , has 2n values of opposite signs in pairs λ = ± λ 2 (they correspond to the singular values in the SVD, or the singular value decomposition), plus m-n zero eigenvalues. The eigenvectors corresponding to these paired eigenvalues in the eigenproblem (9) are the same by one part, a or b of the total vector, but of the opposite sign for the other part, b or a, respectively. It produces pairs of the canonical correlations (4) of the opposite values.

3. Concordance Correlation Coefficient and Canonical Concordance Correlation Analysis

Consider another measure of the relation between two variables—Lin’s concordance correlation coefficient ρ c   which can be expressed as follows:
ρ c = 2 ρ v a r ( v ) v a r ( w ) v a r ( v ) + v a r ( w ) + [ E ( v ) E ( w ) ] 2 = 2 c o v ( v , w ) v a r ( v ) + v a r ( w ) + [ E ( v ) E ( w ) ] 2 ,
where ρ is Pearson’s correlation between two variables (2), and E(v) − E(w) is the difference of the expectations at the population level, or the sample means of these variables at the sample level. Using (10) in place of (3), we can repeat the derivation (4)–(9) for the proposed Canonical Concordance Correlation Analysis, or CCCA. The expression (4) for the measure of agreement (10) of the scores (2) is as follows:
ρ c = 2 ( 1 N v w ) ( 1 N v v ) + ( 1 N w w ) + [ a x ¯ b y ¯ ] 2 = 2 a ( 1 N X Y ) b a ( 1 N X X ) a + b ( 1 N Y Y ) b + [ a x ¯ b y ¯ ] 2 = 2 a C x y b a C x x a + b C y y b + [ a x ¯ b y ¯ ] 2 ,
where x ¯ and y ¯ are column vectors of the sample means, and matrices of sample second moments are the same as in relations (1).
CCCA calculates the vectors a and b via maximizing the concordance correlation coefficient (11), which is a Rayleigh quotient of two bilinear forms and can be optimized similarly to (5) by the conditional objective of the numerator subject to the unit value of the denominator:
ρ c = 2 a C x y b λ ( a C x x a + b C y y b + [ a x ¯ b y ¯ ] 2 1 ) .
In contrast to the regular canonical correlation (4) containing the product of two variances and two normalizing conditions (although, due to (7) reduceable to one combined condition with the sum of variances), the additive items in the denominator (11) can be restricted by just one condition used in (12). It is possible because we can vary parameters of the aggregating vectors a and b for achieving this aim.
Partial derivatives of the objective (12) by the vectors a and b equal to zero define the system of equations for finding these vectors:
{ ρ c a = 2 C x y b 2 λ ( C x x a + ( a x ¯ b y ¯ ) x ¯ ) = 0 ρ c b = 2 C y x a 2 λ ( C y y b ( a x ¯ b y ¯ ) y ¯ ) = 0 .
This system can be rearranged as follows:
{ C x y b = λ ( C x x a + ( x ¯ x ¯ ) a ( x ¯ y ¯ ) b ) C y x a = λ ( C y y b + ( y ¯ y ¯ ) b ( y ¯ x ¯ ) a ) ,
where ( x ¯ x ¯ )   is the outer-product of the x-means vector by itself, so it is the square matrix of the n-th order. The ( x ¯ y ¯ ) is the outer-product of the x-means by y-means vectors, that it is the rectangular matrix of the n by m order, and similarly the other outer-products in (14) are constructed.
Solving one equation from (14) for one of the vectors and substituting it into the other equation generates quadratic in λ eigenproblems difficult for practical solving. However, it is possible to present the system (14) as one generalized eigenproblem for the extended block-matrices:
( 0 n n C x y C y x 0 m m ) ( a b ) = λ ( C x x + x ¯ x ¯ x ¯ y ¯ y ¯ x ¯ C y y + y ¯ y ¯ ) ( a b ) ,
Similar to the CCA solution (9), the CCCA solution can be given for the one generalized eigenproblem (15) for the square block-matrices of the n + m order. Separating the obtained combined eigenvectors in (15) to two sub-vectors a and b and normalizing each of them yields the eigenvectors for the Canonical Concordance Correlation Analysis.
The matrix at the left-hand side is the same in both problems (9) and (15). If all variables have zero mean value then the right-hand side matrix (15) reduces to the right-hand side matrix in (9), so the CCCA reduces to the classical CCA solution. Similar to the discussed above feature of the generalized eigenproblems (9), the eigenproblem (15) has n positive and n negative eigenvalues and the rest m-n values equal zero. Due to the structure of the corresponding eigenvectors, the values of concordance canonical coefficients (11) would appear in pairs of the opposite signs, although, in contrast to the model (9), not equal by the absolute value.
Multiplying the row vectors (a′, b′) of the obtained eigenvectors from the left-hand side on the relation (15) and taking into account the normalizing condition in (12), yields the equality
2 a C x y b = λ ( a C x x a + b C y y b + [ a x ¯ b y ¯ ] 2 ) = λ = ρ c .
Therefore, the eigenvalues λ in (15) define the spectrum of the optimal concordance correlation coefficients ρ c in the CCCA approach.
The generalized matrix problem can be reduced to the regular eigenproblem if to solve first the eigenproblem of the matrix from the right-hand side (15), then to find the square root of this matrix, and using it the generalized problem can be transformed to the regular eigenproblem of a symmetric matrix The eigenproblem solution of a matrix with the added outer-product of a vector, as in (15), is related to the eigenproblem for the matrix itself [18,19,20]. Most modern software packages solve an eigenproblem of a non-symmetric matrix, so a simple way to reduce the generalized problem (15) to a regular eigenproblem is to invert the matrix from the right-hand side and multiply it by the matrix from the left-hand side (15), that is presented in the analytical form in Appendix A.
In contrast to the Lin’s pairwise concordance correlation coefficient, which is not a scale invariant measure because it depends on the units of the variables, the CCCA is a scale invariant measure because parameters of the vectors a and b transform all the variables units to the same comparable units of the combined variables (2).
Let us consider the difference between the pair correlation and the Lin’s concordance correlation coefficient ρ c (10):
ρ ρ c = ρ ( 1 2 v a r ( v ) v a r ( w ) v a r ( v ) + v a r ( w ) + [ E ( v ) E ( w ) ] 2 ) = ρ ( ( v a r ( v ) v a r ( w ) ) 2 + [ E ( v ) E ( w ) ] 2 v a r ( v ) + v a r ( w ) + [ E ( v ) E ( w ) ] 2 ) .
The expression in parentheses at the right-hand side (17) is always non-negative. Then, for ρ 0 the left-hand side is ρ ρ c 0 , so ρ ρ c , thus, the pair correlation is bigger than the concordance coefficient. In the opposite case, for ρ < 0 , the ρ ρ c < 0 , so ρ < ρ c . Therefore, the Pearson pair correlation is always bigger than the Lin’s concordance coefficient by the absolute value:
| ρ | | ρ c | .
This relation holds for the canonical correlations as well. Only for the situation of equal means and equal variances, the concordance correlation (10) can reach the pair correlation (2), ρ c = ρ .

4. Numerical Example

For a numerical illustration, let us consider a dataset LifeCycleSavings by 50 countries used in [8] and available in the R statistical package for an example on the canonical correlation analysis performed by the function cancor. In this demographical financial problem, the first matrix X contains two variables: x1—percentage of population under 15 years old (pop15); and x2—percentage of population over 75 years old (pop 75). The second matrix Y consists of three variables: y1—the saving ratio (sr) presented by the aggregate personal saving divided by the disposable income; y2—the real per-capita disposable income (dpi); and y3—the percentage rate of change in per-capita disposable income (ddpi). These characteristics had been averaged over the decade to remove the business cycle or other short-term fluctuations as described in [21]. Table 1 shows the mean values and standard deviation (std) for these data.
The matrices of variances and covariance (1) for this data are:
C x x = ( 82.079 10.517 10.517 1.633 ) ,           C y y = ( 19.673 958.717 3.841 958.717 962 , 184.7 360.849 3.841 360.849 8.071 ) ,
C x y = ( 18.305 6720.091 1.231 1.794 986.429 0.0919 ) .
With these matrices the CCA (9) solution was obtained. The same matrices and the mean values from Table 1 yield the CCCA (15) solution. Each combined vector (a, b) is normalized so that the total of its squared elements equals one. The rank of the matrices equals the minimum size r = min (n, m) = min (2, 3) = 2, so there are two main canonical pairs of vectors for CCA and for CCCA. Table 2 presents the CCA results obtained by the combined eigenproblem (9).
For the CCA solution (9), there are two positive canonical correlations, ρ 1 = 0.8248 and ρ 2 = 0.3653 , and two negative ones of the same absolute value, ρ 3 = 0.8248 and ρ 4 = 0.3653 , shown in the first two and the last two columns of Table 2, respectively. The first two solutions coincide (up to normalization) with the solutions which can be obtained by the classical CCA eigenproblems (8). We can see by Table 2 that the CCA vectors a3 = −a1, and a4 = −a2, that explains the opposite signs of the CCA values of correlations (3) ρ 3 = ρ 1 and ρ 4 = ρ 2 of the aggregated scores (2) with one vector a of an opposite sign.
With each paired vectors a and b, we can additionally estimate the corresponding values of the concordance correlation (11)—those are shown in the last row of Table 2: ρ c 1 = 0.1358 and ρ c 2 = 0.0034 , and also ρ c 3 = 0.8014 and ρ c 4 = 0.0051 . These values correspond to the inequality (18). Three of these ρ c   are close to zero, but one of them ,   ρ c 3 , is big by the absolute value. Thus, for the generalized solution (9), it is possible to find such a concordance correlation solution which is close to the classical CCA (i.e., −0.8014 vs. −0.8248). This specific solution with the negative correlation identifies the vectors a3 and b3 with which we reach the best agreement between two data sets measured by the Lin’s concordance correlation, so with the high pair correlation and close mean values of the aggregate variables.
Table 3 is organized similar to Table 2 and presents the CCCA results obtained by the combined eigenproblem (15).
Table 3 contains four pairs of the dual vectors a and b in its columns, with two positive and two negative concordance correlations ρ c , all different by the absolute value, as shown in the last row. For each pair of the vectors a and b, the additional regular pair correlations ρ (4) of the aggregate variables were calculated—they are presented in the second from the bottom row. Again, the inequality (18) is satisfied for each solution. The first solution (a1, b1) yields a high canonical concordance correlation ρ c 1 = 0.8080 which is very close to the regular pair correlation of the scores ρ 1 = 0.8082 . The third solution is even better by the equal absolute value of the correlations (actually, with more precision the values are ρ c 3 = 0.824679 and ρ 3 = 0.824681 , so they differ only by 0.000002). Therefore, the solution (a3, b3) with the negative relation of the scores identifies the vectors for the best agreement between two data sets measured by the Lin’s concordance correlation and by the regular correlation, with the close mean values of the aggregate variables, so the good agreement of the data in two sets.
Comparison between the CCA and CCCA vectors in Table 2 and Table 3 shows the following. The square root of sum of squares (SRSS) for difference of solutions given in the columns of these table equals, respectively, 0.2193, 1.1009, 0.0403, and 1.9343, so the best similarity occurs for the third dual vectors (a3, b3). The pair correlation of the vector solutions in each column of Table 2 and Table 3 are, respectively, 0.9693, 0.2618, 0.9997, and −0.8204, so the best correspondence by the absolute value between the CCA and CCCA solutions is reached for the third pair of vectors. They can be used both as the best CCA solution and the best CCCA solution. It is important to note that this optimal solution corresponds to the negative canonical correlation in Table 2 which can be obtained via the combined eigenproblem (9), while the classical CCA eigenproblems (8) produce only positive eigenvalues. Concerning the eigenvalues, the CCA in Table 2 yields the maximum possible canonical correlations, but the additionally estimated concordance correlations are very low, with exception of the third solution. The eigenvalues of the CCCA in Table 3 are all higher than the concordance coefficients in the previous table, and the related to them estimations of the correlations are high, especially for the first and third solutions.

5. Summary

The paper introduced a multivariate statistical method of the Canonical Concordance Correlation Analysis, or CCCA, and compares it with a related Canonical Correlation Analysis, or CCA. The classical CCA is based on the maximization of the Pearson’s correlation between the aggregates of two sets of variables. In place of the Pearson correlation, a related measure of the Lin’s concordance correlation coefficient which accounts both for the correlation and the closeness of the variables’ mean values was adopted. Both the CCA and CCCA methods are presented in a unified framework of the generalized eigenproblems for the combined vectors of loading for two sets of variables. The CCCA eigenproblem can be reduced to the CCA eigenproblem if the means of the aggregates become equal.
The properties and applications of this type of multivariate analysis for the sample data are considered. It is shown that the combined generalized eigenproblem of CCA method is preferable in finding all positive and negative eigenvalues defining the canonical correlations of both signs. The closeness of solutions produced by the CCA and CCCA can serve as indication on the optimal vectors of loading which yield the high correlations of the aggregate scores together with the close mean values of the aggregates, so the best agreement between two data sets. Applying CCA, we get the best possible canonical correlations but not necessarily a high level of agreement measured by the concordance correlations. However, applying the CCCA, we reach the maximum possible agreement given by the canonical concordance correlations, together with the high level of the correlations among the aggregated variables of two data sets.
The proposed new type of modern multivariate analysis can serve for various practical research projects operating with two data sets, particularly, for data fusion [22] and for other problems [15] which require finding a maximum agreement, similarity, or closeness between variables. Future studies on canonical concordance correlation analysis can extend this method to the robust solutions, evaluations for three and more data sets, for information presented in multidimensional arrays or matrices with three and more entries, and for more adequate operating on complex data of the modern science and applications.

Funding

This research received no external funding.

Data Availability Statement

The data sources are given in the section of numerical example.

Acknowledgments

I am very grateful to George Luta from Georgetown University for proposing this problem and helping to elaborate on improving the results. I am also thankful to four anonymous reviewers for the valuable comments and suggestions refining the paper.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Additional Mathematical Results

Due to the property of equal Lagrange terms λ = μ (7), the objective (5) can be reduced to one combined condition of the sum of both quadratic forms of variances:
ρ = a C x y b λ 2 ( a C x x a + b C y y b 2 ) .
The objective (A1) corresponds to maximizing the following quotient of the bilinear to quadratic forms:
ρ =   2 a C x y b a C x x a + b C y y b
It means that the classical CCA solution can be obtained not only from maximizing the correlation coefficient (4), but also from the objective (A2) of the double covariance to the sum of the variances.
Instead of maximizing the ratio of forms in (A2), we can also optimize their difference. Consider the least squares (LS) objective of difference of the aggregated variables scores (2):
L S = 1 N | | v w | | 2 = 1 N ( X a Y b ) ( X a Y b ) = 1 N a X X a + 1 N b Y Y b 2 N a X Y b = a C x x a + b C y y b 2 a C x y b .
Derivatives by the vectors a and b, put to zero, yield the homogeneous equations
C x y b = C x x a ,   C y x a = C y y b ,
which essentially are the same relations (6) but without the normalizing terms. Solving Equation (A4) for one vector via another one,
a = C x x 1 C x y b ,   b = C y y 1 C y x a ,
and substituting a from the first equation to the second one, and vice versa, leads to the system (8) which are expressed as the eigenproblems. The eigenvalues can be taken equal one if to normalize the eigenvectors by the square root of their eigenvalues. Thus, minimizing of the LS objective (A3) corresponds to maximizing of the correlation objective (4). Absence of the normalizing conditions in (A3) means that the obtained solutions (A5) can be considered as the local extrema, while adding the normalizing conditions (5) and (6) makes them the global extrema.
Comparing the objective (A2) with the CCCA objective (11) we see that they differ only by the item of the squared aggregated difference of means in the denominator. Then, similar to using the difference of scores in place of the ratio in (A2) and (A3), instead of the concordance correlation ratio (11) we can use the corresponding to it difference:
L S = 1 N v w 2 + [ E ( v ) E ( w ) ] 2 = 1 N ( X a Y b ) ( X a Y b ) + [ a x ¯ b y ¯ ] 2 = 1 N a X X a + 1 N b Y Y b 2 N a X Y b + ( a x ¯ x ¯ a + b y ¯ y ¯ b 2 a x ¯ y ¯ b ) = a ( C x x + x ¯ x ¯ ) a + b ( C y y + y ¯ y ¯ ) b 2 a ( C x y + x ¯ y ¯ ) b .
The variance-covariance matrices of the elements with added product of mean values of the corresponding variables define the second moment matrices of the non-centered data. Thus, using the difference of scores minimization criterion (A6) corresponds to using the non-centered data, or to maximization of the cosine of the aggregators’ scores. Some other normalizing conditions for objectives optimization were applied in [3,4,23].
Let us consider also the matrix normalizations corresponding to the variables’ standardization useful for more precise computations. The variance-covariance matrices in CCA (9) or CCCA (15) can contain very different values, depending on the units of the variables, that leads so possible problems in computations. To make all the elements of a similar values, it is advisable to transform the variables to the standardized values, i.e., centered and divided by the standard deviations. For example, the problem (15) can be represented as follows:
D 1 2 ( 0 n n C x y C y x 0 m m ) D 1 2 D 1 2 ( a b ) = λ D 1 2 ( C x x + x ¯ x ¯ x ¯ y ¯ y ¯ x ¯ C y y + y ¯ y ¯ ) D 1 2 D 1 2 ( a b ) ,
where the square root of the diagonal matrix D is defined by the diagonal elements of the variance matrices:
D 1 / 2 = ( ( d i a g ( C x x ) ) 1 / 2 0 n m 0 m n ( d i a g ( C y y ) ) 1 / 2 ) ,
and D 1 / 2 consists of the reciprocal elements. Then, the problem (A7) can be rewritten as
( 0 n n R x y R y x 0 m m ) ( α β ) = λ ( R x x + m x m x m x m y m y m x R y y + m y m y ) ( α β ) ,
with matrices of correlations and normalized mean vectors constructed by the following patterns:
R x x = ( d i a g ( C x x ) ) 1 / 2 C x x ( d i a g ( C x x ) ) 1 / 2 ,       m x = ( d i a g ( C x x ) ) 1 / 2 x ¯ ,      
and similar by the other variables. The new combined eigenvector in (A9) is defined via the diagonal matrix (A8) as follows:
( α β ) = D 1 / 2 ( a b ) .
All variables in (A9) are in the standardized units. The eigenvalues in the problems (15) and (A9) are the same, and when the vectors α and β are found the original vectors a and b can be defined by the inversion of the relation (A11). The generalized eigenproblem (15) can be reduced to the regular eigenproblem:
( C x x + x ¯ x ¯ x ¯ y ¯ y ¯ x ¯ C y y + y ¯ y ¯ ) 1 · ( 0 n n C x y C y x 0 m m ) ( a b ) = λ ( a b ) .
Let us describe how to invert the matrix from the left-hand side of (A12). Denoting the original matrix as A, we have:
A = ( C x x + x ¯ x ¯ x ¯ y ¯ y ¯ x ¯ C y y + y ¯ y ¯ ) = ( C x x 0 n m 0 m n C y y ) + ( x ¯ y ¯ ) ( x ¯ y ¯ ) B + c c ,
where B is the block-diagonal matrix of the second moments, and cc’ is the outer-product of the column vector c of the stacked vectors of mean values x ¯ and y ¯ ,   by its transposed row vector c′. Due to the well-known Sherman-Morrison formula, the inverse of the matrix with such a structure can be expressed as follows:
A 1 = ( B + c c ) 1 = B 1 B 1 c c B 1 1 + c B 1 c .
With definitions from (A13), the Formula (A14) can be rewritten in the explicit form:
A 1 = ( C x x 0 n m 0 m n C y y ) 1 ( C x x 0 n m 0 m n C y y ) 1 ( x ¯ y ¯ ) ( x ¯ y ¯ ) ( C x x 0 n m 0 m n C y y ) 1 1 + ( x ¯ y ¯ ) ( C x x 0 n m 0 m n C y y ) 1 ( x ¯ y ¯ ) = ( C x x 1 0 n m 0 m n C y y 1 ) k q q ,
where for a more compact notations we use the vector q defined as
q = ( C x x 0 n m 0 m n C y y ) 1 · ( x ¯ y ¯ ) = ( C x x 1 x ¯ C y y 1 y ¯ ) ,
and the constant k
k = 1 1 + ( x ¯ y ¯ ) ( C x x 0 n m 0 m n C y y ) 1 ( x ¯ y ¯ ) = 1 1 + x ¯ C x x 1 x ¯ + y ¯ C y y 1 y ¯ .
Using (A16) and (A15) yields:
A 1 = ( C x x 1 k C x x 1 x ¯ x ¯ C x x 1 k C x x 1 x ¯ y ¯ C y y 1 k C y y 1 y ¯ x ¯ C x x 1 C y y 1 k C y y 1 y ¯ y ¯ C y y 1 ) ,
where k is defined in (A17). Using (A18), the matrix product from (A12) becomes:
A 1 ( 0 n n C x y C y x 0 m m ) = ( k C x x 1 x ¯ y ¯ C y y 1 C y x C x x 1 C x y k C x x 1 x ¯ x ¯ C x x 1 C x y C y y 1 C y x k C y y 1 y ¯ y ¯ C y y 1 C y x k C y y 1 y ¯ x ¯ C x x 1 C x y ) .
With this matrix (A19), the eigenproblem (A12) can be solved, and all the eigenvalues and vectors obtained.
If we make k = 0, which corresponds to the absence of the last item in (A15) or to zero values of the means x ¯ and y ¯   and make q a zero vector (A16), then the matrix from (A19) reduces to the matrix which can be obtained from (9) if we transform that generalized eigenproblem to the regular eigenproblem. In other words, for k = 0 the matrix of the CCCA problem (A19) can be expressed as the matrix for the CCA problem.
The considered derivation shows how to transform the CCCA generalized eigenproblem with two matrices to the regular eigenproblem with one matrix (A12). This matrix (A19) is not symmetrical, although it is often better to have a symmetric matrix for computations, so let us briefly describe transformation of the generalized eigenproblem (15) into the eigenproblem of one symmetric matrix. The right-hand matrix in (15) is denoted as A matrix in (A13). At first, we solve the eigenproblem for the matrix A, that is A γ = μ γ , with μ   and γ —the eigenvalue and eigenvector, respectively. In the matrix form, this eigenproblem can be presented as A G = G M , with M—the diagonal matrix of the eigenvalues, and G—the matrix of eigenvectors in its columns. The matrix G is orthogonal, so its inverted one equals the transposed matrix, G−1 = G′. The spectral decomposition of the matrix by its eigenvectors, the square root of this matrix and its inversion are:
A = G M G ,           A 1 / 2 = G M 1 / 2 G ,               A 1 / 2 = G M 1 / 2 G .
With these definitions, the generalized eigenproblem (15) can be rewritten as follows:
( 0 n n C x y C y x 0 m m ) ( a b ) = λ A 1 / 2 A 1 / 2 ( a b ) ,
which can be further transformed to:
A 1 / 2 ( 0 n n C x y C y x 0 m m ) A 1 / 2 A 1 / 2 ( a b ) = λ A 1 / 2 ( a b )
With the new vector
( α β ) = A 1 / 2 ( a b ) ,
the problem (A22) is expressed as the eigenproblem of thesymmetric matrix:
( A 1 / 2 ( 0 n n C x y C y x 0 m m ) A 1 / 2 ) ( α β ) = λ ( α β ) .
The eigenvalues in the problems (15) and (A24) are the same, and after finding the eigenvectors in (A24) the original vectors a and b can be obtained by the inverted relation (A23), where the square root matrix is defined in (A20). The eigenvectors of the regular eigenproblem of the symmetric matrix (A24) are orthogonal, so from (A23) we have the relation of orthogonality for the original vectors (A12) in the metric of matrix A:
( α β ) j ( α β ) k = ( a b ) j A ( a b ) k = δ j k ,
where δ j k   is the Kronecker delta.

References

  1. Hotelling, H. Relations between two sets of variates. Biometrika 1936, 28, 321–377. [Google Scholar] [CrossRef]
  2. Horst, P. Relations among m sets of measures. Psychometrika 1961, 26, 129–149. [Google Scholar] [CrossRef]
  3. Tishler, A.; Lipovetsky, S. Canonical correlation analyses for three data sets: A unified framework with application to management. Comput. Oper. Res. 1996, 23, 667–679. [Google Scholar] [CrossRef]
  4. Tishler, A.; Lipovetsky, S. Modeling and forecasting with robust canonical analysis: Method and application. Comput. Oper. Res. 2000, 27, 217–232. [Google Scholar] [CrossRef]
  5. Kessy, A.; Lewin, A.; Strimmer, K. Optimal whitening and decorrelation. Am. Stat. 2018, 72, 309–314. [Google Scholar] [CrossRef]
  6. Hardoon, D.; Szedmak, S.; Shawe-Taylor, J. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 2004, 16, 2639–2664. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  7. Lipovetsky, S. Dual PLS analysis. Int. J. Inf. Technol. Decis. Mak. 2012, 11, 879–891. [Google Scholar] [CrossRef]
  8. Lipovetsky, S. Orthonormal canonical correlation analysis. Open Stat. 2021, 2, 24–36. [Google Scholar] [CrossRef]
  9. Adrover, J.; Donato, S. A robust predictive approach for canonical correlation analysis. J. Multivar. Anal. 2015, 133, 356–376. [Google Scholar] [CrossRef]
  10. Cao, D.S.; Liu, S.; Zeng, W.B.; Liang, Y.Z. Sparse canonical correlation analysis applied to -omics studies for integrative analysis and biomarker discovery. J. Chemom. 2015, 29, 371–378. [Google Scholar] [CrossRef]
  11. Wilms, I.; Croux, C. Sparse canonical correlation analysis from a predictive point of view. Biom. J. 2015, 57, 834–851. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Jendoubi, T.; Strimmer, K. A whitening approach to probabilistic canonical correlation analysis for omics data integration. BMC Bioinform. 2019, 20, 15. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Lê Cao, K.A.; Welham, Z. Data Integration Using R: Methods and Applications with the mixOmics Package; CRC/Chapman and Hall: Boca Raton, FL, USA, 2022. [Google Scholar]
  14. Lin, L. A concordance correlation coefficient to evaluate reproducibility. Biometrics 1989, 45, 255–268. [Google Scholar] [CrossRef] [PubMed]
  15. Bottai, M.; Kim, T.; Lieberman, B.; Luta, G.; Pena, E. On optimal correlation-based prediction. Am. Stat. 2022, 76, 313–321. [Google Scholar] [CrossRef]
  16. Christensen, R. Comment on “On Optimal Correlation-Based Prediction,” by Bottai et al. (2022). Am. Stat. 2022, 76, 322. [Google Scholar] [CrossRef]
  17. Lipovetsky, S. Comment on “On Optimal Correlation-Based Prediction”, by Bottai et al. (2022). Am. Stat. 2023, 77. forthcoming. [Google Scholar] [CrossRef]
  18. Gu, M.; Eisenstat, S.C. A stable and efficient algorithm for the rank-one modification of the symmetric eigenproblem. SIAM J. Matrix Anal. Appl. 1994, 15, 1266–1276. [Google Scholar] [CrossRef]
  19. Betcke, T.; Higham, N.J.; Mehrmann, V.; Schröder, C.H.; Tisseur, F. NLEVP: A Collection of Nonlinear Eigenvalue Problems; MIMS EPrint: Manchester, UK, 2008; pp. 1–18. [Google Scholar]
  20. Huang, X.; Bai, Z.; Su, Y. Nonlinear rank-one modification of the symmetric eigenvalue problem. J. Comput. Math. 2010, 28, 218–234. [Google Scholar] [CrossRef]
  21. Belsley, D.A.; Kuh, E.; Welsch, R.E. Regression Diagnostics; John Wiley& Sons: New York, NY, USA, 1980. [Google Scholar]
  22. Lipovetsky, S. Data Fusion in Several Algorithms. Adv. Adapt. Data Anal. 2013, 5, 3. [Google Scholar] [CrossRef]
  23. Lipovetsky, S.; Tishler, A.; Conklin, M. Multivariate least squares and its relation to other multivariate techniques. Appl. Stoch. Model. Bus. Ind. 2002, 18, 347–356. [Google Scholar] [CrossRef]
Table 1. The dataset LifeCycleSavings, the means and standard deviations.
Table 1. The dataset LifeCycleSavings, the means and standard deviations.
Characteristicsx1x2y1y2y3
mean35.092.299.671106.763.76
std9.151.294.48990.872.87
Table 2. CCA solution.
Table 2. CCA solution.
VariablesCCA by the Eigenproblem (9)
(a1, b1)(a2, b2)(a3, b3)(a4, b4)
x10.1808−0.1366−0.18080.1366
x2−0.9655−0.98150.96550.9815
y1−0.16810.1259−0.16810.1259
y2−0.0026−0.0003−0.0026−0.0003
y3−0.0828−0.0463−0.0828−0.0463
Canonical correlation coefficient, ρ 0.82480.3653−0.8248−0.3653
Added concordance correlation coefficient, ρ c 0.13580.0034−0.8014−0.0051
Table 3. CCCA solution.
Table 3. CCCA solution.
VariablesCCCA by the Eigenproblem (15)
(a1, b1)(a2, b2)(a3, b3)(a4, b4)
x10.0084−0.0505−0.2131−0.0969
x2−0.9980−0.30610.9546−0.8014
y1−0.04190.3663−0.1891−0.5870
y2−0.0013−0.0007−0.00280.0013
y3−0.0456−0.8773−0.0875−0.0616
Added correlation coefficient, ρ 0.80820.1767−0.8247−0.3237
Canonical concordance correlation coefficient, ρ c 0.80800.0167−0.8247−0.0944
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lipovetsky, S. Canonical Concordance Correlation Analysis. Mathematics 2023, 11, 99. https://doi.org/10.3390/math11010099

AMA Style

Lipovetsky S. Canonical Concordance Correlation Analysis. Mathematics. 2023; 11(1):99. https://doi.org/10.3390/math11010099

Chicago/Turabian Style

Lipovetsky, Stan. 2023. "Canonical Concordance Correlation Analysis" Mathematics 11, no. 1: 99. https://doi.org/10.3390/math11010099

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop