8 min read

Correlation Analysis

Canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two groups of variables \(\mathbf X\) has \(p\) variables \[\mathbf X=\begin{bmatrix} X_1\\ X_2\\ \vdots\\ X_p\\ \end{bmatrix}\] and \(\mathbf Y\) has \(q\) variables \[\mathbf Y=\begin{bmatrix} Y_1\\ Y_2\\ \vdots\\ Y_q\\ \end{bmatrix}\] \[E(\mathbf X)=\boldsymbol\mu_X\] \[Cov(\mathbf X)=\boldsymbol\Sigma_{XX}\] and \[E(\mathbf Y)=\boldsymbol\mu_Y\] \[Cov(\mathbf Y)=\boldsymbol\Sigma_{YY}\] and \[Cov(\mathbf X,\mathbf Y)=\boldsymbol\Sigma_{XY}=\boldsymbol\Sigma_{YX}^T=E(\mathbf X-\boldsymbol\mu_X)(\mathbf Y-\boldsymbol\mu_Y)^T=\begin{bmatrix} \sigma_{X_1Y_1}&\sigma_{X_1Y_2}&\cdots&\sigma_{X_1Y_q}\\ \sigma_{X_2Y_1}&\sigma_{X_2Y_2}&\cdots&\sigma_{X_2Y_q}\\ \vdots&\vdots&\ddots&\vdots\\ \sigma_{X_pY_1}&\sigma_{X_pY_2}&\cdots&\sigma_{X_pY_q}\\ \end{bmatrix}\] Linear combinations provide simple summary measures of a set of variables. Let \[U=\mathbf a^T\mathbf X\] \[V=\mathbf b^T\mathbf Y\] Then \[Var(U)=Var(\mathbf a^T\mathbf X)=\mathbf a^TCov(\mathbf X)\mathbf a=\mathbf a^T\boldsymbol\Sigma_{XX}\mathbf a\] \[Var(V)=Var(\mathbf b^T\mathbf Y)=\mathbf b^TCov(\mathbf Y)\mathbf b=\mathbf b^T\boldsymbol\Sigma_{YY}\mathbf b\] \[Cov(U,V)=Cov(\mathbf a^T\mathbf X,\mathbf b^T\mathbf Y)=\mathbf a^TCov(\mathbf X,\mathbf Y)\mathbf b=\mathbf a^T\boldsymbol\Sigma_{XY}\mathbf b\] We shall seek coefficient vectors \(\mathbf a\) and \(\mathbf b\) such that \[Cor(U,V)=Cor(\mathbf a^T\mathbf X,\mathbf b^T\mathbf Y)=\frac{Cov(\mathbf a^T\mathbf X,\mathbf b^T\mathbf Y)}{\sqrt{Var(\mathbf a^T\mathbf X)}\sqrt{Var(\mathbf b^T\mathbf Y)}}=\frac{\mathbf a^T\boldsymbol\Sigma_{XY}\mathbf b}{\sqrt{\mathbf a^T\boldsymbol\Sigma_{XX}\mathbf a}\sqrt{\mathbf b^T\boldsymbol\Sigma_{YY}\mathbf b}}\] is as large as possible. The first pair of canonical variables is the pair of linear combinations \((U_1,V_1)\) which maximize the correlation \(Cor(U,V)\) and has unit variances. The \(k^{th}\) pair of canonical variables is the pair of linear combinations \((U_k,V_k)\) which maximize the correlation \(Cor(U,V)\) and has unit variances and uncorrelated with all of the previous \(k-1\) canonical variable pairs.

  1. If variables \(x_1,x_2,\cdots,x_p\) and are uncorrelated, the statistical distance from \(\mathbf X\) to origin \(\mathbf O\) is \[d(\mathbf X,\mathbf O)=\sqrt{\frac{x_1^2}{s_{11}}+\frac{x_2^2}{s_{22}}+\cdots+\frac{x_p^2}{s_{pp}}}\] If variables \(x_1,x_2,\cdots,x_p\) and are correlated, we can rotate the original coordinate system through the angle \(\theta\) while keeping the scatters fixed and label the variables in the rotated axes as \(\widetilde x_1,\widetilde x_2,\cdots,\widetilde x_p\), which are uncorrelated. Then the statistical distance from \(\mathbf X\) to origin \(\mathbf O\) in the rotated axes is \[d(\mathbf X,\mathbf O)=\sqrt{\frac{\widetilde x_1^2}{\widetilde s_{11}}+\frac{\widetilde x_2^2}{\widetilde s_{22}}+\cdots+\frac{\widetilde x_p^2}{\widetilde s_{pp}}}=\sqrt{\mathbf x^T\mathbf A\mathbf x}\] with \(\mathbf A\) is the rotation matrix: \[\mathbf A=\begin{bmatrix} a_{11}&a_{12}&\cdots&a_{1p}\\ a_{12}&a_{22}&\cdots&a_{2p}\\ \vdots&\vdots&\ddots&\vdots\\ a_{1p}&a_{2p}&\cdots&a_{pp}\\ \end{bmatrix}\] By the spectral decomposition \[\mathbf A=\sum_{i=1}^{p}\lambda_i\mathbf e_i\mathbf e_i^T\quad (i=1,2,\cdots,p)\] so \[d^2=\mathbf x^T\mathbf A\mathbf x=\mathbf x^T(\sum_{i=1}^{p}\lambda_i\mathbf e_i\mathbf e_i^T)\mathbf x=\sum_{i=1}^{p}\lambda_i(\mathbf x^T\mathbf e_i)^2\] Let \(\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\), then \(\mathbf x=d\lambda_1^{-1/2}\mathbf e_1\) satisfies \[\mathbf x^T\mathbf A\mathbf x=\sum_{i=1}^{p}\lambda_i(\mathbf x^T\mathbf e_i)^2=\lambda_1\Bigl[(d\lambda_1^{-1/2}\mathbf e_1)^T\mathbf e_1\Bigr]^2=d^2\] and all of the uncorrelated \(\mathbf x_i=d\lambda_i^{-1/2}\mathbf e_i\) satisfies \(\mathbf x^T\mathbf A\mathbf x=d^2\) Thus, the points at distance \(d\) lie on an hyperellipsoids whose axes are given by the eigenvectors of \(\mathbf A\) with lengths proportional to the reciprocals of the square roots of the eigenvalues. The constant of proportionality is \(d\) and the half-length of the axes of the hyperellipsoid in the direction \(\mathbf e_i\) is equal to \(\frac{d}{\sqrt{\lambda_i}}\)

  2. Let \(\underset{p\times 1}{\mathbf b}\) and \(\underset{p\times 1}{\mathbf d}\) be any two vectors, and \(\underset{p\times p}{\mathbf B}\) is a positive definite matrix. Then \[(\mathbf b^T\mathbf d)^2=(\mathbf b^T\mathbf B^{1/2}\mathbf B^{-1/2}\mathbf d)^2=\Bigl((\mathbf B^{1/2}\mathbf b)^T(\mathbf B^{-1/2}\mathbf d)\Bigr)^2\\ \le\Bigl((\mathbf B^{1/2}\mathbf b)^T(\mathbf B^{1/2}\mathbf b)\Bigr)\Bigl((\mathbf B^{-1/2}\mathbf d)^T(\mathbf B^{-1/2}\mathbf d)\Bigr)\\ =(\mathbf b^T\mathbf B\mathbf b)(\mathbf d^T\mathbf B^{-1}\mathbf d)\] with equality if and only if \(\mathbf b=c\mathbf B^{-1}\mathbf d\) for some constant c. So \[\underset{\mathbf b\ne\mathbf 0}{\text{max}}\Bigl(\frac{\mathbf b^T\mathbf d}{\sqrt{\mathbf b^T\mathbf B\mathbf b}}\Bigr)=\sqrt{\mathbf d^T\mathbf B^{-1}\mathbf d}\] The meaning of this maximum is square of the inner product of two vectors is no more than the product of 1) the inner product of one vector with itself rotated by a positive definite matrix and 2) the inner product of the other vector with itself rotated by the reverse matrix. The maximum is achieved when one vector rotated by matrix \(\mathbf B\) will be the same direction with another vector.

  3. By the spectral decomposition, the hyperellipsoids \[d^2=\mathbf a^T\boldsymbol\Sigma_{XX}\mathbf a=\mathbf a^T(\sum_{i=1}^{p}\lambda_i\mathbf e_i\mathbf e_i^T)\mathbf a=\sum_{i=1}^{p}\lambda_i(\mathbf a^T\mathbf e_i)^2\quad (i=1,2,\cdots,p)\] where \((\lambda_i,\mathbf e_i)\) are the eigenvalue-eigenvector pairs of \(\boldsymbol\Sigma_{XX}\) and \(\mathbf a_i=d\lambda_i^{-1/2}\mathbf e_i\) is on the hyperellipsoid with the distance to origin \(\mathbf O\) in the \(\mathbf e_i\) direction is \(\frac{d}{\sqrt{\lambda_i}}\) Then \[U_i=\mathbf a_i^T\mathbf X=(d\lambda_i^{-1/2}\mathbf e_i)^T\mathbf X=d\lambda_i^{-1/2}\mathbf e_i^T\mathbf X=d\mathbf e_i^T\boldsymbol\Sigma_{XX}^{-1/2}\mathbf X\] and \[V_j=\mathbf b_j^T\mathbf Y=c\kappa_j^{-1/2}\mathbf f_j^T\mathbf Y=c\mathbf f_j^T\boldsymbol\Sigma_{YY}^{-1/2}\mathbf Y\] are the maximum uncorrelated values, where \((\kappa_j,\mathbf f_j)\) are the eigenvalue-eigenvector pairs of \(\boldsymbol\Sigma_{YY}\) and \(d=\sqrt{\mathbf a^T\boldsymbol\Sigma_{XX}\mathbf a}\) and \(c=\sqrt{\mathbf b^T\boldsymbol\Sigma_{YY}\mathbf b}\) Then the maximum correlation is \[Cor(U_k,V_k)=\frac{\mathbf a_k^T\boldsymbol\Sigma_{XY}\mathbf b_k}{\sqrt{\mathbf a_k^T\boldsymbol\Sigma_{XX}\mathbf a_k}\sqrt{\mathbf b_k^T\boldsymbol\Sigma_{YY}\mathbf b_k}}=\frac{d\mathbf e_k^T\boldsymbol\Sigma_{XX}^{-1/2}\boldsymbol\Sigma_{XY}c\boldsymbol\Sigma_{YY}^{-1/2}\mathbf f_k}{dc}=\mathbf e_k^T\boldsymbol\Sigma_{XX}^{-1/2}\boldsymbol\Sigma_{XY}\boldsymbol\Sigma_{YY}^{-1/2}\mathbf f_k\] and \[Cor^2(U_i,V_j)=\mathbf e_k^T\boldsymbol\Sigma_{XX}^{-1/2}\boldsymbol\Sigma_{XY}\boldsymbol\Sigma_{YY}^{-1/2}\mathbf f_k\mathbf f_k^T\boldsymbol\Sigma_{YY}^{-1/2}\boldsymbol\Sigma_{YX}\boldsymbol\Sigma_{XX}^{-1/2}\mathbf e_k=\mathbf e_k^T\boldsymbol\Sigma_{XX}^{-1/2}\boldsymbol\Sigma_{XY}\boldsymbol\Sigma_{YY}^{-1}\boldsymbol\Sigma_{YX}\boldsymbol\Sigma_{XX}^{-1/2}\mathbf e_k=\rho_k^2\] where \(\rho_k^2\) are the eigenvalues of \(\boldsymbol\Sigma_{XX}^{-1/2}\boldsymbol\Sigma_{XY}\boldsymbol\Sigma_{YY}^{-1}\boldsymbol\Sigma_{YX}\boldsymbol\Sigma_{XX}^{-1/2}\) also \[Cor^2(U_i,V_j)=\mathbf f_k^T\boldsymbol\Sigma_{YY}^{-1/2}\boldsymbol\Sigma_{YX}\boldsymbol\Sigma_{XX}^{-1/2}\mathbf e_k\mathbf e_k^T\boldsymbol\Sigma_{XX}^{-1/2}\boldsymbol\Sigma_{XY}\boldsymbol\Sigma_{YY}^{-1/2}\mathbf f_k=\mathbf f_k^T\boldsymbol\Sigma_{YY}^{-1/2}\boldsymbol\Sigma_{YX}\boldsymbol\Sigma_{XX}^{-1}\boldsymbol\Sigma_{XY}\boldsymbol\Sigma_{YY}^{-1/2}\mathbf f_k=\rho_k^{*2}\] where \(\rho_k^{*2}\) are the eigenvalues of \(\boldsymbol\Sigma_{YY}^{-1/2}\boldsymbol\Sigma_{YX}\boldsymbol\Sigma_{XX}^{-1}\boldsymbol\Sigma_{XY}\boldsymbol\Sigma_{YY}^{-1/2}\) Usually, we normalize \(\mathbf a\) and \(\mathbf b\) to make \(d=\sqrt{\mathbf a^T\boldsymbol\Sigma_{XX}\mathbf a}=1\) and \(c=\sqrt{\mathbf b^T\boldsymbol\Sigma_{YY}\mathbf b}=1\) So \[Var(U)=\mathbf a^T\boldsymbol\Sigma_{XX}\mathbf a=1\] \[Var(V)=\mathbf b^T\boldsymbol\Sigma_{YY}\mathbf b=1\]

  4. If the original variables are standardized with \[\mathbf Z_X=\begin{bmatrix} \mathbf Z_{X1}\\ \mathbf Z_{X2}\\ \vdots\\ \mathbf Z_{Xp}\\ \end{bmatrix}\] and \[\mathbf Z_Y=\begin{bmatrix} \mathbf Z_{Y1}\\ \mathbf Z_{Y2}\\ \vdots\\ \mathbf Z_{Yq}\\ \end{bmatrix}\] the canonical variates are of the form \[U_k=\mathbf a_k^T\mathbf Z_X=\mathbf e_k^T\boldsymbol\rho_{XX}^{-1/2}\mathbf Z_X\] and \[V_k=\mathbf b_k^T\mathbf Z_Y=\mathbf f_k^T\boldsymbol\rho_{YY}^{-1/2}\mathbf Z_Y\] Here \(Cov(\mathbf Z_X)=\boldsymbol\rho_{XX}\) and \(Cov(\mathbf Z_Y)=\boldsymbol\rho_{YY}\) and \(Cov(\mathbf Z_X, \mathbf Z_Y)=\boldsymbol\rho_{XY}=\boldsymbol\rho_{YX}^T\) The canonical correlations \[Cor^2(U_k,V_k)=\rho_k^2\] where \(\rho_k^2\) are the eigenvalues of the matrix \[\boldsymbol\rho_{XX}^{-1/2}\boldsymbol\rho_{XY}\boldsymbol\rho_{YY}^{-1}\boldsymbol\rho_{YX}\boldsymbol\rho_{XX}^{-1/2}\] or the eigenvalues of the matrix \[\boldsymbol\rho_{YY}^{-1/2}\boldsymbol\rho_{YX}\boldsymbol\rho_{XX}^{-1}\boldsymbol\rho_{XY}\boldsymbol\rho_{YY}^{-1/2}\] Then the canonical coefficients of the original variables are \(\mathbf e_i^T\boldsymbol\Sigma_{XX}^{-1/2}\) and \(\mathbf f_i^T\boldsymbol\Sigma_{YY}^{-1/2}\) and are the same as the canonical coefficients of the standardized variables is \(\mathbf e_i^T\boldsymbol\rho_{XX}^{-1/2}\) and \(\mathbf f_i^T\boldsymbol\rho_{YY}^{-1/2}\) The correlations are unaffected by the standardization.

  5. The first \(p\) canonical variables in \(\mathbf U\) are \[\underset{(p\times 1)}{\mathbf U}=\mathbf A\mathbf X=\begin{bmatrix} \mathbf a_1^T\\ \mathbf a_2^T\\ \vdots\\ \mathbf a_p^T\\ \end{bmatrix}\begin{bmatrix} X_1\\ X_2\\ \vdots\\ X_p\\ \end{bmatrix}\] where \(\mathbf A\) is the matrix whose rows contain the canonical coefficients. Then \[Cov(\mathbf U, \mathbf X)=Cov(\mathbf A\mathbf X, \mathbf X)=\mathbf A\boldsymbol\Sigma_{XX}\]. Because \(Var(U_i)=1, (i=1,2,\cdots,p)\) the canonical correlations between canonical variates and their component variables are \[Cor(U_i,X_k)=\frac{Cov(U_i,X_k)}{\sqrt{1}\sqrt{Var(X_k)}}=\frac{Cov(U_i,X_k)}{\sigma_{kk}^{1/2}}=Cov(U_i,X_k)\sigma_{kk}^{-1/2}=Cov(U_i,\sigma_{kk}^{-1/2}X_k)\]. Let diagonal matrix \[\mathbf D_{XX}^{-1/2}=\begin{bmatrix} \sigma_{11}^{-1/2}&0&\cdots&0\\ 0&\sigma_{22}^{-1/2}&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&\sigma_{pp}^{-1/2}\\ \end{bmatrix}\] Then the canonical correlations between canonical variates and the original variables are \[\underset{(p\times p)}{\boldsymbol\rho_{\mathbf U, \mathbf X}}=Cor(\mathbf U, \mathbf X)=Cov(\mathbf U, \mathbf D_{XX}^{-1/2}\mathbf X)=Cov(\mathbf A\mathbf X, \mathbf D_{XX}^{-1/2}\mathbf X)=\mathbf A\boldsymbol\Sigma_{XX}\mathbf D_{XX}^{-1/2}\] The first \(q\) canonical variables in \(\mathbf V\) are \[\underset{(q\times 1)}{\mathbf V}=\mathbf B\mathbf Y=\begin{bmatrix} \mathbf b_1^T\\ \mathbf b_2^T\\ \vdots\\ \mathbf b_q^T\\ \end{bmatrix}\begin{bmatrix} Y_1\\ Y_2\\ \vdots\\ Y_q\\ \end{bmatrix}\] where \(\mathbf B\) is the matrix whose rows contain the canonical coefficients. Then the canonical correlations between canonical variates and the original variables are \[\underset{(q\times q)}{\boldsymbol\rho_{\mathbf V, \mathbf Y}}=Cor(\mathbf V, \mathbf Y)=Cov(\mathbf V, \mathbf D_{YY}^{-1/2}\mathbf Y)=Cov(\mathbf B\mathbf Y, \mathbf D_{YY}^{-1/2}\mathbf Y)=\mathbf B\boldsymbol\Sigma_{YY}\mathbf D_{YY}^{-1/2}\] Similarly, the canonical correlations between canonical variates \(\mathbf U\) and the original variables \(\mathbf Y\) are \[\underset{(p\times q)}{\boldsymbol\rho_{\mathbf U, \mathbf Y}}=Cor(\mathbf U, \mathbf Y)=Cov(\mathbf U, \mathbf D_{YY}^{-1/2}\mathbf Y)=Cov(\mathbf A\mathbf X, \mathbf D_{YY}^{-1/2}\mathbf Y)=\mathbf A\boldsymbol\Sigma_{XY}\mathbf D_{YY}^{-1/2}\] and \[\underset{(q\times p)}{\boldsymbol\rho_{\mathbf V, \mathbf X}}=Cor(\mathbf V, \mathbf X)=Cov(\mathbf V, \mathbf D_{XX}^{-1/2}\mathbf X)=Cov(\mathbf B\mathbf Y, \mathbf D_{XX}^{-1/2}\mathbf X)=\mathbf B\boldsymbol\Sigma_{YX}\mathbf D_{XX}^{-1/2}\]

  6. If the original variables are standardized with \[\mathbf Z_X=\begin{bmatrix} \mathbf Z_{X1}\\ \mathbf Z_{X2}\\ \vdots\\ \mathbf Z_{Xp}\\ \end{bmatrix}\] Then \[\boldsymbol\rho_{\mathbf U, \mathbf Z_X}=Cor(\mathbf U, \mathbf Z_X)=Cov(\mathbf A_Z\mathbf Z_X, \mathbf Z_X)=\mathbf A_Z\boldsymbol\rho_{XX}=\mathbf A_Z\mathbf D_{XX}^{-1/2}\boldsymbol\Sigma_{XX}\mathbf D_{XX}^{-1/2}\\ =\mathbf A\mathbf D_{XX}^{1/2}\mathbf D_{XX}^{-1/2}\boldsymbol\Sigma_{XX}\mathbf D_{XX}^{-1/2}=\mathbf A\boldsymbol\Sigma_{XX}\mathbf D_{XX}^{-1/2}=\underset{(p\times p)}{\boldsymbol\rho_{\mathbf U, \mathbf X}}\] and \(\mathbf A_Z=\mathbf A\mathbf D_{XX}^{1/2}\) are the matrices whose rows contain the canonical coefficients for \(\mathbf Z_X\). The canonical correlations are unaffected by the standardization. Similarly, \[\boldsymbol\rho_{\mathbf V, \mathbf Z_Y}=\mathbf B_Z\boldsymbol\rho_{YY}\]

  7. Because the correlation between the components in \(\mathbf X\) and \(\mathbf Y\) is \(|\rho_{ik}|=|cor(X_i,Y_k)|=|cor(aX_i,bY_k)|\), then the first canonical correlation is larger than the absolute value of any entry in matrix \(Cov(\mathbf Z_X, \mathbf Z_Y)=\boldsymbol\rho_{XY}=\mathbf D_{XX}^{-1/2}\boldsymbol\Sigma_{XY}\mathbf D_{YY}^{-1/2}\)

  8. The change of coordinates from \(\mathbf X\) to \(\mathbf U=\mathbf A\mathbf X\) and from \(\mathbf Y\) to \(\mathbf V=\mathbf B\mathbf Y\) is chosen to maximize \(Cor(U_1,V_1)\) and, successively \(Cor(U_i,V_i)\) where \((U_i,V_i)\) have zero correlation with the previous pairs. The canonical variables \(\mathbf U\) have covariance matrix \[Cov(\mathbf U)=Cov(\mathbf A\mathbf X)=\mathbf A\boldsymbol\Sigma_{XX}\mathbf A^T=\mathbf I\] and \[\mathbf A=\mathbf E^T\boldsymbol\Sigma_{XX}^{-1/2}=\mathbf E^T\mathbf P_X\boldsymbol\Lambda_{X}^{-1/2}\mathbf P_X^T\] where \(\mathbf E\) is an orthogonal matrix with rows \(\mathbf e^T\) and \(\mathbf P_X\) is an orthogonal matrix with columns are the eigenvectors of \(\boldsymbol\Sigma_{XX}\) and \(\boldsymbol\Lambda_{X}\) is a diagonal matrix with the eigenvalues of \(\boldsymbol\Sigma_{XX}\) on its main diagonal. \(\mathbf P_X^T\mathbf X\) is the set of principal components derived from \(\mathbf X\) and \(\boldsymbol\Lambda_{X}^{-1/2}\mathbf P_X^T\mathbf X\) has \(i^{th}\) row \(\frac{1}{\sqrt{\lambda_i}}\mathbf p_i^T\mathbf X\) which is the \(i^{th}\) principal component scaled to have unit variance. The \(Cov(\boldsymbol\Lambda_{X}^{-1/2}\mathbf P_X^T\mathbf X)=\mathbf 1\) Then the canonical variables \[\mathbf U=\mathbf A\mathbf X=\mathbf E^T\mathbf P_X\boldsymbol\Lambda_{X}^{-1/2}\mathbf P_X^T\mathbf X\] means (1) a transformation of \(\mathbf X\) to uncorrelated standardized principal components followed by (2) a rigid (orthogonal) rotation \(\mathbf P_X\) determined by \(\boldsymbol\Sigma_{XX}\) and then (3) another rotation \(\mathbf E^T\) determined from the full covariance matrix \[\boldsymbol\Sigma_{XX}^{-1/2}\boldsymbol\Sigma_{XY}\boldsymbol\Sigma_{YY}^{-1}\boldsymbol\Sigma_{YX}\boldsymbol\Sigma_{XX}^{-1/2}\]

## signs of results are random
pop <- LifeCycleSavings[, 2:3]
oec <- LifeCycleSavings[, -(2:3)]
cancor(pop, oec)
## $cor
## [1] 0.8247966 0.3652762
## 
## $xcoef
##               [,1]        [,2]
## pop15 -0.009110856 -0.03622206
## pop75  0.048647514 -0.26031158
## 
## $ycoef
##              [,1]          [,2]          [,3]
## sr   0.0084710221  3.337936e-02 -5.157130e-03
## dpi  0.0001307398 -7.588232e-05  4.543705e-06
## ddpi 0.0041706000 -1.226790e-02  5.188324e-02
## 
## $xcenter
##   pop15   pop75 
## 35.0896  2.2930 
## 
## $ycenter
##        sr       dpi      ddpi 
##    9.6710 1106.7584    3.7576
x <- matrix(rnorm(150), 50, 3)
y <- matrix(rnorm(250), 50, 5)
(cxy <- cancor(x, y))
## $cor
## [1] 0.4648744 0.3393778 0.2497012
## 
## $xcoef
##             [,1]         [,2]        [,3]
## [1,]  0.13681854 -0.011328223  0.07117195
## [2,] -0.06953136  0.001488554  0.11053153
## [3,] -0.01805799 -0.124911518 -0.01680600
## 
## $ycoef
##             [,1]        [,2]        [,3]         [,4]        [,5]
## [1,]  0.04242329  0.01471038 -0.09464761  0.003304399 -0.12610091
## [2,] -0.13756352 -0.06554034  0.01593199 -0.006577211 -0.05380149
## [3,]  0.08329928 -0.02767489  0.11328385  0.083475987 -0.05480750
## [4,]  0.06465024 -0.07792535 -0.01358597 -0.116063204  0.00723675
## [5,] -0.01178345  0.07447101  0.13377486 -0.053148728 -0.02503208
## 
## $xcenter
## [1]  0.09045249 -0.15751434 -0.08311951
## 
## $ycenter
## [1]  0.088946456  0.004213401  0.107837528 -0.133090644 -0.004320256
all(abs(cor(x %*% cxy$xcoef,
            y %*% cxy$ycoef)[,1:3] - diag(cxy $ cor)) < 1e-15)
## [1] TRUE
all(abs(cor(x %*% cxy$xcoef) - diag(3)) < 1e-15)
## [1] TRUE
all(abs(cor(y %*% cxy$ycoef) - diag(5)) < 1e-15)
## [1] TRUE