Let the random vector \(\mathbf X^T=[X_1,X_2,\cdots,X_p]\) have the covariance matrix \(\boldsymbol\Sigma\) with eigenvalues \(\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\ge0\), the linear combinations \(Y_i=\mathbf a_i^T\mathbf X=a_{i1}X_1+a_{i2}X_2+\cdots+a_{ip}X_p, \quad (i=1,2,\cdots,p)\) has \(Var(Y_i)=Var(\mathbf a_i^T\mathbf X)=\mathbf a_i^TCov(\mathbf X)\mathbf a_i=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_i\) and \(Cov(Y_i,Y_k)=Cov(\mathbf a_i^T\mathbf X, \mathbf a_k^T\mathbf X)=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_k \quad i,k=1,2,\cdots,p\). The principal components are those uncorrelated linear combinations of \([X_1,X_2,\cdots,X_p]\), \(Y_1,Y_2,\cdots,Y_p\) whose variances \(Var(Y_i)=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_i\) are as large as possible, subject to \(\mathbf a_i^T\mathbf a_i=1\). These linear combinations represent the selection of a new coordinate system obtained by rotating the original system with \(Y_1,Y_2,\cdots,Y_p\) as the new coordinate axes.
The first principal component is the linear combination with maximum variance, \(Var(Y_1)=\mathbf a_1^T\boldsymbol\Sigma\mathbf a_1, \quad \mathbf a_1^T\mathbf a_1=1\), and the second principal component is \(Var(Y_2)=\mathbf a_2^T\boldsymbol\Sigma\mathbf a_2, \quad \mathbf a_2^T\mathbf a_2=1, \quad Cov(Y_1,Y_2)=\mathbf a_1^T\boldsymbol\Sigma\mathbf a_2=0\), and the \(i^{th}\) principal component is \(Var(Y_i)=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_i, \quad \mathbf a_i^T\mathbf a_i=1, \quad Cov(Y_i,Y_k)=\mathbf a_i^T\boldsymbol\Sigma\mathbf a_k=0, \quad i>k\).
Let \((\lambda_i,\boldsymbol e_i), \quad i=1,2,\cdots,p\) are the eigenvalue-eigenvector pairs of \(\boldsymbol\Sigma\), where \(\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\ge0\) Because \[\frac{\mathbf a^T\boldsymbol\Sigma\mathbf a}{\mathbf a^T\mathbf a}=\frac{\mathbf a^T\mathbf B^{\frac{1}{2}}\mathbf B^{\frac{1}{2}}\mathbf a}{\mathbf a^T\mathbf a}=\frac{\mathbf a^T\mathbf P\boldsymbol \Lambda^{\frac{1}{2}}\mathbf P^T\mathbf P\boldsymbol \Lambda^{\frac{1}{2}}\mathbf P^T\mathbf a}{\mathbf a^T\mathbf P\mathbf P^T\mathbf a}=\frac{\mathbf y^T\boldsymbol\Lambda\mathbf y}{\mathbf y^T\mathbf y}\\
=\frac{\displaystyle\sum_{i=1}^{p}\lambda_iy_i^2}{\displaystyle\sum_{i=1}^{p}y_i^2}\le \lambda_1\frac{\displaystyle\sum_{i=1}^{p}y_i^2}{\displaystyle\sum_{i=1}^{p}y_i^2}=\lambda_1\] where \(\underset{p\times p}{\mathbf P}\) is the orthogonal matrix whose columns are the eigenvectors \(\mathbf e_1,\mathbf e_2,\cdots,\mathbf e_p\) and \(\underset{p\times p}{\boldsymbol\Lambda}\) is the diagonal matrix with eigenvalues \(\lambda_1,\lambda_2,\cdots,\lambda_p\) along the main diagonal, and \(\underset{\mathbf a\ne\mathbf 0}{\text{max}}\displaystyle\frac{\mathbf a^T\boldsymbol\Sigma\mathbf a}{\mathbf a^T\mathbf a}=\lambda_1=\frac{\mathbf e_1^T\boldsymbol\Sigma\mathbf e_1}{\mathbf e_1^T\mathbf e_1}=\mathbf e_1^T\boldsymbol\Sigma\mathbf e_1=Var(Y_1)\) when \((\mathbf a=\mathbf e_1)\). Similarly,\(\underset{\mathbf a\bot\mathbf e_1,\mathbf e_2,\cdots,\mathbf e_k}{\text{max}}\displaystyle\frac{\mathbf a^T\boldsymbol\Sigma\mathbf a}{\mathbf a^T\mathbf a}=\lambda_{k+1}=Var(Y_{k+1}) \quad k=1,2,\cdots,p-1\) when \((\mathbf a=\mathbf e_{k+1})\).
Then \(Y_1=\mathbf e_1^T\mathbf X, Y_2=\mathbf e_2^T\mathbf X,\cdots,Y_p=\mathbf e_p^T\mathbf X\) are the principal components, which are uncorrelated and have variances equal to the eigenvalues of the covariance matrix \(\boldsymbol\Sigma\) of \(\mathbf X^T=[X_1,X_2,\cdots,X_p]\).
- Population Principal Components
Because the covariance matrix \(\boldsymbol\Sigma\) of \(\mathbf X^T=[X_1,X_2,\cdots,X_p]\) has diagonal elements \(\sigma_{11},\sigma_{22},\dots,\sigma_{pp}\), so \(tr(\boldsymbol\Sigma)=\displaystyle\sum_{i=1}^{p}\sigma_{ii}=\sum_{i=1}^{p}Var(X_i)\) And because \(tr(\boldsymbol\Sigma)=tr(\mathbf P\boldsymbol\Lambda\mathbf P^T)=tr(\boldsymbol\Lambda\mathbf P^T\mathbf P)=tr(\boldsymbol\Lambda)=\displaystyle\sum_{i=1}^{p}\lambda_i=\displaystyle\sum_{i=1}^{p}Var(Y_i)\) so \((\text{Total population variance })=\displaystyle\sum_{i=1}^{p}Var(X_i)=\displaystyle\sum_{i=1}^{p}\sigma_{ii}=\displaystyle\sum_{i=1}^{p}Var(Y_i)=\displaystyle\sum_{i=1}^{p}\lambda_{i}\) The proportion of total variance explained by the \(k^{th}\) principal component is \(\Biggl({\substack{\text{Proportion of total }\\ \text{population variance due to } k^{th}\\ \text{ principal component}}}\Biggr)=\displaystyle\frac{\lambda_k}{\lambda_1+\lambda_2+\cdots+\lambda_p}, \quad k=1,2,\cdots,p\)
Becasue the \(k^{th}\) component of \(\mathbf X\) is \(X_k=\mathbf a_k^T\mathbf X\), where \(\mathbf a_k^T=[0,\cdots,0,1,0,\cdots,0]\) and because \(Y_i=\mathbf e_i^T\mathbf X\), then \(Cov(Y_i, X_k)=Cov(X_k, Y_i)=Cov(\mathbf a_k^T\mathbf X, \mathbf e_i^T\mathbf X)=\mathbf a_k^T\boldsymbol\Sigma\mathbf e_i=\mathbf a_k^T\lambda_i\mathbf e_i=\lambda_ie_{ik}\) Then the correlation coefficients between the principal components \(Y_i\) and the variables \(X_k\) is \[\rho_{Y_i,X_k}=\frac{Cov(Y_i, X_k)}{\sqrt{Var(Y_i)}\sqrt{Var(X_k)}}=\frac{\lambda_ie_{ik}}{\sqrt{\lambda_i}\sqrt{\sigma_{kk}}}=\frac{\sqrt{\lambda_i}e_{ik}}{\sqrt{\sigma_{kk}}}\quad (i,k=1,2,\cdots,p)\]
If multivariate normal random variables \(\mathbf X\) is distributed as \(N_p(\boldsymbol\mu, \boldsymbol\Sigma)\), then contours of constant density for \(\mathbf X\) are on the ellipsoids centered with \(\boldsymbol\mu\): \[(\mathbf x-\boldsymbol\mu)^T\boldsymbol\Sigma^{-1}(\mathbf x-\boldsymbol\mu)=(\mathbf x-\boldsymbol\mu)^T\displaystyle\sum_{i=1}^{p}(\frac{1}{\lambda_i}\mathbf e_i\mathbf e_i^T)(\mathbf x-\boldsymbol\mu)\\ =\displaystyle\sum_{i=1}^{p}\frac{1}{\lambda_i}\Bigl[\mathbf e_i^T(\mathbf x-\boldsymbol\mu)\Bigr]^2=c^2\], which have axes \(\pm c\sqrt{\lambda_i}\mathbf e_i, \quad (i=1,2,\cdots, p)\), where \((\lambda_i,\mathbf e_i)\) are the eigenvalue–eigenvector pairs of \(\boldsymbol\Sigma\). We can translate \(\boldsymbol\mu=\boldsymbol0\) without change the Covariance of \(\mathbf X\), then \(\mathbf x^T\boldsymbol\Sigma^{-1}\mathbf x=\displaystyle\sum_{i=1}^{p}\frac{1}{\lambda_i}(\mathbf e_i^T\mathbf x)^2=\displaystyle\sum_{i=1}^{p}\frac{1}{\lambda_i}y_i^2=c^2\), where \(y_i=\mathbf e_i^T\mathbf x\) are the principal components of \(\mathbf x\) and this equation defines an ellipsoid in a coordinate system with axes \(y_i\) lying in the directions \(\mathbf e_i\), which are the directions of the axes of a constant density ellipsoid. Any point on the \(i^{th}\) ellipsoid axis has \(\mathbf x\) coordinates proportional to \(\mathbf e_i^T=[e_{i1},e_{i2},\cdots,e_{ip}]\)
If the random variables are standardized with \[Z_i=\frac{X_i-\mu_i}{\sqrt{\sigma_{ii}}}\], with matrix notation \(\mathbf Z=(\mathbf D^{\frac{1}{2}})^{-1}(\mathbf X-\boldsymbol\mu)\) where \[\mathbf D=\begin{bmatrix} \sigma_{11}&0&0&\cdots&0\\ 0&\sigma_{22}&0&\cdots&0\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ 0&0&0&\cdots&\sigma_{pp}\\ \end{bmatrix}\] Then \(E(\mathbf Z)=\mathbf0\) and \[Cov(\mathbf Z)=(\mathbf D^{\frac{1}{2}})^{-1}\boldsymbol\Sigma(\mathbf D^{\frac{1}{2}})^{-1}=\begin{bmatrix} \frac{\sigma_{11}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{11}}}&\frac{\sigma_{12}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{22}}}&\frac{\sigma_{13}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{33}}}&\cdots&\frac{\sigma_{1p}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{pp}}}\\ \frac{\sigma_{12}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{22}}}&\frac{\sigma_{22}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{22}}}&\frac{\sigma_{23}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{33}}}&\cdots&\frac{\sigma_{2p}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{pp}}}\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ \frac{\sigma_{1p}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{pp}}}&\frac{\sigma_{2p}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{pp}}}&\frac{\sigma_{3p}}{\sqrt{\sigma_{33}}\sqrt{\sigma_{pp}}}&\cdots&\frac{\sigma_{pp}}{\sqrt{\sigma_{pp}}\sqrt{\sigma_{pp}}}\\ \end{bmatrix}=\boldsymbol\rho\] Then the principal components of \(\mathbf Z\) can be generated from the eigenvectors of \(\boldsymbol\rho\), which is both the covariance matrix of \(\mathbf Z\) and the correlation matrix of \(\mathbf X\) Usually, the principal components derived from \(\boldsymbol\Sigma\) are different from those derived from \(\boldsymbol\rho\)
- Sample Principal Components
For the \(n\) measurements on \(p\)-dimensional population \(\mathbf x_1,\mathbf x_2,\cdots,\mathbf x_n\), with mean vector \(\boldsymbol\mu\) and covariance matrix \(\underset{p\times p}{\boldsymbol\Sigma}\). These samples has the mean vector \(\bar{\mathbf x}\) and sample covariance matrix \(\underset{p\times p}{\mathbf S}\) and the sample correlation matrix of \(\underset{p\times p}{\mathbf R}\). The linear combination of the \(j^{th}\) measurement \(\mathbf x_j\), is \(\mathbf a_1^T\mathbf x_j=a_{11}x_{j1}+a_{12}x_{j2}+\cdots+a_{1p}x_{jp}, \quad (j=1,2,\cdots,n)\), then the linear combination of sample mean is \(\mathbf a_1^T\bar{\mathbf x}\) and sample variance is \(\mathbf a_1^T\mathbf S\mathbf a_1\), two linear combinations \(\mathbf a_1^T\mathbf x_j, \mathbf a_2^T\mathbf x_j\) have sample covariance \(\mathbf a_1^T\mathbf S\mathbf a_2\). The \(i^{th}\) principal component of of sample \(\mathbf x_j\) is the linear combination \(\mathbf a_i^T\mathbf x_j\) that maximizes \(Var(\mathbf a_i^T\mathbf x_j)\) subject to \(\mathbf a_i^T\mathbf a_i=1\) and \(Var(\mathbf a_i^T\mathbf x_j,\mathbf a_k^T\mathbf x_j)=0, \quad k<i\). The first principal component maximizes \(\mathbf a_1^T\mathbf S\mathbf a_1\) is \(\hat{y}_1=\hat{\mathbf e}_1^T\mathbf x_j\), which is attained when \(\mathbf a_1=\hat{\mathbf e}_1\), and \(\hat{\mathbf e}_1\) is the eigenvector of \(\mathbf S\).
For the \(n\) measurements on \(p\)-dimensional population, if the eigenvalue-eigenvector pairs of the sample covariance matrix \(\underset{p\times p}{\mathbf S}\) are \(\{(\hat{\lambda}_i,\hat{\mathbf e}_i), \quad (i=1,2,\cdots,p)\}\), the \(i^{th}\) sample principal component is given by \(\hat{y}_i=\hat{\mathbf e}_i^T\mathbf x, \quad (i=1,2,\cdots,p)\), where \(\hat{\lambda}_1\ge\hat{\lambda}_2\ge\cdots\hat{\lambda}_p\) and \(\mathbf x\) is any observation on the variables \(x_1,x_2,\cdots,x_p\). Also the variance of sample principal component is \(Var(\hat{y}_i)=\hat{\lambda}_i\) and the covariance between the sample principal component and the variables is \(Cov(\hat{y}_i,\hat{x}_k)=0, k\ne i\) and total sample variance is \(\displaystyle\sum_{i=1}^{p}s_{ii}=\sum_{i=1}^{p}\hat{\lambda}_i\). Similarly, the correlation coefficients between the principal component \(\hat{y}_i\) and the variables \(x_k\) is \(\rho_{\hat{y}_i,x_k}=\displaystyle\frac{Cov(\hat{y}_i,x_k)}{\sqrt{Var(\hat{y}_i)}\sqrt{Var(x_k)}}=\frac{\hat{e}_{ik}\sqrt{\hat{\lambda}_i}}{\sqrt{s_{kk}}}, \quad i,k=1,2,\cdots,p\)If \(\mathbf X\) is distributed as \(N_p(\boldsymbol\mu, \boldsymbol\Sigma)\), the sample principal components \(\hat{y}_i=\hat{\mathbf e}_i^T(\mathbf x-\bar{\mathbf x})\) are realizations of population principal components \(Y_i=\mathbf e_i^T(\mathbf X-\boldsymbol\mu)\), with \(Y_i\) is distributed as \(N_p(\boldsymbol0, \boldsymbol\Lambda)\), and the diagonal matrix \(\boldsymbol\Lambda\) has entries \(\lambda_1,\lambda_2,\cdots,\lambda_p\), which are the eigenvalues of \(\boldsymbol\Sigma\). The contour consisting of all vectors \(\mathbf x\) that satisfy \((\mathbf x-\bar{\mathbf x})^T\mathbf S^{-1}(\mathbf x-\bar{\mathbf x})=c^2\), estimates the constant density contour \((\mathbf x-\boldsymbol\mu)^T\boldsymbol\Sigma^{-1}(\mathbf x-\boldsymbol\mu)=c^2\)
data(longley)
longley
## GNP.deflator GNP Unemployed Armed.Forces Population Year Employed
## 1947 83.0 234.289 235.6 159.0 107.608 1947 60.323
## 1948 88.5 259.426 232.5 145.6 108.632 1948 61.122
## 1949 88.2 258.054 368.2 161.6 109.773 1949 60.171
## 1950 89.5 284.599 335.1 165.0 110.929 1950 61.187
## 1951 96.2 328.975 209.9 309.9 112.075 1951 63.221
## 1952 98.1 346.999 193.2 359.4 113.270 1952 63.639
## 1953 99.0 365.385 187.0 354.7 115.094 1953 64.989
## 1954 100.0 363.112 357.8 335.0 116.219 1954 63.761
## 1955 101.2 397.469 290.4 304.8 117.388 1955 66.019
## 1956 104.6 419.180 282.2 285.7 118.734 1956 67.857
## 1957 108.4 442.769 293.6 279.8 120.445 1957 68.169
## 1958 110.8 444.546 468.1 263.7 121.950 1958 66.513
## 1959 112.6 482.704 381.3 255.2 123.366 1959 68.655
## 1960 114.2 502.601 393.1 251.4 125.368 1960 69.564
## 1961 115.7 518.173 480.6 257.2 127.852 1961 69.331
## 1962 116.9 554.894 400.7 282.7 130.081 1962 70.551
dim(longley)
## [1] 16 7
cor(longley)
## GNP.deflator GNP Unemployed Armed.Forces Population
## GNP.deflator 1.0000000 0.9915892 0.6206334 0.4647442 0.9791634
## GNP 0.9915892 1.0000000 0.6042609 0.4464368 0.9910901
## Unemployed 0.6206334 0.6042609 1.0000000 -0.1774206 0.6865515
## Armed.Forces 0.4647442 0.4464368 -0.1774206 1.0000000 0.3644163
## Population 0.9791634 0.9910901 0.6865515 0.3644163 1.0000000
## Year 0.9911492 0.9952735 0.6682566 0.4172451 0.9939528
## Employed 0.9708985 0.9835516 0.5024981 0.4573074 0.9603906
## Year Employed
## GNP.deflator 0.9911492 0.9708985
## GNP 0.9952735 0.9835516
## Unemployed 0.6682566 0.5024981
## Armed.Forces 0.4172451 0.4573074
## Population 0.9939528 0.9603906
## Year 1.0000000 0.9713295
## Employed 0.9713295 1.0000000
cov(longley)
## GNP.deflator GNP Unemployed Armed.Forces Population
## GNP.deflator 116.45763 1063.6041 625.8666 349.0254 73.50300
## GNP 1063.60412 9879.3537 5612.4370 3088.0428 685.24094
## Unemployed 625.86663 5612.4370 8732.2343 -1153.7876 446.27415
## Armed.Forces 349.02537 3088.0428 -1153.7876 4843.0410 176.40981
## Population 73.50300 685.2409 446.2742 176.4098 48.38735
## Year 50.92333 470.9779 297.3033 138.2433 32.91740
## Employed 36.79666 343.3302 164.9103 111.7681 23.46197
## Year Employed
## GNP.deflator 50.92333 36.79666
## GNP 470.97790 343.33021
## Unemployed 297.30333 164.91027
## Armed.Forces 138.24333 111.76811
## Population 32.91740 23.46197
## Year 22.66667 16.24093
## Employed 16.24093 12.33392
# Get the eigenvalues and eigenvectors using eigen()
(en <- eigen(cov(longley)))
## eigen() decomposition
## $values
## [1] 1.536819e+04 7.078799e+03 1.205492e+03 1.645780e+00 2.352774e-01
## [6] 9.817098e-02 9.428974e-03
##
## $vectors
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] -0.08246505 0.03431671 0.04174461 0.953414736 0.200588528 -0.201911092
## [2,] -0.75612880 0.31889266 0.55626174 -0.078811527 -0.003389211 0.100012924
## [3,] -0.62581871 -0.58065916 -0.52048350 -0.006811143 -0.010137793 0.006269162
## [4,] -0.15764282 0.74796225 -0.64460779 -0.011957646 -0.003793022 -0.002353907
## [5,] -0.05438061 0.01345204 0.03418848 -0.281879616 0.461366947 -0.837567707
## [6,] -0.03716835 0.01181644 0.01849000 0.019100099 -0.323260157 -0.132982953
## [7,] -0.02509395 0.01402362 0.02995436 0.069128695 -0.801423112 -0.479562668
## [,7]
## [1,] -0.016606051
## [2,] 0.030401713
## [3,] 0.009713214
## [4,] 0.004370434
## [5,] -0.043101018
## [6,] -0.935729980
## [7,] 0.348192811
# Eigenvectors are always orthogonal, calculate A'A using crossprod()
crossprod(en$vectors)
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1.000000e+00 -6.619054e-17 5.594483e-17 -7.155734e-18 -3.469447e-18
## [2,] -6.619054e-17 1.000000e+00 1.501078e-16 -4.445229e-17 0.000000e+00
## [3,] 5.594483e-17 1.501078e-16 1.000000e+00 -1.647987e-17 6.938894e-18
## [4,] -7.155734e-18 -4.445229e-17 -1.647987e-17 1.000000e+00 -5.551115e-17
## [5,] -3.469447e-18 0.000000e+00 6.938894e-18 -5.551115e-17 1.000000e+00
## [6,] -5.204170e-18 1.561251e-17 1.734723e-18 2.012279e-16 -4.440892e-16
## [7,] 0.000000e+00 8.673617e-19 -6.938894e-18 0.000000e+00 0.000000e+00
## [,6] [,7]
## [1,] -5.204170e-18 0.000000e+00
## [2,] 1.561251e-17 8.673617e-19
## [3,] 1.734723e-18 -6.938894e-18
## [4,] 2.012279e-16 0.000000e+00
## [5,] -4.440892e-16 0.000000e+00
## [6,] 1.000000e+00 -2.775558e-17
## [7,] -2.775558e-17 1.000000e+00
# trace(cov) = sum of eigenvalues
matlib::tr(cov(longley))
## [1] 23654.47
sum(en$values)
## [1] 23654.47
# determinant = product of eigenvalues, det(A)=∏λi, This means that the determinant will be zero # if any λi=0.
det(cov(longley))
## [1] 47005222
prod(en$values)
## [1] 47005222
# rank = number of non-zero eigenvalues
matlib::R(cov(longley))
## [1] 7
sum(en$values != 0)
## [1] 7
# sum of squares of A = sum of squares of eigenvalues, ∑λ2i.
sum((cov(longley))^2)
## [1] 287744025
sum(en$values^2)
## [1] 287744025
# get the reverse matrix using solve()
AI <- solve(cov(longley))
AI
## GNP.deflator GNP Unemployed Armed.Forces Population
## GNP.deflator 1.16786027 -0.30776300 -0.042606945 -0.013034547 2.028608136
## GNP -0.30776300 0.20404438 0.037941637 0.012064719 -0.985381346
## Unemployed -0.04260695 0.03794164 0.011169221 0.004788169 -0.116613849
## Armed.Forces -0.01303455 0.01206472 0.004788169 0.002655552 -0.005301107
## Population 2.02860814 -0.98538135 -0.116613849 -0.005301107 8.295913765
## Year 1.65695843 -3.14878669 -0.958587835 -0.425468239 4.774745254
## Employed -0.27011189 0.64236279 0.362297651 0.185293607 0.916474819
## Year Employed
## GNP.deflator 1.6569584 -0.2701119
## GNP -3.1487867 0.6423628
## Unemployed -0.9585878 0.3622977
## Armed.Forces -0.4254682 0.1852936
## Population 4.7747453 0.9164748
## Year 93.4862110 -32.8030642
## Employed -32.8030642 17.9334871
# zapsmall() is handy for cleaning up tiny values.
zapsmall(AI %*% cov(longley))
## GNP.deflator GNP Unemployed Armed.Forces Population Year Employed
## GNP.deflator 1 0 0 0 0 0 0
## GNP 0 1 0 0 0 0 0
## Unemployed 0 0 1 0 0 0 0
## Armed.Forces 0 0 0 1 0 0 0
## Population 0 0 0 0 1 0 0
## Year 0 0 0 0 0 1 0
## Employed 0 0 0 0 0 0 1