next up previous


This LaTeX document is available as postscript or asAdobe PDF.

Matrix Algebra
L. R. Schaeffer, March 1999

Definitions

1.
Scalars. A scalar is a single quantity usually designated by an italicized lower case letter, for example a, r, and k.

2.
Matrix. A matrix is a group of elements arranged in rows and columns. Elements can be scalars, mathematical expressions, or other matrices. Upper case letters in bold face type denote matrices. Examples of matrices are

\begin{displaymath}\bf {A} \mbox{ = } \left( \begin{array}{rrrr}
7 & 18 & -2 & 22 \\ -16 & 3 & 55 & 1 \\ 9 & -4 & 0 & 31
\end{array} \right), \end{displaymath}


\begin{displaymath}\bf {B} \mbox{ = } \left( \begin{array}{ccc}
x & y+1 & x+y+z...
... c \log d & e \\ \sqrt{x-y} & (m+n)/n & p
\end{array} \right), \end{displaymath}

and

\begin{displaymath}\bf {C} \mbox{ = } \left( \begin{array}{ll}
\bf {C}_{11} & \bf {C}_{12} \\ \bf {C}_{21} & \bf {C}_{22}
\end{array} \right). \end{displaymath}

The order or dimension of a matrix is the number of rows and columns.

3.
Vectors. A column vector is a matrix with one column, and a row vector is a matrix with one row. Usually if the word vector is used, it is implied that the vector is a column vector. Vectors are commonly denoted by lower case letters in bold face type, but this is not a universal practice.

4.
Transposition. Transposition is the formation of a new matrix by making rows of the original matrix into columns of the new matrix in corresponding order. The transpose of a column vector is a row vector, and the transpose of a matrix of order \( m \times n \) results in a new matrix of order \( n \times m \). If \( \bf {A} \) is the original matrix, then the transpose is denoted as \( \bf {A'} \) in these notes, but in some places the transpose is denoted as \( \bf {A}^T \).

5.
Diagonals. The diagonal elements of a matrix are those elements whose row and column numbers are the same. For example, if

\begin{displaymath}\bf {D} \mbox{ = } \left( \begin{array}{rrr}
8 & 3 & -2 \\ 5 & 7 & -1 \end{array} \right) \end{displaymath}

then the diagonal elements are 8 and 7.

Off-diagonal elements of a matrix are all of the other elements excluding the diagonals.

Types of Matrices

1.
Rectangular. A matrix is rectangular if the number of rows does not equal the number of columns. All vectors are rectangular matrices.
2.
Square. Square matrices have the same number of rows as columns.
3.
Symmetric. Symmetric matrices are also square matrices. The element in the ith row and jth column is equal to the element in the jth row and ith column, for all values of i and j. The transpose of a symmetric matrix is that same matrix, i.e. \( \bf {A'} = \bf {A} \). Take any matrix, say \(\bf {B}\) and form the product, \( \bf {B'B} \), then that product is symmetric and square.
4.
Diagonal. A diagonal matrix is a square, symmetric matrix in which all of the elements are zero except the diagonal elements. A diagonal matrix can be written as

\begin{displaymath}\bf {D} \mbox{ = } {\bf diag}( d_{1} \mbox{ } d_{2} \cdots
d_{n}) \end{displaymath}

where di are the diagonal elements.
5.
Identity. An identity matrix is a special diagonal matrix in which all of the diagonal elements are equal to one. An identity matrix is usually represented by \( \bf {I} \). Because is it a diagonal matrix, an identity matrix is also square and symmetric.
6.
Null. A null matrix is a matrix of any dimension in which all elements are equal to zero, and is denoted by \( \bf {0} \).
7.
Matrices With Only 1's. A column vector of length n with every element equal to 1, is usually denoted as \( \bf {1}_{n} \). A matrix with r rows and c columns and all elements equal to 1 is denoted as \( \bf {J} \).
8.
Triangular. Triangular matrices are square matrices with either all off-diagonal elements above or below the diagonals equal to zero. An upper triangular matrix has all elements below the diagonal equal to zero, and a lower triangular matrix has all elements above the diagonals equal to zero.
9.
Tridiagonal. A tridiagonal matrix is a square matrix with all elements equal to zero except the diagonals and the elements immediately to the left and right of the diagonal. An example is shown below.

\begin{displaymath}\bf {B} \mbox{ = } \left( \begin{array}{rrrrrr}
10 & 3 & 0 & ...
... 0 & 3 & 10 & 3 \\
0 & 0 & 0 & 0 & 3 & 10 \end{array} \right) \end{displaymath}

Addition of Matrices

Matrices are conformable for addition if they have the same order. The resulting sum is a matrix having the same number of rows and columns as the two matrices to be added. Matrices that are not of the same order cannot be added together. If \( \bf {A} = \{ a_{ij} \} \) and \( \bf {B} = \{ b_{ij} \} \), then

\begin{displaymath}\bf {A}+\bf {B} \mbox{ = } \{ a_{ij} + b_{ij} \}. \end{displaymath}

For example, let

\begin{displaymath}{\bf A} \mbox{ = } \left( \begin{array}{rrr}
4 & 5 & 3 \\ 6 ...
...\begin{array}{rrr}
1 & 0 & 2 \\ 3 & 4 & 1 \end{array} \right) \end{displaymath}

then

\begin{displaymath}{\bf A}+{\bf B} \mbox{ = } \left( \begin{array}{rrr}
4+1 & 5+0 & 3+2 \\ 6+3 & 0+4 & 2+1 \end{array} \right) \end{displaymath}


\begin{displaymath}\mbox{ = } \left( \begin{array}{rrr}
5 & 5 & 5 \\ 9 & 4 & 3 \end{array} \right) \mbox{ = }
{\bf B} + {\bf A} . \end{displaymath}

Subtraction is the addition of two matrices, one of which has all elements multiplied by a minus one (-1). That is,

\begin{displaymath}{\bf A}+(-1){\bf B} \mbox{ = } \left( \begin{array}{rrr}
3 & 5 & 1 \\ 3 & -4 & 1 \end{array} \right). \end{displaymath}

Multiplication of Matrices

Two matrices are conformable for multiplication if the number of columns in the first matrix equals the number of rows in the second matrix. If $\bf {C}$ has order $p \times q$ and $\bf {D}$ has order $m \times n$, then the product $\bf {CD}$ exists only if q=m. The product matrix has order $p \times n$. In general, \( \bf {CD} \) does not equal \( \bf {DC} \), and most often the product \( \bf {DC} \) may not even exist because $\bf {D}$ may not be conformable for multiplication with $\bf {C}$. Thus, the ordering of matrices in a product must be carefully and precisely written.

The computation of a product is defined as follows: let

\begin{displaymath}{\bf C}_{p \times q} = \{ c_{ij} \} \end{displaymath}

and

\begin{displaymath}{\bf D}_{m \times n} = \{ d_{ij} \} \end{displaymath}

and q=m, then

\begin{displaymath}{\bf CD}_{p \times n} = \{ \sum_{k=1}^{m} c_{ik}d_{kj} \}. \end{displaymath}

As an example, let

\begin{displaymath}{\bf C} \mbox{ = } \left( \begin{array}{rrr}
6 & 4 & -3 \\ 3 ...
...begin{array}{rr}
1 & 1 \\ 2 & 0 \\ 3 & -1 \end{array} \right), \end{displaymath}

then

\begin{displaymath}{\bf CD} \mbox{ = } \left( \begin{array}{rr}
6(1)+4(2)-3(3) &...
...in{array}{rr}
5 & 9 \\ 0 & 10 \\ 12 & 10 \end{array} \right). \end{displaymath}

Transpose of a Product

The transpose of the product of two or more matrices is the product of the transposes of each matrix in reverse order. That is, the transpose of $\bf {CDE}$, for example, is $\bf {E'D'C'}$. The reader should verify that the result is conformable for multiplication, and that the result has the correct dimensions.

Special Products

A matrix, say $\bf {A}$, is idempotent if the product of the matrix with itself equals itself, i.e. $\bf {AA} = \bf {A}$. This implies that the matrix must be square, but not necessarily symmetric.

A matrix is nilpotent if the product of the matrix with itself equals a null matrix, i.e. $\bf {BB} = \bf {0}$, then $\bf {B}$ is nilpotent.

A matrix is orthogonal if the product of the matrix with its transpose equals an identity matrix, i.e. $\bf {UU'} = \bf {I}$, which also implies that $\bf {U'U} = \bf {I}$.

Some Rules on Multiplication

1.
Multiplication of a matrix by a scalar results in multiplying every element of the matrix by that scalar.
2.
\(\bf {ABC} = \bf {A(BC)} = \bf {(AB)C} \).
3.
\(\bf {A(B+C)} = \bf {AB}+\bf {AC} \).
4.
\(\bf {(A+B)}^{2} = (\bf {A}+\bf {B})(\bf {A}+\bf {B}) =
\bf {AA}+\bf {AB}+\bf {BA}+\bf {BB} \).

Traces of Square Matrices

The trace is an operation that is applied to a square matrix and is denoted by the letters tr. The trace operation is the sum of the diagonal elements of a matrix. The sum is a scalar quantity. Let

\begin{displaymath}{\bf A} \mbox{ = } \left( \begin{array}{rrr}
.51 & -.32 & -.19 \\ -.28 & .46 & -.14 \\ -.21 & -.16 & .33
\end{array} \right), \end{displaymath}

then the trace is

\begin{displaymath}tr({\bf A}) \mbox{ = } .51 + .46 + .33 \mbox{ = } 1.30. \end{displaymath}

The trace of the product of conformable matrices has a special property known as the rotation rule of traces. That is,

\begin{displaymath}tr({\bf ABC}) \mbox{ = } tr({\bf BCA}) \mbox{ = } tr({\bf CAB}). \end{displaymath}

The traces are equal, because they are scalars, even though the dimensions of the three products might be greatly different.

Euclidean Norm

The Euclidean Norm is a matrix operation usually used to determine the degree of difference between two matrices of the same order. The norm is a scalar and is denoted as $ \Vert {\bf A} \Vert $ .


\begin{displaymath}\Vert {\bf A} \Vert \mbox{ = } [ tr({\bf AA'}) ]^{.5} \mbox{ = }
( \sum_{i=1}^{n} \sum_{j=1}^{n} a_{ij}^{2} )^{.5}. \end{displaymath}

For example, let

\begin{displaymath}{\bf A} \mbox{ = } \left( \begin{array}{rrr}
7 & -5 & 2 \\ 4 & 6 & 3 \\ 1 & -1 & 8 \end{array} \right), \end{displaymath}

then

\begin{displaymath}\Vert {\bf A} \Vert \mbox{ = } (49 + 25 + 4 + \cdots + 1 + 64 )^{.5} \end{displaymath}


\begin{displaymath}\mbox{ = } (205)^{.5} = 14.317821. \end{displaymath}

Other types of norms also exist.

Other Useful Matrix Operations

1.
Direct Sum. For matrices of any dimension, say ${\bf H}_{1}$, ${\bf H}_{2}$, ... $ {\bf H}_{n}$, the direct sum is


\begin{displaymath}\mbox{ $ \sum_{i}^{+} $ } {\bf H}_{i} \mbox{ = } {\bf H}_{1} ...
...
{\bf0} & {\bf0} & \cdots & {\bf H}_{n} \end{array} \right). \end{displaymath}

2.
Kronecker Product. The Kronecker product, also known as the direct product, is where every element of the first matrix is multiplied, as a scalar, times the second matrix. Suppose that $\bf {B}$ is a matrix of order $m \times n$ and that $\bf {A}$ is of order $2 \times 2$, then the direct product of $\bf {A}$ times $\bf {B}$ is

\begin{displaymath}{\bf A} \star {\bf B} \mbox{ = } {\bf A} \otimes {\bf B} \mbo...
...{\bf B} \\ a_{21}{\bf B} & a_{22}{\bf B}
\end{array} \right). \end{displaymath}

Notice that the dimension of the example product is $2m \times 2n$.

3.
Hadamard Product. The Hadamard product exists for two matrices that are conformable for addition. The corresponding elements of the two matrices are multiplied together. The order of the resulting product is the same as the matrices in the product. For two matrices, $\bf {A}$ and $\bf {B}$ of order $2 \times 2$, then the Hadamard product is

\begin{displaymath}{\bf A} \odot {\bf B} \mbox{ = } \left( \begin{array}{cc}
a_{...
...12}b_{12} \\ a_{21}b_{21} & a_{22}b_{22}
\end{array} \right). \end{displaymath}

Elementary Operators

Elementary operator matrices are identity matrices that have been modified by one of three methods. The product of an elementary operator matrix with any other matrix does not alter the rank of that matrix. Elementary operator matrices reduce a matrix to either a triangular or a diagonal form. The reduction process may involve many elementary operator matrices multiplied together. To illustrate the three types of elementary operator matrices and the process of reducing a matrix, begin with an example matrix, $\bf {A}$, where

\begin{displaymath}{\bf A} \mbox{ = } \left( \begin{array}{rrr}
4 & 16 & 24 \\ 3 & -1 & -2 \\ 1 & 5 & 8 \end{array} \right). \end{displaymath}

The first type of elementary operator matrix has one of the diagonal elements of an identity matrix replaced with a constant other than 1. Let $ {\bf E}_{11}(.25) $ be an elementary operator where the first diagonal element has been changed to .25. Pre-multiplication of $\bf {A}$ by this matrix gives $ {\bf A}_{1} $, where

\begin{displaymath}{\bf A}_{1} \mbox{ = } \left( \begin{array}{ccc}
.25 & 0 & 0 ...
...r}
4 & 16 & 24 \\ 3 & -1 & -2 \\ 1 & 5 & 8 \end{array} \right) \end{displaymath}


\begin{displaymath}\mbox{ = } \left( \begin{array}{rrr}
1 & 4 & 6 \\ 3 & -1 & -2 \\ 1 & 5 & 8 \end{array} \right). \end{displaymath}

The second type of elementary operator matrix interchanges rows or columns of a matrix. Let $ {\bf E}_{ij} $ be an elementary operator where rows i and j are interchanged. Pre-multiplication of $ {\bf A}_{1} $ by this matrix gives $ {\bf A}_{2} $ where

\begin{displaymath}{\bf A}_{2} \mbox{ = } \left( \begin{array}{rrr}
1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{array} \right)
{\bf A}_{1} \end{displaymath}


\begin{displaymath}\mbox{ = } \left( \begin{array}{rrr}
1 & 4 & 6 \\ 1 & 5 & 8 \\ 3 & -1 & -2 \end{array} \right). \end{displaymath}

The third type of elementary operator matrix combines the elements of two rows or columns of a matrix by addition or subtraction. Let $ {\bf E}_{ij}(c) $ be an elementary operator where c times row i is added to row j. Pre-multiplication of $ {\bf A}_{2} $ by this matrix gives $ {\bf A}_{3} $ where

\begin{displaymath}{\bf A}_{3} \mbox{ = } \left( \begin{array}{rrr}
1 & 0 & 0 \\ 0 & 1 & 0 \\ -3 & 0 & 1 \end{array} \right)
{\bf A}_{2} \end{displaymath}


\begin{displaymath}\mbox{ = } \left( \begin{array}{rrr}
1 & 4 & 6 \\ 1 & 5 & 8 \\ 0 & -13 & -20 \end{array} \right). \end{displaymath}

By continued pre-multiplication of $ {\bf A}_{3} $ by other elementary operators, $\bf {A}$ may be reduced to an upper triangular matrix. Further post-multiplication by another set of elementary operators can reduce $\bf {A}$ to a diagonal matrix.

Matrix Inversion

By definition, an inverse of a square matrix $\bf {A}$, denoted by $ {\bf A}^{-1} $, is a matrix which when pre- or post- multiplied times the original matrix yields an identity matrix. That is,

\begin{displaymath}{\bf AA^{-1} } \mbox{ = } {\bf I}, \mbox{ and }
{\bf A^{-1}A } \mbox{ = } {\bf I} .\end{displaymath}

Any square matrix with a nonzero determinant can be inverted. Thus, the first step to matrix inversion is the calculation of the determinant of the matrix. The determinant is a scalar quantity. For a $2 \times 2$ matrix, say

\begin{displaymath}{\bf A} \mbox{ = } \left( \begin{array}{cc}
a_{11} & a_{12} \\ a_{21} & a_{22} \end{array} \right), \end{displaymath}

then the determinant is

\begin{displaymath}\vert {\bf A} \vert \mbox{ = } a_{11}a_{22}-a_{21}a_{12} . \end{displaymath}

For a $3 \times 3$ matrix, the determinant can be reduced to a series of determinants of $2 \times 2$ matrices. For example, let

\begin{displaymath}{\bf B} \mbox{ = } \left( \begin{array}{rrr}
6 & -1 & 2 \\ 3 & 4 & -5 \\ 1 & 0 & -2 \end{array} \right),\end{displaymath}

then

\begin{displaymath}\vert {\bf B} \vert \mbox{ = } 6 \left\vert \begin{array}{rr}...
...\vert \begin{array}{rr}
3 & 4 \\ 1 & 0 \end{array} \right\vert \end{displaymath}


\begin{displaymath}\mbox{ = } 6(-8) \mbox{ } +1(-1) \mbox{ } +2(-4) \end{displaymath}


\begin{displaymath}\mbox{ = } -57. \end{displaymath}

The general expression for the determinant of a matrix is

\begin{displaymath}\vert {\bf B} \vert \mbox{ = } \sum_{j=1}^{n} (-1)^{i+j}b_{ij}
\vert {\bf M}_{ij} \vert \end{displaymath}

where $\bf {B}$ is of order n, and $ {\bf M}_{ij} $ is a minor submatrix of $\bf {B}$ resulting from the deletion of the ith row and jth column of $\bf {B}$. Any row of $\bf {B}$ may be used to compute the determinant because the result should be the same for each row. Columns may also be used instead of rows. The reader may verify this assertion.

If the determinant is non-zero, then proceed to the next step which is to find the matrix of signed minors, known as the adjoint matrix. Using the same matrix, $\bf {B}$ above, the minors and their determinants are as follows:

\begin{displaymath}{\bf M}_{11} \mbox{ = } +1 \left( \begin{array}{rr}
4 & -5 \...
...\right), \mbox{ and }
\vert {\bf M}_{11} \vert \mbox{ = } -8, \end{displaymath}


\begin{displaymath}{\bf M}_{12} \mbox{ = } -1 \left( \begin{array}{rr}
3 & -5 \...
... \right), \mbox{ and }
\vert {\bf M}_{12} \vert \mbox{ = } +1, \end{displaymath}


\begin{displaymath}{\bf M}_{13} \mbox{ = } +1 \left( \begin{array}{rr}
3 & 4 \\ ...
... \right), \mbox{ and }
\vert {\bf M}_{13} \vert \mbox{ = } -4, \end{displaymath}


\begin{displaymath}{\bf M}_{21} \mbox{ = } -1 \left( \begin{array}{rr}
-1 & 2 \\...
... \right), \mbox{ and }
\vert {\bf M}_{21} \vert \mbox{ = } -2, \end{displaymath}


\begin{displaymath}{\bf M}_{22} \mbox{ = } +1 \left( \begin{array}{rr}
6 & 2 \\ ...
...\right), \mbox{ and }
\vert {\bf M}_{22} \vert \mbox{ = } -14, \end{displaymath}


\begin{displaymath}{\bf M}_{23} \mbox{ = } -1 \left( \begin{array}{rr}
6 & -1 \\...
... \right), \mbox{ and }
\vert {\bf M}_{23} \vert \mbox{ = } -1, \end{displaymath}


\begin{displaymath}{\bf M}_{31} \mbox{ = } +1 \left( \begin{array}{rr}
-1 & 2 \\...
... \right), \mbox{ and }
\vert {\bf M}_{31} \vert \mbox{ = } -3, \end{displaymath}


\begin{displaymath}{\bf M}_{32} \mbox{ = } -1 \left( \begin{array}{rr}
6 & 2 \\ ...
...ox{ and }
\vert {\bf M}_{32} \vert \mbox{ = } 36, \mbox{ and } \end{displaymath}


\begin{displaymath}{\bf M}_{33} \mbox{ = } +1 \left( \begin{array}{rr}
6 & -1 \\...
... \right), \mbox{ and }
\vert {\bf M}_{33} \vert \mbox{ = } 27. \end{displaymath}

The adjoint matrix of signed minors, $ {\bf M}_{B} $ is then

\begin{displaymath}{\bf M}_{B} \mbox{ = } \left( \begin{array}{rrr}
-8 & 1 & -4 \\ -2 & -14 & -1 \\ -3 & 36 & 27 \end{array} \right). \end{displaymath}

The inverse of $\bf {B}$ is then

\begin{displaymath}{\bf B}^{-1} \mbox{ = } \vert {\bf B} \vert^{-1} {\bf M'}_{B} \end{displaymath}


\begin{displaymath}\mbox{ = } \frac{1}{-57} \left( \begin{array}{rrr}
-8 & -2 & -3 \\ 1 & -14 & 36 \\ -4 & -1 & 27 \end{array} \right). \end{displaymath}

If the determinant is zero, then the inverse is not defined or does not exist. A square matrix with a non-zero determinant is said to be nonsingular. Only nonsingular matrices have inverses. Matrices with zero determinants are called singular and do not have an inverse.

An important result is that the inverse of an inverse matrix is equal to the original matrix. That is,

\begin{displaymath}({\bf A^{-1}})^{-1} \mbox{ = } {\bf A}. \end{displaymath}

The inverse of the product of two or more nonsingular matrices follows a rule similar to that for the transpose of a product of matrices. Let \( {\bf A, B, \mbox{ and } C } \) be three nonsingular matrices, then

\begin{displaymath}( {\bf ABC})^{-1} \mbox{ = } {\bf C^{-1}B^{-1}A^{-1} }, \end{displaymath}

and

\begin{displaymath}{\bf C^{-1}B^{-1}A^{-1}ABC} \mbox{ = } {\bf I}. \end{displaymath}

Generalized Inverses

Suppose we have a set of equations, \( {\bf Ax = r } \), then a solution for $ {\bf x} $ is obtained by pre-multiplying both sides of the equation by $ {\bf A}^{-1} $ provided that $\bf {A}$ is a nonsingular matrix. Then \( {\bf x = A^{-1}r } \). Often, however, $\bf {A}$ is a singular matrix, i.e. the determinant is zero and the inverse does not exist. Solutions to the set of equations can still be calculated using a generalized inverse of $\bf {A}$. A generalized inverse, denoted as $ {\bf A^{-}} $, is a matrix that satisfies the following expression

\begin{displaymath}{\bf AA^{-}A} \mbox{ = } {\bf A}. \end{displaymath}

In general, neither \( {\bf AA^{-}} \mbox{ nor } {\bf A^{-}A} \) are equal to $ {\bf I} $ except in the case when $\bf {A}$ is nonsingular.

There are an infinite number of generalized inverses for any one singular matrix. A unique generalized inverse is the Moore-Penrose inverse which satisfies the following conditions:

1.
\( {\bf AA^{-}A} \ = \ {\bf A}, \)
2.
\( {\bf A^{-}AA^{-}} \ = \ {\bf A^{-}}, \)
3.
\( ({\bf A^{-}A})' \ = \ {\bf A^{-}A}, \) and
4.
\( ({\bf AA^{-}})' \ = \ {\bf AA^{-}}. \)
Usually, a generalized inverse that satisfies only the first condition is sufficient for practical purposes.

Eigenvalues and Eigenvectors

There are a number of square, and sometimes symmetric, matrices involved in statistical procedures that must be positive definite. Suppose that ${\bf Q}$ is any square matrix then

To determine the appropriate condition of the matrix ${\bf Q}$, then one must calculate the eigenvalues (or latent roots or characteristic roots) of the matrix. The eigenvalues are useful in that

If ${\bf Q}$ is a square, symmetric matrix, then it can be represented as

\begin{displaymath}{\bf Q} \mbox{ = } {\bf UDU'} \end{displaymath}

where $\bf {D}$ is a diagonal matrix, the canonical form of ${\bf Q}$, containing the eigenvalues of ${\bf Q}$, and ${\bf U}$ is an orthogonal matrix of the corresponding eigenvectors. Recall that for a matrix to be orthogonal then \( {\bf UU'} \mbox{ = } {\bf I} \mbox{ = }
{\bf U'U}, \) and \( {\bf U^{-1}} \mbox{ = } {\bf U'} \).

The eigenvalues and eigenvectors are found by solving

\begin{displaymath}\mid {\bf Q}-d{\bf I} \mid \mbox{ = } 0, \end{displaymath}

and

\begin{displaymath}{\bf Qu}-d{\bf u} \mbox{ = } {\bf0}, \end{displaymath}

where d is one of the eigenvalues of ${\bf Q}$ and ${\bf u}$ is the corresponding eigenvector. There are numerous computer routines for calculating $\bf {D}$ and ${\bf U}$.

Differentiation

The differentiation of mathematical expressions involving matrices follows similar rules as for those involving scalars. Some of the basic results are shown below.

Let

\begin{displaymath}c = 3x_{1} + 5x_{2} + 9x_{3} = {\bf b'x} \end{displaymath}


\begin{displaymath}\mbox{ = } \left( \begin{array}{rrr}
3 & 5 & 9 \end{array} \r...
... \begin{array}{c}
x_{1} \\ x_{2} \\ x_{3} \end{array} \right). \end{displaymath}

With scalars we have

\begin{displaymath}\frac{\partial c}{\partial x_{1}} \mbox{ = } 3 \end{displaymath}


\begin{displaymath}\frac{\partial c}{\partial x_{2}} \mbox{ = } 5 \end{displaymath}


\begin{displaymath}\frac{\partial c}{\partial x_{3}} \mbox{ = } 9, \end{displaymath}

but with vectors we have

\begin{displaymath}\frac{\partial c}{\partial {\bf x}} \mbox{ = } {\bf b}. \end{displaymath}

The general rule is

\begin{displaymath}\frac{\partial {\bf A'x}}{\partial {\bf x}} \mbox{ = }
{\bf A}. \end{displaymath}

Another function might be

c = 9x12 + 6x1x2 + 4x22

or

\begin{displaymath}c = \left( \begin{array}{cc}
x_{1} & x_{2} \end{array} \right...
...}{c}
x_{1} \\ x_{2} \end{array} \right) \mbox{ = } {\bf x'Ax}. \end{displaymath}

With scalars we would have

\begin{displaymath}\frac{\partial c}{\partial x_{1}} \mbox{ = } 2(9)x_{1} + 6x_{2} \end{displaymath}


\begin{displaymath}\frac{\partial c}{\partial x_{2}} \mbox{ = } 6x_{1} + 2(4)x_{2}, \end{displaymath}

then in matrix form,

\begin{displaymath}\frac{\partial c}{\partial {\bf x}} \mbox{ = } 2{\bf Ax}. \end{displaymath}

If $\bf {A}$ was not a symmetric matrix, then

\begin{displaymath}\frac{\partial {\bf x'Ax}}{\partial {\bf x}} \mbox{ = }
{\bf Ax} + {\bf A'x}. \end{displaymath}

Cholesky Decomposition

In simulation studies or applications of Gibb's sampling there is frequently a need to factor a symmetric positive definite matrix into the product of a matrix times its transpose. The Cholesky decomposition of a matrix, say ${\bf V}$, is a lower triangular matrix such that

\begin{displaymath}{\bf V} = {\bf T}{\bf T}', \end{displaymath}

and ${\bf T}$ is lower triangular. Suppose that

\begin{displaymath}{\bf V} = \left( \begin{array}{rrr} 9 & 3 & -6 \\
3 & 5 & 0 \\ -6 & 0 & 21 \end{array} \right). \end{displaymath}

The problem is to derive

\begin{displaymath}{\bf T} = \left( \begin{array}{rrr} t_{11} & 0 & 0 \\
t_{21} & t_{22} & 0 \\ t_{31} & t_{32} & t_{33} \end{array} \right), \end{displaymath}

such that

\begin{eqnarray*}t_{11}^{2} & = & 9 \\
t_{11}t_{21} & = & 3 \\
t_{11}t_{31} ...
...\ \ {\rm and} \\
t_{31}^{2} + t_{32}^{2} + t_{33}^{2} & = & 21
\end{eqnarray*}


These equations give t11 = 3, then t21 = 3/t11 = 1, and t31 = -6/t11 = -2. From the fourth equation, t22 is the square root of (5 - t212) or t22 = 2. The fifth equation says that (1)(-2) + (2)t32 = 0 or t32 = 1. The last equation says that t332 is 21 - (-2)2 - (1)2 = 16. The end result is

\begin{displaymath}{\bf T} = \left( \begin{array}{rrr} 3 & 0 & 0 \\ 1 & 2 & 0 \\
-2 & 1 & 4 \end{array} \right). \end{displaymath}

The derivation of the Cholesky decomposition is easily programmed for a computer. Note that in calculating the diagonals of ${\bf T}$ the square root of a number is needed, and consequently this number must always be positive. Hence, if the matrix is positive definite, then all necessary square roots will be of positive numbers. However, the opposite is not true. That is, if all of the square roots are of positive numbers, the matrix is not necessarily guaranteed to be positive definite. The only way to guarantee positive definiteness is to calculate the eigenvalues of a matrix and to see that they are all positive.

Inverse of a Lower Triangular Matrix

The inverse of a lower triangular matrix is also a lower triangular matrix, and can be easily derived. The diagonals of the inverse are the inverse of the diagonals of the original matrix. Using the matrix ${\bf T}$ from the previous section, then

\begin{displaymath}{\bf T}^{-1} = \left( \begin{array}{rrr} t^{11} & 0 & 0 \\
...
... & t^{22} & 0 \\ t^{31} & t^{32} & t^{33} \end{array} \right), \end{displaymath}

where

\begin{eqnarray*}t^{ii} & = & 1 / t_{ii} \\
t_{21}t^{11} + t_{22}t^{21} & = & ...
...+ t_{33}t^{31} & = & 0 \\
t_{32}t^{22} + t_{33}t^{32} & = & 0.
\end{eqnarray*}


These equations give

\begin{eqnarray*}t^{11} & = & \frac{1}{3} \\
t^{21} & = & - \frac{1}{6} \\
t...
... & = & - \frac{1}{8} \ \ {\rm and} \\
t^{33} & = & \frac{1}{4}
\end{eqnarray*}


Likewise the determinant of a triangular matrix is the product of all of the diagonal elements. Hence all diagonal elements need to be non-zero for the inverse to exist.

The natural logarithm of the determinant of a triangular matrix is the summation of the natural logarithms of the individual diagonal elements. This property is useful in derivative free restricted maximum likelihood.


next up previous

This LaTeX document is available as postscript or asAdobe PDF.

Larry Schaeffer
1999-02-26