This LaTeX document is available as postscript or asAdobe PDF.

Matrix Algebra
L. R. Schaeffer, March 1999

Definitions

1.
Scalars. A scalar is a single quantity usually designated by an italicized lower case letter, for example a, r, and k.

2.
Matrix. A matrix is a group of elements arranged in rows and columns. Elements can be scalars, mathematical expressions, or other matrices. Upper case letters in bold face type denote matrices. Examples of matrices are

and

The order or dimension of a matrix is the number of rows and columns.

3.
Vectors. A column vector is a matrix with one column, and a row vector is a matrix with one row. Usually if the word vector is used, it is implied that the vector is a column vector. Vectors are commonly denoted by lower case letters in bold face type, but this is not a universal practice.

4.
Transposition. Transposition is the formation of a new matrix by making rows of the original matrix into columns of the new matrix in corresponding order. The transpose of a column vector is a row vector, and the transpose of a matrix of order results in a new matrix of order . If is the original matrix, then the transpose is denoted as in these notes, but in some places the transpose is denoted as .

5.
Diagonals. The diagonal elements of a matrix are those elements whose row and column numbers are the same. For example, if

then the diagonal elements are 8 and 7.

Off-diagonal elements of a matrix are all of the other elements excluding the diagonals.

Types of Matrices

1.
Rectangular. A matrix is rectangular if the number of rows does not equal the number of columns. All vectors are rectangular matrices.
2.
Square. Square matrices have the same number of rows as columns.
3.
Symmetric. Symmetric matrices are also square matrices. The element in the ith row and jth column is equal to the element in the jth row and ith column, for all values of i and j. The transpose of a symmetric matrix is that same matrix, i.e. . Take any matrix, say and form the product, , then that product is symmetric and square.
4.
Diagonal. A diagonal matrix is a square, symmetric matrix in which all of the elements are zero except the diagonal elements. A diagonal matrix can be written as

where di are the diagonal elements.
5.
Identity. An identity matrix is a special diagonal matrix in which all of the diagonal elements are equal to one. An identity matrix is usually represented by . Because is it a diagonal matrix, an identity matrix is also square and symmetric.
6.
Null. A null matrix is a matrix of any dimension in which all elements are equal to zero, and is denoted by .
7.
Matrices With Only 1's. A column vector of length n with every element equal to 1, is usually denoted as . A matrix with r rows and c columns and all elements equal to 1 is denoted as .
8.
Triangular. Triangular matrices are square matrices with either all off-diagonal elements above or below the diagonals equal to zero. An upper triangular matrix has all elements below the diagonal equal to zero, and a lower triangular matrix has all elements above the diagonals equal to zero.
9.
Tridiagonal. A tridiagonal matrix is a square matrix with all elements equal to zero except the diagonals and the elements immediately to the left and right of the diagonal. An example is shown below.

Matrices are conformable for addition if they have the same order. The resulting sum is a matrix having the same number of rows and columns as the two matrices to be added. Matrices that are not of the same order cannot be added together. If and , then

For example, let

then

Subtraction is the addition of two matrices, one of which has all elements multiplied by a minus one (-1). That is,

Multiplication of Matrices

Two matrices are conformable for multiplication if the number of columns in the first matrix equals the number of rows in the second matrix. If has order and has order , then the product exists only if q=m. The product matrix has order . In general, does not equal , and most often the product may not even exist because may not be conformable for multiplication with . Thus, the ordering of matrices in a product must be carefully and precisely written.

The computation of a product is defined as follows: let

and

and q=m, then

As an example, let

then

Transpose of a Product

The transpose of the product of two or more matrices is the product of the transposes of each matrix in reverse order. That is, the transpose of , for example, is . The reader should verify that the result is conformable for multiplication, and that the result has the correct dimensions.

Special Products

A matrix, say , is idempotent if the product of the matrix with itself equals itself, i.e. . This implies that the matrix must be square, but not necessarily symmetric.

A matrix is nilpotent if the product of the matrix with itself equals a null matrix, i.e. , then is nilpotent.

A matrix is orthogonal if the product of the matrix with its transpose equals an identity matrix, i.e. , which also implies that .

Some Rules on Multiplication

1.
Multiplication of a matrix by a scalar results in multiplying every element of the matrix by that scalar.
2.
.
3.
.
4.
.

Traces of Square Matrices

The trace is an operation that is applied to a square matrix and is denoted by the letters tr. The trace operation is the sum of the diagonal elements of a matrix. The sum is a scalar quantity. Let

then the trace is

The trace of the product of conformable matrices has a special property known as the rotation rule of traces. That is,

The traces are equal, because they are scalars, even though the dimensions of the three products might be greatly different.

Euclidean Norm

The Euclidean Norm is a matrix operation usually used to determine the degree of difference between two matrices of the same order. The norm is a scalar and is denoted as .

For example, let

then

Other types of norms also exist.

Other Useful Matrix Operations

1.
Direct Sum. For matrices of any dimension, say , , ... , the direct sum is

2.
Kronecker Product. The Kronecker product, also known as the direct product, is where every element of the first matrix is multiplied, as a scalar, times the second matrix. Suppose that is a matrix of order and that is of order , then the direct product of times is

Notice that the dimension of the example product is .

3.
Hadamard Product. The Hadamard product exists for two matrices that are conformable for addition. The corresponding elements of the two matrices are multiplied together. The order of the resulting product is the same as the matrices in the product. For two matrices, and of order , then the Hadamard product is

Elementary Operators

Elementary operator matrices are identity matrices that have been modified by one of three methods. The product of an elementary operator matrix with any other matrix does not alter the rank of that matrix. Elementary operator matrices reduce a matrix to either a triangular or a diagonal form. The reduction process may involve many elementary operator matrices multiplied together. To illustrate the three types of elementary operator matrices and the process of reducing a matrix, begin with an example matrix, , where

The first type of elementary operator matrix has one of the diagonal elements of an identity matrix replaced with a constant other than 1. Let be an elementary operator where the first diagonal element has been changed to .25. Pre-multiplication of by this matrix gives , where

The second type of elementary operator matrix interchanges rows or columns of a matrix. Let be an elementary operator where rows i and j are interchanged. Pre-multiplication of by this matrix gives where

The third type of elementary operator matrix combines the elements of two rows or columns of a matrix by addition or subtraction. Let be an elementary operator where c times row i is added to row j. Pre-multiplication of by this matrix gives where

By continued pre-multiplication of by other elementary operators, may be reduced to an upper triangular matrix. Further post-multiplication by another set of elementary operators can reduce to a diagonal matrix.

Matrix Inversion

By definition, an inverse of a square matrix , denoted by , is a matrix which when pre- or post- multiplied times the original matrix yields an identity matrix. That is,

Any square matrix with a nonzero determinant can be inverted. Thus, the first step to matrix inversion is the calculation of the determinant of the matrix. The determinant is a scalar quantity. For a matrix, say

then the determinant is

For a matrix, the determinant can be reduced to a series of determinants of matrices. For example, let

then

The general expression for the determinant of a matrix is

where is of order n, and is a minor submatrix of resulting from the deletion of the ith row and jth column of . Any row of may be used to compute the determinant because the result should be the same for each row. Columns may also be used instead of rows. The reader may verify this assertion.

If the determinant is non-zero, then proceed to the next step which is to find the matrix of signed minors, known as the adjoint matrix. Using the same matrix, above, the minors and their determinants are as follows:

The adjoint matrix of signed minors, is then

The inverse of is then

If the determinant is zero, then the inverse is not defined or does not exist. A square matrix with a non-zero determinant is said to be nonsingular. Only nonsingular matrices have inverses. Matrices with zero determinants are called singular and do not have an inverse.

An important result is that the inverse of an inverse matrix is equal to the original matrix. That is,

The inverse of the product of two or more nonsingular matrices follows a rule similar to that for the transpose of a product of matrices. Let be three nonsingular matrices, then

and

Generalized Inverses

Suppose we have a set of equations, , then a solution for is obtained by pre-multiplying both sides of the equation by provided that is a nonsingular matrix. Then . Often, however, is a singular matrix, i.e. the determinant is zero and the inverse does not exist. Solutions to the set of equations can still be calculated using a generalized inverse of . A generalized inverse, denoted as , is a matrix that satisfies the following expression

In general, neither are equal to except in the case when is nonsingular.

There are an infinite number of generalized inverses for any one singular matrix. A unique generalized inverse is the Moore-Penrose inverse which satisfies the following conditions:

1.
2.
3.
and
4.
Usually, a generalized inverse that satisfies only the first condition is sufficient for practical purposes.

• Linear Independence. A square matrix with a nonzero determinant is a matrix in which the rows and columns of the matrix are linearly independent. If is the matrix, then linear independence means that no vector, say , exists such that except for .

A square matrix with a zero determinant implies that at least one non-null vector, , exists such that To compute a generalized inverse of a matrix that has a zero determinant requires that we know the number of linearly independent rows and columns in that matrix. The number of linearly independent rows or columns is called the rank of a matrix and is a term that applies to rectangular as well as square matrices.

• Rank. Elementary operators are used to determine the rank of a matrix. Reduction of the matrix to a diagonal matrix is called reduction to canonical form, and the reduced matrix is called the canonical form under equivalence. If the matrix is square and symmetric, then the reduction of the matrix to a diagonal form with either ones or minus ones on the diagonals is called congruent reduction and the resulting matrix is the canonical form under congruence. Usually, only a reduction to an upper triangular form is necessary to determine the rank of the matrix.

Let

The rank of a matrix can not be greater than the minimum of the number of rows or columns, whichever is smaller. In the example above, has 3 rows and 5 columns, and therefore the rank of can not be greater than 3. To find the exact rank, elementary operator matrices are used. Let be the first elementary operator matrix that will zero out the first elements in the second and third rows. Premultiplication of any matrix by elementary operator matrices does not change the rank of that matrix. Let

then

Now use to subtract the third row from the second row, so that

and

The rank is the number of non-zero diagonal elements of the reduced matrix, i.e. .

Elementary operators of order would be postmultiplied times in order to reduce the matrix to a diagonal form, and the rank would still be the number of non-zero diagonal elements.

• Definitions.
Full-row rank
If has order with rank equal to m, then has full row rank.
Full-column rank
A matrix with rank equal to the number of columns has full-column rank.
Full rank
A square matrix with rank equal to the number of rows or columns has full rank. A full rank matrix is nonsingular, has a non-zero determinant, and has an inverse.

1.
A null matrix has a rank of zero.
2.
A J-matrix has a rank of one.
3.
The rank of an idempotent matrix is equal to its trace.
4.
If has rank r and has rank q, and the two matrices are conformable for multiplication, then the product, , has a maximum possible rank equal to the lesser of r or q.

• Consequences. The fact that a matrix has rank that is less than the number of rows or columns in the matrix means that we can partition the matrix into a square matrix of order equal to the rank and that has rank equal to that matrix, and into other matrices of the remaining rows and columns. The other rows and columns are linearly dependent upon the rows and columns in that square matrix. That means that they can be formed from the rows and columns that are linearly independent.

Let be a matrix of order with rank r, and r is less than either p or q, then there are p-r rows of and q-r columns which are not linearly independent. Partition as follows:

such that has order and rank of r. Re-arrangement of rows and columns of may be needed to find an appropriate . has order , has order , and has order .

There exist matrices, and such that

and

and

To illustrate, let

and the rank of this matrix is 2. A full rank submatrix within is the upper left matrix. Let

then

where

where

and

This kind of partitioning of matrices with less than full rank is always possible. In practice, we need only know that we can perform this kind of partitioning, but it is usually not necessary to derive and explicitly.

• Calculations. For a matrix, , of less than full rank, there is an infinite number of possible generalized inverses, all of which would satisfy . However, only one generalized inverse needs to be computed in practice. A method to derive one particular type of generalized inverse has the following steps:
1.
Determine the rank of .
2.
Obtain , a square, full rank submatrix of with rank equal to the rank of .
3.
Partition as

4.
Compute the generalized inverse as

If has order , then must have order . To prove that is a generalized inverse of , then multiply out the expression

From the previous section, so that

• Solutions to Equations. Because there are an infinite number of generalized inverses to a matrix that has less than full rank, then it logically follows that for a system of consistent equations, , where the solutions are computed as , then there would also be an infinite number of solution vectors for . Having computed only one generalized inverse, however, it is possible to compute many different solution vectors. From Searle(1971 page 9, Theorem 2), if has q columns and if is one generalized inverse of , then the consistent equations have solution

where is any arbitrary vector of length q. The number of linearly independent solution vectors, however, is (q-r+1).

It is also possible to generate other generalized inverses of the same matrix. If is a generalized inverse of then so is

for any and . Pre- and post- multiplication of by shows that this is so.

• Special Generalized Inverses. In linear models analyses, the product occurs frequently where is a matrix that usually has more rows than it has columns and has rank less than the number of columns, and is a square matrix that is usually diagonal. Generalized inverses of this product matrix have special features that are used in the derivation and proof of various statistical procedures. Let be a matrix of order with rank r. The product, has order and is symmetric with rank r. Let represent any generalized inverse of the product matrix, then the following results are true.
1.
is not necessarily a symmetric matrix, in which case is also a generalized inverse of the product matrix.
2.
or
3.
is a generalized inverse of .
4.
is always symmetric and unique for all generalized inverses of the product matrix, .
5.
If = for some , then .
The proofs of the above results can be found in Searle (1971). These results are very important.

Eigenvalues and Eigenvectors

There are a number of square, and sometimes symmetric, matrices involved in statistical procedures that must be positive definite. Suppose that is any square matrix then

• is positive definite if is always greater than zero for all vectors, .
• is positive semi-definite if is greater than or equal to zero for all vectors , and for at least one vector , then .
• is non-negative definite if is either positive definite or positive semi-definite.
To determine the appropriate condition of the matrix , then one must calculate the eigenvalues (or latent roots or characteristic roots) of the matrix. The eigenvalues are useful in that
• The product of the eigenvalues equals the determinant of the matrix.
• The number of non-zero eigenvalues equals the rank of the matrix.
• If all the eigenvalues are greater than zero, then the matrix is positive definite.
• If all the eigenvalues are greater than or equal to zero and one or more are equal to zero, then the matrix is positive semi-definite.

If is a square, symmetric matrix, then it can be represented as

where is a diagonal matrix, the canonical form of , containing the eigenvalues of , and is an orthogonal matrix of the corresponding eigenvectors. Recall that for a matrix to be orthogonal then and .

The eigenvalues and eigenvectors are found by solving

and

where d is one of the eigenvalues of and is the corresponding eigenvector. There are numerous computer routines for calculating and .

Differentiation

The differentiation of mathematical expressions involving matrices follows similar rules as for those involving scalars. Some of the basic results are shown below.

Let

With scalars we have

but with vectors we have

The general rule is

Another function might be

c = 9x12 + 6x1x2 + 4x22

or

With scalars we would have

then in matrix form,

If was not a symmetric matrix, then

Cholesky Decomposition

In simulation studies or applications of Gibb's sampling there is frequently a need to factor a symmetric positive definite matrix into the product of a matrix times its transpose. The Cholesky decomposition of a matrix, say , is a lower triangular matrix such that

and is lower triangular. Suppose that

The problem is to derive

such that

These equations give t11 = 3, then t21 = 3/t11 = 1, and t31 = -6/t11 = -2. From the fourth equation, t22 is the square root of (5 - t212) or t22 = 2. The fifth equation says that (1)(-2) + (2)t32 = 0 or t32 = 1. The last equation says that t332 is 21 - (-2)2 - (1)2 = 16. The end result is

The derivation of the Cholesky decomposition is easily programmed for a computer. Note that in calculating the diagonals of the square root of a number is needed, and consequently this number must always be positive. Hence, if the matrix is positive definite, then all necessary square roots will be of positive numbers. However, the opposite is not true. That is, if all of the square roots are of positive numbers, the matrix is not necessarily guaranteed to be positive definite. The only way to guarantee positive definiteness is to calculate the eigenvalues of a matrix and to see that they are all positive.

Inverse of a Lower Triangular Matrix

The inverse of a lower triangular matrix is also a lower triangular matrix, and can be easily derived. The diagonals of the inverse are the inverse of the diagonals of the original matrix. Using the matrix from the previous section, then

where

These equations give

Likewise the determinant of a triangular matrix is the product of all of the diagonal elements. Hence all diagonal elements need to be non-zero for the inverse to exist.

The natural logarithm of the determinant of a triangular matrix is the summation of the natural logarithms of the individual diagonal elements. This property is useful in derivative free restricted maximum likelihood.

This LaTeX document is available as postscript or asAdobe PDF.

Larry Schaeffer
1999-02-26