next up previous


This LaTeX document is available as postscript or asAdobe PDF.

Animal Model
L. R. Schaeffer, March 1999

Assumptions of the Model

A simple animal model will be utilized in this lesson. The equation of the model is

\begin{displaymath}y_{i} \ = \ \mu \ + \ a_{i} \ + \ e_{i}, \end{displaymath}

where yi is the phenotypic record or observation, $\mu$ is the overall mean of the observations, ai are the additive genetic values of animals, and ei are the residual or environmental effects associated with the ith observation. Some of the assumptions of this model are
1.
The trait of interest is influenced by an infinite number of loci, each with a small effect.
2.
The population is large (infinite) in size.
3.
The individuals within the population are mating randomly.
4.
No selection has been exercised on individual records or on mating assignments.
5.
Only additive genetic effects exist.

Genotypic Values of Animals

One way to understand a model is to generate some fictitious data that correspond to the model exactly. Below are pedigrees of 16 animals. The first four were randomly chosen from a large population and subsequently mated at random.

Animal Sire Dam Animal Sire Dam
1 0 0 9 5 6
2 0 0 10 7 8
3 0 0 11 5 8
4 0 0 12 7 6
5 1 2 13 1 6
6 3 4 14 3 8
7 3 2 15 5 4
8 1 4 16 7 8

The Relationship Matrix

The matrix of additive genetic relationships among the sixteen individuals is ${\bf A}$ (times 16) given below:

\begin{displaymath}\left( \begin{array}{rrrrrrrrrrrrrrrr}
16 & 0 & 0 & 0 & 8 & ...
...& 8 & 8 & 4 & 8 & 6 & 6 & 4 & 6 & 4 & 16
\end{array} \right). \end{displaymath}

One method to generate additive genetic values of the 16 animals would be to partition ${\bf A}$ into the product of a lower triangular matrix times its transpose, obtain a vector of 16 pseudo-random normal deviates, and pre-multiply this vector by the lower triangular matrix. The Cholesky Decomposition is the procedure to compute the lower triangular matrix. In SAS IML, write ${\bf L} =$ HALF(${\bf A}$). The theory is that the vector of pseudo-random normal deviates, say ${\bf v}$, has

\begin{eqnarray*}E({\bf v}) & = & {\bf0}, \mbox{ and } \\
V({\bf v}) & = & {\bf I}.
\end{eqnarray*}


Then the vector of additive genetic values is

\begin{eqnarray*}{\bf a} & = & {\bf Lv}, \mbox{ where } \\
E({\bf a}) & = & {\...
... \\
V({\bf a}) & = & {\bf LIL'} = {\bf LL'} \\
& = & {\bf A}
\end{eqnarray*}


Thus, ${\bf a}$ has the desired expectation and variance-covariance matrix. The Cholesky decomposition is a difficult method to implement for a large number of animals, because ${\bf A}$ and ${\bf L}$ would be very large.

By Following Pedigrees

A second and slightly more efficient method of generating true breeding values is to following the pedigree file chronologically. Sort the pedigrees so that the oldest animals appear before their progeny (as already given in the above table). Note that there are base population animals with unknown parents and animals with known parents.

Base population animals, by definition, are unrelated to each other and are non-inbred. Let ${\bf a}_{b}$ represent the vector of base population animals, in this case animals 1 to 4. Assume that

\begin{displaymath}{\bf a}_{b} \sim N({\bf0}, {\bf I}\sigma^{2}_{a}), \end{displaymath}

where $\sigma^{2}_{a}$ is the additive genetic variance, which represents the sum of additive genetic variances at all loci that govern this trait.

The other animals, 5 to 16, with known parents, are represented by ${\bf a}_{n}$. The additive genetic value of any animal in this vector can be written as

\begin{displaymath}a_{i} \ = \ 0.5*(a_{s} + a_{d}) + m_{i}, \end{displaymath}

where the subscripts s and d refer to sire and dam of animal i, and mi is the deviation of animal i from the average additive genetic value of its parents, known as the Mendelian sampling effect. In all situations, the Mendelian sampling effect is assumed to NOT be affected by selection of parents, and thus,

\begin{eqnarray*}m_{i} & \sim & N(0, (0.5*(1 - \bar{F})\sigma^{2}_{a}) ),
\end{eqnarray*}


where $\bar{F}$ is the average inbreeding coefficient of the two parents. The above formula for ai can be used to generate the additive genetic values of animals with parents, as long as the inbreeding coefficients of the parents are known. This is easily achieved using the Meuwissen and Luo (1992) algorithm described in an earlier chapter.

Example

Assume heritability is 0.36, and that the variance of phenotypic records is to be 100. Then, $\sigma^{2}_{a} = 36$, and $\sigma^{2}_{e} = 64$. A base population animal's additive genetic value is created by obtaining a pseudo-random normal deviate, RND, and multiplying it by $\sigma_{a}$. For animals 1 to 4,

\begin{eqnarray*}a_{1} & = & -1.8014 * 6 \ = \ -10.8 \\
a_{2} & = & 0.6556 * 6...
... & 0.8029 * 6 \ = \ 4.8 \\
a_{4} & = & -0.4242 * 6 \ = \ -2.5
\end{eqnarray*}


The subsequent progeny can be generated as follows:

\begin{eqnarray*}a_{5} & = & 0.5*(a_{1} \ + \ a_{2}) \ + \ m_{5} \\
& = & 0.5*...
...= \ -5.4 \\
a_{16} & = & -3.55 \ + \ (1.4474)*4.2426 \ = \ 2.6
\end{eqnarray*}


where 4.2426 is $(0.5 \ \sigma^{2}_{a})^{.5}$. In general, the variance of the Mendelian sampling effect is $\sigma^{2}_{a} \times
b_{ii}$.

Residual and Phenotypic Values

To create phenotypic records, the equation of the model must be specified. Assume that

\begin{displaymath}y_{i} \ = \ \mu \ + \ a_{i} \ + \ e_{i}. \end{displaymath}

To create yi, $\mu$ and ei need to be known. Let $\mu = 50$. Because $\mu$ does not have a subscript, means that all yi will have the same constant as part of the record. However, ei has a subscript i, and therefore a different number is added to each yi. The ei are assumed to be from a Normal distribution with a mean of zero and variance equal to $\sigma^{2}_{e}$,

\begin{displaymath}e_{i} \ \sim \ N(0, \sigma^{2}_{e}), \end{displaymath}

and these residual terms are assumed to be independent of ai. That is, the covariance between ai and ei is zero for all i. The residuals, ei, can be generated as

\begin{displaymath}e_{i} \ = \ \mbox{ RND } * \sigma_{e}. \end{displaymath}

The following table contains the components to each record. Records will be formed only for animals 5 to 16. Animals 1 to 4 are merely base animals and do not have any records themselves.

i $\mu$ ai RND $* \sigma_{e}$ = ei   =yi
5 50 -5.2 -0.7905 *8 = -6.324 = 38.5
6 50 3.4 -0.5634 *8 = -4.507 = 48.9
7 50 2.9 1.4213 *8 = 11.370 = 64.3
8 50 -10.0 1.3096 *8 = 10.476 = 50.5
9 50 -2.4 -1.4456 *8 = -11.564 = 36.0
10 50 3.3 1.6360 *8 = 13.088 = 66.4
11 50 -11.4 -1.2174 *8 = -9.739 = 28.9
12 50 3.6 2.4276 *8 = 19.420 = 73.0
13 50 -0.9 -0.6109 *8 = -4.887 = 44.2
14 50 -6.7 1.2639 *8 = 10.111 = 53.4
15 50 -5.4 -1.3777 *8 = -11.021 = 33.6
16 50 2.6 -0.3930 *8 = -3.144 = 49.5
Note that the ei values are generally much larger than the ai values and that they have the largest impact on yi. This is typical of nearly all traits measured on all livestock species.

The variance of this sample of ei values was 109.83, which is greater than the population value of 64. However, such differences can be expected in small samples. In fact, the variance of the estimate of the sample variance is

\begin{displaymath}V( \hat{\sigma}^{2}_{e} ) \ = \
\frac{2 \sigma^{4}_{e}}{n}, \end{displaymath}

where n is the number of elements in the sample. In this case, n = 12, and

\begin{displaymath}V( \hat{\sigma}^{2}_{e} ) \ = \ \frac{2*4096}{12} \ = \ 682.666.. \end{displaymath}

which gives a standard deviation of 26.13. Thus, the average variance estimate of samples of 12 ei would be expected to be 64 with a standard deviation of 26.13, and the sample variance estimates would follow a Chi-square distribution with 12 degrees of freedom.

Derivation of Estimators

Selection Index

Assume a single record per animal, then

\begin{displaymath}I = b(y- \mu), \end{displaymath}

where b, the selection index weight, is derived from selection index equations as,

\begin{displaymath}\sigma^{2}_{y} \ b \ = \ \sigma_{ya}, \end{displaymath}

and $\mu$ is assumed to be known. The variance of y is

\begin{eqnarray*}\sigma^{2}_{y} & = & \sigma^{2}_{a} \ + \ \sigma^{2}_{e} \\
& = & 36 \ + \ 64 \ = \ 100,
\end{eqnarray*}


and the covariance between y and a is

\begin{displaymath}\sigma_{ya} \ = \ \sigma^{2}_{a} \ = \ 36, \end{displaymath}

so that

\begin{eqnarray*}\hat{b} & = & \sigma_{ya} / \sigma^{2}_{y} \\
& = & \sigma^{2...
...igma^{2}_{a} \ + \ \sigma^{2}_{e} ) \\
& = & h^{2} \ = \ 0.36.
\end{eqnarray*}


An estimate of $\mu$ is just the overall mean, i.e.

\begin{displaymath}\hat{\mu} \ = \ (38.5 + 48.9 + \cdots + 49.5) / 12 \ = \ 48.9333.. \end{displaymath}

The indexes of the animals with records are given in the table below. Because the true breeding values (TBV) are known (from the simulation process), the correlation between index values and TBV can be calculated. The expected correlation is the square root of the heritability or .6. For this small amount of data the correlation was .6575.

Results for Selection Index Method.
Animal TBV   I I-TBV $\hat{e}_{i}$
5 -5.2   -3.756 1.444 -6.67733
6 3.4   -0.012 -3.412 -.02133
7 2.9   5.532 2.632 9.83467
8 -10.0   0.564 10.564 1.00267
9 -2.4   -4.656 -2.256 -8.27733
10 3.3   6.288 2.988 11.17867
11 -11.4   -7.212 4.188 -12.82133
12 3.6   8.664 5.064 15.40267
13 -0.9   -1.704 -0.804 -3.02933
14 -6.7   1.608 8.308 2.85867
15 -5.4   -5.520 -0.120 -9.81333
16 2.6   0.204 -2.396 0.36267
Var. 30.0906   24.4841   70.93335
Cov.   17.8476      
Cor.   0.6575      
MSE       17.3062  

Note that base animals were not evaluated by this index because they did not have records. Also, relationships among animals were ignored. Animal 7, for example, had a record and 3 progeny, but the progeny information was not included in the evaluation. The MSE, mean squared error, is the average squared difference between the index and TBV. The MSE criterion is often used to compare different types of estimators. Another criterion is the variance of the estimated residuals (last column). The method giving the highest correlation between estimated and true values, or the smallest MSE, or the lowest variance of estimated residuals should be the preferred method.

The selection index method, although not a bad first attempt, comes pretty close to the true breeding values. The following methods are attempts to improve upon the selection index. Generalized Least Squares

Let a general model be written as

\begin{displaymath}{\bf y} \ = \ {\bf W}\mbox{\boldmath$\beta$ } \ + \ {\bf e}, \end{displaymath}

where

\begin{displaymath}E({\bf y}) = {\bf W} \mbox{\boldmath$\beta$ }, \
\ \mbox{and} \ \ V({\bf y}) = {\bf V}. \end{displaymath}

The objective is to estimate $\beta$ by a linear function of the observations, ${\bf L}'{\bf y}$, such that the variance of the error of estimation is minimized and such that the expectation of the estimator equals the expected value of the estimand. Create a matrix ${\bf F}$ as the sum of the variance of the error of estimation and a LaGrange multiplier to force unbiasedness, then

\begin{eqnarray*}{\bf F} & = & Var({\bf L}'{\bf y}-\mbox{\boldmath$\beta$ })
+...
...'{\bf VL} + ({\bf L}'{\bf W} - {\bf I})
\mbox{\boldmath$\phi$ }
\end{eqnarray*}


Now take derivatives with respect to the unknowns, ${\bf L}$ and $\phi$ and set the derivatives equal to null matrices. The following equations are obtained

\begin{displaymath}\left( \begin{array}{ll} {\bf V} & {\bf W} \\
{\bf W}' & {\...
...eft( \begin{array}{c}
{\bf0} \\ {\bf I} \end{array} \right). \end{displaymath}

Solving these equations gives

\begin{displaymath}{\bf L} \ = \ {\bf V}^{-1}{\bf W}({\bf W}'{\bf V}^{-1}{\bf W})^{-} \end{displaymath}

and therefore,

\begin{eqnarray*}{\bf L}'{\bf y} & = & ({\bf W}'{\bf V}^{-1}{\bf W})^{-}
{\bf W}'{\bf V}^{-1}{\bf y} \\
& = & \hat{ \mbox{ \boldmath$\beta$ } }
\end{eqnarray*}


The estimator of $\beta$ is the solution to the following equations, which are called Generalized Least Squares (GLS),

\begin{displaymath}{\bf W}'{\bf V}^{-1}{\bf W} \mbox{ \boldmath$\beta$ } \ = \
{\bf W}'{\bf V}^{-1}{\bf y}. \end{displaymath}

Ordinary Least Squares equations, (OLS), are obtained when ${\bf V} = {\bf I}\sigma^{2}$, which are

\begin{displaymath}{\bf W}'{\bf W} \mbox{ \boldmath$\beta$ } \ = \
{\bf W}'{\bf y}. \end{displaymath}

Weighted Least Squares, WLS, equations are obtained when ${\bf V}$is a diagonal matrix and the diagonal elements are not all equal.

For the animal model, then

\begin{eqnarray*}{\bf W} & = & ( {\bf 1} \ \ \ {\bf I} ) \\
{\bf V} & = & {\bf ...
...{\bf V}^{-1}{\bf y} \\ {\bf V}^{-1}{\bf y}
\end{array} \right).
\end{eqnarray*}


To solve these GLS equations a restriction is needed. Let $\hat{ \mu} = 48.9333$, then

\begin{eqnarray*}{\bf V}^{-1}{\bf 1}\hat{\mu} + {\bf V}^{-1} \tilde{\bf a} & = &...
...}\hat{\mu}) \\
\tilde{\bf a} & = & ({\bf y} - {\bf 1}\hat{\mu})
\end{eqnarray*}


Thus, GLS in this situation, gives estimated breeding values that are the observations deviated from the overall mean. The correlation of $\tilde{\bf a}$ with TBV is 0.6575, and the ranking of animals by $\tilde{\bf a}$ are identical to the results from the selection index. Thus, in terms of genetic change, exactly the same animals would be selected for breeding. Note that the MSE for the GLS estimator is much larger than for the selection index estimator. Thus, MSE is not a perfect criterion for distinguishing between two procedures.

Results for Generalized Least Squares.
Animal TBV   $\tilde{\bf a}$ $\tilde{\bf a}$-TBV
5 -5.2   -10.4333 -5.2333
6 3.4   -0.0333 -3.4333
7 2.9   15.3667 12.4667
8 -10.0   1.5667 11.5667
9 -2.4   -12.9333 -10.5333
10 3.3   17.4667 14.1667
11 -11.4   -20.0667 -8.6333
12 3.6   24.0667 20.4667
13 -0.9   -4.7333 -3.8333
14 -6.7   4.4667 11.1667
15 -5.4   -15.3333 -9.9333
16 2.6   0.5667 -2.0333
Var. 30.0906   188.9206  
Cov.   49.5766    
Cor.   0.6575    
MSE       109.8697

The estimated residuals in this case are all equal to zero, and therefore, the variance of the estimated residuals is also zero. However, this does not imply that GLS is the best method because the error is actually becoming part of the estimated breeding value. Regressed Least Squares

This method consists of 'shrinking' the GLS estimator, $\tilde{\bf a}$, towards its expected value. If animals were unrelated, then the shrunken estimator would be $h^{2}\tilde{\bf a}$ which is identical to the selection index estimator. However, because animals are indeed related, the procedure is formally written as

\begin{eqnarray*}{\bf u}^{*} & = & ({\bf Z}'{\bf R}^{-1}{\bf Z}+{\bf G}^{-1})^{-...
...bf Z}+{\bf G}^{-1})^{-1}
{\bf Z}'{\bf R}^{-1}{\bf Z}\hat{\bf u}
\end{eqnarray*}


where $\hat{\bf u}$ and $\hat{\bf b}$ are solutions from

\begin{displaymath}\left( \begin{array}{c} \hat{\bf b} \\ \hat{\bf u} \end{array...
...}{\bf y} \\
{\bf Z}'{\bf R}^{-1}{\bf y} \end{array} \right). \end{displaymath}

For the animal model example, then

\begin{eqnarray*}{\bf Z}'{\bf R}^{-1}{\bf Z}+{\bf G}^{-1} & = & {\bf I} + {\bf A...
... = & ({\bf I}+{\bf A}^{-1}_{nn}k)^{-1}({\bf y}
- \bar{\bf y}).
\end{eqnarray*}


The selection index and regressed LS estimates are given in the following table.

Results for Regressed Least Squares
and Selection Index.
Animal TBV I RLS $\hat{e}_{i}$  
5 -5.2 -3.76 -7.17 -3.2633  
6 3.4 -0.01 2.11 -2.1433  
7 2.9 5.53 8.41 6.9567  
8 -10.0 0.56 -2.19 3.7567  
9 -2.4 -4.66 -4.81 -8.1233  
10 3.3 6.29 6.26 11.2067  
11 -11.4 -7.21 -8.07 -11.9633  
12 3.6 8.66 9.39 14.6767  
13 -0.9 -1.70 -2.08 -2.6533  
14 -6.7 1.61 2.40 2.0667  
15 -5.4 -5.52 -6.76 -8.5733  
16 2.6 0.20 2.55 -1.9833  
Cor.   0.66 0.75 59.7168  
MSE   17.31 20.18    

The RLS estimator gave a higher correlation than selection index, but the MSE was also greater. The variance of estimated residuals was smaller than for selection index. Best Linear Unbiased Prediction

Prediction refers to the estimation of the realized value of a random variable (from data) that has been sampled from a population with a known variance-covariance structure. The general mixed model is written as

\begin{displaymath}{\bf y} \mbox{ = } {\bf Xb} + {\bf Zu} + {\bf e} \end{displaymath}

where
${\bf y}$
is an Nx1 vector of observations,
${\bf b}$
is a px1 vector of unknown constants,
${\bf u}$
is a qx1 vector of unknown effects of random variables,
${\bf e}$
is an Nx1 vector of unknown residual effects,
${\bf X,Z}$
are known matrices of order Nxp and Nxq respectively, that relate elements of ${\bf b}$ and ${\bf u}$ to elements of ${\bf y}$.

The elements of ${\bf b}$ are considered to be fixed effects while the elements of ${\bf u}$ are random factors from populations of random effects with known variance-covariance structures. Both ${\bf b}$ and ${\bf u}$ may be partitioned into one or more factors depending on the situation.

The expectations of the random variables are

\begin{eqnarray*}E({\bf b}) & = & {\bf b} \\
E({\bf u}) & = & {\bf0} \\
E({\...
...\bf0} \\
E({\bf y}) & = & E({\bf Xb+Zu+e}) \\
& = & {\bf Xb}
\end{eqnarray*}


and the variance-covariance structure is typically represented as

\begin{displaymath}Var \left( \begin{array}{c} {\bf u} \\ {\bf e} \end{array} \r...
...c} {\bf G} & {\bf0} \\
{\bf0} & {\bf R} \end{array} \right), \end{displaymath}

where ${\bf G}$ and ${\bf R}$ are known, positive definite matrices. Consequently,

\begin{eqnarray*}Var({\bf y}) & = & Var({\bf Xb+Zu+e}) \\
& = & {\bf ZGZ'+R}, ...
...
Cov({\bf y,u}) & = & {\bf ZG} \\
Cov({\bf y,e}) & = & {\bf R}
\end{eqnarray*}


If ${\bf u}$ is partitioned into s factors as

\begin{displaymath}{\bf u'} \mbox{ = } \left( \begin{array}{cccc}
{\bf u'_{1}} & {\bf u'_{2}} & \cdots & {\bf u'_{s}}
\end{array} \right), \end{displaymath}

then

\begin{displaymath}Var({\bf u}) \mbox{ = } Var \left( \begin{array}{c}
{\bf u_{1...
... & {\bf G'_{2s}} & \cdots & {\bf G_{ss}}
\end{array} \right). \end{displaymath}

Each ${\bf G_{ij}}$ is assumed to be known.

The prediction problem involves both ${\bf b}$ and ${\bf u}$. A few definitions are needed.

Predictand
is the quantity to be predicted, in this case ${\bf K'b+M'u}$,
Predictor
is the function used to predict the predictand, a linear function of ${\bf y}$, i.e. ${\bf L'y}$.
Prediction Error
is the difference between the predictor and the predictand, i.e. \( {\bf K'b+M'u-L'y} \).
Unbiasedness
is a property of a predictor in which the expectation of the predictor is the same as the expectation of the predictand.

The derivation of BLUP begins by equating the expectations of the predictor and the predictand to determine what needs to be true in order for unbiasedness to hold. That is,

\begin{eqnarray*}E({\bf L'y}) & = & {\bf L'Xb} \\
E({\bf K'b+M'u}) & = & {\bf K'b}
\end{eqnarray*}


then to be unbiased for all possible vectors ${\bf b}$,

\begin{displaymath}{\bf L'X} \mbox{ = } {\bf K'} \end{displaymath}

must hold, or

\begin{displaymath}{\bf L'X-K'} \mbox{ = } {\bf0}. \end{displaymath}

The next step is to derive the variance of prediction error:

\begin{eqnarray*}Var({\bf K'b+M'u-L'y}) & = & Var({\bf M'u-L'y}) \\
& = & {\bf...
...M} + {\bf L'VL} - {\bf M'GZ'L} - {\bf L'ZGM} \\
& = & Var(PE).
\end{eqnarray*}


By requiring the predictor to be unbiased, then the mean squared error is equivalent to the variance of prediction error. Now combine the variance of prediction error with a LaGrange Multiplier to force unbiasedness to obtain the matrix ${\bf F}$, where

\begin{displaymath}{\bf F} \mbox{ = } Var(PE) + ({\bf L'X}-{\bf K'}){\bf\Phi}. \end{displaymath}

Minimization of the diagonals of ${\bf F}$ is achieved by differentiating ${\bf F}$ with respect to the unknowns, ${\bf L}$ and ${\bf\Phi}$, and equating the partial derivatives to null matrices.

\begin{eqnarray*}\frac{\partial {\bf F}}{\partial {\bf L}} & = &
2{\bf VL}-2{\...
...{\partial {\bf\Phi}} & = &
{\bf X'L} -{\bf K} \mbox{ = } {\bf0}
\end{eqnarray*}


Let ${\bf\theta}=.5{\bf\Phi}$, then the first derivative can be written as

\begin{eqnarray*}{\bf VL} & = & {\bf ZGM} - {\bf X \theta}
\end{eqnarray*}


then solve for ${\bf L}$ as

\begin{eqnarray*}{\bf L} & = & {\bf V^{-1}ZGM} - {\bf V^{-1}X \theta}.
\end{eqnarray*}


Substituting the above for ${\bf L}$ into the second derivative, then we can solve for ${\bf\theta}$ as

\begin{eqnarray*}{\bf X'L-K} & = & {\bf0} \\
{\bf X'(V^{-1}ZGM-V^{-1}X \theta)}...
...bf\theta} & = & ({\bf X'V^{-1}X})^{-}({\bf X'V^{-1}ZGM}-{\bf K})
\end{eqnarray*}


Substituting this solution for ${\bf\theta}$ into the equation for ${\bf L}$ gives

\begin{eqnarray*}{\bf L'} & = & {\bf M'GZ'V^{-1}}+{\bf K'(X'V^{-1}X)^{-}X'V^{-1}} \\
& & -{\bf M'GZ'V^{-1}X(X'V^{-1}X)^{-}X'V^{-1}}.
\end{eqnarray*}


If we let

\begin{displaymath}\hat{{\bf b}} \mbox{ = } ({\bf X'V^{-1}X})^{-}{\bf X'V^{-1}y} \end{displaymath}

then the predictor becomes

\begin{displaymath}{\bf L'y} \mbox{ = } {\bf K'\hat{b}} + {\bf M'GZ'V^{-1}}
({\bf y}-{\bf X \hat{b}}) \end{displaymath}

which is the BLUP of ${\bf K'b+M'u}$, and $\hat{\bf b}$ is a GLS solution for ${\bf b}$. A special case for this predictor would be to let ${\bf K'=0}$ and ${\bf M'=I}$, then the predictand is \( {\bf K'b+M'u} \mbox{ = } {\bf u} \), and

\begin{displaymath}{\bf L'y} \mbox{ = } \hat{{\bf u}} \mbox{ = }
{\bf GZ'V^{-1}}({\bf y - X \hat{b}}). \end{displaymath}

Hence the result that the predictor of ${\bf K'b+M'u}$ is \( \left( \begin{array}{ll} {\bf K'} & {\bf M'} \end{array} \right) \)times the predictor of \( \left( \begin{array}{ll} {\bf b'} & {\bf u'}
\end{array} \right)' \) which is \( \left( \begin{array}{ll}
{\bf\hat{b}'} & {\bf\hat{u}'} \end{array} \right)'. \)

Let

\begin{eqnarray*}{\bf P} & = & ({\bf X'V^{-1}X})^{-} \\
{\bf\hat{b}} & = & {\bf PX'V^{-1}y} \end{eqnarray*}


then

\begin{eqnarray*}{\bf\hat{u}} & = & {\bf GZ'V^{-1}}({\bf y-XPX'V^{-1}y}) \\
& ...
...V^{-1}Qy} \\
\mbox{for} \ \ {\bf Q} & = & ({\bf I-XPX'V^{-1}}). \end{eqnarray*}


From the results on generalized inverses of ${\bf X}$,

\begin{eqnarray*}{\bf XPX'V^{-1}X} & = & {\bf X}, \mbox{and therefore} \\
{\bf ...
...\
& = & {\bf X-XPX'V^{-1}X} \\
& = & {\bf X-X} \ = \ {\bf0}. \end{eqnarray*}


The variance of the predictor is then,

\begin{eqnarray*}Var({\bf\hat{u}}) & = & {\bf GZ'V^{-1}Q}(V({\bf y})){\bf Q'V^{-...
...-1}ZG} \\
& = & {\bf GZ'V^{-1}ZG}-{\bf GZ'V^{-1}XPX'V^{-1}ZG}.
\end{eqnarray*}


The covariance between ${\bf\hat{b}}$ and ${\bf\hat{u}}$ is

\begin{eqnarray*}Cov({\bf\hat{b},\hat{u}}) & = & {\bf PX'V^{-1}}V({\bf y})
{\bf...
...X'Q'V^{-1}ZG} \\
& = & {\bf0} \ \mbox{because} \ {\bf X'Q = 0}
\end{eqnarray*}


Therefore, the total variance of the predictor is

\begin{eqnarray*}Var({\bf K'\hat{b}+M'\hat{u}}) & = & {\bf K'PK}+{\bf M'GZ'V^{-1}ZGM} \\
& & -{\bf M'GZ'V^{-1}XPX'V^{-1}ZGM}. \end{eqnarray*}


The variances of prediction error are

\begin{eqnarray*}Var({\bf\hat{b}-b}) & = & Var({\bf\hat{b}})+Var({\bf b})
-Cov(...
...u}})
- Cov({\bf\hat{b},u}) \\
& = & {\bf0}-{\bf PX'V^{-1}ZG}. \end{eqnarray*}


Derivation of MME

Take the first and second partial derivatives of ${\bf F}$, the variance-covariance matrix of prediction errors plus the LaGrange Multiplier to force unbiasedness, and write them in matrix notation as


\begin{displaymath}\left( \begin{array}{ll} {\bf V} & {\bf X} \\ {\bf X'} & {\bf...
...eft( \begin{array}{c}
{\bf ZGM} \\ {\bf K} \end{array} \right) \end{displaymath}

Recall that \( {\bf V} \mbox{ = } Var({\bf y}) \mbox{ = }
{\bf ZGZ'+R}, \) and let

\begin{displaymath}{\bf S} \mbox{ = } {\bf G}({\bf Z'L-M}) \end{displaymath}

which when re-arranged gives

\begin{displaymath}{\bf M} \mbox{ = } {\bf Z'L-G^{-1}S}, \end{displaymath}

then the previous equations can be re-written as

\begin{displaymath}\left( \begin{array}{lll} {\bf R} & {\bf X} & {\bf Z} \\
{\b...
...gin{array}{c} {\bf0} \\ {\bf K} \\ {\bf M} \end{array} \right).\end{displaymath}

Take the first row of these equations and solve for ${\bf L}$, then substitute the solution for ${\bf L}$ into the other two equations. Thus,

\begin{displaymath}{\bf L} \mbox{ = } -{\bf R^{-1}X \theta}-{\bf R^{-1}ZS} \end{displaymath}

and

\begin{displaymath}- \left( \begin{array}{ll} {\bf X'R^{-1}X} & {\bf X'R^{-1}Z} ...
...\left( \begin{array}{c} {\bf K} \\ {\bf M} \end{array}\right). \end{displaymath}

Let a solution to these equations be obtained by computing a generalized inverse of

\begin{displaymath}\left( \begin{array}{ll} {\bf X'R^{-1}X} & {\bf X'R^{-1}Z} \\
{\bf Z'R^{-1}X} & {\bf Z'R^{-1}Z+G^{-1}} \end{array} \right) \end{displaymath}

denoted as

\begin{displaymath}\left( \begin{array}{ll} {\bf C}_{xx} & {\bf C}_{xz} \\
{\bf C}_{zx} & {\bf C}_{zz} \end{array} \right), \end{displaymath}

then the solutions are

\begin{displaymath}\left( \begin{array}{c} {\bf\theta} \\ {\bf S} \end{array}\ri...
...left( \begin{array}{c} {\bf K} \\ {\bf M} \end{array} \right). \end{displaymath}

Therefore, the predictor is

\begin{eqnarray*}{\bf L'y} & = & \left( \begin{array}{ll} {\bf K'} & {\bf M'}
\e...
...gin{array}{c} {\bf\hat{b}} \\ {\bf\hat{u}}
\end{array} \right).
\end{eqnarray*}


where \( {\bf\hat{b}} \mbox{ and } {\bf\hat{u}} \) are solutions to

\begin{displaymath}\left( \begin{array}{ll} {\bf X'R^{-1}X} & {\bf X'R^{-1}Z} \\...
...y}{c} {\bf X'R^{-1}y} \\
{\bf Z'R^{-1}y} \end{array} \right). \end{displaymath}

These final equations are known as Henderson's Mixed Model Equations or HMME. Notice that these equations are of order equal to the number of elements in ${\bf b}$ and ${\bf u}$, which is usually less than the number of elements in ${\bf y}$, and therefore, are more practical to obtain than the original BLUP formulation. Also, these equations require the inverse of ${\bf R}$ rather than ${\bf V}$, both of which are of the same order, but ${\bf R}$ is usually diagonal or has a more simple structure than ${\bf V}$. Also, the inverse of ${\bf G}$ is needed, which is of order equal to the number of elements in ${\bf u}$. The ability to compute the inverse of ${\bf G}$depends on the model and the definition of ${\bf u}$.

The variances of the predictors and prediction errors can be expressed in terms of the generalized inverse of the coefficient matrix of HMME. Recall that

\begin{displaymath}\left( \begin{array}{c} {\bf\hat{b}} \\ {\bf\hat{u}}
\end{ar...
...{\bf X'R^{-1}} \\
{\bf Z'R^{-1}} \end{array} \right) {\bf y}, \end{displaymath}

or in shortened notation as

\begin{displaymath}{\bf\hat{b}} \mbox{ = } {\bf C_{b}'}{\bf y}, \end{displaymath}

and

\begin{displaymath}{\bf\hat{u}} \mbox{ = } {\bf C_{u}'}{\bf y}. \end{displaymath}

Without loss of generality, assume that the coefficient matrix of HMME is full rank (to simplify the presentation of results), then

\begin{displaymath}\left( \begin{array}{ll} {\bf C_{xx}} & {\bf C_{xz}} \\
{\bf...
... {\bf Z'R^{-1}X} &
{\bf Z'R^{-1}Z+G^{-1}} \end{array} \right) \end{displaymath}


\begin{displaymath}\mbox{ = } \left( \begin{array}{cc} {\bf I} & {\bf0} \\
{\bf0} & {\bf I} \end{array} \right), \end{displaymath}

which gives the result that

\begin{displaymath}\left( \begin{array}{ll} {\bf C_{xx}} & {\bf C_{xz}} \\
{\bf...
...1}Z} \\ {\bf Z'R^{-1}X} &
{\bf Z'R^{-1}Z} \end{array} \right) \end{displaymath}


\begin{displaymath}\mbox{ = } \left( \begin{array}{ll}
{\bf I} & -{\bf C_{xz}}{\...
...\bf0} &
{\bf I}-{\bf C_{zz}}{\bf G^{-1}} \end{array} \right). \end{displaymath}

This last result is used over and over in deriving the remaining results. Now,

\begin{eqnarray*}Var({\bf\hat{b}}) & = & V({\bf C_{b}'}{\bf y}) \\
& = & {\bf ...
...f C_{xx}}
- {\bf C_{xz}G^{-1}C_{zx}} \\
& = & {\bf C_{xx}}.
\end{eqnarray*}


The remaining results are derived in a similar manner. These give

\begin{eqnarray*}Var({\bf\hat{u}}) & = & {\bf C_{u}}'Var({\bf y}){\bf C_{u}} \\ ...
...}, \hat{u}-u}) & = & Cov({\bf\hat{b},u}) \\
& = & {\bf C_{xz}}
\end{eqnarray*}


In matrix form, the variance-covariance matrix of the predictors is

\begin{displaymath}Var \left( \begin{array}{l} {\bf\hat{b}} \\ {\bf\hat{u}}
\e...
...{xx}} & {\bf0} \\ {\bf0} & {\bf G-C_{zz}} \end{array} \right), \end{displaymath}

and the variance-covariance matrix of prediction errors is

\begin{displaymath}Var \left( \begin{array}{c} {\bf\hat{b}} \\ {\bf\hat{u}-u}
...
...f C_{xz}} \\ {\bf C_{zx}} & {\bf C_{zz}}
\end{array} \right). \end{displaymath}

As the number of observations in the analysis increases, two things can be noted from these results:

1.
$Var({\bf\hat{u}})$ increases in magnitude towards a maximum of G, and
2.
$Var({\bf\hat{u}-u})$ decreases in magnitude towards a minimum of 0.

Application to Example Data

For the simple animal model, HMME can be formed by noting that

\begin{eqnarray*}( {\bf X} \ \ \ {\bf Z}) & = & ({\bf 1} \ \ {\bf0} \ \ {\bf I})...
...} & {\bf A}^{nn} \end{array} \right)
\frac{1}{\sigma^{2}_{a}}.
\end{eqnarray*}


The resulting equations, with $k=\sigma^{2}_{e} / \sigma^{2}_{a}$, are

\begin{displaymath}\left( \begin{array}{lll} N & {\bf0} & {\bf 1}' \\
{\bf0} &...
...l} {\bf 1}'{\bf y} \\ {\bf0} \\
{\bf y} \end{array} \right). \end{displaymath}

Note that extra rows and columns were added to allow the base generation animals to be included. An alternative set of equations, which would give identical solutions for the animals with records, would be

\begin{displaymath}\left( \begin{array}{ll} N & {\bf 1}' \\ {\bf 1} &
{\bf I}+{...
...in{array}{l} {\bf 1}'{\bf y} \\ {\bf y}
\end{array} \right). \end{displaymath}

In order to use the latter equations, ${\bf A}_{nn}$ would have to be constructed and then inverted, and this would have to be accomplished without using Henderson's easy rules. The easy rules apply only if the base animals are included, and thus the former equations are recommended. The solutions to the equations, for all animals in the example data set are given in the following table. The estimate of $\mu$ was 48.888601.

Results for BLUP
Animal TBV   $\hat{\bf a}$ $\hat{\bf a}$-TBV $\hat{e}_{i}$
1 -10.8   -4.74 6.06  
2 3.9   0.37 -3.53  
3 4.8   5.86 1.06  
4 -2.5   -1.50 1.00  
           
5 -5.2   -7.14 -1.94 -3.2486
6 3.4   2.14 -1.26 -2.1286
7 2.9   8.44 5.54 6.9714
8 -10.0   -2.16 7.84 3.7714
9 -2.4   -4.78 -2.38 -8.1086
10 3.3   6.30 3.00 11.2114
11 -11.4   -8.04 3.36 -11.9486
12 3.6   9.42 5.82 14.6914
13 -0.9   -2.04 -1.14 -2.6486
14 -6.7   2.44 9.14 2.0714
15 -5.4   -6.73 -1.33 -8.5586
16 2.6   2.59 -0.01 -1.9786
Var(5-16) 30.0906   37.2250   59.7024
Var(1-16) 32.6700   31.2449    
Cov(5-16)   25.2577      
Cov(1-16)   24.1477      
Cor(5-16)   0.7547      
Cor(1-16)   0.7558      
MSE(5-16)       20.3286  
MSE(1-16)       18.4532  

Thus, BLUP analysis of the animal model gave a higher correlation with TBV than the simple selection index or the GLS estimator, and the result was very similar to the regressed least squares estimator except that the MSE was slightly higher for BLUP, but the variance of the estimated residuals was slightly smaller for BLUP. The BLUP analysis provided estimators for the base generation animals, and utilized the relationships among all animals in the data.

Partitioning BLUP Solutions

Partitioning solutions from HMME became useful in explaining results from BLUP to dairy breeders in terms that they could understand. Partitioning also helps the researcher to understand what is contributing to the solutions.

From HMME, animal solutions have contributions from three basic sources of information, i.e. their own records, their parent average breeding value, and the average of their progeny EBV deviated from one half the mate's EBV. Take, for example, the equation for animal 7 in the example. The equation is

\begin{eqnarray*}1\mu - k(a_{2}) -k(a_{3}) & + & .5k(a_{6})+ (1+3.5k)a_{7} \\
+k(a_{8}) - k(a_{10}) -k(a_{12}) & - & k(a_{16}) \ = \ 64.3.
\end{eqnarray*}


This can be re-arranged as follows:

\begin{eqnarray*}(1+3.5k)a_{7} & = & (64.3 \ - \ 1\mu) \\
& & + k(a_{2}+a_{3})...
...2} & = & 2k / (n+2k + .5pk) \\
w_{3} & = & .5pk / (n+2k + .5pk)
\end{eqnarray*}


For animal 7,

\begin{eqnarray*}\mbox{Data Ave.} & = & 64.3 - 48.888601 \ = \ 15.4114 \\
\mbox...
...114)+.49231(3.1178) \\
& & +.36923(12.9298) \\
& = & 8.4428
\end{eqnarray*}


From this small example, the weight on the parent average, PA, w2, is greater than the weights on either the animal's own record or on the average of 3 progeny, even at a heritability of .36. Dairy producers believe that w2 should be smaller than either w1 or w3, even though this could lead to a lower correlation between TBV and EBV. Below is a table of the weights for animal 7 if the heritability were different values.

HMME Weights for n=1, p=3, and both parents known.
h2 (n+2k+.5pk) w1 w2 w3
    Data PA Progeny
0.10 32.5 .03077 .55385 .41538
0.20 15.0 .06667 .53333 .40000
0.36 7.2222 .13846 .49231 .36923
0.50 4.5 .22222 .44444 .33333

As the heritability increases, the weight on an animal's own records increases, the weight on the parent average decreases, and the weight on the average of three progeny decreases. However, in all cases the weight on the parent average is still greater than the other two weights. In general, at least 4 progeny are needed to have the same weight on the parent average as on the progeny average. Thus, the parent average is equivalent to the average of four progeny, in terms of importance. At the same time, an animal needs $n \geq 2k$ records to have more importance than the parent average. At h2=0.5, this means that n must be greater than 2, but at h2=0.1, n must be greater than 18.

If animal 7 had 100 progeny, n=1, and h2=0.36, then w1=0.01070, w2=0.03805, and w3=0.95125, so that the progeny average becomes the most important piece of information in the long run. This is because the progeny average represents what an animal transmits to its offspring, while an animal's own record can be affected by many factors, and the parent average also does not have much relevance when an animal has many progeny. Robertson and Dempfle

This methodology was first given by Alan Robertson (1955) and was later revived by Leo Dempfle (1989) in a symposium in honour of Alan Robertson. The method is Bayesian in philosophy and consists of combining prior information about $\beta$ with information coming from the data. Let

\begin{eqnarray*}{\bf y} & = & {\bf W}\mbox{\boldmath$\beta$ } + {\bf e} \\
E(...
...\beta$ } & \sim & N( {\bf W}
\mbox{\boldmath$\beta$ }, {\bf R})
\end{eqnarray*}


Thus,

\begin{eqnarray*}\mbox{Prior Info} & = & E(\mbox{\boldmath$\beta$ }) \ = \ {\bf ...
...^{-1}{\bf y} \\
\mbox{weight} & = & {\bf W}'{\bf R}^{-1}{\bf W}
\end{eqnarray*}


The Prior Info is weighted by its weight and the Data Info is weighted by its weight, and the sum is divided by the sum of the weights. Hence,

\begin{eqnarray*}\mbox{\boldmath$\beta$ }^{o} & = & ({\bf W}'{\bf R}^{-1}{\bf W}...
...{-1})^{-1} (
{\bf W}'{\bf R}^{-1}{\bf y} + {\bf M}^{-1}{\bf r})
\end{eqnarray*}


The result can be expanded by noting that

\begin{eqnarray*}\mbox{\boldmath$\beta$ } & = & \left( \begin{array}{c}
{\bf b...
...{-1}{\bf y} \\ {\bf Z}'{\bf R}^{-1}{\bf y}
\end{array} \right)
\end{eqnarray*}


Combining these bits gives Robertson's Mixed Model Equations, RMME, as follows.

\begin{displaymath}\left( \begin{array}{lr} {\bf X}'{\bf R}^{-1}{\bf X}+{\bf S}^...
...R}^{-1}{\bf y} +
{\bf G}^{-1}{\bf r}_{2} \end{array} \right). \end{displaymath}

RMME are slightly different from HMME. If ${\bf S}^{-1} = {\bf0}$, which would happen if nothing was known about ${\bf b}$, and if $E({\bf u})={\bf r}_{2}={\bf0}$, then RMME would become the same as HMME. When ${\bf r}_{2} \neq {\bf0}$, this is a situation in which selection may have occurred. Thus, RMME are more general in concept than HMME, but require more previous knowledge.

Application to Example Data

For the animal model example, the RMME would be

\begin{displaymath}\left( \begin{array}{lll} N+s^{-1} & {\bf0} & {\bf 1}' \\
{...
...o} \\ {\bf a}^{o}_{b} \\
{\bf a}^{o}_{n} \end{array} \right) \end{displaymath}


\begin{displaymath}= \left( \begin{array}{l} {\bf 1}'{\bf y} + s^{-1}r_{1} \\
...
...}k{\bf r}_{2b}+{\bf A}^{nn}k{\bf r}_{2n}
\end{array} \right). \end{displaymath}

Suppose that from previous research it was known that

\begin{eqnarray*}r_{1} & = & 50, \\
s^{-1} & = & 5000, \\
\left( \begin{arra...
...egin{array}{c} {\bf a}_{b} \\
{\bf a}_{n} \end{array} \right).
\end{eqnarray*}


That is, the mean is known to be 50 with a variance of $\frac{1}{5000}$, which is pretty accurate, but not perfect, and the expected values of ${\bf a}$ are their TBV (which would never be known in real life). The results from RMME using the above information give a correlation between estimated and true BV of .91, which is substantially better than the other methods.

More realistically, the expected values of the base population animals can be assumed to be zero, i.e. ${\bf r}_{2b}={\bf0}$, but due to selection or non random mating, the expected value of offspring may not be a null vector. Assume that ${\bf r}_{2n} = -2 {\bf 1}$, which says that the offspring are below the average of the base population animals. Also, assume that nothing is known about $\mu$, so that s-1=0. The results from RMME under these assumptions and from HMME are given in the following table.

Results for RMME and HMME.
Animal TBV HMME RMME
5 -5.2 -7.14 -9.14
6 3.4 2.14 0.14
7 2.9 8.44 6.44
8 -10.0 -2.16 -4.16
9 -2.4 -4.78 -6.78
10 3.3 6.30 4.30
11 -11.4 -8.04 -10.04
12 3.6 9.42 7.42
13 -0.9 -2.04 -4.04
14 -6.7 2.44 0.44
15 -5.4 -6.73 -8.73
16 2.6 2.59 0.59
Cor.   0.75 0.75
MSE   20.33 15.45

The correlation for RMME is the same as for HMME because the solutions for RMME are exactly 2 units lower than for HMME. However, notice that the MSE is smaller for RMME, which says that the solutions for offspring are closer to their true values.

The outcome from the use of RMME is dependent upon the Prior Info that goes into RMME. With good Prior Info the results are better than if the priors are poor. If the Prior Info is in error, then RMME could be worse than HMME or other methods.

RMME have not been used very often in animal breeding research, at least in terms of using Prior Info. With selection and non random mating being more important in animal breeding, use of RMME should increase in the future because it allows a means to account for these biases.


next up previous

This LaTeX document is available as postscript or asAdobe PDF.

Larry Schaeffer
1999-02-26