This LaTeX document is available as postscript or asAdobe PDF.
These notes are based on Sorensen (1998) from a course on "Gibbs Sampling in Quantitative Genetics" given at AGBU, Armidale, NSW, Australia. The Gibbs sampling tool to derive estimates of variance components from the joint posterior distribution seems to be a very powerful tool to use with animal models. Computationally, one needs only a program that calculates solutions to Henderson's mixed model equations, very good random number generators, and computer time to process very large numbers of samples.
Begin with the simple single trait animal model. That is,
For
,
the vector of additive genetic values, quantitative
genetics theory suggests that they follow a normal distribution, i.e.
Now form the joint posterior distribution as
In order to implement Gibbs sampling, all of the fully conditional
posterior distributions (one for each component of
) need
to be derived from the above joint posterior distribution.
Let
In general terms, the conditional posterior distribution of
is
Computational Scheme
Gibbs sampling is much like Gauss-Seidel iteration, except that
Round 1
Round 2 begins by re-forming the MME using the new variance ratio.
The equations have changed to
Burn In Period
Generally, the Gibbs chain of samples does not immediately converge to give samples from the joint posterior distribution. A burn in period is usually allowed during which the sampling process moves from the initial values of the parameters to those from the joint posterior distribution. The length of the burn-in period is usually judged by visually inspecting a plot of sample values across rounds. In the example figure below, the burn-in period should likely be 2000 rounds. This figure represents an animal model analysis of over 323,000 records on slightly over 1 million animals. The trait was one of the type traits in the Holstein breed. The samples generated during the burn-in period are typically discarded from further use.
Estimates
Each round of Gibbs sampling is dependent on the results of the previous round. Depending on the total number of observations and parameters, one round may be positively correlated with the next twenty to three hundred rounds. Even though samples are correlated, the user can determine the effective number of samples in the total by calculating lag correlations. Suppose a total of 12,000 samples (after removing the burn-in rounds) gave an effective number of samples equal to 500. This implies that every 24th sample should be uncorrelated.
An overall estimate of a parameter can be obtained by averaging all of the 12,000 samples (after the burn-in). However, to derive a confidence interval or to plot the distribution of the samples or to calculate the standard deviation of the estimate, the variance of the 500 independent samples should be used.
Influence of the Priors
In the small example,
va=ve=10 whereas N was only 5. Thus,
the prior values of the variances received more weight than information
coming from the data. This is probably appropriate for this small
example, but if N were 5,000,000, then the influence of the priors
would be next to none. The amount of influence of the priors is
not directly determined by the ratio of vi to N. In the small
example, even though
,
the influence of
S2a could be greater than
.
Schenkel (1998)
applied some methods recommended to him by Gianola in his thesis work.
Long Chain or Many Chains?
Early papers on MCMC (Monte Carlo Markov Chain) methods recommended running many chains of samples and then averaging the final values from each chain. This was to insure independence of the samples. That philosophy seems to have changed to a recommendation of one single long chain. For animal breeding applications this could mean 100,000 samples or more. If a month is needed to run 50,000 samples, then maybe three chains of 50,000 would be preferable. If only an hour is needed for 50,000 samples, then 1,000,000 samples would not be difficult to run. The main conclusion is that it is very difficult to know if or when a Gibbs chain has converged to the joint posterior distribution. This is currently an active area of statistical research.
This LaTeX document is available as postscript or asAdobe PDF.
Larry Schaeffer