next up previous


This LaTeX document is available as postscript or asAdobe PDF.

Genomic Model

A genomic model separates the additive genetic merit of an individual into contributions from the male parent and from the female parent. The ability to do this may have an advantage in estimating the magnitude of QTL effects. In matrix form, the model is

\begin{displaymath}{\bf y} = {\bf Xb} + {\bf Zg}_{m} + {\bf Zg}_{f}
+ {\bf e}, \end{displaymath}

where ${\bf g}_{m}$ is the male parent contribution and ${\bf g}_{f}$ is the female parent contribution. Summed together they give the total additive genetic merit of an individual. Now let ${\bf g}$ represent all of the genomic effects ordered in pairs by animal, so that

\begin{displaymath}\left( \begin{array}{c} g_{1m} \\ g_{1f} \\
g_{2m} \\ g_{2f}...
...enome} \\
\mbox{animal n, female genome}
\end{array} \right). \end{displaymath}

Then

\begin{displaymath}Var( {\bf g}) = {\bf G}\sigma^{2}_{g}, \end{displaymath}

but

\begin{displaymath}{\bf G}\sigma^{2}_{g} = \sum_{i=1}^{\infty} {\bf G}_{i}
\sigma^{2}_{i}, \end{displaymath}

where ${\bf G}_{i}$ is the gametic relationship matrix for locus i and $\sigma^{2}_{i}$ is one half the additive genetic variance of locus i. The infinitesimal model assumes that there are an infinite number of loci affecting the trait, each with a small and nearly equal variance. The genomic relationship matrix, ${\bf G}$, can be viewed as the average of the infinite number of gametic relationship matrices, ${\bf G}_{i}$, under those assumptions. The genomic relationship matrix can be constructed using fairly simple rules.

1. Genomic Relationships

Suppose we have the following pedigrees.

Table 1.
Animal Sire Dam
A - -
B - -
C A B
D A C
E D B
Expand this table to identify the genomic pedigree structure.

Table 2.
Animal Genome Parent1 Parent2
A A1 - -
A A2 - -
B B1 - -
B B2 - -
C C1 A1 A2
C C2 B1 B2
D D1 A1 A2
D D2 C1 C2
E E1 D1 D2
E E2 B1 B2

The genomic relationship matrix will therefore be of order 10. The diagonals of the genomic relationship matrix are all equal to 1.

    A B C D E
    A1 A2 B1 B2 C1 C2 D1 D2 E1 E2
  A1 1 0 0 0            
A                      
  A2 0 1 0 0            
  B1 0 0 1 0            
B                      
  B2 0 0 0 1            
  C1         1          
C                      
  C2           1        
  D1             1      
D                      
  D2               1    
  E1                 1  
E                      
  E2                   1

Because the parents of A and B are unknown, then they are assumed to be randomly drawn from the large random mating population and assumed to have no genes identical by descent between them.

Let (A1,C1) indicate an element in the above table between the A1 male parent contribution of animal A and the C1 male parent contribution of animal C, then the value that goes into that location is

    (A1,C1) = 0.5 * [ (A1,A1) + (A1,A2) ] = 0.5.

Similarly, for the rest of the A1 row,

    (A1,C2) = 0.5 * [ (A1,B1) + (A1,B2) ] = 0,  
    (A1,D1) = 0.5 * [ (A1,A1) + (A1,A2) ] = 0.5, 
    (A1,D2) = 0.5 * [ (A1,C1) + (A1,C2) ] = 0.25, 
    (A1,E1) = 0.5 * [ (A1,D1) + (A1,D2) ] = 0.375,
    (A1,E2) = 0.5 * [ (A1,B1) + (A1,B2) ] = 0.

This recursive pattern follows through the entire table. The completed table is shown below.

    A B C D E
    A1 A2 B1 B2 C1 C2 D1 D2 E1 E2
  A1 1 0 0 0 .5 0 .5 .25 .375 0
A                      
  A2 0 1 0 0 .5 0 .5 .25 .375 0
  B1 0 0 1 0 0 .5 0 .25 .125 .5
B                      
  B2 0 0 0 1 0 .5 0 .25 .125 .5
  C1 .5 .5 0 0 1 0 .5 .5 .5 0
C                      
  C2 0 0 .5 .5 0 1 0 .5 .25 .5
  D1 .5 .5 0 0 .5 0 1 .25 .625 0
D                      
  D2 .25 .25 .25 .25 .5 .5 .25 1 .625 .25
  E1 .375 .375 .125 .125 .5 .25 .625 .625 1 .125
E                      
  E2 0 0 .5 .5 0 .5 0 .25 .125 1

Animals D and E are inbred and the offdiagonals between D1 and D2 and between E1 and E2 show the inbreeding coefficient. Both the additive and dominance relationship matrices may be obtained from this genomic relationship table. The additive relationship between animals A and C is given by

    0.5 * [ (A1,C1) + (A1,C2) + (A2,C1) + (A2,C2) ] = 0.5.

Add the four numbers in each square of the table and divide by 2. The ${\bf A}$ matrix is then

\begin{displaymath}{\bf A} = \left( \begin{array}{lllll}
1 & 0 & .5 & .75 & .375...
...& .75 \\
.375 & .625 & .625 & .75 & 1.125 \end{array}\right). \end{displaymath}

The dominance genetic relationship between animals X and Y, in general, is given by

    (X1,Y1)*(X2,Y2) + (X1,Y2)*(X2,Y1).

The complete dominance relationship matrix is

\begin{displaymath}{\bf D} = \left( \begin{array}{lllll}
1 & 0 & 0 & .25 & 0 \\ ...
...5 \\
0 & .125 & .25 & .15625 & 1.015625
\end{array} \right). \end{displaymath}

2. Example Genomic Model

Assume the five animals (A through E) had records equal to 5, 7, 9, 2, and 4, respectively. Then the SAS statements to analyze by the genomic model would be

    w = { 1  1 1  0 0  0 0  0 0  0 0,
          1  0 0  1 1  0 0  0 0  0 0,
          1  0 0  0 0  1 1  0 0  0 0,
          1  0 0  0 0  0 0  1 1  0 0,
          1  0 0  0 0  0 0  0 0  1 1};
    y = { 5, 7, 9, 2, 4 };
    yy = y`*y;
    ww = w`*w;
    wy = w`*y;
    g = { 8 0  0 0  4 0  4 2  3 0,
          0 8  0 0  4 0  4 2  3 0,
          0 0  8 0  0 4  0 2  1 4,
          0 0  0 8  0 4  0 2  1 4,
          4 4  0 0  8 0  4 4  4 0,
          0 0  4 4  0 8  0 4  2 4,
          4 4  0 0  4 0  8 2  5 0,
          2 2  2 2  4 4  2 8  5 2,
          3 3  1 1  4 2  5 5  8 1,
          0 0  4 4  0 4  0 2  1 8};
    g = g/8;
    gi = inv(g);
    B = j(1,1,0);
    H = block(B,gi);
    mme = ww + (H*3);
    c = inv(mme);
    dhat = c*wy;
    red = dhat`*wy;
    ev = (yy - red)/4;
    print yy, red, ev, ww, wy, c, dhat;

Note that all the elements of G were multiplied by 8, so that whole numbers could be entered into the SAS program rather than decimal numbers. This avoids problems with rounding errors and makes the matrix easier to enter. The function B=j(n,m,k); sets up a matrix of order n by m with all elements equal to k. The block(B,gi) function creates a new matrix that looks like

\begin{displaymath}\left( \begin{array}{ll} \verb*/B/ & {\bf0} \\
{\bf0} & \verb*/gi/ \end{array} \right). \end{displaymath}

The solution vector, dhat, is

\begin{displaymath}\left( \begin{array}{c} \hat{\mu} \\
\hat{A1} \\ \hat{A2} \\...
... -.81364 \\ -.28628 \\ -.68468 \\
.12053 \end{array} \right). \end{displaymath}

3. Inverse of Genomic Matrix

Henderson(1975) presented a fast method to invert the additive genetic relationship matrix for the case when animals were not inbred. Quaas(1976) showed how to compute the inverse when animals were inbred. The fastest method of calculating the inbreeding coefficient was presented by Meuwissen and Luo (1992). Combining the results of these papers, a fast way of inverting the genomic relationship matrix has been found.

Any positive definite matrix may be partitioned as

\begin{displaymath}{\bf G} = {\bf TDT}' \end{displaymath}

where ${\bf T}$ is a lower triangular matrix and ${\bf D}$ is a diagonal matrix. The Meuwissen and Luo (1992) algorithm has been modified to provide the diagonals of ${\bf D}$ while forming a row of ${\bf T}$. Animal genomes are processed in order from oldest to youngest. For animal genomes with unknown parent genomes, the diagonals of ${\bf D}$are equal to 1. Therefore, the diagonals of ${\bf D}$for A1, A2, B1, and B2 are equal to 1.

Begin with C1, the parent genomes are A1 and A2. Form a table as follows:

    Genome    t       D
      C1      1       x
      A1      .5      1
      A2      .5      1

The diagonal element for (C1,C1) in ${\bf G}$ is equal to 1, which is equal to ${\bf t}'{\bf Dt}$, which is

\begin{displaymath}(1)^{2}x \ + \ (.5)^{2}(1) \ + \ (.5)^{2}(1) \ = \ 1, \end{displaymath}

which can be re-arranged and solved for x,

\begin{displaymath}x = 1 \ - \ .25 \ - \ .25 \ = \ .5. \end{displaymath}

A similar table and calculations can be made for C2, D1, and E2. Thus, their diagonal elements of ${\bf D}$ are also equal to .5.

Now make the table for D2 whose parent genomes are C1 and C2.

    Genome    t       D
      D2      1       x
      C1      .5      .5
      C2      .5      .5

Now we need to add the parent genomes of C1 and C2, as follows:

    Genome    t       D
      D2      1       x
      C1      .5       .5
      C2      .5       .5
      A1      .25     1
      A2      .25     1
      B1      .25     1
      B2      .25     1

The next step would be to add the 'parents' of A1 and A2, then B1 and B2, but these 'parents' are unknown, and so no further additions to the table are made. Now we compute ${\bf t}'{\bf Dt}$ as

x + (.5)2(.5) + (.5)2(.5) + 4(.25)2(1) = 1,

or

\begin{displaymath}x = 1 - .125 - .125 - 4(.0625) \ = \ .5. \end{displaymath}

So far, the diagonals of ${\bf D}$ have been either 1 or .5. Now make a table for E1, whose parent genomes are D1 and D2. As the animals become younger, the length of these tables can become greater, and with n generations there can be up to 2n+1 elements in the table.

    Genome    t       D
      E1      1       x
      D1      .5      .5
      D2      .5      .5
      A1      .25     1 
      A2      .25     1
      C1      .25     .5
      C2      .25     .5
      A1      .125    1
      A2      .125    1
      B1      .125    1
      B2      .125    1

Notice that A1 and A2 appear twice in the table, and their coefficients in ${\bf t}$ must be added together before computing ${\bf t}'{\bf Dt}$. The new table, after adding coefficents is

    Genome    t       D
      E1      1       x
      D1      .5      .5
      D2      .5      .5
      A1      .375     1 
      A2      .375     1
      C1      .25     .5
      C2      .25     .5
      B1      .125    1
      B2      .125    1

Then

x = 1 - 2(.5)2(.5) - 2(.375)2(1) - 2(.25)2(.5) - 2(.125)2(1) = .375.

The complete results for the diagonals of ${\bf D}$ are given in the next table.

Table 3.
Animal Genome Parent1 Parent2 ${\bf D}$
A A1 - - 1
A A2 - - 1
B B1 - - 1
B B2 - - 1
C C1 A1 A2 .5
C C2 B1 B2 .5
D D1 A1 A2 .5
D D2 C1 C2 .5
E E1 D1 D2 .375
E E2 B1 B2 .5

The inverse of ${\bf G}$ is

\begin{displaymath}{\bf G}^{-1} = {\bf T}^{-T}{\bf D}^{-1}{\bf T}^{-1}, \end{displaymath}

and as Henderson (1975) discovered, the elements in ${\bf T}^{-1}$ are all 1's on the diagonals, and each row has a -.5 in the columns corresponding to the two parent genomes. All other elements are equal to 0. This structure leads to a simple set of rules for creating the inverse of ${\bf G}$, which can be accomplished by going through the pedigrees in Table 3, one genome at a time. Let di be equal to one over the diagonal of ${\bf D}$for the ith genome, and let p1 and p2 be the parent genomes, then the contributions to the inverse of ${\bf G}$ from this genome would be to add the following values:

  i p1 p2
i di .5di .5di
p1 .5di .25di .25di
p2 .5di .25di .25di

Applying these rules, then the complete inverse is shown in the table below.

    A B C D E
    A1 A2 B1 B2 C1 C2 D1 D2 E1 E2
  A1 2 1 0 0 -1 0 -1 0 0 0
A                      
  A2 1 2 0 0 -1 0 -1 0 0 0
  B1 0 0 2 1 0 -1 0 0 0 -1
B                      
  B2 0 0 1 2 0 -1 0 0 0 -1
  C1 -1 -1 0 0 2.5 .5 0 -1 0 0
C                      
  C2 0 0 -1 -1 .5 2.5 0 -1 0 0
  D1 -1 -1 0 0 0 0 2.6667 .6667 -1.3333 0
D                      
  D2 0 0 0 0 -1 -1 .6667 2.6667 -1.3333 0
  E1 0 0 0 0 0 0 -1.3333 -1.3333 2.6667 0
E                      
  E2 0 0 -1 -1 0 0 0 0 0 2

The reader should check that ${\bf G}^{-1}$, given above, when multiplied times ${\bf G}$ gives an identity matrix, ${\bf I}$.

4. REML Estimation

Utilizing the example data in these notes, the usual EM algorithm formulas are

\begin{eqnarray*}\hat{\sigma}^{2}_{e} & = & ( {\bf y}'{\bf y} - \hat{\bf d}'
{\b...
...})) \\
& = & (175 \ - \ 153.16603) / (5-1) \\
& = & 5.4585,
\end{eqnarray*}


for the residual variance, and

\begin{eqnarray*}\hat{\sigma}^{2}_{g} & = & ( \hat{\bf g}'{\bf G}^{-1}
\hat{\bf ...
... & = & (1.7345 \ + \ (2.9989)(5.4585) ) / 10, \\
& = & 1.8104,
\end{eqnarray*}


for the genomic variance, which equals one half of the additive genetic variance. The new variance ratio is slightly above 3, which is not surprising with only 5 observations.


next up previous

This LaTeX document is available as postscript or asAdobe PDF.

Larry Schaeffer
2001-11-20