Literature DB >> 22384398

Haplotype probabilities in advanced intercross populations.

Karl W Broman1.   

Abstract

Advanced intercross populations, in which multiple inbred strains are mated at random for many generations, have the advantage of greater precision of genetic mapping because of the accumulation of recombination events across the multiple generations. Related designs include heterogeneous stock and the diversity outcross population. In this article, I derive the two-locus haplotype probabilities on the autosome and X chromosome with these designs. These haplotype probabilities provide the key quantities for developing hidden Markov models for the treatment of missing genotype information. I further derive the map expansion in these populations, which is the frequency of recombination breakpoints on a random chromosome.

Entities:  

Keywords:  Collaborative Cross; Mouse Genetic Resource; advanced intercross lines; diversity outcross; heterogeneous stock; map expansion

Year:  2012        PMID: 22384398      PMCID: PMC3284327          DOI: 10.1534/g3.111.001818

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Advanced intercross populations, in which multiple inbred strains are mated at random for many generations, have the advantage of greater precision of genetic mapping because of the accumulation of recombination events across the multiple generations. The most commonly used form, which begins with two inbred strains, was formally introduced by Darvasi and Soller (1995) and called advanced intercross lines (AIL). A closely related design is that of heterogeneous stock (HS; see Mott ), in which eight inbred strains are randomly mated for many generations. Svenson developed the diversity outcross population (DO), which was formed with progenitors that were partially inbred individuals drawn from intermediate generations in the development of the Collaborative Cross (so-called pre-CC mice; see Aylor ). The mapping of quantitative trait loci in such populations, whether by interval mapping (Lander and Botstein 1989) or Haley-Knott regression (Haley and Knott 1992), generally requires conditional genotype probabilities at putative quantitative trait loci, given the available marker genotype data. Such probabilities are often calculated using a hidden Markov model (HMM; see Broman and Sen 2009, App. D). An HMM for this purpose formally requires the calculation of two-locus diplotype probabilities, although if the populations are formed with a large number of mating pairs, the two haplotypes within an individual are independent, and so it is sufficient to calculate two-locus haplotype probabilities. Darvasi and Soller (1995) derived the two-locus haplotype probabilities for the autosome in AIL. I am not aware of any work considering the X-chromosome. In this article, I derive the two-locus haplotype probabilities for the autosome and X-chromosome in AIL, HS, and the DO. The calculations for the DO rely on recent results on haplotype probabilities in pre-CC mice (Broman 2012). Throughout, I assume an effectively infinite set of mating pairs at each generation, no sex difference in recombination, and no selection or mutation. Let us first revisit the two-locus autosomal haplotype probabilities in AIL, as they serve as a simple example of the technique used in these calculations (see also Bulmer 1980, Ch. 3). Let p denote the frequency of the AA haplotype at generation F. Then and we have the recurrence relationwhere r is the recombination fraction (in one meiosis) between the two loci. Equation (1) is derived by noting that an AA haplotype drawn from generation F is either an intact AA haplotype at generation F, transmitted without recombination, or it is a recombinant haplotype bringing two independent A alleles together. Note that the frequency of the A allele is at every generation. The solution of this recurrence relation (see Graham ) is, for s ≥ 2,The frequency of recombinant haplotypes at generation F is 1 − 2p. For the X-chromosome in AIL, I will first consider a balanced case, begun with equal proportions of F1 individuals from reciprocal crosses, A × B and B × A, so that the F1 males are equally likely to be hemizygous A or B. Let m and f denote the frequency of the AA haplotype in males and females, respectively, at generation F. Then and we have This recurrence relation is derived in a similar way to that for the autosome, noting that the male haplotype was drawn from his mother, with a chance for recombination, and a random female haplotype is equally likely to have been drawn from her father, without recombination, or from her mother, with the potential for recombination. I again make use of the fact that the frequency of the A allele is in both males and females at every generation. The solution to this relation is, for s ≥ 2,where , w = (1 − r + z)/4, and y = (1 − r − z)/4. Note that the frequencies of recombinant haplotypes in males and females are 1 − 2m and 1 − 2f, respectively, and that the overall frequency is 1 − (2m +4f)/3. Now I turn to the unbalanced case for the X-chromosome, in which all F1 individuals are derived from the cross female A × male B, so that all F1 males are hemizygous A. This appears to be widely used in practice (e.g., Norgard ; Kelly ). The calculations are more difficult, because the allele frequencies are different in males and females and across generations. I first calculate the single-locus allele frequencies. Let q be the frequency of the A allele in females at generation F. Note that the frequency in males at F is q−1. The initial values are q0 = 1 and , and we have the recurrence relation , which comes from the fact that a random allele drawn from the female at generation F is equally likely to be an allele from the female or male at generation F, and the allele in the male at F is a random allele from the female at F. The solution of the recurrence relation is , for s ≥ 0. I now turn to the two-locus haplotype probabilities. Let and denote the frequencies of the AA haplotype on the X chromosome in males and females at generation F in an unbalanced AIL, and note that and . The haplotype probabilities satisfy a recurrence relation similar to that in equation (3): Note the distinction between equations (3) and (5): if a recombinant haplotype is transmitted from the F female, the chance that it brings two A alleles together depends on the frequency of the A allele in males and females in the F−1 generation. In the balanced case, these are each ; in the unbalanced case, they are different from each other and vary across generations. I have been unable to obtain closed-form solutions for and . However, the values can be quickly calculated numerically, using equation (5). Note that . Haplotype probabilities in the DO are calculated similarly. The progenitors for the DO were pre-CC mice. I assume a large number of progenitors, that they were drawn from independent lines, and that the order of the crosses that generated the different lines were random, giving complete balance across the eight alleles. In a potential abuse of notation, I will redefine the q, p, m, and f variables used previously. Let q denote the frequency of the AA haplotype at generation G2:F in the pre-CC; this is times the haplotype probability in Table 4 of Broman (2012). Let p be the probability of the AA haplotype at generation s of the diversity outcross. The pre-CC progenitors of the DO were drawn from independent lines at a variety of different generations along the course to inbreeding. Let α denote the proportion of the pre-CC progenitors that were at generation G2: F, and note that a pre-CC progenitor at generation G2: F will transmit the AA haplotype with frequency q+1 (that is, the frequency of the AA haplotype at generation G2: F). Thus, the frequency of the AA haplotype at the first generation of the DO is . The recurrence relation for the p is like that in equation (1): p+1 = (1 − r)p + r/64. The solution is Note that the recombinant haplotypes are all equally likely, due to the random order of the initial crosses, and so each has probability (1 − 8p)/56. HS corresponds to the DO with α1 = 1 (that is, k ≡ 1), in which case p1 = q2 = 7 − 24r + 24r2 − 8r3. I now turn to the X-chromosome. Let m and f denote the frequency of the AA haplotype on the X chromosome in males and females in the DO at generation s. Assuming random orders of crosses to generate the pre-CC progenitors,where and are the frequencies of the AA and CC haplotypes, respectively, on the X-chromosome in females at generation G1: F in the construction of four-way RIL by sibling mating (see Broman 2012, Table 4). m1 is calculated in the same way. The recurrence relations are much like equation (3): The solutions are the following:where w, y, and z are as in equation (4). Again, HS corresponds to DO with α1 = 1, in which case f1 = (4 − 5r + r2)/32 and, m1 = (2 − 3r + r2)/16. In Figure 1, the probabilities of recombinant two-locus haplotypes are displayed for the different populations. For the DO, I used the distribution of k as in Figure 1 of Svenson and s = 5. For HS and AIL, I used s = 10 and 12, respectively, to match the total number of generations with recombination—the average k in Svenson was six. Recombinant haplotypes are more frequent on the autosome, and are more frequent in HS than in the DO; inbreeding in the pre-CC progenitors of the DO is accompanied by a loss of recombinants.
Figure 1 

Frequency of a two-locus haplotype being recombinant, as a function of the recombination fraction at meiosis, for the diversity outcross population at s = 5 (solid curves), heterogeneous stock at s = 10 (dashed curves), and balanced AIL at s = 12 (dotted curves), for the autosome (black), male X (blue), and female X (red). The green dashed curve is the recombinant frequency for HS at s = 10 assumed in Mott .

Frequency of a two-locus haplotype being recombinant, as a function of the recombination fraction at meiosis, for the diversity outcross population at s = 5 (solid curves), heterogeneous stock at s = 10 (dashed curves), and balanced AIL at s = 12 (dotted curves), for the autosome (black), male X (blue), and female X (red). The green dashed curve is the recombinant frequency for HS at s = 10 assumed in Mott . It is particularly interesting to consider the map expansion in these populations, which is the frequency of recombination breakpoints on a random chromosome. Let R denote the probability of a recombinant haplotype; then the map expansion is (see Teuscher and Broman 2007). The map expansion on an autosome in AIL is s/2. For the DO, on an autosome, the map expansion satisfies , where M1 is the weighted average (with weights α) of the map expansion in the pre-CC at generation G2: F (see Broman 2012, Table 4). For the particular progenitors detailed in Svenson , Figure 1), this is approximately (7s +37)/8. For HS, we have M1 = 3 and . For the X-chromosome in balanced AIL, HS and DO, the map expansion is that of the autosome. For the case of the X-chromosome in unbalanced AIL, in which all F1 males are hemizygous A, I cannot derive a closed-form solution, but taking the derivatives of the recurrence relations in equation (5), I can derive a simple recurrence relation for the map expansion. (Note that the overall map expansion on the X-chromosome can be obtained as the average of the sex-specific map expansions, with weight given to the female, since two-thirds of the X-chromosomes are in females.) Let denote the map expansion at F, and again let q be the frequency of the A allele in females at F. Then we havewith the initial conditions and . Although I have not been able to derive a closed-form solution for , it is easily calculated numerically. The aforementioned haplotype probabilities provide the key quantities for developing HMMs for advanced intercross populations. However, it should be noted that there are other approaches to handling such data. For example, Besnier used a variance components model to analyze outbred chicken AIL data, with identity-by-descent probabilities calculated using a modified version of the method of Pong-Wong , for general pedigree data. The aforementioned result for HS differs from that in Mott and incorporated into the HAPPY software. They had assumed that the map expansion in HS was , whereas I show it to be . In the first three of generations with recombination, individuals are fully heterozygous, and so all recombination events can be seen; in the subsequent s − 1 generations, there is a 1/8 chance of homozygosity and so only 7/8 of recombination events can be seen. Mott further assumed that the transition probabilities along an HS chromosome are a function of genetic distance, but that requires knowledge of the map function. It is more direct to express the transition probabilities in terms of the recombination fraction at meiosis. The green curve in Figure 1 displays the probability of a recombinant haplotype assumed in Mott for HS with s = 10 when the map function corresponding to the gamma model with the level of crossover interference estimated for the mouse in Broman is used. The probability is slightly smaller than that from my calculations; at r = 0.01, the equation in Mott gives 0.099, whereas I obtain 0.103. I have assumed an effectively infinite number of mating pairs at each generation. In practice, with a finite number of mating pairs, there will be some inbreeding and so an increased frequency of homozygosity and a decreased frequency of recombination. In addition, the individuals at the final generation will include siblings, and the relationships among individuals might be used to improve the genotype reconstruction. In practice, for computational efficiency, both the inbreeding and the relationships among individuals would probably be ignored in the genotype reconstruction, and with dense genotype data, there will be little loss of information.
  13 in total

1.  Crossover interference in the mouse.

Authors:  Karl W Broman; Lucy B Rowe; Gary A Churchill; Ken Paigen
Journal:  Genetics       Date:  2002-03       Impact factor: 4.562

2.  Genetic architecture of voluntary exercise in an advanced intercross line of mice.

Authors:  Scott A Kelly; Derrick L Nehrenberg; Jeremy L Peirce; Kunjie Hua; Brian M Steffy; Tim Wiltshire; Fernando Pardo-Manuel de Villena; Theodore Garland; Daniel Pomp
Journal:  Physiol Genomics       Date:  2010-04-13       Impact factor: 3.107

3.  A simple regression method for mapping quantitative trait loci in line crosses using flanking markers.

Authors:  C S Haley; S A Knott
Journal:  Heredity (Edinb)       Date:  1992-10       Impact factor: 3.821

4.  Mapping mendelian factors underlying quantitative traits using RFLP linkage maps.

Authors:  E S Lander; D Botstein
Journal:  Genetics       Date:  1989-01       Impact factor: 4.562

5.  High-resolution genetic mapping using the Mouse Diversity outbred population.

Authors:  Karen L Svenson; Daniel M Gatti; William Valdar; Catherine E Welsh; Riyan Cheng; Elissa J Chesler; Abraham A Palmer; Leonard McMillan; Gary A Churchill
Journal:  Genetics       Date:  2012-02       Impact factor: 4.562

6.  Genetic analysis of complex traits in the emerging Collaborative Cross.

Authors:  David L Aylor; William Valdar; Wendy Foulds-Mathes; Ryan J Buus; Ricardo A Verdugo; Ralph S Baric; Martin T Ferris; Jeff A Frelinger; Mark Heise; Matt B Frieman; Lisa E Gralinski; Timothy A Bell; John D Didion; Kunjie Hua; Derrick L Nehrenberg; Christine L Powell; Jill Steigerwalt; Yuying Xie; Samir N P Kelada; Francis S Collins; Ivana V Yang; David A Schwartz; Lisa A Branstetter; Elissa J Chesler; Darla R Miller; Jason Spence; Eric Yi Liu; Leonard McMillan; Abhishek Sarkar; Jeremy Wang; Wei Wang; Qi Zhang; Karl W Broman; Ron Korstanje; Caroline Durrant; Richard Mott; Fuad A Iraqi; Daniel Pomp; David Threadgill; Fernando Pardo-Manuel de Villena; Gary A Churchill
Journal:  Genome Res       Date:  2011-03-15       Impact factor: 9.043

7.  A simple and rapid method for calculating identity-by-descent matrices using multiple markers.

Authors:  R Pong-Wong; A W George; J A Woolliams; C S Haley
Journal:  Genet Sel Evol       Date:  2001 Sep-Oct       Impact factor: 4.297

8.  Identification of quantitative trait loci affecting murine long bone length in a two-generation intercross of LG/J and SM/J Mice.

Authors:  Elizabeth A Norgard; Charles C Roseman; Gloria L Fawcett; Mihaela Pavlicev; Clinton D Morgan; L Susan Pletscher; Bing Wang; James M Cheverud
Journal:  J Bone Miner Res       Date:  2008-06       Impact factor: 6.741

9.  Fine mapping and replication of QTL in outbred chicken advanced intercross lines.

Authors:  Francois Besnier; Per Wahlberg; Lars Rönnegård; Weronica Ek; Leif Andersson; Paul B Siegel; Orjan Carlborg
Journal:  Genet Sel Evol       Date:  2011-01-17       Impact factor: 4.297

10.  Genotype probabilities at intermediate generations in the construction of recombinant inbred lines.

Authors:  Karl W Broman
Journal:  Genetics       Date:  2012-02       Impact factor: 4.562

View more
  19 in total

1.  Ten years of the Collaborative Cross.

Authors:  David W Threadgill; Gary A Churchill
Journal:  Genetics       Date:  2012-02       Impact factor: 4.562

2.  A general modeling framework for genome ancestral origins in multiparental populations.

Authors:  Chaozhi Zheng; Martin P Boer; Fred A van Eeuwijk
Journal:  Genetics       Date:  2014-09       Impact factor: 4.562

3.  High-resolution genetic mapping of complex traits from a combined analysis of F2 and advanced intercross mice.

Authors:  Clarissa C Parker; Peter Carbonetto; Greta Sokoloff; Yeonhee J Park; Mark Abney; Abraham A Palmer
Journal:  Genetics       Date:  2014-09       Impact factor: 4.562

4.  Discovery and refinement of muscle weight QTLs in B6 × D2 advanced intercross mice.

Authors:  P Carbonetto; R Cheng; J P Gyekis; C C Parker; D A Blizard; A A Palmer; A Lionikas
Journal:  Physiol Genomics       Date:  2014-06-24       Impact factor: 3.107

5.  Structural Variation Shapes the Landscape of Recombination in Mouse.

Authors:  Andrew P Morgan; Daniel M Gatti; Maya L Najarian; Thomas M Keane; Raymond J Galante; Allan I Pack; Richard Mott; Gary A Churchill; Fernando Pardo-Manuel de Villena
Journal:  Genetics       Date:  2017-06       Impact factor: 4.562

6.  Defining the consequences of genetic variation on a proteome-wide scale.

Authors:  Joel M Chick; Steven C Munger; Petr Simecek; Edward L Huttlin; Kwangbom Choi; Daniel M Gatti; Narayanan Raghupathy; Karen L Svenson; Gary A Churchill; Steven P Gygi
Journal:  Nature       Date:  2016-06-15       Impact factor: 49.962

7.  Inferring genome-wide recombination landscapes from advanced intercross lines: application to yeast crosses.

Authors:  Christopher J R Illingworth; Leopold Parts; Anders Bergström; Gianni Liti; Ville Mustonen
Journal:  PLoS One       Date:  2013-05-02       Impact factor: 3.240

8.  Ten years of the collaborative cross.

Authors:  David W Threadgill; Gary A Churchill
Journal:  G3 (Bethesda)       Date:  2012-02-01       Impact factor: 3.154

9.  The genome architecture of the Collaborative Cross mouse genetic reference population.

Authors: 
Journal:  Genetics       Date:  2012-02       Impact factor: 4.562

Review 10.  Mouse Genetic Reference Populations: Cellular Platforms for Integrative Systems Genetics.

Authors:  Emily Swanzey; Callan O'Connor; Laura G Reinholdt
Journal:  Trends Genet       Date:  2020-09-30       Impact factor: 11.639

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.