Literature DB >> 25740936

Modeling X-linked ancestral origins in multiparental populations.

Abstract

The models for the mosaic structure of an individual's genome from multiparental populations have been developed primarily for autosomes, whereas X chromosomes receive very little attention. In this paper, we extend our previous approach to model ancestral origin processes along two X chromosomes in a mapping population, which is necessary for developing hidden Markov models in the reconstruction of ancestry blocks for X-linked quantitative trait locus mapping. The model accounts for the joint recombination pattern, the asymmetry between maternally and paternally derived X chromosomes, and the finiteness of population size. The model can be applied to various mapping populations such as the advanced intercross lines (AIL), the Collaborative Cross (CC), the heterogeneous stock (HS), the Diversity Outcross (DO), and the Drosophila synthetic population resource (DSPR). We further derive the map expansion, density (per Morgan) of recombination breakpoints, in advanced intercross populations with L inbred founders under the limit of an infinitely large population size. The analytic results show that for X chromosomes the genetic map expands linearly at a rate (per generation) of two-thirds times 1 - 10/(9L) for the AIL, and at a rate of two-thirds times 1 - 1/L for the DO and the HS, whereas for autosomes the map expands at a rate of 1 - 1/L for the AIL, the DO, and the HS.

Entities: Chemical Disease Gene Species

Keywords: Collaborative Cross (CC); Diversity Outcross (DO); Drosophila synthetic population resource (DSPR); MPP; Multiparent Advanced Generation Inter-Cross (MAGIC); advanced intercross lines (AIL); map expansion; multiparental populations

Mesh：

Year: 2015 PMID： 25740936 PMCID： PMC4426366 DOI： 10.1534/g3.114.016154

Source DB: PubMed Journal: G3 (Bethesda) ISSN： 2160-1836 Impact factor: 3.154

There have been recently designed quantitative trait locus (QTL) mapping populations with either multiple parents to increase the genetic diversity of the founder population, or many intercross generations to improve the mapping resolution by accumulating historical recombination events. Some examples include the Collaborative Cross (CC) (Churchill ), the advanced intercross lines (AIL) (Darvasi and Soller 1995), the heterogeneous stock (HS) (Mott ), the diversity outcross (DO) (Svenson ), and the Drosophila synthetic population resource (DSPR) (King ). The CC can be regarded as a set of eight-way recombinant inbred lines (RIL) by sibling mating, where eight founders of each line are permuted. The genomes of individuals in QTL mapping populations are random mosaics of the founders’ genomes. The QTL mapping generally necessitates the reconstruction of these genome blocks along two homologous chromosomes of a sampled individual from available genotype data. Such reconstruction is often performed under a hidden Markov model (HMM) with the latent state being the pair of ancestral origins at a locus, where the transition probability of ancestral origins between two loci, or the two-locus diplotype (two-haplotype) probabilities are required. Modeling ancestral origins along a pair of autosomal chromosomes has been well developed recently. Broman (2012a) extended the approach of Haldane and Waddington (1931) from the two-way to four- and eight-way RIL by sibling mating and provided recipes for calculating autosomal two-locus diplotype probabilities numerically. Johannes and Colome-Tatche (2011) derived autosomal two-locus diplotype probabilities for the two-way RIL by selfing. Zheng described a general modeling framework for ancestral origins that can be applied to autosomes in various mapping populations such as the RIL by selfing or sibling mating and the AIL. A special treatment is required for modeling ancestral origins along a pair of X chromosomes. Haldane and Waddington (1931) derived the recurrence relations of the X-linked two-locus diplotype probabilities for the two-way RIL by sibling mating and the bi-parental repeated parent-offspring mating, and their closed form solutions for the final homozygous lines. Broman (2005) extended the solutions to the two- and three-locus haplotype probabilities for the two, four, or eight-way RIL by sibling mating. Broman (2012b) derived the X-linked two-locus haplotype probabilities in advanced intercross populations including the AIL, the HS, and the DO, assuming an infinitely large population size. In this paper, we extend our previous work (Zheng ) to model the ancestral origins along a pair of X chromosomes in a finite mapping population. This extension also builds on the theory of junctions in inbreeding (Fisher 1949, 1954). A junction is defined as a boundary point of genome blocks on chromosomes where two distinct ancestral origins meet, and the boundary points that occur at the same location along multiple chromosomes are counted as a single junction. The map expansion is the expected junction density (per Morgan) on a maternally or paternally derived X chromosome, denoted by or , respectively. We denote by the overall junction density along the XX chromosomes of a female, and it can be used as a measure of X-linked QTL mapping resolution (Darvasi and Soller 1995; Weller and Soller 2004). The key feature of this extension is to account for the asymmetry between maternally and paternally derived X chromosomes because the latter did not experience any crossover events with Y chromosomes. We first present a model framework for X-linked ancestral origins, where the recurrent relations are derived for various junction densities including the map expansions and . Then, we derive the closed form solutions for these expected densities in mapping populations including the RIL by sibling mating, the AIL, the HS, the DO, and the DSPR; they are evaluated by forward simulation studies. Lastly, we discuss the model assumptions and the implications of the analytic results on haplotype reconstructions and breeding designs.

A model for X-linked ancestral origins

Assumptions and notation

Consider a dioecious population with two separate sexes: homogametic females with sex chromosomes XX and heterogametic males with sex chromosomes XY. There are no recombination events between X and Y, and thus we ignore the pseudoautosomal regions on the XY chromosomes. As in most mammals and some insects (Drosophila), some flowering species, such as white campion (Silene latifolia), papaya (Carica papaya), and asparagus (Asparagus officianalis), have the XY sex determination system (Ming and Moore 2007). The dioecious population was founded in generation 0, and it has nonoverlapping generations. There are no natural or artificial selections since the founder population. The mating schemes of producing the next generation are random, and they may vary from one generation to the next. The assignments of offspring genders are assumed to be independent of mating schemes. The ancestral origins along two homologous autosomes have been modeled as a continuous time Markov chain (CTMC) (Zheng ). We extend the approach to account for the asymmetry of XX chromosomes, using superscript m (p) for maternally (paternally) derived genes or chromosomes. See Supporting Information, Table S1 for a list of symbols used in this paper. Let be the ordered pair of the ancestral origins at location x along the two X chromosomes of a randomly sampled female. The ancestral origin process is assumed to follow a CTMC, where x is the time parameter in unit of Morgan. We assign a unique ancestral origin to the X chromosomes of each inbred founder, or to each X chromosome of each outbred founder. Multiple genes, within or between loci, are identical by descent (IBD) if they have the same ancestral origins. Let L be the number of possible ancestral origins that or may take. L may be less than the number of inbred founders if some male founders did not produce daughters to pass down their X chromosomes. For example, for the four-way RIL by sibling mating since one of the founder mating pairs produces only one son (Figure 1A).

Figure 1

The continuous time Markov chain (CTMC) of X-linked ancestry blocks in the four-way recombinant inbred lines (RILs) by sibling mating. (A) One realization of ancestry blocks in the four-way RIL with generation up to F3. The sex chromosomes of the four inbred founders are represented by different colors and labeled as A, B, C, and D. The short bars denote Y chromosomes. The ancestral origin D is impossible in the X chromosomes of generation t ≥ 1. (B) Evaluation of the exchangeability assumption by one-locus genotype probabilities. The gray dashed line refers to the average genotype probability for one particular non-IBD genotype AB, AC, or BC; the black dashed line is for one particular IBD genotype AA, BB, or CC. Note that the ancestral origins A and B are exchangeable, but the ancestral origin C is not exchangeable with either A or B. (C) Schematics of the seven junction types along the maternally (left) and paternally (right) derived X chromosomes. (D) The rate matrix of the CTMC for the four RILs in (A). The diagonal elements are given so that row sums are zero. The rate matrix is determined by the seven basic rates, each corresponding to one of the seven junction types. The subscripts of the basic rates denote the IBD (1) or non-IBD (0) states on the left- and right-hand sides of the junctions, and the rates with superscript * refer to the transitions on the paternally derived chromosome. (E) The general relationships between the basic rates and the expected densities for the seven types of junctions, with for the four-way RIL in (A). The L possible ancestral origins are assumed to be exchangeable, so that we focus on the changes of ancestral origins. See Figure 1B and the relevant part of Discussion on the exchangeability assumption. The initial distribution of at the leftmost locus is specified by , a probability that the two ancestral origins are the same (IBD) at a locus. Let be the non-IBD probability. Given either IBD or non-IBD at the locus, the ancestral origin pair takes each of the possible combinations with equal probability. The transition rate matrix of the CTMC can be constructed from the expected densities of all the junction types along the two X chromosomes of a female. The junction type denotes the four-gene IBD configuration on both sides of a junction, where () is on the left-hand (right-hand) side, haplotype () is on the first (second) chromosome, and the same integers denote IBD. Figure 1C illustrates the seven types of junctions: , , , , , , and for , where the two types and do not exist for . We do not define junction types for the eight two-locus configurations , , , , , , , and , because there are either zero or no less than two junctions between the two loci. Figure 1D shows the transition rate matrix of the CTMC in the four-way RIL by sibling mating. Figure 1E shows the relationships between the expected densities and the transition rates, and they are derived based on the interpretation that is the two-locus diplotype probability, in the limit that the genetic distance (in Morgan) between two loci goes to zero. The map expansions and and the overall expected junction density are given bysimilar to those for autosomes (Zheng ) except that for X chromosomes. We have and , since the junction densities do not depend on the direction of chromosomes. In contrast to the single-locus two-gene non-IBD probability , the ordering of the superscripts in generally does matter, that is, except for the junction type . In addition, we have (see Figure 1C). Thus, the CTMC of X-linked ancestral origins can be described by one non-IBD probability and the five expected junction densities , , , , and , under the exchangeability assumption of the L possible ancestral origins.

Single-locus non-IBD probabilities

The calculation of the expected junction densities necessitates the introduction of the probabilities for the two- and three-gene IBD configurations at a single locus. All the following derivations of the recurrence relations for these probabilities are based on the Mendelian inheritance of X-linked genes: a paternally derived gene must be a copy of the maternally derived gene in a male of the previous generation, and a maternally derived gene has equal probability of being a copy of either the maternally derived gene or the paternally derived gene in a female of the previous generation. In a dioecious mapping population, the single-locus two-gene probabilities of IBD configuration depend on whether or not the two homologous genes are in a single individual. Thus, we denote by , , and the two-gene probability of IBD configuration , given that the two homologous genes are in two distinct individuals in generation t and have parental origins , , and , respectively (Figure 2A); it holds that .

Figure 2

Schematics of (A) the probabilities of the two-gene IBD configurations, (B) the probabilities of the three-gene IBD configurations, and (C) the expected junction densities. Circles denote females, and dashed rectangles for males or females. Black vertical lines denote the maternally derived X chromosomes, and gray vertical lines for the paternally derived. Dots denote genes on chromosomes. The recurrence relations of the two-gene non-IBD probabilities are derived by tracing the parental origins of two homologous genes from generation into the previous generation, and they are given bywhere equation (4d) holds immediately after one generation of random mating, although it may not hold in the founder population at . In equation (4a), the first term on the right-hand side refers to the scenario that the two genes with parent origins in generation t come from a single female of the previous generation with the probability , and with probability that they come from different genes of the female. In equation (4b), the two genes with parental origins cannot merge because they must come from one male and one female of the previous generation. In equation (4c), the two genes with parental origins in generation t come from a single male of the previous generation with the probability ; if so, they must merge because there is only one X chromosome in a male. We introduce the single-locus three-gene probabilities of IBD configuration . Let , , , and be the probabilities of IBD configuration , given that the three homologous genes are in three distinct individuals in generation t and have parental origins , , , and , respectively (Figure 2B). Similarly, we define and for three homologous genes in two distinct individuals. The ordering of the superscripts does not matter for these three-gene probabilities, for example, . The recurrence relations of the three-gene non-IBD probabilities are derived by tracing the parental origins of three homologous genes from generation t ≥ 1 into the previous generation, and they are given bywhere is the coalescence probability of three maternally derived genes in generation t that a particular pair of genes come from a single female of the previous generation and the third comes from another female of the previous generation, and similarly for three paternally derived genes. The equations (5e, 5f) hold immediately after one generation of random mating, although they may not hold in the founder population at . The derivations of the recurrence equations (5a–5d) for the three-gene non-IBD probabilities are similar to equations (4a–4c) for the two-gene non-IBD probabilities. In equation (5a), the pre-factor 3 denotes that each of the three possible pairs of genes may come from a single female of the previous generation; the term is the probability that the three maternally derived genes in generation t come from three distinct females of the previous generation, and it is obtained by the probability that one pair of genes come from two distinct females minus the probability that the third gene and either gene of the pair come from a single female of the previous generation. Similarly, the term in equation (5d) is the probability that the three paternally derived genes in generation t come from three distinct males of the previous generation.

Expected junction densities

We derive the recurrence relations for , , , , and . The recurrence relation for follows from the theory of junctions (Fisher 1954): a new junction is formed whenever a recombination event occurs between two X chromosomes that are non-IBD at the location of a crossover. The recurrence relations for the map expansions and are given bywhere equation (6b) follows directly from no recombination events occurring between the XY chromosomes in a male of the previous generation. To measure differential map expansions between maternally and paternally derived chromosomes, we define and , and their recurrence relations are given byaccording to the recurrence equations (6a, 6b). If there are equal numbers of males and females in the population, a randomly chosen X chromosome is maternally derived with probability , and it is paternally derived with probability . Thus can be interpreted as the map expansion on a randomly chosen X chromosome. For comparisons, we denote by the map expansion on a random chosen autosome, and and its recurrence relation is given by (MacLeod ; Zheng )where refers to the non-IBD probability between two homologous autosomal genes in an individual. The equations (7a, 8) show that the map expansion for an X chromosome is two-thirds for an autosome if the non-IBD probability for autosomes is the same as for XX chromosomes, and the sex ratio is 1. In addition to and , we define , , , and for haplotypes and that are in two distinct individuals and have parental origins , , , and , respectively (Figure 2C). The contributions to the junctions in the current generation come from either the existing junctions at the previous generation, or a new junction via a crossover event. In the following, we focus on the formation of a new junction, because the contributions of the existing junctions in the previous generation are similar to those for the two-gene non-IBD probabilities in the recurrence equations (4a–4c). The schematics of the recurrence relations for junction types and are shown in Figure S1. The ancestry transitions of type occur on both haplotypes and at exactly the same location, and thus a new junction of type can be formed only by duplicating a chromosome segment. It holds that and because of the symmetry of type . We havefor t ≥ 1, where equation (9d) may not hold in the founder population at , the first term on the right-hand side of equation (9a) refers to the scenario that both haplotypes and come from a single female of the previous generation, and the first term on the right-hand side of equation (9c) refers to the scenario that both haplotypes are the duplicated copies of the maternally derived X chromosome in a male of the previous generation (Figure S1A). According to equations (6a, 6b) and equations (9a–9d), the overall expected density in equation (3) does not depend on the three-gene non-IBD probabilities. The ancestry transition of type occurs on haplotype . A new junction of type is formed whenever the two parental chromosomes of haplotype and the parental chromosome of haplotype are distinct and have the IBD configuration at the location of the crossover. We havefor , where equations (10e, 10f) may not hold in the founder population at . A new junction is formed at the rate (), given that the parental chromosome of haplotype is maternally (paternally) derived. The density in equation (10c) has no contributions of a new junction because there are no crossover events occurring between the XY chromosomes in the father of haplotype (Figure S1B). We denote by , and , and their recurrence relations are given in Appendix A. Both and measure the asymmetry between maternally and paternally derived X chromosomes.

Model evaluation by simulations

To evaluate the theoretical predications of non-IBD probabilities and expected junction densities, we perform simulation studies with the same model assumptions: random mating with discrete generations, no natural selections, and no genetic interferences, except that the ancestral origins along chromosomes do not follow Marker assumptions. Instead, the genome ancestral origins are simulated forwardly by first generating a pedigree according to a given breeding design, and then dropping on the pedigree the distinct founder genome labels (ancestral origins) that are assigned to the whole X chromosomes of each complete inbred founder or to each X chromosome of each outbred founder. The X chromosomes of each descendant gamete are specified as a list of the labeled segments determined by chromosomal crossovers. For a mapping population with the particular breeding design, the realized junction densities and IBD probabilities are saved for all individuals in each generation in each simulation replicate, and they are averaged over in total replicates. Various mating schemes are used in simulating breeding pedigrees. We denote by RM1 the random mating where each sampling of two randomly chosen individuals with opposite genders produces one offspring, and RM2 the random mating where each sampling of two randomly chosen individuals with opposite genders produces two offspring. We combine these mating schemes with -NE if each parent contributes a Poisson distributed number of gametes to the next generation, and -E if each parent contributes exactly two gametes. Thus, we have four random mating schemes, RM1-NE, RM1-E, RM2-NE, and RM2-E. The sibling mating belongs to RM2-E with population size 2, and the exclusively pairing in -way (n ≥ 1) crosses can be regarded as a special case of random mating without inbreeding. The genders are assigned randomly, independent of mating schemes.

Application to QTL mapping populations

Multistage populations

For mapping populations with stage-wise constant mating schemes, we derive analytic expressions of the non-IBD probabilities and the expected junction densities for constructing CTMC of X-linked ancestral origins, according to the recurrence relations. The closed form solutions are obtained by linking results of each subsequent stage via the initial conditions. The general results for a population with constant random mating are derived in Appendix A, where three scenarios are considered: finite population of size ≥6, sibling-mating population of size 2, and large population of size »6. Table S2 gives the coalescence probabilities of X chromosomes for various mating schemes, similar to Table 1 of Zheng for autosomes. Table S3 summarizes the results for X chromosomes in a sibling-mating population, and Table S4 for autosomes; they are necessary for dioecious breeding populations with a stage of inbreeding by sibling mating such as the CC and the DSPR. We use the superscripts of A denoting the quantities for autosomes.

Table 1

Results for X chromosomes in the -way RIL by sibling mating in the last generation , where for and for

Quantity	Theoretical Prediction
(A) 2 ways sibling
αgmp(12)	5+510(λ1)V+conjugate
RgX	83−20+8515(λ1)V+conjugate
Rgm	83−23(−12)V−5+355(λ1)V+conjugate
Jgmp(1122)	83+13(−12)V−(3+52+5+510V)(λ1)V+conjugate
Jgmp+(1232)	0
Jgmp−(1232)	0
(B) 2n(n≥2) ways sibling
αgmp(12)	5+3510(λ1)V+conjugate
RgX	23(U+6)−30+14515(λ1)V+conjugate
Rgm	23(U+6)+13(RU+1m−RU+1p)(−12)V−10+455(λ1)V+conjugate
Jgmp(1122)	23(U+6)−16(RU+1m−RU+1p)(−12)V−[10+455+1+54RU+1m+5+520RU+1p+5+3510V](λ1)V+conjugate
Jgmp+(1232)	−(12)V+[5+3510+1+54RU+1m+5+520RU+1p](λ1)V+conjugate
Jgmp−(1232)	(12)V+1+(RU+1p−RU+1m+1)(−12)V+1

The eigenvalues and . The map expansions and are given by equations (11a, 11b). The conjugate is given by replacing with from the terms involving . For example, the conjugate term for in (A) is given by . RIL, recombinant inbred line. We derive the analytic expressions of , , , , , and in the mapping populations of the RIL, the AIL, and the DO, and they are given in Table 1, Table 2, and Table 3, respectively. These results are necessary for constructing the CTMC of ancestral origins along the XX chromosomes of a female; only the expression of is needed for the maternal derived X chromosome of a male. For comparisons, the autosomal results for , , , and are included. The results for the AIL, the DO, and the DSPR are derived under the assumption of a large population size in the intercross stage. We evaluate this assumption in the DSPR, because the evaluation results hold similarly for the AIL and the DO. In addition, the map expansions and are given explicitly under the assumption of an infinitely large intercross population size, which may be used as a simple measure of QTL mapping resolution.

Table 2

Results for the AIL in the last generation

Quantity	Theoretical Prediction
(A) X chromosomes
αgmp(12)	(1−109L)(λ1)U+29L(−12)U+89L(14)U
RgX	89L+23(1−109L)1−(λ1)U1−λ1−881L(−12)U−6481L(14)U
Rgm	29+5281L+23(1−109L)1−(λ1)U1−λ1−(29+2081L+427LU)(−12)U−3281L(14)U
Jgmp(1122)	29+5281L+23(1−109L)1−(λ1)U1−λ1−[29+5281L+(23+1627s)(1−109L)U](λ1)U
Jgmp+(1232)	(1−2L)(1−43L)(λ1)U−(λ4)Us+(1−2L)(−19+7681L)(λ1)U+(1−2L)1681L(λ4)U+(1−2L)(19−481L+827LU)(−12)U+(1−2L)(−8881L+1627LU)(14)U
Jgmp−(1232)	(1−2L)(13−49L)(λ4)U−(1−2L)(13+49L)(−12)U+(1−2L)89L(14)U
(B) Autosomes
αgAA(12)	(1−1L)(λ1A)U−1
RgA	1+(1−1L)1−(λ1A)U−11−λ1A
JgAA(1122)	[1−(λ1A)U−1]+(1−1L)[1−(λ1A)U−11−λ1A−(U−1)(λ1A)U−2]
JgAA(1232)	(1−2L)(λ1A)U−1+(1−1L)(1−2L)(λ1A)U−1−(λ4A)U−1sA

The eigenvalues and for X chromosomes, and for autosomes and . AIL, advanced inter-cross lines.

Table 3

Results for the DO in the last generation

Quantity	Theoretical Prediction
(A) X chromosomes
αgmp(12)	(1−1L)(λ1)U
RgX	R0X+23α0mp(12)+23(1−1L)1−(λ1)U1−λ1
Rgm	R0X+23α0mp(12)+29(1−1L)+23(1−1L)1−(λ1)U1−λ1−[29(1−1L)+16(R0m−R0p)−13α0mp(12)](−12)U
Jgmp(1122)	R0X+23α0mp(12)+29(1−1L)+23(1−1L)1−(λ1)U1−λ1−[R0X+23α0mp(12)+29(1−1L)+(23+1627s)(1−1L)U](λ1)U
Jgmp+(1232)	(1−1L)(1−2L)(λ1)U−(λ4)Us+(1−2L)[−19(1−1L)+R0X+23α0mp(12)](λ1)U+(1−2L)[19(1−1L)+112(R0m−R0p)−16α0mp(12)](−12)U
Jgmp−(1232)	13(1−1L)(1−2L)(λ4)U+(1−2L)[−13(1−1L)−14(R0m−R0p)+12α0mp(12)](−12)U
(B) Autosomes
αgAA(12)	(1−1L)(λ1A)U
RgA	R0A+α0AA(12)+(1−1L)1−(λ1A)U1−λ1A
JgAA(1122)	[R0A+α0AA(12)][1−(λ1A)U]+(1−1L)[1−(λ1A)U1−λ1A−U(λ1A)U−1]
JgAA(1232)	[R0A+α0AA(12)](1−2L)(λ1A)U+(1−1L)(1−2L)(λ1A)U−(λ4A)UsA

The eigenvalues and for X chromosomes, and for autosomes and . DO, diversity outcross.

The eigenvalues and for X chromosomes, and for autosomes and . AIL, advanced inter-cross lines. The eigenvalues and for X chromosomes, and for autosomes and . DO, diversity outcross. Many breeding populations can be divided into three stages: mixing, intercross, and inbreeding, such as the RIL by sibling mating, the CC, and the DSPR. There is no inbreeding stage for the AIL, the HS, and the DO. We denote by U the number of intercross generations, V the number of inbreeding generations, and N the intercross population size. Let and denote the random mating schemes for mixing and intercross stages, respectively. We choose the mixing stage to consist of one generation of random mating, so that the non-IBD probabilities and the expected junction densities in the population do not depend on whether genes or haplotypes are in distinct individuals. The general derivation procedure is as follows. First, we derive the initial conditions in the population for the intercross stage, according to the genetic compositions of the founder population . Second, we substitute the obtained initial conditions into the theorems of Appendix A3 under the assumption of a large intercross population size. Alternatively, the theorems of Appendix A1 may be used for a finite intercross population. Lastly, if there is a stage of inbreeding by sibling mating, we substitute analytic expressions in the population into the theorems of Appendix A2 to obtain the results in the last generation .

RIL

The -way RIL by sibling mating can be regarded as a three-stage mapping population without the intercross stage for . All the founders are fully inbred, and the intercross mating scheme is exclusively pairing so that inbreeding is completely avoided. Thus , and the non-IBD probability during the intercross stage , where for and for . According to the recurrence equations (6a, 6b), it holdsand . Furthermore, it is straightforward to obtain , , , , and , where the indicator if and 0 otherwise, since the two maternally derived genes at must come from the inbred female founder for the two-way RIL. Substituting the initial conditions in the population into Table S3, we obtain the results for the RIL in the last generation shown in Table 1. The non-IBD probabilities for X chromosomes are the same as those for autosomes (Table 2 of Zheng ). Thus, we show analytically that the map expansion for the X chromosome is two-thirds that of the autosome for the -way (n ≥ 1) RIL, according to equations (7a, 8). Broman (2012a) has verified this two-thirds rule via Maxima for the -way RIL up to . Figure 3 shows that these theoretical predictions fit very well with the forward simulation results for the two- and eight-way RIL by sibling mating. The differential densities and decay very fast with generation t and show some oscillations in the beginning generations. The overall expected junction density reaches the maximum in the same generation for autosomes.

Figure 3

Results of the -way recombinant inbred lines (RILs) with by sibling mating for (left panels) and (right panels). The filled symbols refer to the results for X chromosomes, the empty symbols for autosomes, and lines for the theoretical predictions in Table 1. The non-IBD probabilities for X chromosomes and autosomes are overlapped with each other. The brown filled diamonds refer to in (C) and (D) and in (E) and (F).

AIL

We consider a multiparental AIL population that is founded by inbred females and inbred males. A unique ancestral origin is assigned to each inbred founder’s genomes so that the two-gene non-IBD probabilities and , and similarly for the three-gene non-IBD probabilities , if they exist. The population of size N is produced by mating scheme RM1-NE or RM2-NE. According to Table S2, the coalescence probabilities and for mating scheme RM1-NE, and they hold approximately for RM2-NE with large population size N » 6. Thus, the two-gene non-IBD probabilities at are given by and according to the recurrence equations (4a–4d), and the three-gene non-IBD probabilites at are given by and according to the recurrence equations (5a–5f). In addition, no junctions can be formed from inbred founders so that it holds that , , and . The population is maintained for U generations with constant size N and sex ratio 1. Assuming that the intercross population size is large (N » 6), all the two- and three-gene coalescence probabilities at are approximately equal and are denoted by s, and they are determined by the intercross mating scheme according to Table S2. Substituting the initial conditions in the population into the theorems of Appendix A3, we obtain in Table 2 the results for X chromosomes in the AIL in the last generation . Table 2 also shows the results for autosomes, which are derived according to Zheng . As shown in Table 2, the non-IBD probabilities for X-chromosomes are unequal to those for autosomes, and thus the map expansions generally do not satisfy the two-thirds rule. According to the map expansions and in Table 2, we derive their approximations under the limit of an infinitely large population size (N →∞) so that the coalescence probability goes to zero (s →0),where the last two terms for in Table 2 are small and thus ignored. The equations (12a, 12b) show that the two-thirds rule is approximately valid for a large number L of founder lines. The map expansion of equation (12b) for is consistent with the previous results (Darvasi and Soller 1995; Liu ; Winkler ; Broman 2012b). The left panels of Figure 4 show for the AIL that the theoretical predictions fit very well with the forward simulation results, where RM1-NE, RM1-E, , and . Within intercross generations, the non-IBD probability decreases slowly with generation t, the differential map expansion remains almost constant after a few generations of oscillations, and the map expansions in equations (12a, 12b), shown as thick red lines in Figure 4, are very good approximations.

Figure 4

Results of the AIL (left panels) and the HS (right panels) with L = 8 and N = 100. The random mating schemes M = RM1-NE for the AIL and RM1-E for the HS, and M = RM1-E for both populations. The symbols and lines are the same as those in Figure 3. The theoretical predictions refer to Table 2 for the AIL and Table 3 for the DO. The additional red lines denote the map expansions under the large size assumption, given by equations (12a, 12b) for the AIL and equations (13a, 13b) for the HS.

HS and DO

The HS and the DO differ from the AIL only in the genetic compositions of the founder population. The N progenitors of the DO at were sampled independently from pre-CC lines at a variety of different generations. Each pre-CC line is produced by the RIL by sibling mating from randomly permuted founder strains. Let denote the proportion of the pre-CC progenitors that were in generation k. Thus, for a random progenitor, it holds and , where and for can be obtained from Table 1. Because the founder stains are exchangeable, we obtain , , and , and because recombination crossovers are independent among different pre-CC lines, the between-individual expected junction densities at are given by , , and , where refers to the probability that the third ancestral origin on haplotype is different from the two ancestral origins on haplotype where the ancestry transition occurs. The within-individual expected junction densities at are not required in the following derivations. The population of size N is produced by random mating with equal sex ratio. Assuming that the population size N » 6, the coalescence probabilities at are approximated to be zero. According to the recurrence equations for the two- and three-gene non-IBD probabilities, the between-individual probabilities did not change and the within-individual non-IBD probabilities at equal to the corresponding between-individual probabilities. In addition, we have , , , , , according to the recurrence equations for the expected junction densities. Similar to the intercross stage of the AIL, we obtain in Table 3 the results for X chromosomes in the DO in the last generation by substituting the initial conditions in the population into the theorems of Appendix A3. Table 3 also shows the results for autosomes, which are derived according to Zheng . Under the limit of an infinitely large population size (N→∞), we obtain from Table 3showing that the two-thirds rule is valid under such an approximation since and for progenitors drawn from the RIL (Table 1). The map expansion in equation (13b) for is the same as the one obtained by Broman (2012b). The right panels of Figure 4 show for the HS that the theoretical predictions fit very well with the forward simulation results, where RM1-E, the 100 individuals in the population were sampled independently from CC funnels at the same generation . The results are similar to those for the AIL with the same L shown in the left panels of Figure 4. For X chromosomes, the non-IBD probabilities in the DO are larger than those in the AIL, and thus in the DO the map expands at a higher rate than that for the AIL, see equations (12a, 13a).

DSPR

The DSPR RILs were derived from two synthetic populations, each created independently by adding the multiparental AIL with an inbreeding stage by sibling mating (King ). For example, we derive the analytic expressions of the map expansions in one synthetic population with L founder strains. We assume that , which holds in a non-inbreeding population and approximately in a large population (e.g., ) with a large number of intercross generations (e.g., ). According to the map expansions in Table S3, we havewhere and , and and are given in Table 1, Table 2, or Table 3 if the population is the last generation of the RIL, the AIL, or the DO, respectively. We evaluate the large size assumption for various random mating schemes by simulation studies of the DSPR. Figure 5 shows the fitting of the theoretical predictions with the forward simulation results for the intercross size 20, 50, and 100, where the mating schemes RM1-NE and RM1-E (RM1-NE) for the left (right) panels. The theoretical predictions are obtained by combining the results for the AIL (Table 2) with those for the sibling-mating population (Table S3), assuming the large size (N » 6). The relative worse fitting for the differential densities and is probably attributable to the limited number (2 × 104) of simulation replicates. The theoretical fitting becomes improved with increased size N, and it is very good for N = 100 within the range of U = 20 intercross generations. The fitting for RM1-E is better than RM1-NE because in the former case the two-gene coalescence probabilities are always equal to the three-gene probabilities (Table S2), independent of the size N. Figure S2 shows similar results for the random mating scheme RM2, except that the expected junction densities are slightly smaller. Figure S3 and Figure S4 show that the large size assumption is less sensitive for autosomes, and the fittings are very good even for N = 20.

Figure 5

Results of the DSPR for X chromosomes with and 20 (cyan), 50 (brown), and 100 (blue). The random mating schemes M = RM1-NE for all panels and M = RM1-E (RM1-NE) for the left (right) panels. The lines denote the theoretical predictions under the large size assumption. The filled circles refer to in panels C and D, and in panels E and F; the filled diamonds refer to in (C) and (D) and in (E) and (F).

Discussion

We have extended our previous framework of modeling ancestral origin processes from autosomes to X chromosomes, and thus the same assumptions such as exchangeability of ancestral origins, Markov properties and random mating also apply (Zheng ). The deviations from Markov properties result in larger variances in the IBD-tract length and the junction densities, which have been shown to be acceptable (Chapman and Thompson 2003; Martin and Hospital 2011). The random mating indicates that our approach does not apply to breeding populations with marker-assisted selections. In contrast to the previous approaches (Haldane and Waddington 1931; Broman 2012a), the exchangeability assumption of ancestral origins greatly reduces model complexity, because the number of possible junction types does not depend on the number of founders for L ≥ 3 whereas the number of diplotype states increases very fast with L. The assumption affects the rate matrix of the Markov model, but not the expected junction densities where only changes of ancestral origins matter. The exchangeability is a good approximation for the AIL- or the multiparent advanced generation inter-cross (i.e., MAGIC)-type populations with random mating, but it does not hold for the multiway RIL by sibling mating. However, the exchangeability assumption is not critical for the application of our results to haplotype reconstructions from genotype data. The genomes of the individuals collected in the last generation have been well mixed by random chromosomal segregations over many generations. This is demonstrated in Figure 1A for the four-way RIL by sibling mating, where a female A and a male B was crossed, and a female C and a male D was crossed, and then a daughter from A × B and a son from C × D was crossed. The X chromosome of the founder D is lost in . The genotype probabilities for AB and AC are different and given in the Table 2 of Broman (2012a), although the sum of the genotype probabilities for AB, AC, and BC is equal to in Table 1. Figure 1B shows that the genotype probability for AB or AC becomes close to the average probability as generation t increases. Furthermore, in the beginning generations when the asymmetry among ancestral origins is large, there are fewer number of recombination breakpoints, and thus more marker data per genome block are available to estimate ancestral origins. As a result, a priori equal weights of ancestral origins have little effects. An HMM is under development for reconstructing ancestral origins for both autosomes and X chromosomes from marker data, using the present model and the previous one (Zheng ) as the prior distribution. The previously implemented HMM methods, such as GAIN (Liu ) and HAPPY (Mott ), were developed for autosomes, and they do not account for the asymmetry between maternally and paternally derived X chromosomes. The closed form expressions for non-IBD probabilities and various expected junction densities have been derived for stage-wise mapping populations. They provide the complete information for constructing the CTMC along two X chromosomes but also the guides for designing a new population in terms of X-linked QTL mapping resolutions. For advanced intercross populations such as the AIL, the HS, and the DO under the assumption of a large intercross size, the map expands linearly at a rate proportional to the inverse of the number L of inbred founders, which is robust to intercross mating schemes. For the RIL and the inbreeding stage of the DSPR, the map expansion slows down with increasing level of inbreeding. The overall junction density for the DSPR decreases after one generation of the inbreeding stage by sibling mating, whereas for the RIL it reaches the maximum in the middle of inbreeding by sibling mating. These conclusions can also be applied to autosomes. Thus the most effective way of improving mapping resolutions is to increase the number U of intercross generations in a large population (N ≥ 5U, empirically).

19 in total

1. On the determination of recombination rates in intermated recombinant inbred populations.

Authors: Christopher R Winkler; Nicole M Jensen; Mark Cooper; Dean W Podlich; Oscar S Smith
Journal: Genetics Date: 2003-06 Impact factor: 4.562

2. An analytical formula to estimate confidence interval of QTL location with a saturated genetic map as a function of experimental design.

Authors: Joel Ira Weller; Morris Soller
Journal: Theor Appl Genet Date: 2004-09-23 Impact factor: 5.699

3. The genomes of recombinant inbred lines.

Authors: Karl W Broman
Journal: Genetics Date: 2004-11-15 Impact factor: 4.562

4. The Collaborative Cross, a community resource for the genetic analysis of complex traits.

Authors: Gary A Churchill; David C Airey; Hooman Allayee; Joe M Angel; Alan D Attie; Jackson Beatty; William D Beavis; John K Belknap; Beth Bennett; Wade Berrettini; Andre Bleich; Molly Bogue; Karl W Broman; Kari J Buck; Ed Buckler; Margit Burmeister; Elissa J Chesler; James M Cheverud; Steven Clapcote; Melloni N Cook; Roger D Cox; John C Crabbe; Wim E Crusio; Ariel Darvasi; Christian F Deschepper; R W Doerge; Charles R Farber; Jiri Forejt; Daniel Gaile; Steven J Garlow; Hartmut Geiger; Howard Gershenfeld; Terry Gordon; Jing Gu; Weikuan Gu; Gerald de Haan; Nancy L Hayes; Craig Heller; Heinz Himmelbauer; Robert Hitzemann; Kent Hunter; Hui-Chen Hsu; Fuad A Iraqi; Boris Ivandic; Howard J Jacob; Ritsert C Jansen; Karl J Jepsen; Dabney K Johnson; Thomas E Johnson; Gerd Kempermann; Christina Kendziorski; Malak Kotb; R Frank Kooy; Bastien Llamas; Frank Lammert; Jean-Michel Lassalle; Pedro R Lowenstein; Lu Lu; Aldons Lusis; Kenneth F Manly; Ralph Marcucio; Doug Matthews; Juan F Medrano; Darla R Miller; Guy Mittleman; Beverly A Mock; Jeffrey S Mogil; Xavier Montagutelli; Grant Morahan; David G Morris; Richard Mott; Joseph H Nadeau; Hiroki Nagase; Richard S Nowakowski; Bruce F O'Hara; Alexander V Osadchuk; Grier P Page; Beverly Paigen; Kenneth Paigen; Abraham A Palmer; Huei-Ju Pan; Leena Peltonen-Palotie; Jeremy Peirce; Daniel Pomp; Michal Pravenec; Daniel R Prows; Zhonghua Qi; Roger H Reeves; John Roder; Glenn D Rosen; Eric E Schadt; Leonard C Schalkwyk; Ze'ev Seltzer; Kazuhiro Shimomura; Siming Shou; Mikko J Sillanpää; Linda D Siracusa; Hans-Willem Snoeck; Jimmy L Spearow; Karen Svenson; Lisa M Tarantino; David Threadgill; Linda A Toth; William Valdar; Fernando Pardo-Manuel de Villena; Craig Warden; Steve Whatley; Robert W Williams; Tim Wiltshire; Nengjun Yi; Dabao Zhang; Min Zhang; Fei Zou
Journal: Nat Genet Date: 2004-11 Impact factor: 38.330

Modeling X-linked ancestral origins in multiparental populations.

A model for X-linked ancestral origins

Assumptions and notation

Single-locus non-IBD probabilities

Expected junction densities

Model evaluation by simulations

Application to QTL mapping populations

Multistage populations

RIL

AIL

HS and DO

DSPR

Discussion

1. On the determination of recombination rates in intermated recombinant inbred populations.

2. An analytical formula to estimate confidence interval of QTL location with a saturated genetic map as a function of experimental design.

3. The genomes of recombinant inbred lines.

4. The Collaborative Cross, a community resource for the genetic analysis of complex traits.

5. Marker densities and the mapping of ancestral junctions.

6. Genome-wide high-resolution mapping by recurrent intermating using Arabidopsis thaliana as a model.

7. High-resolution genetic mapping using the Mouse Diversity outbred population.

8. Quantitative epigenetics through epigenomic perturbation of isogenic lines.

9. Haplotype probabilities in advanced intercross populations.

10. Genotype probabilities at intermediate generations in the construction of recombinant inbred lines.

1. Reconstruction of Genome Ancestry Blocks in Multiparental Populations.

2. Construction of Genetic Linkage Maps in Multiparental Populations.

3. Accurate Genotype Imputation in Multiparental Populations from Low-Coverage Sequence.

4. Recursive Algorithms for Modeling Genomic Ancestral Origins in a Fixed Pedigree.