Literature DB >> 22640408

XVth QTLMAS: simulated dataset.

Jean-Michel Elsen1, Simon Tesseydre, Olivier Filangi, Pascale Le Roy, Olivier Demeure.   

Abstract

BACKGROUND: Our aim was to simulate the data for the QTLMAS2011 workshop following a pig-type family structure under an oligogenic model, each QTL being specific.
RESULTS: The population comprised 3000 individuals issued from 20 sires and 200 dams. Within each family, 10 progenies belonged to the experimental population and were assigned phenotypes and marker genotypes and 5 belonged to the selection population, only known on their marker genotypes. A total of 10,000 SNPs carried by 5 chromosomes of 1 Morgan each were simulated. Eight QTL were created (1 quadri-allelic, 2 linked in phase, 2 linked in repulsion, 1 imprinted and 2 epistatic). Random noise was added giving an heritability of 0.30. The marker density, LD and MAF were similar to real life parameters.

Entities:  

Year:  2012        PMID: 22640408      PMCID: PMC3363151          DOI: 10.1186/1753-6561-6-S2-S1

Source DB:  PubMed          Journal:  BMC Proc        ISSN: 1753-6561


Background

Statistical methods, and softwares, for the marker-assisted genetic analysis of quantitative traits and for the Genomic Evaluation of Breeding Values are partly converging in the new context of high density SNP chip technology. Genome Wide Association Studies based on independent individuals are used on a very large scale in human genetics, whereas GEBV techniques have mostly been developed for ruminant species, in particular dairy cattle where sires have very large numbers of offspring but dams only one progeny per mating. However, both GWAS and GEBV are universal approaches which should be adapted to any family structure, for instance the medium-sized full sib families found in pigs. Similarly to the 2009 and 2008 workshops [1,2], the data sets offered to exploration during the QTLMAS 2011 workshop were organized following this pig-type structure. The architecture of analyzed traits can be highly variable. The number of QTL varies from one in the monogenic inheritance found for some disease resistances to a huge number of tiny QTLs in other cases. Moreover, the QTL may be subject to various effects including dominance, epistasis or imprinting. To appreciate the ability of methods to deal with these situations, the choice was made in our simulation to avoid polygenic noise and limit the heredity to 8 segregating QTLs, each displaying its own features.

Simulated method

Pedigree

The population was a collection of 20 non-independent sire families. Each sire was mated to 10 dams, a given dam being mated to only one sire. Each dam gave birth to two sets of 10 and 5 offspring, respectively. The first progeny group (n = 2000 individuals) formed the experimental population, with marker genotypes and trait phenotype information. The second group (n = 1000 individuals) were candidates to selection, only recorded for their marker information. The parental generation (20 sires and 200 dams) was generated by a random choice of two gametes chosen in pools of 75. These 2x75 gamete pools were generated after a long history of random drift and mutation simulated by the LDSO software [3]. This history involved two steps: 1000 generations of a population comprising 1000 gametes, followed by a severe bottleneck with 150 gametes evolving during 30 generations.

Genomes

The genome structure consisted of five autosomal chromosomes of one Morgan each. Biallelic SNPs were simulated, located every 0.05 cM (2000 SNPs /chromosome). A pool of 1000 gametes was first generated in linkage equilibrium. During the 1150 generations following this initial step, a mutation rate of 0.0002 was applied.

Quantitative trait phenotypes

The trait variability was due to the segregation of 8 QTLs and to environmental noise. The QTLs were generated by transforming SNPs that were still polymorphic in the last generation. These SNPs were then removed from the marker data file. The QTL located on chromosome 1 was generated by pooling alleles from two adjacent SNPs, in order to create a quadri-allelic locus. QTL characteristics varied between chromosomes and were chosen to represent extreme situations (table 1). The effects of the QTLs are given in "trait units" (TU). Environmental noise variance was adjusted to the observed genetic variation, i.e. the genetic variation due to the additive effects of QTL, in order to give a realized heritability of 0.3. The resulting phenotypic standard deviation was 9.37 TU.
Table 1

Characteristics of the simulated QTLs

QTLChrom.Position (cM)TypeEffects
QTL112.854 alleles, additive, bigAllele 1 = 0., 2 = 2., 3 = 4., 4= 6.

111222
QTL2281.9in phase with QTL311-4.-2.0.
QTL393.75in phase with QTL212-2.0.2.
220.2.4.

111222
QTL435.0opposition with QTL5110.2.4.
QTL515.0opposition with QTL412-2.0.2.
22-4.-2.0.

11122122
QTL6432.2Imprinted2.00.00.00.0

111222
QTL7536.3epistatic with QTL8112.1.0.
QTL899.2epistatic with QTL7120.0.0.
220.0.0.
Characteristics of the simulated QTLs On chromosome 1, a QTL (QTL1) with 4 alleles, displaying large additive effects (0.0, 2.0, 4.0 and 6.0 TU for alleles 1 to 4) was positioned close to the chromosome border (2.85cM). The deviation between extreme genotypes (44 vs. 11) was thus 12 TU, i.e. about 1.28 phenotypic standard deviations. Chromosomes 2 and 3 were assigned two linked additive QTLs showing a 1-TU allelic effect, acting "in phase" on chromosome 2, and "in repulsion" on chromosome 3. The wording "phase" and "repulsion" should be clarified in our context. Four classes of chromosomes 2 (resp. 3) were observed in the last generation, defined by the alleles present at QTL2 and QTL3 (resp. QTL4 and QTL5): 1-1, 1-2, 2-1 and 2-2. The associations 1-1 and 2-2 being more frequent than the 1-2 or 2-1 in both cases, we assigned the same direction to the effects of alleles 1 (resp. 2) at QTL2 and 1 (resp. 2) at QTL3, and alleles 1 (resp. 2) at QTL4 and 2 (resp. 1) at QTL5. Chromosome 4 was characterized by an imprinted QTL of moderate effect (2 TU). All individuals receiving allele 1 from their sire displayed a quantitative phenotype increased by 2 TU as compared to individuals receiving allele 2. On chromosome 5, two epistatic QTLs were positioned far from each other. The effect of QTL7 was expressed (with mean values of 0, 1 and 2 for genotypes 11, 12 and 22) only when animals displayed genotype 11 at QTL8.

Results

Amongst the 10,000 SNPs, 7,130 were still polymorphic in the last generation. The Minor Allele Frequency was classically distributed with a peak near 0 and a nearly uniform distribution elsewhere (Figure 1). The average MAF was 0.23 with a standard deviation of 0.15.
Figure 1

Minor Allele Frequency distribution in the last generation.

Minor Allele Frequency distribution in the last generation. The linkage disequilibrium generated by the simulation process is typical of livestock structure (Figure 2). When compared to theoretical curves obtained using the formulae from Tenesa et al. [4], E(r2)=1/(4N+2) with Nthe effective population size and c the recombination rate, the observed LD was closer to the N=1000 curve at short distances, and to the N=150 curve for larger distances between SNPs (Figure 3). This evolution is consistent with a recent bottleneck in a formerly sizeable population.
Figure 2

Mean and maximum Linkage Disequilibrium (r.

Figure 3

Observed and expected (assuming effective population sizes of 150 and 1000 reproducers) Linkage Disequilibrium (r.

Mean and maximum Linkage Disequilibrium (r. Observed and expected (assuming effective population sizes of 150 and 1000 reproducers) Linkage Disequilibrium (r. The 220 parents of the final generation were related, due to the limited sample size of the historical population. The distribution of the genomic relationship coefficients is given in Figure 4 as per [5]. It shows that animals were far from unrelated, a hypothesis often assumed in simple QTL detection approaches.
Figure 4

Distribution of the genomic relationship coefficients in the parental generation.

Distribution of the genomic relationship coefficients in the parental generation.

Discussion

The simulated data described here were proposed to teams taking part in the QTLMAS2011 workshop in order to compare their QTL mapping and Genomic EBV techniques. The marker structure was similar to situations encountered in livestock populations, with one SNP every 0.05 cM (corresponding to a 60K SNP chip for a classical 3000 cM genome), an average MAF of 0.23, and a mean LD between close (0.05 cM) loci of 0.27, similar to findings previously described in cattle [6]. The co-ancestry relationship displayed a large variability as expected in real breeds. On the contrary, the genetic architecture of the quantitative trait was probably much simpler than most of the situations prevailing for production traits: only 8 segregating QTLs, one or two per chromosome. Different types of allelic relationships were chosen: additivity for a single major QTL (chromosome 1), linked genes (chromosomes 2 and 3), an imprinting feature on chromosome 4 and two epistatic loci on chromosome 5. This simplified situation was chosen on purpose to avoid a possible confounding effect due to polygenic noise and to emphasize the abilities of the compared techniques to deal with such extreme cases.

List of abbreviations used

SNP: Single Nucleotide Polymorphisms ; QTL: Quantitative Trait Locus ; MAF: Minor Allele Frequency ; LD: Linkage Disequilibrium ; GEBV: Genomic Estimated Breeding Value ; GWAS: Genome Wise Association Studies.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

All authors contributed to the ideas and methods, and read and approved the manuscript. ST, JME and OF programmed the simulations. JME wrote the manuscript.
  6 in total

1.  Recent human effective population size estimated from linkage disequilibrium.

Authors:  Albert Tenesa; Pau Navarro; Ben J Hayes; David L Duffy; Geraldine M Clarke; Mike E Goddard; Peter M Visscher
Journal:  Genome Res       Date:  2007-03-09       Impact factor: 9.043

2.  LDSO: a program to simulate pedigrees and molecular information under various evolutionary forces.

Authors:  F Ytournel; S Teyssèdre; D Roldan; M Erbe; H Simianer; D Boichard; H Gilbert; T Druet; A Legarra
Journal:  J Anim Breed Genet       Date:  2012-01-23       Impact factor: 2.380

3.  Common SNPs explain a large proportion of the heritability for human height.

Authors:  Jian Yang; Beben Benyamin; Brian P McEvoy; Scott Gordon; Anjali K Henders; Dale R Nyholt; Pamela A Madden; Andrew C Heath; Nicholas G Martin; Grant W Montgomery; Michael E Goddard; Peter M Visscher
Journal:  Nat Genet       Date:  2010-06-20       Impact factor: 38.330

4.  QTLMAS 2009: simulated dataset.

Authors:  Albart Coster; John W M Bastiaansen; Mario P L Calus; Chris Maliepaard; Marco C A M Bink
Journal:  BMC Proc       Date:  2010-03-31

5.  Comparison of analyses of the QTLMAS XII common dataset. I: Genomic selection.

Authors:  Mogens Sandø Lund; Goutam Sahana; Dirk-Jan de Koning; Guosheng Su; Orjan Carlborg
Journal:  BMC Proc       Date:  2009-02-23

6.  Whole genome linkage disequilibrium maps in cattle.

Authors:  Stephanie D McKay; Robert D Schnabel; Brenda M Murdoch; Lakshmi K Matukumalli; Jan Aerts; Wouter Coppieters; Denny Crews; Emmanuel Dias Neto; Clare A Gill; Chuan Gao; Hideyuki Mannen; Paul Stothard; Zhiquan Wang; Curt P Van Tassell; John L Williams; Jeremy F Taylor; Stephen S Moore
Journal:  BMC Genet       Date:  2007-10-25       Impact factor: 2.797

  6 in total
  4 in total

1.  The influence of a first-order antedependence model and hyperparameters in BayesCπ for genomic prediction.

Authors:  Xiujin Li; Xiaohong Liu; Yaosheng Chen
Journal:  Asian-Australas J Anim Sci       Date:  2018-07-26       Impact factor: 2.509

2.  Genomic Prediction Using Bayesian Regression Models With Global-Local Prior.

Authors:  Shaolei Shi; Xiujin Li; Lingzhao Fang; Aoxing Liu; Guosheng Su; Yi Zhang; Basang Luobu; Xiangdong Ding; Shengli Zhang
Journal:  Front Genet       Date:  2021-04-15       Impact factor: 4.599

3.  Inferring haplotypes and parental genotypes in larger full sib-ships and other pedigrees with missing or erroneous genotype data.

Authors:  Carl Nettelblad
Journal:  BMC Genet       Date:  2012-10-10       Impact factor: 2.797

4.  A new genotype imputation method with tolerance to high missing rate and rare variants.

Authors:  Yumei Yang; Qishan Wang; Qiang Chen; Rongrong Liao; Xiangzhe Zhang; Hongjie Yang; Youmin Zheng; Zhiwu Zhang; Yuchun Pan
Journal:  PLoS One       Date:  2014-06-27       Impact factor: 3.240

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.