Literature DB >> 25519519

A selective genotyping approach identifies QTL in a simulated population.

Bianca Moioli1, Francesco Napolitano1, Gennaro Catillo1.   

Abstract

BACKGROUND: Identification of QTLs for important phenotypic traits, through the use of medium-density genome-wide SNP panels, is one of the most challenging areas in animal genetics, for preventing the time-consuming direct sequencing of putative candidate genes, when searching for the mutations that affect the trait. Appropriate statistical analyses allow the identification of genomic regions associated with the investigated trait in the genotyped population.
METHODS: The selective genotyping technique was applied to 1000 genotyped animals with known phenotype. Sliding windows composed of five consecutive SNPs were created for each chromosome; we assumed that the QTLs were encoded by the windows showing the highest difference in the frequency of the same alleles between the most divergent productive groups (the two tails of the distribution).
RESULTS: Ten windows affected at least one trait. For five of these windows, the highest and significant effect was given by one only SNP, which could therefore be taken as the QTL itself.
CONCLUSIONS: In this study we proposed a simple method to identify genomic regions associated to the phenotype under study. The identification of the DNA region is the first step to search for the mutation which is really responsible for the trait variability, through the direct sequencing of the genome regions that encode the QTL.

Entities:  

Year:  2014        PMID: 25519519      PMCID: PMC4195409          DOI: 10.1186/1753-6561-8-S5-S5

Source DB:  PubMed          Journal:  BMC Proc        ISSN: 1753-6561


Background

The recent availability of genome-wide SNP panels, which offered the opportunity to evaluate the variation in SNP allele frequencies between populations, allowed the successful finding of genomic regions subject to positive selection in human and cattle [1-5]. For the identification of selection sweeps for milk traits, efficient application of the selective genotyping strategy for QTL mapping has been reported in dairy cattle [6], swine [7] and sheep [8]. In these cases, the extreme divergent individuals for a trait (the two tails of the distribution) are chosen and genotyped. Boligon et al. [9] compared selective genotyping strategies for prediction of breeding values in a population undergoing selection, and concluded that animals with extreme yield deviation values in a reference population are the most informative when training genomic selection models. Using the selective genotyping approach, Moioli et al. [8] identified two novel non-synonymous mutations associated with milk yield in sheep, and demonstrated their effect also in independent populations. In the present study, we hypothesized that selection sweeps, detected in a simulated population, were useful to map QTLs for the trait under selection in the whole population.

Materials and methods

Dataset

Three milk production traits were simulated in a population of 3,000 females, included in a data set of 4,100 individuals of 4 different generations (G0 to G4) having known pedigree. Females and parental genotypes at 10,000 SNPs equally distributed on 5 chromosomes were available. A detailed description of the population is reported by Usai et al. [10].

Statistical analysis

The selective genotyping technique was simulated on the females of generation 3 (1000 females), assuming that they were those who had better profited of the selection. Their production was reported on table 1. Allele frequencies at each SNP of each chromosome were calculated separately for the group the production of which was <-1 st dev for each trait, and the group the production of which was >1 st dev for each trait. The number of the animals of each group was also reported in table 1. The QTLs so hypothesized might be affected by the number of individuals included in the production tails, this depending on the additive-relationship between them, which might not represent the average relationship of the whole population. Habier et al. [11], in the context of predicting genomic breeding values (GEBV), advised that additive-genetic relationships between the training individuals and a selection candidate, captured by SNPs, affects the GEBV accuracy of that candidate. Therefore, in the present study, coefficient of relationship between the individuals of each tail portion, as well as the whole population were calculated as in Wright [12] using Proc Inbreeding in SAS [13].
Table 1

Statistical parameters relevant to the analyzed traits in the female population of generation 3

VariableNmeanst devminmaxN< -1 st.devN> 1 st.dev
Trait11000-6.42171.32-526.43483.17173150
Trait21000-0.1709.60-32.2325.51165167
Trait310000.0003960.02388-0.0890.085156153
Statistical parameters relevant to the analyzed traits in the female population of generation 3 The QTL effect was subsequently estimated with the use of sliding windows, composed of five consecutive SNPs and calculated for each of the five chromosomes. The number of markers in each window was established based on the consideration that the SNP density of the simulated population of the present study was similar to the average SNP density of the cattle panel used by Stella et al. [2]. These authors suggested that sliding windows of 5, 9, and 19 SNPs respectively give similar results when searching for selective sweeps in cattle. For each window, the sum of the differences (in absolute value) of the allele frequencies, at each SNP, between the two productive groups, was calculated; the sliding windows were then ranked, according to this parameter, within each chromosome. We arbitrarily hypothesized that the potential QTL, for the considered trait, was located in the top ranking window. Because the selective genotyping was performed separately for the three traits, the potential QTLs could be located in different windows; for this reason, more than one window in the same chromosome were considered in the subsequent analyses. The top ranking sliding windows, encoding the hypothesized QTL, as well as the potentially affected traits, are reported in table 2.
Table 2

Top ranking sliding windows based on the highest difference in allelic frequencies between the two productive groups, separately for each trait

chrStarting positionEnd positionmarkersQTLtrait1QTLtrait2QTLtrait3
184,000,00084,200,000SNP1681 - SNP1685xx
114,500,00014,750,000SNP291 - SNP295x
292,500,00092,450,000SNP3847 - SNP3851x
246,700,00046,900,000SNP2935 - SNP2939x
276,900,00077,100,000SNP3539 - SNP3543x
3400,000600,000SNP4009 - SNP4013x
326,600,00026,800,000SNP4533 - SNP4537x
336,850,00037,050,000SNP4738 - SNP4742x
47,650,0007,850,000SNP6154 - SNP6158x
424,850,00025,250,000SNP6498 - SNP6502xx
569,300,00069,500,000SNP9387 - SNP9391xx
52,700,0002,950,000SNP8055 - SNP8059x
Top ranking sliding windows based on the highest difference in allelic frequencies between the two productive groups, separately for each trait

Estimation of the QTL effect for the whole window of 5 SNP

The QTL effect was calculated on the whole recorded population as follows. For each sliding window, the most probable haplotype alleles were calculated using the EM algorithm [14], through Proc Haplotype in SAS [13], and were assigned to each phenotyped individual (n = 3000). For each haplotype allele showing allele frequency ≥ .07 in the recorded population, the allelic substitution effect was estimated as a covariate on each trait, as in Sherman et al. [15], with the following model: y = b(haplotype allele) + e Where y = trait1, trait2 and trait3 Alleles were coded as follows: 2 copies of the same allele = 2; one copy = 1; no copy = 0. To account for multiple testing, the corrected probability of the effect was estimated using the False Discovery Rate test with Proc Multtest in SAS [13].

Estimation of the SNP effect from the haplotype effect

Under the hypothesis that one SNP of each haplotype was expected to have a major effect on the recorded trait, direct observation of those haplotype alleles that showed a highly significant effect (P < .00001) on one trait allowed to select one SNP where the two alleles showed opposite effects on that trait. For each of those SNPs, the substitution allelic effect was estimated as a covariate on each trait, similarly and with the same model as for the estimation of the allele haplotype effect.

Results

Because the selective genotyping strategy was performed separately for the three traits, the statistically significant windows varied depending on the considered trait (Table 2). The average additive relationship values of each of the selected tails, for each trait, were very similar to each other's (Table 3), ranging from 4.26 to 4.37 %; but they were higher than the corresponding value calculated for the whole population (3.01%). For all tested haplotypes, the corrected probabilities, after consideration of the FDR, of the allelic substitution effects, were reported in table 4.
Table 3

Average relationships in the selected groups of animals and in the whole population

TraittailNCoefficient of relationship
meanst dev

1highest1504.330.017
1lowest1734.270.018
2highest1674.320.017
2lowest1654.270.018
3highest1534.370.017
3lowest1564.260.017
Total41003.010.016
Table 4

Haplotype effects.

chrPos. (Mb)Start/EndHaplo.alleleFreqTrait 1Trait 2Trait 3
EffectFDR PEffectFDR PEffectFDR P

184.084.2112220.21-48.7<10-4-0.5ns9.9*10-3<10-4
121210.2432.5<10-40.92*10-2-4.1*10-3<10-4
121110.0721.82*10-2-0.2ns-7.0*10-3<10-4
111210.092.0ns0.3ns6.1*10-4ns
121220.0724.21*10-2-0.8ns-9.8*10-3<10-4
212220.07-5.7ns0.1ns1.5*10-3ns

114.514.7111110.08-16.77*10-2-0.4ns2.1*10-38*10-2
121110.1937.1<10-42.8<10-42.9*10-35*10-4
121120.1818.14*10-30.5ns-2.2*10-38*10-3
121210.140.5ns1.33*10-46.0*10-3<10-4
122120.32-25.5<10-4-2.3<10-4-4.0*10-3<10-4

246.746.9111220.11-14.1ns-0.5ns1.7*10-3ns
111120.26-3.6ns0.4ns3.0*10-3<10-4
121110.23-17.94*10-3-1.02*10-34.3*10-4ns
221110.198.4ns0.0ns-2.4*10-36*10-3

276.977.1111120.29-12.92*10-2-0.88*10-32.7*10-6ns
211120.282.4ns-0.1ns-1.2*10-3ns
222210.267.0ns0.81*10-21.3*10-3ns
221120.0725.01*10-20.2ns-5.1*10-32*10-4

292.392.5111110.3913.12.*10-20.65*10-2-5.1*10-4ns
221220.110.5ns0.1ns6.7*10-4ns
222220.11-16.54*10-2-0.6ns1.6*10-3ns

30.40.6111210.11-22.18*10-3-1.51*10-4-1.7*10-5ns
111220.2710.1ns-0.1ns-3.2*10-3<10-4
222110.329.5ns1.4<10-43.5*10-3<10-4
221210.1-9.0ns0.2ns3.4*10-33*10-3

326.626.8111210.38-17.34*10-4-0.94*10-47.4*10-4ns
222120.2231.7<10-41.22*10-4-3.3*10-3<10-4
112220.07-15.9ns-0.9ns-6.2*10-4ns
222220.11-4.8ns0.2ns2.4*10-3ns

336.937.1111110.1424.41*10-3-0.4ns-1.0*10-2<10-4
211110.11-19.42*10-20.4ns7.4*10-3<10-4
211220.10.6ns0.6ns3.2*10-33*10-3
211120.08-2.3ns-0.2ns7.5*10-4ns
121110.16-4.0ns-0.3ns-1.1*10-1ns
212220.19-2.7ns0.4ns3.6*10-3<10-4

47.77.9111120.3413.81*10-20.2ns-2.3*10-38*10-4
111220.11-14.95*10-2-1.11*10-2-6.4*10-4ns
121120.07-25.81*10-2-1.92*10-4-1.8*10-3ns
222120.16-13.94*10-20.2ns4.7*10-3<10-4

424.925.3111220.07-12.3ns-1.81*10-4-3.8*10-3ns
112110.15-24.51*10-4-1.31*10-48.2*10-4ns
121210.0749.8<10-43.0<10-4-6.0*10-4ns
121220.2335.8<10-42.6<10-41.5*10-3ns
211110.14-37.9<10-4-2.5<10-4-2.2*10-5ns

569.369.5121210.35-27.7<10-4-1.6<10-45.2*10-4ns
121120.1520.22*10-30.84*10-2-2.5*10-31*10-2
122120.0812.0ns-0.1ns-2.5*10-31*10-2
212120.1915.31*10-21.12*10-38.6*10-4ns

52.72.9121110.19-11.4ns-0.4ns1.3*10-3ns
212210.16-27.3<10-4-0.4ns4.9*10-3<10-4
221110.1926.3<10-40.85*10-2-3.0*10-31*10-4
221120.165.3ns-0.3ns-3.7*10-3<10-4
Average relationships in the selected groups of animals and in the whole population Haplotype effects. Through direct observation of those haplotype alleles that showed a significant effect on one trait, it was possible to make evident which SNP, within the haplotype allele, might have been directly responsible of the trait variability. In Table 5 only the SNPs that presented a highly significant (P < .0001) allelic substitution effect were reported. These SNPs, located on chromosomes 1, 3 and 4 might be themselves considered the QTLs influencing the relevant trait.
Table 5

Effect of allele 1 of the SNP with major effect on each trait.

ChrSNPpositionaffected traiteffectP-value
1SNP29314,600,00022.4702.0*10-10
1SNP29314,600,00022.470<1.0*10-16
1SNP29314,600,00030.0047.3*10-09
1SNP168284,050,0001-39.230<1.0*10-16
1SNP168284,050,00030.008<1.0*10-16
3SNP473836,850,0003-0.008<1.0*10-16
4SNP61557,700,000118.0508.0*10-05
4SNP61557,700,0003-0.0034.0*10-05
4SNP649924,900,0001-58.550<1.0*10-16
4SNP649924,900,0002-3.760<1.0*10-16
Effect of allele 1 of the SNP with major effect on each trait.

Discussion

In this study, two assumptions were arbitrarily made. The first was that the selective genotyping strategy was successful for QTL mapping. Although the literature reported evidence of the suitability of this strategy [9], the decision to what animals should be considered as highly divergent for each trait was a choice of the authors. Therefore, the results obtained, both in numbers and in the position of the QTLs, might have been different if more or less restrictive parameters had been chosen. The additive relationship values of each of the selected tails, for each trait, were very similar to each other's, ranging from 4.26 to 4.37 %; but they were higher than the corresponding value calculated for the whole population (3.01%). To appraise the extent of the difference in the average relationship between the tails and the whole population, it is useful to cite Vahlsten et al. [16] who reported that an increase by 0.96 % units of relationship, per generation, is to be considered slow, this value referring to Friesian bulls, born during 40 years, and belonging to a population of over 400,000 animals. It can therefore be inferred that the relationship differences observed in the present study reproduce the mere generational trend. The second assumption was that the QTL was encoded by an haplotype of 5 consecutive SNPs. Weller and Ron [17] underlined how important is the extent of LD in the application of genome scans to breeding programs. These authors noted that population-wide linkage LD extends, in dairy cattle, over less than 1 cM, i.e. a much shorter extent than the genetic linkage within families, that extends over tens of centimorgans. It is therefore possible that the hypothesis that the QTL was encoded by the haplotype with the highest effect on each trait was not the most appropriate for this study, the analyzed population consisting in a simulated sample. However, because the sliding windows encompass consecutive markers, the choice to select the top ranking window for each trait seemed appropriate, because it allowed the identification of single SNPs (Table 5) having a very high significant effect on one trait, the probability for some of them being < .1.0E-16.

Conclusions

In this study we proposed a simple method to identify genomic regions associated to the phenotype under study, regions that could therefore be taken into account as the potential QTLs. The identification of the DNA region is the first step to identify the mutation which is really responsible for the variability of the trait, through the direct sequencing of the genomic regions that encode the QTL. The precision of the QTL estimation can vary depending on the deviations values established in the reference population to define which animals are extremely divergent.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

BM planned the study and applied the procedures to set up the sliding windows to be used in the subsequent analyses. GC and FN performed the statistical analysis of association providing the estimation of the QTL effects. All authors have contributed to the editing of the article, and approved the final manuscript.
  12 in total

1.  Interrogating a high-density SNP map for signatures of natural selection.

Authors:  Joshua M Akey; Ge Zhang; Kun Zhang; Li Jin; Mark D Shriver
Journal:  Genome Res       Date:  2002-12       Impact factor: 9.043

2.  A genome-wide scan for signatures of recent selection in Holstein cattle.

Authors:  S Qanbari; E C G Pimentel; J Tetens; G Thaller; P Lichtner; A R Sharifi; H Simianer
Journal:  Anim Genet       Date:  2010-01-21       Impact factor: 3.169

3.  Comparison of selective genotyping strategies for prediction of breeding values in a population undergoing selection.

Authors:  A A Boligon; N Long; L G Albuquerque; K A Weigel; D Gianola; G J M Rosa
Journal:  J Anim Sci       Date:  2012-12       Impact factor: 3.159

Review 4.  Positive natural selection in the human lineage.

Authors:  P C Sabeti; S F Schaffner; B Fry; J Lohmueller; P Varilly; O Shamovsky; A Palma; T S Mikkelsen; D Altshuler; E S Lander
Journal:  Science       Date:  2006-06-16       Impact factor: 47.728

5.  Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.

Authors:  L Excoffier; M Slatkin
Journal:  Mol Biol Evol       Date:  1995-09       Impact factor: 16.240

Review 6.  Invited review: quantitative trait nucleotide determination in the era of genomic selection.

Authors:  J I Weller; M Ron
Journal:  J Dairy Sci       Date:  2011-03       Impact factor: 4.034

7.  The impact of genetic relationship information on genomic breeding values in German Holstein cattle.

Authors:  David Habier; Jens Tetens; Franz-Reinhold Seefried; Peter Lichtner; Georg Thaller
Journal:  Genet Sel Evol       Date:  2010-02-19       Impact factor: 4.297

8.  Polymorphisms and haplotypes in the bovine neuropeptide Y, growth hormone receptor, ghrelin, insulin-like growth factor 2, and uncoupling proteins 2 and 3 genes and their associations with measures of growth, performance, feed efficiency, and carcass merit in beef cattle.

Authors:  E L Sherman; J D Nkrumah; B M Murdoch; C Li; Z Wang; A Fu; S S Moore
Journal:  J Anim Sci       Date:  2007-09-04       Impact factor: 3.159

9.  Signatures of selection identify loci associated with milk yield in sheep.

Authors:  Bianca Moioli; Maria Carmela Scatà; Roberto Steri; Francesco Napolitano; Gennaro Catillo
Journal:  BMC Genet       Date:  2013-09-03       Impact factor: 2.797

10.  Identification of a short region on chromosome 6 affecting direct calving ease in Piedmontese cattle breed.

Authors:  Silvia Bongiorni; Giordano Mancini; Giovanni Chillemi; Lorraine Pariset; Alessio Valentini
Journal:  PLoS One       Date:  2012-12-04       Impact factor: 3.240

View more
  1 in total

1.  Association of SNPs in dopamine and serotonin pathway genes and their interacting genes with temperament traits in Charolais cows.

Authors:  E Garza-Brenner; A M Sifuentes-Rincón; R D Randel; F A Paredes-Sánchez; G M Parra-Bracamonte; W Arellano Vera; F A Rodríguez Almeida; A Segura Cabrera
Journal:  J Appl Genet       Date:  2016-12-16       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.