Literature DB >> 24740293

Identification of selection footprints on the X chromosome in pig.

Yunlong Ma1, Haihan Zhang1, Qin Zhang1, Xiangdong Ding1.   

Abstract

Identifying footprints of selection can provide a straightforward insight into the mechanism of artificial selection and further dig out the causal genes related to important traits. In this study, three between-population and two within-population approaches, the Cross Population Extend Haplotype Homozygosity Test (XPEHH), the Cross Population Composite Likelihood Ratio (XPCLR), the F-statistics (Fst), the Integrated Haplotype Score (iHS) and the Tajima's D, were implemented to detect the selection footprints on the X chromosome in three pig breeds using Illumina Porcine60K SNP chip. In the detection of selection footprints using between-population methods, 11, 11 and 7 potential selection regions with length of 15.62 Mb, 12.32 Mb and 9.38 Mb were identified in Landrace, Chinese Songliao and Yorkshire by XPEHH, respectively, and 16, 13 and 17 potential selection regions with length of 15.20 Mb, 13.00 Mb and 19.21 Mb by XPCLR, 4, 2 and 4 potential selection regions with length of 3.20 Mb, 1.60 Mb and 3.20 Mb by Fst. For within-population methods, 7, 10 and 9 potential selection regions with length of 8.12 Mb, 8.40 Mb and 9.99 Mb were identified in Landrace, Chinese Songliao and Yorkshire by iHS, and 4, 3 and 2 potential selection regions with length of 3.20 Mb, 2.40 Mb and 1.60 Mb by Tajima's D. Moreover, the selection regions from different methods were partly overlapped, especially the regions around 22∼25 Mb were detected under selection in Landrace and Yorkshire while no selection in Chinese Songliao by all three between-population methods. Only quite few overlap of selection regions identified by between-population and within-population methods were found. Bioinformatics analysis showed that the genes relevant with meat quality, reproduction and immune were found in potential selection regions. In addition, three out of five significant SNPs associated with hematological traits reported in our genome-wide association study were harbored in potential selection regions.

Entities:  

Mesh:

Year:  2014        PMID: 24740293      PMCID: PMC3989256          DOI: 10.1371/journal.pone.0094911

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Artificial selection plays an important role in the process of adaptive evolution of domestic animals [1]. So far, a series of noticeable differences caused by artificial selection have been identified, especially the economic traits which brought huge economic profit in the development of human society [1], [2]. With the development of high throughput genotyping technology, hunting genomic evidence of selection on genes or genomic regions via high-density SNP chips or sequencing data shows useful to provide straightforward insights into the meaning of selection and explore causal genes relevant to traits of interest [3], [4]. Theoretically, a novel causal variant that has been under the pressure of selection usually shows long-range linkage disequilibrium (LD) and a high population frequency over a long period of time. Hence, selection footprints could be detected through the decay of linkage disequilibrium and the variation of allele frequency. So far, a series of related methods have been proposed and can be grouped into categories of site-frequency spectrum and linkage disequilibrium according to the theory of them [5]. The F-statistics (Fst) [6], the Tajima's D test [7], and the Cross Population Composite Likelihood Ratio (XPCLR) [8], the Cross Population Extend Haplotype Homozygosity Test (XPEHH) [4] and the Integrated Haplotype Score (iHS) [9], as the representative method respectively corresponding to each category, are widely used in identifying selection footprints. Among them, Fst, XPCLR and XPEHH are mainly used to detect selection footprints between populations (between-population methods), both the Tajima's D and iHS are primarily using the information from single population to reveal the selection footprints (within-population methods). Fst was initially used to assess the population differentiation according to the DNA polymorphism of populations [6], which was attributed to the geographically variable selection. Currently, some branches of Fst methods have been developed, e.g. the two-step method of Gianola's Fst [10], Fst-based Bayesian hierarchical model [11]. Different from Fst, the XPCLR uses the differentiation of multi-locus allele frequency between two populations to detect selection footprints, it is effective in identifying the fast changes in allele frequency at the locus with random drift [8]. The major consideration of Fst and XPCLR is the variation of allele frequency while XPEHH assumes that the occurrence of selection can be traced through measuring LD or observing overrepresented haplotype in population, making it capable to detect entirely or approximately fixed site [4]. The iHS is also based on theory of linkage disequilibrium, it is sensitive for finding the regions with a rapidly increased frequency of the derived allele at selected sites [9]. Tajima's D is based on allele frequency and it is sensitive to purifying selection and balancing selection [7]. At present, many studies of selection footprints in human and animals were reported [3], [4], [12]–[15]. However, most of these studies circumscribe the investigations on the autosomes and rarely on the X chromosome. Comparing with autosome, the X chromosome has its own particularity and plays an important role in evolution of human and animals, McVicker et al. (2009) investigated the genomic signature of natural selection and found that genome diversity reduction caused by selection on the X chromosome (12–40%) is higher than on the antosomes (19–26%) [13]. The X chromosome has suffered higher selection pressure than autosomes due to the sex-specific dosage compensation (SSDC),resulting in genes on the X chromosome under more direct and effective selection [16], [17]. As an important model animal, pig has experienced a long history of artificial selection in the process of domestication and breeding [18]. The X chromosome of pig carries many interesting genes like androgen receptor gene (AR) and thyroid-binding globulin gene (TBG). Therefore, it is necessary to investigate the occurrence of selection footprints on the X chromosome in pig. In this study, three between-population methods (XPEHH, Fst and XPCLR) and two within-population methods (iHS and Tajima's D) were implemented to scan the whole X chromosome for hunting selection footprints in three pig breeds through Illumina PorcineSNP60K BeadChip (Illumina, San Diego, CA). Afterwards, a stream of analysis, including gene searching and functional enrichment analysis, were performed to explain the biological significant of selection footprints.

Materials and Methods

Experiment Animals

A total of 515 pigs were selected out from three breeds as the experimental population in this study. There are 67 individuals (32 boars and 35 sows) in Landrace, 375 individuals (207 boars and 168 sows) in Yorkshire and 73 individuals (39 boars and 34 sows) in Chinese Songliao (Songliao for short). Songliao was bred in 1988 using boars of cross-bred of Duroc and Landrace, and sows of Minzhu, one famous Chinese native breed. In order to identify population structure and avoid the relatedness of animals, the principal component analysis (PCA) followed Paschou et al. (2007) was performed using the genotype data [19]. A total of 113 sows, including 35 from Landrace, 34 from Songliao and 44 from Yorkshire, were finally chosen to detect selection footprints on the X chromosome. As shown in Fig. S1, the contribution of first principal component captures 36.14% of the variance in this data, and the second 22.03%.

Genotyping Data

Genomic DNA samples were extracted from ear tissue of all 515 pigs. The whole procedure for collecting ear tissue samples was carried out in strict accordance with the protocol approved by the Animal Welfare Committee of China Agricultural University (Permit Number: DK996). All DNA samples were genotyped using the Infinium II Multisample assay (Illumina Inc.). Illumina Porcine60K SNP arrays were scanned using iScan (Illumina Inc.) and analyzed using BeadStudio (Version 3.2.2, Illumina, Inc.). We implemented one procedure to ensure the high quality of genotyping on the X chromosome of all sows: (1) the individuals with call rate less than 0.90 were discarded; (2) SNP loci were removed on condition that the SNP call rate <0.90; (3) SNP loci severely deviated from Hardy-Weinberg equilibrium (p-value <10e–6) were removed; (4) SNP loci with minor allele frequency (MAF) less than 0.05 were removed only when within-population methods were performed. After quality control, we imputed the missing genotypes and inferred haplotypes using BEAGLE [20].

Detection of Selection Footprints

Three between-population methods (XPEHH, Fst and XPCLR) and two within-population methods (iHS and Tajima's D) were implemented to detect the selection footprints. Fst, XPCLR and Tajima's D can directly handle SNP genotype, while XPEHH and iHS mainly use phased data.

Calculation of XPEHH scores

The XPEHH derives from the idea of Extended Haplotype Homozygosity (EHH), which is defined as a probability that two randomly chosen extended haplotypes carrying a given core haplotype are homozygosity [3], [4], [21], EHH is calculated as where is the number of sample of a particular core haplotype , is the number of samples of a particular extended haplotype which is based on a particular core haplotype and is the number of unique extended haplotypes. The basic idea of XPEHH is to test if the site is homozygous in one population but polymorphic in another population through the comparison of EHH score of two populations on one core SNP. It is expressed as where is the integral of the EHH value with respect to genetic distance in population A, is in population B. Population B is viewed as reference population and population A as observed population. Negative XPEHH score suggests selection happened in reference population, otherwise in observed population. XPEHH is highly powerful in detecting those with approximately fixed or fixed selected loci [3].

Population differentiation index

As a single locus analysis method, Fst generally quantifies the relationship between pairs of the allele within subpopulation and the meta-population for measuring the degree of differentiation. In this study, a two-steps process proposed by Gianola et al. [10] was employed to identify selection footprints based on population differentiation. In the first step, with a non-informative prior distribution of allelic frequency, a method to model the Bayesian drawing samples from the posterior distribution of parameters was introduced. According to Bayes theorem, the joint posterior density of all allelic frequencies is where R represents the total number of subpopulations; represents the frequency of allele A at site in subpopulation; represents the frequency of allele a at site in subpopulation. The second step, considering the posterior distribution samples as “data”, goes to model the finite mixture to figure out the clusters of statitics. Then, a draw from the posterior distribution of is expressed aswhere the mean posterior distribution of (Fst) value between populations ranges from 0 (identical population) to 1 (complete differentiation).

Calculation of XPCLR values

To avoid the influence of SNP ascertainment bias, XPCLR was built upon the multiple-locus composite likelihood ratio method (CLR) [8]. It not only makes use of the differences in allele frequencies between populations, but also models the joint allele frequency spectrum under selection. The likelihood function is given by where r is vector of recombination rate: , n is the sample size, stands for the count of neutral allele at locus i, s is the selection coefficient, k is the size of sliding window, w is a weight factor on linkage disequilibrium and p represents the allele frequency.

Calculation of iHS scores

The iHS statistic was defined as the log of the ratio of the integrated EHH score for haplotypes centering the ancestral allele to the integrated EHH score for haplotypes centering the derived allele as described by Voight et al. (2006) [9]. The standardized iHS is defined as Where and represent the integrated EHH score for ancestral and derived core allele. The final statistic approximately follows a standard normal distribution [9].

Calculation of Tajima's D

The Tajima's D test considers the difference between the mean pairwise difference and the number of segregating sites in nucleotide polymorphism data [7]. It is expressed as:where and n is the number of sequences. The statistic equals zero for neutral variation, and is negative when an excess of rare polymorphism caused by a recent selective sweep and is positive with the excess of high-frequency variants suggests balancing selection for multiple alleles.

Identifying potential selection footprints

Separately for each population/population pair analysis, two different procedures were implemented to determine the significance of statistic values and to identify potential selection footprints. (1) For XPEHH, iHS and XPCLR, which can make use of multiple markers, followed Voight et al. (2006) [9], the thresholds of empirical cutoffs for the X chromosome were based on the autosomal cutoffs. We determined empirical cutoffs for the top 5% of signals genome-wide on all autosomes, the statistic values on the X chromosome that were above these thresholds were considered to be outlier and treated as potential selection footprints. (2) For Fst and Tajima's D, we implemented 5000 permutation tests to establish the empirical distributions of Fst and Tajima's D. As describe by Qanbari et al. [22], in each permutation test, we shuffled the allele frequencies randomly across the fixed SNP positions. The threshold values at significance level of 0.05 from the empirical distribution were used to determine the significance of statistic. In addition, we also carried out 5000 permutation tests on XPCLR to see the plausibility of permutation test on approaches handling multiple markers.

Bioinformatics Analysis

Based on the findings from detection of selection footprints, further bioinformatics analyses were carried out to reveal the potential biological function of genes harbored in selection regions.

Enrichment analysis

The process of enrichment analysis, including cellular component, molecular function, biological process and the KEGG pathway, was performed for the candidate selection regions. Considering only quite few available annotation on pig genome, the abundant database of human genomic information was referred to identify genes on pig genome. The program of BioMart (http://www.biomart.org/)[23] and DAVID 6.7 (http://david.abcc.ncifcrf.gov/) [24]were employed to generate the homology gene set and gene enrichment analysis.

Gene annotation

In the analysis, the interest region (so-called selection region) for annotation is empirically defined as the chromosome segment, in which the outlier or selection footprint was extended about 400 kb towards its upstream and downstream directions. According to the selection regions, we identified the particular biological function through the database of NCBI (http://www.ncbi.nlm.nih.gov/gene/). In addition, we validated those regions with the candidate regions found in our previous genome-wide association study (GWAS) research [25].

Results

Information of Markers

After quality control and principal component analysis, 35, 34, 44 individuals and 1163, 1136 and 1159 SNPs corresponding to Landrace, Songliao and Yorkshire were finally retained in this analysis. In order to implement three between-population methods XPEHH, Fst and XPCLR, 1129, 1146 and 1132 common SNPs were separately chosen from the pairs of Landrace-Songliao (L-S for short), Landrace-Yorkshire (L-Y) and Yorkshire-Songliao (Y-S). The average distance of adjacent SNPs corresponding to three breed pairs is 111.60 kb, 109.95 kb and 111.31 kb, respectively.

Empirical Distribution of Test Statistic

The distributions of test statistics of three between-population methods for each breed pair and of two within-population methods for each breed can be clearly illustrated. Taking Landrace and breed pair of Landrace-Yorkshire (L-Y) for instance, Fig. 1 plots the distributions of these five test statistics on the X chromosome data (red line), empirical distributions of Fst, Tajima's D (black line) and XPCLR (yellow line) from 5000 permutation tests, and the distributions of XPEHH, iHS and XPCLR on all autosomes (black line), respectively. The distributions of XPEHH and iHS on the X chromosome are nearly in accordance with their distributions on autosomes, and these two test statistics from autosomes data more follow standard normal distribution, as pointed out by Sabeti et al. [4]. Correspondingly, the critical value for iHS at significance level of 0.05 are 1.96 and −1.96, and those for XPEHH are very close to standard normal distribution with 1.934 and −2.082. For Fst and Tajima's D, their critical values from empirical distributions are much stricter, making the detection of selection footprints more convinced. Two procedures were used to generate critical values for XPCLR, while the critical value from permutation test is so high that no selection footprints were detected. The distribution of XPCLR on the X chromosome is nearly as same as that on all autosomes, therefore the critical value from autosomal cutoffs is more reasonable and used in our whole study. In addition, the distributions of the five test statistics indicate similar tendency for other breeds and breed pairs (Fig. S2 and Fig. S3).
Figure 1

Posterior density of five test statistics.

The distributions of five test statistics on the X chromosome (red line), empirical distributions of Fst, Tajima's D (black line) and XPCLR (yellow line) from 5000 permutation tests, and the distributions of XPEHH, iHS and XPCLR on all autosomes (black line), respectively. Tajima's D and iHS are for Landrace, XPEHH, XPCLR and Fst are for breed pair of Landrace-Yorkshire only.

Posterior density of five test statistics.

The distributions of five test statistics on the X chromosome (red line), empirical distributions of Fst, Tajima's D (black line) and XPCLR (yellow line) from 5000 permutation tests, and the distributions of XPEHH, iHS and XPCLR on all autosomes (black line), respectively. Tajima's D and iHS are for Landrace, XPEHH, XPCLR and Fst are for breed pair of Landrace-Yorkshire only.

Selection footprints and regions detected by between- and within-population methods

Table 1 summaries the selection footprints which were identified in three breed pairs (L-Y, L-S and S-Y) by three between-population methods, respectively. For breed pair of Landrace-Songliao (L-S), 27 negative values out of 64 outliers suggest that selection happened in the reference population of Songliao, and the other 37 outliers indicate that selection happened in Landrace when XPEHH test was used. Similarly, 32 outliers were detected in Landrace-Yorkshire (L-Y) with 5 selection happened in Yorkshire and 27 in Landrace. Hence, 64 selection footprints, including 37 outliers from L-S pair and 27 outliers from L-Y, were revealed in Landrace in total. Likewise, 72 and 34 selection footprints were detected in Songliao and Yorkshire, respectively (Table 1).
Table 1

Summary of selection footprints detected by three between-population methods in different pig breed pairs.

Breed pair1 Number of SNPAverage SNP density (kb)XPEHHXPCLRFst
L-S 1129111.6064(L 37, S 27)2 59(S 41, L 18)2 1(L 1, S 1)2
L-Y 1146109.9532(L 27, Y 5)60(Y 31, L 29)3 (L 3, Y 3)
Y-S 1132111.3174(Y 29, S 45)70(S 33 Y 37)1 (Y 1, S 1)

L-S represents breed pair of Landrace and Songliao, Y represents Yorkshire.

The number of selection footprints (selection region for Fst) separately identified in two breeds for one breed pair in brackets.

L-S represents breed pair of Landrace and Songliao, Y represents Yorkshire. The number of selection footprints (selection region for Fst) separately identified in two breeds for one breed pair in brackets. For the implementation of Gianola's Fst, the whole X chromosome was divided into a series of non-overlapping, consecutive, 800-kb windows. The windows, in which SNPs with Fst values higher than the empirical critical value at significance level of 0.05 from permutation test, were treated as potential selection region. One selection region detected by Fst indicates the selection happened in both breeds for one breed pair, e.g. for breed pair of Landrace-Songliao (L-S), one same selection region were detected in Landrace and Songliao (Table 1). Different from XPEHH and Fst, the selection footprints were separately detected by XPCLR in observed population, e.g. for breed pair of Landrace-Songliao (L-S), 41 selection footprints were detected in Landrace when Songliao was regarded as reference population, and 18 in Songliao when Landrace was the reference population (Table 1). Considering the overlap of selection regions, the selection footprints detected by three between-population methods were merged by single breed (Table 2). Taking Landrace as an example, in total 64 outliers were detected (37 from L-S and 27 from L-Y) by XPEHH, after merging the overlapping selection regions harboring those outliers, 11 selection regions were finally identified. Similarly, 11 and 7 selection regions were detected for Songliao and Yorkshire, respectively. On average, each selection region has the length of 1.42 Mb, 1.12 Mb and 1.34 Mb, and correspondingly contains approximately 16.5, 18.3 and 12.3 SNPs in three breeds, respectively. Likewise, 16, 13 and 17 selection regions were identified by XPCLR in Landrace, Songliao and Yorkshire, respectively, with length of 0.95 Mb, 1.00 Mb and 1.13 Mb and harboring 11.63, 15.46, 13.70 SNPs each on average. For Fst, in total 4, 2 and 4 selection regions were finally found with fixed length of 800 kb and containing 6.25, 6.00 and 7.25 SNPs each on average in Landrace, Songliao and Yorkshire.
Table 2

Summary of incorporating selection regions in three pig breeds with three between-population methods and two within-population methods.

LandraceSongliaoYorkshire
XPEHH Number of regions11117
Average length (Mb)1.421.121.34
Number of SNP/region16.5018.3012.30
XPCLR Number of regions161317
Average length (Mb)0.951.001.13
Number of SNP/region11.6315.4613.70
Fst 1 Number of regions424
Number of SNP/region6.256.007.25
iHS Number of regions7109
Average length (Mb)1.160.841.11
Number of SNP/region20.1013.3021.00
Tajima's D 1 Number of regions432
Number of SNP/region13.2515.0011.50

Each selection region has fixed length of 800 kb.

Each selection region has fixed length of 800 kb. Comparing with between-population methods, the detection of selection footprints in one population using within-population methods iHS and Tajima's D were relative simple. As shown in Table 2, after merging the overlapping selection regions, in total 7, 10 and 9 selection regions were identified by iHS in Landrace, Songliao and Yorkshire with length of 1.16 Mb, 0.84 Mb and 1.11 Mb and harboring 20.1, 13.3 and 21.0 SNPs each on average. Likewise, Tajima's D was implemented to detect selection region within one breed in the same way as Fst did in breed pair, 4, 3 and 2 selection regions (balancing selection) were separately identified with fixed length of 800 kb and containing 13.25, 15.0 and 11.5 SNPs each on average, but no positive selection was identified.

The overlap of selection region

Fig. 2 presents an intuitive scatter plot, showing the distribution of the quantile values (q-value) of five approaches along physical position on the X chromosome in three breeds, respectively. The majorities of selection footprints in three breeds are concentrated in two ends of the X chromosome and there is a high proportion overlap across different breeds. Table 3 further shows not only the length of selection region identified by five methods respectively, but also the length of overlapping region identified each other. Taking Landrace as an example, the total length of selection regions that were separately detected by XPEHH, XPCLR, Fst, iHS and Tajima's D was about 15.62 Mb, 15.20 Mb, 3.20 Mb, 8.12 Mb and 3.20 Mb. Among them, 3.15 Mb, 1.95 Mb and 1.43 Mb overlapping regions corresponds to the pairs of between-population methods (XPEHH-XPCLR, XPCLR-Fst and XPEHH-Fst). In addition, the overlapping regions between within-population methods and between-population methods are quite few. There is no overlap of selection regions detected by Tajima's D with those detected by XPEHH, XPCLR and Fst. Similarly, only quite small proportion of the selection regions identified by iHS are overlapped with those from XPEHH and XPCLR, and no overlap with Fst.
Figure 2

Distribution of selection footprints and selection regions on the X chromosome in three pig breeds.

The part above bold line plots the quantile value of selection footprints, and the part below bold line shows the selection regions along the X chromosome. The quantile value is defined as which quartile of the top values of the respective statistic the reported value cuts off.

Table 3

Overlap of selection regions (Mb) from five methods in three pig breeds.

XPEHHXPCLRFstiHSTajima's D
Landrace XPEHH15.623.151.430.800
XPCLR15.201.951.210
Fst3.2000
iHS8.120
Tajima's D3.20
Songliao XPEHH12.321.8801.290
XPCLR13.000.182.770
Fst1.6000
iHS8.400.74
Tajima's D2.40
Yorkshire XPEHH9.380.561.820.880
XPCLR19.212.150.740
Fst3.2000
iHS9.991.16
XPCLR15.201.951.210

Distribution of selection footprints and selection regions on the X chromosome in three pig breeds.

The part above bold line plots the quantile value of selection footprints, and the part below bold line shows the selection regions along the X chromosome. The quantile value is defined as which quartile of the top values of the respective statistic the reported value cuts off.

The biological function in selection regions

Based on the findings of selection regions, orthologous comparison analysis revealed that a total of 166, 132 and 241 genes were harbored in all selection regions in Landrace, Songliao and Yorkshire, respectively. While the further enrichment analysis using DAVID v2.1 [24] indicated that quite few functional terms were significant after Benjamini or Bonferroni correction (see Table S1). Wang et al. (2012) reported 5 genes on the X chromosome are associated with hematological traits by using the same experiment population as in this study [25]. Among of these 5 genes, three out of them completely fall into and one close to the selection regions were identified in this study. Table 4 presents 5 significant SNPs in their report were involved in 9 selection regions spreading over 3 breeds, respectively. This implies these selection regions might reflect the potential genetic basis of hematological traits in pig. A series of genes not completely harbored in but overlapped with potential selection regions in this study are shown in Table 5. These genes are relevant with reproduction, immune and meat quality based on the gene database in NCBI (http://www.ncbi.nlm.nih.gov/gene/). Among of them, ACE2 with the function of inhibiting the differentiation of adipocytes [26] is overlapped with the potential selection region of 13.06∼13.09 Mb which was identified by XPEHH and iHS, and gene ACSL4 related with meat quality is overlapped with the region of 105.37∼105.45 Mb detected by XPEHH [27]. 2 genes (ATP1B4 and HTR2C), which are overlapped with the potential selection regions in Yorkshire, are relevant with sow reproduction traits, such as infanticide phenomenon and perinatal development of embryo [28], [29]. The other 2 genes (TRPC5 and ZDHHC9) have been reported relevant with disease traits in several studies [30], [31]. Table S2 detailed presents genes located in those potential selection regions detected by at least two methods.
Table 4

Selection regions harboring SNPs associated hematological traits reported by Wang et al. (2012).

Position of outlier SNPTraitSelection regions (breed)Max Statistical value (method, q-value)Candidate gene
3917606 Mean corpuscular volume3418340–4781142 (Y); 3897515–4872159 (S); 3584968–4384968 (S); 3767181–4567181(L); 3234420–4803149(L);2.32 (iHS, 0.971); 2.57(XPEHH, 0.982); 1.98(iHS,0.943); 2.34(XPEHH.0.988); 2.62(iHS,0.977);KAL1
9272275 red blood cell count8516875–9391368(L);2.02(XPEHH,0.960);LOC100157657
43443513 platelet count40108407–40908406(L);3.59(Tajima's D,0.984);LOC100155983
54837338 plateletcritLOC100516479
92131194 platelet count91878407–92678407 (Y); 92108407–92908406 (S);7.66(XPCLR,0.998); 3.35(TajimaD,0.975);LOC100524920
Table 5

Some candidate genes located in selection regions.

Position(Mb)q-value (Statistics, Breed)Candidate geneGene function
122.170∼122.210 0.999(XPCLR, S); 0.995(XPCLR, Y);ZDHHC9Related with the congenital splay leg [31]
13.060∼13.090 0.974(iHS,Y); 0.982(XPEHH,L)ACE2Related with the inhibition of the differentiation of adipocytes [39]
14.053∼14.060 0.990(iHS,Y); 0.982(XPEHH,L); 0.993(XPCLR,Y)S100GRelated with the establishment and maintenance of pregnancy in pigs [40]
7.065∼7.273 0.996(iHS,Y); 0.986(iHS,L); 0.960(XPEHH,S);STSRelated with the estrogen actions [41]
16.290∼16.320 0.992(FST,S); 0.992(FST,L);RS1Related with the X-linked juvenile retinoschisis [42]
109.824∼109.827 0.997(XPCLR,Y);AGTR2Related with preeclampsia [43]
1.958∼1.965 0.998(XPEHH,S); 0.994(XPCLR,S);OBPOdorant-binding proteins [44]
112.770∼112.790 0.997(XPCLR,Y);ATP1B4Plays an essential role in perinatal development [28]
105.370∼105.450 0.990(XPEHH,Y);ACSL4Related with pork quality [27]
106.140∼106.280 0.992(XPCLR,S);TRPC5Related with the fight against cardiovascular disease [30]
108.810∼108.870 0.995(XPCLR,Y);HTR2CRelated with infanticide phenomenon [29]

Discussion

In the past few years, hunting genomic evidence of selection has been widely viewed as an effective approach for exploring the potential genetic mechanism of phenotype polymorphisms and providing more properly interpretation of evolution with the application of high throughput technology [15], [32]. And a series of approaches have been proposed to detect the selection footprint, all approaches have their own strengths and weaknesses. In this study, we employed five representative methods, XPEHH, XPCLR,Fst, Tajima's D and iHS, to explore selection footprints on the X chromosome. Among them, Fst is effective for detecting selection footprints in single locus based on population differentiation [10]. XPEHH was proposed to detect selection footprints with fixed or approximately fixed selection locus [4], XPCLR is more robust in some scenarios as the change in allele frequency occurs too quickly [8]. iHS is effective in detecting ongoing selection footprints, but not in detecting recently compeleted selection footprints [5]. Tajima's D is an traditional and famous within-population method which is sensentive to purifying selection and balancing selection [5]. Furthermore, XPEHH, XPCLR and iHS actually separately find one significant core SNP and grid window by utilizing multiple-locus information, they identified more selection regions than Fst and Tajima's D. Most of the methods implemented in detecting selection footprints do not follow strict distributions, e.g. XPEHH and iHS just approximately follow normal distribution. Therefore the risk of false positive of the traditional significance test remains high due to the uncertainty of null distribution of test statistic. In addition, the genome-wide scan of selection footprints also brings the dilemma from multiple testing. Permutation test is proved robust and powerful in gene mapping and detection of selection footprints by establishing the empirical distributions of test statistics [22], [33], our results indicate permutation test is plausible for methods only dealing with allele frequency, e.g. Fst and Tajima's D, both methods only utilize the variation of allele frequency. While no selection footprints were detected when implementing permutation test in XPCLR. Once the allele frequencies of fixed SNPs were shuffled through permutation test, the linkage disequilibrium among adjacent SNPs were correspondingly changed, bringing bias as XPCLR mainly make use of information of multiple SNPs. Similarly, permutation test is implausible for haplotype-based method XPEHH and iHS as well, because haplotypes severely depend on linkage disequilibrium of SNPs. In addition, computing time is also demanding for implementation of permutation test in XPEHH and iHS. For multiple SNP methods, Voight et al. (2006) suggested empirical cutoffs using the top 1% or 5% of genome-wide on all autosomes to determine the significance of test statistic [9]. Our results show this strategy is more reasonable and saves computing time. In addition, the selection footprints identified by mutiple methods, to some extent, are more convinced. Our results indicate there is a high proportion of overlapping selection regions identified by three between-population methods. Particularly, the region around 22 ∼25 Mb was detected under selection in Landrace and Yorkshire by all three between-population methods, while only small part of this region was detect under selection in Songliao by Tajima's D (Fig 2). Unfortunately, the information of genes harbored in these regions is not available so far (Table S2). Comparing with Chinese Songliao, Landrace and Yorkshire share more common genetic background and they have already experienced a relative long period of adaptive evolution, resulting in some genes in these regions nearly fixed, while Songliao was bred through hybridization of Duroc, Landrace and Chinese Minzhu in past three decades. Therefore, this region might imply some genes relevant to the domestication of European and Chinese pigs, it is worth being deep investigated. Our results indicate LD measured with r2 on the X chromosome (0.395 in Landrace, 0.366 in Songliao and 0.381 in Yorkshire) is slightly higher than those on autosomes (0.354 in Landrace, 0.363 in Songliao and 0.344 in Yorkshire), while the SNP density on the X chromosome is about 110 kb but 60 kb on autosomes, it implies that the LD on the X chromosome might be much higher given the same SNP density as autosomes. The genome diversity will be decreased with high LD as reported by McVicker et al. (2009), correspondingly, the genes on the X chromosome will experience higher pressure of selection [13]. At the same time, we also found the potential selection regions gathered around two ends (0–40 Mb and 80–120 mMb) on the X chromosome, especially the end on short arm (0–40 Mb) of the X chromosome suffered more selections, this region was also overlapped with pseudoautosomal region (PAR) of pig, as Skinner (2013) reported PAR in pig was mapped at the beginning of the short arm even the exact position of PAR is not clear so far [34]. The genes in PAR are probably inclined to be suffered higher selection pressure than autosomes attribute to the sex-specific dosage compensation (SSDC) [16], [17]. Meanwhile, there are also some silent regions resulting from X-inactivation on the X chromosome due to the sex-specific dosage compensation (SSDC), resulting in no signal of selection in this region. This could be one explanation to the phenomenon of few selection footprints identified in the central region (∼40–80 Mb) on the X chromosome in our study even the SNP density in this region are nearly equal to that in two ends. The enrichment analysis to the selection region identified in this study has not find significant terms after correction, while some terms in one test with P-value less than 0.05 indicated their biological information related with hematological traits. For instance, two GO Biological Progresses, including GO:0002035∼brain renin-angiotensin system and GO:0002002∼regulation of angiotensin levels in blood, are corresponding to Yorkshire and Landrace (Table S1). These GO terms imply that some hematological traits might have been suffered selection in the process of evolution and domestication. Coincidently, our findings also indicate that the significant SNPs associated with hematological traits in our previous study [25] are harbored in selection regions, it in some extent suggests that the concerned hematological traits, including RBC (red blood cell count), MCV (mean corpuscular volume), PLT (platelet count) and PCT (plateletcrit), experienced artificial or natural selection. Usually, hematological traits are referred as important index for immune traits. This might indicate the X chromosome plays an important role in immune system of pig as it does in Human [35]. So far, several researches have been carried out to identify selection footprints in pig [2], [15], [32], [36]. Ai et al. (2013) sampled 18 populations with sample size per breed ranged from 5 to 32 [36]. Wilkinson et al. (2013) collected 14 pig breeds with 24–34 individuals per breed [2]. Although these two studies detect selection footprints on the X chromosome besides autosomes using Porcine SNP60 BeadChips, the blending of boars and sows are not reasonable for the analysis of the X chromosome. Rubin et al. (2012) pointed out that the X chromosome should be solely analyzed for the identification of selection footprints [32] and only sows could be used as sex chromosomes and autosomes, even between genders, are subjected to different selective pressures and have different effective population sizes [37]. Obviously, the small sample size per breed from Ai et al. (2013) [36] and Wilkinson et al. (2013) [2] make it unfeasible to use sows only. Amaral et al. (2011) carried out whole genome-wide detection of footprints through sequencing of pooled DNA [15], it is more difficult to analyzed the X chromosome separately. Additionly, the pooling size and coverage of sequencing need to take into consideration as point out by Cutler et al. [38]. Herefore, it is worthwhile to use sufficient sows to detect selection footprints on the X chomsome in this study. Scatter plots of the population structure of 113 individuals via principal component analysis. (TIFF) Click here for additional data file. Posterior density of five test statistics. Tajima's D and iHS are for Songliao, XPEHH, XPCLR and Fst are for breed pair of Landrace-Songliao only. (TIFF) Click here for additional data file. Posterior density of five test statistics. Tajima's D and iHS are for Yorkshire, XPEHH, XPCLR and Fst are for breed pair of Songliao-Yorkshire only. (TIFF) Click here for additional data file. The complete list of three breeds' enrichment analysis. (XLSX) Click here for additional data file. Selection regions identified by more than two methods. (DOCX) Click here for additional data file.
  43 in total

1.  Detecting recent positive selection in the human genome from haplotype structure.

Authors:  Pardis C Sabeti; David E Reich; John M Higgins; Haninah Z P Levine; Daniel J Richter; Stephen F Schaffner; Stacey B Gabriel; Jill V Platko; Nick J Patterson; Gavin J McDonald; Hans C Ackerman; Sarah J Campbell; David Altshuler; Richard Cooper; Dominic Kwiatkowski; Ryk Ward; Eric S Lander
Journal:  Nature       Date:  2002-10-09       Impact factor: 49.962

2.  Dosage compensation of the active X chromosome in mammals.

Authors:  Di Kim Nguyen; Christine M Disteche
Journal:  Nat Genet       Date:  2005-12-11       Impact factor: 38.330

3.  Retinoschisin forms a multi-molecular complex with extracellular matrix and cytoplasmic proteins: interactions with beta2 laminin and alphaB-crystallin.

Authors:  Marie-France Steiner-Champliaud; José Sahel; David Hicks
Journal:  Mol Vis       Date:  2006-08-10       Impact factor: 2.367

Review 4.  Sex chromosome specialization and degeneration in mammals.

Authors:  Jennifer A Marshall Graves
Journal:  Cell       Date:  2006-03-10       Impact factor: 41.582

5.  Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.

Authors:  F Tajima
Journal:  Genetics       Date:  1989-11       Impact factor: 4.562

6.  Empirical threshold values for quantitative trait mapping.

Authors:  G A Churchill; R W Doerge
Journal:  Genetics       Date:  1994-11       Impact factor: 4.562

7.  A regulatory mutation in IGF2 causes a major QTL effect on muscle growth in the pig.

Authors:  Anne-Sophie Van Laere; Minh Nguyen; Martin Braunschweig; Carine Nezer; Catherine Collette; Laurence Moreau; Alan L Archibald; Chris S Haley; Nadine Buys; Michael Tally; Göran Andersson; Michel Georges; Leif Andersson
Journal:  Nature       Date:  2003-10-23       Impact factor: 49.962

8.  Ancient DNA, pig domestication, and the spread of the Neolithic into Europe.

Authors:  Greger Larson; Umberto Albarella; Keith Dobney; Peter Rowley-Conwy; Jörg Schibler; Anne Tresset; Jean-Denis Vigne; Ceiridwen J Edwards; Angela Schlumbaum; Alexandru Dinu; Adrian Balaçsescu; Gaynor Dolman; Antonio Tagliacozzo; Ninna Manaseryan; Preston Miracle; Louise Van Wijngaarden-Bakker; Marco Masseti; Daniel G Bradley; Alan Cooper
Journal:  Proc Natl Acad Sci U S A       Date:  2007-09-13       Impact factor: 11.205

9.  Genome-wide detection and characterization of positive selection in human populations.

Authors:  Pardis C Sabeti; Patrick Varilly; Ben Fry; Jason Lohmueller; Elizabeth Hostetter; Chris Cotsapas; Xiaohui Xie; Elizabeth H Byrne; Steven A McCarroll; Rachelle Gaudet; Stephen F Schaffner; Eric S Lander; Kelly A Frazer; Dennis G Ballinger; David R Cox; David A Hinds; Laura L Stuve; Richard A Gibbs; John W Belmont; Andrew Boudreau; Paul Hardenbol; Suzanne M Leal; Shiran Pasternak; David A Wheeler; Thomas D Willis; Fuli Yu; Huanming Yang; Changqing Zeng; Yang Gao; Haoran Hu; Weitao Hu; Chaohua Li; Wei Lin; Siqi Liu; Hao Pan; Xiaoli Tang; Jian Wang; Wei Wang; Jun Yu; Bo Zhang; Qingrun Zhang; Hongbin Zhao; Hui Zhao; Jun Zhou; Stacey B Gabriel; Rachel Barry; Brendan Blumenstiel; Amy Camargo; Matthew Defelice; Maura Faggart; Mary Goyette; Supriya Gupta; Jamie Moore; Huy Nguyen; Robert C Onofrio; Melissa Parkin; Jessica Roy; Erich Stahl; Ellen Winchester; Liuda Ziaugra; David Altshuler; Yan Shen; Zhijian Yao; Wei Huang; Xun Chu; Yungang He; Li Jin; Yangfan Liu; Yayun Shen; Weiwei Sun; Haifeng Wang; Yi Wang; Ying Wang; Xiaoyan Xiong; Liang Xu; Mary M Y Waye; Stephen K W Tsui; Hong Xue; J Tze-Fei Wong; Luana M Galver; Jian-Bing Fan; Kevin Gunderson; Sarah S Murray; Arnold R Oliphant; Mark S Chee; Alexandre Montpetit; Fanny Chagnon; Vincent Ferretti; Martin Leboeuf; Jean-François Olivier; Michael S Phillips; Stéphanie Roumy; Clémentine Sallée; Andrei Verner; Thomas J Hudson; Pui-Yan Kwok; Dongmei Cai; Daniel C Koboldt; Raymond D Miller; Ludmila Pawlikowska; Patricia Taillon-Miller; Ming Xiao; Lap-Chee Tsui; William Mak; You Qiang Song; Paul K H Tam; Yusuke Nakamura; Takahisa Kawaguchi; Takuya Kitamoto; Takashi Morizono; Atsushi Nagashima; Yozo Ohnishi; Akihiro Sekine; Toshihiro Tanaka; Tatsuhiko Tsunoda; Panos Deloukas; Christine P Bird; Marcos Delgado; Emmanouil T Dermitzakis; Rhian Gwilliam; Sarah Hunt; Jonathan Morrison; Don Powell; Barbara E Stranger; Pamela Whittaker; David R Bentley; Mark J Daly; Paul I W de Bakker; Jeff Barrett; Yves R Chretien; Julian Maller; Steve McCarroll; Nick Patterson; Itsik Pe'er; Alkes Price; Shaun Purcell; Daniel J Richter; Pardis Sabeti; Richa Saxena; Stephen F Schaffner; Pak C Sham; Patrick Varilly; David Altshuler; Lincoln D Stein; Lalitha Krishnan; Albert Vernon Smith; Marcela K Tello-Ruiz; Gudmundur A Thorisson; Aravinda Chakravarti; Peter E Chen; David J Cutler; Carl S Kashuk; Shin Lin; Gonçalo R Abecasis; Weihua Guan; Yun Li; Heather M Munro; Zhaohui Steve Qin; Daryl J Thomas; Gilean McVean; Adam Auton; Leonardo Bottolo; Niall Cardin; Susana Eyheramendy; Colin Freeman; Jonathan Marchini; Simon Myers; Chris Spencer; Matthew Stephens; Peter Donnelly; Lon R Cardon; Geraldine Clarke; David M Evans; Andrew P Morris; Bruce S Weir; Tatsuhiko Tsunoda; Todd A Johnson; James C Mullikin; Stephen T Sherry; Michael Feolo; Andrew Skol; Houcan Zhang; Changqing Zeng; Hui Zhao; Ichiro Matsuda; Yoshimitsu Fukushima; Darryl R Macer; Eiko Suda; Charles N Rotimi; Clement A Adebamowo; Ike Ajayi; Toyin Aniagwu; Patricia A Marshall; Chibuzor Nkwodimmah; Charmaine D M Royal; Mark F Leppert; Missy Dixon; Andy Peiffer; Renzong Qiu; Alastair Kent; Kazuto Kato; Norio Niikawa; Isaac F Adewole; Bartha M Knoppers; Morris W Foster; Ellen Wright Clayton; Jessica Watkin; Richard A Gibbs; John W Belmont; Donna Muzny; Lynne Nazareth; Erica Sodergren; George M Weinstock; David A Wheeler; Imtaz Yakub; Stacey B Gabriel; Robert C Onofrio; Daniel J Richter; Liuda Ziaugra; Bruce W Birren; Mark J Daly; David Altshuler; Richard K Wilson; Lucinda L Fulton; Jane Rogers; John Burton; Nigel P Carter; Christopher M Clee; Mark Griffiths; Matthew C Jones; Kirsten McLay; Robert W Plumb; Mark T Ross; Sarah K Sims; David L Willey; Zhu Chen; Hua Han; Le Kang; Martin Godbout; John C Wallenburg; Paul L'Archevêque; Guy Bellemare; Koji Saeki; Hongguang Wang; Daochang An; Hongbo Fu; Qing Li; Zhen Wang; Renwu Wang; Arthur L Holden; Lisa D Brooks; Jean E McEwen; Mark S Guyer; Vivian Ota Wang; Jane L Peterson; Michael Shi; Jack Spiegel; Lawrence M Sung; Lynn F Zacharia; Francis S Collins; Karen Kennedy; Ruth Jamieson; John Stewart
Journal:  Nature       Date:  2007-10-18       Impact factor: 49.962

10.  A map of recent positive selection in the human genome.

Authors:  Benjamin F Voight; Sridhar Kudaravalli; Xiaoquan Wen; Jonathan K Pritchard
Journal:  PLoS Biol       Date:  2006-03-07       Impact factor: 8.029

View more
  19 in total

1.  A genome scan for selection signatures in pigs.

Authors:  Yunlong Ma; Julong Wei; Qin Zhang; Lei Chen; Jinyong Wang; Jianfeng Liu; Xiangdong Ding
Journal:  PLoS One       Date:  2015-03-10       Impact factor: 3.240

2.  Detection of selection signatures in dairy and beef cattle using high-density genomic information.

Authors:  Fuping Zhao; Sinead McParland; Francis Kearney; Lixin Du; Donagh P Berry
Journal:  Genet Sel Evol       Date:  2015-06-19       Impact factor: 4.297

3.  Signatures of natural selection at the FTO (fat mass and obesity associated) locus in human populations.

Authors:  Xuanshi Liu; Kerstin Weidle; Kristin Schröck; Anke Tönjes; Dorit Schleinitz; Jana Breitfeld; Michael Stumvoll; Yvonne Böttcher; Torsten Schöneberg; Peter Kovacs
Journal:  PLoS One       Date:  2015-02-03       Impact factor: 3.240

4.  Prediction of Genes Related to Positive Selection Using Whole-Genome Resequencing in Three Commercial Pig Breeds.

Authors:  HyoYoung Kim; Kelsey Caetano-Anolles; Minseok Seo; Young-Jun Kwon; Seoae Cho; Kangseok Seo; Heebal Kim
Journal:  Genomics Inform       Date:  2015-12-31

5.  Identifying molecular signatures of hypoxia adaptation from sex chromosomes: A case for Tibetan Mastiff based on analyses of X chromosome.

Authors:  Hong Wu; Yan-Hu Liu; Guo-Dong Wang; Chun-Tao Yang; Newton O Otecko; Fei Liu; Shi-Fang Wu; Lu Wang; Li Yu; Ya-Ping Zhang
Journal:  Sci Rep       Date:  2016-10-07       Impact factor: 4.379

6.  Whole genome re-sequencing reveals recent signatures of selection in three strains of farmed Nile tilapia (Oreochromis niloticus).

Authors:  María I Cádiz; María E López; Diego Díaz-Domínguez; Giovanna Cáceres; Grazyella M Yoshida; Daniel Gomez-Uchida; José M Yáñez
Journal:  Sci Rep       Date:  2020-07-13       Impact factor: 4.379

7.  Ethiopian indigenous goats offer insights into past and recent demographic dynamics and local adaptation in sub-Saharan African goats.

Authors:  Getinet M Tarekegn; Negar Khayatzadeh; Bin Liu; Sarah Osama; Aynalem Haile; Barbara Rischkowsky; Wenguang Zhang; Kassahun Tesfaye; Tadelle Dessie; Okeyo A Mwai; Appolinaire Djikeng; Joram M Mwacharo
Journal:  Evol Appl       Date:  2021-06-15       Impact factor: 5.183

8.  Detection of Selection Signatures on the X Chromosome in Three Sheep Breeds.

Authors:  Caiye Zhu; Hongying Fan; Zehu Yuan; Shijin Hu; Li Zhang; Caihong Wei; Qin Zhang; Fuping Zhao; Lixin Du
Journal:  Int J Mol Sci       Date:  2015-08-28       Impact factor: 5.923

9.  Detection of genomic signatures of recent selection in commercial broiler chickens.

Authors:  Weixuan Fu; William R Lee; Behnam Abasht
Journal:  BMC Genet       Date:  2016-08-26       Impact factor: 2.797

10.  Assessing genetic architecture and signatures of selection of dual purpose Gir cattle populations using genomic information.

Authors:  Amanda Marchi Maiorano; Daniela Lino Lourenco; Shogo Tsuruta; Alejandra Maria Toro Ospina; Nedenia Bonvino Stafuzza; Yutaka Masuda; Anibal Eugenio Vercesi Filho; Joslaine Noely Dos Santos Goncalves Cyrillo; Rogério Abdallah Curi; Josineudson Augusto Ii de Vasconcelos Silva
Journal:  PLoS One       Date:  2018-08-02       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.