| Literature DB >> 24728479 |
Mingzhou Li1, Shilin Tian2, Carol K L Yeung2, Xuehong Meng3, Qianzi Tang4, Lili Niu4, Xun Wang4, Long Jin4, Jideng Ma4, Keren Long4, Chaowei Zhou5, Yinchuan Cao3, Li Zhu4, Lin Bai4, Guoqing Tang4, Yiren Gu6, An'an Jiang4, Xuewei Li4, Ruiqiang Li7.
Abstract
Domesticated organisms have experienced strong selective pressures directed at genes or genomic regions controlling traits of biological, agricultural or medical importance. The genome of native and domesticated pigs provide a unique opportunity for tracing the history of domestication and identifying signatures of artificial selection. Here we used whole-genome sequencing to explore the genetic relationships among the European native pig Berkshire and breeds that are distributed worldwide, and to identify genomic footprints left by selection during the domestication of Berkshire. Numerous nonsynonymous SNPs-containing genes fall into olfactory-related categories, which are part of a rapidly evolving superfamily in the mammalian genome. Phylogenetic analyses revealed a deep phylogenetic split between European and Asian pigs rather than between domestic and wild pigs. Admixture analysis exhibited higher portion of Chinese genetic material for the Berkshire pigs, which is consistent with the historical record regarding its origin. Selective sweep analyses revealed strong signatures of selection affecting genomic regions that harbor genes underlying economic traits such as disease resistance, pork yield, fertility, tameness and body length. These discoveries confirmed the history of origin of Berkshire pig by genome-wide analysis and illustrate how domestication has shaped the patterns of genetic variation.Entities:
Mesh:
Year: 2014 PMID: 24728479 PMCID: PMC3985078 DOI: 10.1038/srep04678
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary and annotation of SNPs in Berkshire pigs
| Category | Number of SNPs | |
|---|---|---|
| 3,645,294 | ||
| 23,796 | ||
| Missense | 7,713 | |
| Stop gain | 44 | |
| Stop loss | 16 | |
| Synonymous | 14,132 | |
| 792,471 | ||
| 118 | ||
| 23,597 | ||
| 243 | ||
| 2,783,164 | ||
The package ANNOVAR56 was used to identify whether SNPs cause protein coding changes and amino acids that are affected. ‘Upstream’ refers to a variant that overlaps with the 1 kb region upstream of the gene start site. ‘Stop gain’ means that an nsSNP leads to the creation of a stop codon at the variant site. ‘Stop loss’ means that an nsSNP leads to the elimination of a stop codon at the variant site. ‘Splicing’ means that a variant is within 2 bp of a splice junction. ‘Downstream’ means that a variant overlaps with the 1 kb region downstream of the gene end site. ‘Upstream/Downstream’ means that a variant is located in downstream and upstream regions (possibly for two different genes).
Figure 1Phylogenetic relationship and gene introgression.
(a) Two-way PCA plot of pig breeds. The fraction of the variance explained is 33.56% for eigenvector 1 and 9.56% for eigenvector 2 with a Tracy-Widom P value < 10−6 (Supplementary Table S5). (b) NJ phylogenetic tree of pig breeds. The scale bar represents p distance. (c) Four-taxon ABBA/BABA test of introgression. First panel from the left: ABBA and BABA nucleotide sites employed in the test are derived (- - B -) in Chinese domestic pigs compared with the warthog outgroup (- - - A), but differ among Berkshire and other 5 European domestic pigs (either ABBA or BABA). As this almost exclusively restricts attention to sites polymorphic in the ancestor of Chinese domestic pigs, Berkshire and other 5 European domestic pigs, equal numbers of ABBA and BABA sites are expected under a null hypothesis of no introgression, as depicted in the two gene genealogies. Second to last panel from the left: Distribution among chromosomes of D-statistic (± s.e.), which measures excess of ABBA sites over BABA sites, here for the comparison: Other 5 European domestic pigs (i.e. Duroc, Landrace, Pietrain, Large white and Hampshire), Berkshire, Chinese domestic pigs, African warthog.
Figure 2Identification of genomic regions with strong selective sweep signals in Berkshire pigs.
(a) LD patterns of Berkshire and European wild boars. (b) Distribution of log2 (θπ ratio (θπ, wild boar/θπ, Berkshire)) and FST, which are calculated in 100 kb windows sliding in 10 kb steps. Data points located to the right of the vertical lines (corresponding to 10% right tails of the empirical log2 (θ ratio) distribution, where log2 (θ ratio) is 3.14) and above the horizontal line (10% right tail of the empirical FST distribution, where FST is 0.71) were identified as selected regions for Berkshire pigs (red points). (c) Violin plot of θπ ratio and FST values for regions of Berkshire pigs that have undergone positive selection versus the whole genome. Each “violin” with the width depicting a 90°-rotated kernel density trace and its reflection. Vertical black boxes denote the interquartile range (IQR) between the first and third quartiles (25th and 75th percentiles, respectively) and the white point inside denotes the median. Vertical black lines denote the lowest and highest values within 1.5 times IQR from the first and third quartiles, respectively. The statistical significance was calculated by the Mann-Whitney U test.
Top ten functional gene categories enriched for genes affected by domestication
| Category | Term description | Involved gene number | |
|---|---|---|---|
| GO-BP: 0051607 | Defense response to virus | 4 | 0.001 |
| InterPro: 013151 | Immunoglobulin | 11 | 0.001 |
| KEGG-pathway: 04722 | Neurotrophin signaling pathway | 8 | 0.003 |
| GO-BP: 0040008 | Regulation of growth | 11 | 0.003 |
| InterPro:007110 | Immunoglobulin-like | 16 | 0.004 |
| KEGG-pathway:04114 | Oocyte meiosis | 6 | 0.004 |
| GO-BP:0009615 | Response to virus | 6 | 0.005 |
| GO-BP:0003006 | Reproductive developmental process | 9 | 0.005 |
| GO-BP:0045137 | Development of primary sexual characteristics | 6 | 0.005 |
| GO-MF: 0005267 | Potassium channel activity | 8 | 0.013 |
P values (i.e. EASE scores), indicating significance of the overlap between various gene sets, were calculated using a Benjamini-corrected modified Fisher's exact test. A complete list of categories and gene names are provided in Supplementary Data S5.
Figure 3Genes related to body length with strong selective sweep signals in Berkshire pigs.
(a) Log2 (θπ ratio (θπ, wild boar/θπ, Berkshire)) and FST values are plotted using a 10 kb sliding window for genes embedded in selected regions. Genomic regions located above the upper horizontal blue line (corresponding to a 10% significance level of FST, where FST = 0.71) and above the lower horizontal red line (a 10% significance level of θπ ratio, where log2 (θπ ratio) = 3.14) were termed as regions with strong selective sweep signals (green regions). Genome annotations are shown at the bottom (black bar: coding sequences, blue bar: genes). The boundary of ten genes related to body length is marked in red. (b) NR6A1 gene with strong selective sweep signals. Out of 482 genes embedded in selected regions which crossed 1,144 windows of 100 kb in length sliding in 10 kb steps, only one gene (i.e. NR6A1) is embedded in the most significantly (1% right tail log2 (θπ ratio and F values) selected regions (log2 (θπ ratio) = 6.72; FST = 0.91).