| Literature DB >> 29115919 |
Beatriz Gutiérrez-Gil1, Cristina Esteban-Blanco2,3, Pamela Wiener4, Praveen Krishna Chitneedi2, Aroa Suarez-Vega2, Juan-Jose Arranz2.
Abstract
BACKGROUND: With the aim of identifying selection signals in three Merino sheep lines that are highly specialized for fine wool production (Australian Industry Merino, Australian Merino and Australian Poll Merino) and considering that these lines have been subjected to selection not only for wool traits but also for growth and carcass traits and parasite resistance, we contrasted the OvineSNP50 BeadChip (50 K-chip) pooled genotypes of these Merino lines with the genotypes of a coarse-wool breed, phylogenetically related breed, Spanish Churra dairy sheep. Genome re-sequencing datasets of the two breeds were analyzed to further explore the genetic variation of the regions initially identified as putative selection signals.Entities:
Mesh:
Year: 2017 PMID: 29115919 PMCID: PMC5674817 DOI: 10.1186/s12711-017-0354-x
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 4.297
Fig. 1Sheep breeds selected for this study, Australian Merino (left) and Spanish Churra (right). Original images taken from Wikipedia (https://commons.wikimedia.org/w/index.php?curid=12599612; https://commons.wikimedia.org/w/index.php?curid=12174588)
Fig. 2Results of the genetic differentiation analysis (a) and of the analysis of reduced heterozygosity in the Australian Merino populations studied here (Australian Industry Merino, Australian Merino and Australian Poll Merino) (b) and Spanish Churra sheepAustralian Merino sheep (c). a F ST values obtained across the whole-genome (averaged in sliding 9-SNP windows) when contrasting the 50 K-chip pooled genotypes of the Australian Merino and Spanish Churra sheep samples considered in this work. The horizontal line indicates the top 0.5th percent threshold of the F ST-distribution. b, c Genome-wide distribution of observed heterozygosity values (averaged over a sliding 9-SNP window) for the pooled genotypes of the three Australian Merino populations (b) and Spanish Churra (c). The horizontal lines indicate the bottom 0.5th percent thresholds of the heterozygosity distributions
Fig. 3Results of the selection sweep mapping analyses performed with the two haplotype-based methods used in this work, performed with the hapFLK (a) and the rehh (XP-EHH analysis) (b) software. Genome-wide distribution of the log (1/P value) obtained from each analysis are represented on the Y-axis. The horizontal lines represent the significance threshold considered (P < 0.001)
Convergence regions identified in this study based on the overlapping of the results of the four mapping analyses performed to identify selection sweeps between Churra and Australian Merino breeds
| CCRa | SSb | Chrc | CCR flanking markers | Start position (bp) | End position (bp) | XP-EHH valued |
|---|---|---|---|---|---|---|
| 1 |
|
|
|
|
|
|
| FST-SS4 | 2 |
| 51898098 | 53597080 | ||
| Churra-ObsHtz-SS5 | 2 |
| 51898098 | 52997998 | ||
|
|
|
|
|
| ||
| Churra-ObsHtz-SS6 | 2 |
| 53366034 | 53670410 | ||
| 2 |
|
|
|
|
|
|
| FST-SS5 | 2 |
| 79017511 | 79189919 | ||
| 3 |
|
|
|
|
|
|
| FST-SS6 | 3 |
| 151512221 | 151778900 | ||
| FST-SS7 | 3 |
| 152215311 | 152227684 | ||
| 4 |
|
|
|
|
| |
|
|
|
|
|
|
| |
| FST-SS8 | 3 |
| 152795421 | 153090551 | ||
| FST-SS9 | 3 |
| 153459890 | 153519437 | ||
| 5 |
|
|
|
|
|
|
| FST-SS10 | 3 |
| 154069702 | 154522600 | ||
| 6 |
|
|
|
|
|
|
| FST-SS11 | 3 |
| 155167107 | 155252399 | ||
| 7 |
|
|
|
|
|
|
| Churra-ObsHtz-SS23 | 3 |
| 179832455 | |||
| 8 | Churra-ObsHtz-SS24 | 3 |
| 182778735 | 182916410 | |
|
|
|
|
|
|
| |
| 9 | Churra-ObsHtz-SS25 | 3 |
| 183347210 | 183368930 | |
| Merino-ObsHtz-SS22 | 3 |
| 183368930 | |||
|
|
|
|
|
|
| |
| 10 |
|
|
|
|
|
|
| FST-SS15 | 3 |
| 188276666 | |||
| 11 | Merino-ObsHtz-SS35 | 6 |
| 36461468 | 36655091 | |
|
|
|
|
|
| − | |
| 12 | FST-SS24 | 6 |
| 37164263 | ||
|
|
|
|
|
| − | |
| Merino-ObsHtz-SS36 | 6 |
| 37987281 | |||
| Merino-ObsHtz-SS37 | 6 |
| 38214088 | |||
| Merino-ObsHtz-SS38 | 6 |
| 38417881 | 38481174 | ||
| 13 |
|
|
|
|
|
|
| Churra-ObsHtz-SS37 | 8 |
| 32849509 | 32979538 | ||
| 14 |
|
|
|
|
|
|
| Churra-ObsHtz-SS38 | 8 |
| 37211967 | 37313171 | ||
| 15 |
|
|
|
|
|
|
| FST-SS29 | 10 |
| 29353089 | 29713193 | ||
| Churra-ObsHtz-SS42 | 10 |
| 29476678 | 29688513 | ||
| 16 | Merino-ObsHtz-SS52 | 11 |
| 26512466 | 26939891 | |
|
|
|
|
|
| − | |
| 17 | Merino-ObsHtz-SS70 | 15 |
| 74618189 | 74636302 | |
| XPEHH-SS95 | 15 |
| 74618189 | 74636302 | − | |
| 18 |
|
|
|
|
| |
| FST-SS49 | 25 |
| 7599609 | 7608913 | ||
|
|
|
|
|
| − |
After defining the selection signals identified by the different selection sweep mapping methods considered in our study, i.e. differentiation analysis (FST-SS), identification of regions of reduced heterozygosity (ObsHtz-SS) and haplotype-based selection mapping methods hapFLK and XEHPP analyses (hapFLK-SS) and XEHPP-SS), the corresponding intervals were compared and Convergence Candidate regions (CCR) were defined when at least one haplotype-based method showed coincidence with any of the two other analyses performed
aConvergence candidate regions defined based on the convergence of selection signals identified in this study
bSelection signals identified by the four analysis methods used in this study: the methods based on the estimation of F ST and observed heterozygosity (ObsHtz) and the two methods based on haplotype analysis (hapFLK and XPEHH). Note that the signals identified by the haplotype-based methods are indicated in italics. It was necessary that at least overlapping of one significant haplotype-based SS (identified by the hapFLK or the XPEHH analyses) and one SS identified by any of the two other methods (FST or ObsHtz-based analyses) to label a region as a CCR
cChromosome
dFor the SS identified with the XP-EHH test, the most extreme XP-EHH estimate is provided. Note that positive and negative (negative highlighted in bold font) estimates indicate selection in the Churra and Merino populations, respectively
Correspondence of the 18 convergence candidate regions (CCR) identified as putative selection signals for Churra and Australian Merino sheep populations with previously reported signatures of selection
| Present study | Other studies | |||
|---|---|---|---|---|
| Region | Genomic interval (Mb) | Correspondence with other studies | Putative candidate genes according to other studies | Population (target trait) |
| CCR1 | Chr2: 51.659–53.837 | OAR2: 52.266–52.454 | Zel-Lori Bakhtiri and HapMap dataset [ | |
| OAR2: 52.40 (peak SNP) | HapMap dataset [ | |||
| OAR2: 51.41–53.44 | HapMap dataset [ | |||
| OAR2: 51.72–51.95 | HapMap dataset [ | |||
| OAR2: 51.200–52.100; 52.100–52.900; 53.60–54.5800 | Duolang sheep [ | |||
| CCR2 | Chr2: 78.854–79.190 | |||
| CCR3 | Chr3: 151.088–152.393 | OAR3: 150.5–154.2 |
| Spanish breeds [ |
| OAR3: 151.42–156.93 |
| HapMap dataset [ | ||
| OAR3: 152.68–154.679 | HapMap dataset [ | |||
| CCR4 | Chr3: 152.545–153.519 | OAR3: 150.5–154.2 |
| Spanish breeds [ |
| OAR3: 152.68–154.679 | HapMap dataset [ | |||
| CCR5 | Chr3: 154.007–154.523 | OAR3: 154.213 (peak SNP) |
| HapMap dataset [ |
| OAR3: 154.79–154.93 | HapMap dataset [ | |||
| OAR3: 151.42–156.93 | HapMap dataset [ | |||
| OAR3: 150.5–154.2 | Spanish breeds [ | |||
| OAR3: 152.68–154.679 | HapMap dataset [ | |||
| CCR6 | Chr3: 154.638–158.339 | OAR3: 154.79–154.93 | HapMap dataset [ | |
| CCR7 | Chr3: 179.816–180.129 | |||
| CCR8 | Chr3: 182.779–182.916 | OAR3: 182.00–184.00 | Duolang sheep [ | |
| CCR9 | Chr3: 183.347–183.430 | OAR3: 182.00–184.00 | Duolang sheep [ | |
| CCR10 | Chr3: 187.634–188.482 | |||
| CCR11 | Chr6: 36.461–36.914 | OAR6: 36.073 (peak SNP) |
| HapMap dataset [ |
| OAR6: 34.71–39.12 | HapMap dataset [ | |||
| OAR6: 36.63–36.8 | HapMap dataset [ | |||
| OAR6: 36.200–36.500 | Duolang sheep [ | |||
| OAR6: 30.367–41.863 | HapMap dataset [ | |||
| CCR12 | Chr6: 37.164–38.580 | OAR6: 34.71–39.12 | ||
| OAR6: 37.2–38.0 | HapMap dataset [ | |||
| OAR6: 37.40–37.60 | HapMap dataset [ | |||
|
| Small-tailed Han sheep [ | |||
| OAR6: 30.367–41.863 | HapMap dataset [ | |||
| CCR13 | Chr8: 32.779–33.477 | OAR8: 32.159 (Peak SNP) |
| HapMap dataset [ |
| CCR14 | Chr8: 37.075–37.423 | |||
| CCR15 | Chr10: 29.344–29.713 | OAR10: 29.476 (peak SNP) |
| HapMap dataset [ |
| OAR10: 29.1–29.3 | Spanish breeds [ | |||
| OAR10: 28.50–30.50 | HapMap dataset [ | |||
| OAR10: 28.71–29.00 | HapMap dataset [ | |||
| HapMap dataset [ | ||||
| HapMap dataset [ | ||||
| OAR10: 27.1–31.2 | Small-tailed Han sheep [ | |||
| OAR10: 29.1–31.9 |
| Duolang sheep [ | ||
| OAR10: 29.40–29.700 | ||||
| OAR10: 29.50–29.400 | ||||
| CCR16 | Chr11: 26.512–26.940 | OAR11: 24.18–38.74 | HapMap dataset [ | |
| OAR11: 26.8–29.9 | Barki sheep versus temperate breeds (hot arid environment) [ | |||
| Small-tailed Han sheep [ | ||||
| CCR17 | Chr15: 74.618–74.636 | OAR15: 72.774–74.55 | HapMap dataset [ | |
| CCR18 | Chr25: 7.356–7.821 | OAR25: 7.517 (peak SNP) | HapMap dataset [ | |
| OAR25: 7.400–7.600 | Duolang sheep [ | |||
Correspondence between the convergence candidate regions (CCR) identified in the core analyses between Australian Merino and Churra breeds (labeled as CCR1 to CCR18), with the CCR identified in the validation analyses performed by contrasting a small dataset of Spanish Merino and Churra sheep genotypes (labeled as CCR101 to CCR118)
| CCR AustralianMerino-Churra | CCR SpanishMerino-Churra | ||||
|---|---|---|---|---|---|
| Region | Genomic region | Most extreme XPEHH valuea | Region | Genomic region | Most extreme XPEHH valuea |
| CCR1 | Chr2: 51.659–53.837 | 6.297 | CCR101 | Chr2: 51.530–53.798 | 4.282 |
| CCR2 | Chr2: 78.854–79.190 | 4.571 | |||
| CCR3 | Chr3: 151.088–152.393 | 5.232 | CCR102 | Chr3: 151.433–152.055 | 3.648 |
| CCR4 | Chr3: 152.545–153.519 | 6.651 | CCR103 | Chr3: 152.855–152.861 | 3.560 |
| CCR5 | Chr3: 154.007–154.523 | 4.324 | |||
| CCR6 | Chr3: 154.638–158.339 | 5.409 | |||
| CCR7 | Chr3: 179.816–180.129 | 4.066 | |||
| CCR8 | Chr3: 182.779–182.916 | 3.373 | |||
| CCR9 | Chr3: 183.347–183.430 | 4.061 | |||
| CCR10 | Chr3: 187.634–188.482 | 4.323 | |||
| CCR104 | Chr4: 30.499–30.929 | − | |||
| CCR11 | Chr6: 36.461–36.914 | − | |||
| CCR12 | Chr6: 37.164–38.580 | − | CCR105 | Chr6: 38.181–38.255 | − |
| CCR106 | Chr6: 38.429–38.617 | − | |||
| CCR107 | Chr8: 31.613–31.699 | ||||
| CCR13 | Chr8: 32.779–33.477 | 4.846 | CCR108 | Chr8: 32.364–32.597 | |
| CCR109 | Chr8: 33.676–34.622 | ||||
| CCR110 | Chr8: 34.791–35.740 | ||||
| CCR14 | Chr8: 37.075–37.423 | 4.194 | |||
| CCR111 | Chr8: 51.730–52.676 | − | |||
| CCR112 | Chr8: 52.997–54.352 | − | |||
| CCR113 | Chr8: 59.193–60.187 | − | |||
| CCR15 | Chr10: 29.344–29.713 | 3.716 | |||
| CCR114 | Chr10: 51.490–52.154 | − | |||
| CCR115 | Chr10: 52.389–52.670 | − | |||
| CCR16 | Chr11: 26.512–26.940 | − | |||
| CCR116 | Chr15_ 37.553–37.776 | 4.543 | |||
| CCR117 | Chr15: 38.783–38.943 | 3.734 | |||
| CCR17 | Chr15: 74.618–74.636 | − | |||
| CCR18 | Chr25: 7.356–7.821 | − | CCR118 | Chr25: 7.356–7.970 | − |
aFor the CCR including a selection signal identified by the XP-EHH test, the most extreme XP-EHH estimate is provided. Positive and negative (highlighted in bold font) XP-EHH estimates indicate selection in the Churra and Merino populations, respectively
Fig. 4Results of the association analysis performed for the 135,061 SNPs from the processing of 28 whole-genome sequencing samples of Churra and Australian Merino sheep breeds with the aim of identifying the markers with the most divergent allele frequencies between the two breeds compared. Genome-wide distribution of the log (1/P value) obtained from the association analysis with the breed identity are represented on the Y-axis. The horizontal line represents the significance threshold considered after a Bonferroni correction for multiple testing (P < 0.05/139,244 = P value < 0.000000359; log (1/P value) = 6.44)
Characterization of the three missense mutations identified in this study
| Features | Missense mutations identified as divergent variants based on the analysis of whole-genome sequence datasets from Churra and Australian Merino samples | |||
|---|---|---|---|---|
| SNP position (Oar_v3.1) | 52,429,848 | 37,308,727 | 37,3557,21 | 37,356,400 |
| Chromosome | 2 | 6 | 6 | 6 |
| dbSNP_ID |
|
|
|
|
| Gene |
|
|
|
|
| Ref. (Texel Oar_v3.1) → Altb | T → C | C → T | T → A | A → T |
| Position in CDS | c.2540 | c.1754 | c.4321c | c.3642c |
| Base pair substitution in CDS | T → C | C → T | A → T | T → A |
| Breed (mutant allele)d | Merino | Merino | Churra | Merino |
| Codon change | cAc → cGc | TCC → TTC | ATA → TTA | GAT → GAA |
| Amino acid change | Histitine (H) → Arginine (R) | Serine (S) → Phenilalanine (F) | Isoleucine (I) → Leucine (L) | Aspartate (D) → Glutamate (E) |
| Protein change |
|
|
|
|
| Functional impact (ensemblVEP_Oarv3.1) | Moderate | Moderate | Moderate | Moderate |
| Functional impact (Polyphen-2) | Benign | Benign (score = 0.252; sensitivity: 0.91; specificity: 0.88) | Benign | Benign |
| Functional impact (SIFT_Oarv3.1) | Tolerated | Deleterious | Tolerated (low confidence) | Tolerated |
| Properties of wild aminoacid | Moderate hydropathy, charge “+” | Hydrophilic, polar, no charge | Hydrophobic, no charge | Hydrophilic, charge “−” |
| Properties of mutant aminoacid | Hydrophilic, charge “+” | Hydrophobic, apolar, no charge | Hydrophobic, no charge | Hydrophilic,, charge “−” |
| Churra genotypes | TT (15) | CC (14), CT (1) | AT (1), TT (14) | AA (14), AT (1) |
| Australian Merino genotypes | CC (9), TC (4) | TT (9), TC (3), CC (1) | AA (9), AT (3), TT (1) | TT (9), TA (3), AA (1) |
aMutation initially annotated within the ENSOARG00000004249 novel gene (Oar_3.1). BLASTN analyses showed correspondence with the human LCORL gene and the ovine LCOR according to the most recent version of the sheep genome (Oar_v4.0)
bRef. (Texel Oar_v3.1) → Alt: Reference and alternative alleles, respectively, identified in the analysis of the whole-genome sequence datasets
cPosition of the SNP in the coding sequence based on the alignment of the sequence harboring the mutation to the annotation of the LCORL gene in the most recent version of the sheep genome (Oar_v4.0): NCBI Reference sequences: XM_015096407.1, XP_014951893.1 (ligand-dependent nuclear receptor corepressor-like protein isoform X1)
dBreed with the highest frequency for the mutant allele (regarding the wild protein sequence). Note that for SNP rs419074913, the Texel sheep of the reference genome harbors the mutant allele according the CDS and protein sequence
Fig. 5Multi-species alignment of the amino-acid sequence of the NCAPG protein across ruminant species, humans and mouse using Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo/). The serine amino acid affected by the NCAPG_c.1754C > T mutation shows a high conservation level in all species considered