| Literature DB >> 32235592 |
Elisa Somenzi1, Paolo Ajmone-Marsan1, Mario Barbato1.
Abstract
Hybridisation of wild populations with their domestic counterparts can lead to the loss of wildtype genetic integrity, outbreeding depression, and loss of adaptive features. The Mediterranean island of Sardinia hosts one of the last extant autochthonous European mouflon (Ovis aries musimon) populations. Although conservation policies, including reintroduction plans, have been enforced to preserve Sardinian mouflon, crossbreeding with domestic sheep has been documented. We identified panels of single nucleotide polymorphisms (SNPs) that could act as ancestry informative markers able to assess admixture in feral x domestic sheep hybrids. The medium-density SNP array genotyping data of Sardinian mouflon and domestic sheep (O. aries aries) showing pure ancestry were used as references. We applied a two-step selection algorithm to this data consisting of preselection via Principal Component Analysis followed by a supervised machine learning classification method based on random forest to develop SNP panels of various sizes. We generated ancestry informative marker (AIM) panels and tested their ability to assess admixture in mouflon x domestic sheep hybrids both in simulated and real populations of known ancestry proportions. All the AIM panels recorded high correlations with the ancestry proportion computed using the full medium-density SNP array. The AIM panels proposed here may be used by conservation practitioners as diagnostic tools to exclude hybrids from reintroduction plans and improve conservation strategies for mouflon populations.Entities:
Keywords: ancestry informative marker; hybridisation; mouflon; random forest; sheep
Year: 2020 PMID: 32235592 PMCID: PMC7222383 DOI: 10.3390/ani10040582
Source DB: PubMed Journal: Animals (Basel) ISSN: 2076-2615 Impact factor: 2.752
Figure 1Principal Components Analysis (PC1 vs. PC2) of the two reference populations (MSar and SAR) analysed using the full single nucleotide polymorphism (SNP) set. In brackets are the percentage of variance explained by each component.
Characteristics of the ancestry informative marker panels. The data processes used for marker selection were: preselection (Pre), Random Forest (RF), iterated Random Forest (iRF), and top-markers choice (tc). N is the number of SNPs in each panel. The SNP distribution per chromosome can be found in Supplementary Table S1.
| Panel Name | Scope | Method | N |
|---|---|---|---|
| GW1 | Genome-wide | Pre | 1279 |
| GW2 | Genome-wide | Pre + RF | 131 |
| GW3 | Genome-wide | Pre + RF + iRF | 51 |
| CH1 | Chromosome-wide | Pre + RF | 933 |
| CH2 | Chromosome-wide | Pre + RF + tc | 78 |
Figure 2PCA and density distribution of the PC1 obtained using the full SNP set (top-left panel) and three AIMs on reference populations and simulated hybrids (HYB) using the genome-wide discovery approach. BC1S and BC1M are the simulated F1 backcrossed with SAR and MSar, respectively. The gradient legend on the right side of the plot shows the transition gradient from sheep to mouflon genetic components.
Figure 3PCA and density distribution of the PC1 obtained using the full SNP set (top-left panel) and three AIMs on reference populations and real mouflon x domestic hybrids (MxS) using the genome-wide discovery approach. The gradient legend on the right side of the plot shows the transition gradient from sheep to mouflon genetic components. The analysis was performed using 33,481 SNPs and the three GW panels.
Figure 4Supervised Admixture plot of MxS dataset obtained using the full set of SNPs. MSar and SAR were used as prior populations.
Coefficient of determination values (r2) calculated between the ancestry percentages using the full set of SNPs and the AIM panels in the simulated (HYB) and case study (MxS) populations. N is the number of SNPs in each panel.
| AIMs | N | HYB | MxS |
|---|---|---|---|
| GW1 | 1279 | 0.997 | 0.99 |
| GW2 | 131 | 0.985 | 0.971 |
| GW3 | 51 | 0.966 | 0.966 |
| CH1 | 933 | 0.997 | 0.989 |
| CH2 | 78 | 0.961 | 0.946 |
Coefficient of determination values calculated between the ancestry percentages obtained using the full SNP set and the AIMs in three commercial sheep breeds.
| Breed | Acronym | GW1 | GW2 | GW3 | CH1 | CH2 |
|---|---|---|---|---|---|---|
| New Zealand Texel | TEX | 0.995 | 0.971 | 0.945 | 0.993 | 0.920 |
| Australian Poll Merino | APM | 0.995 | 0.968 | 0.938 | 0.993 | 0.923 |
| Lacaune | LAC | 0.994 | 0.962 | 0.928 | 0.992 | 0.923 |