| Literature DB >> 36011262 |
Giovanna Salvatore1, Valentino Palombo1, Stefano Esposito2, Nicolaia Iaffaldano1, Mariasilvia D'Andrea1.
Abstract
Brown trout (Salmo trutta), like many other freshwater species, is threated by the release in its natural environment of alien species and the restocking with allochthonous conspecific stocks. Many conservation projects are ongoing and several morphological and genetic tools have been proposed to support activities aimed to restore genetic integrity status of native populations. Nevertheless, due to the complexity of degree of introgression reached up after many generations of crossing, the use of dichotomous key and molecular markers, such as mtDNA, LDH-C1* and microsatellites, are often not sufficient to discriminate native and admixed specimens at individual level. Here we propose a reduced panel of ancestry-informative SNP markers (AIMs) to support on field activities for Mediterranean trout management and conservation purpose. Starting from the genotypes data obtained on specimens sampled in the main two Molise's rivers (Central-Southern Italy), a 47 AIMs panel was identified and validated on simulated and real hybrid population datasets, mainly through a Machine Learning approach based on Random Forest classifier. The AIMs panel proposed may represent an interesting and cost-effective tool for monitoring the level of introgression between native and allochthonous trout population for conservation purpose and this methodology could be also applied in other species.Entities:
Keywords: Mediterranean trout; SNP array; ancestry informative markers; introgression; machine learning; random forest
Mesh:
Year: 2022 PMID: 36011262 PMCID: PMC9407066 DOI: 10.3390/genes13081351
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Figure 1PCA obtained using the full SNPs set on reference and entire populations, encompassing non-admixed native and alien trout samples. In green and orange are reported the non-admixed native split by rivers and in red the alien samples, respectively. In brackets the percentage of variance explained by each component is reported.
Number of SNPs shared between pairs of SNP panels determined with the seven different methods reported in this study (in the diagonal, the 96 SNPs).
| Method | RF GI 1 | RF GI 2 | RF EN 1 | RF EN 2 | Delta | FST | PCA |
|---|---|---|---|---|---|---|---|
|
| 96 | ||||||
|
| 83 | 96 | |||||
|
| 89 | 81 | 96 | ||||
|
| 85 | 89 | 84 | 96 | |||
|
| 79 | 77 | 80 | 80 | 96 | ||
|
| 81 | 81 | 80 | 83 | 88 | 96 | |
|
| 52 | 50 | 53 | 53 | 56 | 57 | 96 |
Out Of Bag (OOB) and the accuracy classification scores obtained by RF algorithm considering the reference and the test trout populations by using the seven 96 SNP panels.
| Method | OOB | Train | Test |
|---|---|---|---|
| RF GI 1 | 90% | 100% | 95% |
| RF GI 2 | 87% | 100% | 97% |
| RF EN 1 | 85% | 100% | 95% |
| RF EN 2 | 86% | 100% | 97% |
| Delta | 91% | 100% | 92% |
| FST | 87% | 100% | 92% |
| PCA | 87% | 100% | 95% |
Coefficient of determination values (r2) calculated between the ancestry percentages using the full set of SNPs and the AIM panels in case study populations. N is the number of SNPs in each panel.
| SNPs Panel | N | Biferno (r2) | Volturno (r2) |
|---|---|---|---|
| Delta | 96 | 0.982 | 0.989 |
| FST | 96 | 0.981 | 0.988 |
| PCA | 96 | 0.973 | 0.984 |
| RF EN 1 | 96 | 0.985 | 0.985 |
| RF EN 2 | 96 | 0.986 | 0.988 |
| RF GI 1 | 96 | 0.985 | 0.985 |
| RF GI 2 | 96 | 0.983 | 0.987 |
| Candidate AIM | 47 | 0.955 | 0.979 |
Figure 2PCA and density distribution of the PC1 obtained using the common 47 AIMs on reference populations and real hybrid dataset split by rivers.
Figure 3PCA and density distribution of the PC1 obtained using the common 47 AIMs on reference populations and simulated hybrid dataset split by rivers.
Figure 4Distribution on the 40 trout chromosomes of the SNPs selected for the 96 SNP panels using the four different methods described in this study (RF GI 1 = random forest Gini Index stability occurrence; RF GI 2 = random forest Gini Index stability mean; RF EN 1 = random forest Entropy stability occurrence; RF EN 2 = random forest Entropy stability mean; Delta; FST = Fixation index; PCA = principal component analysis).
List of genes pinpointed by VEP tool within or close (<50 Kbps) to common SNPs included in the panels selected by the seven different methods used in this study (RF GI1, RF GI2, RF EN1, RF EN2, Delta, FST and PCA).
| SNP | Chr | Genomic | Gene(s) |
|---|---|---|---|
| AX-89926492 | 1 | 36,478,362 |
|
| AX-89933844 | 3 | 56,808,067 |
|
| AX-89957249 | 3 | 49,711,361 |
|
| AX-89933361 | 4 | 43,428,599 |
|
| AX-89954271 | 5 | 11,245,515 |
|
| AX-89955512 | 5 | 31,053,287 |
|
| AX-89964745 | 6 | 28,034,638 |
|
| AX-89965418 | 6 | 53,152,780 |
|
| AX-89922103 | 8 | 11,821,548 |
|
| AX-89923685 | 8 | 40,108,784 |
|
| AX-89930404 | 10 | 9,847,082 |
|
| AX-89926808 | 12 | 25,219,858 |
|
| AX-89935881 | 12 | 68,862,681 |
|
| AX-89941680 | 12 | 72,156,842 |
|
| AX-89943019 | 12 | 78,807,143 |
|
| AX-89944919 | 12 | 68,082,269 |
|
| AX-89966227 | 12 | 24,351,329 |
|
| AX-89937326 | 13 | 48,590,388 |
|
| AX-89970985 | 13 | 27,301,563 |
|
| AX-89928338 | 14 | 30,569,008 |
|
| AX-89965056 | 14 | 22,507,646 |
|
| AX-89976571 | 14 | 25,409,004 |
|
| AX-89975434 | 15 | 23,622,069 |
|
| AX-89971379 | 16 | 37,104,371 |
|
| AX-89961240 | 19 | 43,548,455 |
|
| AX-89961754 | 21 | 24,970,403 |
|
| AX-89969654 | 22 | 13,715,689 |
|
| AX-89957356 | 24 | 15,818,961 |
|
| AX-89924719 | 25 | 31,489,217 |
|
| AX-89935421 | 25 | 23,270,420 |
|
| AX-89950643 | 25 | 32,760,767 |
|
| AX-89936803 | 26 | 26,628,471 |
|
| AX-89959464 | 26 | 22,256,971 |
|
| AX-89948079 | 27 | 20,927,232 |
|
| AX-89961304 | 28 | 42,593,898 |
|
| AX-89963552 | 28 | 24,003,249 |
|
| AX-89961685 | 29 | 17,322,240 |
|
| AX-89965310 | 33 | 12,906,283 |
|
| AX-89927784 | 35 | 3,818,258 |
|
| AX-89958723 | 35 | 4,892,093 |
|
| AX-89938669 | 38 | 8,489,802 |
|