| Literature DB >> 24651212 |
Cornelia Di Gaetano1, Giovanni Fiorito1, Maria Francesca Ortu2, Fabio Rosa3, Simonetta Guarrera3, Barbara Pardini3, Daniele Cusi4, Francesca Frau5, Cristina Barlassina4, Chiara Troffa2, Giuseppe Argiolas2, Roberta Zaninello2, Giovanni Fresu2, Nicola Glorioso2, Alberto Piazza1, Giuseppe Matullo1.
Abstract
The peculiar position of Sardinia in the Mediterranean sea has rendered its population an interesting biogeographical isolate. The aim of this study was to investigate the genetic population structure, as well as to estimate Runs of Homozygosity and regions under positive selection, using about 1.2 million single nucleotide polymorphisms genotyped in 1077 Sardinian individuals. Using four different methods--fixation index, inflation factor, principal component analysis and ancestry estimation--we were able to highlight, as expected for a genetic isolate, the high internal homogeneity of the island. Sardinians showed a higher percentage of genome covered by RoHs>0.5 Mb (F(RoH%0.5)) when compared to peninsular Italians, with the only exception of the area surrounding Alghero. We furthermore identified 9 genomic regions showing signs of positive selection and, we re-captured many previously inferred signals. Other regions harbor novel candidate genes for positive selection, like TMEM252, or regions containing long non coding RNA. With the present study we confirmed the high genetic homogeneity of Sardinia that may be explained by the shared ancestry combined with the action of evolutionary forces.Entities:
Mesh:
Year: 2014 PMID: 24651212 PMCID: PMC3961211 DOI: 10.1371/journal.pone.0091237
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Map of Mediterranean basin showing the localization of Sardinia and Sardinian linguistic domains.
A) Map of the Mediterranean basin showing the geographic position of Sardinia. B) The Sardinian linguistic domains: 1 = Gallurese (77 individuals); 2 = Nuorese (88); 3 = Logudorese (385); 4 = Sassarese (342); 5 = Alghero (87); 6 = Campidanese (98).
Figure 2SNP-Based Principal Component Analysis of 1,077 individuals from Sardinia.
Figure 2 A) division accounting linguistic macro-areas. Key of the colors: red: Campidanese; green: Alghero; deep blue: Gallurese; light blue: Logudorese; yellow: Sassarese; purple: Nuorese. Figure 2 B) division accounting geographical areas. Key of the colors: green: Southern Sardinia; grey: Central Sardinia; yellow: Northern Sardinia.
Fst values (in bold) and genomic control inflation factor (λGC) (in italics) between Sardinian linguistic macro-areas.
| λGC/Fst | Campidanese | Alghero | Gallurese | Logudorese | Nuorese | Sassarese |
| Campidanese |
|
|
|
|
|
|
| Alghero |
|
|
|
|
|
|
| Gallurese |
|
|
|
|
|
|
| Logudorese |
|
|
|
|
|
|
| Nuorese |
|
|
|
|
|
|
| Sassarese |
|
|
|
|
|
|
Figure 3ADMIXTURE software results for K = 2–4.
Ancestry for each individual inferred using ADMIXTURE software.
Mean genomic inbreeding coefficients (FRoH %) using 0.5 and 5 Mb minimum RoH thresholds and mean sum of RoH.
| FRoH%≥0.5 | FRoH%≥5 | Mean (SD) sum of RoH (Mb) | ||||
| mean≥0.5 | mean≥5 | |||||
| Campidanese | 3.08 | 0.49 | 82.55 | 3.81 | 13.24 | 3.14 |
| Alghero | 2.71 | 0.29 | 72.77 | 2.86 | 7.86 | 2.28 |
| Gallurese | 3.10 | 0.51 | 83.09 | 4.39 | 13.65 | 3.82 |
| Logudorese | 2.96 | 0.41 | 79.33 | 1.79 | 11.06 | 1.53 |
| Nuorese | 2.89 | 0.42 | 77.61 | 3.74 | 11.15 | 3.26 |
| Sassarese | 2.94 | 0.44 | 78.84 | 2.05 | 11.67 | 1.84 |
| Italy | 2.52 | 0.47 | 67.55 | 4.98 | 12.64 | 4.28 |
* p-value smaller than 0.05 when comparing each linguistic macro-area to peninsular Italy.
Mean inbreeding coefficients.
| Mean inbreeding coefficient | SE |
| |
| Campidanese | 0.0106 | 0.00015 | 0.002 |
| Alghero | 0.0058 | 0.00014 | 0.26 |
| Gallurese | 0.0100 | 0.00022 | 0.01 |
| Logudorese | 0.0086 | 0.00002 | 0.003 |
| Nuorese | 0.0079 | 0.00016 | 0.06 |
| Sassarese | 0.0082 | 0.00004 | 0.01 |
| Italy | 0.0046 | 0.00001 | - |
Mean inbreeding coefficients, standard errors (SE) and T test p-Values of Sardinia macro-areas and peninsular Italy.
Percentage of the accessible genome occupied (2.84 Gb) and mean sum of RoH in Mb (with standard errors SE) for six classes of RoH.
| 0.5–1 Mb | 1–2 Mb | 2–4 Mb | 4–8 Mb | 8–16 Mb | >16 Mb | |||||||||||||
| % RoH | mean | SE | % RoH | mean | SE | % RoH | mean | SE | % RoH | mean | SE | % RoH | mean | SE | % RoH | mean | SE | |
| Campidanese | 1.23 | 32.92 | 0.53 | 0.98 | 26.25 | 0.56 | 0.31 | 8.27 | 0.6 | 0.2 | 5.28 | 0.78 | 0.17 | 4.66 | 1.26 | 0.19 | 5.18 | 1.74 |
| Alghero | 1.19 | 31.86 | 0.56 | 0.95 | 25.61 | 0.58 | 0.26 | 6.89 | 0.51 | 0.1 | 2.66 | 0.57 | 0.13 | 3.46 | 1.06 | 0.09 | 2.29 | 1.12 |
| Gallurese | 1.19 | 31.83 | 0.55 | 1.01 | 27.12 | 0.65 | 0.32 | 8.68 | 0.64 | 0.22 | 5.98 | 1.06 | 0.15 | 4.07 | 1.07 | 0.2 | 5.42 | 2.26 |
| Logudorese | 1.21 | 32.42 | 0.27 | 1 | 26.91 | 0.33 | 0.29 | 7.68 | 0.27 | 0.14 | 3.82 | 0.34 | 0.16 | 4.17 | 0.58 | 0.16 | 4.33 | 0.93 |
| Nuorese | 1.19 | 31.82 | 0.58 | 0.93 | 25.02 | 0.59 | 0.31 | 8.33 | 0.49 | 0.15 | 3.89 | 0.74 | 0.19 | 5.01 | 1.25 | 0.13 | 3.54 | 1.82 |
| Sassarese | 1.21 | 32.52 | 0.28 | 0.99 | 26.47 | 0.32 | 0.26 | 6.96 | 0.27 | 0.14 | 3.7 | 0.39 | 0.14 | 3.72 | 0.59 | 0.2 | 5.48 | 1.18 |
| Italy | 0.98 | 26.25 | 0.43 | 0.8 | 21.49 | 0.63 | 0.24 | 6.42 | 0.63 | 0.13 | 3.45 | 1.01 | 0.18 | 4.9 | 1.51 | 0.16 | 4.4 | 2.07 |
*T test p-value<0.05 comparing to Italy,
T test p-value<0.05 comparing to Alghero.
Nine genomic regions showing signals of positive selection in the Sardinian's genome ordered by |iHS|.
| Position NCBI36/hg18 | SIZE | n SNP |iHS|>4 | n SNP | MAX|iHS| | MAX iHS SNP |
| empirical | genes |
| chr19: 32,961,206–33,175,723 | 215 | 12 | 69 | 6.25 | rs17714275 | 3.87 e−34 | <0.0001 |
|
| chr6: 29,555,703–33,009,633 | 3454 | 35 | 4884 | 5.37 | rs397081 | 1.64 e−21 | <0.0001 |
|
| chr2: 238,113,451–238,164,950 | 51 | 7 | 37 | −5.33 | rs2292871 | 6.62 e−22 | <0.0001 |
|
| chr9: 70,303,655–70,400,714 | 97 | 8 | 37 | 5.06 | rs11143002 | 4.86 e−25 | <0.0001 |
|
| chr19: 22,561,972–22,586,080 | 24 | 8 | 16 | −4.66 | rs4932781 | 4.49 e−29 | <0.0001 |
|
| chr5: 109,659,513–109,731,650 | 72 | 13 | 28 | −4.65 | rs10478008 | 9.03 e−44 | <0.0001 |
|
| chr1: 247,047,666–247,088,866 | 41 | 7 | 15 | −4.48 | rs12058711 | 1.05 e−25 | <0.0001 |
|
| chr4: 34,062,734–34,244,104 | 181 | 13 | 46 | −4.39 | rs11936559 | 5.37 e−40 | <0.0001 |
|
| chr11: 55,732,908–56,414,929 | 682 | 10 | 228 | 4.27 | rs12576240 | 4.79 e−25 | <0.0001 |
|
Column headers: Position on NCBI36/hg18 of region showing evidence for selection; Size in Kb of the genomic region; nSNP |iHS|>4 indicates the number of SNPs with an absolute |iHS| higher than 4 in each region; nSNP is the number of SNPs in each region; Max iHS is the highest value of each region; Max iHS SNP is the polymorphism with the highest value for each region; P-values: nominal p-values; Empirical p-values: after permutation-based multiple testing corrections; Genes: the genes within the region. When, in the genomic region, there are more than 4 genes, only the first 4 are indicated.