| Literature DB >> 21264345 |
Isabel Mendizabal1, Cristina Valente, Alfredo Gusmão, Cíntia Alves, Verónica Gomes, Ana Goios, Walther Parson, Francesc Calafell, Luis Alvarez, António Amorim, Leonor Gusmão, David Comas, Maria João Prata.
Abstract
Previous genetic, anthropological and linguistic studies have shown that Roma (Gypsies) constitute a founder population dispersed throughout Europe whose origins might be traced to the Indian subcontinent. Linguistic and anthropological evidence point to Indo-Aryan ethnic groups from North-western India as the ancestral parental population of Roma. Recently, a strong genetic hint supporting this theory came from a study of a private mutation causing primary congenital glaucoma. In the present study, complete mitochondrial control sequences of Iberian Roma and previously published maternal lineages of other European Roma were analyzed in order to establish the genetic affinities among Roma groups, determine the degree of admixture with neighbouring populations, infer the migration routes followed since the first arrival to Europe, and survey the origin of Roma within the Indian subcontinent. Our results show that the maternal lineage composition in the Roma groups follows a pattern of different migration routes, with several founder effects, and low effective population sizes along their dispersal. Our data allowed the confirmation of a North/West migration route shared by Polish, Lithuanian and Iberian Roma. Additionally, eleven Roma founder lineages were identified and degrees of admixture with host populations were estimated. Finally, the comparison with an extensive database of Indian sequences allowed us to identify the Punjab state, in North-western India, as the putative ancestral homeland of the European Roma, in agreement with previous linguistic and anthropological studies.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21264345 PMCID: PMC3018485 DOI: 10.1371/journal.pone.0015988
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Sequence diversity indices for mtDNA lineages (positions 16,090–16,365) in the Roma populations and corresponding host populations included in the present study.
| Population | N | K (%K) | S (%S) |
|
|
|
|
|
|
|
| Roma Portugal | 138 | 22 (15.94) | 38 (13.77) | 0.85±0.02 | 0.01±0.01 | 3.92±1.98 | 7.14 (4.38–11.35) | 6.91±1.91 | −1.30 | −3.92 |
| Roma Spain | 115 | 29 (38.26) | 36 (13.77) | 0.89±0.02 | 0.01±0.01 | 3.59±1.84 | 12.20 (7.80–18.70) | 6.76±1.93 | −1.43 | −12.83 |
| Roma Bulgaria1 | 71 | 35 (49.29) | 36 (13.04) | 0.97±0.01 | 0.02±0.01 | 4.97±2.45 | 26.70 (16.65–42.66) | 7.45±2.25 | −1.07 | −21.24 |
| Roma Bulgaria2 | 53 | 23 (43.40) | 29 (10.51) | 0.96±0.01 | 0.02±0.01 | 4.57±2.53 | 14.90 (8.58–25.65) | 6.39±2.07 | −0.94 | −8.69 |
| Roma Bulgaria3 | 108 | 31 (28.70) | 35 (12.68) | 0.92±0.01 | 0.02±0.01 | 4.40±2.43 | 14.20 (9.12–21.74) | 6.66±1.92 | −1.04 | −12.44 |
| Roma Hungary | 205 | 43 (20.98) | 50 (18.12) | 0.91±0.01 | 0.02±0.01 | 6.05±2.89 | 16.30 (11.40–23.10) | 8.48±2.15 | −1.22 | −17.97 |
| Roma Lithuania | 18 | 5 (27.77) | 9 (3.26) | 0.66±0.10 | 0.01±0.01 | 2.98±1.64 | 1.90 (0.70–5.10) | 2.62±1.22 | 0.01 | 1.28 |
| Roma Poland | 69 | 13 (18.84) | 21 (7.61) | 0.82±0.03 | 0.02±0.01 | 5.19±2.54 | 4.50 (2.40–8.10) | 4.37±1.45 | −0.03 | 0.08 |
| Host Portugal | 118 | 81 (68.64) | 60 (21.74) | 0.97±0.01 | 0.01±0.01 | 4.21±2.10 | 112.73 (76.90–166.80) | 11.23±2.97 | −1.97 | −25.74 |
| Host Spain | 68 | 61 (89.71) | 58 (21.01) | 0.99±0.01 | 0.02±0.01 | 5.26±2.57 | 281.23 (142.90–591.60) | 12.11±3.48 | −1.89 | −25.46 |
| Host Bulgaria | 141 | 86 (60.99) | 70 (25.36) | 0.98±0.01 | 0.02±0.01 | 4.89±2.40 | 92.60 (65.90–130.80) | 12.68±3.22 | −2.12 | −25.74 |
| Host Hungary | 211 | 135 (69.59) | 79 (28.62) | 0.98±0.01 | 0.02±0.01 | 4.26±2.12 | 160.56 (120.84–214.07) | 13.33±3.17 | −2.07 | −25.45 |
| Host Lithuania | 162 | 96 (59.26) | 72 (26.09) | 0.98±0.01 | 0.02±0.01 | 5.63±2.72 | 98.20 (71.40–135.30) | 12.72±3.16 | −1.98 | −25.46 |
| Host Poland | 413 | 195 (47.21) | 102 (36.96) | 0.97±0.01 | 0.02±0.01 | 5.14±2.50 | 143.7 (117.20–176.00) | 15.46±3.31 | −2.13 | −25.13 |
Data from present study;
Data from Fernandez et al. [20];
Data from Gresham et al. [6];
Data from Irwin et al. [7];
Data from Malyarchuk et al. [9];
Unpublished data;
Data from Alvarez et al. [21];
Data from Richards et al. [22];
Data from Lappalainen et al. [23];
Data from Grzybowski et al. [24].
N, sample size; K, number of different sequences; S, number of polymorphic sites; Ĥ, sequence diversity; π, nucleotide diversity; θ π mean number of pairwise differences between sequences; θ K mean number of pairwise differences based on K; θ S mean number of pairwise differences based on S; D, Tajima's test of selective neutrality; F S, Fu's test of selective neutrality;
*P-value<0.05;
**P-value<0.001.
Figure 1MtDNA haplogroups corresponding to founder lineages in the European Roma populations.
Percentages of non-founder lineages are shown in white in the circles. Sample sizes (n) and sequence diversity (Ĥ) are shown for each Roma sample.
AMOVA with mtDNA sequences from the Roma populations analyzed.
| Variance | ||||
| Grouping criteria | Groups | Among groups | Among populations within groups | Within populations |
| Total sample | 19 populations in the original publications | 7.02 | 92.98 | |
| Country | Portugal, Spain, Lithuania, Poland, Hungary, Bulgaria | 3.77 | 3.61 | 92.62 |
| Historical migration | 4.99 | 3.08 | 91.92 | |
| Early settlement in Balkans: Bulgaria 1 | ||||
| Settlement in Bulgaria and Hungary from Wallachia/Moldavia 17th-18th centuries: Bulgaria 2 and Hungary | ||||
| Settlement in Bulgaria from Wallachia/Moldavia late 19th century: Bulgaria 3 | ||||
| North/Western route: Lithuanian, Polish, Spanish, Portuguese Roma | ||||
The grouping showed explains the highest variance among groups (other results in Table S2).
*P-value<0.05;
**P-value<0.001.
Figure 2Non-metric Multidimensional Scaling plot (NMDS) of the pairwise differences between Roma and the corresponding host populations (stress value = 0.068).
The labels “Roma Bulgaria 1”, “Roma Bulgaria 2” and “Roma Bulgaria 3” stand for Bulgarian Roma populations grouped according to history of migrations as in Gresham et al. [6].
Estimated probabilities for the subcontinental regions and states considered in the matching analysis and the corresponding standard deviations (SD).
| Subcontinental region | State | n | Probability Region (SD) | Probability State (SD) |
|
| 418 | 0.721 (0.038) | ||
| Himachal Pradesh | 37 | 0.017 (0.011) | ||
| Kashmir | 19 | - | ||
| Punjab | 362 | 0.536 (0.042) | ||
|
| 314 | 0.022 (0.012) | ||
| Uttar Pradesh | 232 | 0.014 (0.010) | ||
| Madhya Pradesh | 82 | 0.002 (0.003) | ||
|
| 348 | 0.008 (0.007) | ||
| Guajarat | 91 | 0.002 (0.004) | ||
| Maharashtra | 221 | 0.004 (0.005) | ||
| Rajastan | 36 | - | ||
|
| 431 | - | ||
| Karnataka | 201 | - | ||
| Kerala | 230 | - | ||
|
| 1443 | 0.051 (0.019) | ||
| Tamil Nadu | 427 | 0.003 (0.005) | ||
| Andhra Pradesh | 1016 | 0.033 (0.015) | ||
|
| 483 | 0.198 (0.034) | ||
| Bihar | 45 | 0.058 (0.020) | ||
| Orissa | 153 | 0.299 (0.039) | ||
| West Bengal | 285 | 0.034 (0.015) | ||
|
| 314 | - | ||
| Arunachal Pradesh | 26 | - | ||
| Asma | 58 | - | ||
| Manipur | 9 | - | ||
| Mizoram | 14 | - | ||
| Nagaland | 43 | - | ||
| Tripura | 134 | - | ||
| Bangladesh | 30 | - | ||
| TOTAL | 3751 |
Those regions/states with no matches with Roma sequences (probability = 0) are shown with hyphens (-).
Figure 3Median-joining network of the mtDNA sequences belonging to the M5*, M51a, M25, M35 and M18 haplogroups in the Roma and Indian populations (numbers represent mutation defining these haplogroups).