| Literature DB >> 31550250 |
Zohreh Mehrjoo1, Zohreh Fattahi1, Maryam Beheshtian1, Marzieh Mohseni1, Hossein Poustchi2, Fariba Ardalani1, Khadijeh Jalalvand1, Sanaz Arzhangi1, Zahra Mohammadi2, Shahrouz Khoshbakht1, Farid Najafi3, Pooneh Nikuei4, Mohammad Haddadi5, Elham Zohrehvand1, Morteza Oladnabi6,7, Akbar Mohammadzadeh1, Mandana Hadi Jafari1, Tara Akhtarkhavari1, Ehsan Shamsi Gooshki8,9, Aliakbar Haghdoost10,11, Reza Najafipour12, Lisa-Marie Niestroj13, Barbara Helwing14, Yasmina Gossmann15, Mohammad Reza Toliat13, Reza Malekzadeh2,15, Peter Nürnberg13,16, Kimia Kahrizi1, Hossein Najmabadi1, Michael Nothnagel13,16.
Abstract
Iran, despite its size, geographic location and past cultural influence, has largely been a blind spot for human population genetic studies. With only sparse genetic information on the Iranian population available, we pursued its genome-wide and geographic characterization based on 1021 samples from eleven ethnic groups. We show that Iranians, while close to neighboring populations, present distinct genetic variation consistent with long-standing genetic continuity, harbor high heterogeneity and different levels of consanguinity, fall apart into a cluster of similar groups and several admixed ones and have experienced numerous language adoption events in the past. Our findings render Iran an important source for human genetic variation in Western and Central Asia, will guide adequate study sampling and assist the interpretation of putative disease-implicated genetic variation. Given Iran's internal genetic heterogeneity, future studies will have to consider ethnic affiliations and possible admixture.Entities:
Mesh:
Year: 2019 PMID: 31550250 PMCID: PMC6759149 DOI: 10.1371/journal.pgen.1008385
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Samples included in this study.
| Ethnic group | Language subfamily (top-level family) | Samples before QC | Samples after QC (female/male) |
|---|---|---|---|
| Iranian Arabs | Arabic (Afro-Asiatic) | 100 | 96 (42/54) |
| Iranian Azeris | Azeric (Turkic) | 100 | 99 (44/55) |
| Iranian Baluchis | Iranian (Indo-European) | 100 | 92 (37/55) |
| Iranian Gilaks | Iranian (Indo-European) | 77 | 75 (29/46) |
| Iranian Kurds | Iranian (Indo-European) | 100 | 97 (46/51) |
| Iranian Lurs | Iranian (Indo-European) | 100 | 98 (58/40) |
| Iranian Mazanderanis (Tabari) | Iranian (Indo-European) | 92 | 87 (38/49) |
| Iranian Persians | Iranian (Indo-European) | 100 | 95 (51/44) |
| Iranian Persian Gulf (PG) Islanders | Iranian (Indo-European) | 100 | 91 (43/48) |
| Iranian Sistanis | Iranian (Indo-European) | 100 | 94 (49/45) |
| Iranian Turkmen | Turkmen (Turkic) | 100 | 97 (47/50) |
| Total | 1069 | 1021 (484/537) |
Given are ethnic affiliation, spoken language families (according to Glottolog 3.2; http://glottolog.org/) and the number of samples per ethnic group before and after quality control (QC).
*: Samples were part of the Iranome project [124].
Fig 1Internal Iranian population structure.
Relative sample locations with respect to the first two MDS components. (A) Relative sample locations of the Iranian ethnic groups from this study, including 90% density limits; (B) zoomed view into the subset of the seven groups belonging to the Central Iranian Cluster (CIC).
Fig 2Iranian ethnic groups in a global context.
Relative sample locations with respect to the first two MDS components. Iranian ethnic groups in a global context (subset of “Old World” populations from the global 1000G data set); inlet zoomed view of the CIC and adjacent European populations.
Fig 3Iranian ethnic groups in a regional context.
Relative sample locations with respect to the first two MDS components. Iranian ethnic groups (solid points) in a local context of samples from [2, 6, 44] (open symbols, triangles and 90% density limits).
Assessment of population substructure.
| Iranian … | Arabs | Azeris | Baluchis | Gilaks | Kurds | Lurs | Mazanderanis | Persians | PG Islanders | Sistanis | Turkmen |
| Arabs | 1.24 | 2.65 | 1.50 | 1.49 | 1.43 | 1.61 | 1.36 | 2.30 | 2.08 | 2.64 | |
| Azeris | 0.0017 | 2.39 | 1.25 | 1.28 | 1.24 | 1.29 | 1.17 | 2.28 | 1.77 | 2.03 | |
| Baluchis | 0.0089 | 0.0073 | 2.19 | 2.54 | 2.43 | 2.11 | 2.16 | 2.62 | 1.39 | 2.74 | |
| Gilaks | 0.0030 | 0.0015 | 0.0074 | 1.37 | 1.32 | 1.56 | 1.23 | 2.18 | 1.72 | 2.42 | |
| Kurds | 0.0025 | 0.0013 | 0.0084 | 0.0021 | 1.31 | 1.41 | 1.31 | 2.43 | 1.98 | 2.59 | |
| Lurs | 0.0022 | 0.0011 | 0.0076 | 0.0018 | 0.0015 | 1.32 | 1.19 | 2.34 | 1.88 | 2.52 | |
| Mazanderanis | 0.0033 | 0.0016 | 0.0064 | 0.0008 | 0.0023 | 0.0018 | 1.23 | 2.19 | 1.61 | 2.36 | |
| Persians | 0.0018 | 0.0008 | 0.0061 | 0.0014 | 0.0016 | 0.0010 | 0.0012 | 2.11 | 1.60 | 2.21 | |
| PG Islanders | 0.0070 | 0.0067 | 0.0091 | 0.0076 | 0.0076 | 0.0071 | 0.0068 | 0.0059 | 2.21 | 3.00 | |
| Sistanis | 0.0058 | 0.0041 | 0.0021 | 0.0043 | 0.0053 | 0.0046 | 0.0034 | 0.0032 | 0.0067 | 2.13 | |
| Turkmen | 0.0090 | 0.0056 | 0.0097 | 0.0089 | 0.0087 | 0.0081 | 0.0079 | 0.0067 | 0.0110 | 0.0065 |
Lower-left triangle: Weir’s FST for pairs of Iranian ethnic groups and for single groups, respectively; upper-right triangle: upper bound for genomic inflation factor (GIF) between pairs of groups (see main text for details).
Fig 4ADMIXTURE inference of Iranian ethnic groups.
(A) Inference in Iranian data set. Inferred mixture proportions for 1021 Iranian samples from this study for k = 4 ancestral populations, yielding a minimal cross-validation (CV) error of 0.544; (B) Inference in global data set. Additional inclusion of the global 1000G data set (k = 13; CV = 0.499). (C) Inference in local data set. Additional inclusion of the local data set (k = 8; CV = 0.575).
Fig 5Ancient DNA samples from 45,000 (Upper Palaeolithic)–3350 BCE in the context of extant Iranian ethnic groups.
Time-period specific ancient DNA samples (S3 Table) projected onto extant human variation (S18 Fig). The geographic origin of the ancient samples is coded by color.
Fig 7Ancient DNA samples from 1200 BCE–1460 CE in the context of extant Iranian ethnic groups.
Time-period specific ancient DNA samples (S3 Table) projected onto extant human variation (S18 Fig). The geographic origin of the ancient samples is coded by color. Previous time strata are indicated by 95% density limits (refer to Figs 5 and 6).
Fig 6Ancient DNA samples from 3350–1200 BCE in the context of extant Iranian ethnic groups.
Time-period specific ancient DNA samples (S3 Table) projected onto extant human variation (S18 Fig). The geographic origin of the ancient samples is coded by color. Previous time strata are indicated by 95% density limits (refer to Fig 5).
Comparative consanguinity assessment.
| FI | Runs of homozygosity (PLINK) | Autozygous segments (IBDseg) | Class C segments (GARLIC) | ||||
|---|---|---|---|---|---|---|---|
| Iranian … | Number | Cumulative [Mb] | Number | Cumulative [Mb] | Number | Cumulative [Mb] | |
| 0.0122±0.0192 | 43.9±9.7 | 127.7±96.6 | 21.6±8.8 | 91.2±102.6 | 2.9±3.7 | 54.2±75.9 | |
| 0.0025±0.0127 | 40.6±7.7 | 82.5±67.8 | 17.1±6.6 | 40.5±70.2 | 1.2±2.7 | 21.8±53.9 | |
| 0.0123±0.0213 | 51.1±12.2 | 156.1±118.8 | 20.9±11.2 | 114.4±122.0 | 2.8±3.8 | 59.0±86.6 | |
| 0.0001±0.0104 | 43.1±12.3 | 84.0±73.2 | 11.0±9.2 | 34.8±70.2 | 0.6±1.7 | 11.3±30.6 | |
| 0.0010±0.0100 | 42.5±7.1 | 82.3±52.7 | 15.2±5.3 | 36.8±55.6 | 1.1±2.3 | 19.2±41.6 | |
| 0.0048±0.0137 | 43.4±9.5 | 100.2±81.3 | 18.0±8.1 | 57.6±85.8 | 1.4±2.8 | 27.0±55.4 | |
| 0.0037±0.0110 | 44.5±9.0 | 95.1±60.4 | 17.1±8.1 | 49.9±62.3 | 0.5±1.0 | 13.7±26.0 | |
| 0.0057±0.0149 | 42.9±9.6 | 97.9±74.2 | 18.3±9.0 | 56.1±78.5 | 0.9±1.8 | 21.0±40.4 | |
| 0.0024±0.0114 | 42.1±9.2 | 97.9±74.2 | 12.2±7.7 | 60.3±74.0 | 1.9±2.8 | 31.2±50.6 | |
| 0.0132±0.0202 | 48.1±12.3 | 147.9±111.5 | 23.1±11.3 | 110.0±119.1 | 2.6±3.4 | 53.4±71.4 | |
| 0.0069±0.0167 | 38.7±8.5 | 99.0±82.9 | 17.9±8.4 | 62.0±85.1 | 1.4±2.4 | 32.1±57.9 | |
Given are mean±standard deviation for selected indicators of autozygosity in the Iranian ethnic groups.