| Literature DB >> 29410676 |
David G Mahal1,2, Ianis G Matsoukas1.
Abstract
Several studies have evaluated the movements of large populations to the Indian subcontinent; however, the ancient geographic origins of smaller ethnic communities are not clear. Although historians have attempted to identify the origins of some ethnic groups, the evidence is typically anecdotal and based upon what others have written before. In this study, recent developments in DNA science were assessed to provide a contemporary perspective by analyzing the Y chromosome haplogroups of some key ethnic groups and tracing their ancient geographical origins from genetic markers on the Y-DNA haplogroup tree. A total of 2,504 Y-DNA haplotypes, representing 50 different ethnic groups in the Indian subcontinent, were analyzed. The results identified 14 different haplogroups with 14 geographic origins for these people. Moreover, every ethnic group had representation in more than one haplogroup, indicating multiple geographic origins for these communities. The results also showed that despite their varied languages and cultural differences, most ethnic groups shared some common ancestors because of admixture in the past. These findings provide new insights into the ancient geographic origins of ethnic groups in the Indian subcontinent. With about 2,000 other ethnic groups and tribes in the region, it is expected that more scientific discoveries will follow, providing insights into how, from where, and when the ancestors of these people arrived in the subcontinent to create so many different communities.Entities:
Keywords: DNA; Indian subcontinent; Y chromosome; ethnic group; haplogroup; human migration
Year: 2018 PMID: 29410676 PMCID: PMC5787057 DOI: 10.3389/fgene.2018.00004
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Datasets used in this study.
| The National Geographic Society's Genographic Project | The Genographic Project is studying the genetic signatures of ancient human migrations and creating a database of yDNA and mtDNA profiles. Currently, there are over 800,000 participants from over 140 countries. | Genographic, |
| The Ethnic Groups of South Asia | The study covered a high-resolution assessment (69 informative Y-chromosome binary markers and 10 microsatellite markers) of a large set of representative ethnic groups of South Asia. This included 728 samples from India representing 36 populations, with 17 tribal populations, from six geographic regions and different social and linguistic categories, and 176 samples from Pakistan representing eight populations. | Sengupta et al., |
| The Origin of Romanies | The haplotype frequencies for 11 Y-STR markers in a Romani population ( | Nagy et al., |
| Paternal Lineages among North Indians | A total of 32 Y-chromosomal markers in 560 North Indian males collected from three higher caste groups (Brahmins, Chaturvedis, and Bhargavas) and two Muslims groups (Shia and Sunni) were genotyped. | Zhao et al., |
| The Himachal Pradesh, India Autosomal and Y-Chromosome Study | Genotypic analysis of 48 Malani individuals at 15 highly polymorphic autosomal STR loci. | Giroti and Talwar, |
| The Ezhava Population of Kerala | Haplotype analysis of the Ezhava population of Kerala ( | Nair et al., |
| The Dravidian populations | Two Dravidian populations, namely Lingayat ( | Chennakrishnaiah et al., |
| The Pathans Group of Pakistan | Haplotype analysis of 22 Y-STR haplotypes and Y haplogroup distribution in Pathans ( | Lee et al., |
A master dataset of 2,504 Y-chromosome profiles of 50 ethnic groups in the Indian subcontinent was compiled from eight different sources.
The allele sequence variants that have been exploited in this study.
| DYS19 | n/a | [TAGA] | 10–19 | 0.25 | DY-27H39; DYS394 | |
| DYS385a | The order of DYS385a may be reversed. Its sequence is referred to as the Kittler order. | [GAAA] | 7–28 | 0.21 | DYS385 I | |
| DYS385b | The order of DYS385b may be reversed. Its sequence is referred to as the Kittler order. | [GAAA]n | 7–28 | 0.21 | DYS385 II | |
| DYS389I | DYS389 is a multi-copy marker, and includes DYS389i and DYS389ii. DYS389ii refers to the total length of DYS389. Therefore, when there is a one-step mutation at DYS389i, it will also appear in DYS389ii. | [TCTG]3 [TCTA] | 9–17 | 0.24 | DYS389a | |
| DYS389II | DYS389 is a multi-copy marker, and includes DYS389i and DYS389ii. DYS389ii refers to the total length of DYS389. Therefore, when there is a one-step mutation at DYS389i, it will also appear in DYS389ii. | [TCTG]n[TCTA]pN48[TCTG]3[TCTA]q | 24–34 | 0.35 | DYS389b | |
| DYS390 | n/a | [TCTG]8[TCTA]n[TCTG]1[TCTA]4 | 17–28 | 0.25 | n/a | |
| DYS391 | n/a | [TCTA]n | 6–14 | 0.28 | n/a | |
| DYS392 | n/a | [TAT]n | 6–17 | 0.07 | n/a | |
| DYS393 | n/a | [AGAT]n | 9–17 | 0.08 | DYS395 | |
| DYS437 | n/a | [TCTA]n[TCTG]2[TCTA]4 | 13–17 | 0.13 | ||
| DYS438 | n/a | [TTTTC]n | 6–14 | 0.07 | ||
| DYS439 | n/a | [GATA]n | 9–14 | 0.61 | ||
| DYS448 | n/a | [AGAGAT]nN42[AGAGAT]m | 20–26 | 0.11 | ||
| DYS456 | n/a | [AGAT]n | 13–18 | 0.53 | ||
| DYS458 | n/a | [GAAA]n | 13–20 | 1.06 |
n, p, and q represent number of individual repeats per short tandem repeat unit. Reference motifs are based on sequences provided in STRBase (.
Fifty ethnic groups of the Indian Subcontinent (n = 2,504) represented in 14 Y-Chromosome Haplogroups, with their markers and date scale.
| ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |||
| Bangladesh, Bangladeshi | 22 | √ | √ | √ | √ | Genographic, | ||||||||||
| Bangladesh, Bengali | 49 | √ | √ | √ | √ | √ | √ | √ | √ | Genographic, | ||||||
| India, Agharia | 10 | √ | √ | √ | Sengupta et al., | |||||||||||
| India, Ambalakarar | 29 | √ | √ | √ | √ | √ | √ | Sengupta et al., | ||||||||
| India, Assamese | 6 | √ | √ | √ | √ | Genographic, | ||||||||||
| India, Bhargavas | 96 | √ | √ | √ | √ | √ | √ | √ | √ | Zhao et al., | ||||||
| India, Brahmin | 152 | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | Sengupta et al., | |||
| India, Chamar | 18 | √ | √ | √ | Sengupta et al., | |||||||||||
| India, Chaturvedis | 88 | √ | √ | √ | √ | √ | √ | √ | √ | √ | Zhao et al., | |||||
| India, Ezhava | 113 | √ | √ | √ | √ | √ | √ | √ | √ | √ | Nair et al., | |||||
| India, Gujarati | 116 | √ | √ | √ | √ | √ | √ | √ | Genographic, | |||||||
| India, Halba | 20 | √ | √ | √ | √ | Sengupta et al., | ||||||||||
| India, Ho | 30 | √ | √ | Sengupta et al., | ||||||||||||
| India, Irula | 30 | √ | √ | √ | √ | √ | √ | Sengupta et al., | ||||||||
| India, Iyengar | 43 | √ | √ | √ | √ | √ | √ | √ | Sengupta et al., | |||||||
| India, Iyer | 51 | √ | √ | √ | √ | √ | √ | √ | Sengupta et al., | |||||||
| India, Jamatia | 30 | √ | √ | √ | Sengupta et al., | |||||||||||
| India, Jat Haryana | 108 | √ | √ | √ | √ | √ | √ | √ | √ | Nagy et al., | ||||||
| India, Jat Sikh | 98 | √ | √ | √ | √ | √ | √ | √ | √ | Nagy et al., | ||||||
| India, Kamar | 30 | √ | √ | √ | √ | Sengupta et al., | ||||||||||
| India, Kashmiri | 21 | √ | √ | √ | √ | Genographic, | ||||||||||
| India, Koknasth Brahmin | 25 | √ | √ | √ | √ | Sengupta et al., | ||||||||||
| India, Konda Reddy | 30 | √ | √ | √ | √ | Sengupta et al., | ||||||||||
| India, Konkane | 53 | √ | √ | √ | √ | √ | √ | √ | √ | √ | Genographic, | |||||
| India, Kota | 15 | √ | √ | √ | Sengupta et al., | |||||||||||
| India, Koya Dora | 27 | √ | √ | √ | √ | Sengupta et al., | ||||||||||
| India, Kurumba | 19 | √ | √ | √ | √ | Sengupta et al., | ||||||||||
| India, Lingayat | 101 | √ | √ | √ | √ | √ | √ | √ | Chennakrishnaiah et al., | |||||||
| India, Lodha | 20 | √ | √ | √ | √ | Sengupta et al., | ||||||||||
| India, Malani | 30 | √ | √ | √ | √ | Giroti and Talwar, | ||||||||||
| India, Malayali | 97 | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | Genographic, | ||||
| India, Maratha | 88 | √ | √ | √ | √ | √ | √ | √ | √ | Sengupta et al., | ||||||
| India, Mizo | 27 | √ | √ | Sengupta et al., | ||||||||||||
| India, Muria | 20 | √ | √ | √ | Sengupta et al., | |||||||||||
| India, Nair | 48 | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | Nair et al., | ||||
| India, Pallan | 29 | √ | √ | √ | √ | √ | √ | Sengupta et al., | ||||||||
| India, Rajput | 47 | √ | √ | √ | √ | √ | √ | √ | Genographic, | |||||||
| India, Tripuri | 20 | √ | √ | Sengupta et al., | ||||||||||||
| India, Vanniyar | 25 | √ | √ | √ | √ | √ | √ | Sengupta et al., | ||||||||
| India, Vellalar | 32 | √ | √ | √ | √ | √ | √ | Sengupta et al., | ||||||||
| India, Vokkaliga | 102 | √ | √ | √ | √ | √ | √ | √ | √ | Chennakrishnaiah et al., | ||||||
| Pakistan, Balochi | 29 | √ | √ | √ | √ | √ | Sengupta et al., | |||||||||
| Pakistan, Brahui | 25 | √ | √ | √ | √ | √ | √ | √ | Sengupta et al., | |||||||
| Pakistan, Burusho | 20 | √ | √ | √ | √ | √ | √ | √ | √ | Sengupta et al., | ||||||
| Pakistan, Hazara | 27 | √ | √ | √ | √ | √ | √ | Sengupta et al., | ||||||||
| Pakistan, Kalash | 20 | √ | √ | √ | √ | √ | Sengupta et al., | |||||||||
| Pakistan, Makrani | 20 | √ | √ | √ | √ | √ | Sengupta et al., | |||||||||
| Pakistan, Pashtun | 20 | √ | √ | √ | √ | √ | Genographic, | |||||||||
| Pakistan, Pathan | 288 | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | Sengupta et al., | ||||
| Pakistan, Sindhi | 40 | √ | √ | √ | √ | √ | √ | √ | Sengupta et al., | |||||||
| Total | 2,504 | 76 | 14 | 71 | 84 | 403 | 15 | 279 | 37 | 281 | 175 | 11 | 65 | 963 | 30 | |
| Percent (%) | 100 | 3.0 | 0.6 | 2.8 | 3.4 | 16.1 | 0.6 | 11.1 | 1.5 | 11.2 | 7.0 | 0.4 | 2.6 | 38.5 | 1.2 |
Sources for markers and date scale: (Smolenyak and Turner, .
Figure 1Fourteen top level haplogroups, and recent markers with date scale. (Smolenyak and Turner, 2004; Wells, 2007); Y-DNA Haplogroup Tree, markers, and descriptions at ISOGG, http://isogg.org/tree/index.html), kya, thousand years ago.
Ancient geographic origins of 14 Y-chromosome haplogroups.
| C | M130 | ~50 kya | Southern Asia, part of first migration out of Africa, coastal India to Southeast Asia |
| E | M96 | ~30–40 kya | Northeast Africa, part of second migration out of Africa, initially settled in Middle East |
| F | M89 | ~45 kya | Northeastern Africa or the Middle East (in 90% of all non-African men), parent of HG's G–T |
| G | M201 | ~10–23 kya | Eastern edge of the Middle East, close to Himalayan foothills, Indus Valley |
| H | M69 | ~30 kya | South central Asia, known as the “Indian Marker” |
| I | M170 | ~25 kya | Europe, Near East, Central Asia, known as the “European haplogroup” |
| J | M304 | ~15 kya | Fertile Crescent (Mesopotamia, the land in and around the Tigris and Euphrates rivers) |
| K | M9 | ~40 kya | Iran or south-central Asia, diverged from the M89 Middle Eastern clan |
| L | M11 | ~25–30 kya | Pamir Knot region (Hindu Kush, Tian Shan, Himalayas) in Tajikistan |
| O | M175 | ~35 kya | Central or East Asia, part of M9 Eurasian clan, early Siberians |
| P | M48 | ~35 kya | Central Asia, part of M9 Eurasian clan, north of Hindu Kush mountains |
| Q | M242 | ~15–20 kya | Siberia (North Asia), descendants first arrivals in North America |
| R | M207 | ~4–27 kya | Central Asia (from the Caspian Sea to border of western China) |
| T | M184 | ~25 kya | Low frequencies Europe, the Middle East, North Africa, and East Africa |
Smolenyak and Turner, .