| Literature DB >> 27090253 |
Wan Isa Hatin1, Ab Rajab Nur-Shafawati1, Ali Etemad2, Wenfei Jin3, Pengfei Qin3, Shuhua Xu3, Li Jin3, Soon-Guan Tan4, Pornprot Limprasert5, Merican Amir Feisal6,7, Mohammed Rizman-Idid6, Bin Alwi Zilfalil8,9.
Abstract
BACKGROUND: The Malays consist of various sub-ethnic groups which are believed to have different ancestral origins based on their migrations centuries ago. The sub-ethnic groups can be divided based on the region they inhabit; the northern (Melayu Kedah and Melayu Kelantan), western (Melayu Minang) and southern parts (Melayu Bugis and Melayu Jawa) of Peninsular Malaysia. We analyzed 54,794 autosomal single nucleotide polymorphisms (SNPs) which were shared by 472 unrelated individuals from 17 populations to determine the genetic structure and distributions of the ancestral genetic components in five Malay sub-ethnic groups namely Melayu Bugis, Melayu Jawa, Melayu Minang, Melayu Kedah, and Melayu Kelantan. We also have included in the analysis 12 other study populations from Thailand, Indonesia, China, India, Africa and Orang Asli sub-groups in Malay Peninsula, obtained from the Pan Asian SNP Initiative (PASNPI) Consortium and International HapMap project database.Entities:
Keywords: Admixture; Genetic structure; Haplotypes; Malays; Single nucleotide polymorphisms
Year: 2014 PMID: 27090253 PMCID: PMC7735395 DOI: 10.1186/s11568-014-0005-z
Source DB: PubMed Journal: Hugo J ISSN: 1877-6558
Pair-wise Fst (x 1000) between the Malay sub-ethnic groups and other populations in this study
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MY-BG | |||||||||||||||||
| MY-JV | 24 | ||||||||||||||||
| MY-MN | 24 | 21 | |||||||||||||||
| MY-KN | 23 |
|
| ||||||||||||||
| MY-KD | 22 |
|
|
| |||||||||||||
| TH-PT | 26 | 21 | 23 | 21 |
| ||||||||||||
| MY-TM | 26 |
|
|
|
|
| |||||||||||
| MY-JH | 42 | 34 | 35 | 32 | 31 | 37 | 30 | ||||||||||
| MY-KS | 53 | 47 | 46 | 42 | 41 | 48 | 41 | 23 | |||||||||
| ID-JV | 22 |
|
|
|
|
|
|
| 44 | ||||||||
| ID-ML | 26 | 25 | 23 | 23 | 23 | 28 | 25 | 41 | 53 | 23 | |||||||
| ID-TR |
| 22 | 21 | 21 | 21 | 25 | 23 | 40 | 52 | 20 | 22 | ||||||
| CN-JN | 31 | 25 | 27 | 24 | 23 | 28 | 25 | 40 | 52 | 23 | 31 | 28 | |||||
| CN-WA | 24 |
|
|
|
|
|
|
| 44 |
| 24 | 22 | 17 | ||||
| IN-WL | 57 | 54 | 46 |
|
| 48 | 50 | 57 | 61 | 52 | 57 | 56 | 57 | 51 | |||
| IN-DR | 51 | 48 | 40 |
|
| 42 | 44 | 50 | 55 | 46 | 51 | 50 | 51 | 45 | 17 | ||
| YRI | 112 | 109 | 102 | 99 | 97 | 106 | 104 | 111 | 116 | 107 | 112 | 110 | 112 | 106 | 88 | 84 |
*The bold numbers indicate close genetic relationship between the populations.
Figure 1MDS analysis for 17 populations based on Fst. A) two dimensions (2D) and B) three dimensions (3D).
Figure 2The estimated population structure and ancestral membership coefficients of each of the 472 individuals for K = 2 to K = 10 from dataset S2. The linguistic family of populations were showed at the top of the figure while the name of populations were showed below the figure. Each population was separated by the solid line and each individual was represented by a thin vertical line, which was partitioned into K color segment that represent the individual’s estimated Q fractions in K clusters.
Proportion of membership coefficient ( ) for each of population in each of the six inferred clusters ( = 6)
|
|
|
|
| |||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |||
| Malaysia: | ||||||||
| Malays*: | ||||||||
| Jawa (19) | Johor | MY-JV | 0.419 | 0.004 | 0.010 | 0.665 | 0.004 | 0.001 |
| Bugis (14) | Johor | MY-BG | 0.255 | 0.027 | 0.044 | 0.561 | 0.008 | 0.001 |
| Minang (20) | Negeri Sembilan | MY-MN | 0.318 | 0.011 | 0.019 | 0.525 | 0.125 | 0.002 |
| Kelantan (18) | Kelantan | MY-KN | 0.222 | 0.018 | 0.054 | 0.542 | 0.162 | 0.002 |
| Kedah (24) | Kedah | MY-KD | 0.206 | 0.028 | 0.031 | 0.527 | 0.208 | 0.001 |
| Proto-Malaya: | ||||||||
| Temuan (49) | Negeri Sembilan | MY-TM | 0.101 | 0.361 | 0.053 | 0.478 | 0.006 | 0.001 |
| Negritosa: | ||||||||
| Jahai (50) | Perak | MY-JH | 0.012 | 0.010 | 0.808 | 0.168 | 0.002 | 0.000 |
| Kensui (30) | Kedah | MY-KS | 0.018 | 0.006 | 0.926 | 0.035 | 0.015 | 0.001 |
| Thailand*: | ||||||||
| Pattani (14) | Pattani | TH-PT | 0.237 | 0.013 | 0.039 | 0.570 | 0.140 | 0.001 |
| Indonesiaa: | ||||||||
| Jawa (19) | Java | ID-JV | 0.251 | 0.029 | 0.048 | 0.663 | 0.008 | 0.001 |
| Melayu (12) | Sumatera | ID-ML | 0.378 | 0.005 | 0.018 | 0.586 | 0.012 | 0.001 |
| Toraja (20) | Sulawesi | ID-TR | 0.444 | 0.004 | 0.009 | 0.540 | 0.003 | 0.001 |
| Chinaa: | ||||||||
| Jinuo (29) | Yunnan | CN-JN | 0.016 | 0.005 | 0.015 | 0.959 | 0.005 | 0.001 |
| Wa (56) | Yunnan | CN-WA | 0.032 | 0.005 | 0.021 | 0.936 | 0.005 | 0.001 |
| Indiaa: | ||||||||
| Marathi (14) | Maharashtra | IN-WL | 0.002 | 0.003 | 0.006 | 0.006 | 0.979 | 0.004 |
| Telugu (24) | Andra Pradesh | IN-DR | 0.003 | 0.003 | 0.011 | 0.006 | 0.975 | 0.002 |
| Africab: | ||||||||
| Yoruba (60) | Nigeria | YRI | 0.001 | 0.001 | 0.003 | 0.001 | 0.002 | 0.992 |
aThe genotype data obtained from the database of PASNPI Consortium (http://www4a.biotec.or.th/PASNP/).
bThe genotype data obtained from International HapMap Consortium (http://hapmap.ncbi.nlm.nih.gov/).
*The inclusion criteria are; the sampled individual of a population must be at least three generations of the same population, no parental admixture and communicate daily in the local dialect. The exclusion criteria are those that contradict the inclusion criteria.
Figure 3Haplotype Sharing (HS) percentage of the Malays (MY), northern Malays (NMY), peninsula Malays (PMY), Chinese (CN), Indians (IN), Proto-Malays (PM) and (NG). Haplotypes in population A were identified by HSA as four classes: 1) private in population A; 2) shared with population B only; 3) shared with population C only; and 4) shared with all the three populations. A) MY; B) CN; C) IN; D) NMY; E) CN; F) IN; G) PMY; H) PM and I) NG. HS proportions were obtained by sampling 76 chromosomes 100 times without replacement and calculated without considering the frequencies of distinct haplotypes.
Figure 4Phylogenetic trees of STRUCTURE analyses and Haplotype Sharing Analyses (HSAs). The phylogenetic trees re-constructed based on two types of genetic distance methods, which are A) Cavalli DC and B) Nei’s DA. The clade of MIT consists of Malays, Indonesians and Thais, PM is Proto-Malays, CN is Chinese, NG is Semang, IN is Indians and YRI is Yoruba. The phylogenetic trees based on haplotype sharing distance from 100 kb bins of HSAs were showed by; C) PMY is private haplotypes found only in Malays samples of peninsula; PM is private haplotypes found only in Proto-Malays samples; NG is private haplotypes found only in Semang samples; Shared haplotypes is found in all PMY, PM and NG samples. D) NMY is private haplotypes found only in northern Malay samples; CN is private haplotypes found only in Chinese samples; IN is private haplotypes found only in Indians samples; Shared haplotypes is found in all NMY, CN and IN samples; YRI is the African haplotypes that were used as outgroup.
Figure 5Map of the Asian continent depicting geographic locations of the sampled populations in six countries. The small box on the upper right corner of the figure shows African continent. The colors denoted the linguistic family of the populations.