| Literature DB >> 26395220 |
Lian Deng1, Boon-Peng Hoh1,2, Dongsheng Lu1, Woei-Yuh Saw3,4, Rick Twee-Hee Ong3, Anuradhani Kasturiratne5, H Janaka de Silva6, Bin Alwi Zilfalil7, Norihiro Kato8, Ananda R Wickremasinghe5, Yik-Ying Teo3,4,9,10,11, Shuhua Xu1,12,13.
Abstract
The Malay people are an important ethnic composition in Southeast Asia, but their genetic make-up and population structure remain poorly studied. Here we conducted a genome-wide study of four geographical Malay populations: Peninsular Malaysian Malay (PMM), Singaporean Malay (SGM), Indonesian Malay (IDM) and Sri Lankan Malay (SLM). All the four Malay populations showed substantial admixture with multiple ancestries. We identified four major ancestral components in Malay populations: Austronesian (17%-62%), Proto-Malay (15%-31%), East Asian (4%-16%) and South Asian (3%-34%). Approximately 34% of the genetic makeup of SLM is of South Asian ancestry, resulting in its distinct genetic pattern compared with the other three Malay populations. Besides, substantial differentiation was observed between the Malay populations from the north and the south, and between those from the west and the east. In summary, this study revealed that the genetic identity of the Malays comprises a mixed entity of multiple ancestries represented by Austronesian, Proto-Malay, East Asian and South Asian, with most of the admixture events estimated to have occurred 175 to 1,500 years ago, which in turn suggests that geographical isolation and independent admixture have significantly shaped the genetic architectures and the diversity of the Malay populations.Entities:
Mesh:
Year: 2015 PMID: 26395220 PMCID: PMC4585825 DOI: 10.1038/srep14375
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Phylogenetic tree showing genetic relatedness of four Malay populations.
A phylogenetic tree was constructed by using the neighbor-joining method, taking YRI as the outgroup. The pair-wise population distance was measured by global FST with 1,000 bootstrapping repeats. Bootstrap values are noted on the branches. Population IDs are shown in Supplementary Table S4. Geographical groups are indicated by colors. The four Malay populations are highlighted in bold font with red asterisks.
Figure 2Plots of the first two principal components for hierarchical analyses of worldwide populations.
(A) 2,713 individuals representing 83 populations from Africa, Europe and Asia. (B) 2,597 individuals representing 82 populations from Europe and Asia (excluding YRI). PMMK and PMMM denote Kelantan Malay and Minangkabau, respectively.
Figure 3Locations and genetic makeup of the Malays and other populations.
The averaged genetic makeup across individuals of each population are indicated by the bars. Each color represents an independent cluster at K = 9. Southeast Asian 1 and Southeast Asian 2 represent the aboriginal Southeast Asian component and Austronesian component, respectively. All the Malay populations are arranged in the dashed box. Population IDs are shown in Supplementary Table S4. The map is generated using the packages of R v2.11.1, including mapdata v2.2-3, mapplots v1.5 and maps 2.3-9 (http://cran.r-project.org/web/packages/).
Dating gene flow to the Malays.
| Group | PMM | SGM | IDM | SLM | |||
|---|---|---|---|---|---|---|---|
| MY-MLY | MY-KN | MY-MN | SG-MAS | SG-MY | |||
| European | 35.93 ± 5.79 | 30.88 ± 6.34 | 31.21 ± 7.38 | 9.12 ± 0.68 | 17.96 ± 4.24 | NA | 7.92 ± 0.82 |
| Southeast Asian 1 | NA | 261.46 ± 113.84 | NA | 29.53 ± 1.10 | NA | NA | 7.33 ± 0.59 |
| Southeast Asian 2 | NA | NA | NA | NA | NA | NA | 6.41 ± 1.04 |
| South Asian | 40.21 ± 15.71 | 34.05 ± 3.70 | 35.48 ± 6.09 | 9.57 ± 0.89 | 17.48 ± 3.96 | 72.39 ± 18.00 | 8.26 ± 1.59 |
| East Asian | 6.83 ± 2.55 | NA | NA | 4.94 ± 0.88 | NA | NA | 7.47 ± 0.44 |
Populations in the first column are gene flow donors, and those in the first row are gene flow receptors. The date (mean ± sd) is measured for each donor-receptor pair and is measured by generations. The date of gene flow from each ancestry is the summary of the mean date estimated by the sub-populations (Supplementary Table S1). NA: No available data.