Literature DB >> 18482451

Paternal genetic affinity between Western Austronesians and Daic populations.

Hui Li1, Bo Wen, Shu-Juo Chen, Bing Su, Patcharin Pramoonjago, Yangfan Liu, Shangling Pan, Zhendong Qin, Wenhong Liu, Xu Cheng, Ningning Yang, Xin Li, Dinhbinh Tran, Daru Lu, Mu-Tsu Hsu, Ranjan Deka, Sangkot Marzuki, Chia-Chen Tan, Li Jin.   

Abstract

BACKGROUND: Austronesian is a linguistic family spread in most areas of the Southeast Asia, the Pacific Ocean, and the Indian Ocean. Based on their linguistic similarity, this linguistic family included Malayo-Polynesians and Taiwan aborigines. The linguistic similarity also led to the controversial hypothesis that Taiwan is the homeland of all the Malayo-Polynesians, a hypothesis that has been debated by ethnologists, linguists, archaeologists, and geneticists. It is well accepted that the Eastern Austronesians (Micronesians and Polynesians) derived from the Western Austronesians (Island Southeast Asians and Taiwanese), and that the Daic populations on the mainland are supposed to be the headstream of all the Austronesian populations.
RESULTS: In this report, we studied 20 SNPs and 7 STRs in the non-recombining region of the 1,509 Y chromosomes from 30 China Daic populations, 23 Indonesian and Vietnam Malayo-Polynesian populations, and 11 Taiwan aboriginal populations. These three groups show many resemblances in paternal lineages. Admixture analyses demonstrated that the Daic populations are hardly influenced by Han Chinese genetically, and that they make up the largest proportion of Indonesians. Most of the population samples contain a high frequency of haplogroup O1a-M119, which is nearly absent in other ethnic families. The STR network of haplogroup O1a* illustrated that Indonesian lineages did not derive from Taiwan aborigines as linguistic studies suggest, but from Daic populations.
CONCLUSION: We show that, in contrast to the Taiwan homeland hypothesis, the Island Southeast Asians do not have a Taiwan origin based on their paternal lineages. Furthermore, we show that both Taiwan aborigines and Indonesians likely derived from the Daic populations based on their paternal lineages. These two populations seem to have evolved independently of each other. Our results indicate that a super-phylum, which includes Taiwan aborigines, Daic, and Malayo-Polynesians, is genetically educible.

Entities:  

Mesh:

Year:  2008        PMID: 18482451      PMCID: PMC2408594          DOI: 10.1186/1471-2148-8-146

Source DB:  PubMed          Journal:  BMC Evol Biol        ISSN: 1471-2148            Impact factor:   3.260


Background

Austronesian is one of the most important linguistic families, spread in most regions of Island Southeast Asia, the Pacific Ocean, and the Indian Ocean, and comprising more than one fifth of all the languages in the world [1]. This linguistic family was originally proposed by Murdock [2] by bringing two groups of speakers, i.e. Malayo-Polynesians (Island Southeast Asians (ISEA), Malagasy, Micronesians, and Polynesians) and Taiwan aborigines together as a monophyletic unit based on their linguistic similarity [3,4]. Later, Benedict found that another linguistic family in East Asia, Daic, has many resemblances with the so-called Austronesian, and therefore announced a super-phylum of Austro-Tai [5]. Daic is a linguistic family located to the north of the ISEA groups, mainly in South China. Some Daic populations spread to Laos, Thailand, and as far as India [1]. Substantial resemblances among Taiwan aborigines, Malayo-Polynesians, and Daic speakers have been reported by ethnologists [6-10] and linguists [11-15], linking Taiwan aborigines and Malayo-Polynesians to coastal populations in Southeast China, primarily Daic speakers and their ancestry, Baiyue. The origin of Austronesian has always been a controversial subject in linguistics and other related fields. The Express Train Hypothesis, a well accepted linguistic theory on the origin of Austronesian [3,4,16,17], postulates that proto-Austronesians originated in Taiwan and began to expand southward about 5,000–6,000 years ago by way of the Philippines and Eastern Indonesia. They eventually navigated eastward to Micronesia and Polynesia, and westward to Western Indonesia and Madagascar. The 'express train' refers to a rapid dispersal across the present Austronesian range starting from Eastern Indonesia. The hypothesis of the Taiwan origin of all the Austronesians (Taiwan Homeland Hypothesis or THH hereafter) is primarily based on the observation that a much higher linguistic diversity exists among languages of Taiwan aborigines than among the Malayo-Polynesians [3,4]. However, some linguists found evidences against the THH, and suggested that Kalimantan or Sulawesi may be the homeland of Austronesian [15,18,19]. The THH was further challenged by ethnologists [6-9], archaeologists [10], and geneticists [20-25]. Genetic evidence has been equally controversial. Some mitochondrial DNA (mtDNA) studies suggested a Taiwan origin of Polynesians [20-22]. A recent mtDNA study on Taiwan aborigines found a root of the "Polynesian Motif" in Taiwan, which suggests that the THH may be confirmed in maternal lineages [26]. On the other hand, this theory was challenged in paternal lineages by the Y-Chromosome studies that showed a lack of resemblance between the Polynesians and Taiwan aborigines [23]. It was also challenged by other mtDNA studies, which suggest an Indonesian origin of Polynesians [24,25]. The conflicts in the genetic evidence can be attributed to the lack of evidence or populations from two crucial regions: (1) coastal populations in Southeast Asia ancestral to three Austronesian groups (Taiwan aborigines, ISEA, and Polynesians), and (2) ISEA populations including Indonesians from which Polynesians derived. Another important factor in the genetic structure of Austronesians is that Eastern Austronesians are distinctly different from Western Austronesians (ISEA and Taiwan aborigines, Figure 1). Autosomal STR variation studies [27] revealed a pronounced genetic division between Polynesians and Western Austronesians. These studies suggest that the Polynesians might have undergone natural selection or have been admixed with Melanesians. This process changed their genetic structure [16,20,28]. There is also the possibility of genetic drift and founder effects during the dispersal of Polynesians. The genetic structure of Western Austronesians, especially that of the ISEA, is more pivotal to the origin of Austronesians (Figure 1). The high Y chromosome diversity of Indonesian populations, Bali and Sumba islanders, suggests that these populations have existed since the Palaeolithic age [29,30]. Because of this high genetic diversity, it appears that the ISEA, especially the Indonesians are not just of Taiwanese origin.
Figure 1

Geographic distribution of sampled populations and migration routes suggested by Y chromosome analysis. The codes for the population samples are the same as those in Table 1. Green arrows indicate expansion of Daic; blue arrows, Taiwanese; orange arrows, ISEA. The origin of Polynesians, purple arrows, remains controversial in paternal lineages.

Geographic distribution of sampled populations and migration routes suggested by Y chromosome analysis. The codes for the population samples are the same as those in Table 1. Green arrows indicate expansion of Daic; blue arrows, Taiwanese; orange arrows, ISEA. The origin of Polynesians, purple arrows, remains controversial in paternal lineages. Here, we examined the THH of ISEA by studying the Y chromosome diversity of all relevant population groups such as that of the Daic, Indonesians, and Taiwan aborigines. We show that the paternal lineages of both ISEA and Taiwan aborigines derived from the Daic, although independently of each other. In addition, our findings indicate that it is unlikely that Taiwan is the homeland of the paternal lineages of the ISEA populations.

Results and Discussion

To determine the genetic affinity between the Daic populations and the Western Austronesians, we typed twenty single nucleotide polymorphisms (SNPs) and seven short tandem repeats (STRs) in the non-recombining region of 1,509 Y chromosomes sampled from 30 Daic populations, 23 ISEA populations, and 11 Taiwan aboriginal populations (see Figure 1 for locations of the populations and Table 1 for population information). Almost all of the Daic populations in China and all of the Taiwan aboriginal populations were sampled in this study.
Table 1

Classification, population, and location information of the populations sampled in this study

No.ETHNICISO639-3FAMILYSUB-FAMILYBRANCHPOPULATIONCOUNTRYPROVINCECOUNTY
D1BolyuplyAustro-AsiaticMon-KhmerPalyu10,000ChinaGuangxiLonglin
D2YerongyrnDaicKadaiBu-Rong400ChinaGuangxiNapo
D3QaugioDaicKadaiGe-Chi3,000ChinaGuizhouBijie
D4Blue-GelaogiqDaicKadaiGe-Chi1,700ChinaGuangxiLonglin
D5LachilbtDaicKadaiGe-Chi9,016ChinaYunnanMaguan
D6MollaoDaicKadaiGe-Chi30,000ChinaGuizhouMajiang
D7Red-GelaogirDaicKadaiGe-Chi1,500ChinaGuizhouDafang
D8White-GelaogiwDaicKadaiGe-Chi1,200ChinaYunnanMalipo
D9Hlai-QilicDaicKadaiHlai747,000ChinaHainanTongza
D10JiamaojioDaicKadaiHlai52,300ChinaHainanBaoting
D11BuyangbyuDaicKadaiYang-Biao3,000ChinaYunnanGuangnan
D12CuncuqDaicKadaiYang-Biao70,000ChinaHainanDongfang
D13LaqualaqDaicKadaiYang-Biao307ChinaYunnanMalipo
D14Man-CaolanmlcDaicKam-TaiBe-Tai114,000ChinaGuangxiFangcheng
D15Zhuang-NccxDaicKam-TaiBe-Tai10,000,000ChinaGuangxiWuming
D16Zhuang-SccyDaicKam-TaiBe-Tai4,000,000ChinaGuangxiChongzuo
D17LingaoonbDaicKam-TaiBe-Tai520,000ChinaHainanLingao
D18EeeeDaicKam-TaiBe-Tai30,000ChinaGuangxiRongshui
D19Ai-ChamaihDaicKam-TaiKam-Sui2,300ChinaGuizhouLibo
D20Dong/KamdocDaicKam-TaiKam-Sui907,560ChinaGuangxiSanjiang
D21SuiswiDaicKam-TaiKam-Sui345,993ChinaGuangxiRongshui
D22MakmkgDaicKam-TaiKam-Sui10,000ChinaGuizhouLibo
D23MulammlmDaicKam-TaiKam-Sui159,328ChinaGuangxiLuocheng
D24MaonanmmdDaicKam-TaiKam-Sui37,000ChinaGuangxiHuanjiang
D25BiaobykDaicKam-TaiKam-Sui20,000ChinaGuangdongHuaiji
D26ThentctDaicKam-TaiKam-Sui20,000ChinaGuizhouPingtang
D27DangaDaicUnclassified1,000,000ChinaHainanLingshui
D28DornQdaycDaicUnclassified500,000ChinaShanghaiMinhang
D29CaoMiaocovDaicKam-TaiKam-Sui63,632ChinaGuangxiRongshui
D30LakalbcDaicKam-TaiKam-Sui12,000ChinaGuangxiJinxiu
T1AmisamiAustronesianTaiwanPaiwanic130,000ChinaTaiwanHualien
T2PazehuunAustronesianTaiwanPaiwanic300ChinaTaiwanCholan
T3Siraiya-MakataofosAustronesianTaiwanPaiwanic10,000ChinaTaiwanHualien
T4ThaossfAustronesianTaiwanPaiwanic248ChinaTaiwanNantou
T5PaiwanpwnAustronesianTaiwanPaiwanic53,000ChinaTaiwanTaitung
T6AtayaltayAustronesianTaiwanAtayalic63,000ChinaTaiwanYilan
T7RukaidruAustronesianTaiwanPaiwanic8,007ChinaTaiwanPingtung
T8PyumapyuAustronesianTaiwanPaiwanic8,132ChinaTaiwanTaitung
T9TsoutsuAustronesianTaiwanTsouic5,797ChinaTaiwanKagi
T10BununbnnAustronesianTaiwanPaiwanic34,000ChinaTaiwanHualien
T11SaisiyatxsyAustronesianTaiwanPaiwanic4,194ChinaTaiwanYilan
I1BatakbbcAustronesianMalayo-PolynesianWestern5,800,000IndonesiaSumatera Utara
I2BangkamlyAustronesianMalayo-PolynesianWestern500,000IndonesiaSumatera SelatanBangka
I3Malay (Riau)mlyAustronesianMalayo-PolynesianWestern2,000,000IndonesiaRiau
I4MinangkabauminAustronesianMalayo-PolynesianWestern4,000,000IndonesiaSumatera Barat
I5PalembangplmAustronesianMalayo-PolynesianWestern1,100,000IndonesiaSumatera Selatan
I6NiasniaAustronesianMalayo-PolynesianWestern600,000IndonesiaSumatera UtaraNias
I7DayakdykAustronesianMalayo-PolynesianWestern2,100,000IndonesiaKalimantan Tengah
I8BanjarbjnAustronesianMalayo-PolynesianWestern3,000,000IndonesiaKalimantan Selatan
I9JavanesejavAustronesianMalayo-PolynesianWestern75,500,000IndonesiaJawa Tengah
I10TenggertesAustronesianMalayo-PolynesianWestern500,000IndonesiaJawa Timur
I11BalinesebanAustronesianMalayo-PolynesianWestern3,800,000IndonesiaBali
I12BugisbugAustronesianMalayo-PolynesianWestern3,500,000IndonesiaSulawesi Selatan
I13TorajasdaAustronesianMalayo-PolynesianWestern500,000IndonesiaSulawesi Selatan
I14MakasarmakAustronesianMalayo-PolynesianWestern1,600,000IndonesiaSulawesi Selatan
I15MinahasatomAustronesianMalayo-PolynesianWestern200,000IndonesiaSulawesi Utara
I16KaililewAustronesianMalayo-PolynesianWestern471,000IndonesiaSulawesi Tengah
I17SasaksasAustronesianMalayo-PolynesianWestern2,100,000IndonesiaNusa Tenggara BaratLombok
I18SumbawasmwAustronesianMalayo-PolynesianWestern400,000IndonesiaNusa Tenggara BaratSumbawa
I19SumbaxbrAustronesianMalayo-PolynesianCentral234,574IndonesiaNusa Tenggara TimurSumba
I20AloraolAustronesianMalayo-PolynesianCentral25,000IndonesiaNusa Tenggara TimurAlor
I21IrianGeelvink Bay20,806IndonesiaIrian Jaya
I22ChamcjmAustronesianMalayo-PolynesianWestern99,000VietnamBinhdinh
I23TsathuqAustronesianMalayo-PolynesianWestern4,500ChinaHainanSanya

Detailed information can be searched online http://www.ethnologue.com by ISO639-3 codes.

In addition, principal component (PC) analysis of 134 East Asian populations encompassing all linguistic groups in East and Southeast Asia was performed using the frequencies of haplogroups defined by SNPs. The result showed that Daic populations are closer to the Western Austronesian groups than any other East and Southeast Asian populations are (Figure 2), indicating a strong genetic affinity between Daic speakers and Western Austronesians. The separation of the Daic-ISEA-Taiwan cluster from the other ethnic groups is attributable to PC2 rather than to PC1, and O1a* is the haplogroup that shows the strongest correlation with PC2 (r2 = -0.875, P < 10-4; see Additional file 1 for details). Furthermore, O1a-M119 is the dominating haplogroup in Taiwan aborigines (average 77%) ranging from 54% to 100% (Table 2, sum of O1a* and O1a2). This lineage is also highly prevalent in Daic speakers (20.5%) and in ISEA (21.2%), but not in the other East Asians (< 5%) [23,31-34]. Therefore, O1a-M119 is expected to provide much information for delineating the relationship between the Daic and Western Austronesians.
Figure 2

Principal component plot of Y-SNP. (A) PC plot of all the population samples. DC (green stars) is closest to MP (purple crosses) and TA (blue crosses). All of the other groups including ST, HM, AA, and AT (red spots including triangles, squares and diamonds) are rather far removed from MP and TA, which indicates that DC is the only group that might be related to MP and TA. (B) PC plots of pooled samples. The ST, HM, AA, and AT samples were pooled according to the linguistic families. The DC samples were pooled according to the sub-families. MP and TA samples were pooled according to the geographic locations. Ethnic groups: AA, Austro-Asiatic speakers; AT, Altaic speakers; DC, Daic speakers; HM, Hmong-Mien speakers; MP, Malayo-Polynesian speakers; ST, Sino-Tibetan speakers; TA, Taiwan aborigines.

Table 2

Y-SNP haplogroup frequencies of the newly studied samples (%)

PopulationSizeCD*D1FMKO*O1a*O1a2O2a*O2a1O3*O3a1O3a4O3a5O3a5aP
Bolyu303.33.310.010.03.323.330.06.710.0
Yerong1662.56.318.812.5
Qau1315.47.723.115.430.87.7
Blue Gelao303.313.360.016.73.33.3
Lachi303.33.313.313.316.76.710.03.36.723.3
Mollao3010.03.313.33.33.363.33.3
Red Gelao313.26.522.622.616.112.916.1
White Gelao1435.714.342.97.1
Hlai-Qi3435.332.429.42.9
Jiamao2725.951.922.2
Buyang323.16.36.39.43.171.9
Cun313.26.59.738.738.73.2
Laqua2532.04.060.04.0
Man-Caolan3010.010.053.33.320.03.3
Zhuang-N2213.64.672.74.64.6
Zhuang-S1513.320.060.06.7
Lingao303.316.726.713.33.310.026.7
E313.23.29.716.16.554.83.23.2
Laka234.452.24.48.726.14.4
Kam/Dong3821.15.310.539.510.52.610.5
Sui508.010.018.044.020.0
Mak&AiCham402.587.55.02.52.5
Mulam402.512.57.55.05.025.030.07.55.0
Maonan329.49.415.656.39.4
Biao342.95.914.717.752.95.9
Then303.33.333.350.06.73.3
Danga4020.05.02.57.517.57.55.017.52.515.0
DornQdayc-S742.16.339.612.58.34.227.1
DornQdayc-N515.92.02.031.429.42.02.011.813.7
CaoMiao338.210.03.066.712.1
Amis287.142.817.87.121.43.6
Pazeh2114.338.119.114.314.3
Makatao372.72.75.470.35.413.5
Thao224.681.84.69.1
Paiwan2263.627.39.1
Atayal2295.54.5
Rukai1181.818.2
Pyuma1172.79.19.19.1
Tsou1888.95.65.6
Bunun175.917.658.817.6
Saisiyat1145.59.19.19.127.3
Batak1311.619.323.115.423.17.7
Bangka137.77.730.823.123.17.7
Malay137.77.77.738.57.723.17.7
Minangkabau156.720.020.013.320.020.0
Palembang119.163.618.29.1
Nias128.391.7
Dayak156.726.720.020.06.76.713.3
Banjar1513.36.726.726.726.7
Javanese1526.726.720.013.313.3
Tengger1216.78.333.333.38.3
Balinese1428.614.37.128.614.37.1
Bugis1513.320.033.326.76.7
Torajan1513.313.313.313.36.733.36.7
Minahasa147.150.021.47.114.3
Makassar1323.130.815.47.723.1
Kaili156.733.320.06.726.76.7
Sasak1513.313.326.76.720.020.0
Sumbawa1816.783.3
Sumba1414.378.67.1
Alor1338.530.723.17.7
Irian1145.536.418.2
Cham119.190.9
Tsat3112.916.158.13.26.53.2

Zhuang and DornQdayc are divided into Southern (S) and Northern (N) parts.

The PC plot of Figure 2 indicates that some Daic populations are close to the Sino-Tibetan cluster. It is possible that Daic and Sino-Tibetan populations have a common ancestry, which might have resulted in their genetic resemblance. However, another explanation for this observation is that Daic populations in mainland East Asia may have been influenced by Han Chinese genetically as they coexisted as neighbors since around 2,500 years ago. Admixture analysis can estimate the proportions of assumed Daic or Han ancestry in the present Daic populations, and some Daic populations isolated from Han Chinese can be used as the parental population in this admixture analysis. Aboriginal populations on Hainan Island (Hlai, Jiamao, and Cun) and Taiwan Island are assumed to have been relatively isolated, as their cultures were little influenced by the exotic cultures on the mainland. Therefore, the genetic structures of these island aborigines might be the closest to that of ancestral Daic [35]. Classification, population, and location information of the populations sampled in this study Detailed information can be searched online http://www.ethnologue.com by ISO639-3 codes. Y-SNP haplogroup frequencies of the newly studied samples (%) Zhuang and DornQdayc are divided into Southern (S) and Northern (N) parts. Principal component plot of Y-SNP. (A) PC plot of all the population samples. DC (green stars) is closest to MP (purple crosses) and TA (blue crosses). All of the other groups including ST, HM, AA, and AT (red spots including triangles, squares and diamonds) are rather far removed from MP and TA, which indicates that DC is the only group that might be related to MP and TA. (B) PC plots of pooled samples. The ST, HM, AA, and AT samples were pooled according to the linguistic families. The DC samples were pooled according to the sub-families. MP and TA samples were pooled according to the geographic locations. Ethnic groups: AA, Austro-Asiatic speakers; AT, Altaic speakers; DC, Daic speakers; HM, Hmong-Mien speakers; MP, Malayo-Polynesian speakers; ST, Sino-Tibetan speakers; TA, Taiwan aborigines. To estimate the assumed genetic influence of Han Chinese on the mainland Daic, we applied the Y SNP data of mainland Daic, Hainan aborigines, Taiwan aborigines, and Han Chinese [34] to our admixture analysis. For this analysis, we set the latter three pooled populations as the parental populations of mainland Daic. Our results show that the genetic contribution of the Hainan aborigines is very high (2.145 ± 0.927), while those of the Han Chinese (-0.314 ± 0.422) and Taiwan aborigines (-0.831 ± 0.662) are hardly detected. Here the negative values of the genetic contribution estimated by the ADMIX program suggest that there is no possible contributions to the present Daic populations. This result indicates that the paternal lineages of Daic populations are relatively undisturbed, and the genetic affinity between Daic and Western Austronesian populations has hardly been influenced by population admixture. The ISEA populations may also be admixed. In our study, we assumed that the ISEA were mixed by three potential parental populations: Daic populations, Taiwan aborigines, and the indigenous populations of the Sunda Islands, who are similar to Papuans. We performed an admixture analysis on the Indonesians, and included data of the Papuans from the literature [36,37] as one of the parental population structures in the analysis. Our analysis showed the following admixture proportions: Daic (0.713 ± 0.124), Taiwan (0.143 ± 0.125), and Papuans (0.144 ± 0.050), indicating that the contribution of the Daic ancestry on the Indonesians is the most dominant. There is some uncertainty in these data as our assumption that the ISEA population is an admixture can not be tested. As the haplogroup O1a* is the most unique haplogroup of the Daic and Western Austronesian populations, we estimated pairwise genetic divergence between Daic, Indonesians, and Taiwan aborigines using seven STRs carried by O1a* individuals (see Table 3 for genetic distances and Additional file 2 for STR raw data). Our study shows that the divergence between Taiwan aborigines and Indonesians is the largest, and is about 3-fold as much as that between the Daic group and Taiwan aborigines. The divergence between the Daic group and Indonesians is comparable to that between the Daic group and Taiwan aborigines. These findings indicate that the Indonesians and Taiwan aborigines are genetically closer to the Daic group than the two Western Austronesian groups are to each other. Furthermore, the diversity based on the seven STRs carried by O1a* individuals is higher in the Daic speakers than the diversities in Indonesians and Taiwan aborigines (Table 3). The population with the highest diversity is not always the oldest, but can also be a result of admixture with other neighbouring populations. However, the high diversity of the O1a* haplogroup of the Daic speakers should have resulted from the oldest age of the population, as this haplogroup is almost absent in the neighbouring populations and no admixture can bring more diversity. Taking the results of diversity and divergence together, the Daic population group is likely the ancestral group from which the Indonesians and Taiwan aborigines derived separately in paternal lineages. Other haplogroups of Y chromosomes (e.g. O3-M122, O2a-M95) displayed a similar pattern as O1a*, showing that the Daic group is genetically closer to Indonesians and Taiwan aborigines than these latter two groups are to each other (Table 3). Interestingly, O2a may be traced even further to Austro-Asiatic populations as suggested by a recent study [38].
Table 3

Y-STR diversity of O1a, O2a, and O3 haplogroup

Between-group Diversity (Genetic distance)
RSTLinearized RST
O1aO2aO3O1aO2aO3

Daic-TA0.109 (p < 10-5)0.012 (P = 0.271)0.019 (P = 0.187)0.1220.0120.019
Daic-ISEA0.108 (p < 10-5)0.093 (P < 10-5)0.049 (P = 0.001)0.1210.1020.052
TA-ISEA0.269 (p < 10-5)0.318 (P < 10-5)0.285 (P < 10-5)0.3680.4660.398

Within-group Diversity

SizeAverage Gene DiversityAverage Variance
O1aO2aO3O1aO2aO3O1aO2aO3

Daic1402921450.6010.5180.6580.9381.0411.494
ISEA7538640.5470.3970.4980.8970.3200.634
TA14712140.5030.5430.6210.6560.6851.220

Y STR genetic distances between Taiwan Aborigines (TA) and ISEA were always largest and more than twice as much as those between Daic and one of the TA and ISEA groups among the samples of all of the three haplogroups, O1a, O2a and O3. Two statistics of Rst and Linearized Rst referred to reference [53].

Y STR within-group gene diversities of Daic were always largest. Average gene diversity refers to reference [55], and variance refers to reference [56].

Y-STR diversity of O1a, O2a, and O3 haplogroup Y STR genetic distances between Taiwan Aborigines (TA) and ISEA were always largest and more than twice as much as those between Daic and one of the TA and ISEA groups among the samples of all of the three haplogroups, O1a, O2a and O3. Two statistics of Rst and Linearized Rst referred to reference [53]. Y STR within-group gene diversities of Daic were always largest. Average gene diversity refers to reference [55], and variance refers to reference [56]. A median-joining network was constructed based on 7-STR haplotypes of O1a* individuals in the three ethnic groups (Figure 3). If THH of ISEA is true, i.e., ISEA primarily derived from Taiwan aborigines, one would expect sharing and/or connections of ISEA lineages and Taiwan aboriginal lineages in the network. In Figure 3, Daic lineages (green nodes) constitute the center of the network. All ISEA lineages (yellow nodes) and Taiwan aboriginal lineages (blue nodes) are either shared or connected to one of the Daic lineages, either directly or indirectly. In contrast, none of the Taiwan aboriginal lineages (except for one) are shared with or connected to the ISEA lineages. These observations suggest that ISEA did not directly derive from Taiwan aborigines but that the ISEA and Taiwan aborigines derived from the Daic independently of each other.
Figure 3

Haplotype network of Y-STRs of Haplogroup O1a* individuals. As the original network was too complicated to display, here we presented the shortest tree of the largest possibility reduced from the network (this function is available in the recent versions of NETWORK program). Each node represents an O1a* STR haplotype. The lengths of the lines are proportional to the mutation steps. The broken line stands for only one step. The sizes of the nodes are proportional to their frequencies. Almost none of the ISEA haplotypes is directly linked to Taiwan aborigines, and both ISEA and Taiwanese are linked directly or indirectly to the Daic haplotypes holding the centre of the network (big green node).

Haplotype network of Y-STRs of Haplogroup O1a* individuals. As the original network was too complicated to display, here we presented the shortest tree of the largest possibility reduced from the network (this function is available in the recent versions of NETWORK program). Each node represents an O1a* STR haplotype. The lengths of the lines are proportional to the mutation steps. The broken line stands for only one step. The sizes of the nodes are proportional to their frequencies. Almost none of the ISEA haplotypes is directly linked to Taiwan aborigines, and both ISEA and Taiwanese are linked directly or indirectly to the Daic haplotypes holding the centre of the network (big green node). We further noticed the Daic lineages that are connected to ISEA lineages in the network. Interestingly, most of the Daic haplotypes connecting to the ISEA are either from Hainan Island or from Guangxi, which is to northwest of Hainan (green nodes with dark green frames in Figure 3). These Hainan and Guangxi populations are located around the Gulf of Tonkin. In particular, Cham, a Malayo-Polynesian population in South Vietnam, as well as Tsat in Hainan, which is a subgroup of Cham [11,39], were found to connect Daic and Indonesians in the network. Therefore, we hypothesized that the ISEA likely originated in the area around the Gulf of Tonkin, and migrated southward through the Indochina Peninsula to the Malaya Peninsula before they spread to most of the islands of the Pacific Ocean and the Indian Ocean. The age of the O1a* haplogroup was estimated in the network. The total age is 33765 ± 5221 years, which corresponds to the last Ice Age. The age of all the Daic samples in the network is 33193 ± 5577 years, close to the age of O1a*. It is not easy to estimate the real age of the Taiwan clusters as they overlap with the Daic haplotypes to a large extent. This kind of overlap also indicates multiple migrations from Daic populations to Taiwan aborigines. We estimated the age of the Taiwan cluster in the left side of the network to be 14659 ± 3110 years. The estimated age of all the Taiwan samples is 21268 ± 3148 years. Interestingly, this latter age is close to the age of the oldest human remains found in Taiwan, those of the Chochen Man [40]. Therefore, we conclude that the migration of O1a* individuals from the mainland to Taiwan Island occurred during the Palaeolithic Age. Because two fairly specific clusters of ISEA haplotypes can be observed in the network, we performed time estimates in both clusters. The age of the left ISEA cluster in the network is 9895 ± 2393 years, whereas that of the right cluster is 25880 ± 7137 years. The linguistic estimate for the origin of the Malayo-Polynesian is younger than that of our estimates, around 5000–6000 years ago [16]. Moreover, little overlap between Daic haplotypes and ISEA haplotypes is observed in the network, which indicates bottleneck effects might have formed the two ISEA clusters during the emigration of ISEA populations out of the ancestral Daic populations. Geographically, the bottleneck might be the narrow seashore of Vietnam. Therefore, the O1a* haplogroup was most probably introduced into ISEA populations during the origin of the Malayo-Polynesians more than 7500 years ago. However, the possibility of recent migrations of the O1a individuals into ISEA can not be ignored, because the genetic time estimate is not precise enough to eliminate such a possibility. It should be noted that, in the Express Train Hypothesis, there are two different aspects: 1) the origin of the migrations, i.e. the Taiwan Homeland Hypothesis, and 2) the mode of migrations, i.e., a rapid dispersal starting from Indonesia. In this study, we examined the THH in Western Austronesians by including the Daic speakers and ISEA, both of which are largely missing in previous studies. We show that Taiwan is not likely the homeland of Indonesian ISEA, at least not for the major paternal lineages. Although both Taiwan aborigines and Indonesian ISEA derived from the Daic, their departures occurred separately, suggesting that the major paternal lineages of Western Austronesian populations are not monophyletic. Interestingly, the spread of the domestic pig in the Southeast Asia archipelago and the Pacific took place in almost the same way as that of Western Austronesian populations suggested by our study. The pigs in Taiwan and in regions as far as Micronesia came directly from the mainland of East Asia, while those in the Southeast Asian archipelago and Polynesia came from the Indochina Peninsula. It is assumed that the domestic pig was introduced by human populations during early migrations, which would imply that humans have also entered the Southeast Asia archipelago and the Pacific in two different routes [41]. In fact, our observations are consistent with a monophyletic Austro-Tai super-phylum which contains Daic speakers, Malayo-Polynesians, and Taiwan aborigines [5]. The observations presented in this study demonstrate that it is absolutely necessary to include Daic populations and ISEA in the Austronesian origin studies. Without these groups, Polynesians and Taiwan aborigines would have appeared most similar to each other, leading to the conclusion that all the Austronesians originated in Taiwan. Our results suggest that the Gulf of Tonkin is more likely the homeland of the paternal lineages of ISEA. Due to the complex nature of population migrations from Eastern Indonesia to the Pacific Islands [23,42-47], and the pronounced genetic division between Eastern and Western Austronesians [27], we opted not to include Polynesian data in our analysis. Instead, we only analyzed Western Austronesians. The absence of O1a-M119 in Polynesian populations is intriguing and it can not be simply explained by invoking the bottleneck effect [21-25] given that a great deal of diversity of Y chromosome haplotypes has been observed in Polynesians [23,42]. Consistent with our findings for paternal lineages, mitochondrial DNA studies on populations from Peninsular Malaysia also suggest an ancestry of aboriginal Malays in Indochina around the time of the Last Glacial Maximum [48]. This ancestry subsequently dispersed through the Malaya Peninsula into island Southeast Asia [48]. The ISEA mtDNA studies also indicated that if an Austronesian migration from Taiwan did take place, it was demographically minor [49]. Most of our conclusions are based on the analysis of O1a*, which is only a fraction of the Y-chromosome lineages found in these populations. The frequency of this group of lineages is remarkable in Taiwanese populations, but it is not so dramatic in Malayo-Polynesians or Daic populations. It is possible that some population events could have involved other Y-chromosome lineages. It is also reasonable that there are other minor parts of paternal lineages with different origins, such as aboriginal populations of Indonesia prior to the formation of Austronesian, or that more recent migrations from South Asia took place [29]. The genetic relationship amongst the East and Southeast Asians are much more complicated than expected.

Conclusion

Our results show that the Daic populations are closer to the Western Austronesian populations in paternal lineages than any other ethnic groups in East Asia are. The STR diversity of the Y chromosome haplogroup O1a-M119, the major haplogroup among the Daic and Western Austronesian populations, shows that Taiwan and ISEA, two groups of Western Austronesian, derived from the Daic independently of each other. Therefore, it is most likely that the ISEA populations mainly originated in the region around the Tonkin Gulf, the homeland of the Daic, and migrated to Indonesia through the Vietnam corridor. In contrast, the Taiwan aborigines migrated from mainland China directly. Our results indicate that a super-phylum, which includes Taiwan aborigines, Daic, and Malayo-Polynesians, is genetically educible.

Methods

Sampling

Blood samples from 30 Daic populations across South China were collected using FTA cards (Whatman® Inc), covering almost all of the Daic populations in China. Those from 11 Taiwan aborigine populations were collected from both the lowlands and the highlands of Taiwan. Samples from 23 Malayo-Polynesian populations were collected, among which 21 were collected across Indonesia, 1 from Binhdinh of Vietnam, and 1 from Hainan of China. The sample sizes from each population are given in Table 2. All of the 1,509 individuals studied from these populations are unrelated and gave their consents for this study. Individual samples were from diverse regions of the population distribution area to make the sample more diverse. Reference data for 70 other groups in East and Southeast Asia were obtained from the literature (including some Daic speaking populations [23], Malayo-Polynesians [23], Taiwan aborigines [23], Tibeto-Burman speaking populations [31-33], Han Chinese [31,34], and Altaic speaking populations [31]), for a total reference sample size of 1,348 individuals. In PC analysis, these samples refer to a total of 134 different population groups, including newly typed and previously published populations. Although the sample sizes of some populations were relatively small, we do not think it is necessary to enlarge these sample sizes, as they were collected from very small populations with low Y chromosome diversity, such as the Ai-Cham and Geelvink Irians. The effective population size of the Y chromosome is usually less than one fourth of the size of that of autosomes. Therefore, Y chromosome diversity studies require much smaller sample sizes than studies of autosomal genetic markers. For a normal size population of some hundred thousand, a sample of around 30 individuals will be sufficient. Even fewer samples are required for small populations. Here we maintained a sample size of around 30 for most of the populations, and around 15 for small populations.

Genetic markers

Twenty bi-allelic Y-chromosome markers (SNP), YAP, M15, M130, M89, M9, M5, M122, M134, M7, M117, M121, M111, M17, M175, M119, M110, M95, M88, M45, and M120 were typed by PCR-based restriction-fragment length polymorphism methods [31]. Most of these markers are highly informative in East Asians and define 19 haplogroups following the Y Chromosome Consortium nomenclature [50]. Seven microsatellite markers (STR) on Y-chromosome, DYS19, DYS388, DYS389-1, DYS390, DYS391, DYS392, and DYS393 were typed using fluorescent-labelled primers [51]. The genotyping results are given in Additional file 2.

Data analysis

Population relationships were investigated with principal component analyses using Y-chromosome haplogroup frequencies and SPSS11.0 software (SPSS Inc.). Some of the SNPs, such as M175 and M117, were not typed for the previously published populations, therefore our O*-M175 data were combined into haplogroup K, and O3a5a-M117 into O3a5* in our PC analysis. Correlation analysis among haplogroups and PCs was also conducted using SPSS11.0. The admixture analysis was performed using an ADMIX 2.0 program [52] in order to evaluate the genetic influence of Han Chinese on the Daic populations. We assumed the potential admixture started 2,500 ago when the Qin army entered the Daic area in Canton. The admixture proportions of the Indonesians were also estimated by ADMIX 2.0, and the admixture history was to start 5,000 years ago. The genetic distances among Daic, Taiwan aborigines, and Malayo-Polynesians were estimated by RST and linearized RST [53] using ARLEQUIN software [54], and the diversities of three groups were evaluated by average gene diversity, haplotype diversity [55], and variance of the STR allele sizes [56]. A Median-Joining network of O1a* STR haplogroups was drawn by Network 4.1 software (Fluxus Technology Ltd). The age of O1a* was estimated in the network. The mutation rate used in the time estimate is 1.932 × 10-4 per year, the sum of the mutation rates [57] of all the STRs used in the network. We assumed 25 years for one generation.

Authors' contributions

HL, SJC, BS, YL, PP, ZQ, WL, XC, XL, and NY carried out the molecular genetic studies. HL and LJ drafted the manuscript. HL, BW, DL, MH, RD, SM, CCT, and LJ participated in the design of the study and performed the statistical analysis. HL, SJC, PP, SP, and DT collected the samples. All authors read and approved the final manuscript.

Additional file 1

Correlation coefficients between haplogroups and PCs. Although P values of the correlation coefficients between PC1 and M9, M110, M95, M88 etc. are all very significant, all of these correlation coefficients are less than 0.5. Thus, PC1 has little information about the ethnic clustering. In contrast, PC2 is significantly correlated with O1a-M119 seen in a large correlation coefficient. This haplogroup distinguishes the Daic-MP-TW cluster. Thus, PC2 provides information on ethnic clustering. Click here for file

Additional file 2

Y-STR haplotypes of individual samples. The names of the individuals begin with ISO639-3 codes of their populations. Click here for file
  33 in total

1.  Y chromosome haplotypes reveal prehistorical migrations to the Himalayas.

Authors:  B Su; C Xiao; R Deka; M T Seielstad; D Kangwanpong; J Xiao; D Lu; P Underhill; L Cavalli-Sforza; R Chakraborty; L Jin
Journal:  Hum Genet       Date:  2000-12       Impact factor: 4.132

2.  Inferring admixture proportions from molecular data: extension to any number of parental populations.

Authors:  I Dupanloup; G Bertorelle
Journal:  Mol Biol Evol       Date:  2001-04       Impact factor: 16.240

3.  Polynesian origins. Slow boat to Melanesia?

Authors:  S J Oppenheimer; M Richards
Journal:  Nature       Date:  2001-03-08       Impact factor: 49.962

4.  Melanesian origin of Polynesian Y chromosomes.

Authors:  M Kayser; S Brauer; G Weiss; P A Underhill; L Roewer; W Schiefenhövel; M Stoneking
Journal:  Curr Biol       Date:  2000-10-19       Impact factor: 10.834

5.  Polynesian origins: insights from the Y chromosome.

Authors:  B Su; L Jin; P Underhill; J Martinson; N Saha; S T McGarvey; M D Shriver; J Chu; P Oefner; R Chakraborty; R Deka
Journal:  Proc Natl Acad Sci U S A       Date:  2000-07-18       Impact factor: 11.205

6.  A nomenclature system for the tree of human Y-chromosomal binary haplogroups.

Authors: 
Journal:  Genome Res       Date:  2002-02       Impact factor: 9.043

7.  mtDNA lineage analyses: origins and migrations of Micronesians and Polynesians.

Authors:  J K Lum; R L Cann
Journal:  Am J Phys Anthropol       Date:  2000-10       Impact factor: 2.868

8.  Y-Chromosome evidence for a northward migration of modern humans into Eastern Asia during the last Ice Age.

Authors:  B Su; J Xiao; P Underhill; R Deka; W Zhang; J Akey; W Huang; D Shen; D Lu; J Luo; J Chu; J Tan; P Shen; R Davis; L Cavalli-Sforza; R Chakraborty; M Xiong; R Du; P Oefner; Z Chen; L Jin
Journal:  Am J Hum Genet       Date:  1999-12       Impact factor: 11.025

9.  Y chromosome sequence variation and the history of human populations.

Authors:  P A Underhill; P Shen; A A Lin; L Jin; G Passarino; W H Yang; E Kauffman; B Bonné-Tamir; J Bertranpetit; P Francalacci; M Ibrahim; T Jenkins; J R Kidd; S Q Mehdi; M T Seielstad; R S Wells; A Piazza; R W Davis; M W Feldman; L L Cavalli-Sforza; P J Oefner
Journal:  Nat Genet       Date:  2000-11       Impact factor: 38.330

10.  A predominantly indigenous paternal heritage for the Austronesian-speaking peoples of insular Southeast Asia and Oceania.

Authors:  C Capelli; J F Wilson; M Richards; M P Stumpf; F Gratrix; S Oppenheimer; P Underhill; V L Pascali; T M Ko; D B Goldstein
Journal:  Am J Hum Genet       Date:  2001-01-22       Impact factor: 11.025

View more
  32 in total

Review 1.  Origin of ethnic groups, linguistic families, and civilizations in China viewed from the Y chromosome.

Authors:  Xueer Yu; Hui Li
Journal:  Mol Genet Genomics       Date:  2021-05-26       Impact factor: 3.291

2.  Genetic diversity on the Comoros Islands shows early seafaring as major determinant of human biocultural evolution in the Western Indian Ocean.

Authors:  Said Msaidie; Axel Ducourneau; Gilles Boetsch; Guy Longepied; Kassim Papa; Claude Allibert; Ali Ahmed Yahaya; Jacques Chiaroni; Michael J Mitchell
Journal:  Eur J Hum Genet       Date:  2010-08-11       Impact factor: 4.246

3.  Refined phylogenetic structure of an abundant East Asian Y-chromosomal haplogroup O*-M134.

Authors:  Chao Ning; Shi Yan; Kang Hu; Yin-Qiu Cui; Li Jin
Journal:  Eur J Hum Genet       Date:  2015-08-26       Impact factor: 4.246

4.  Correlations in the population structure of music, genes and language.

Authors:  Steven Brown; Patrick E Savage; Albert Min-Shan Ko; Mark Stoneking; Ying-Chin Ko; Jun-Hun Loo; Jean A Trejaut
Journal:  Proc Biol Sci       Date:  2013-11-13       Impact factor: 5.349

5.  Genetic landscape of Eurasia and "admixture" in Uyghurs.

Authors:  Hui Li; Kelly Cho; Judith R Kidd; Kenneth K Kidd
Journal:  Am J Hum Genet       Date:  2009-12       Impact factor: 11.025

6.  An updated tree of Y-chromosome Haplogroup O and revised phylogenetic positions of mutations P164 and PK4.

Authors:  Shi Yan; Chuan-Chao Wang; Hui Li; Shi-Lin Li; Li Jin
Journal:  Eur J Hum Genet       Date:  2011-04-20       Impact factor: 4.246

7.  Early Austronesians: into and out of Taiwan.

Authors:  Albert Min-Shan Ko; Chung-Yu Chen; Qiaomei Fu; Frederick Delfin; Mingkun Li; Hung-Lin Chiu; Mark Stoneking; Ying-Chin Ko
Journal:  Am J Hum Genet       Date:  2014-03-06       Impact factor: 11.025

8.  Genetic polymorphisms of 17 Y-chromosomal short tandem repeat loci in Atayal population of Taiwan.

Authors:  Fang-Chin Wu; Chin-Wen Ho; Chang-En Pu; Kuang-Yu Hu; David Hwang Liu
Journal:  Croat Med J       Date:  2009-06       Impact factor: 1.351

9.  The forensic landscape and the population genetic analyses of Hainan Li based on massively parallel sequencing DNA profiling.

Authors:  Haoliang Fan; Zhengming Du; Fenfen Wang; Xiao Wang; Shao-Qing Wen; Lingxiang Wang; Panxin Du; Hai Liu; Shengping Cao; Zhenming Luo; Bingbing Han; Peiyu Huang; Bofeng Zhu; Pingming Qiu
Journal:  Int J Legal Med       Date:  2021-04-13       Impact factor: 2.686

10.  Insights From Y-STRs: Forensic Characteristics, Genetic Affinities, and Linguistic Classifications of Guangdong Hakka and She Groups.

Authors:  Chunfang Luo; Lizhong Duan; Yanning Li; Qiqian Xie; Lingxiang Wang; Kai Ru; Shahid Nazir; Muhammad Jawad; Yifeng Zhao; Fenfen Wang; Zhengming Du; Dehua Peng; Shao-Qing Wen; Pingming Qiu; Haoliang Fan
Journal:  Front Genet       Date:  2021-05-24       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.