Literature DB >> 24339953

Patrilineal background of esophageal cancer and gastric cardia cancer patients in a Chaoshan high-risk area in China.

Shuhui Liu1, Bo Huang, Haihua Huang, Xiaoyun Li, Guangcan Chen, Guohong Zhang, Wengting Lin, Dan Guo, Jie Wang, Zefeng Yu, Xi Liu, Min Su.   

Abstract

The Taihang Mountain range of north-central China, the Southern region area of Fujian province, and the Chaoshan plain of Guangdong province are 3 major regions in China well known for their high incidence of esophageal cancer (EC). These areas also exhibit high incidences of gastric cardia cancer (GCC). The ancestors of the Chaoshanese, now the major inhabitants in the Chaoshan plain, were from north-central China. We hypothesized that EC and GCC patients in Chaoshan areas share a common ancestry with Taihang Mountain patients. We analyzed 16 East Asian-specific Y-chromosome biallelic markers (single nucleotide polymorphisms; Y-SNPs) and 6 Y-chromosome short tandem repeat (Y-STR) loci in 72 EC and 48 GCC patients from Chaoshan and 49 EC and 63 GCC patients from the Taihang Mountain range. We also compared data for 32 Chaoshan Hakka people and 24 members of the aboriginal She minority who live near the Chaoshan area. Analysis was by frequency distribution and principal component, correlation and hierarchical cluster analysis of Y-SNP. Chaoshan patients were closely related to Taihang Mountain patients, even though they are geographically distant. Y-STR analysis revealed that the 4 patient groups were more closely related with each other than with other groups. Network analysis of the haplogroup O3a3c1-M117 showed a high degree of patient-specific substructure. We suggest that EC and GCC patients from these 2 areas share a similar patrilineal genetic background, which may play an important role in the genetic factor of EC and GCC in these populations.

Entities:  

Mesh:

Year:  2013        PMID: 24339953      PMCID: PMC3858265          DOI: 10.1371/journal.pone.0081670

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Esophageal cancer (EC) is one of the most common fatal cancers worldwide. China has geographical “hot spots” of high EC incidence. A well-known region with high risk of EC in China is the Taihang Mountain area between Henan, Hebei, and Shanxi provinces in north-central China, the famous “Asian EC belt” ranging from the Caucasian mountains, across northern Iran, all the way to northern China [1]. As well, the incidence of gastric cardia cancer (GCC) is high in the belt. For example, the world standardized incidence of EC and GCC in Linxian, Henan province, was 81.96/100,000 people and 31.04/100,000, respectively between 1983 and 2002 [2], [3]. The Chaoshan area in southern China is another EC high-risk area. The age-standardized incidence rates in Nanao island for EC and GCC were 74.47/100,000 and 34.81/100,000, respectively, between 1995 and 2004 [4]. The geographic features of south-littoral Chaoshan and north-central Taihang Mountain area are distinct, but the incidence of EC and GCC is high within these 2 regions [5]. We and others have reported familial aggregation of EC and GCC and increased EC and GCC risk in family members in this high-risk population [6]–[9]. In the Chaoshan high-risk area, the incidence of EC and GCC is not even among population groups, although they are exposed to the similar environment. The 3 main populations in Chaoshan area include 2 Han populations – Chaoshanese with Chaoshan dialects and Hakka with Hakka dialects – and one local aboriginal She population. Since the Qing Dynasty (216∼207 BC), the Henan and Shanxi Han people of north-central China migrated into the Chaoshan area in Guangdong province via Fujian province because of war and famine. They gradually became the predominant inhabitants of the Chaoshan area and are called Chaoshanese [10], so the Chaoshan dialect is similar to ancient Chinese. Hakka Chinese originated from the north Han Chinese of the Yellow River and Luohe River basin of the Central Plain. From the Jin Dynasty (266∼316 AD) to the Tong Dynasty (960∼1297 AD), they were forced to move to southern areas also because of wars. When the Hakkas arrived in the Chaoshan area, the Chaoshanese had already settled in the rich plain area, so the Hakkas had to settle in the mountain area, where they lived with the local aborigines, the She population (Fig 1).
Figure 1

Geographic distribution of the three studied EC and GCC high-risk populations and two low-risk population Hakka and She in Chaoshan area.

Arrows show the north-to-south migrations of Han inhabitants from north-central China according to historical records. 218BC, AD311 and AD669 are the three major time periods of north-to-south migrations.

Geographic distribution of the three studied EC and GCC high-risk populations and two low-risk population Hakka and She in Chaoshan area.

Arrows show the north-to-south migrations of Han inhabitants from north-central China according to historical records. 218BC, AD311 and AD669 are the three major time periods of north-to-south migrations. The Hakka and Chaoshanese populations show the characteristics of their unique cultures [10]–[13] which have many similarities to northern Han Chinese, including some features of dialect, life style, customs, and habits [10]. The Chaoshan She population is the only aboriginal and minority population. She people mainly work in agriculture, forestry, and animal husbandry; their language and living customs differ from that of the Han population [14]. Although all 3 populations are exposed to a similar geographical environment, only the Chaoshanese have a high incidence of EC and GCC. Our previous research of Y-chromosome and mtDNA haplogroups concluded that the EC high-risk populations in Taihang Mountain, Fujian Minnan and Guangdong Chaoshan share a similar patrilineal and matrilineal genetic background [15], [16]. In the present study, we further explored the patrilineal genetic structure of EC and GCC patients in Chaoshan high-risk areas and compared it with matched high-risk populations and corresponding low-risk populations. We aimed to examine whether Chaoshan cancer patients have a common ancestry with Taihang Mountain patients and whether they share the same unique Y-chromosome haplotypes. We also compared these data for Y-chromosome single nucleotide polymorphisms (Y-SNPs) and Y-chromosome short tandem repeat (Y-STRs) with that of other Chinese populations from public databases to explore the relative genetic affinity of the studied populations. We first analyzed non-recombining portion of the Y chromosome (NRY) in these 6 populations with 16 East Asian-specific biallelic markers [17], [18] (SNPs), which were characterized by low mutation rate and low probabilities of back and parallel mutation and suitable for tracing early demographic events in human history. Then we investigated the genetic distance among EC and GCC patients with Y-STR loci with relatively high mutation rate and appropriate for analyzing the relationship among close groups and their microevolution [15], [16]. Both Y-SNP and Y-STR analysis results support that the Chaoshan patients have close genetic relatedness with Taihang Mountain patients and the patients have closer relationship with each other than with the high risk population.

Results

Distribution of NRY Haplogroups in the 6 Studied Populations in China

Y-SNP genotyping revealed the haplogroup frequencies of the Chaoshan EC or GCC patients, Taihang Mountain EC or GCC patients, and Chaoshan Hakka and She populations. The highest haplogroup of Chaoshan patients was O3a3c1-M117, which is the characteristic haplogroup for Northern East Asians (Table 1). It was also high for Taihang Mountain patients but was significantly lower for Chaoshan Hakka and She populations than Chaoshan patients (p<0.05). Both Chaoshan Hakka and She populations showed a high frequceny of O1a*, the characteristic haplogroup for Southeastern Asians. It was significantly higher for Chaoshan Hakka than Chaoshan patients (p<0.05). The She population showed a unique high frequency of O3a3b* as compared with other studied populations, except the Chaoshan GCC patients, with very low frequency of 2.08%.
Table 1

Y-chromosome single nucleotide polymorphism (Y-SNP) haplogroup frequencies of the 6 studied populations (%).

HalplogroupChaoshanChaoshanTaihangTaihangChaoshanChaoshan
ECCCMountain ECMountain CCHakkas (%)She (%)
Patients (%)patients (%)patients (%)patients (%)n = 32n = 24
n = 72n = 48n = 49n = 63
C* 0016.339.526.250
D/E(M1) 0001.5900
D1(M15) 1.3902.04000
F*(M89) 4.1700000
K*(M9) 1.3912.501.5900
O*(M175) 8.3310.422.046.3504.17
O3*(M122) 15.2818.7526.5323.8131.2529.17
O3a1(M121) 2.7802.04000
O3a3c*(M134) 5.564.1716.3323.816.250
O3a3c1*(M117) 22.2237.524.4915.873.138.33
O3a3b*(M7) 02.0800020.83
O1a*(M119) 16.6714.582.043.1743.7520.83
O2a*(M95) 16.67006.356.254.17
O2a1*(M88,M111) 5.560003.130
P*(M45) 004.084.7600
Q1a1(M120) 004.083.17012.5

Principal Component Analysis Revealed Close Affinity among the 4 Patient Groups

Principal component analysis (PCA) involves a mathematical procedure that transforms a number of correlated variables into a (smaller) number of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. In the principal-component plot, the smaller the distance of two populations, the closer the genetic relationship is between the two. Figure 2 shows the results of principal component analysis, with 3 components (PC1, 2, 3), for Y-SNP frequencies based on genotyping results of the 6 studied populations and additional data for other Chinese Han. For comparison, the haplotype frequencies of 4 high-risk populations from Chaoshan (CSHR), Fujian (FJHR) and Taihang Mountain (THHR) areas were included [15]. The 3 components accounted for 86.2% of the total variation in Y-SNP. The 4 patient groups and 3 high-risk populations clustered together. The Chaoshan She and Hakka populations formed another cluster. The rest of the Northern Han and Southern Han formed another group. The Chaoshan patients and high-risk population were isolated from the Chaoshan Hakka and She populations and Guangzhou Han.
Figure 2

3-D principal component maps of frequencies of Y-chromosome single nucleotide polymorphism (Y-SNP) in Chinese populations.

The smaller the distance between populations, the closer the relationship. We divided 26 populations into 3 clusters: 1) Cluster1 (Red circle): 4 patient groups and 3 populations at high risk of esophageal cancer (EC): CSEC: Chaoshan EC patients, CSCC: Chaoshan gastric cardia cancer (GCC) patients, CSHR: Chaoshan high-risk population; FJHR: Fujian high-risk population; THEC: Taihang Mountain EC patients; THCC: Taihang Mountain GCC patients; and THHR: Taihang Mountain high-risk population; 2) Cluster2 (Green circle) Chaoshan Hakka (CSKJ) and She population (CSSZ); 3) Cluster3 (Orange circle) Northern and southern Han populations. Northern Han populations: HeB: Hebei Han; LN: Liaoning Han; XJ: Xinjiang Han; NMG: Neimeng Han; HB: Hubei Han; HN: Henan Han; GS: Gansu Han; SX: Shanxi Han; SD: Shangdong Han. Southern Han populations: GD: Guangzhou Han; SH: Shanghai Han; ZJ: Zhejiang Han; AH: Anhui Han; JS: Jiangsu Han; HuN: Hunan Han; JX: Jiangxi Han; SC: Sichuan Han.

3-D principal component maps of frequencies of Y-chromosome single nucleotide polymorphism (Y-SNP) in Chinese populations.

The smaller the distance between populations, the closer the relationship. We divided 26 populations into 3 clusters: 1) Cluster1 (Red circle): 4 patient groups and 3 populations at high risk of esophageal cancer (EC): CSEC: Chaoshan EC patients, CSCC: Chaoshan gastric cardia cancer (GCC) patients, CSHR: Chaoshan high-risk population; FJHR: Fujian high-risk population; THEC: Taihang Mountain EC patients; THCC: Taihang Mountain GCC patients; and THHR: Taihang Mountain high-risk population; 2) Cluster2 (Green circle) Chaoshan Hakka (CSKJ) and She population (CSSZ); 3) Cluster3 (Orange circle) Northern and southern Han populations. Northern Han populations: HeB: Hebei Han; LN: Liaoning Han; XJ: Xinjiang Han; NMG: Neimeng Han; HB: Hubei Han; HN: Henan Han; GS: Gansu Han; SX: Shanxi Han; SD: Shangdong Han. Southern Han populations: GD: Guangzhou Han; SH: Shanghai Han; ZJ: Zhejiang Han; AH: Anhui Han; JS: Jiangsu Han; HuN: Hunan Han; JX: Jiangxi Han; SC: Sichuan Han.

Positive Correlation between 4 Patient Populations and Chinese Han Populations

Y-SNP haplogroup frequencies for the patient groups and high-risk population from the same area were positively correlated, and frequencies for all patient groups were positively correlated with the Fujian and Chaoshan high-risk populations (Table 2). Frequencies for the Chaoshan EC patients and Chaoshan Hakka were correlated but the coefficient was the lowest. Frequencies for HC were positively correlated with most of the Chinese Han frequencies and those for HNEC were positively correlated with some of the Chinese Han frequencies.
Table 2

Correlation analysis of Y-chromosome SNP haplogroup frequencies in the studied populations and 3 high-risk populations and 17 Chinese Han populations.

Esophageal cancer patientsGastric cardia cancer patients
ChaoshanTaihang MountainChaoshanTaihang Mountain
Taihang Mountain EC 0.453
Chaoshan CC 0.745** 0.636*
Taihang Mountain CC 0.4710.897** 0.497
Chaoshan high-risk 0.771** 0.827** 0.828** 0.770**
Fujian high-risk 0.618* 0.730** 0.720** 0.614*
Taihang Mountain high-risk 0.4340.691** 0.3340.830**
Chaoshan Hakka 0.550* 0.3130.3620.345
Chaoshan She 0.4110.3320.4040.287
Hebei Han 0.1880.605* 0.2730.773**
Shandong Han 0.1040.540* 0.2290.709**
Henan Han 0.1220.4430.3080.522*
Anhui Han 0.1750.5050.290.519*
Zhejiang Han 0.350.4780.2260.625**
Jiangsu Han 0.1950.4180.2640.6*
Shanghai Han 0.2830.3770.3430.484
Hubei Han 0.2010.515* 0.2360.727**
Sichuan Han −0.70.1650.0570.426
Jixi Han 0.1560.3030.1880.532*
Hunan Han 0.3850.4720.2630.723**
Gansu Han −0.0710.3850.080.520*
Liaoning Han 0.0730.3340.1330.471
Neimengu Han −0.0870.571* 0.0310.695**
Shanxi Han −0.0050.3990.1390.638*
Xingjiang Han 0.0550.518* 0.1860.719**
Guangdong Han 0.2350.4490.1210.722**

P<0.01 level (2-tailed).

P<0.05 level (2-tailed).

P<0.01 level (2-tailed). P<0.05 level (2-tailed).

Hierarchical Cluster Analysis Isolates Patients and High-risk Population from Other Populations

To study the affinity among the 4 patient groups and their relationship with other Han and minority nationalities, we analyzed Y-SNP data by hierarchical cluster analysis with average linkage (between groups). We compared 17 Chinese Han populations (population information was the same as from principal component analysis), 3 southern minority nationalities (Yao, Zhuang and Dong; [19] and 5 northern minority nationalities (Tibetan, Mongol (MG), Hui, Ewenki (EWK), Shui). The Taihang Mountain patients and high-risk population (Taihang) were genetically close and formed a branch; meanwhile, the Chaoshan patients were genetically close to the Chaoshan and Fujian high-risk populations (Chaoshan, Fujian) and formed another branch (Fig. 3). Then these 2 branches crossed and clustered with Chaoshan Hakka and She populations. All other populations clustered outside the main branch formed by populations from high-risk areas. Therefore, EC or GCC patients and high-risk populations were closer genetically with each other than with Chaoshan Hakka, She and other populations.
Figure 3

Dendrogram of Y-SNP data.

Shows the affinity between the studied populations, the high-risk population, Chinese Han and Chinese minority nationalities. Taihang: Taihang Mountain high-risk population; Chaoshan: Chaoshan high-risk population; Fujian: Fujian high-risk population. The other abbreviations are defined in the Methods and Figure 2.

Dendrogram of Y-SNP data.

Shows the affinity between the studied populations, the high-risk population, Chinese Han and Chinese minority nationalities. Taihang: Taihang Mountain high-risk population; Chaoshan: Chaoshan high-risk population; Fujian: Fujian high-risk population. The other abbreviations are defined in the Methods and Figure 2.

Genetic Distance Analysis and Construction of a Phylogenetic Tree

We used Y-STR data to investigate the genetic relationships between the 4 patient populations. Rst distances between pairs of populations were calculated on the basis of 6 Y-STRs:DYS389 (I, II), DYS390, DYS391, DYS392, DYS393 and DYS394. We included 6 additional Chinese populations and 3 high-risk populations: Zhejiang [20], Henan [21], Dongbei [22], Tianjing [23], Hunan Han [24], and Tibetan [25], and Chaoshan, Fujian, and Taihang Moutain high-risk populations, all of which belong to the Sino-Tibetan language family [15], as do the 4 patient groups. From the RST distance matrix, we constructed an unrooted neighbor-joining tree (Fig. 4). The patient groups were closer to each other than to the high-risk populations and the other Chinese Han populations.
Figure 4

Neighbor-joining tree of genetic distance between patients, high-risk EC population and Chinese Han populations based on Y-chromosome short tandem repeat (Y-STR) data.

The 4 patient groups are close to each other and are clustered with the high-risk populations.

Neighbor-joining tree of genetic distance between patients, high-risk EC population and Chinese Han populations based on Y-chromosome short tandem repeat (Y-STR) data.

The 4 patient groups are close to each other and are clustered with the high-risk populations.

Network Analysis of Y-STR Haplogroups of the 4 Patient Groups and 3 High-risk Populations

The highest haplogroup frequency shared by the Chaoshan patients was O3a3c1-M117 (Table 1). The network for patients and high risk populations was further constructed based on the haplogroup O3a3c1-M117. In all, 12 Henan and 15 Chaoshan EC patients, 17 Chaoshan and 9 Henan GCC patients, and 23 Chaoshan, 8 Henan and 24 Fujian high-risk individuals belonged to haplogroup O3a3c1-M117. Individuals with Y-STR frequency <2 were eliminated from the analysis. Finally, data for 55 individuals were included and analyzed (Fig. 5). The central node was represented by 8 Fujian high-risk individuals, 1 Henan high-risk individual and 1 Chaoshan EC patient. All of the other haplogroup O3a3c1-M117 individuals came from this central node. This central node was connected to 5 one-step neighbors, with 2 neighbors representing 5 Fujian high-risk individuals; the third neighbor represented 8 Chaoshan high-risk individuals, 1 Henan high-risk individual, 2 Fujian high-risk individuals and 1 Chaoshan EC patient; the fourth neighbor represented 2 Chaoshan EC patients, 1 Chaoshan high-risk individual and 1 Fujian high-risk individual; and the fifth neighbor represented 1 Chaoshan GCC patient and 1 Chaoshan high-risk individual. Most patients were generated from the fifth one-step neighbor and thus clustered mainly in one area (circle in Fig. 5). This area included all GCC patients and 5 EC patients, with the remaining 6 EC patients scattered in other nodes.
Figure 5

Y-STR network of haplogroup O3a3c1-M117 for patients and high-risk populations belonging to cluster 1 in figure 2.

Most patient groups were generated from one node and clustered mainly in one area (circle). Circles represent lineages, area is proportional to frequency, and color indicates population of origin.

Y-STR network of haplogroup O3a3c1-M117 for patients and high-risk populations belonging to cluster 1 in figure 2.

Most patient groups were generated from one node and clustered mainly in one area (circle). Circles represent lineages, area is proportional to frequency, and color indicates population of origin.

Materials and Methods

Sample Collection and DNA Extraction

Blood samples of 288 unrelated males were collected from the Taihang Mountain and Chaoshan high-risk areas. Informed consent was obtained from all subjects. Subjects were 1) Chaoshan patients–72 EC and 48 GCC patients; 2) Taihang Mountain patients–49 EC and 63 GCC patients; and 3) Chaoshan EC low-risk population–24 She people from Chaoshan Fenghuang Mountain and 32 Chaoshan Hakka from Chaoshan Puning county. Disease in all patients was confirmed pathologically. All participants involved in our study were given written informed consents. The study was approved by the ethical review committee of Shantou University Medical College. Genomic DNA was extracted from whole blood by the TIANamp Blood DNA kit (DP318-03) (Tiangen Biotech Co., Beijing).

Genotyping of Y-SNPs and Y-STRs

Y-SNPs were genotyped by Sequenom MassARRAY iPLEX Gold module (Sequenom Inc.) (PCR primers and extension primers are in Table 3). M1 polymorphism (Alu insertion, also called YAP) was directly analyzed by agarosegel electrophoresis after PCR [26]. STRs were genotyped by fluorescence PCR as previously described [15], and fluorescent-labeled extension products were capillary electrophoresed on an ABI 3730x Genetic Analyzer (ABI, USA). All primers were synthesized by Sangon Co. (Shanghai). In 1999, Su et al. ascertained 17 Y-chromosome haplogroups based on 19 East Asian-specific biallelic markers as the paternal structure of East Asians [19]. The adjusted phylogenetics diagram of Y-SNPs [27] includes nearly 600 SNPs and defines 311 haplogroups. The phylogenetic diagram of 17 haplogroups defined by 16 Y-SNPs is in Figure 6.
Table 3

PCR primers and extension primers for 15-SNPs used in Sequenom genotyping.

SNP_IDWELL1st-PCRP2nd-PCRPUEP_SEQ
M134 W1 ACGTTGGATGGAATCATCAAACCCAGAAGG ACGTTGGATGGGAGAGATACTTTTGATCCC TTTTGATCCCCACCAAT
M119 W1 ACGTTGGATGGGGAGACAGATAATTCTGC ACGTTGGATGATGGGTTATTCCAATTCAGC CAATTCAGCATACAGGC
M88 W1 ACGTTGGATGCAGTGCTAGAGAGGAAAACC ACGTTGGATGTATAGGCTATGGCCTAGGTG TATTCCTGCTTCTTCTGC
M45 W1 ACGTTGGATGCAGTAACTCTAGGAGAGAGG ACGTTGGATGCCTGGACCTCAGAAGGAGC TCAGAAGGAGCTTTTTGC
M122 W1 ACGTTGGATGCAAGGTAGAAAAGCAATTGAG ACGTTGGATGCTCTGTGTTAGAAAAGATAGC ccGATTTTCCCCTGAGAGC
M15 W1 ACGTTGGATGTGTCCAGAGGGTCTGCTAAC ACGTTGGATGGGAAGAGTAGAGAAAAGGTG GAGAAAAGGTGGTACAATG
M7 W1 ACGTTGGATGGCATCACCAAAGGGCATGTA ACGTTGGATGTTGTAGTTGAGTTACTGTT GTTGAGTTACTGTTCTTCTT
M95 W1 ACGTTGGATGTCTCCTAAGCCTACAGGTTG ACGTTGGATGATGGAGTTCCTGAGGATAAG GGAAAGACTACCATATTAGTG
M117 W1 ACGTTGGATGATTGACAGTTATCAGTTTG ACGTTGGATGATAACTCACCAAAGGAATGC CTCACCAAAGGAATGCACATCT
M111 W1 ACGTTGGATGGCCAAAAACAACAGAACAAG ACGTTGGATGTGTGGTACTTGTTTTGTGTG AGGTAAATTTTGGGGAGAAAAC
M89 W1 ACGTTGGATGAAAGGTAGCTGCAACTCAGG ACGTTGGATGTCCTGGATTCAGCTCTCTTC CCTAAGGTTATGTACAAAAATCT
M120 W1 ACGTTGGATGCGCAATAAAGTATAATTTCCC ACGTTGGATGAACACACTGCTAATGATCCG tTCCGTTTTTTGATGTGGAAATA
M175 W2 ACGTTGGATGCTACTGATACCTTTGTTTCTG ACGTTGGATGTGAATCAGGCACATGCCTTC ATGCCTTCTCACTTCTC
M9 W2 ACGTTGGATGCATTGAACGTTTGAACATGTC ACGTTGGATGCAGAACTGCAAAGAAACGGC GGCCTAAGATGGTTGAAT
M121 W2 ACGTTGGATGCAGCATGATATTTCCACATC ACGTTGGATGCATCGCTAAACACACGTACC CACACGTACCATAAATCAAA
Figure 6

Phylogenetic diagram of 17 haplogroups in Chinese populations based on 16-chromosome biallelic markers drawn according to the non-recombining portion of the Y-chromosome haplogroup tree of East Asia.

The most recent markers defining the haplogroups are beside the branches.

Phylogenetic diagram of 17 haplogroups in Chinese populations based on 16-chromosome biallelic markers drawn according to the non-recombining portion of the Y-chromosome haplogroup tree of East Asia.

The most recent markers defining the haplogroups are beside the branches.

Population and Genotyping

Subjects were genotyped for Y-SNP haplogroup and frequencies were compared among the 4 patient populations and She and Hakka populations (Table S1). Principal component, correlation and hierarchical cluster analyses were used to analyze the relationship among the 6 populations. Three high-risk populations from the Taihang Mountain, Fujian Minnan, and Chaoshan areas and 25 previously published Chinese populations were compared. The 25 Chinese populations were divided into 4 groups by geographic location and nationality [15]: Northern Han (NHs) and northern minority nationalities (NMNs), southern Han (SH) and southern minority nationalities (SMNs). NH populations were Hebei [28], Liaoning (data provided by the State Key Laboratory of Genetic Engineering and Center for Anthropological Studies, School of Life Sciences, Fudan University), Xinjiang, Gansu, Shanxi, Neimeng, Shandong and Henan [28]; SH populations were Hunan, Hubei, Zhejiang, Jiangxi, Shanghai, Anhui, Jiangsu, Sichuan [28], Guangzhou and Guangxi (data provided by Fudan University); NMN populations were Tibetan, Mongol, Hui, Ewenki, and Shui (data provided by Fudan University); data for 3 southern minority nationalities (Yao, Zhuang and Dong [19] and 5 northern minority nationalities (Tibetan, Mongol, Hui, Ewenki, and Shui populations were provided by Fudan University). Chaoshan patients, Henan patients, Chaoshan Hakka and Chaoshan She population belong to SHs, NHs, SHs, and SMNs, respectively. Guangzhou Han, Chaoshan Hakka, and Chaoshan patients belong to the Guangfu, Hakka, and Fulao/Helao clans, respectively, the 3 major clans in Guangdong Province. Chaoshan She people comprise the major SMNs who live in the Chaoshan area. These 4 populations are geographically proximate. STRs can be used to analyze minute genetic diversity in close populations, so on the basis of Y-SNP results, Y-STRs were used to analyze the genetic differentiation and origin among patients and high-risk populations (Table S2). We added Y-STR data for 3 high-risk populations from our previous research [15] and for 6 previously published populations: Zhejiang [20], Henan [21], Dongbei [22], Tianjing [23], Hunan Han [24] and Tibetan people [25]. The extent of genetic differentiation of the populations was estimated by the Rst statistic on the basis of the Y-STR haplotypes by use of Alrequin 3.1. A neighbor-joining tree was constructed according to the Rst distance matrix with use of MEGA 5.1. A network of Y-STR data was constructed by use of Network 4.6.1.1 (www.fluxus-engineering.com). In the network map, individuals with the same mutations of Y-STRs were in the same node, and one node could generate other nodes due to gradual Y-STR mutation [15].

Discussion

Chaoshanese are descendants of north-central China Han people. North-central Chinese Han began to migrate into southern China beginning in the Qin Dynasty (216 BC). The Han Dynasty (206 BC–220 AD) experienced another 3 waves of large-scale migration into southern China because of the decrease in the native population in this area. Gradually, over 2,000 years, the north-central Chinese Han became the main population – Chaoshanese in the Chaoshan region, called Helao, who directly migrated from north-central China, or Fulao, who first migrated to Fujian Minnan, then to Chaoshan with well-maintained language and customs from north-central China. The Taihang Mountain people in north-central China, Fujian Minnan and Chaoshan areas are well known for their high incidence of EC [15]. With the development of diagnostic techniques and improved epidemiology, more GCC cases have been confirmed in these areas. EC and GCC are the 2 most common cancers in these 3 areas. Our previous genetic research showed that high-risk populations in these 3 areas share a common ancestry [15], [16]. In the present study, we studied Y-chromosome haplogroups of EC and GCC patients from the Chaoshan and Taihang Mountain areas to further explore the paternal genetic background of the patients. We compared the data with 2 low-risk Chaoshan Hakka and She populations and 3 high-risk populations. We first analyzed the distribution of Y-SNP haplogroups among the studied populations. The haplogroup with the highest frequency shared by Chaoshan EC and GCC patients was O3a3c1-M117, one of the northern Han dominant haplogroups, which was also high in Taihang Mountain patients but low in the Chaoshan Hakka and She populations. As compared with Chaoshan patients and the high-risk population, the Chaoshan Hakka and She populations showed a relatively higher frequency of the southern native dominant O1*. Similar to Taihang Mountain patients, Chaoshan patients showed northern Han dominant haplogroups as their highest frequency haplogroups, so Chaoshan and Taihang Mountain patients are relatively closely related. On Y-SNP principal component analysis, the paternal structure for Chaoshan patients differed from that for Chaoshan Hakka and She populations, although they are in geographic proximity and Chaoshan Hakka are also descendants of north-central Chinese Hans. Chaoshan patients clustered closely with the Fujian and Henan high-risk population and patients, although they are geographically distant. Chaoshan Hakka and She populations clustered together, which agrees with historical records. Chaoshan Hakka mainly inhabit the mountain area, for more gene flow with the She population, who also live in the mountain area. Y-SNP haplotype frequencies were positively correlated among patients, which further supports their close genetic affinity. The results of hierarchical cluster analysis also supported the close genetic affinity among patients and high-risk populations. Phylogenetically, the patient groups were more closely related to each other than with the high-risk population (Fig. 4). Network analysis (Fig. 5) suggested that the patrilineal lineage of haplogroup O3a3c1-M117 individuals was the Taihang Mountain and Fujian high-risk individuals and Chaoshan EC patients, who constituted the central node, and patients of the O3a3c1-M117 individuals from the 2 studied areas were largely from one one-step neighbors containing 1 Chaoshan high-risk individual and 1 Chaoshan GCC patient. The haplogroup O3a3c1-M117 network analysis revealed variation among populations but also a high degree of patient-specific substructure. All 14 GCC patients and 5 of the 11 EC patients fall into one cluster (Fig. 5, circle). Haplogroup O3a3c1-M117 patients may have originated from the same ancestral haplogroup. Thus, we suggest patrilineal genetic affinity among the 2 geographically separated GCC and EC patients in China. Recent genome-wide association studies from China high-risk areas showed significant association of a variant at 10q23 in PLCE1 and both esophageal squamous cell carcinoma and gastric cardia adenocarcinoma, which highlights the common genetic mechanisms that may contribute to the etiology of both cancers [29]. Though EC and GCC are pathologically distinct, the epidemiology studies [2]–[9], genome-wide association studies and present study all support that EC and GCC may share common genetic structure. EC and GCC are anatomically adjacent and they have similar embryogenesis. They are exposed to similar environmental condition during life. However why they may be affected by a common genetic structure is still unknown. We suggest that EC and GCC do not occur at random in high-risk populations but are closely associated with a certain patrilineal background structure and these related patients may inherit a pathogenic genetic structure from their common ancestors. In summary, the patrilineal genetic structure of Chaoshan and Taihang Mountain patients is similar, and patients have closer affinity with each other than with the high-risk populations. The EC and GCC patients share a recent common ancestor. In contrast, the Chaoshan Hakka and She populations have a relatively distant relationship with Chaoshanese people, which may explain in part the high incidence of EC and GCC in Chaoshanese people. Raw data for individual Y-SNP. (XLS) Click here for additional data file. Raw data for individual Y-STR. (XLS) Click here for additional data file.
  20 in total

1.  The origin of Mosuo people as revealed by mtDNA and Y chromosome variation.

Authors:  Wen Bo; Shi Hong; Ren Ling; Xi Huifeng; Li Kaiyuan; Zhang Wenyi; Su Bing; Si Shiheng; Jin Li; Xiao Chunjie
Journal:  Sci China C Life Sci       Date:  2004-02

2.  Tracing the origins of Hakka and Chaoshanese by mitochondrial DNA analysis.

Authors:  Wen-Zhi Wang; Cheng-Ye Wang; Yao-Ting Cheng; An-Long Xu; Chun-Ling Zhu; Shi-Fang Wu; Qing-Peng Kong; Ya-Ping Zhang
Journal:  Am J Phys Anthropol       Date:  2010-01       Impact factor: 2.868

3.  Prevalence and risk factors associated with prehypertension and hypertension in the Chinese She population.

Authors:  Yinghua Lin; Xiaolan Lai; Gang Chen; Yuan Xu; Baoying Huang; Zichun Chen; Shaoheng Zhu; Jin Yao; Qiqin Jiang; Huibin Huang
Journal:  Kidney Blood Press Res       Date:  2012-03-01       Impact factor: 2.687

4.  Temporal trends of esophageal cancer during 1995-2004 in Nanao Island, an extremely high-risk area in China.

Authors:  Min Su; Min Liu; Dong-Ping Tian; Xiao-Yun Li; Guo-Hong Zhang; He-Lin Yang; Xiaolong Fan; Hai-Hua Huang; Yu-Xia Gao
Journal:  Eur J Epidemiol       Date:  2006-12-29       Impact factor: 8.082

5.  Y-Chromosome evidence for a northward migration of modern humans into Eastern Asia during the last Ice Age.

Authors:  B Su; J Xiao; P Underhill; R Deka; W Zhang; J Akey; W Huang; D Shen; D Lu; J Luo; J Chu; J Tan; P Shen; R Davis; L Cavalli-Sforza; R Chakraborty; M Xiong; R Du; P Oefner; Z Chen; L Jin
Journal:  Am J Hum Genet       Date:  1999-12       Impact factor: 11.025

6.  Esophageal cancer in Shanxi Province, People's Republic of China: a case-control study in high and moderate risk areas.

Authors:  Y P Wang; X Y Han; W Su; Y L Wang; Y W Zhu; T Sasaba; K Nakachi; Y Hoshiyama; Y Tagashira
Journal:  Cancer Causes Control       Date:  1992-03       Impact factor: 2.506

7.  Prospective study of risk factors for esophageal and gastric cancers in the Linxian general population trial cohort in China.

Authors:  Gina D Tran; Xiu-Di Sun; Christian C Abnet; Jin-Hu Fan; Sanford M Dawsey; Zhi-Wei Dong; Steven D Mark; You-Lin Qiao; Philip R Taylor
Journal:  Int J Cancer       Date:  2005-01-20       Impact factor: 7.396

8.  [Origin of Hakka and Hakkanese: a genetics analysis].

Authors:  Hui Li; Wu-Yun Pan; Bo Wen; Ning-Ning Yang; Jian-Zhong Jin; Li Jin; Da-Ru Lu
Journal:  Yi Chuan Xue Bao       Date:  2003-09

9.  Y-chromosome haplotype distribution in Han Chinese populations and modern human origin in East Asians.

Authors:  Y Ke; B Su; J Xiao; H Chen; W Huang; Z Chen; J Chu; J Tan; L Jin; D Lu
Journal:  Sci China C Life Sci       Date:  2001-06

10.  A shared susceptibility locus in PLCE1 at 10q23 for gastric adenocarcinoma and esophageal squamous cell carcinoma.

Authors:  Christian C Abnet; Neal D Freedman; Nan Hu; Zhaoming Wang; Kai Yu; Xiao-Ou Shu; Jian-Min Yuan; Wei Zheng; Sanford M Dawsey; Linda M Dong; Maxwell P Lee; Ti Ding; You-Lin Qiao; Yu-Tang Gao; Woon-Puay Koh; Yong-Bing Xiang; Ze-Zhong Tang; Jin-Hu Fan; Chaoyu Wang; William Wheeler; Mitchell H Gail; Meredith Yeager; Jeff Yuenger; Amy Hutchinson; Kevin B Jacobs; Carol A Giffen; Laurie Burdett; Joseph F Fraumeni; Margaret A Tucker; Wong-Ho Chow; Alisa M Goldstein; Stephen J Chanock; Philip R Taylor
Journal:  Nat Genet       Date:  2010-08-22       Impact factor: 38.330

View more
  3 in total

1.  Connecting gastrointestinal cancer risk to cadmium and lead exposure in the Chaoshan population of Southeast China.

Authors:  Xueqiong Lin; Lin Peng; Xijin Xu; Yanrong Chen; Yuling Zhang; Xia Huo
Journal:  Environ Sci Pollut Res Int       Date:  2018-04-17       Impact factor: 4.223

2.  Expression of 4-hydroxynonenal in esophageal squamous cell carcinoma.

Authors:  Ying Zhang; Hu Wang; Kezhou Wu; Zhaoyong Liu
Journal:  Oncol Lett       Date:  2017-05-05       Impact factor: 2.967

3.  The long-term spatial-temporal trends and burden of esophageal cancer in one high-risk area: A population-registered study in Feicheng, China.

Authors:  Xiubin Sun; Deli Zhao; Yi Liu; Yunxia Liu; Zhongshang Yuan; Jialin Wang; Fuzhong Xue
Journal:  PLoS One       Date:  2017-03-07       Impact factor: 3.240

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.