Souhir Mestiri1,2, Sami Boussetta3, Andrew J Pakstis4, Sarra El Kamel3, Amel Ben Ammar El Gaaied3, Kenneth K Kidd4, Lotfi Cherni2,3. 1. Laboratory of Genetics, Biodiversity and Bioresource Valorization (LR11ES41), University of Monastir, Monastir, Tunisia. 2. Higher Institute of Biotechnology of Monastir, Monastir University, Monastir, Tunisia. 3. Laboratory of Genetics, Immunology and Human Pathologies, Faculty of Sciences of Tunis, University of Tunis El Manar, Tunis, Tunisia. 4. Department of Genetics, Yale University School of Medicine, New Haven, Connecticut, USA.
Abstract
BACKGROUND: The single nucleotide polymorphisms (SNPs) of the dopamine D3 receptor (DRD3), the CUB and sushi multiple domains 1 (CSMD1) and the neuregulin 1 (NRG1) genes were used to study the genetic diversity and affinity among North African populations and to examine their genetic relationships in worldwide populations. METHODS: The rs3773678, rs3732783 and rs6280 SNPs of the DRD3 gene located on chromosome 3, the rs10108270 SNP of the CSMD1 gene and the rs383632, rs385396 and rs1462906 SNPs of the NRG1 gene located on chromosome 8 were analysed in 366 individuals from seven North African populations (Libya, Kairouan, Mehdia, Sousse, Kesra, Smar and Kerkennah). RESULTS: The low values of FST indicated that only 0.27%-1.65% of the genetic variability was due to the differences between the populations. The Kairouan population has the lowest average heterozygosity among the North African populations. Haplotypes composed of the ancestral alleles ACC and ACAT were more frequent in the Kairouan population than in other North African populations. The PCA and the haplotypic analysis showed that the genetic structure of populations in North Africa was closer to that of Europeans, Admixed Americans, South Asians and East Asians. However, analysis of the rs3732783 and rs6280 SNPs revealed that the CT microhaplotype was specific to the North African population. CONCLUSIONS: The Kairouan population exhibited a relatively low rate of genetic variability. The North African population has undergone significant gene flow but also evolutionary forces that have made it genetically distinct from other populations.
BACKGROUND: The single nucleotide polymorphisms (SNPs) of the dopamine D3 receptor (DRD3), the CUB and sushi multiple domains 1 (CSMD1) and the neuregulin 1 (NRG1) genes were used to study the genetic diversity and affinity among North African populations and to examine their genetic relationships in worldwide populations. METHODS: The rs3773678, rs3732783 and rs6280 SNPs of the DRD3 gene located on chromosome 3, the rs10108270 SNP of the CSMD1 gene and the rs383632, rs385396 and rs1462906 SNPs of the NRG1 gene located on chromosome 8 were analysed in 366 individuals from seven North African populations (Libya, Kairouan, Mehdia, Sousse, Kesra, Smar and Kerkennah). RESULTS: The low values of FST indicated that only 0.27%-1.65% of the genetic variability was due to the differences between the populations. The Kairouan population has the lowest average heterozygosity among the North African populations. Haplotypes composed of the ancestral alleles ACC and ACAT were more frequent in the Kairouan population than in other North African populations. The PCA and the haplotypic analysis showed that the genetic structure of populations in North Africa was closer to that of Europeans, Admixed Americans, South Asians and East Asians. However, analysis of the rs3732783 and rs6280 SNPs revealed that the CT microhaplotype was specific to the North African population. CONCLUSIONS: The Kairouan population exhibited a relatively low rate of genetic variability. The North African population has undergone significant gene flow but also evolutionary forces that have made it genetically distinct from other populations.
The dopamine D3 receptor (DRD3) is a member of the dopamine receptor G protein‐coupled receptor family. It is located in the limbic areas of the brain (Sokoloff et al., 1992) and plays a role in congnitive and emotional functions (Bombin et al., 2008). The DRD3 gene (OMIM accession number: *126451) encodes the DRD3 receptor, which is located on chromosome 3q13.3 (Le Coniat et al., 1991). The rs6280 and rs3732783 polymorphisms of the DRD3 gene have been extensively studied to demonstrate genetic associations with neuropsychiatric disorders (Gassó et al., 2020; Yang et al., 2016; Zhang et al., 2011). The rs6280 (Gly9Ser) is a functional single nucleotide polymorphism (SNP), which corresponds to a cytosine (C) to thymine (T) substitution leading to a glycine (Gly) to serine (Ser) substitution in the extracellular N‐terminal domain of the receptor (Utsunomiya et al., 2012). It has been shown that the presence of the glycine leads to a significant increase in dopamine‐binding affinity and amplifies dopamine intracellular signalling (Jeanneteau et al., 2006). The rs6280 and rs3732783 polymorphisms are genetically close (27 basepairs). These two SNPs constitute a microhaplotype, which has been studied by Kidd et al. (2014). The rs3732783 (Ala17Ala) is a synonymous variant that has no effect on the resulting protein sequence product. Zainal Abidin et al. (2015) reported an association of rs3732783 polymorphism with the development of impulse control behaviour among Malaysian Parkinson's disease patients. The rs3773678 polymorphism of DRD3 gene is an intronic polymorphism that has been associated with nicotine dependence in Chinese population (Wei et al., 2012).The CUB and sushi multiple domains 1 (CSMD1) protein plays a very important role in the regulation of complement activation and inflammation in the developing central nervous system (Kraus et al., 2006). The CSMD1 protein is encoded by the CSMD1 gene (OMIM accession number: *608397), which is located on chromosome 8p23 and contains 69 exons (Sun et al., 2001). Recently, it has been shown that the expression level of the CSMD1 gene increases after antipsychotic treatment in patients with schizophrenia (Liu et al., 2019). Several polymorphisms of the CSMD1 gene‐like rs10503253 and rs2616984 have been associated with the onset of schizophrenia and multiple neurodevelopmental disorders (Athanasiu et al., 2017; Bocharova et al., 2017; Donohoe et al., 2013). However, no studies have been performed with the polymorphism rs10108270 of CSMD1 gene. Neuregulin 1 (NRG1) is a multifunctional protein. It plays a fundamental role in the development of the peripheral nervous system and in the process of nerve repair (Bare et al., 2011; Ronchi et al., 2016).The NRG1 gene (OMIM accession number: *142445) is made up of 21 alternately spliced exons. Most of the exons are located in a 200‐kb region at the 3′ end of the gene (Steinthorsdottir et al., 2004). Recently, it has been shown that the expression of the NRG1 gene is associated with the clinical risk of psychosis (Jagannath et al., 2018). In 2002, Stefansson et al. (2002) identified for the first time NRG1 as a susceptibility gene for schizophrenia in the Icelandic population. Several association studies have confirmed this finding in other populations (Fukui et al., 2006; Jagannath et al., 2017; Mostaid et al., 2016). However, other investigations have failed to show the association of NRG1 with schizophrenia (Allen et al., 2008; Loh et al., 2013). These conflicting results may be due to the difference in allele frequencies between populations for NRG1 and other genetic risk factors. Indeed, Gardner et al. (2006) analysed 13 SNPs of the NRG1 gene, including rs385396 and rs1462906 in 39 populations worldwide. They suggested that the allele and haplotypic frequencies of these SNPs are extremely different between populations. They also showed a significant difference in the structure of the linkage disequilibrium between Europe and other continental groups such as America, Central/South Asia, East Asia, Middle East/North Africa, Oceanic and Sub‐Saharan Africa. These results may be due to events in the history of populations from different geographic regions (Gardner et al., 2007).Several other studies have studied the genetic diversity of the North African populations (Boussetta et al., 2019; Serra‐Vidal et al., 2019). Recently, we have used the markers of the DRD2/ANKK1 locus to show that the North African population structure is the result of a mixture of European, Asian and Sub‐Saharan lineages (Mestiri et al., 2021). To confirm this result, we analysed the SNPs rs6280, rs3732783, rs3773678 of the DRD3 gene, the SNP rs10108270 of the CSMD1 gene and the SNPs rs385396, rs383632 and rs1462906 of the NRG1 gene in seven North African populations. The allele and haplotypic frequencies of the North African populations were compared with those of other worldwide populations in order to place it in the global context and to better understand its evolutionary history.
MATERIALS AND METHODS
Study populations
The study was performed on seven North African populations (Figure 1) consisting of 362 healthy and unrelated individuals who gave their informed written consent for DNA genotyping.
FIGURE 1
Geographical location of the seven populations analysed in this study
Geographical location of the seven populations analysed in this studyThe Smar sample includes 62 individuals. Smar is a Berber village in the south‐east of Tunisia. The Berbers can be descendants of Capsian and Neolithic peoples (Murdock, 1959). The Berbers of the region of Smar have been living in Tunisia since the end of the 16th century.The Kesra sample includes 44 individuals. Kesra is a Berber village in north‐west of Tunisia. During the Arab invasion, the Berber population took refuge in this mountainous region (Ibn Khaldoun, 1968).The Kerkennah sample is made up of 46 individuals. Kerkennah is a group of islands off the coast of southern Tunisia, where several past civilizations like the Amazigh (Berbers), Phoenicians and Romans have settled. The Islamic history of Kerkennah began with the Ottomans (Fehri, 2009).The Sousse sample is made up of 48 individuals. Sousse, a port city in east‐central Tunisia, was founded by the Phoenicians in the 11th before the common era. After the fall of the Roman Empire, the city came under the control of the Vandals and then the Byzantines. In the seventh century, the Arabs conquered the city and introduced the Islamic religion and the Arabic language (Djelloul, 2006). The Normans of Sicily (Norwich, 1992) then the Spaniards invaded Sousse. It received contributions from the Ottoman Turks in the 16th century, then it came under the control of the French in 1881 (Pétridès, 1910).The Mahdia sample comprises 47 individuals. Mahdia is located in the center‐east of Tunisia. It was the capital of the Fatimids (921–973) thanks to its proximity to the sea and to its promontory, on which a military colony had existed since the time of the Phoenicians (Favreau, 1995).The Kairouan sample includes 45 individuals. Kairouan is a city in central Tunisia. It was the first Muslim city in the Maghreb built around 670. It was a center of learning attracting Muslims from other countries (McEvedy & Jones, 1978).The Libyan sample comprises 70 individuals collected from six different communities in Libya. Libya is a country in North Africa that is part of the Maghreb. Several civilizations have succeeded in Libya, the Phoenicians, the Romans, the Vandals, the Byzantines, the Arabs, the Spaniards, the Ottomans and the Italians. The majority of Libyans are of mixed Berber‐Arab culture (Camps, 1981).Data from the 26 global populations of the 1000 Genomes (1KGP) project (Table 1) were added to those obtained from the seven North African populations of this study to allow a comparison between populations (Consortium, 2015). The populations of the Americas (Puerto Ricans, Colombians, Peruvians and Mexican Americans) are mixtures of West African and European origin. Peruvians have the highest portion of Native American ancestry.
TABLE 1
The size of the global populations of the 1000 Genomes (1KGP) project analysed in this study
Population
Sample size
African (AFR)
Yoruba in Ibadan, Nigera (YRI)
108
Luhya in Webuye, Kenya (LWK)
99
Gambian in Western Gambia (GWD)
113
Mende in Sierra Leone (MSL)
85
Esan in Nigera (ESN)
99
Americans of African Ancestry in SW USA (ASW)
61
African Carribbeans in Barbados (ACB)
96
Ad Mixed American (AMR)
Mexican Ancestry from Los Angeles, USA (MXL)
64
Puerto Ricans from Puerto Rico (PUR)
104
Colombians from Medellin, Colombia (CLM)
94
Peruvians from Lima, Peru (PEL)
85
East Asian (EAS)
Han Chinese in Bejing, China (CHB)
103
Japanese in Tokyo, Japan (JPT)
104
Southern Han Chinese (CHS)
105
Chinese Dai in Xishuangbanna, China (CDX)
93
Kinh in Ho Chi Minh City, Vietnam (KHV)
99
European (EUR)
Utah Residents from North and West Europe (CEU)
99
Toscani in Italia (TSI)
107
Finnish in Finland (FIN)
99
British in England and Scotland (GBR)
91
Iberian population in Spain (IBS)
107
South Asian (SAS)
Gujarati Indian from Houston, Texas (GIH)
103
Punjabi from Lahore, Pakistan (PJL)
96
Bengali from Bangladesh (BEB)
86
Sri Lankan Tamil from the UK (STU)
102
Indian Telugu from the UK (ITU)
102
The size of the global populations of the 1000 Genomes (1KGP) project analysed in this study
DNA genotyping
The extraction of genomic DNA from all individuals was carried out by the phenol‐chloroform method. We analysed three SNPs of the DRD3 gene, one SNP of the CSMD1 gene and three SNPs of the NRG1 gene (Table 2). Genotyping of all SNPs was performed using the TaqMan® assay according to the manufacturer's protocol. An AB7900 thermal cycler using SDS software read the 384‐well plates.
TABLE 2
Description of the seven single nucleotide polymorphisms in the DRD3, CSMD1 and NRG1 genes
Abbreviations: SNP, single nucleotide polymorphism; UTR, untranslated region.
Description of the seven single nucleotide polymorphisms in the DRD3, CSMD1 and NRG1 genesAbbreviations: SNP, single nucleotide polymorphism; UTR, untranslated region.
Statistical analysis
The PLINK 1.09 software (Purcell et al., 2007) determined the calculation of the allele and genotypic frequencies. The Hardy–Weinberg equilibrium was estimated by a chi‐square test. Haplotypes formed with the seven SNPs in all populations were obtained from the LDlink website (Machiela & Chanock, 2015). The calculation of the distance matrix F
ST was determined by the software ARLEQUIN v3.11 (Excoffier & Lischer, 2010) and the principal component analysis was carried out with the software PAST (Hammer et al., 2001). The analysis of the haplotypes obtained with the seven SNPs was performed using the PHASE v2.1.1 software (Stephens et al., 2001; Stephens & Scheet, 2005). The degree of linkage disequilibrium (LD) was assessed by Haploview software.
RESULTS
Allele frequencies and heterozygosity
Single nucleotide polymorphism sites were polymorphic in all populations. For all SNPs studied the Hardy–Weinberg equilibrium (HWE) were respected at a significance level of 1% (Table S1). Table 3 displays the allele frequencies of all SNPs in all populations of North Africa. The results did not reveal any differences in the allele frequencies of the five SNPs rs6280, rs3732783, rs385396, rs383632 and rs1462906 in North African populations. However, a significant difference at the 0.05 level in the allele frequencies of rs3773678 between the Sousse and Libya populations was demonstrated (p = 0.014). In addition, the allele frequencies of rs10108270 from Kairouan were also different at the 0.05 level from those of Kerkennah (p = 0.006), Smar (p = 0.018) and Libya (p = 0.02). The ancestral C allele of rs3732783 was the least frequent allele in the North African populations studied; its frequency varied from 2.3% in Kerkennah to 11.5% in Sousse. Nonetheless, the ancestral alleles rs385396 A and rs383632 C were the most frequent alleles in these populations with frequencies ranging from 88.5% in Sousse to 97.1% in Libya for rs385396 and from 88.5% in Sousse to 97.8% in Kairouan for rs383632 (Table 3). Table 4 shows the heterozygosity for each locus and for each North African population. The average heterozygosity for each population varied from 0.283 in Kairouan to 0.333 in Smar. In general, the heterozygosity of each locus was close to the maximum value of heterozygosity (0.5) except those of loci rs3732783, rs385396 and rs383632. Indeed, the average heterozygosity varied from 0.127 for rs3732783 and rs383632 to 0.492 for rs6280.
TABLE 3
Allele frequencies in North African populations at seven SNPs (rs6280, rs3732783 and rs3773678 of the DRD3 gene; rs10108270 of the CSMD1 gene and rs385396, rs383632 and rs1462906 of the NRG1 gene
Kairouan
Kerkennah
Kesra
Mehdia
Smar
Sousse
Libya
rs6280
C
42 (0.477)
40 (0.444)
35 (0.438)
35 (0.398)
58 (0.475)
34 (0.425)
60 (0.435)
T
46 (0.523)
50 (0.556)
45 (0.562)
53 (0.602)
64 (0.525)
46 (0.575)
78 (0.565)
rs3732783
C
6 (0.067)
2 (0.023)
5 (0.058)
8 (0.093)
8 (0.065)
11 (0.115)
9 (0.064)
T
84 (0.933)
84 (0.977)
81 (0.942)
78 (0.907)
116 (0.935)
85 (0.885)
131 (0.936)
rs3773678
A
26 (0.361)
35 (0.398)
28 (0.326)
29 (0.322)
43 (0.358)
27 (0.281)
60 (0.435)
G
46 (0.639)
53 (0.602)
58 (0.674)
61 (0.678)
77 (0.642)
69 (0.719)
78 (0.565)
rs10108270
A
30 (0.333)
49 (0.533)
35 (0.398)
39 (0.443)
61 (0.492)
40 (0.426)
67 (0.486)
C
60 (0.667)
43 (0.467)
53 (0.602)
49 (0.557)
63 (0.508)
54 (0.574)
71 (0.514)
rs385396
A
87 (0.967)
85 (0.924)
79 (0.94)
77 (0.917)
111 (0.895)
85 (0.885)
134 (0.971)
C
3 (0.033)
7 (0.076)
5 (0.06)
7 (0.083)
13 (0.105)
11 (0.115)
4 (0.029)
rs383632
C
88 (0.978)
84 (0.933)
79 (0.94)
85 (0.904)
110 (0.902)
85 (0.885)
128 (0.97)
T
2 (0.022)
6 (0.067)
5 (0.06)
9 (0.096)
12 (0.098)
11 (0.115)
4 (0.03)
rs1462906
T
20 (0.222)
16 (0.178)
18 (0.209)
9 (0.122)
32 (0.262)
14 (0.149)
26 (0.191)
C
70 (0.778)
74 (0.822)
68 (0.791)
65 (0.878)
90 (0.738)
80 (0.851)
110 (0.809)
TABLE 4
Heterozygosities and average heterozygosity of seven SNPs in the DRD3, CSMD1 and NRG1 genes in North African populations
rs6280
rs3732783
rs3773678
rs10108270
rs385396
rs383632
rs1462906
Average
Kairouan
0.499
0.125
0.461
0.444
0.064
0.043
0.345
0.283
Kerkennah
0.494
0.045
0.479
0.498
0.140
0.125
0.293
0.296
Kesra
0.493
0.109
0.439
0.479
0.113
0.113
0.331
0.297
Mehdia
0.479
0.169
0.437
0.494
0.152
0.174
0.214
0.303
Smar
0.499
0.122
0.460
0.500
0.188
0.177
0.387
0.333
Sousse
0.489
0.204
0.404
0.489
0.204
0.204
0.254
0.321
Libya
0.492
0.120
0.492
0.500
0.056
0.058
0.309
0.289
Average
0.492
0.127
0.453
0.486
0.131
0.127
0.304
0.303
Allele frequencies in North African populations at seven SNPs (rs6280, rs3732783 and rs3773678 of the DRD3 gene; rs10108270 of the CSMD1 gene and rs385396, rs383632 and rs1462906 of the NRG1 geneHeterozygosities and average heterozygosity of seven SNPs in the DRD3, CSMD1 and NRG1 genes in North African populations
Gene diversity analysis
Total genomic diversity through all loci among populations (H
T), the genetic diversity between individuals on populations (H
S) and the diversity between populations (F
ST) are presented in Table 5. The findings propose that the genomic diversity of all the loci was relatively high (mean H
T = 0.3063), this could be owing to the diversity between individuals inside the same population (mean H
S = 0.3031). The low F
ST values (0.0027–0.0165) indicate that a high proportion (98.35%–99.73%) of the total genetic variability is the result of within‐population variation while 0.27%–1.65% of this variability is due to the differences among the seven North African populations. The frequencies of the ancestral alleles of the SNPs analysed in the North African population and the global populations were shown in Figure 2. The frequencies of the ancestral alleles rs383632 (C) and rs385396 (A) of the NRG1 gene in the North African populations are similar to those of the African, Admixed American, European and South Asian populations but are higher than those of the East Asian population. The rs3773678 (A), rs6280 (C), rs10108270 (A) and rs1462906 (T) ancestral alleles frequencies in the North African population are close to those of the Admixed American, East Asian, European and South Asian populations but are lower than those of the African population. The rs3732783 (C) ancestral allele frequency in the North African population is similar to those of other world populations.
TABLE 5
Genetic diversity analysis for each locus and all loci in the North African population
Locus
Heterozygosiy within populations (HS)
Total heterozygosity (HT)
Gene diversity (FST)
rs6280
0.4920
0.4933
0.0027
rs3732783
0.1275
0.1290
0.0110
rs3773678
0.4531
0.4576
0.0098
rs10108270
0.4862
0.4938
0.0155
rs385396
0.1310
0.1329
0.0142
rs383632
0.1276
0.1297
0.0165
rs1462906
0.3046
0.3083
0.0121
All loci
0.3031
0.3063
0.0116
FIGURE 2
Frequency of the ancestral allele of SNPs in the North African and the world populations. AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; NA, North African; SAS, South Asian. Abbreviations for other populations are shown in Table 1
Genetic diversity analysis for each locus and all loci in the North African populationFrequency of the ancestral allele of SNPs in the North African and the world populations. AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; NA, North African; SAS, South Asian. Abbreviations for other populations are shown in Table 1
Principal component analysis (PCA)
The PCA plot depicted in Figure 3 shown the genetic relationship between North African and world populations. PC1 (79.65%) and PC2 (16.46%) clearly separate the North African population from the East Asian and African populations, respectively. The North African population was clustered with the European, South Asian and Admixed American populations in the lower left region of the PCA plot.
FIGURE 3
Principal component analysis (PCA) of North African and world populations based on a dataset
Principal component analysis (PCA) of North African and world populations based on a dataset
Analysis of haplotype frequencies
Haplotypic analysis of the rs10108270, rs383632, rs385396 and rs1462906 polymorphisms of chromosome 8 in all the populations studied showed the presence of 10 haplotypes among the 16 possible haplotypes (Figure 4). The most common haplotype, CCAC, represents 39.5% of all North African chromosomes. The ACAT haplotype with ancestral allele at all four SNPs is more frequent in the Kairouan population (11.7%) than in other North African populations (1.1%–6.7%). The CTCT haplotype with ancestral allele (T) of rs1462906 is only detected in the Sousse population. Since the rs10108270 polymorphism of the CSMD1 gene is very distant from the other SNPs of the NRG1 gene on chromosome 8 (~27 megabytes), we have estimated the frequencies of the haplotypes consisting only of the SNPs rs383632, rs385396 and rs1462906 of the NRG1 gene in Table S2. We noticed that the results obtained are very close to those determined with the combination of the four SNPs of chromosome 8. In fact, the TCT haplotype is only identified in the Sousse population.
FIGURE 4
Haplotypic distribution of the chromosome 8 single nucleotide polymorphisms (rs10108270, rs383632, rs385396, rs1462906) in the North African and world populations. AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; NA, North African; SAS, South Asian. Abbreviations for other populations are shown in Table 1. The ancestral haplotype is underlined
Haplotypic distribution of the chromosome 8 single nucleotide polymorphisms (rs10108270, rs383632, rs385396, rs1462906) in the North African and world populations. AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; NA, North African; SAS, South Asian. Abbreviations for other populations are shown in Table 1. The ancestral haplotype is underlinedHaplotypic analysis of the rs3773678, rs3732783 and rs6280 SNPs of DRD3 gene on chromosome 3 in all the populations studied showed the presence of seven haplotypes among the eight possible haplotypes (Figure 5). The frequencies of the haplotypes composed of these three SNPs in the North African population are presented in Tablr S3. The ACC haplotype composed of the ancestral alleles of the three SNPs of chromosome 3 is more frequent in the Kairouan population (6.4%) than in other North African populations (0.8%–2.8%). The GCT haplotype with ancestral allele (C) of rs3732783 is specific to the Mehdia population of the North African region. No other population has this haplotype.
FIGURE 5
Haplotypic distribution of the chromosome 3 single nucleotide polymorphisms (rs3773678, rs3732783, rs6280) in the North African and world populations. AFR: African; AMR: Ad Mixed American; EAS: East Asian; EUR: European; NA: North African; SAS: South Asian. Abbreviations for other populations are shown in Table 1. The ancestral haplotype is underlined
Haplotypic distribution of the chromosome 3 single nucleotide polymorphisms (rs3773678, rs3732783, rs6280) in the North African and world populations. AFR: African; AMR: Ad Mixed American; EAS: East Asian; EUR: European; NA: North African; SAS: South Asian. Abbreviations for other populations are shown in Table 1. The ancestral haplotype is underlinedThe comparison of the haplotypic profile of the North African population with those of the 26 populations of the 1000 Genomes project was carried out with the SNPs of chromosome 8 (Figure 4) and those of chromosome 3 (Figure 5). The frequency of ACAT haplotype with ancestral alleles is very high in the African population (65.89%) but low in the North African (5.38%), European (0.7%), Admixed American (1.59%), East Asian (0.2%) and South Asian (1.64%) populations (Figure 4). The frequency of the ACC haplotype, composed of ancestral alleles, in the North African population (1.75%), is close to that of the Admixed American (0.72%), East Asian (0.5%) and European (0%) populations but it is different from that of the African (7.19%) and South Asian (5.32%) populations. The GTT haplotype composed of the derived alleles of the three SNPs of chromosome 3 is more frequent in the North African (49.46%), South Asian (52.15%), European (64.41%), East Asian (61.81%) and Admixed American (54.03%) populations than in the African population (12.18%) (Figure 5).Figure 6 showed the frequencies of haplotypes carrying the T‐derived allele of SNP rs6280. The frequency of these haplotypes was 18.1% in the African population while it varied between 55.3% in the North African population and 69.3% in the East Asian population. Analysis of microhaplotypes––haplotypes of less than 300 nucleotides (Kidd et al., 2014)—made up of exonic polymorphisms of the DRD3 gene in the North African population and global populations revealed that the rs3732783C‐rs6280T microhaplotype is present only in the North African population. It represents 50% of the microhaplotypes determined in this population (Figure 7). The rs3732783T‐rs6280T microhaplotype consisting of derived alleles has a low frequency (5.2%) in the North African population while its frequency varies between 18% and 69% in the other populations. In the North African population, the frequency of the microhaplotype presenting the ancestral alleles rs3732783C and rs6280C (14.3%) is close to that of the South Asian (14.7%) and African (13.7%) populations but it is higher than that of the European (8%), American (9.8%) and East Asian (1.5%) populations (Figure 7).
FIGURE 6
Frequency of haplotypes with or without the rs6280T‐derived allele in the North African and world populations. AFR, African; AMR, Ad Mixed American; EAS, East Asian; EUR, European; NA, North African; SAS, South Asian
FIGURE 7
Haplotypic distribution of the chromosome 3 single nucleotide polymorphisms (rs3732783, rs6280) in the North African and world populations. AFR, African; AMR, Ad Mixed American; EAS, East Asian; EUR, European; NA, North African; SAS, South Asian
Frequency of haplotypes with or without the rs6280T‐derived allele in the North African and world populations. AFR, African; AMR, Ad Mixed American; EAS, East Asian; EUR, European; NA, North African; SAS, South AsianHaplotypic distribution of the chromosome 3 single nucleotide polymorphisms (rs3732783, rs6280) in the North African and world populations. AFR, African; AMR, Ad Mixed American; EAS, East Asian; EUR, European; NA, North African; SAS, South Asian
Linkage disequilibrium
Linkage disequilibrium (LD) between rs10108270, rs383632, rs385396 and rs1462906 SNPs of chromosome 8 and between rs3773678, rs3732783 and rs6280 SNPs of chromosome 3 is measured by the values of D′, r
2 and LOD (Logarithm of odds) for the North African populations and the world populations. The values of r
2 are presented in Figures 8 and 9.
FIGURE 8
Linkage disequilibrium patterns between rs10108270, rs383632, rs385396 and rs1462906 SNPs of chromosome 8 (a) and rs3773678, rs3732783 and rs6280 SNPs of chromosome 3 (b) in North African populations. The r
2 values are presented in each square. Red squares indicate a statistically significant LD between the pair of SNPs. Shaded red squares indicate that D′ < 1 and LOD ≥ 2, bright red squares without a number indicate that D′ = 1 and LOD ≥ 2, white squares indicate that D′ < 1 and LOD < 2 and blue squares indicate that D′ = 1 and LOD < 2
FIGURE 9
Linkage disequilibrium patterns between rs10108270, rs383632, rs385396 and rs1462906 SNPs of chromosome 8 (a) and rs3773678, rs3732783 and rs6280 SNPs of chromosome 3 (b) in the North African and world populations. AFR, African; AMR, Ad Mixed American; EAS, East Asian; EUR, European; NA: North African; SAS, South Asian. The r
2 values are presented in each square. Red squares indicate a statistically significant LD between the pair of SNPs. Shaded red squares indicate that D′ < 1 and LOD ≥ 2, bright red squares without a number indicate that D′ = 1 and LOD ≥ 2, white squares indicate that D′ < 1 and LOD < 2 and blue squares indicate that D′ = 1 and LOD < 2
Linkage disequilibrium patterns between rs10108270, rs383632, rs385396 and rs1462906 SNPs of chromosome 8 (a) and rs3773678, rs3732783 and rs6280 SNPs of chromosome 3 (b) in North African populations. The r
2 values are presented in each square. Red squares indicate a statistically significant LD between the pair of SNPs. Shaded red squares indicate that D′ < 1 and LOD ≥ 2, bright red squares without a number indicate that D′ = 1 and LOD ≥ 2, white squares indicate that D′ < 1 and LOD < 2 and blue squares indicate that D′ = 1 and LOD < 2Linkage disequilibrium patterns between rs10108270, rs383632, rs385396 and rs1462906 SNPs of chromosome 8 (a) and rs3773678, rs3732783 and rs6280 SNPs of chromosome 3 (b) in the North African and world populations. AFR, African; AMR, Ad Mixed American; EAS, East Asian; EUR, European; NA: North African; SAS, South Asian. The r
2 values are presented in each square. Red squares indicate a statistically significant LD between the pair of SNPs. Shaded red squares indicate that D′ < 1 and LOD ≥ 2, bright red squares without a number indicate that D′ = 1 and LOD ≥ 2, white squares indicate that D′ < 1 and LOD < 2 and blue squares indicate that D′ = 1 and LOD < 2Significant LD between the two SNPs rs383632 and rs385396 of the NRG1 gene in chromosome 8 was observed for all North African populations. The LD between rs3773678 and rs6280 of the DRD3 gene in chromosome 3 was significant in all North African populations except in those of Libya and Kesra. However, the exonic SNPs rs6280 and rs3732783 of the DRD3 gene were not in linkage disequilibrium in all North African populations (Figure 8).The pairwise LD between the SNPs of chromosome 8 and those of chromosome 3 for the North African population and the other global populations are presented in Figure 9. LD analysis of chromosome 8 SNPs identified a haplotypic block on 2 kb consisting of rs383632 and rs385396 in the North African population and in all global populations except African population. For chromosome 3 SNPs, we observed significant linkage disequilibrium between the polymorphisms rs3732783 and rs6280 of DRD3 gene in the North African population and global populations.
DISCUSSION
In this study, the rs3773678, rs3732783 and rs6280 SNPs of DRD3 gene, the rs10108270 SNPs of CSMD1 gene and the rs383632, rs385396 and rs1462906 SNPs of NRG1 gene were explored in seven population groups of North Africa. A regional analysis assessed the extent of genetic variation and affinity between those groups and a worldwide analysis aimed at genetically locating the North African population in relation to the global populations.Regional analysis based on comparison of allele frequencies, PCA and LD analysis revealed a general similarity among the populations of the North African region. The low F
ST values for each SNP (mean F
ST 0.0116) showed that the North African populations studied are not very different from one another at these neurological polymorphisms. This result was in agreement with that of Gardner et al. (2007) who also presented low values of F
ST for the SNPs rs383632, rs385396 and rs1462906 of the NRG1 gene in the populations of the Middle East/North Africa region. While Kidd et al. (2004) showed a high F
ST values for rs1462906 (0.457) and rs385396 (0.251) in global population. These different results could be due to the variability of the allele frequencies of the NRG1 gene according to the geographic regions. However, the heterozygosity evaluation on the seven SNPs studied suggested that the Kairouan population has the lowest average heterozygosity (0.283) among the North African populations. Therefore, within the North African population, the Kairouan population group has a low rate of genetic variability. In addition, our result of the haplotypic analysis of the North African population provided relevant information. Indeed, haplotypes composed of the ancestral alleles ACC of chromosome 3 and ACAT of chromosome 8 were more frequent in the Kairouan population than in other North African populations. We also showed that the Mahdia and Sousse populations were distinguished from other North African populations by the presence of the GCT and CTCT haplotypes, respectively. The presence of these specific haplotypes could be the result of recombination and mutations or a new flow of genes, which play a major role in genetic diversity and represent an important evolutionary force (Scally, 2016). Our result is comparable to those obtained by the analysis of other genetic markers. Effectively, by analysing SNPs of the COMT gene, Boussetta et al. (2019) suggested that the Sousse population is distinguished from other North African populations. In addition, by analysing seven Alu insertion polymorphisms, Frigi et al. (2014) highlighted a differentiation between Mahdia and the majority of North African populations. This genetic diversity observed in the populations of Sousse and Mahdia could be due to their close geographical location and their common history. In fact, the city of Sousse is located 28 kilometres from that of Mahdia. The two cities occupy a strategic geographical location in the center‐east of Tunisia on the Mediterranean Sea. These two cities have seen the passage of several civilizations such as Phoenicians, Romans, Vandals, Byzantines, Arabs, Spanish, Ottomans, Andalusians, and French who participated in the genetic diversity of North African populations (Favreau, 1995; Norwich, 1992; Pétridès, 1910).The comparison of the North African population to the whole of the world populations revealed a differentiation of the sub‐Saharan African populations from the North African, European, South Asian, East Asian and Admixed American populations. Indeed, the frequency of the ancestral allele of certain SNPs like rs6280, rs3773678, rs1462906 and rs10108270 in the sub‐Saharan African populations was different from that of other populations. Our result confirmed that of Gardner et al. (2006) who demonstrated that the allele frequency of rs1462906 in the sub‐Saharan African populations was different from that of the world populations. PCA and haplotypic profile analysis also showed genetic differentiation of the sub‐Saharan African populations from other global populations. This differentiation could be explained by the existence of the Sahara, which constitutes a geographical barrier to migration and against gene flow (Harich et al., 2010). Indeed, 10,000 years ago during the optimum climatic Holocene, the drying up of the Sahara began and has led for about 5,000 years to the current climate and to a rupture between the north and the south of Africa (Brooks et al., 2005).Despite the geographic, socio‐cultural and linguistic distances that separate the North African and sub‐Saharan African populations, the existence of an ancestral genetic component between the two groups of populations has been shown. Indeed, we have demonstrated that the ancestral ACC haplotypes of chromosome 3 and ACAT of chromosome 8 were present in the North African and sub‐Saharan African populations. We have already demonstrated in a previous study that the genes of the North African populations were subjected to a Sub‐Saharan influence (Mestiri et al., 2021). Our results were consistent with the idea of ‘out of Africa’ dispersal for Homo sapiens presented by Henn et al. (2012). More recently, D'Atanasio et al. (2018) proved that the period of Humid Africa favoured the expansion of Neolithic civilization and contact between populations, which contributed to the distribution of Y lineages in North and Sub‐Saharan Africa. All these events could explain the presence of common haplotypes between the North African and African populations.On the other hand, PCA and the haplotypic analysis of DRD3, CSMD1 and NRG1 polymorphisms in North African and worldwide populations showed that the genetic structure of populations in North Africa is closer to that of Europeans, Admixed Americans, South Asians and East Asians. This result proved the existence of a considerable Eurasian genetic component for the North African populations since it was shown that recent European contact had an influence on the Admixed American population (Gravel et al., 2013). Our findings were in agreement with other observations made using different genetic markers. In recent study, using SNPs from the DRD2/ANKK1 locus, we demonstrated by PCA the genetic affinity of the North African population to the South Asian and Admixted American populations (Mestiri et al., 2021). In 2017, Rodríguez‐Varela et al. (2017) performed a PCA on the Guanche individuals (originated from Canary Islands) and populations from North Africa, Europe and the Middle East using genome‐wide autosomal SNPs. They revealed a strong genetic affinity of the Guanches to modern Northwest African populations like Tunisians and Algerians. Further research confirmed our result by analysing the HLA genes. Indeed, Hajjej et al. (2017) showed that the Tunisian population is genetically close to the Iberians and Western Mediterranean but it was different from the Sub‐Saharan population. Other studies based on the analysis of mitochondrial DNA have shown that the haplogroups detected in the North African population are mostly of Eurasian origin (Cherni et al., 2005; Cherni et al., 2009; Fadhlaoui‐Zid et al., 2011; Frigi et al., 2017; Harich et al., 2010). Indeed, the U5, R0a, V, U3 and J1b haplogroups of Middle East and European origin as well as the M1b and U6 haplogroups of Southwest Asian origin are specific to the North African populations (González‐Andrade et al., 2007; Olivieri et al., 2006; Pennarun et al., 2012). Arauna et al. (2019) recently studied the genetic history of populations in the coastal area of North Africa. They demonstrated a flow of genes from the Mediterranean coasts of North Africa to Europe (Tuscany and the Iberian Peninsula) probably in the fourth century during the movements of peoples following the fall of the Roman Empire and during the expansion of Arabs in the seventh century and the Christian kingdoms later.We then focused on the analysis of exonic polymorphisms of the DRD3 gene, the synonymous SNP rs3732783 and the non‐synonymous SNP rs6280 in the North African population and other global populations. Comparison of the frequencies of haplotypes carrying the rs6280T‐derived allele and haplotypes carrying the rs6280C ancestral allele proved that all populations except the sub‐Saharan African populations showed an increase in the frequency of haplotypes carrying the rs6280T‐derived allele. This result could be explained by a recent positive selection, which affected the DRD3 gene only in North African, European, South Asian, East Asian and Admixed American populations. It is very likely that the rs6280 SNP was the subject of this positive selection since it has an effect on the function of the gene. Indeed, it has been shown that the ancestral allele rs6280C has a greater affinity to dopamine and increased signalling responses (Jeanneteau et al., 2006). The most striking result of our analysis was the specificity of the microhaplotype presenting the ancestral C allele of rs3732783 and the T‐derived allele of rs6280 to the North African population. This microhaplotype represented 50% of all haplotypes in the North African populations. However, the frequency of the microhaplotype presenting the ancestral alleles rs3732783C and rs6280C in the North African populations was very low compared to those of other populations worldwide. Our study showed that the rs3732783 and rs6280 polymorphisms were not in LD in all North African populations. The generation of new haplotypes specific to the North African region could be due to different evolutionary forces occurred over many human generations and which can act at the genomic level by mutations and recombination, at the demographic level by migrations and at the selective level following climate or nutritional change. All these events have allowed the North African population to be genetically distinct from other populations.An important limitation of this study was the limited number of SNPs analysed. Indeed, to regroup individuals into distinct groups according to their different origins, it is necessary to consider a very large number of markers making it possible to have a panel with sufficient discriminating power. In addition, the analysis of a large number of markers will avoid bias in the estimation of inter‐ and intrapopulation variation (Willing et al., 2012). On the other hand, the study of the genetic diversity and the structure of the North African population could be carried out using a new approach, which analyses transcriptomic markers specific to the population.In conclusion, our study suggested that the North African genome lies in an intermediate position between Europe and Asia. The antiquity and richness of the North Africa colonisation history allowed the emergence of a particular genetic structure, which made North African populations genetically distinct from other populations.
ETHICAL COMPLIANCE
Ethics Committee approves this work for Research in Life Sciences and Health of the ISBM (CER‐SVS/ISBM).
CONFLICT OF INTEREST
The authors have declared no conflicts of interest.
AUTHOR CONTRIBUTIONS
Souhir Mestiri: Data interpretation; writing‐original draft. Sami Boussetta and Sarra Elkamel: Methodology; data acquisition and analysis. Andrew J. Pakstis, Kenneth K. Kidd and Amel Ben Ammar Elgaaied: Writing‐review and editing. Lotfi Cherni: Project administration; supervision.TABLE S1TABLE S2TABLE S3Click here for additional data file.
Authors: Kenneth K Kidd; Andrew J Pakstis; William C Speed; Robert Lagacé; Joseph Chang; Sharon Wootton; Eva Haigh; Judith R Kidd Journal: Forensic Sci Int Genet Date: 2014-07-01 Impact factor: 4.882
Authors: Patricia Gassó; Joan Albert Arnaiz; Sergi Mas; Amalia Lafuente; Miquel Bioque; Manuel J Cuesta; Covadonga M Díaz-Caneja; Clemente García; Antonio Lobo; Ana González-Pinto; Mara Parellada; Iluminada Corripio; Eduard Vieta; Josefina Castro-Fornieles; Anna Mané; Natalia Rodríguez; Daniel Boloc; Jerónimo Saiz-Ruiz; Miguel Bernardo Journal: J Psychopharmacol Date: 2020-02-03 Impact factor: 4.153
Authors: Igor Bombin; Celso Arango; María Mayoral; Josefina Castro-Fornieles; Ana Gonzalez-Pinto; Cristina Gonzalez-Gomez; Dolores Moreno; Mara Parellada; Inmaculada Baeza; Montserrat Graell; Soraya Otero; Pilar A Saiz; Ana Patiño-Garcia Journal: Am J Med Genet B Neuropsychiatr Genet Date: 2008-09-05 Impact factor: 3.568
Authors: Md Shaki Mostaid; David Lloyd; Benny Liberg; Suresh Sundram; Avril Pereira; Christos Pantelis; Tim Karl; Cynthia Shannon Weickert; Ian P Everall; Chad A Bousman Journal: Neurosci Biobehav Rev Date: 2016-06-06 Impact factor: 8.989
Authors: Simon Gravel; Fouad Zakharia; Andres Moreno-Estrada; Jake K Byrnes; Marina Muzzio; Juan L Rodriguez-Flores; Eimear E Kenny; Christopher R Gignoux; Brian K Maples; Wilfried Guiblet; Julie Dutil; Marc Via; Karla Sandoval; Gabriel Bedoya; Taras K Oleksyk; Andres Ruiz-Linares; Esteban G Burchard; Juan Carlos Martinez-Cruzado; Carlos D Bustamante Journal: PLoS Genet Date: 2013-12-26 Impact factor: 5.917
Authors: Souhir Mestiri; Sami Boussetta; Andrew J Pakstis; Sarra El Kamel; Amel Ben Ammar El Gaaied; Kenneth K Kidd; Lotfi Cherni Journal: Mol Genet Genomic Med Date: 2022-02-07 Impact factor: 2.183
Authors: Souhir Mestiri; Sami Boussetta; Andrew J Pakstis; Sarra El Kamel; Amel Ben Ammar El Gaaied; Kenneth K Kidd; Lotfi Cherni Journal: Mol Genet Genomic Med Date: 2022-02-07 Impact factor: 2.183