Literature DB >> 23113183

Genetic Diversity and Balancing Selection within the Human Phenylalanine Hydroxylase (PAH) Gene Region in Iranian Population.

A Haghighatnia1, S Vallian, J Mowla, Z Fazeli.   

Abstract

BACKGROUND: Genetic diversity of three polymorphic markers in the phenylalanine hydroxylase (PAH) gene region including PvuII (a), PAHSTR and MspI were investigated.
METHODS: Unrelated individuals (n=139) from the Iranian populations were genotyped using primers specific to PAH gene markers including PvuII(a), MspI and PAHSTR. The amplified products for PvuII(a), MspI were digested using the appropriate restriction enzymes and separated on 1.5% agarose. The PAHSTR alleles were identified using polyacrylamide gel electrophoresis followed by silver staining. The exact size of the STR alleles was determined by sequencing. The allele frequency and population status of the alleles were estimated using PHASE, FBAT and GENEPOP software.
RESULTS: The estimated degree of heterozygosity for PAHSTR, MspI and PvuII (a) was 66%, 56% and 58%, respectively. The haplotype estimation analysis of the markers resulted in nine informative haplotypes with frequencies ≥5%. Moreover, the results obtained from Ewens-Watterson test for neutrality suggested that the markers were under balancing selection in the Iranian population.
CONCLUSION: These findings suggested the presence of genetic diversity at these three markers in the PAH gene region. Therefore, the markers could be considered as functional markers for linkage analysis of the PAH gene mutations in the Iranian families with the PKU disease.

Entities:  

Keywords:  Genetic diversity; Iran; Phenylalanine hydroxylase (PAH); Phenylketonuria (PKU)

Year:  2012        PMID: 23113183      PMCID: PMC3468980     

Source DB:  PubMed          Journal:  Iran J Public Health        ISSN: 2251-6085            Impact factor:   1.429


Introduction

Deficiency in the human phenylalanine hydroxylase (PAH) enzyme due mutations in the coding region of PAH gene is associated with phenylketonuria (PKU) (1–4). The PAH gene is 90 kbp in length with 13 exons (5–7). More than 500 different mutations have been identified in this gene associated with the PKU disease (3, 7). Therefore, direct mutation analysis of the disease, especially for families with a diseased member (child) is time consuming and very expensive. Alternatively, indirect analysis of mutations, using linked markers have proven to be useful. However, linkage analysis requires the presence of informative haplotypes in the population (8). There are two multiallelic markers in the PAH gene region; a variable number of tandem repeats (VNTR) at the 3′ end of the gene, and an intra-genic short tandem repeat (STR) in intron 3. Moreover, eight biallelic markers are present along the gene region including BglII, PvuII(a) and PvuII(b), EcoRI, XmnI, MspI, HindIII and EcoRV (5,7,9,10). Although each marker can be analyzed independently, it is more informative to analyze them in groups. When multiple markers in a chromosomal region were used to assess their association with a disease, determination of haplotype was more efficient than separate analyses of individual markers (11). There are two categories of data which can unambiguously determine haplotypes, i) population-based data and ii) pedigree or family-based data (12). Several algorithms have been developed to infer haplotypes and estimate haplotype frequency by use of genotyping data obtained from unrelated individuals (unknown phase or population-based data) (11). These algorithms include parsimony, Expectation-Maximization, Bayesian, Perfect and Imperfect phylogeny (13–18). However, a proportion of the inferred haplotypes might be incorrect which could be considered as the main disadvantage of all these algorithms (17–20). In family-based methods, usually families with several generations were genotyped across several loci and the linkage phase could be determined by recording alleles passed from one generation to the next (12). It has been reported that switch errors where a segment of the maternal haplotype was incorrectly joined to the paternal one, occur extraordinarily rarely in the trio samples (parents and their offspring). The switch errors were found higher in the unrelated individuals due to the lack of information. However, even for the unrelated individuals, the estimation of haplotype frequency can be performed with relatively high accuracy (21) According to the theory of neutral molecular evolution, most of the polymorphisms were selectively neutral in a population. A number of neutrality tests were devised that make general interferences about the causes of molecular evolution (22, 23). By using these tests, the impact of selection was studied at a number of genes, such as ABO blood group, the major histocompatibility antigens (HLA), lactase (24–26). Although, the more polymorphisms at the PAH gene was placed at non-coding region, but it was necessary to survey neutrality theory for these markers. These studies could increase our insight into evolutionary history of human populations. In the present study, the genetic structure of three other markers of the PAH gene including PvuII(a), STR and MspI was investigated

Material and Methods

DNA sample

Overall, 139 individuals consist of 100 unrelated healthy persons and 13 family trios (with three persons, two parents, and one child) from Isfahan population of Iran were included in the study.

Genotyping

Genotyping was performed by first extracting genomic DNA from peripheral blood using a standard salting out procedure (27). The DNA samples were amplified for two biallelic restriction fragment length polymorphism (RFLP) markers (PvuII (a) and MspI), and one multiallelic short tandem repeat marker (PAHSTR). The PvuII (a) site is located at the 5′ end of intron 2 (GeneBank AF003966) (28). The position of MspI site is in intron 7, downstream to exon 8 (GeneBank AF003967) (28). PCR reaction was performed in 25 μL total volume containing 50 ng DNA, 500 mM KCl, 100 mM Tris–HCl (pH 8.4), 50 mM MgCl2, 200 μM dNTP, 5U Taq DNA polymerase and 10 pmole of each forward and reverse primers. Initial denaturation was performed at 94 °C for 4 minutes, followed by 30 cycles including 1 minute denaturation at 95 °C; different annealing temperature depending on the marker used (57°C for PAHSTR and 58°C for MspI and PvuII (a)) for 1 minute), 1 minute extension at 72 °C; followed by a final extension for 10 minutes at 72 °C. The amplified DNA for MspI and PvuII loci were subjected to enzymatic digestion. The digestion products were separated on 1.5% agarose gel electrophoresis (29, 30). The exact size of the PAHSTR alleles were examined by separating of the alleles on PAGE, gel extraction using a melting and freezing procedure as described previously (31) and cloning into a PTZ57R/T using TA clone™ PCR cloning Kit (Fermentas, Germany). The cloned PAHSTR alleles were sequenced using an ABI 737 sequencer (Perkin Elmer/ABI).

Statistical analysis

Allele frequency of the observed and expected heterozygosity was determined using the GENEPOP software (32). The haplotype frequency was estimated using PHASE (33) and FBAT software (34). Genotype data in 100 unrelated individuals were used by PHASE software and haplotype frequency estimation of genotype data obtained from the families were performed by FBAT. The genotyping data obtained from family trios was used to infer haplotype phase in each individual. The inferring of haplotype phase was done by use of PedPhase program (35, 36). In order to provide a better description of the obtained results from the haplotype estimation, the estimation of LD (linkage disequilibrium) was also performed for 100 unrelated individuals using 2LD computer program (37). The Ewens–Watterson homozygosity test of neutrality was performed using PyPop program (38). Ewens-Watterson test was also performed using Popgene32 software (available at http://www.ualberta.ca/~fyeh/download.htm). The observed F (sum of square of allelic frequency) and limit (upper and lower) at 95% confidence region were calculated using similar software.

Results

Three markers in the phenyalanine hydroxylase (PAH) gene region including PAHSTR, PvuII (a) and MspI were genotyped. Genotyping of PAHSTR allele revealed the presence of 9 different alleles with 260, 252, 248, 244, 240, 236, 232, 228, 224 bp in size. The exact size of the alleles was confirmed by sequencing (Fig. 1, line M).
Fig. 1:

Genotyping of the PAHSTR allele located in the PAH gene region. The PAHSTR alleles were genotyped and analyzed in 12% polyacrylamide gel electrophoresis followed by silver staining. The indicated alleles were as follows: Lane 1, allele number 7; Lane 2, alleles 8 and 9; Lane 3, alleles 7 and 4; Lane 4, allele 6, lane 5, allele 6; Lane 6, alleles 8 and 7; and Lane 7, allele 6. M, represents DNA ladder composed of four sequenced alleles

The PvuII (a) and MspI biallelic markers were genotyped using PCR followed by restriction enzyme digestion (Fig. 2).
Fig. 2:

Genotyping of PvuII (a) and MspI markers. A) For PvuII (a) marker, PCR using specific primers resulted in a 375 bp fragment. In the presence of the restriction site, the amplified product was digested by PvuII (a) to 225 and 150bp fragments. B) For MspI marker, PCR using specific primers resulted in a 425bp fragment. In the presence of the restriction site, the amplified product was digested by MspI into 300 and 125bp fragments. Symbols (+) and (−) represent the presence and absence of the restriction site, respectively. Line M, represents 100bp DNA ladder

The allelic frequency and the expected and observed heterozygosity of the markers were estimated by GENEPOP software as depicted in Tables 1, 2 and 3.
Table 1:

The frequency distribution of the MspI and PvuII(a) markers in the Iranian population

Frequency

SiteAllele (+)Allele (−)
MspI0.48000.5200
PvuII(a)0.63000.3700
Table 2:

The frequency distribution of the PAHSTR marker in the Iranian population

PAHSTR

AlleleFrequencySize (bp)
10.0300260
20.1050252
30.1850248
40.2200244
50.2150240
60.1050236
70.1000232
80.0350228
90.0050224
Table 3:

Expected and observed heterozygosity of three markers, PAHSTR, MspI and PvuII(a) at the PAH gene in the Iranian population

HeterozygosityPAHSTRMspIPvuII(a)
Observed66%56%58%
Expected84.1%50.17%46.8%
The genotyping data for PAHSTR showed that alleles with 244 bp and 224 bp had the most and lowest frequency, respectively. The observed heterozygosity for the PAHSTR marker was lower than the expected heterozygosity (Table 3). The frequency of MspI and PvuII (a) restriction sites were 0.4800 and 0.6300 in the Iranian population. The haplotype frequency for three markers was estimated. As shown in Table 4, among 36 possible haplotypes by PHASE software, nine haplotypes were considered as informative haplotypes with frequencies ≥5% in the Iranian population. Haplotype estimation of families was performed by FBAT software, which resulted in 7 informative haplotypes with frequencies ≥5% among 23 possible haplotypes. The presence and absence of restriction sites in PvuII (a) and MspI was indicated as number 1 and 0, respectively. For the PAHSTR marker, number 1 to 9 was used to indicate different allele sizes (Table 5).
Table 4:

The estimation of PvuII(a)-PAHSTR-MspI haplotypes frequency at the PAH gene by two programs, FBAT and PHASE in the Iranian population

Haplotype*FBATPHASE
1110.0096150.004907
1100.0000000.004039
1210.0000000.026179
1200.0096150.057420
1310.0480770.055828
1300.0384620.059984
1410.2019230.053070
1400.1057690.089032
1510.0192310.070292
1500.0000000.088447
1610.0096150.027882
1600.0865380.036665
1710.0096150.012482
1700.00000000.032121
1810.0192310.013548
1800.0000000.006676
1910.0000000.000301
1900.0000000.001126
0110.0096150.020803
0100.0000000.000251
0210.0576920.014405
0200.0096150.006995
0310.0673080.058781
0300.0384620.010408
0410.0865380.072781
0400.0480770.025116
0510.0000000.040881
0500.0192310.005380
0610.0096150.028945
0600.0673080.028945
0710.0096150.005851
0700.0192310.049546
0810.0000000.000350
0800.0000000.009426
0910.0000000.000150
0900.0000000.003422

The numbers in each haplotype represent the alleles of PvuII(a), STR and MspI markers, respectively.

Table 5:

D′ and χ2 values for three possible pairing of markers in the PAH locus

pairing of markersD′χ2df
PvuII(a)-PAHSTR0.21719016.898
PvuII(a)-MspI0.1796773.931
PAHSTR-MspI0.31877527.758
The haplotype frequency of PvuII(a), MspI, PAHSTR markers was estimated by use of two software, FBAT for families, and PHASE for unrelated individuals. The results of haplotype estimation showed the presence of 9 [120, 131, 130, 141, 140, 151, 150, 031, 041] and 7 [141, 140, 160, 021, 031, 041, 060] informative haplotypes by use of PHASE and FBAT, respectively. Four haplotypes including 141, 140, 031 and 041 were estimated as informative haplotypes using both computer programs. The data using PHASE program showed that five haplotypes including 120, 131, 130, 151, 150 could be considered as informative haplotypes in the Iranian population. In the next step, the determination of haplotype phase at 13 family trios was done by use of PedPhase program. As presented in Table 4, the haplotype phase of all three members, two parents and their offspring, at five of thirteen families was only determined. The origin of haplotype phase (maternal or parental) was also distinguished in these families. At eight remainder pedigrees, the genotyping data of families was not enough for haplotype phasing. Estimation of haplotype frequency using PHASE or FBAT programs among phase-known haplotypes indicated the presence of seven haplotypes 021, 151, 160, 031, 141, 140 and 041 with frequency >5% (Table 4). The pattern of LD for the pairing of these three markers was assessed by calculating average D' and χ2. As indicated in Table 5, the results of estimation for D′ values for all three possible pairing of markers were smaller than 0.5. The calculated χ2 values were also smaller than χ2 values obtained from the chi-square table (P<0.05). Analysis of neutrality of all three markers showed that the F value (sum of square of allelic frequency) located outside the lower and upper limit of 95% confidence region of expected F value at two markers of PAH gene, PAHSTR and MspI (Table 6).
Table 6:

The Ewens-Watterson test for neutrality at three markers of the PAH gene in the Iranian population

The markerkObserved FExpected FFndL95U95
PvuII(a)20.53920.8314−1.74430.50180.9901
PAHSTR90.16310.3436−1.44520.18650.6798
MspI20.50040.8314−1.97560.50320.9901

k: Number of alleles; Observed F: Observed sum of the squared of allelic frequency; L95, U95: The 95% confidence interval upper and lower limit; Fnd: Normalized deviate of F

Discussion

Genetic diversity plays an important role in the application of genetic markers in linkage analysis in carrier diagnosis of genetic diseases. In the present study, genetic diversity of three markers PvuII (a), MspI, PAHSTR located in the PAH gene region were examined in the Iranian population. Genotyping of the markers showed the presence of nine different alleles for the PAHSTR marker in this population. Comparison of the frequency of the PAHSTR alleles in the Iranian population with other populations indicated that the frequency of these alleles was very similar to those of other populations (Table 2) (39). However, minor differences were observed as to alleles with sizes of 216 and 256 bp were absent in the Iranian population. Similar situation was observed in other Asian populations, which were studied previously (39). As in other populations like South America, North America, Oceania and some part of East Asia, the 244 bp allele had the most frequency in the Iranian population (39). The allele with 224 bp was absent in some populations such as South and North America, Europe and Africa. Moreover, in some other populations, e.g. some Asian regions, the allele with 224 had the lowest frequency. Among the biallelic markers examined, PvuII (a) showed higher degree of heterozygosity (Table 1), and therefore, this marker could be suggested as an informative and applicable marker for carrier detection applications. The observed heterozygosity for PAHSTR marker was found lower than the expected one. Therefore, this may indicate that this marker could not be optimal for diagnostic purposes. However, combination of PAHSTR and PAHVNTR polymorphic systems could significantly increase the informativeness of the STR/VNTR haplotype in linkage analysis in PKU families. Moreover, the data showed that the observed heterozygosity of MspI and PvuII (a) was higher than their expected heterozygosity. The degree of observed heterozygosity of PvuII (a) marker was higher than MspI, suggesting that this marker could be more applicable than MspI marker in carrier detection and prenatal diagnosis of the PKU disease in the Iranian population. Estimation of haplotype frequency by use of FBAT program, indicated a frequency of <5%. Because, the estimation of haplotype frequency was performed by use of only genotype data obtained from 13 family trios, it is likely that the low frequency of these haplotypes was resulted from insufficient number of individuals, which were used for estimation of haplotype frequency using the FBAT program. The phase of PvuII (a)-PAHSTR-MspI haplotype was also determined at 13 family trios by use of PedPhase program. The inferring of PvuII (a)-PAHSTR-MspI haplotype phase indicated that haplotype diversity was high at the Iranian population. Determination of haplotype phase at PAH gene requires genotype data obtained from other family members. Similar results were also reported previously for BglII-EcoRI-PAHVNTR haplotypes (40). As shown in Table 3, for the markers studied, the observed heterozygosity was >50%. Therefore, it is expected that more individuals in the Iranian population to be heterozygous for more than one marker and haplotype phasing at PAH gene. Furthermore, the estimation of D′ and chi-square (χ2) for the above markers showed that they were not in linkage disequilibrium. The results obtained from LD indicated that these markers at the PAH gene were independent to each other. In Ewens-Watterson test for neutrality for the markers, PAHSTR and MspI, at the PAH gene lied outside the limit of 95% confidence region. Therefore, these two markers of the PAH gene were not neutral and may be linked with some selection traits or genes. Therefore, these two markers might be under genetic hitchhiking, which is a potent force in changing allelic frequency and heterozygosity (41, 42). The observed and expected frequency (F) was estimated for PvuII (a), MspI and PAHSTR markers by use of Pypop program. The comparison of these data revealed that the observed F was lower than the expected F for all three markers at the PAH gene in the Iranian population. These results and negative value of F indicated that these markers were under balancing selection in the Iranian population. In conclusion, the results obtained from haplotype frequency estimation study suggested that the combination of PvuII (a)/PAHSTR/MspI markers could be used as an informative tool in diagnostic purposes of PAH gene mutations in the Iranian population. The results of this study could increase our insight into diversity and evolution history of these markers at the PAH gene in the Iranian population.

Ethical considerations

Ethical issues (Including plagiarism, Informed Consent, misconduct, data fabrication and/or falsification, double publication and/or submission, redundancy, etc) have been completely observed by the authors.
  32 in total

1.  ALFRED: an allele frequency database for diverse populations and DNA polymorphisms.

Authors:  K H Cheung; M V Osier; J R Kidd; A J Pakstis; P L Miller; K K Kidd
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  A new statistical method for haplotype reconstruction from population data.

Authors:  M Stephens; N J Smith; P Donnelly
Journal:  Am J Hum Genet       Date:  2001-03-09       Impact factor: 11.025

3.  Comparisons of two methods for haplotype reconstruction and haplotype frequency estimation from population data.

Authors:  S Zhang; A J Pakstis; K K Kidd; H Zhao
Journal:  Am J Hum Genet       Date:  2001-10       Impact factor: 11.025

4.  2LD, GENECOUNTING and HAP: Computer programs for linkage disequilibrium analysis.

Authors:  Jing Hua Zhao
Journal:  Bioinformatics       Date:  2004-02-10       Impact factor: 6.937

Review 5.  Natural selection and the evolutionary history of major histocompatibility complex loci.

Authors:  A L Hughes; M Yeager
Journal:  Front Biosci       Date:  1998-05-26

6.  A simple salting out procedure for extracting DNA from human nucleated cells.

Authors:  S A Miller; D D Dykes; H F Polesky
Journal:  Nucleic Acids Res       Date:  1988-02-11       Impact factor: 16.971

7.  The hitch-hiking effect of a favourable gene.

Authors:  J M Smith; J Haigh
Journal:  Genet Res       Date:  1974-02       Impact factor: 1.588

8.  Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.

Authors:  L Excoffier; M Slatkin
Journal:  Mol Biol Evol       Date:  1995-09       Impact factor: 16.240

9.  PCR detection of the MspI (Aa) RFLP at the human phenylalanine hydroxylase (PAH) locus.

Authors:  N Wedemeyer; B Dworniczak; J Horst
Journal:  Nucleic Acids Res       Date:  1991-04-25       Impact factor: 16.971

10.  Associations between mutations and a VNTR in the human phenylalanine hydroxylase gene.

Authors:  A A Goltsov; R C Eisensmith; D S Konecki; U Lichter-Konecki; S L Woo
Journal:  Am J Hum Genet       Date:  1992-09       Impact factor: 11.025

View more
  1 in total

1.  Molecular Genetic Analysis of the Variable Number of Tandem-Repeat Alleles at the Phenylalanine Hydroxylase Gene in Iranian Azeri Turkish Population.

Authors:  Morteza Bagheri; Isa Abdi Rad; Nima Hosseini Jazani; Rasoul Zarrin; Ahad Ghazavi
Journal:  Iran Biomed J       Date:  2015-05-30
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.