| Literature DB >> 16595072 |
Sevtap Savas1, Sukru Tuzmen, Hilmi Ozcelik.
Abstract
Single nucleotide polymorphisms (SNPs) constitute the most common type of genetic variation in humans. SNPs introducing premature termination codons (PTCs), herein called X-SNPs, can alter the stability and function of transcripts and proteins and thus are considered to be biologically important. Initial studies suggested a strong selection against such variations/mutations. In this study, we undertook a genome-wide systematic screening to identify human X-SNPs using the dbSNP database. Our results demonstrated the presence of 28 X-SNPs from 28 genes with known minor allele frequencies. Eight X-SNPs (28.6 per cent) were predicted to cause transcript degradation by nonsense-mediated mRNA decay. Seventeen X-SNPs (60.7 per cent) resulted in moderate to severe truncation at the C-terminus of the proteins (deletion of >50 per cent of the amino acids). The majority of the X-SNPs (78.6 per cent) represent commonly occurring SNPs, by contrast with the rarely occurring disease-causing PTC mutations. Interestingly, X-SNPs displayed a non-uniform distribution across human populations: eight X-SNPs were reported to be prevalent across three different human populations, whereas six X-SNPs were found exclusively in one or two population(s). In conclusion, we have systematically investigated human SNPs introducing PTCs with respect to their possible biological consequences, distributions across different human populations and evolutionary aspects. We believe that the SNPs reported here are likely to affect gene/protein function, although their biological and evolutionary roles need to be further investigated.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16595072 PMCID: PMC3500177 DOI: 10.1186/1479-7364-2-5-274
Source DB: PubMed Journal: Hum Genomics ISSN: 1473-9542 Impact factor: 4.639
Validated X-SNPs in the human genome.
| Gene | aGene function | bAccession # | Location | cSNP ID | dFrequency | eHomozygosity | X-SNP | fProtein length (truncation) | gNMD | hCpG |
|---|---|---|---|---|---|---|---|---|---|---|
| AGT | Cell signalling; hypertension | 1q42-q43 | rs5039 | CEPH-MULTI-NATIONAL 184 chr. | n/a | Q53X | 485 | + | + | |
| APOC4 | Lipid metabolism | 19q13.2 | rs5164 | CEPH-MULTI-NATIONAL 184 chr. | - | W47X | 127 | + | - | |
| CDH15 | Cell adhesion; morphogenetic processes | 16q24.3 | rs2270416 | JBIC-allele-EAST ASIA 1500 chr. G = 0.826 | n/a | Y788X | 814 | - | + | |
| CLCA3 | Transport | 1p31-p22 | rs2292830 | JBIC-allele -EAST ASIA 1462 chr. G = 0.569 | n/a | Y84X | 262 | - | - | |
| CYP2C19 | Transport; drug metabolism and synthesis of lipids | 10q24. | rs4986893 | PAC1-EAST ASIA 46 chr. G = 0.913 | n/a | W212X | 446 | - | - | |
| DSCR8 | Unknown | 21q22.2 | rs2836172 | NCBI|NIHPDR-NORTH AMERICA 20 chr. A = 0.900 T = 0.100 | + | K79X | 91 | - | - | |
| EPHX1 | Aromatic compound catabolism; xenobiotic metabolism | 1q42.1 | rs4986931 | PAC1-EAST ASIA 46 chr. A = 0.978 G = 0.022 P1-MULTI-NATIONAL 202 chr. A = 0.990 G = 0.010 CAUC1-MULTI-NATIONAL 62 chr. | - | W97X | 455 (78.7%) | + | - | |
| FUT2 | Carbohydrate metabolism; protein glycosylation | 19q|3.3 | rs1800030 | PAC1-EAST ASIA 48 chr. G = 0.979 A = 0.021 P1-MULTI-NATIONAL 202 chr. G = 0.995 | - | W297X | 346 | - | - | |
| HPS4 | Organelle biogenesis; protein stabilisation/targeting | 22cen--q|2.3 | rs3747129 | JBIC-allele-EAST ASIA 1492 chr. G = 0.798 A = 0.202 AFD_EUR_PANEL-NORTH AMERICA 48 chr. G = 0.812 A = 0.188 AFD_AFR_PANEL-NORTH AMERICA 46 chr. G = 0.978 A = 0.022 | + | R246X | 528 | - | + | |
| IL17RB | Immuno-regulatory activity; regulation of cell growth | 3p2|.1 | rs1043261 | JBIC-allele-EAST ASIA 1476 chr. C = 0.902 T = 0.098 HapMap-CEU-EUROPE 120 chr. C = 0.908 T = 0.092 AFD_EUR_PANEL-NORTH AMERICA 48 chr. C = 0.938 T = 0.062 | + | Q484X | 502 | - | + | |
| KRTAP1-1 | Cytoskeleton; intermediate filaments | 17q12-q21 | rs3213755 | JBIC-allele-EAST ASIA 708 chr. C = 0.617 T = 0.383 HapMap-CEU-EUROPE 120 chr. G = 0.800 A = 0.200 | + | Q51X | 177 | - | - | |
| LCE5A | Unknown | 1q21.3 | rs2282298 | JBIC-allele-EAST ASIA 1504 chr. G = 0.979 A = 0.021 AFD_EUR_PANEL-NORTH | - | R79X | 118 | - | + | |
| LIG4 | DNA repair; cell cycle | 13q33-q34 | rs2232636 | PAC1-EAST ASIA 46 chr. G = 1.000 A = 0.000 P1-MULTI-NATIONAL 202 chr. G = 0.995 A = 0.005 | - | W46X | 911 | - | - | |
| LPL | Lipoprotein metabolism | 8p22 | rs328 | WIAF-CSNP-MITOGPOP5-MULTI-NATI-ONAL 112 chr. C = 0.982 G = 0.018 JBIC-allele-EAST ASIA 1458 chr. C = 0.860 | + | S474X | 475 | - | - | |
| MAGEE2 | Unknown | Xq13.3 | rs1343879 | TSC_42_C-NORTH AMERICA 84 chr. C = 0.950 A = 0.050 C_42_A-EAST ASIA 84 chr. A = 0.650 | - | E120X | 523 | - | - | |
| MS4A12 | Signal transduction | 11q12 | rs2298553 | JBIC-allele-EAST ASIA 726 chr. C = 0.585 T = 0.415 AFD_EUR_PANEL-NORTH | + | Q71X | 267 | + | - | |
| OAS2 | Immune response | 12q24.2 | rs15895 | POOLED_CEPH-MULTI-NATIONAL 188 chr. A = 0.668 G = 0.332 CEPH-MULTI-NATIONAL 184 chr. C = 0.670 T = 0.330 | + | W720X | 727 | - | - | |
| OVCH2 | Proteolysis | 11p15.4 | rs4509745 | HapMap-CEU-EUROPE chr.120 T = 0.658 | + | W556X | 564 | - | - | |
| POLE2 | DNA repair | 14q21- q22 | rs3218790 | NIHPDR-NORTH AMERICA 170 chr. | - | K443X | 527 | + | - | |
| SER-PINB11 | Serine-type endopeptidase inhibitor activity | 18 | rs4940595 | AfAm 12 chr. C = 0.667 A = 0.333 | + | E90X | 392 | + | - | |
| SMUG1 | DNA repair | 12q13.11-q13.3 | rs2233919 | NIHPDR-NORTH AMERICA 574 chr. C = 0.986 T = 0.014 PDR90 166 chr. C = 0.988 T = 0.012 | - | Q3X | 270 | + | - | |
| SPTBN5 | Actin cytoskeleton organisation and biogenesis | 15q21 | rs2271286 | JBIC-allele-EAST ASIA 1482 chr. G = 0.951 A = 0.049 | - | Q72X | 3674 | - | - | |
| TAP2 | Immune response; protein transport and assembly | 6p21.3 | rs241448 | CEPH-MULTI-NATIONAL 184 T = 0.700 C = 0.300 WIAF-CSNP-MITOGPOP5-MULTI-NATI-ONAL 48 chr. T = 0.812 C = 0.188 | + | Q687X | 703 | - | - | |
| TAAR9 | Signal transduction | 6q23.2 | rs2842899 | HapMap-CEU-EUROPE 120 chr. T = 0.708 A = 0.292 HapMap-YRI-WEST AFRICA 120 chr. T = 0.883 A = 0.117 | + | Q61X | 348 | - | - | |
| TLR5 | Immune response | 1q41-q42 | rs5744168 | D-0-NORTH AMERICA 48 chr.C = 0.938 T = 0.062 E-0-NORTH AMERICA 40 chr. C = 0.925 | - | R392X | 858 | - | + | |
| TRPM1 | Cation transport | 15q13-q14 | rs3784589 | JBIC-allele-EAST ASIA 1502 chr. C = 0.965 A = 0.035 HapMap-CEU-EUROPE 120 chr. C = 0.942 A = 0.058 | + | E1305X | 1533 | - | - | |
| UNC93A | Unknown | 6q27 | rs2235197 | JBIC-allele-EAST ASIA 1484 chr. G = 0.852 A = 0.148 | n/a | W151X | 456 | - | - | |
| ZNF34 | Gene expression | 8q24.3 | rs2294120 | JBIC-allele-EAST ASIA 1494 chr. C = 0.729 T = 0.271 | n/a | Q56X | 549 | + | + | |
Abbreviation: SNP = single nucleotide polymorphism.
Gene functions are retrieved from the Entrez Gene database of NCBI [30].
The accession numbers onto which the SNP-flanking sequences have been located.
SNP ID corresponds to the dbSNP database SNP identifiers.
The frequency information is as posted in dbSNP build 124.
This information indicates whether or not a homozygous sample in a sample set was reported for the corresponding X-SNP and was collected from the dbSNP database 'summary of genotypes' section: 'n/a': no information was available, ' + ': homozygous genotype was reported, ' 2 ': no homozygous was reported.
Length of the wild-type protein products. In parentheses are the percentages of the protein truncation at the C-terminus caused by the X-SNP.
SNPs that may lead to nonsense-mediated mRNA decay are annotated by ' + '.
SNPs occurring at CpG dinucleotides and thus can be hot spot mutations are annotated by ' + '.
Figure 1Percentage truncation induced by the X-SNPs in the corresponding proteins. The X-SNPs that can cause mRNA degradation via NMD are indicated by * in the left margin of the histogram.
Figure 2How can we explain the allele frequencies of the X-SNPs? This figure presents a summary of possible biological consequences of X-SNPs. For simplicity, both deleterious and slightly deleterious variations are annotated as deleterious.