| Literature DB >> 27924270 |
Kazunori D Yamada1, Hafumi Nishi1, Junichi Nakata2, Kengo Kinoshita3.
Abstract
Functional sites on proteins play an important role in various molecular interactions and reactions between proteins and other molecules. Thus, mutations in functional sites can severely affect the overall phenotype. Progress of genome sequencing projects has yielded a wealth of information on single nucleotide variants (SNVs), especially those with less than 1% minor allele frequency (rare variants). To understand the functional influence of genetic variants at a protein level, we investigated the relationship between SNVs and protein functional sites in terms of minor allele frequency and the structural position of variants. As a result, we observed that SNVs were less abundant at ligand binding sites, which is consistent with a previous study on SNVs and protein interaction sites. Additionally, we found that non-rare variants tended to be located slightly apart from enzyme active sites. Examination of non-rare variants revealed that most of the mutations resulted in moderate changes of the physico-chemical properties of amino acids, suggesting the existence of functional constraints. In conclusion, this study shows that the mapping of genetic variants on protein structures could be a powerful approach to evaluate the functional impact of rare genetic variations.Entities:
Keywords: 3D structure; non-synonymous mutation; protein-ligand interaction; rare variant
Year: 2016 PMID: 27924270 PMCID: PMC5042176 DOI: 10.2142/biophysico.13.0_157
Source DB: PubMed Journal: Biophys Physicobiol ISSN: 2189-4779
Statistics of variants and non-variants at ligand binding sites and other different locations of proteins
| Binding site | Non-binding surface | Interior | Total | |
|---|---|---|---|---|
| All variants | 230 | 9,070 | 7,090 | 16,390 |
| 220 | 8,709 | 6,884 | 15,813 | |
| 4 | 187 | 106 | 297 | |
| 6 | 174 | 100 | 280 | |
| Non-variants | 3,995 | 130,181 | 141,347 | 275,523 |
| Total | 4,225 | 139,251 | 148,437 | 291,913 |
Top 10 ligands for all binding sites and the sites where rare variants were mapped. All ligands except peptides and DNA/RNA are represented by the PDB three-letter codes
| All binding sites | Rare-variant mapped sites | ||||
|---|---|---|---|---|---|
|
|
| ||||
| Rank | Ligand type | Count | Rank | Ligand type | Count |
| 1 | Peptide | 90 | 1 | DNA/RNA | 42 |
| 1 | CA | 90 | 2 | Peptide | 22 |
| 3 | ZN | 85 | 3 | CA | 12 |
| 4 | DNA/RNA | 52 | 3 | NAP | 12 |
| 5 | MG | 41 | 5 | ADP | 8 |
| 6 | ADP | 26 | 6 | FAD | 7 |
| 7 | SAH | 18 | 7 | SAM | 6 |
| 8 | NAP | 17 | 8 | NDP | 5 |
| 9 | FAD | 15 | 8 | NAD | 5 |
| 10 | ANP | 13 | 10 | ZN | 4 |
| 10 | NAD | 13 | 10 | HEM | 4 |
Intermediate and common variants at ligand binding sites. MAF (EA): minor allele frequency in the European American population, MAF (AA): minor allele frequency among African Americans, SAS: sulfasalazine, PNP: 4-nitrophenyl hydrogen methylphosphonate, CA: calcium ion, FAD: flavin adenine dinucleotide
| Mutation | Protein | Ligand | MAF (EA) | MAF (AA) | MAF (All) | RS number |
|---|---|---|---|---|---|---|
| I105V | GSTP1 | SAS | 33.27 | 41.96 | 36.08 | rs1695 |
| D101N | HLA-A | peptide | 31.37 | 26.85 | 29.84 | rs1136688 |
| L273M | ALPPL2 | PNP | 25.31 | 31.24 | 27.34 | rs17416141 |
| K186E | KLK1 | CA | 26.86 | 19.63 | 24.41 | rs5517 |
| E750D | LPHN1 | peptide | 1.116 | 18.23 | 6.912 | rs41276898 |
| T117S | CYB5R3 | FAD | 0.06980 | 27.49 | 9.357 | rs1800457 |
| T97I | HLA-A | peptide | 3.716 | 7.461 | 4.985 | rs1136688 |
| D197N | DOK7 | peptide | 0.05810 | 6.446 | 2.222 | rs16844422 |
| A114T | PABPC3 | peptide | 1.535 | 0.4539 | 1.169 | rs117014540 |
| Y89H | HLA-DRB1 | peptide | 1.086 | 0.8700 | 1.013 | rs17882583 |
Figure 1Examples of non-rare variants at ligand binding sites. A: alkaline phosphatase (colored in pink) with 4-nitrophenyl hydrogen methyl-phosphonate (cyan) (PDBID: 1zed), B: cytochrome b5 reductase 3 (gray) with flavin adenine dinucleotide (cyan) (PDBID: 3w5h), C: kallikrein 1 (blue) with a calcium ion (yellow) (PDBID: 1spj). Variant sites are shown with an orange stick model, and other ligand binding residues are shown as a light green stick. Note that not all ligand binding residues are presented in the figure.
Statistics of variants and non-variants at enzyme active sites. Numbers corresponding to the non-redundant dataset are shown in parenthesis
| All variants | Non-variants | Total | |||
|---|---|---|---|---|---|
| Rare | Intermediate | Common | Total | ||
| 48 (39) | 1 (0) | 0 (0) | 49 (39) | 953 (728) | 1,002 (767) |
Figure 2The spatial distances of variants and all residues from active sites in the non-redundant set. Black: rare variants, red: non-rare (intermediate and common) variants, blue: protein interior residues, green: protein surface residues. Note that the total numbers of rare and non-rare variants are 5,817 and 191, respectively.
Figure 3Example of non-rare variants in the active sites of chymase (PDBID: 4afq). The active sites of chymase (Ser62, His66, and Asp89) are shown as sticks. The His66 variant site is colored in orange, non-variant sites Ser62 and Asp89 are shown in light green.