| Literature DB >> 29560428 |
Risa Anzai1, Yoshiki Asami1, Waka Inoue1, Hina Ueno1, Koya Yamada1, Tetsuji Okada1.
Abstract
Systematic analysis of the statistical and dynamical properties of proteins is critical to understanding cellular events. Extraction of biologically relevant information from a set of high-resolution structures is important because it can provide mechanistic details behind the functional properties of protein families, enabling rational comparison between families. Most of the current structural comparisons are pairwise-based, which hampers the global analysis of increasing contents in the Protein Data Bank. Additionally, pairing of protein structures introduces uncertainty with respect to reproducibility because it frequently accompanies other settings for superimposition. This study introduces intramolecular distance scoring for the global analysis of proteins, for each of which at least several high-resolution structures are available. As a pilot study, we have tested 300 human proteins and showed that the method is comprehensively used to overview advances in each protein and protein family at the atomic level. This method, together with the interpretation of the model calculations, provide new criteria for understanding specific structural variation in a protein, enabling global comparison of the variability in proteins from different species.Entities:
Keywords: Biochemistry; Bioinformatics; Biophysics; Molecular biology; Structural biology; Systems biology
Year: 2018 PMID: 29560428 PMCID: PMC5857612 DOI: 10.1016/j.heliyon.2018.e00510
Source DB: PubMed Journal: Heliyon ISSN: 2405-8440
The summary of 30 proteins described in the main text among 300 human proteins analyzed by DSA. Each full-length value, used for calculation of length (%), contains both the signal and activation peptide parts, if they are present.
| Gene | Protein | UniProt | Entries | Entries (%) | Length | Length (%) | Resolution (Å) | Score |
|---|---|---|---|---|---|---|---|---|
| AKR1B1 | Aldose reductase | P15121 | 128 | 95.5 | 307 | 97.5 | 1.42 | 144 |
| AKR1B10 | Aldo-keto reductase family 1 member B10 | O60218 | 19 | 100 | 316 | 100 | 1.83 | 197 |
| AMY2A | Pancreatic alpha-amylase | P04746 | 46 | 100 | 495 | 96.9 | 1.9 | 171 |
| B2M | Beta-2-microglobulin | P61769 | 536 | 85.6 | 99 | 83.2 | 2.16 | 49.6 |
| B4GALT1 | Beta-1,4-galactosyltransferase 1 | P15291 | 14 | 82.4 | 272 | 68.3 | 2.14 | 310 |
| BCKDHA | 2-oxoisovalerate dehydrogenase subunit alpha | P12694 | 22 | 91.7 | 280 | 62.9 | 1.85 | 269 |
| BCKDHB | 2-oxoisovalerate dehydrogenase subunit beta | P21953 | 24 | 100 | 326 | 83.2 | 1.85 | 391 |
| CA1 | Carbonic anhydrase 1 | P00915 | 24 | 100 | 256 | 98.5 | 1.96 | 133 |
| CA2 | Carbonic anhydrase 2 | P00918 | 638 | 98.8 | 255 | 98.1 | 1.68 | 149 |
| CA13 | Carbonic anhydrase 13 | Q8N1Q1 | 13 | 100 | 257 | 98.1 | 1.72 | 159 |
| CALM1 | Calmodulin | P0DP25 | 46 | 58.2 | 140 | 94.6 | 2.3 | 16.4 |
| CTSB | Cathepsin B | P07858 | 11 | 100 | 203 | 59.9 | 2.39 | 84.3 |
| CTSK | Cathepsin K | P43235 | 50 | 96.2 | 209 | 97.2 | 2.09 | 114 |
| CTSS | Cathepsin S | P25774 | 31 | 100 | 217 | 65.6 | 1.88 | 159 |
| CYP2A6 | Cytochrome P450 2A6 | P11509 | 11 | 100 | 463 | 93.7 | 2.09 | 216 |
| DDB1 | DNA damage-binding protein 1 | Q16531 | 16 | 45.7 | 771 | 67.7 | 3.01 | 50.7 |
| FNTA | Protein farnesyltransferase/geranylgeranyltransferase type-1 subunit alpha | P49354 | 14 | 100 | 313 | 82.8 | 2.03 | 349 |
| FNTB | Protein farnesyltransferase subunit beta | P49356 | 13 | 92.9 | 407 | 93.1 | 1.97 | 436 |
| GLTP | Glycolipid transfer protein | Q9NZD2 | 14 | 63.6 | 200 | 95.7 | 1.96 | 93.9 |
| HIST1H2AB | Histone H2A type 1-B/E | P04908 | 48 | 94.1 | 103 | 79.2 | 2.78 | 147 |
| NAGA | Alpha-N-acetylgalactosaminidase | P17050 | 7 | 100 | 387 | 94.2 | 1.83 | 428 |
| PRKACA | cAMP-dependent protein kinase catalytic subunit alpha | P17612 | 28 | 84.8 | 302 | 86.3 | 1.98 | 102 |
| RANGAP1 | Ran GTPase-activating protein 1 | P46060 | 13 | 100 | 156 | 26.6 | 2.33 | 44 |
| RBP2 | Retinol-binding protein 2 | P50120 | 26 | 100 | 133 | 99.3 | 1.55 | 48.4 |
| SOD1 | Superoxide dismutase [Cu-Zn] | P00441 | 59 | 69.4 | 151 | 98.1 | 1.88 | 89.4 |
| TOP1 | DNA topoisomerase 1 | P11387 | 14 | 93.3 | 412 | 53.9 | 2.75 | 99 |
| ADORA2A | Adenosine receptor A2a | P29274 | 30 | 100 | 200 | 48.5 | 2.64 | 58.8 |
| ADRB2 | Beta-2 adrenergic receptor | P07550 | 17 | 85 | 200 | 48 | 3.14 | 65.1 |
| CNR1 | Cannabinoid receptor 1 | P21554 | 4 | 100 | 200 | 42.4 | 2.79 | 58.9 |
| HTR2B | 5-hydroxytryptamine receptor 2B | P41595 | 4 | 100 | 200 | 41.6 | 2.85 | 108 |
Fig. 1The main plot obtained from distance scoring analysis (DSA). (a) ctsk; (b) prkaca; (c) ddb1; and (d) amy2a.
Fig. 2Comparison between model and real main plots of hist1h2ab. (a) Model plot with three random stdev sets; (b) model plot with 10 random stdev sets; (c) real main plot.
Fig. 3The summary plot obtained from DSA of 300 human proteins. (a) Proteins analyzed using 4–14 structures; (b) proteins analyzed using 15–26 structures; (c) proteins analyzed using 27–638 structures. For each panel, proteins with lowest/highest resolutions and scores are marked by arrows, with the gene name followed by the number of used structures in parentheses.
Fig. 4Progress plot obtained from DSA. (a) b4galt1; (b) cyp2a6; (c) top1; and (d) rangap1. For cyp2a6 and top1, exponential fittings are overlaid with dotted curves.
Fig. 5The main plot of all-alpha proteins focusing on short-distance range. (a) gltp; (b) il4; (c) adrb2. Inset: whole view of the main plot. For adrb2, intrahelical and interhelical Cα pairs are colored in red and blue dots, respectively.
Fig. 6The main plot highlighting systematic appearance of high-scoring pairs. (a) akr1b1; (b) sod1; (c) lyz. All the residues contributing to the red dots in the main plot are graphically shown in the inset as red part of the ribbon. The PDB ID of the inset is (a) 4YU1; (b) 5U9 M (chain A); (c) 208L.