| Literature DB >> 15691378 |
Anna R Panchenko1, Thomas Madej.
Abstract
BACKGROUND: Protein evolution and protein classification are usually inferred by comparing protein cores in their conserved aligned parts. Structurally aligned protein regions are separated by less conserved loop regions, where sequence and structure locally deviate from each other and do not superimpose well.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15691378 PMCID: PMC549550 DOI: 10.1186/1471-2148-5-10
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
List of the names of 59 test protein families together with their CDD accession names, lengths, number of protein pairs, Pearson correlation coefficients between LHM (AHM) and normalized Blast bitscore. The families are ordered with respect to decreasing quality of LHM correlation. The supplementary table is available at [27].
| Xylose_isom | pfam00259 | 381 | 28 | -0.99 | -0.98 |
| MHC_I | pfam00129 | 175 | 28 | -0.95 | -0.96 |
| PTPc | smart00194 | 248 | 25 | -0.92 | -0.96 |
| IPT | smart00429 | 97 | 21 | -0.90 | -0.94 |
| ZnMc_1 | smart00235 | 137 | 34 | -0.83 | -0.94 |
| RNAse_Pc | cd00163 | 99 | 25 | -0.82 | -0.94 |
| gpdh_C | pfam02800 | 153 | 39 | -0.72 | -0.93 |
| Aamy_C | smart00632 | 81 | 31 | -0.94 | -0.90 |
| peroxidase | pfam00141 | 240 | 48 | -0.90 | -0.90 |
| copper-bind | pfam00127 | 81 | 87 | -0.84 | -0.89 |
| CBM_20 | pfam00686 | 94 | 15 | -0.91 | -0.89 |
| RnaseA | pfam00074 | 98 | 44 | -0.48 | -0.87 |
| IGv | cd00099 | 105 | 133 | -0.78 | -0.86 |
| ADH_zinc_N | pfam00107 | 337 | 64 | -0.93 | -0.86 |
| ldh_C | pfam02866 | 143 | 29 | -0.93 | -0.86 |
| RIP | pfam00161 | 232 | 28 | -0.87 | -0.85 |
| Peptidase_C1 | pfam00112 | 200 | 55 | -0.82 | -0.85 |
| ZnMc_2 | cd00203 | 134 | 23 | -0.87 | -0.85 |
| PROF | cd00148 | 120 | 15 | -0.90 | -0.85 |
| plant_peroxidase | cd00314 | 236 | 76 | -0.90 | -0.83 |
| alpha-amylase_C | pfam02806 | 78 | 39 | -0.93 | -0.82 |
| sodcu | pfam00080 | 139 | 15 | -0.98 | -0.81 |
| fer2_1 | cd00207 | 78 | 38 | -0.86 | -0.80 |
| Pept_C1 | smart00645 | 202 | 90 | -0.86 | -0.79 |
| ferritin | pfam00210 | 152 | 19 | -0.94 | -0.79 |
| ldh | pfam00056 | 135 | 44 | -0.82 | -0.78 |
| SH2 | pfam00017 | 86 | 21 | -0.48 | -0.78 |
| flavodoxin | pfam00258 | 143 | 26 | -0.88 | -0.78 |
| EFh | cd00051 | 57 | 59 | -0.75 | -0.77 |
| rhv_1 | cd00205 | 195 | 71 | -0.86 | -0.76 |
| LYZ1_1 | smart00263 | 116 | 67 | -0.66 | -0.75 |
| aldo_ket_red | pfam00248 | 277 | 28 | -0.93 | -0.73 |
| COesterase | pfam00135 | 485 | 28 | -0.80 | -0.72 |
| TIG | pfam01833 | 89 | 39 | -0.90 | -0.72 |
| fer2_2 | pfam00111 | 69 | 73 | -0.77 | -0.70 |
| beta-lactamase | pfam00144 | 264 | 45 | -0.90 | -0.70 |
| rhv_2 | pfam00073 | 216 | 95 | -0.86 | -0.70 |
| GLECT | cd00070 | 124 | 28 | -0.80 | -0.67 |
| globin | pfam00042 | 133 | 96 | -0.74 | -0.66 |
| GST_C | pfam00043 | 107 | 77 | -0.77 | -0.63 |
| LYZ1_2 | cd00119 | 109 | 24 | -0.43 | -0.61 |
| PA2c | smart00085 | 102 | 210 | -0.29 | -0.57 |
| lipocalin | pfam00061 | 131 | 55 | -0.62 | -0.56 |
| phoslip | pfam00068 | 102 | 102 | -0.21 | -0.54 |
| proteasome | pfam00227 | 189 | 56 | -0.80 | -0.51 |
| UBCc | smart00212 | 141 | 45 | -0.79 | -0.50 |
| Sm | smart00651 | 63 | 30 | -0.54 | -0.49 |
| Tryp_SPc | smart00020 | 208 | 561 | -0.55 | -0.46 |
| CLECT_1 | smart00034 | 90 | 35 | -0.59 | -0.44 |
| crystall | pfam00030 | 81 | 10 | -0.76 | -0.41 |
| CLECT_2 | cd00037 | 93 | 263 | -0.45 | -0.36 |
| RHO | smart00174 | 173 | 10 | -0.52 | -0.36 |
| IGc1 | cd00098 | 88 | 85 | -0.65 | -0.32 |
| Tryp_SPc | cd00190 | 211 | 378 | -0.55 | -0.31 |
| MHC_II_beta | pfam00969 | 86 | 32 | -0.52 | -0.26 |
| ADK | pfam00406 | 174 | 28 | -0.37 | -0.19 |
| Rho | cd00157 | 172 | 66 | -0.20 | -0.16 |
| Phycobilisome | pfam00502 | 148 | 15 | -0.85 | -0.10 |
| ADF | smart00102 | 116 | 10 | -0.85 | 0.34 |
Table shows the median of Pearson correlation coefficients, fraction of families with statistically significant correlation (P-value less than 0.01) and the fraction of families with the ratio r2 higher than 0.9 for each measure of structural similarity used in the study.
| -0.81 | 90 | 71 | |
| -0.82 | 90 | 80 | |
| -0.76 | 88 | 71 |
Figure 1Hausdorff measure (in Angstroms) for loop (LHM) and aligned (AHM) regions is plotted versus the normalized Blast bitscore for three families: Pancreatic ribonucleases (RnaseA), Ig-like plexins/transcription factors (IPT) and Trypsin-like serine proteases (Tryp_SPc). Solid line shows the linear regression fit of the data.
Figure 2Complete linkage cluster tree produced using fraction of non-identical residues (p-distance), RMSD (Å), and LHM (Å) is plotted between proteins from Pancreatic ribonuclease family (RnaseA). Five major groups of RnaseA family according to Rosenberg et al [23] are: eosinophil ribonucleases (ER), pancreatic ribonucleases (PR), angiogenins (ANG), Rana ribonucleases (RR) and ribonuclease 4 (R4). The maximum parsimony tree described by Rosenberg et al [23] is given in the Phylip format: (RR, ((ANG, R4), (PR, ER))).
Figure 3Complete linkage cluster tree produced using fraction of non-identical residues (p-distance), RMSD (Å), and LHM (Å) is plotted between proteins from SH2 family (SH2). The classifications of SH2 domains according to [25, 26] are given in the parentheses: syk (1B, B), shptp2 (4, C), vsrc (1A, A), hck (1A, A), csk (1B, B), P85a (3, D) and shc (4,).