| Literature DB >> 16948858 |
Luonan Chen1, Ling-Yun Wu, Yong Wang, Shihua Zhang, Xiang-Sun Zhang.
Abstract
BACKGROUND: Protein structure comparison is one of the most important problems in computational biology and plays a key role in protein structure prediction, fold family classification, motif finding, phylogenetic tree reconstruction and protein docking.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16948858 PMCID: PMC1574323 DOI: 10.1186/1472-6807-6-18
Source DB: PubMed Journal: BMC Struct Biol ISSN: 1472-6807
Figure 1An example for two protein chains and their assignment matrix S with n= 5 and n= 7.
Figure 2An example of the alignment for two protein chains 1DHFand 8DFR. The iteration number of the algorithm is represented by t. (a) The two proteins 1DHFand 8DFR are in the original coordinates, which are in the Crepresentation of the backbone. (b) and (c) Relative positions of the two proteins during the convergence process. (d) The optimal alignment by our algorithm. The final alignment number m is 182 with rms = 0.7 at λ = 6.0 and = 0.0.
Comparisons of structure alignment algorithms with rms distance and the number of aligned atoms m.
| Protein Pairs | SAMO | Dali | CE | Lund | ||||||
| Reductases | 1DHF | 0.7 | 182 | 6.0 | 0.7 | 182 | 0.7 | 182 | 0.7 | 182 |
| 1DHF | 1.8 | 156 | 6.0 | 2.0 | 154 | 2.0 | 154 | 2.0 | 156 | |
| 1DHF | 1.6 | 159 | 6.0 | 1.7 | 158 | 1.7 | 158 | 1.7 | 159 | |
| 8DFR - 4DFR | 1.9 | 157 | 6.0 | 2.0 | 155 | 2.0 | 155 | 2.0 | 157 | |
| 8DFR - 3DFR | 1.6 | 159 | 6.0 | 1.8 | 159 | 1.8 | 158 | 1.8 | 160 | |
| 4DFR | 1.5 | 156 | 6.0 | 1.5 | 154 | 1.5 | 155 | 1.5 | 155 | |
| Globins | 2HHB | 1.4 | 139 | 6.0 | 1.4 | 138 | 1.5 | 139 | 1.4 | 139 |
| 2HHB | 1.5 | 141 | 6.0 | 1.5 | 139 | 1.6 | 141 | 1.5 | 141 | |
| 2HHB | 1.6 | 140 | 6.0 | 1.7 | 138 | 1.7 | 136 | 1.6 | 138 | |
| 2HHB | 2.2 | 131 | 6.0 | 2.3 | 129 | 2.6 | 128 | 2.3 | 130 | |
| 1MBD - 2HBG | 2.0 | 141 | 8.0 | 2.2 | 140 | 2.1 | 140 | 2.0 | 140 | |
| 2HHB | 1.6 | 145 | 6.0 | 1.6 | 145 | 1.6 | 144 | 1.6 | 145 | |
| 2HHB | 1.7 | 137 | 6.0 | 2.0 | 135 | 1.9 | 134 | 1.7 | 136 | |
| 2HHB | 2.2 | 140 | 6.0 | 2.3 | 138 | 2.4 | 139 | 2.4 | 140 | |
| 2LHB - 1MBD | 1.4 | 137 | 6.0 | 1.4 | 135 | 1.6 | 137 | 1.5 | 137 | |
| 2LHB - 2HBG | 1.9 | 133 | 6.0 | 2.0 | 128 | 2.1 | 130 | 2.0 | 132 | |
| 1MBD - 1MBA | 1.9 | 143 | 6.0 | 1.9 | 142 | 1.8 | 141 | 1.9 | 143 | |
| 1MBA - 1ECD | 1.9 | 136 | 8.0 | 1.9 | 133 | 2.0 | 134 | 2.0 | 136 | |
| 2HBG - 1ECD | 2.4 | 129 | 6.5 | 2.6 | 129 | 2.6 | 125 | 2.5 | 129 | |
| Ten 'difficult' structures | 1FXI | 2.5 | 70 | 6.0 | 2.6 | 60 | 3.8 | 64 | 2.6 | 63 |
| 1TEN - 3HHR | 1.7 | 87 | 6.0 | 1.9 | 86 | 1.9 | 87 | 1.8 | 87 | |
| 3HLA | 2.9 | 87 | 6.0 | 3.0 | 75 | 3.4 | 84 | 3.3 | 83 | |
| 2AZA | 2.5 | 82 | 4.5 | 2.5 | 81 | 2.9 | 84 | 2.4 | 83 | |
| 1CEW | 2.3 | 83 | 6.5 | 2.3 | 81 | 2.3 | 81 | 2.2 | 82 | |
| 1CID - 2RHE | 2.3 | 98 | 6.5 | 3.2 | 97 | 2.9 | 97 | 2.5 | 97 | |
| 1CRL - 1EDE | 3.1 | 281 | 6.0 | 3.5 | 211 | 3.8 | 219 | 5.0 | 126 | |
| 2SIM - 1NSB | 2.9 | 322 | 6.0 | 3.3 | 291 | 3.0 | 275 | 2.9 | 292 | |
| 1BGE | 3.3 | 110 | 7.5 | 3.3 | 94 | 3.9 | 107 | 3.3 | 104 | |
| 1TIE - 4FGF | 2.4 | 115 | 6.0 | 3.1 | 114 | 2.9 | 116 | 2.7 | 115 | |
| Different folds | 1NSB | 3.1 | 156 | 6.0 | - | 6.4 | 88 | - | ||
| 1NSB | 3.0 | 118 | 6.0 | - | 5.8 | 72 | - | |||
| 1FXI | 2.9 | 56 | 6.0 | - | 7.2 | 56 | - | |||
| 1FXI | 2.9 | 70 | 6.0 | - | 5.8 | 48 | - | |||
| Different Classes | 1BGE | 2.8 | 82 | 6.0 | - | 7.4 | 40 | - | ||
| 1BGE | 3.2 | 103 | 6.0 | - | 6.2 | 48 | - | |||
| 2GMF | 3.0 | 68 | 6.0 | - | 4.8 | 40 | - | |||
| Circular permutation | 1LED - 1NLS | 1.1 | 213 | 6.0 | 1.9 | 119 | 1.1 | 112 | - | |
| 2PIA - 1AXJ | 3.3 | 118 | 6.0 | 3.49 | 36 | 3.3 | 62 | - | ||
Comparisons of our algorithm with method of [24] for circularly permuted proteins.
| ID/Size | ID/Size | SAMO | Method in [24] | ||||
| Naturally occurring | 1RIN/180 | 2CNA/237 | 6.0 | 1.581 | 174 | 0.877 | 45 |
| 1RSY/121 | 1QAS/123 | 6.0 | 1.741 | 118 | 1.107 | 44 | |
| 1NKL/78 | 1QDM/74 | 6.0 | 2.852 | 72 | 1.823 | 48 | |
| 1ONR/316 | 1FBA/360 | 6.0 | 3.016 | 244 | 2.444 | 77 | |
| 1AQI/382 | 1BOO/282 | 6.0 | 3.329 | 200 | 3.571 | 66 | |
| Human made | 1AVD/123 | 1SWG/112 | 6.0 | 2.499 | 98 | 0.815 | 66 |
| 1GBG/214 | 1AJK/212 | 6.0 | 2.879 | 182 | 0.347 | 110 | |
Figure 3The comparison result of a naturally occurring protein pair 1RIN/180 and 2CNA/237 by SAMO. The subfigures (a) and (c) are backbone of proteins 1RIN and 2CNA respectively. The subfigure (c) illustrates the aligned result after the optimal superimposing by different colors. The red chain is 2CNA and the blue chain is 1RIN. The two termini of the two structures are indicated by labeling their residue names respectively. Notice that the termini of the blue chain (1RIN) are aligned to the middle of the red chain (2CNA) and vice versa.
Figure 4The comparison result of protein pair 1TPO/223 (β-trysin) and 2ACT/218 (actinidin) by SAMO. Subfigure (a) illustrates the aligned result after the optimal superimposing by different colors. The red chain is 1TPO and the blue chain is 2ACT. There are 122 amino-acid matching pairs with RMSD = 2.02. Also the active site region on 1TPO is highlighted in yellow and the active site region on 2act is drawn in green. Subfigure (b) is the detail match of the active sites between 1TPO(red) and 2ACT(blue). As indicated in the figure, the superposition of Cs found by SAMO brings the catalytic triads close together. In the matched active sites, the amino acids come from different fragments of each protein chain. Some segments contain contiguous residues. The actual matching of these segments is, for 2ACT: 24–25, 151–152, 132–137, 158–167, 177–183, 191–198; for 1TPO: 194–195, 85–86, 31–36, 39–48, 50–56, 101–108. The actual matching of the residues formed the active sites is listed in Subfigure (c).
Comparisons of our algorithm with method of [14].
| ID/Size | ID/Size | SAMO | Method in [14] | ||||
| Score | Score | ||||||
| 1CHO/238 | 1CHO/238 | 1.00 | 0.00 | 238 | 1.00 | 0. 00 | 238 |
| 1CHO/238 | 2CHA/236 | 0.99 | 0.55 | 236 | 0.99 | 0. 55 | 236 |
| 1CHO/238 | 2PTCE/223 | 0.85 | 0.85 | 212 | 0.84 | 0.85 | 211 |
| 1CHO/238 | 1TPO/223 | 0.82 | 0.80 | 208 | 0.81 | 0. 81 | 207 |
| 1CHO/238 | 1TGSE/225 | 0.78 | 0.88 | 203 | 0.78 | 0.89 | 203 |
| 1CHO/238 | 2PRK/279 | 0.25 | 1.77 | 104 | 0.25 | 1. 67 | 103 |
| 1CHO/238 | 3ACT/218 | 0.26 | 1.69 | 93 | 0.25 | 1.69 | 86 |
| 1CHO/238 | 1SBT/275 | 0.25 | 1.87 | 111 | 0.24 | 1. 76 | 96 |
| 1CHO/238 | 2SECE/274 | 0.27 | 1.90 | 117 | 0.23 | 1. 78 | 95 |
| 1CHO/238 | 1FX1/147 | 0.23 | 1.94 | 80 | 0.23 | 1.69 | 71 |
| 1CHO/238 | 1CSE/274 | 0.26 | 1.89 | 117 | 0.23 | 1.72 | 94 |
| 1CHO/238 | 1TECE/279 | 0.27 | 1.94 | 105 | 0.22 | 1.79 | 94 |
| 1CHO/238 | 9PAP/212 | 0.29 | 1.99 | 102 | 0.22 | 1. 74 | 81 |