| Literature DB >> 19077267 |
Shashi Bhushan Pandit1, Jeffrey Skolnick.
Abstract
BACKGROUND: Protein tertiary structure comparisons are employed in various fields of contemporary structural biology. Most structure comparison methods involve generation of an initial seed alignment, which is extended and/or refined to provide the best structural superposition between a pair of protein structures as assessed by a structure comparison metric. One such metric, the TM-score, was recently introduced to provide a combined structure quality measure of the coordinate root mean square deviation between a pair of structures and coverage. Using the TM-score, the TM-align structure alignment algorithm was developed that was often found to have better accuracy and coverage than the most commonly used structural alignment programs; however, there were a number of situations when this was not true.Entities:
Mesh:
Substances:
Year: 2008 PMID: 19077267 PMCID: PMC2628391 DOI: 10.1186/1471-2105-9-531
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Structural alignment by different algorithms for two datasets
| CE | 5.7 (1.1) | 2.8 (0.2) | 0.185 (0.070) | 18.9 (10.3) | 8.4 (8.3) | 69.9 | 37.2 |
| TM-align | 5.1 (1.0) | 2.5 (0.3) | 0.255 (0.087) | 29.5 (11.1) | 17.7 (11.4) | 87.8 | 42.3 |
| Fr-TM-align | 5.0 (1.0) | 2.5 (0.3) | 0.279 (0.091) | 33.1 (12.0) | 20.0 (12.3) | 93.8 | 45.3 |
| CE | 5.7 (1.1) | 2.8 (0.2) | 0.190 (0.073) | 21.5 (11.7) | 10.5 (10.1) | 65.8 | 39.7 |
| TM-align | 4.8 (1.1) | 2.4 (0.4) | 0.256 (0.093) | 32.3 (11.5) | 21.0 (12.3) | 77.1 | 42.1 |
| Fr-TM-align | 4.7 (1.0) | 2.4 (0.4) | 0.280 (0.097) | 35.9 (12.2) | 23.1 (13.1) | 82.6 | 45.2 |
a Results are averaged over all structure pairs. cRMSD, cR(core), L, cov, TM, PSI and rPSI denotes coordinate RMSD (in Å), cRMSD (in Å) for all aligned pairs that contribute to PSI, number of aligned residues, coverage of aligned residues, TM-score, percentage structural similarity and relevant percentage structural similarity (see method section). The number in parenthesis is the standard deviation.
Comparison of best structural alignment (alignment with highest CE z-score) from CE
| CE | 4.0 (1.4) | 2.4 (0.4) | 0.403 (0.203) | 47.5 (23.5) | 36.9 (24.5) | 118.5 | 57.8 |
| TM-align | 4.0 (1.2) | 2.2 (0.4) | 0.446 (0.194) | 52.8 (21.2) | 43.9 (23.9) | 129.7 | 60.9 |
| Fr-TM-align | 3.9 (1.1) | 2.2 (0.4) | 0.459 (0.188) | 54.7 (19.9) | 45.7 (22.9) | 132.6 | 62.3 |
| CE | 4.1 (1.4) | 2.5 (0.4) | 0.366 (0.170) | 44.5 (22.4) | 34.6 (23.1) | 101.7 | 55.9 |
| TM-align | 4.1 (1.3) | 2.3 (0.4) | 0.408 (0.164) | 48.5 (21.0) | 39.5 (22.7) | 112.4 | 58.8 |
| Fr-TM-align | 4.0 (1.3) | 2.2 (0.4) | 0.426 (0.161) | 50.9 (20.4) | 42.0 (22.5) | 117.1 | 60.6 |
a For each protein, we selected the protein pair with the highest z-score using the CE program. We used the same set of protein pairs selected from CE for comparison with the other programs. cRMSDZ, cR(core)Z, LZ, covZ, TMZ, PSIZ and rPSIZ denotes coordinate RMSD (in Å), cRMSD (in Å) for all aligned pairs that contribute to PSI, number of aligned residues, coverage of aligned residues, TM-score, percentage structural similarity and relevant percentage structural similarity (see Methods). The number in parenthesis is the standard deviation.
Comparison of best structural alignment (alignment with maximum TM-score) from Fr-TM-align
| CE | 4.5 (1.5) | 2.5 (0.4) | 0.392 (0.207) | 39.1 (26.6) | 29.3 (26.6) | 120.9 | 59.2 |
| TM-align | 4.6 (1.7) | 2.3 (0.4) | 0.488 (0.163) | 46.5 (24.5) | 37.1 (27.2) | 162.5 | 70.4 |
| Fr-TM-align | 4.5 (1.6) | 2.3 (0.4) | 0.522 (0.144) | 50.0 (23.4) | 40.4 (26.5) | 170.6 | 74.5 |
| CE | 4.8 (1.5) | 2.6 (0.4) | 0.354 (0.175) | 35.8 (24.2) | 25.7 (24.1) | 108.5 | 58.0 |
| TM-align | 4.7 (1.5) | 2.3 (0.4) | 0.451 (0.142) | 43.9 (21.6) | 34.1 (23.3) | 139.3 | 67.5 |
| Fr-TM-align | 4.5 (1.5) | 2.3 (0.4) | 0.497 (0.127) | 49.2 (22.0) | 39.1 (24.4) | 149.1 | 73.2 |
a For each protein, we selected the protein pair with the highest TM-score using the Fr-TM-align program. We used the same set of protein pairs selected from Fr-TM-align for comparison with the other programs. cRMSDM, cR(core)M, LM, covM, TMM, PSIM and rPSIM denotes coordinate RMSD (in Å), cRMSD (in Å) for all aligned pairs that contribute to PSI, number of aligned residues, coverage of aligned residues, TM-score, percentage structural similarity and relevant percentage structural similarity (See Methods). The number in parenthesis is the standard deviation.
Figure 1A) Difference in TM-score from Fr-TM-align and TM-align plotted against the TM-score from TM-align for dataset 1. B) Similar data as in A, but plotted for dataset 2. (dTM is defined as (TM-score (Fr-TM-align) – TM-score (TM-align)).
Figure 2A) Scatter plot showing the number of aligned residues from Fr-TM-align versus the number of aligned residues from TM-align for dataset 1. B) Similar data as in A, but plotted for dataset 2.
Figure 3A) Histogram showing the fraction of protein pairs in various PSI (in %) bins for the dataset 1. B) Similar data as in A, but plotted for dataset 2.
Figure 4Histogram showing the fraction of protein pairs with improved/decreased or unchanged TM-score by Fr-TM-align with respect to the TM-score reported by TM-align. dTM is defined as (TM-score (Fr-TM-align) – TM-score (TM-align)).
Figure 5Histogram showing the fraction of protein pairs with improved/decreased or without any change in TM-score by Fr-TM-align with respect to the length of the smaller protein of the two proteins being aligned. dTM is defined as (TM-score (FrTMalign) – TM-score (TM-align)).
Figure 6Two examples showing the structural alignments from Fr-TM-align and TM-align. A) The structural alignment between 2GZQ_A (186 residues) and 1A1M_B (99 residues). B) The structural alignment between 1AOL (228 residues) and 1AKP (114 residues). The first row shows the ribbon diagrams of the native structures. The beta-sheets in the native structures are colored in cyan to highlight the structurally similar region, while the remainder of the structure is transparent gray. The second row is the structural alignment given by TM-align and Fr-TM-align. L denotes the number of aligned residues. The longer of the two proteins is shown in light green color and the smaller protein is shown in light yellow color. The aligned (unaligned) regions are shown in the thick backbone (thin) backbone.