| Literature DB >> 15784139 |
Amarendran R Subramanian1, Jan Weyer-Menkhoff, Michael Kaufmann, Burkhard Morgenstern.
Abstract
BACKGROUND: We present a complete re-implementation of the segment-based approach to multiple protein alignment that contains a number of improvements compared to the previous version 2.2 of DIALIGN. This previous version is superior to Needleman-Wunsch-based multi-alignment programs on locally related sequence sets. However, it is often outperformed by these methods on data sets with global but weak similarity at the primary-sequence level.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15784139 PMCID: PMC1087830 DOI: 10.1186/1471-2105-6-66
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Performance of seven protein multi-alignment programs on the IRMBASE 1.0 database of benchmark alignments. Percentage values are sum-of-pairs scores, i.e. the percentage of correctly aligned residue pairs of ROSE motifs contained in the IRMBASE sequence families.
| Method | ref1 | ref2 | ref3 | Total |
| DIALIGN-T | 94.07% | 92.69% | 92.68% | 93.14% |
| DIALIGN 2.2 | 92.26% | 92.72% | 91.87% | 92.28% |
| T-COFFEE 1.37 | 91.18% | 85.61% | 87.81% | 88.20% |
| PROBCONS 1.09 | 66.74% | 68.30% | 77.92% | 70.98% |
| POA V2 | 90.26% | 43.61% | 36.85% | 56.91% |
| MUSCLE 3.5 | 36.16% | 37.84% | 52.30% | 42.10% |
| CLUSTAL W 1.83 | 8.02% | 12.69% | 20.16% | 13.62% |
Performance of seven protein multi-alignment programs on IRMBASE using column scores as quality criterion. Thus, percentage values denote the percentage of correct alignment columns of the ROSE motifs in IRMBASE
| Method | ref1 | ref2 | ref3 | Total |
| DIALIGN-T | 82.28% | 78.36% | 79.71% | 80.12% |
| DIALIGN 2.2 | 79.46% | 77.82% | 78.24% | 78.51% |
| T-COFFEE 1.37 | 75.35% | 66.60% | 69.21% | 70.19% |
| PROBCONS 1.09 | 33.13% | 37.95% | 51.26% | 40.78% |
| POA V2 | 73.00% | 12.46% | 07.45% | 30.97% |
| MUSCLE 3.5 | 09.41% | 10.89% | 22.37% | 14.22% |
| CLUSTAL W 1.83 | 00.00% | 00.83% | 05.14% | 01.92% |
Performance of seven protein multi-alignment programs on the BAliBASE benchmark database using sum-of-pairs scores as evaluation criterion.
| Method | ref1 | ref2 | ref3 | ref4 | ref5 | Total |
| DIALIGN-T | 82.76% | 91.28% | 75.34% | 86.43% | 93.30% | 84.69% |
| DIALIGN 2.2 | 81.40% | 89.56% | 68.93% | 91.24% | 94.14% | 83.59% |
| T-COFFEE 1.37 | 84.67% | 93.24% | 80.32% | 75.80% | 96.20% | 85.95% |
| PROBCONS 1.09 | 90.37% | 94.61% | 84.34% | 89.20% | 98.07% | 91.11% |
| POA V2 | 74.66% | 88.32% | 63.14% | 82.62% | 76.71% | 76.76% |
| MUSCLE 3.5 | 88.25% | 93.59% | 82.36% | 85.62% | 97.80% | 89.21% |
| CLUSTAL W 1.83 | 86.43% | 93.22% | 75.79% | 81.09% | 86.10% | 86.15% |
Performance of seven protein multi-alignment programs on BAliBASE using column scores.
| Method | ref1 | ref2 | ref3 | ref4 | ref5 | Total |
| DIALIGN-T | 73.22% | 43.43% | 44.69% | 66.13% | 77.05% | 65.65% |
| DIALIGN 2.2 | 71.49% | 37.42% | 35.03% | 81.88% | 84.47% | 64.82% |
| T-COFFEE 1.37 | 75.32% | 53.44% | 52.20% | 45.09% | 86.96% | 68.20% |
| PROBCONS 1.09 | 83.21% | 59.76% | 61.34% | 71.09% | 91.86% | 77.23% |
| POA V2 | 63.21% | 39.02% | 25.57% | 57.22% | 47.18% | 54.18% |
| MUSCLE 3.5 | 80.79% | 56.37% | 56.74% | 62.65% | 91.57% | 74.13% |
| CLUSTAL W 1.83 | 78.39% | 56.24% | 48.87% | 50.44% | 63.89% | 68.48% |
Percentage of sequence families where DIALIGN-T is outperformed on IRMBASE 1.0 by alternative methods according to the sum-of-pairs score. The symbol + denotes statistically significant superiority, - statistically significant inferiority and 0 non-significant superiority or inferiority of DIALIGN-T, respectively. Significance has been calculated according to the Wilcoxon Matched Pairs Signed Rank Test with p ≤ 0.05).
| Method | ref1 | ref2 | ref3 | Total |
| DIALIGN 2.2 | 20.00%+ | 23.33%0 | 23.33%+ | 22.22%+ |
| T-COFFEE 1.37 | 40.00%0 | 31.67%+ | 41.67%+ | 37.78%+ |
| PROBCONS 1.09 | 20.00%+ | 15.00%+ | 21.67%+ | 18.89%+ |
| POA V2 | 16.67%+ | 0.00%+ | 0.00%+ | 5.55%+ |
| MUSCLE 3.5 | 5.00%+ | 5.00%+ | 0.00%+ | 3.33%+ |
| CLUSTAL W 1.83 | 0.00%+ | 0.00%0 | 0.00%0 | 0.0%+ |
Percentage of sequence families where DIALIGN-T is outperformed on IRMBASE 1.0 by other methods according to the column score. Notation is as in Table 5.
| Method | ref1 | ref2 | ref3 | Total |
| DIALIGN 2.2 | 11.67%+ | 21.67%0 | 23.33%+ | 18.89%+ |
| T-COFFEE 1.37 | 36.67%0 | 30.00%+ | 26.67%+ | 31.11%+ |
| PROBCONS 1.09 | 18.33%+ | 01.67%+ | 16.67%+ | 16.67%+ |
| POA V2 | 15.00%+ | 00.00%+ | 00.00%+ | 05.00%+ |
| MUSCLE 3.5 | 05.00%+ | 05.00%+ | 00.00%+ | 03.33%+ |
| CLUSTAL W 1.83 | 00.00%+ | 00.00%+ | 00.00%+ | 00.00%+ |
Percentage of sequence families where DIALIGN-T is outperformed on BAliBASE 2.1 by other methods according to the sum-of-pairs score. Notation is as in Table 5.
| Method | ref1 | ref2 | ref3 | ref4 | ref5 | Total |
| DIALIGN 2.2 | 28.05%+ | 21.74%+ | 16.67%+ | 16.67%0 | 41.67%0 | 26.24%+ |
| T-COFFEE 1.37 | 58.54%- | 86.96%- | 75.00%- | 25.00%0 | 50.00%0 | 60.99%- |
| PROBCONS 1.09 | 71.95%- | 82.61%- | 100.00%- | 33.33%0 | 75.00%- | 80.14%- |
| POA V2 | 20.73%+ | 34.78%+ | 16.67%+ | 33.33%0 | 0.00%+ | 21.99%+ |
| MUSCLE 3.5 | 71.95%- | 73.91%- | 83.33%- | 25.00%0 | 75.00%- | 69.50%- |
| CLUSTAL W 1.83 | 53.66%- | 56.52%0 | 58.33%0 | 16.67%0 | 8.33%+ | 47.52%0 |
Percentage of sequence families where DIALIGN-T is outperformed on BAliBASE 2.1 by other methods according to the column score. Notation as in Table 5.
| Method | ref1 | ref2 | ref3 | ref4 | ref5 | Total |
| DIALIGN 2.2 | 26.83%+ | 13.04%+ | 16.67%+ | 16.67%0 | 50.00%0 | 24.82%+ |
| T-COFFEE 1.37 | 56.10%- | 73.91%- | 66.67%0 | 25.00%0 | 50.00%0 | 56.74%- |
| PROBCONS 1.09 | 80.49%- | 82.61%- | 75.00%- | 25.00%0 | 66.67%- | 74.47%- |
| POA V2 | 20.73%+ | 26.09%0 | 08.33%+ | 16.67%0 | 00.00%+ | 18.44%+ |
| MUSCLE 3.5 | 73.17%- | 73.91%- | 83.33%- | 16.67%0 | 66.67%- | 68.79%- |
| CLUSTAL W 1.83 | 52.44%- | 69.57%- | 50.00%0 | 16.67%0 | 08.33%+ | 48.23%0 |
Average running time (in seconds) per multiple alignment for the 180 sequence families of IRMBASE and for 141 sequence families in BAliBASE 2.1. Program runs were performed on a Linux workstation (RedHat 8.0) with an 3.2 GHz Pentium 4 processor and 2 GB Ram.
| Method | Average runtime on IRMBASE 1.0 | Average runtime on BAliBASE 2.1 |
| DIALIGN-T | 2.36 | 1.38 |
| DIALIGN 2.2 | 3.33 | 1.30 |
| T-COFFEE 1.37 | 27.54 | 7.64 |
| PROBCONS 1.09 | 12.37 | 2.66 |
| POA V2 | 1.44 | 0.58 |
| MUSCLE 3.5 | 9.37 | 0.60 |
| CLUSTAL W 1.83 | 1.41 | 0.47 |