| Literature DB >> 18387204 |
Gabriele Ausiello1, Pier Federico Gherardini, Paolo Marcatili, Anna Tramontano, Allegra Via, Manuela Helmer-Citterich.
Abstract
BACKGROUND: The occurrence of very similar structural motifs brought about by different parts of non homologous proteins is often indicative of a common function. Indeed, relatively small local structures can mediate binding to a common partner, be it a protein, a nucleic acid, a cofactor or a substrate. While it is relatively easy to identify short amino acid or nucleotide sequence motifs in a given set of proteins or genes, and many methods do exist for this purpose, much more challenging is the identification of common local substructures, especially if they are formed by non consecutive residues in the sequence.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18387204 PMCID: PMC2323665 DOI: 10.1186/1471-2105-9-S2-S2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Details of structural matches. Output page of a FunClust search in four non-homologous serine protease protein chains (1a0jA, 1sca, 1tyfA and 1e5tA). The three different clusters identified are shown in tabular form. For each cluster the associated score is reported. The right section of each table reports, in each row, the residues belonging to the different structures, with structurally aligned residues written in the same column. The left section shows the r.m.s.d. value for each match identified between the structures corresponding to the row and column of the table (recall that two structures belonging to the cluster do not necessarily match to each other). In the example shown, the first of the three clusters is composed of the four catalytic triads which are therefore correctly identified. The second cluster identifies three non-catalytic residues in structure 1e5t, while the third one (with the lowest score) involves only three of the four structures. A user activated popup window shows a graphical view (created using the Jmol applet) of the first cluster. The four different structures have been superposed on the residues belonging to the structural motif. Each structure has a different colour, and only the residues involved in the cluster are shown. Commands to trigger the display of the whole structure and of the labels for each protein in the cluster are located in the right portion of the window.
Test cases
| Function | Source | Score | Proximity | Rmsd | PDB | Chain | CATH | Matched Residues |
| Serine endopeptidases EC 3.4.21 | CSA | 18 | H | H | 1a0j | A | 2.40.10.10 | H57 D102 S195 |
| 1sca | 3.40.50.200 | H64 D32 S221 | ||||||
| 1tyf | A | 3.90.226.10 | H122 D171 S97 | |||||
| 1e5t | A | 3.40.50.1820 | H640 D639 S146 | |||||
| WW domain | PROSITE | 9 | H | L | 1eg3 | A | NA | W61 N75 T78 |
| 1o6w | A | 2.20.70.10 | W4 N18 T21 | |||||
| 1zcn | A | NA | W11 N26 T29 | |||||
| 4Fe-4S ferredoxin | PROSITE | 42 | L | L | 1a6l | 3.30.70.20 | C16 C45 C49 C20 P50 P21 C42 | |
| 1jb0 | C | 1.20.1130.10 | C53 C16 C20 C57 P21 P58 C13 | |||||
| 1kf6 | B | 3.10.20.30 | C210 C154 C158 C214 P159 P215 C151 | |||||
| EF HAND | ELM | 9 | M | H | 1bmo | A | 1.20.238.10 | D257 D259 N260 |
| 1daq | A | 3.30.60.30 | D40 D44 N42 | |||||
| 1aj5 | A | 1.10.1330.10 | D227 D229 N230 | |||||
| LIMDomain | PROSITE | 12 | H | L | 1a7i | 2.10.110.10 | C10 C13 H31 C34 | |
| 1wig | A | NA | C34 C37 H56 C59 | |||||
| 2cuq | A | NA | C18 C21 H38 C41 | |||||
| Zn binding | PDBFUN | 45 | H | M | 1a5t | 3.40.50.300 | C62 C65 C50 | |
| 1a73 | A | 3.90.75.10 | C125 C132 C138 | |||||
| 1adn | 3.40.10.10 | C72 C69 C38 | ||||||
| 1adt | NA | C450 C467 C398 | ||||||
| 1ajy | A | NA | C50 C60 C34 | |||||
| 1b55 | A | 2.30.29.30 | C155 C154 C165 | |||||
In this table six cases of functional motifs are reported that the server identified as the largest cluster of conserved residues. The score, r.m.s.d. and side chains maximum proximity parameters are reported (H is high, M is medium and L is low). A detailed parameter description can be found in the online help.
For each motif the list of submitted PDB structures is present along with their CATH code. Finally, for each structure the aminoacids that have been included in the common cluster are indicated.