| Literature DB >> 22449399 |
Muhammad Sulaman Nawaz1, Qurrat Ul Ain, Umair Seemab, Sajid Rashid.
Abstract
Linking similar proteins structurally is a challenging task that may help in finding the novel members of a protein family. In this respect, identification of conserved sequence can facilitate understanding and classifying the exact role of proteins. However, the exact role of these conserved elements cannot be elucidated without structural and physiochemical information. In this work, we present a novel desktop application MotViz designed for searching and analyzing the conserved sequence segments within protein structure. With MotViz, the user can extract a complete list of sequence motifs from loaded 3D structures, annotate the motifs structurally and analyze their physiochemical properties. The conservation value calculated for an individual motif can be visualized graphically. To check the efficiency, predicted motifs from the data sets of 9 protein families were analyzed and MotViz algorithm was more efficient in comparison to other online motif prediction tools. Furthermore, a database was also integrated for storing, retrieving and performing the detailed functional annotation studies. In summary, MotViz effectively predicts motifs with high sensitivity and simultaneously visualizes them into 3D strucures. Moreover, MotViz is user-friendly with optimized graphical parameters and better processing speed due to the inclusion of a database at the back end. MotViz is available at http://www.fi-pk.com/motviz.html.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22449399 PMCID: PMC5054496 DOI: 10.1016/S1672-0229(11)60031-4
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Figure 1Schematic overview of motif searching and visualization algorithm (MSVA). A. AA sequence is retrieved from the PDB, visualized and stored into local database. User may examine the respective sequence properties. B. Calculation and displaying the AA physiochemical properties of selected sequence region or motif(s). The corresponding structure is rendered with distinct color. C. Searching in database and displaying the list of motifs present in the loaded PDB file. D. If motifs are not found in the MotViz database, UniProt PSI-BLAST ( is launched and results are stored in MotViz database for MSA. E. MSA is performed using ClustalW ( and conserved regions are identified and stored in MotViz database to carry out step В and step C. The circle at the lower right corner represents inspection to approve the quality of product.
Figure 2MotViz working interface encompassing feature’s track individually. A. Loaded 3ML6 (complex between Dishevlled2 and Clathrin adaptor AP-2) structure. B. Sequence retrieved from PDB structure. B1. Selected amino acids in the sequence are visualized in structure. B2. Relative amino acids highlighted in structure only. B3. Individual amino acid physiochemical properties charts (pie and bar), buttons and slider panel to select the region(s) of choice are indicated. Detailed view of amino acid physiochemical properties charts. Green color represents the polar amino acids (G, S, T, Y, C, Q and N); orange color represents the hydrophobic amino acids (A, V, L, I, P, W, F and M); red color indicates acidic amino acids (D and E); blue color is for basic residues (K, R and H); while other amino acids are represented by Tan color. C1. List of predicted motifs in multi-color panel. Pie chart indicates the AA physiochemical properties of predicted motifs in parallel to their structure representation. C2. Predicted motifs in structure. C3. Selected motifs in motif panel as well as in structure. C4. Bar chart representing the predicted motif conservation rate analysis encompassing the whole protein. Moreover, conservation value (Cv) along with individual motif location for each predicted motif is also indicated.
Comparative performance of motif searching tools
| Database/Method | Time (s) | Sensitivity | Precision | Specificity | Accuracy | No. of predicted motifs |
|---|---|---|---|---|---|---|
| Prosite | 230.12 | 84.43 | 94.31 | 88.92 | 92.65 | 37 |
| MEME | 348.48 | 92.11 | 93.52 | 95.362 | 92.02 | 139 |
| PATTERN | 165.63 | 80.78 | 96.24 | 98.43 | 95.13 | 13 |
| BLOCKS | 1330.73 | 90.59 | 90.10 | 93.62 | 88.73 | 99 |
| Pfam | 219.12 | 88.34 | 89.37 | 95.10 | 87.15 | 78 |
| ProDom | 198.34 | 81.23 | 94.83 | 98.32 | 92.45 | 15 |
| PRINTS | 165.63 | 82.93 | 94.67 | 98.10 | 91.89 | 27 |
| 0.0699 | 96.18 | 95.82 | 97.43 | 93.45 | 233 |
Note: Comparative performance of MotViz with online databases like Prosite, MEME, PATTERN, BLOCKS, Pfam, ProDom, PRINTS 11, 12, 19, 20, 21, 22, 23. Precision was calculated by TP/(TP+FP). TP represents true positive and FP indicates false positive.
List of highly conserved motifs predicted by MotViz only
| PDB ID | Motif sequence | Location | |
|---|---|---|---|
| VFGDLS | 0.733 | 434-440 | |
| RNELFLDV | 0.9875 | 456-464 | |
| KVVIK | 0.85 | 558-563 | |
| GMKESQISAEIE | 0.99104 | 964-976 | |
| AGNAARDNK | 0.883 | 232-241 | |
| IIPRHLQLA | 0.9013 | 244-253 | |
| LGKVTIAQG | 0.677 | 263-272 | |
| KIEELRQH | 0.9893 | 200-208 | |
| GIRK | 0.925 | 554-558 | |
| PSQD | 0.812 | 203-207 | |
| AGQLRTDIN | 0.772 | 315-324 | |
| ISS | 0.85 | 359-362 | |
| LSALLN | 0.733 | 426-432 | |
| GSVS | 0.675 | 435-439 | |
| NITQAIEQ | 0.656 | 526-534 | |
| KHRGFAF | 0.871 | 127-134 | |
| RTIRV | 0.679 | 157-162 | |
| TEALRFPPVM | 0.655 | 44-54 | |
| YLKSFPNL | 0.9832 | 63-71 | |
| CFRREPSKHL | 0.825 | 139-149 | |
| VDYASDPFF | 0.95 | 192-201 | |
| LKFELLIPL | 0.838 | 216-225 |
Note: Highly-conserved motifs predicted from 9 protein families by MotViz were verified by Cv.
Figure 3Number of motifs predicted by MotViz for protein families selected randomly. X-axis indicates protein families examined, Y-axis represents number of motifs predicted using different tools while Z-axis represents the motif databases. The performance of MotViz is superior to other tools examined, although the performance of MEME ( is very close to that of MotViz.