| Literature DB >> 26293226 |
Rezaul Karim1, Mohd Momin Al Aziz1, Swakkhar Shatabda1,2, M Sohel Rahman1, Md Abul Kashem Mia1, Farhana Zaman1, Salman Rakin1.
Abstract
The number of entries in a structural database of proteins is increasing day by day. Methods for retrieving protein tertiary structures from such a large database have turn out to be the key to comparative analysis of structures that plays an important role to understand proteins and their functions. In this paper, we present fast and accurate methods for the retrieval of proteins having tertiary structures similar to a query protein from a large database. Our proposed methods borrow ideas from the field of computer vision. The speed and accuracy of our methods come from the two newly introduced features- the co-occurrence matrix of the oriented gradient and pyramid histogram of oriented gradient- and the use of Euclidean distance as the distance measure. Experimental results clearly indicate the superiority of our approach in both running time and accuracy. Our method is readily available for use from this website: http://research.buet.ac.bd:8080/Comograd/.Entities:
Year: 2015 PMID: 26293226 PMCID: PMC4543952 DOI: 10.1038/srep13275
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Representation of β sheets of domain d1n4ja36 in α carbon distance matrix gray-scale image.
Figure 2Representation of α helices of domain d1irqa37 in α carbon distance matrix gray-scale image.
Figure 3Level 1 quad tree of α carbon distance matrix image.
Figure 4Percentage of matches of Class, Fold, Superfamily and Family for up to top 50 retrieval results.
Comparison of query time.
| Method | Loading Time | Query Time | Time Per Query |
|---|---|---|---|
| MASASW | 28 min 11 s | 42 min 18 s | 25.38 s. |
| CoMOGrad | 18 m 31 s | 1 m 23 | 0.83 s |
| CoMOGrad + PHOG | 27 min 24 s | 5 min 32 s | 3.32 s. |
Figure 5MCC values for binary classification.