Kui Lin1, Lei Zhu, Da-Yong Zhang. 1. MOE Key Laboratory for Biodiversity Science and Ecological Engineering and College of Life Sciences, Beijing Normal University, Beijing 100875, China. linkui@bnu.edu.cn
Abstract
MOTIVATION: Ideally, only proteins that exhibit highly similar domain architectures should be compared with one another as homologues or be classified into a single family. By combining three different indices, the Jaccard index, the Goodman-Kruskal gamma function and the domain duplicate index, into a single similarity measure, we propose a method for comparing proteins based on their domain architectures. RESULTS: Evaluation of the method using the eukaryotic orthologous groups of proteins (KOGs) database indicated that it allows the automatic and efficient comparison of multiple-domain proteins, which are usually refractory to classic approaches based on sequence similarity measures. As a case study, the PDZ and LRR_1 domains are used to demonstrate how proteins containing promiscuous domains can be clearly compared using our method. For the convenience of users, a web server was set up where three different query interfaces were implemented to compare different domain architectures or proteins with domain(s), and to identify the relationships among domain architectures within a given KOG from the Clusters of Orthologous Groups of Proteins database. CONCLUSION: The approach we propose is suitable for estimating the similarity of domain architectures of proteins, especially those of multidomain proteins. AVAILABILITY: http://cmb.bnu.edu.cn/pdart/.
MOTIVATION: Ideally, only proteins that exhibit highly similar domain architectures should be compared with one another as homologues or be classified into a single family. By combining three different indices, the Jaccard index, the Goodman-Kruskal gamma function and the domain duplicate index, into a single similarity measure, we propose a method for comparing proteins based on their domain architectures. RESULTS: Evaluation of the method using the eukaryotic orthologous groups of proteins (KOGs) database indicated that it allows the automatic and efficient comparison of multiple-domain proteins, which are usually refractory to classic approaches based on sequence similarity measures. As a case study, the PDZ and LRR_1 domains are used to demonstrate how proteins containing promiscuous domains can be clearly compared using our method. For the convenience of users, a web server was set up where three different query interfaces were implemented to compare different domain architectures or proteins with domain(s), and to identify the relationships among domain architectures within a given KOG from the Clusters of Orthologous Groups of Proteins database. CONCLUSION: The approach we propose is suitable for estimating the similarity of domain architectures of proteins, especially those of multidomain proteins. AVAILABILITY: http://cmb.bnu.edu.cn/pdart/.
Authors: Michelle A Schorn; Mohammad M Alanjary; Kristen Aguinaldo; Anton Korobeynikov; Sheila Podell; Nastassia Patin; Tommie Lincecum; Paul R Jensen; Nadine Ziemert; Bradley S Moore Journal: Microbiology Date: 2016-10-27 Impact factor: 2.777
Authors: Peter Cimermancic; Marnix H Medema; Jan Claesen; Kenji Kurita; Laura C Wieland Brown; Konstantinos Mavrommatis; Amrita Pati; Paul A Godfrey; Michael Koehrsen; Jon Clardy; Bruce W Birren; Eriko Takano; Andrej Sali; Roger G Linington; Michael A Fischbach Journal: Cell Date: 2014-07-17 Impact factor: 41.582
Authors: Marc G Chevrette; Fabian Aicheler; Oliver Kohlbacher; Cameron R Currie; Marnix H Medema Journal: Bioinformatics Date: 2017-10-15 Impact factor: 6.937