Literature DB >> 20118117

A fast and automated solution for accurately resolving protein domain architectures.

Corin Yeats1, Oliver C Redfern, Christine Orengo.   

Abstract

MOTIVATION: Accurate prediction of the domain content and arrangement in multi-domain proteins (which make up >65% of the large-scale protein databases) provides a valuable tool for function prediction, comparative genomics and studies of molecular evolution. However, scanning a multi-domain protein against a database of domain sequence profiles can often produce conflicting and overlapping matches. We have developed a novel method that employs heaviest weighted clique-finding (HCF), which we show significantly outperforms standard published approaches based on successively assigning the best non-overlapping match (Best Match Cascade, BMC).
RESULTS: We created benchmark data set of structural domain assignments in the CATH database and a corresponding set of Hidden Markov Model-based domain predictions. Using these, we demonstrate that by considering all possible combinations of matches using the HCF approach, we achieve much higher prediction accuracy than the standard BMC method. We also show that it is essential to allow overlapping domain matches to a query in order to identify correct domain assignments. Furthermore, we introduce a straightforward and effective protocol for resolving any overlapping assignments, and producing a single set of non-overlapping predicted domains.
AVAILABILITY AND IMPLEMENTATION: The new approach will be used to determine MDAs for UniProt and Ensembl, and made available via the Gene3D website: http://gene3d.biochem.ucl.ac.uk/Gene3D/. The software has been implemented in C++ and compiled for Linux: source code and binaries can be found at: ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/DomainFinder3/ CONTACT: yeats@biochem.ucl.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20118117     DOI: 10.1093/bioinformatics/btq034

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  22 in total

1.  Ancient dynamin segments capture early stages of host-mitochondrial integration.

Authors:  Ramya Purkanti; Mukund Thattai
Journal:  Proc Natl Acad Sci U S A       Date:  2015-02-17       Impact factor: 11.205

2.  Comprehensive catalog of dendritically localized mRNA isoforms from sub-cellular sequencing of single mouse neurons.

Authors:  Sarah A Middleton; James Eberwine; Junhyong Kim
Journal:  BMC Biol       Date:  2019-01-24       Impact factor: 7.431

3.  Evolution of domain promiscuity in eukaryotic genomes--a perspective from the inferred ancestral domain architectures.

Authors:  Inbar Cohen-Gihon; Jessica H Fong; Roded Sharan; Ruth Nussinov; Teresa M Przytycka; Anna R Panchenko
Journal:  Mol Biosyst       Date:  2010-12-03

4.  Detecting remote evolutionary relationships among proteins by large-scale semantic embedding.

Authors:  Iain Melvin; Jason Weston; William Stafford Noble; Christina Leslie
Journal:  PLoS Comput Biol       Date:  2011-01-27       Impact factor: 4.475

5.  New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures.

Authors:  Ian Sillitoe; Alison L Cuff; Benoit H Dessailly; Natalie L Dawson; Nicholas Furnham; David Lee; Jonathan G Lees; Tony E Lewis; Romain A Studer; Robert Rentzsch; Corin Yeats; Janet M Thornton; Christine A Orengo
Journal:  Nucleic Acids Res       Date:  2012-11-29       Impact factor: 16.971

6.  The Gene3D Web Services: a platform for identifying, annotating and comparing structural domains in protein sequences.

Authors:  Corin Yeats; Jonathan Lees; Phil Carter; Ian Sillitoe; Christine Orengo
Journal:  Nucleic Acids Res       Date:  2011-06-06       Impact factor: 16.971

7.  CATH FunFHMMer web server: protein functional annotations using functional family assignments.

Authors:  Sayoni Das; Ian Sillitoe; David Lee; Jonathan G Lees; Natalie L Dawson; John Ward; Christine A Orengo
Journal:  Nucleic Acids Res       Date:  2015-05-11       Impact factor: 16.971

8.  Protein function prediction using domain families.

Authors:  Robert Rentzsch; Christine A Orengo
Journal:  BMC Bioinformatics       Date:  2013-02-28       Impact factor: 3.169

9.  Mantis: flexible and consensus-driven genome annotation.

Authors:  Pedro Queirós; Francesco Delogu; Oskar Hickl; Patrick May; Paul Wilmes
Journal:  Gigascience       Date:  2021-06-02       Impact factor: 6.524

10.  Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains.

Authors:  Tony E Lewis; Ian Sillitoe; Antonina Andreeva; Tom L Blundell; Daniel W A Buchan; Cyrus Chothia; Alison Cuff; Jose M Dana; Ioannis Filippis; Julian Gough; Sarah Hunter; David T Jones; Lawrence A Kelley; Gerard J Kleywegt; Federico Minneci; Alex Mitchell; Alexey G Murzin; Bernardo Ochoa-Montaño; Owen J L Rackham; James Smith; Michael J E Sternberg; Sameer Velankar; Corin Yeats; Christine Orengo
Journal:  Nucleic Acids Res       Date:  2012-11-30       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.