Literature DB >> 19786484

Detection of new protein domains using co-occurrence: application to Plasmodium falciparum.

Nicolas Terrapon1, Olivier Gascuel, Eric Maréchal, Laurent Bréehélin.   

Abstract

MOTIVATION: Hidden Markov models (HMMs) have proved to be a powerful tool for protein domain identification in newly sequenced organisms. However, numerous domains may be missed in highly divergent proteins. This is the case for Plasmodium falciparum proteins, the main causal agent of human malaria.
RESULTS: We propose a method to improve the sensitivity of HMM domain detection by exploiting the tendency of the domains to appear preferentially with a few other favorite domains in a protein. When sequence information alone is not sufficient to warrant the presence of a particular domain, our method enables its detection on the basis of the presence of other Pfam or InterPro domains. Moreover, a shuffling procedure allows us to estimate the false discovery rate associated with the results. Applied to P. falciparum, our method identifies 585 new Pfam domains (versus the 3683 already known domains in the Pfam database) with an estimated error rate <20%. These new domains provide 387 new Gene Ontology (GO) annotations to the P. falciparum proteome. Analogous and congruent results are obtained when applying the method to related Plasmodium species (P. vivax and P. yoelii). AVAILABILITY: Supplementary Material and a database of the new domains and GO predictions achieved on Plasmodium proteins are available at http://www.lirmm.fr/~terrapon/codd/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19786484     DOI: 10.1093/bioinformatics/btp560

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  14 in total

1.  Fitting hidden Markov models of protein domains to a target species: application to Plasmodium falciparum.

Authors:  Nicolas Terrapon; Olivier Gascuel; Eric Maréchal; Laurent Bréhélin
Journal:  BMC Bioinformatics       Date:  2012-05-01       Impact factor: 3.169

2.  Using context to improve protein domain identification.

Authors:  Alejandro Ochoa; Manuel Llinás; Mona Singh
Journal:  BMC Bioinformatics       Date:  2011-03-31       Impact factor: 3.169

3.  Identification of Plasmodium vivax proteins with potential role in invasion using sequence redundancy reduction and profile hidden Markov models.

Authors:  Daniel Restrepo-Montoya; David Becerra; Juan G Carvajal-Patiño; Alvaro Mongui; Luis F Niño; Manuel E Patarroyo; Manuel A Patarroyo
Journal:  PLoS One       Date:  2011-10-03       Impact factor: 3.240

4.  Domain similarity based orthology detection.

Authors:  Tristan Bitard-Feildel; Carsten Kemena; Jenny M Greenwood; Erich Bornberg-Bauer
Journal:  BMC Bioinformatics       Date:  2015-05-13       Impact factor: 3.169

5.  Powerful sequence similarity search methods and in-depth manual analyses can identify remote homologs in many apparently "orphan" viral proteins.

Authors:  Durga B Kuchibhatla; Westley A Sherman; Betty Y W Chung; Shelley Cook; Georg Schneider; Birgit Eisenhaber; David G Karlin
Journal:  J Virol       Date:  2013-10-23       Impact factor: 5.103

6.  Identification of divergent protein domains by combining HMM-HMM comparisons and co-occurrence detection.

Authors:  Amel Ghouila; Isabelle Florent; Fatma Zahra Guerfali; Nicolas Terrapon; Dhafer Laouini; Sadok Ben Yahia; Olivier Gascuel; Laurent Bréhélin
Journal:  PLoS One       Date:  2014-06-05       Impact factor: 3.240

7.  Inventory of fatty acid desaturases in the pennate diatom Phaeodactylum tricornutum.

Authors:  Lina-Juana Dolch; Eric Maréchal
Journal:  Mar Drugs       Date:  2015-03-16       Impact factor: 5.118

8.  Improvement in Protein Domain Identification Is Reached by Breaking Consensus, with the Agreement of Many Profiles and Domain Co-occurrence.

Authors:  Juliana Bernardes; Gerson Zaverucha; Catherine Vaquero; Alessandra Carbone
Journal:  PLoS Comput Biol       Date:  2016-07-29       Impact factor: 4.475

9.  Beyond the E-Value: Stratified Statistics for Protein Domain Prediction.

Authors:  Alejandro Ochoa; John D Storey; Manuel Llinás; Mona Singh
Journal:  PLoS Comput Biol       Date:  2015-11-17       Impact factor: 4.475

10.  A multi-objective optimization approach accurately resolves protein domain architectures.

Authors:  J S Bernardes; F R J Vieira; G Zaverucha; A Carbone
Journal:  Bioinformatics       Date:  2015-10-12       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.