Literature DB >> 20920608

EuPathDomains: the divergent domain database for eukaryotic pathogens.

Amel Ghouila1, Nicolas Terrapon, Olivier Gascuel, Fatma Z Guerfali, Dhafer Laouini, Eric Maréchal, Laurent Bréhélin.   

Abstract

Eukaryotic pathogens (e.g. Plasmodium, Leishmania, Trypanosomes, etc.) are a major source of morbidity and mortality worldwide. In Africa, one of the most impacted continents, they cause millions of deaths and constitute an immense economic burden. While the genome sequence of several of these organisms is now available, the biological functions of more than half of their proteins are still unknown. This is a serious issue for bringing to the foreground the expected new therapeutic targets. In this context, the identification of protein domains is a key step to improve the functional annotation of the proteins. However, several domains are missed in eukaryotic pathogens because of the high phylogenetic distance of these organisms from the classical eukaryote models. We recently proposed a method, co-occurrence domain detection (CODD), that improves the sensitivity of Pfam domain detection by exploiting the tendency of domains to appear preferentially with a few other favorite domains in a protein. In this paper, we present EuPathDomains (http://www.atgc-montpellier.fr/EuPathDomains/), an extended database of protein domains belonging to ten major eukaryotic human pathogens. EuPathDomains gathers known and new domains detected by CODD, along with the associated confidence measurements and the GO annotations that can be deduced from the new domains. This database significantly extends the Pfam domain coverage of all selected genomes, by proposing new occurrences of domains as well as new domain families that have never been reported before. For example, with a false discovery rate lower than 20%, EuPathDomains increases the number of detected domains by 13% in Toxoplasma gondii genome and up to 28% in Cryptospordium parvum, and the total number of domain families by 10% in Plasmodium falciparum and up to 16% in C. parvum genome. The database can be queried by protein names, domain identifiers, Pfam or Interpro identifiers, or organisms, and should become a valuable resource to decipher the protein functions of eukaryotic pathogens.
Copyright © 2010 Elsevier B.V. All rights reserved.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20920608     DOI: 10.1016/j.meegid.2010.09.008

Source DB:  PubMed          Journal:  Infect Genet Evol        ISSN: 1567-1348            Impact factor:   3.342


  4 in total

1.  Fitting hidden Markov models of protein domains to a target species: application to Plasmodium falciparum.

Authors:  Nicolas Terrapon; Olivier Gascuel; Eric Maréchal; Laurent Bréhélin
Journal:  BMC Bioinformatics       Date:  2012-05-01       Impact factor: 3.169

2.  Identification of Plasmodium vivax proteins with potential role in invasion using sequence redundancy reduction and profile hidden Markov models.

Authors:  Daniel Restrepo-Montoya; David Becerra; Juan G Carvajal-Patiño; Alvaro Mongui; Luis F Niño; Manuel E Patarroyo; Manuel A Patarroyo
Journal:  PLoS One       Date:  2011-10-03       Impact factor: 3.240

3.  Identification of divergent protein domains by combining HMM-HMM comparisons and co-occurrence detection.

Authors:  Amel Ghouila; Isabelle Florent; Fatma Zahra Guerfali; Nicolas Terrapon; Dhafer Laouini; Sadok Ben Yahia; Olivier Gascuel; Laurent Bréhélin
Journal:  PLoS One       Date:  2014-06-05       Impact factor: 3.240

4.  Plasmobase: a comparative database of predicted domain architectures for Plasmodium genomes.

Authors:  Juliana Bernardes; Catherine Vaquero; Alessandra Carbone
Journal:  Malar J       Date:  2017-06-07       Impact factor: 2.979

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.