Literature DB >> 15608216

The Candida Genome Database (CGD), a community resource for Candida albicans gene and protein information.

Martha B Arnaud1, Maria C Costanzo, Marek S Skrzypek, Gail Binkley, Christopher Lane, Stuart R Miyasato, Gavin Sherlock.   

Abstract

The Candida Genome Database (CGD) is a new database that contains genomic information about the opportunistic fungal pathogen Candida albicans. CGD is a public resource for the research community that is interested in the molecular biology of this fungus. CGD curators are in the process of combing the scientific literature to collect all C.albicans gene names and aliases; to assign gene ontology terms that describe the molecular function, biological process, and subcellular localization of each gene product; to annotate mutant phenotypes; and to summarize the function and biological context of each gene product in free-text description lines. CGD also provides community resources, including a reservation system for gene names and a colleague registry through which Candida researchers can share contact information and research interests. CGD is publicly funded (by NIH grant R01 DE15873-01 from the NIDCR) and is freely available at http://www.candidagenome.org/.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15608216      PMCID: PMC539957          DOI: 10.1093/nar/gki003

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Candida albicans is the best studied of the human fungal pathogens, and it serves as a model organism for the study of other pathogenic fungi. In recent years, the frequency of fungal infections has steadily grown and although these infections are generally less frequent than bacterial infections, at least two aspects make them increasingly important. First, opportunistic infections in immunocompromised patients represent an increasingly common cause of mortality and morbidity (1,2). Second, many of the currently used antifungal compounds (3,4) are often of limited use because of their toxicity and side effects (5). Furthermore, within the last decade there has been an emergence of anti-fungal drug resistance, which was a rarity in the past (6–10). By serving as a resource for scientists who study fungal biology and pathogenesis, the Candida Genome Database (CGD) aims to facilitate progress toward more complete understanding of and effective treatment for fungal diseases. Before CGD was created, three web sites contained information about the C.albicans genome sequence and about C.albicans gene products. The Stanford Genome Technology Center undertook the process of sequencing and the difficult challenge of assembling the sequence of this diploid organism (11), and their web site provides options for searching and downloading the genome sequence. CandidaDB, at the Pasteur Institute, was the first freely available C.albicans database; it contains sequence-based annotation for assemblies 6 and 19 of the genome sequence (http://genolist.pasteur.fr/CandidaDB/). The third resource was developed by the Candida Annotation Working Group, colleagues who came together on a volunteer basis, to analyze the C.albicans sequence produced by the Stanford Genome Technology Center. The results of the Annotation Working Group's efforts include a high quality set of gene annotations and gene ontology (GO) terms assigned by sequence-based prediction. The Annotation Working Group's annotation and sequence analysis tools are accessible on a web site hosted at the Biotechnology Research Institute of the National Research Council in Canada (http://candida.bri.nrc.ca/candida/index.cfm). The Candida research community expressed a need for a database with additional features: comprehensive literature curation, to complement the high quality sequence-based annotation already available; a more extensive set of sequence retrieval and analysis tools, similar to those provided at the Saccharomyces Genome Database (SGD) (12); and centralized community information, such as a colleague directory and a gene name registry. CGD was proposed to meet these needs. CGD is based on the framework of SGD, using the same software, user interfaces, and underlying schema. The format and tools will therefore be familiar to CGD users who are already users of SGD. CGD started with the Candida Annotation Working Group's informative data set, and the CGD curators are now adding published material from the literature.

LOCUS PAGE

Similar to SGD, CGD contains gene information organized around locus pages. An example locus page is shown in Figure 1.
Figure 1

CGD locus page. The locus page presents the basic information about a gene and its product, including names and aliases, a concise description, GO term assignments and mutant phenotypes. The locus page also provides links to additional resources.

The locus page displays basic information about a gene and its product. The gene name is displayed prominently at the top of the page along with all aliases, including names assigned during sequencing and sequence assembly. Also found near the top of the page is the description, which is a concise statement of the most important information known about the gene and the gene product, especially its function, biological context, and physical characteristics. Each gene product is assigned GO terms (13) that describe its molecular functions, its location within the cell, and the biological processes in which it participates. The GO annotation section of each locus page contains a link to the GO annotation page, which shows all GO terms along with the references that were used to make each assignment and the type of evidence that supports it. An example GO annotation page is shown in Figure 2. Each GO term name, both on the locus page and on the GO annotation page, links to a graphical view that allows users to see parent and child relationships for each term, to navigate within the ontologies, and to view summary information about all of the Candida genes assigned to any given GO term.
Figure 2

CGD gene ontology (GO) annotation page. The GO annotation page displays each of the GO term assignments along with the references from which these assignments were made, and the types of evidence that support assignment of each GO term.

Initially, CGD GO curation has focused on one or a few references that describe each gene product. With time, CGD will collect GO terms comprehensively, such that the database will list all of the papers that support assignment of each term, rather than listing only a more limited set of representative papers. The rationale for assigning GO terms from each paper is that the number of independent pieces of evidence for assignment to a particular GO term can be a measure of confidence in that assignment. The locus page also contains a mutant phenotype section. This section lists the type of mutation (e.g. homozygous null, heterozygous null, or overexpression) and any corresponding phenotype. At this time, phenotypes are collected from the literature as free-text descriptions. Each phenotype that is displayed on the locus page is hyperlinked to a list of all C.albicans genes that share that mutant phenotype. The locus page also presents a link to a page that lists the references in which specific phenotypes are described. This page also contains phenotype details, including additional information about the specific conditions under which some phenotypes have been observed.

LITERATURE INTERFACE

CGD contains a wealth of information about the C.albicans scientific literature. This information is available in several formats within the literature guide. The literature guide, which is accessed from the menu on the right-hand side of each locus page, provides a list of papers that characterize a particular gene. These lists were generated by using an automated search of the PubMed database at NCBI, and have been manually screened to eliminate spurious references. Next to each reference there is a list of all the genes described in the paper, and each gene name is hyperlinked to its corresponding locus page. As each reference is curated, curators note whether the paper pertains to any of a set of 45 ‘literature topics’. These topics are based on the set that is used by SGD, but have been expanded to include additional topics of special interest to the Candida research community. The topics include filamentous growth, phenotypic switching, adherence and biofilms, as well as more generalized topics such as function/process, protein physical properties, protein–protein interactions, protein–nucleic acid interactions, post-translational modifications, transcriptional regulation and translational regulation. The complete set of CGD literature guide topics is listed in Table 1. Within the literature guide interface, the reference list may be sorted according to the topic or by curation status (curated or not yet curated). Alternately, users may choose to focus on individual papers. The curated paper view displays the reference information and the abstract, along with a summary of literature guide topics that are assigned to every gene characterized in the paper.
Table 1.

CGD literature topic curation

The literature topics are displayed above, along with the number of times each topic had been assigned, as of August 5, 2004. As each paper is curated, literature topics are assigned to all of the genes described in the paper. Each gene has a set of topic assignments from every curated paper that describes the gene. This information is shown in the literature guide interface, which is accessed from the menu on the right-hand side of each locus page.

COMMUNITY RESOURCES

CGD seeks to facilitate an interaction among the members of the C.albicans research community. Thus, CGD has implemented a colleague registry, by which researchers may share contact information and find others who share research interests or who are experts in a particular topic. CGD also serves as the keeper of gene name reservations prior to publication. The community conferred this privilege upon CGD at the ASM meeting on Candida and Candidiasis in March 2004. Having a reservation system for gene names benefits the entire community because it helps to reduce conflicts in gene names and prevents the introduction of confusing synonyms into the literature. CGD does not itself assign gene names, but rather collects and maintains a list of current reservations and attempts to mediate resolution of any disputes that may arise. CGD follows gene name guidelines that are based on those used by the Saccharomyces cerevisiae research community. Detailed information about choosing and reserving a gene name is found on the CGD web site under nomenclature guide. CGD also hosts a web page with Candida community news and a list of meetings, courses, and related web sites of interest.

CURRENT PROGRESS AND FUTURE DIRECTIONS

The CGD project began in April 2004, and is progressing rapidly. However, there is still much to be done, and there are plans to add additional information and new features. CGD literature curation is now in progress. As of August 2004, CGD contained more than 900 gene product descriptions, and ∼1500 mutant phenotype descriptions and 1500 GO term assignments. The initial release of CGD contained locus pages for genes that have been characterized in the literature. The C.albicans genome contains ∼6400 homologous pairs of genes (11); the majority of these genes have not yet been characterized. CGD will contain the entire SC5314 gene complement, with locus pages and sequences for all the genes that were identified in the genome-sequencing project, although not all of this information was included in CGD upon the initial database release. CGD will contain the reference sequence of the strain SC5314 (11). The C.albicans genomic sequence data is scheduled to be added to CGD in the autumn of 2004. Once this information has been incorporated, CGD will provide access to sequence analysis and visualization tools that are similar to those available at SGD, including tools for viewing multiple versions of sequences that have been updated since the original sequence was published. Each locus page currently provides a hyperlink to the C.albicans BLAST tool at the Biotechnology Research Institute of the National Research Council in Canada. In addition, CGD will also provide links between CGD and SGD locus pages, which will provide instant access to information about the S.cerevisiae orthologs of C.albicans proteins. All CGD data will available for free download at an ftp site that will be linked from our home page. The current curation efforts are focused on the body of scientific literature that deals with specific C.albicans genes by name. However, an additional set of literature exists that concerns more generalized C.albicans biology, e.g. drug sensitivity studies or morphological descriptions that do not examine the role of any specific gene product. CGD plans to include these papers in the database and to make literature guide topic assignments. The current set of literature topics may need to be expanded to capture information from this set of papers more effectively. The CGD group seeks input from the research community as to what types of information would be most useful for CGD to collect from such papers. Within the next year, CGD plans to begin curation of metabolic pathway information. CGD will use the Pathway Tools software (14) to make pathway predictions, and will supplement and validate these predictions by curating pathway information from the published literature.

SUMMARY AND AVAILABILITY

In summary, the Candida Genomic Database is a resource modeled after the Saccharomyces Genome Database. The CGD contains information about C.albicans genes and gene products. CGD is freely available on the web at www.candidagenome.org. CGD also facilitates community interaction by providing a colleague registry and a gene name registry. CGD is being actively developed, and the CGD project staff would like to solicit advice from Candida researchers about ways in which CGD may best serve the C.albicans research community. Users are encouraged to contact CGD at candida-curator@genome.stanford.edu with comments or suggestions.
  12 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  The Pathway Tools software.

Authors:  Peter D Karp; Suzanne Paley; Pedro Romero
Journal:  Bioinformatics       Date:  2002       Impact factor: 6.937

3.  Failure of fluconazole in systemic candidiasis.

Authors:  Y Siegman-Igra; M Y Rabaw
Journal:  Eur J Clin Microbiol Infect Dis       Date:  1992-02       Impact factor: 3.267

4.  Fluconazole resistant candida in AIDS.

Authors:  D Smith; F Boag; J Midgley; B Gazzard
Journal:  J Infect       Date:  1991-11       Impact factor: 6.072

Review 5.  Antifungal agents: chemotherapeutic targets and immunologic strategies.

Authors:  N H Georgopapadakou; T J Walsh
Journal:  Antimicrob Agents Chemother       Date:  1996-02       Impact factor: 5.191

Review 6.  Can we prevent azole resistance in fungi?

Authors:  D W Denning
Journal:  Lancet       Date:  1995-08-19       Impact factor: 79.321

7.  Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms.

Authors:  Karen R Christie; Shuai Weng; Rama Balakrishnan; Maria C Costanzo; Kara Dolinski; Selina S Dwight; Stacia R Engel; Becket Feierbach; Dianna G Fisk; Jodi E Hirschman; Eurie L Hong; Laurie Issel-Tarver; Robert Nash; Anand Sethuraman; Barry Starr; Chandra L Theesfeld; Rey Andrada; Gail Binkley; Qing Dong; Christopher Lane; Mark Schroeder; David Botstein; J Michael Cherry
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

8.  Opportunistic candidiasis: an epidemic of the 1980s.

Authors:  S P Fisher-Hoch; L Hutwagner
Journal:  Clin Infect Dis       Date:  1995-10       Impact factor: 9.079

9.  Emergence of azole drug resistance in Candida species from HIV-infected patients receiving prolonged fluconazole therapy for oral candidosis.

Authors:  E M Johnson; D W Warnock; J Luker; S R Porter; C Scully
Journal:  J Antimicrob Chemother       Date:  1995-01       Impact factor: 5.790

10.  The diploid genome sequence of Candida albicans.

Authors:  Ted Jones; Nancy A Federspiel; Hiroji Chibana; Jan Dungan; Sue Kalman; B B Magee; George Newport; Yvonne R Thorstenson; Nina Agabian; P T Magee; Ronald W Davis; Stewart Scherer
Journal:  Proc Natl Acad Sci U S A       Date:  2004-05-03       Impact factor: 11.205

View more
  57 in total

1.  Identification of a copper-inducible promoter for use in ectopic expression in the fungal pathogen Histoplasma capsulatum.

Authors:  Dana Gebhart; Adam K Bahrami; Anita Sil
Journal:  Eukaryot Cell       Date:  2006-06

2.  SemCat: semantically categorized entities for genomics.

Authors:  Lorraine Tanabe; Lynne H Thom; Wayne Matten; Donald C Comeau; W John Wilbur
Journal:  AMIA Annu Symp Proc       Date:  2006

3.  Induction of the Candida albicans filamentous growth program by relief of transcriptional repression: a genome-wide analysis.

Authors:  David Kadosh; Alexander D Johnson
Journal:  Mol Biol Cell       Date:  2005-04-06       Impact factor: 4.138

Review 4.  Candida albicans cell wall proteins.

Authors:  W LaJean Chaffin
Journal:  Microbiol Mol Biol Rev       Date:  2008-09       Impact factor: 11.056

5.  Responses of pathogenic and nonpathogenic yeast species to steroids reveal the functioning and evolution of multidrug resistance transcriptional networks.

Authors:  Dibyendu Banerjee; Gaelle Lelandais; Sudhanshu Shukla; Gauranga Mukhopadhyay; Claude Jacq; Frederic Devaux; Rajendra Prasad
Journal:  Eukaryot Cell       Date:  2007-11-09

6.  Role of actin cytoskeletal dynamics in activation of the cyclic AMP pathway and HWP1 gene expression in Candida albicans.

Authors:  Michael J Wolyniak; Paula Sundstrom
Journal:  Eukaryot Cell       Date:  2007-08-22

7.  Mitochondrial two-component signaling systems in Candida albicans.

Authors:  John Mavrianos; Elizabeth L Berkow; Chirayu Desai; Alok Pandey; Mona Batish; Marissa J Rabadi; Katherine S Barker; Debkumar Pain; P David Rogers; Eliseo A Eugenin; Neeraj Chauhan
Journal:  Eukaryot Cell       Date:  2013-04-12

8.  Impact of genetic background on allele selection in a highly mutable Candida albicans gene, PNG2.

Authors:  Ningxin Zhang; Richard D Cannon; Barbara R Holland; Mark L Patchett; Jan Schmid
Journal:  PLoS One       Date:  2010-03-09       Impact factor: 3.240

9.  Kex2 protease converts the endoplasmic reticulum alpha1,2-mannosidase of Candida albicans into a soluble cytosolic form.

Authors:  Héctor M Mora-Montes; Oliver Bader; Everardo López-Romero; Samuel Zinker; Patricia Ponce-Noyola; Bernhard Hube; Neil A R Gow; Arturo Flores-Carreón
Journal:  Microbiology (Reading)       Date:  2008-12       Impact factor: 2.777

10.  Shewanella knowledgebase: integration of the experimental data and computational predictions suggests a biological role for transcription of intergenic regions.

Authors:  Tatiana V Karpinets; Margaret F Romine; Denise D Schmoyer; Guruprasad H Kora; Mustafa H Syed; Michael R Leuze; Margrethe H Serres; Byung H Park; Nagiza F Samatova; Edward C Uberbacher
Journal:  Database (Oxford)       Date:  2010-07-06       Impact factor: 3.451

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.