Literature DB >> 18003657

ToxoDB: an integrated Toxoplasma gondii database resource.

Bindu Gajria1, Amit Bahl, John Brestelli, Jennifer Dommer, Steve Fischer, Xin Gao, Mark Heiges, John Iodice, Jessica C Kissinger, Aaron J Mackey, Deborah F Pinney, David S Roos, Christian J Stoeckert, Haiming Wang, Brian P Brunk.   

Abstract

ToxoDB (http://ToxoDB.org) is a genome and functional genomic database for the protozoan parasite Toxoplasma gondii. It incorporates the sequence and annotation of the T. gondii ME49 strain, as well as genome sequences for the GT1, VEG and RH (Chr Ia, Chr Ib) strains. Sequence information is integrated with various other genomic-scale data, including community annotation, ESTs, gene expression and proteomics data. ToxoDB has matured significantly since its initial release. Here we outline the numerous updates with respect to the data and increased functionality available on the website.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 18003657      PMCID: PMC2238934          DOI: 10.1093/nar/gkm981

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Toxoplasma gondii is an intracellular apicomplexan parasite capable of infecting humans. Infection is typically asymptomatic in healthy individuals, but may lead to congenital birth defects and encephalitis in immuno-suppressed individuals (1,2). ToxoDB, initially released in May 2001, has been substantially updated in both content and functionality since last described in January 2003 (3). ToxoDB provides access to the genome sequence and annotation of the T. gondii ME49 strain. It also incorporates the genomic sequence of multiple other strains. The parasite genome is ∼63 Mb in size and consists of 14 chromosomes (4). The initial ToxoDB release was not supported by a relational database and thus the site had restricted functionality and little capability to integrate diverse data types such as gene expression data and single nucleotide polymorphism data (SNPs) with genomic sequence. Since initial publication, ToxoDB has been completely rebuilt using a common architecture similar to another apicomplexan database project, PlasmoDB (5). Both sites, along with CryptoDB, are component sites of ApiDB, the Apicomplexan Bioinformatics Resource Center (6). Many of the new methods of data loading, querying and presentation that are mentioned here have been applied to all of the ApiDB sites to provide a common research platform and facilitate data access among this group of related organisms. ApiDB (http://apidb.org/) serves as an ‘umbrella’ site for cross-species comparisons. Researchers can mine for Toxoplasma genes at ApiDB directly or via their orthologous relationship(s) to genes in other apicomplexan species.

CONTENT OF THE CURRENT RELEASE

Data

ToxoDB provides access to the genome sequence and annotation of T. gondii (ME49 strain) and the genomic sequence of the GT1, VEG and RH (Chr Ia and Chr Ib) strains. Annotation is also available for the apicoplast genome. The current database version (Release 4.2) also contains manual annotation (solicited in the initial genome annotation and entered by users as user comments), ESTs, TIGR Gene Indices clustered ESTs, SAGE tags, SNPs, cosmid and BAC ends, microarray and proteomics studies, all of which have been mapped to the genome (7,8). The database contains the results of automated analyses including gene predictions (using various algorithms), open reading frames (ORFs) greater than 50 aa and protein feature predictions [signal peptides, transmembrane domains, hydrophobicity plots, AA content and InterPro domains (9)], Gene Ontology function predictions, and BLAST similarities to the NCBI non-redundant protein database (Table 1).
Table 1.

Data and analyses that have been integrated into ToxoDB and the number of genes that are impacted

Data typeData sourceNumber of genes
GenesTIGR8032
Community annotationVarious contributors1610
OrthologsGenerated from OrthoMCL4616
GO termsTIGR; InterPro3136
EC numbersTIGR800
SNPsJohn Boothroyd Laboratory; David Roos Laboratory7322
MicroarrayDavid Roos Laboratory7664
ESTsdbEST, TIGR Gene Indices6080
SAGE tagsTgSAGEDB (14)6284
ProteomicsJohnathan Wastling Laboratory; John Murray Laboratory2435
EpitopesIEDB10
Metabolic pathwaysKEGG Pathway614
Data and analyses that have been integrated into ToxoDB and the number of genes that are impacted In addition, we have used the OrthoMCL algorithm to group genes from T. gondii with orthologous genes from 86 other eukaryotic and prokaryotic genomes (10). A mapping of immune epitopes identified in Toxoplasma provided by the Immune Epitope Database and Analysis Resource (IEDB) (11) has been integrated. Affymetrix probes mapped to the genome are visible in GBrowse, as are SNPs generated from nucmer alignments of sequences from the GT1, VEG and RH (Chr Ia and Ib) strains against the reference ME49 sequence. Two expression experiments utilizing a Toxoplasma Affymetrix array have also been deposited in ToxoDB. Users gain access to these new data types in record pages and by queries using the powerful query interface (see Data-Mining section).

Database architecture

As a part of the complete restructure of the ToxoDB resource, the practice of using flat files as a means of data storage was abandoned in early 2006. We now use GUS 3.5, and load data into an underlying Oracle database in a systematic fashion. GUS (Genomics Unified Schema) is an open source project (www.gusdb.org) with a rich relational schema including sequence annotation, expression data and proteomics using controlled vocabularies and ontologies (12). ToxoDB also employs the GUS WDK (Web Development Kit, www.gusdb.org/wdk), to access the database from the internet dramatically improving the way the website operates. This transformation has added considerable increased functionality for database users and conforms to the model used by all ApiDB projects, making it possible for us to generate future database releases in short cycles.

DATA-MINING TOOLS

ToxoDB currently provides 40 different queries of the data and several ancillary tools for analyzing, retrieving or viewing the data such as BLAST, Pathway Tools and an installation of the GMOD project Genome Browser (13). The ToxoDB ‘Query & Tools’ page has been restructured to make all queries available at a glance. Most of the individual queries have been reorganized into categories such as ‘Position’, ‘Expression’ and ‘Function’ to make them more intuitive to the average researcher. Enhanced functionality for the queries has also been added. For example, the ToxoDB keyword search has been significantly improved, offering the user control over which fields in the database are searched, including the official annotation, synonyms, user-supplied comments, domain names, BLAST similarities, etc. Many queries, such as ‘Find SNPs based on Gene ID’, now allow a gene ID list as input [either typed (or copied) by hand or uploaded from a file] facilitating analyses on large groups of genes. The results from all queries can be sorted based on various criteria (columns in the returned data set) and users can also add additional criteria for display (e.g. add columns to display protein features, GO annotation, expression characteristics for gene results, etc.) and sort on them as well. Once the appropriate selection of data types to display has been achieved, users can integrate these search results with other search results using the ‘Query History’ page, or the data can be downloaded in multiple formats for further analysis by the researcher (Figure 1).
Figure 1.

Screenshots showing the flow of a query in ToxoDB. From the Query & Tools page, users can go to particular queries for expression evidence (EST or Mass Spec Evidence), to the Results page where they can sort, manage (add or delete) columns of data and open gene pages. The Query History page permits users to manipulate previous queries including combining them and/or downloading the resulting data. Individual genes are listed on the Gene Results page and each gene has its own gene page, illustrated here by the gene encoding elongation factor 1-alpha. The gene page summarizes all information that is available for a gene including gene model predictions, SNPs, BLAST similarities, protein domains, ESTs, proteomic evidence of expression and microarray expression analyses.

Screenshots showing the flow of a query in ToxoDB. From the Query & Tools page, users can go to particular queries for expression evidence (EST or Mass Spec Evidence), to the Results page where they can sort, manage (add or delete) columns of data and open gene pages. The Query History page permits users to manipulate previous queries including combining them and/or downloading the resulting data. Individual genes are listed on the Gene Results page and each gene has its own gene page, illustrated here by the gene encoding elongation factor 1-alpha. The gene page summarizes all information that is available for a gene including gene model predictions, SNPs, BLAST similarities, protein domains, ESTs, proteomic evidence of expression and microarray expression analyses. ToxoDB uses the GBrowse genome browser (www.gmod.org) (13) to display gene models, EST alignments, SNPs, SAGE tags, etc. GBrowse enables visualization of the parasite genome and gene models, custom restriction-site identification, open reading frame identification, and facilitates download of data in various formats. Different data sets or analyses are displayed as individual tracks within the genome browser. There are approximately 50 GBrowse tracks available in the current version of ToxoDB. All genome sequences [ME49, GT1, VEG and RH (Chr Ia, Ib)] are also available in BLAST-searchable databases and for download in FASTA, GenBank and EMBL formats. ToxoDB users may now register and log in to the site. Doing so enables a researcher to add comments to genes and genomic sequences. It also lets users save query results permanently. Queries in the Query History page can be organized (re-named or deleted) as well as combined with other results (Figure 1). This is a very powerful feature that allows users to refine their results so that precise sets of genes can be discovered. The results may be downloaded using ToxoDB's improved reporting facility. It supports summary reports (Excel compatible tab delimited text), GFF, FASTA and a detailed report that includes almost all available data for each gene in the users result table. Use of this facility as well as many others on the site are now described in short video tutorials that are accessible from the database home page.

FUTURE DIRECTIONS

The last two years were spent on major infrastructure and design elements for ToxoDB. Our future growth will be in the area of increased data acquisition and integration with existing and future data sets. Specifically, we are planning to load and integrate many expression data sets (RNA expression and protein expression) that are just becoming available. We also expect to load and integrate other array-based data sets such as ChIP on Chip and array CGH. As new data are added, we will be adding additional queries and tools to view these data. An area of significant future development will be improving the ability of users to compare the various different sequenced parasite strains visually and download sequence alignments between them.
  12 in total

1.  The generic genome browser: a building block for a model organism system database.

Authors:  Lincoln D Stein; Christopher Mungall; ShengQiang Shu; Michael Caudy; Marco Mangone; Allen Day; Elizabeth Nickerson; Jason E Stajich; Todd W Harris; Adrian Arva; Suzanna Lewis
Journal:  Genome Res       Date:  2002-10       Impact factor: 9.043

2.  Proteomic analysis of rhoptry organelles reveals many novel constituents for host-parasite interactions in Toxoplasma gondii.

Authors:  Peter J Bradley; Chris Ward; Stephen J Cheng; David L Alexander; Susan Coller; Graham H Coombs; Joe Dan Dunn; David J Ferguson; Sanya J Sanderson; Jonathan M Wastling; John C Boothroyd
Journal:  J Biol Chem       Date:  2005-07-07       Impact factor: 5.157

Review 3.  Toxoplasmic encephalitis in AIDS.

Authors:  B J Luft; J S Remington
Journal:  Clin Infect Dis       Date:  1992-08       Impact factor: 9.079

4.  ToxoDB: accessing the Toxoplasma gondii genome.

Authors:  Jessica C Kissinger; Bindu Gajria; Li Li; Ian T Paulsen; David S Roos
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

5.  PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data.

Authors:  Amit Bahl; Brian Brunk; Jonathan Crabtree; Martin J Fraunholz; Bindu Gajria; Gregory R Grant; Hagai Ginsburg; Dinesh Gupta; Jessica C Kissinger; Philip Labo; Li Li; Matthew D Mailman; Arthur J Milgram; David S Pearson; David S Roos; Jonathan Schug; Christian J Stoeckert; Patricia Whetzel
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

6.  OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups.

Authors:  Feng Chen; Aaron J Mackey; Christian J Stoeckert; David S Roos
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

7.  The transcriptome of Toxoplasma gondii.

Authors:  Jay R Radke; Michael S Behnke; Aaron J Mackey; Josh B Radke; David S Roos; Michael W White
Journal:  BMC Biol       Date:  2005-12-02       Impact factor: 7.431

8.  New developments in the InterPro database.

Authors:  Nicola J Mulder; Rolf Apweiler; Teresa K Attwood; Amos Bairoch; Alex Bateman; David Binns; Peer Bork; Virginie Buillard; Lorenzo Cerutti; Richard Copley; Emmanuel Courcelle; Ujjwal Das; Louise Daugherty; Mark Dibley; Robert Finn; Wolfgang Fleischmann; Julian Gough; Daniel Haft; Nicolas Hulo; Sarah Hunter; Daniel Kahn; Alexander Kanapin; Anish Kejariwal; Alberto Labarga; Petra S Langendijk-Genevaux; David Lonsdale; Rodrigo Lopez; Ivica Letunic; Martin Madera; John Maslen; Craig McAnulla; Jennifer McDowall; Jaina Mistry; Alex Mitchell; Anastasia N Nikolskaya; Sandra Orchard; Christine Orengo; Robert Petryszak; Jeremy D Selengut; Christian J A Sigrist; Paul D Thomas; Franck Valentin; Derek Wilson; Cathy H Wu; Corin Yeats
Journal:  Nucleic Acids Res       Date:  2007-01       Impact factor: 16.971

9.  The immune epitope database and analysis resource: from vision to blueprint.

Authors:  Bjoern Peters; John Sidney; Phil Bourne; Huynh-Hoa Bui; Soeren Buus; Grace Doh; Ward Fleri; Mitch Kronenberg; Ralph Kubo; Ole Lund; David Nemazee; Julia V Ponomarenko; Muthu Sathiamurthy; Stephen Schoenberger; Scott Stewart; Pamela Surko; Scott Way; Steve Wilson; Alessandro Sette
Journal:  PLoS Biol       Date:  2005-03       Impact factor: 8.029

10.  Cytoskeletal components of an invasion machine--the apical complex of Toxoplasma gondii.

Authors:  Ke Hu; Jeff Johnson; Laurence Florens; Martin Fraunholz; Sapna Suravajjala; Camille DiLullo; John Yates; David S Roos; John M Murray
Journal:  PLoS Pathog       Date:  2006-02-24       Impact factor: 6.823

View more
  244 in total

1.  Identification of potential serodiagnostic and subunit vaccine antigens by antibody profiling of toxoplasmosis cases in Turkey.

Authors:  Li Liang; Mert Döşkaya; Silvia Juarez; Ayşe Caner; Algis Jasinskas; Xiaolin Tan; Bettina E Hajagos; Peter J Bradley; Metin Korkmaz; Yüksel Gürüz; Philip L Felgner; D Huw Davies
Journal:  Mol Cell Proteomics       Date:  2011-04-21       Impact factor: 5.911

Review 2.  Peroxiredoxins in parasites.

Authors:  Michael C Gretes; Leslie B Poole; P Andrew Karplus
Journal:  Antioxid Redox Signal       Date:  2012-01-25       Impact factor: 8.401

3.  Variable numbers of tandem repeats in Plasmodium falciparum genes.

Authors:  John C Tan; Asako Tan; Lisa Checkley; Caroline M Honsa; Michael T Ferdig
Journal:  J Mol Evol       Date:  2010-08-22       Impact factor: 2.395

4.  Apicomplexan perforin-like proteins.

Authors:  Björn F C Kafsack; Vern B Carruthers
Journal:  Commun Integr Biol       Date:  2010-01

5.  PRMT1 methylates the single Argonaute of Toxoplasma gondii and is important for the recruitment of Tudor nuclease for target RNA cleavage by antisense guide RNA.

Authors:  Alla Musiyenko; Tanmay Majumdar; Joel Andrews; Brian Adams; Sailen Barik
Journal:  Cell Microbiol       Date:  2012-02-28       Impact factor: 3.715

Review 6.  Lipoic acid metabolism in microbial pathogens.

Authors:  Maroya D Spalding; Sean T Prigge
Journal:  Microbiol Mol Biol Rev       Date:  2010-06       Impact factor: 11.056

7.  A serine-arginine-rich (SR) splicing factor modulates alternative splicing of over a thousand genes in Toxoplasma gondii.

Authors:  Lee M Yeoh; Christopher D Goodman; Nathan E Hall; Giel G van Dooren; Geoffrey I McFadden; Stuart A Ralph
Journal:  Nucleic Acids Res       Date:  2015-04-13       Impact factor: 16.971

8.  A plant/fungal-type phosphoenolpyruvate carboxykinase located in the parasite mitochondrion ensures glucose-independent survival of Toxoplasma gondii.

Authors:  Richard Nitzsche; Özlem Günay-Esiyok; Maximilian Tischer; Vyacheslav Zagoriy; Nishith Gupta
Journal:  J Biol Chem       Date:  2017-07-18       Impact factor: 5.157

9.  O-fucosylated glycoproteins form assemblies in close proximity to the nuclear pore complexes of Toxoplasma gondii.

Authors:  Giulia Bandini; John R Haserick; Edwin Motari; Dinkorma T Ouologuem; Sebastian Lourido; David S Roos; Catherine E Costello; Phillips W Robbins; John Samuelson
Journal:  Proc Natl Acad Sci U S A       Date:  2016-09-23       Impact factor: 11.205

10.  A novel dense granule protein, GRA41, regulates timing of egress and calcium sensitivity in Toxoplasma gondii.

Authors:  Kaice A LaFavers; Karla M Márquez-Nogueras; Isabelle Coppens; Silvia N J Moreno; Gustavo Arrizabalaga
Journal:  Cell Microbiol       Date:  2017-05-17       Impact factor: 3.715

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.