Literature DB >> 18957442

PlasmoDB: a functional genomic database for malaria parasites.

Cristina Aurrecoechea1, John Brestelli, Brian P Brunk, Jennifer Dommer, Steve Fischer, Bindu Gajria, Xin Gao, Alan Gingle, Greg Grant, Omar S Harb, Mark Heiges, Frank Innamorato, John Iodice, Jessica C Kissinger, Eileen Kraemer, Wei Li, John A Miller, Vishal Nayak, Cary Pennington, Deborah F Pinney, David S Roos, Chris Ross, Christian J Stoeckert, Charles Treatman, Haiming Wang.   

Abstract

PlasmoDB (http://PlasmoDB.org) is a functional genomic database for Plasmodium spp. that provides a resource for data analysis and visualization in a gene-by-gene or genome-wide scale. PlasmoDB belongs to a family of genomic resources that are housed under the EuPathDB (http://EuPathDB.org) Bioinformatics Resource Center (BRC) umbrella. The latest release, PlasmoDB 5.5, contains numerous new data types from several broad categories--annotated genomes, evidence of transcription, proteomics evidence, protein function evidence, population biology and evolution. Data in PlasmoDB can be queried by selecting the data of interest from a query grid or drop down menus. Various results can then be combined with each other on the query history page. Search results can be downloaded with associated functional data and registered users can store their query history for future retrieval or analysis.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18957442      PMCID: PMC2686598          DOI: 10.1093/nar/gkn814

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Plasmodium spp. are obligate intracellular protozoan parasites of humans and animals, and are the causative agents of malaria. Transmission of these parasites to humans occurs via the Anopheles mosquito vector and the geographic distribution of endemic regions puts almost half of the world's population at risk to contracting malaria. This disease is a major source of morbidity and mortality worldwide, which results in 300–500 million clinical cases and 1–2 million deaths annually (1,2). While several species of Plasmodium cause disease in humans (including P. vivax, P. malariae, P. ovale and P. knowlesi), P. falciparum is by far the deadliest (1,3). The life cycle of the Plasmodium parasite takes it through multiple cell types (in the vertebrate host and arthropod vector) during which the parasite undergoes multiple developmental changes (both sexual and asexual). The different life-cycle stages are marked by specific genomic, transcriptomic, proteomic and metabolomic states. Understanding how these changes are triggered and orchestrated requires mechanisms to view and interrogate genomic and functional genomic data in a powerful and intuitive manner. Over the past 10 years, PlasmoDB has evolved into a venue that integrates such data and allows the user to perform complex queries tailored to their specific needs and interests.

UPDATED DATA CONTENT

The data available in PlasmoDB has expanded to include genomic and functional data from eight Plasmodium species and is summarized in Table 1 (4). The current release (PlasmoDB 5.5) contains fully sequenced and annotated genomes of P. falciparum, P. vivax, P. yoelii, P. berghei, P. chabaudi and P. knowlesi. Importantly, PlasmoDB 5.5 contains results of annotation efforts from multiple sources including the recent systematic effort to update the P. falciparum genome that is an ongoing project started at a workshop in late 2007 co-organized by the Wellcome Trust Sanger Institute (WTSI) and EuPathDB (formerly ApiDB) teams. Reannotation data have been released in incremental steps (snapshots) in order to provide timely information to users of PlasmoDB and to solicit user comments regarding the reannotations.
Table 1.

Types of data available in PlasmoDB and example queries

Type of DataSpecies for which this data is availableExample query
Genomic data
    Full sequence and annotationP. falciparum, P. vivax, P. yoelii, P. berghei, P. chabaudi, P. knowlesiSearch annotations for specific keyword (see Figure 1C).
    Sequence onlyP. reichenowi, P. gallinaceumFind sequence similarity using BLAST.
Transcript expression data
    MicroarrayP. falciparum, P. berghei, P. yoeliiIdentify genes expressed at specific life-cycle stages.
    ESTP. falciparum, P. vivax, P. berghei, P. yoeliiConfirm gene models and alternative gene models.
    SAGEP. falciparumIdentify genes with transcript evidence.
Protein expression dataP. falciparum, P. berghei, P. yoeliiIdentify genes with protein expression evidence at specific life-cycle stages.
Population biology
    SNP Microsatellite Isolate dataP. falciparumFind highly polymorphic genes or distinguish isolates based on their SNP profile.
Protein interaction
    Yeast two hybrid Interactome mapP. falciparumIdentify possible interaction partners of a gene of interest.
Putative function
    GO annotationP. falciparum, P. vivax, P. yoelii, P. berghei, P. chabaudi, P. knowlesiIdentify genes that have GO annotations.
    EC numbersP. falciparum, P. yoelii, P. knowlesiIdentify genes with enzymatic annotations.
    Metabolic pathwaysP. falciparumIdentify parasite-specific or missing metabolic pathways.
Evolutionary
    Orthology basedP. falciparum, P. vivax, P. yoelii, P. berghei, P. chabaudi, P. knowlesiIdentify genes specific to apicomplexa.
    Homology basedP. falciparum and P. yoeliiIdentify homologs of a gene or list of genes of interest.
Protein features
    Protein motifs Interpro/pfam domains Molecular weight Isoelectric point Protein structure Immune epitopesP. falciparum, P. vivax, P. yoelii, P. berghei, P. chabaudi, P. knowlesiIdentify genes with specific protein attributes.
Protein localization
    Signal peptide Transmembrane domains Targeting to the RBCP. falciparum, P. vivax, P. yoelii, P. berghei, P. chabaudi, P. knowlesiIdentify genes targeted to the host cell.
    Apicoplast targetingP. falciparumIdentify genes targeted to the apicoplast.
Types of data available in PlasmoDB and example queries Transcript expression data [microarray, expressed sequence tags (ESTs) and serial analysis of gene expression (SAGE)] available through PlasmoDB has expanded dramatically over the past few releases to include microarray data from multiple life-cycle stages, gene knock-out mutants of P. falciparum and P. berghei (5–12) and multiple stages of P. yoelii (mosquito, erythrocytic and liver stages) (13). Also included are EST data from over 130 libraries (P. falciparum, P. vivax, P. berghei and P. yoelii) (14,15) [dbEST (http://www.ncbi.nlm.nih.gov/dbEST/)] and SAGE data (P. falciparum only) (16–18). Protein expression evidence includes data from various life-cycle stages (P. falciparum, P. berghei and P. yoelii) (11,13,19–21; Leiden Malaria Group, unpublished data). Population biology evidence (P. falciparum only) includes mapping of microsatellite data (22) onto the genome (available as a genome browser track), single nucleotide polymorphism (SNP) data from resequencing efforts of more than 20 P. falciparum strains (P. reichenowi is included as an out-group for comparison purposes) and data from nearly 100 P. falciparum isolates (23–25). OrthoMCL analyses provide ortholog determinations between the different species facilitating discovery of shared genes between lineages (26). Protein function assignments are aided by a number of additional functional data types available through PlasmoDB 5.5 including evidence of protein–protein interaction (yeast two hybrid and predicted interactome) (27,28), Genome Ontology (GO) (29) and InterPro domain (30) annotations for P. falciparum, P. vivax, P. berghei, P. yoelii, P. knowlesi and P. chabaudi, Enzyme Commission (EC) number (29) annotation for P. falciparum, P. yoelii and P. knowlesi (31) and metabolic pathway assignments for P. falciparum (31). In addition, subcellular localization of proteins is available through signal peptide (32) and transmembrane domain predictions (33) for P. falciparum, P. vivax, P. berghei, P. yoelii, P. knowlesi and P. chabaudi, and parasite-specific predictions (P. falciparum only) for apicoplast localization (34) and export to the host cell (35–37).

HOW TO USE PLASMODB

A visitor to PlasmoDB can use the database in two general ways: (i) To retrieve all available information associated with a particular gene of interest using a search for an exact gene ID, gene name or gene product name. (ii) To ask single questions (Table 1) and/or conduct a series of searches followed by refining the results by combining them or subtracting them from one another. Starting with the PlasmoDB home page (Figure 1A), a user can perform a quick search by entering an identifier or test term, or select a specific query from a number of drop-down menus (data not shown). Alternatively, queries may be accessed by visiting the ‘Queries and Tools’ section of PlasmoDB (Figure 1A), which includes a grid displaying all available queries/searches. By using the queries and tools, a user can interrogate data in PlasmoDB—the third column of Table 1 includes example data-specific questions that are available.
Figure 1.

Screenshots from PlasmoDB 5.5 and query workflow. (A) The top of the screenshot shows the PlasmoDB logo. On the left side are links to various sections of PlasmoDB and a point for logging in or registering as a user (not required for using the site but useful for storing search histories. The query grid is in the center and provides an access point to all searchable data in PlasmoDB. (B) This is a scheme of a workflow that a user may follow when building a set of queries. Beginning at the left, queries can be performed starting from the query grid and the results can be joined using operations available through the query history page. (C) Screen shots of a ‘key word’ search page, an example gene query history and a gene results page. Note the add column feature in the results page that allows the addition of columns with additional data and the ability to sort results.

Screenshots from PlasmoDB 5.5 and query workflow. (A) The top of the screenshot shows the PlasmoDB logo. On the left side are links to various sections of PlasmoDB and a point for logging in or registering as a user (not required for using the site but useful for storing search histories. The query grid is in the center and provides an access point to all searchable data in PlasmoDB. (B) This is a scheme of a workflow that a user may follow when building a set of queries. Beginning at the left, queries can be performed starting from the query grid and the results can be joined using operations available through the query history page. (C) Screen shots of a ‘key word’ search page, an example gene query history and a gene results page. Note the add column feature in the results page that allows the addition of columns with additional data and the ability to sort results. When conducting queries with the purpose of combining results it may be useful to visualize the searches in a workflow environment where nodes are connected using different criteria (‘and’, ‘or’, ‘not’) (Figure 1B). In PlasmoDB this would be accomplished by performing a number of queries and subsequently combining the results in the ‘query history’ section (Figure 1C, middle screen shot). For example, one may be interested in identifying a short list of possible vaccine candidates. One possible way of accomplishing this would be by identifying all proteins predicted to be exported to the host cell in P. falciparum. There are three exported protein datasets in PlasmoDB and a union (‘or’ function) of all three results retrieves 405 genes (Figure 1B, steps 1 and 2). To restrict this list further, intersecting (‘and’ function) these results with genes that have no orthologs in mammals reduces the results to 321 genes (Figure 1B, Step 3). Next a user may further prune this list by intersecting the results with other queries, such as genes that are nonpolymorphic between a chloroquine sensitive (3D7) and resistant strain (Dd2). This cuts the number of candidates to 32 genes (Figure 1B, Step 4 and Figure 1C, right screen shot). Alternatively, one may be interested in the genes that have protein expression evidence in a particular stage in the parasite's life cycle (the results of an intersection with genes that have proteomic evidence in gametocyte yields 27 genes). Finally, examination of the list reveals several genes encoding for rifins (a family of clonally variant proteins expressed on the surface of infected red blood cells) (38), and a user may wish to investigate genes other than rifins—this can be accomplished by excluding (‘not’ operation) results of a keyword query using the term ‘rifin’ (Figure 1B, Step 5 and Figure 1C, left most panel). A user may examine the specific gene pages for more gene-specific details, download results with their associated data or log in (if they have not done so already) to ensure that their search strategy is saved for future examination.

FUTURE DIRECTIONS

It is expected that PlasmoDB will continue its data content and tool expansion as user needs require. We anticipate the incorporation of multiple new data sets including microarray, proteomic and specific parasite isolate data. Additionally, over the next few years we look forward to incorporating sequence data from a dramatically expanded Plasmodium spp. sequencing effort (http://www.genome.gov/26525388). In the coming year, we will also release a new user interface that will include a workflow-based search strategy page, similar to what is shown in Figure 1B, which we anticipate will provide a more biologically intuitive and dynamic experience for scientists accessing PlasmoDB and other EuPathDB sites.

FUNDING

Federal funds from the National Institute of Allergy and Infectious Diseases; National Institutes of Health; Department of Health and Human Services, under Contract No. HHSN266200400037C. Funding to pay the Open Access publication charges for this article was provided by this contract. Conflict of interest statement. None declared.
  38 in total

1.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

Authors:  A Krogh; B Larsson; G von Heijne; E L Sonnhammer
Journal:  J Mol Biol       Date:  2001-01-19       Impact factor: 5.469

2.  Improved prediction of signal peptides: SignalP 3.0.

Authors:  Jannick Dyrløv Bendtsen; Henrik Nielsen; Gunnar von Heijne; Søren Brunak
Journal:  J Mol Biol       Date:  2004-07-16       Impact factor: 5.469

3.  Serial analysis of gene expression in Plasmodium falciparum reveals the global expression profile of erythrocytic stages and the presence of anti-sense transcripts in the malarial parasite.

Authors:  S Patankar; A Munasinghe; A Shoaibi; L M Cummings; D F Wirth
Journal:  Mol Biol Cell       Date:  2001-10       Impact factor: 4.138

4.  Widespread distribution of antisense transcripts in the Plasmodium falciparum genome.

Authors:  Anusha Munasinghe Gunasekera; Swati Patankar; Jonathan Schug; Geoffrey Eisen; Jessica Kissinger; David Roos; Dyann F Wirth
Journal:  Mol Biochem Parasitol       Date:  2004-07       Impact factor: 1.759

5.  A large focus of naturally acquired Plasmodium knowlesi infections in human beings.

Authors:  Balbir Singh; Lee Kim Sung; Asmad Matusop; Anand Radhakrishnan; Sunita S G Shamsul; Janet Cox-Singh; Alan Thomas; David J Conway
Journal:  Lancet       Date:  2004-03-27       Impact factor: 79.321

6.  PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data.

Authors:  Amit Bahl; Brian Brunk; Jonathan Crabtree; Martin J Fraunholz; Bindu Gajria; Gregory R Grant; Hagai Ginsburg; Dinesh Gupta; Jessica C Kissinger; Philip Labo; Li Li; Matthew D Mailman; Arthur J Milgram; David S Pearson; David S Roos; Jonathan Schug; Christian J Stoeckert; Patricia Whetzel
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

7.  The transcriptome of the intraerythrocytic developmental cycle of Plasmodium falciparum.

Authors:  Zbynek Bozdech; Manuel Llinás; Brian Lee Pulliam; Edith D Wong; Jingchun Zhu; Joseph L DeRisi
Journal:  PLoS Biol       Date:  2003-08-18       Impact factor: 8.029

8.  Dissecting apicoplast targeting in the malaria parasite Plasmodium falciparum.

Authors:  Bernardo J Foth; Stuart A Ralph; Christopher J Tonkin; Nicole S Struck; Martin Fraunholz; David S Roos; Alan F Cowman; Geoffrey I McFadden
Journal:  Science       Date:  2003-01-31       Impact factor: 47.728

9.  Drug-induced alterations in gene expression of the asexual blood forms of Plasmodium falciparum.

Authors:  Anusha Munasinghe Gunasekera; Swati Patankar; Jonathan Schug; Geoffrey Eisen; Dyann F Wirth
Journal:  Mol Microbiol       Date:  2003-11       Impact factor: 3.501

10.  Discovery of gene function by expression profiling of the malaria parasite life cycle.

Authors:  Karine G Le Roch; Yingyao Zhou; Peter L Blair; Muni Grainger; J Kathleen Moch; J David Haynes; Patricia De La Vega; Anthony A Holder; Serge Batalov; Daniel J Carucci; Elizabeth A Winzeler
Journal:  Science       Date:  2003-07-31       Impact factor: 47.728

View more
  533 in total

Review 1.  Peroxiredoxins in parasites.

Authors:  Michael C Gretes; Leslie B Poole; P Andrew Karplus
Journal:  Antioxid Redox Signal       Date:  2012-01-25       Impact factor: 8.401

2.  Apicomplexan perforin-like proteins.

Authors:  Björn F C Kafsack; Vern B Carruthers
Journal:  Commun Integr Biol       Date:  2010-01

3.  Fluxes in "free" and total zinc are essential for progression of intraerythrocytic stages of Plasmodium falciparum.

Authors:  Rebecca G Marvin; Janet L Wolford; Matthew J Kidd; Sean Murphy; Jesse Ward; Emily L Que; Meghan L Mayer; James E Penner-Hahn; Kasturi Haldar; Thomas V O'Halloran
Journal:  Chem Biol       Date:  2012-06-22

4.  A serine-arginine-rich (SR) splicing factor modulates alternative splicing of over a thousand genes in Toxoplasma gondii.

Authors:  Lee M Yeoh; Christopher D Goodman; Nathan E Hall; Giel G van Dooren; Geoffrey I McFadden; Stuart A Ralph
Journal:  Nucleic Acids Res       Date:  2015-04-13       Impact factor: 16.971

5.  Plasmodium Apicoplast Gln-tRNAGln Biosynthesis Utilizes a Unique GatAB Amidotransferase Essential for Erythrocytic Stage Parasites.

Authors:  Boniface M Mailu; Ling Li; Jen Arthur; Todd M Nelson; Gowthaman Ramasamy; Karin Fritz-Wolf; Katja Becker; Malcolm J Gardner
Journal:  J Biol Chem       Date:  2015-08-28       Impact factor: 5.157

6.  Genome-wide identification and functional annotation of Plasmodium falciparum long noncoding RNAs from RNA-seq data.

Authors:  Qi Liao; Jia Shen; Jianfa Liu; Xi Sun; Guoguang Zhao; Yanzi Chang; Leiting Xu; Xuerong Li; Ya Zhao; Huanqin Zheng; Yi Zhao; Zhongdao Wu
Journal:  Parasitol Res       Date:  2014-02-13       Impact factor: 2.289

7.  Malaria.tools-comparative genomic and transcriptomic database for Plasmodium species.

Authors:  Qiao Wen Tan; Marek Mutwil
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

Review 8.  21st century natural product research and drug development and traditional medicines.

Authors:  Linh T Ngo; Joseph I Okogun; William R Folk
Journal:  Nat Prod Rep       Date:  2013-04       Impact factor: 13.423

9.  Plasmodium falciparum translational machinery condones polyadenosine repeats.

Authors:  Slavica Pavlovic Djuranovic; Jessey Erath; Ryan J Andrews; Peter O Bayguinov; Joyce J Chung; Douglas L Chalker; James Aj Fitzpatrick; Walter N Moss; Pawel Szczesny; Sergej Djuranovic
Journal:  Elife       Date:  2020-05-29       Impact factor: 8.140

Review 10.  Plasmodium Parasites Viewed through Proteomics.

Authors:  Kristian E Swearingen; Scott E Lindner
Journal:  Trends Parasitol       Date:  2018-08-23
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.