| Literature DB >> 18063570 |
Jon Duvick1, Ann Fu, Usha Muppirala, Mukul Sabharwal, Matthew D Wilkerson, Carolyn J Lawrence, Carol Lushbough, Volker Brendel.
Abstract
PlantGDB (http://www.plantgdb.org/) is a genomics database encompassing sequence data for green plants (Viridiplantae). PlantGDB provides annotated transcript assemblies for >100 plant species, with transcripts mapped to their cognate genomic context where available, integrated with a variety of sequence analysis tools and web services. For 14 plant species with emerging or complete genome sequence, PlantGDB's genome browsers (xGDB) serve as a graphical interface for viewing, evaluating and annotating transcript and protein alignments to chromosome or bacterial artificial chromosome (BAC)-based genome assemblies. Annotation is facilitated by the integrated yrGATE module for community curation of gene models. Novel web services at PlantGDB include Tracembler, an iterative alignment tool that generates contigs from GenBank trace file data and BioExtract Server, a web-based server for executing custom sequence analysis workflows. PlantGDB also hosts a plant genomics research outreach portal (PGROP) that facilitates access to a large number of resources for research and training.Entities:
Mesh:
Substances:
Year: 2007 PMID: 18063570 PMCID: PMC2238959 DOI: 10.1093/nar/gkm1041
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Sequence resources and analytical tools available at PlantGDB
In this table, sequence resources are divided into four categories: Uploaded Sequence, Assembled Sequence, Genome Browsers, and Other Tools. For each resource in column 1, the species available, source/version, current sequence count, update frequency, tool/services, download options, and alignment to genome are shown in adjacent cells. Web links for both external and internal data/tool sources are indicated with superscript numbers and are listed at the end of the table under Web Resources. Sequence counts displayed here are as of 30 October 2007.
Figure 1.Database schema for PlantGDB, showing data sources, update frequency, computation and web services. PlantGDB is accessible at http://www.plantgdb.org, and genome browsers are accessible at http://www.plantgdb.org/XxGDB, where Xx is the first letter of the genus and species (e.g. AtGDB = Arabidopsis thaliana genome database).
Figure 2.Transcript assemblies (PUTs) at PlantGDB, grouped by taxonomic affiliation. Sequence totals displayed here are as of 30 October 2007. Parentheses indicate the number of species/subspecies per genus. Genera highlighted in yellow are associated with a genome browser at PlantGDB; an underscore indicates chromosome-based genome browsers. An asterisk designates genera for which PlantGDB provides preprocessed GeneSeqer indices for quick access to spliced alignments.
Figure 3.Screenshots from ZmGDB and yrGATE illustrate the use of online tools for gene discovery and community gene annotation. (A) A web-accessible table of Z. mays BACs (alternately shaded) displaying (left to right) the BAC GI, BAC clone name, followed by the ID, start/end coordinates and functional annotation of splice-aligned TIGR-predicted proteins from O. sativa and finally the ZmGDB entry date. All fields are searchable and each row is linked via column 1 to a genome browser view of the BAC region. This table is currently updated daily at ZmGDB (http://www.plantgdb.org/ZmGDB/DisplayGeneAnn.php). (Similar tables are available for eight other BAC-based xGDB browsers.) Note that a region of BAC GI 156523432 is aligned to three paralogous rice predicted polypeptides, annotated as ‘autophagy-related protein 8 precursor’. Clicking on the BAC GI ‘156523432’ in table column 1 (circled) brings up a BAC/Clone Context View of the specified region (B), showing spliced alignments to the rice predicted polypeptides (black), along with other alignment data, in this case maize cDNAs (blue) and maize ESTs (red). Note the evidence for alternative splicing among the maize ESTs (circles) suggesting at least two alternate transcripts (labeled 1 and 2). The user has the option to explore and annotate this variation using yrGATE. (C) Launching the yrGATE annotation tool displays scrolling list of evidence scores and supporting exons for all exon coordinates at a locus (alternative splice coordinates for 1 and 2 are circled). The user can build a complete gene model on screen by selecting each desired exon and then compare the resulting open reading frame to known proteins using BLAST (data not shown). (D) The chosen gene model is displayed graphically and will be published on the ZmGDB browser following curation by PlantGDB staff. Shown here are yrGATE models for the two putative splice variants, with translation start/stop positions indicated by triangles. (E) Predicted protein sequence for the two yrGATE gene models. This example illustrates how xGDB and yrGATE can be used to identify and publish gene model predictions quickly and easily, enhancing the community genome knowledge base for maize as well as facilitating hypothesis-driven research.