| Literature DB >> 34747486 |
Michiel Van Bel1,2, Francesca Silvestri1,2, Eric M Weitz3, Lukasz Kreft4, Alexander Botzki5, Frederik Coppens1,2, Klaas Vandepoele1,2,6.
Abstract
PLAZA is a platform for comparative, evolutionary, and functional plant genomics. It makes a broad set of genomes, data types and analysis tools available to researchers through a user-friendly website, an API, and bulk downloads. In this latest release of the PLAZA platform, we are integrating a record number of 134 high-quality plant genomes, split up over two instances: PLAZA Dicots 5.0 and PLAZA Monocots 5.0. This number of genomes corresponds with a massive expansion in the number of available species when compared to PLAZA 4.0, which offered access to 71 species, a 89% overall increase. The PLAZA 5.0 release contains information for 5 882 730 genes, and offers pre-computed gene families and phylogenetic trees for 5 274 684 protein-coding genes. This latest release also comes with a set of new and updated features: a new BED import functionality for the workbench, improved interactive visualizations for functional enrichments and genome-wide mapping of gene sets, and a fully redesigned and extended API. Taken together, this new version offers extended support for plant biologists working on different families within the green plant lineage and provides an efficient and versatile toolbox for plant genomics. All PLAZA releases are accessible from the portal website: https://bioinformatics.psb.ugent.be/plaza/.Entities:
Mesh:
Year: 2022 PMID: 34747486 PMCID: PMC8728282 DOI: 10.1093/nar/gkab1024
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Species overview PLAZA 5.0. Overview of the species (yellow bars), taxonomic families (blue bars), and taxonomic orders (leaf nodes in the tree) in PLAZA 5.0 offset against the same content types in PLAZA 4.0. Bars that are solely to the left of the center indicate that there is no increase in PLAZA 5.0, while bars that are solely to the right of the center indicate that this order was not previously present in PLAZA 4.0. Bars with content to both sides of the center indicate that the order has been extended by the addition of new species and/or families in PLAZA 5.0.
Overview of the functional annotation data content in PLAZA 5.0. The MapMan annotations with MapMan term ‘35’ (‘Not assigned’), or its hierarchical descendants, were excluded. The GO annotations with root-level GO terms (‘biological_process’, ‘cellular_component’, ‘molecular_function’) were excluded. Primary GO sources are: the genome project data provider, the GOA project, and data from GeneOntology. Empirical GO evidence types consist of experimental GO evidence types (EXP, IDA, IPI, IMP, IGI, IEP, HTP, HDA, HMP, HGI, HEP), and traceable/curated author statements (IC, TAS)
| PLAZA Dicots 5.0 | PLAZA Monocots 5.0 | ||
|---|---|---|---|
| Genes | All gene types | 4 234 318 | 2 254 715 |
| Protein coding genes | 3 667 693 | 2 165 730 | |
| Genes with InterPro annotations | 2 808 722 (76.6%) | 1 636 297 (75.5%) | |
| Genes with MapMan annotations | 1 578 937 (43%) | 888 280 (41%) | |
| Genes with GO annotations | All data sources | 2 563 423 (69.9%) | 1 515 180 (70%) |
| Primary sources only | 2 114 497 (57.6%) | 1 242 612 (57.4%) | |
| PLAZA GO projection | 1 573 218 (42.9%) | 918 149 (42.4%) | |
| Empirical evidences only | 76 230 (2%) | 65 860 (3%) |
Figure 2.BED import and improved functional enrichment visualization. (A) After selecting the target species and gene types to be included (Zea mays B73, NAM v5.0), and uploading an example BED file (maize GLK2 ChIP-Seq (26)), each region is associated to the closest gene. Next, an interactive peak-to-gene distance distribution for all peaks and their associated closest genes is shown. A distance cutoff can be applied to retain all genes, or only regions overlapping with the gene body, or within a predefined distance to the closest gene. Applying a 2 kb cutoff retained 3758 genes in this workbench experiment. (B) Overview of a GO enrichment analysis for maize GLK2 ChIP-Seq target genes. Through the Graph visualization, the Filtered view discards parental GO terms and reports the most specific GO terms showing significant enrichment. While node size is proportional to the P-value of the enriched term, the node color is determined by the enrichment fold of the ontology term, with the color scale varying between green and purple (high and low values, depicting over-representation and depletion, respectively). The node outer band is determined by the percentage of genes that are annotated with the enriched ontology term (e.g. the selected node GO:0015979 reports that 2.50% of the genes in this workbench experiment are annotated with the enriched GO Biological Process term photosynthesis).