| Literature DB >> 24275491 |
Jaime Huerta-Cepas1, Salvador Capella-Gutiérrez, Leszek P Pryszcz, Marina Marcet-Houben, Toni Gabaldón.
Abstract
Phylogenetic trees representing the evolutionary relationships of homologous genes are the entry point for many evolutionary analyses. For instance, the use of a phylogenetic tree can aid in the inference of orthology and paralogy relationships, and in the detection of relevant evolutionary events such as gene family expansions and contractions, horizontal gene transfer, recombination or incomplete lineage sorting. Similarly, given the plurality of evolutionary histories among genes encoded in a given genome, there is a need for the combined analysis of genome-wide collections of phylogenetic trees (phylomes). Here, we introduce a new release of PhylomeDB (http://phylomedb.org), a public repository of phylomes. Currently, PhylomeDB hosts 120 public phylomes, comprising >1.5 million maximum likelihood trees and multiple sequence alignments. In the current release, phylogenetic trees are annotated with taxonomic, protein-domain arrangement, functional and evolutionary information. PhylomeDB is also a major source for phylogeny-based predictions of orthology and paralogy, covering >10 million proteins across 1059 sequenced species. Here we describe newly implemented PhylomeDB features, and discuss a benchmark of the orthology predictions provided by the database, the impact of proteome updates and the use of the phylome approach in the analysis of newly sequenced genomes and transcriptomes.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24275491 PMCID: PMC3964985 DOI: 10.1093/nar/gkt1177
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
List of query terms supported by the phylomeDB web API
| URL query term | Value |
|---|---|
| Seqid | Any sequence identifier (i.e. Uniprot ID, Ensembl ID). Required |
| Phyid | A phylome ID (i.e. 102), a list of comma-separated phylome IDs or a collection ID (i.e. PhyC1). By default a tree from the most recent phylome will be selected. |
| Method | The preferred evolutionary model for the target tree. Default: best model. |
| Snode | a comma-separated list of target nodes, defined as follows: node_feature|search_pattern|fgcolor|bgcolor, where name: leaf names as shown in the tips of the tree (i.e. TP53) phylomedb_name: phylomedb ID format (i.e. Phy00086SJ) gene_name: original ID used in the source proteome (i.e. ORF_1) swissprot_name: a swissprot ID (i.e. P04637) trembl_name: a trembl ID (i.e. K7PPA8) ensembl_name: any protein, transcript or gene ensembl ID (i.e. ENSP00000269305) genolevures_name: an Ascomycete-based Genolevures database ID taxid: a NCBI taxa ID (i.e. 9606) species: Uniprot species code (i.e HUMAN) spname: scientific name (i.e spiens) relative_age: any of the tracked NCBI taxa names (i.e. Primates) Example: |
| Tree_features | A comma separated list of tree features to be shown. Currently the following features are supported: best_name, name, gene_name, swissprot_name, trembl_name, ensembl_name, genolevures_name, taxid, spname, lineage, motifs and support. |
| Example: |
Only seqid is required to perform a query.
Figure 1.Example of the integrated tree visualization interface showing the gene family phylogeny of TP53. (a) The tree search panel allows switching among all available trees containing the target sequence, even if it was not used as a seed (i.e. collateral tree). (b) The tree editing menu allows to search nodes matching custrom criteria, select what tree features are shown in the image and download image or other data. (c) Lowly supported nodes are highlighted with a transparent bubble and speciation and duplication events are indicated using red and blue colors, respectively. (d) A taxonomy panel indicating the assignment of different partitions to major taxonomic levels. Taxonomic level associated to each color is shown on mouse over events. (e) Domain and sequence panel. PFAM motifs are represented by different shapes and can be clicked for extended information. Inter-domain coding regions are shown using the standard amino acid color codes. Gap regions are illustrated as a flat line. (f) Available tree features. One or more attributes are allowed to be selected to modify the default aspect of the tree image. (g) The tree legend indicating color codes of the different tree nodes. (h) The search panel allows to search for node matching any custom criteria of a number of node attributes. In the example shown, a node containing the P53_C domain has been highlighted through the use of this panel. (i) The contextual node menu, including extended information about a node and links to external data source.