| Literature DB >> 26980519 |
Graham L Cromar1, Anthony Zhao2, Xuejian Xiong2, Lakshmipuram S Swapna2, Noeleen Loughran2, Hongyan Song2, John Parkinson3.
Abstract
PhyloPro is a database and accompanying web-based application for the construction and exploration of phylogenetic profiles across the Eukarya. In this update article, we present six major new developments in PhyloPro: (i) integration of Pfam-A domain predictions for all proteins; (ii) new summary heatmaps and detailed level views of domain conservation; (iii) an interactive, network-based visualization tool for exploration of domain architectures and their conservation; (iv) ability to browse based on protein functional categories (GOSlim); (v) improvements to the web interface to enhance drill down capability from the heatmap view; and (vi) improved coverage including 164 eukaryotes and 12 reference species. In addition, we provide improved support for downloading data and images in a variety of formats. Among the existing tools available for phylogenetic profiles, PhyloPro provides several innovative domain-based features including a novel domain adjacency visualization tool. These are designed to allow the user to identify and compare proteins with similar domain architectures across species and thus develop hypotheses about the evolution of lineage-specific trajectories. Database URL: http://www.compsysbio.org/phylopro/.Entities:
Mesh:
Year: 2016 PMID: 26980519 PMCID: PMC4792532 DOI: 10.1093/database/baw013
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
List of reference species
| No. | Common name | Scientific name | Source (Date) |
|---|---|---|---|
| 1 | Thale cress | PlantGDB: v.173 (26/08/09) | |
| 2 | Bakers yeast | SGD: (12/12/07) | |
| 3 | Roundworm | WormBase: WS205 (30/07/09) | |
| 4 | Fruit fly | FlyBase: v.1.3 (25/06/09) | |
| 5 | House mouse | ENSEMBL: (23/11/07) | |
| 6 | Human | ENSEMBL: (23/11/07) | |
| 7 | Malarial parasite | PlasmoDB: v.5.4 (24/09/07) | |
| 8 | Toxoplasma parasite | ToxoDB: v.4.3 (01/11/07) | |
| 9 | Zebrafish | ENSEMBL: (23/11/07) | |
| 10 | Fission yeast | SANGER: (11/05/06) | |
| 11 | Leishmania parasite | EMBL: (24/12/07) | |
| 12 | Blood Fluke | ENSEMBL: (31/07/14) |
Figure 1.Protein and domain conservation views. (A) Conservation of proteins corresponding to the GOSlim category, ‘Anatomical structure formation involved in morphogenesis’. Colored tiles indicate the presence (color) or absence (black) of an ortholog of the reference organism (in this case human) in a given target species. Species are indicated across the top, grouped by phylogeny with plants on the left. Proteins are indicated in rows on the left, clustered so that proteins with similar patterns of conservation are grouped together. The sequence of a selected human reference protein (SLIT2) and its mouse ortholog are also shown (inset). (B) Domain architecture conservation corresponding to the same group of proteins as in (A) above. Tile colors reflect the comparison between the reference and target domain architectures. The corresponding architectures for SLIT2 are shown (inset). Note that gene order is determined by clustering and is independent between views.
Figure 2.Domain adjacency network exploration. (A) A domain adjacency graph for a subset of proteins corresponding to the GOSlim category, ‘Anatomical structure formation involved in morphogenesis’. Domains are shown as nodes. Edges indicate the adjacency of domain pairs (N to C terminal direction) within one or more architectures corresponding to the searched proteins listed in the side panel. For the example protein (SLIT2), the highlighted nodes indicate the domain architecture pertaining to this protein (enlargement and arrows added for emphasis). The side panel lists the orthologs of the searched proteins from which the graph has been constructed. (B) The area of interest has been expanded from the Laminin_G_1 node to include an additional Laminin_II domain, indicating that this duo appears in one or more additional proteins not in the original search. (C) Expansion continues with Laminin_II now added to the network as a permanent addition, further expansion from this domain identifies Laminin_I as a new neighbor. Selection of numbered nodes, presents a green ‘Protein Search’ button which initiates a search for additional proteins with this architecture that are not in the original list of search proteins. (D) The search in (C) has returned one additional protein (SLIT1) which was not in the original list of searched proteins. Exploration from LRRCT reveals LRR_4 as an adjacent neighbor. Note that multiple adjacent domains are often returned from the search allowing one to build up a rich network in the direction of interest. Also, by selecting the ortholog in another species, differences in architectures between species may be explored and expansions may be scoped to a particular species.