| Literature DB >> 29868279 |
David M Curran1, John S Gilleard2, James D Wasmuth1.
Abstract
After transitioning to a new environment, species often exhibit rapid phenotypic innovation. One of the fastest mechanisms for this is duplication followed by specialization of existing genes. When this happens to a member of a gene family, it tends to leave a detectable phylogenetic signature of lineage-specific expansions and contractions. These can be identified by analyzing the gene family across several species and identifying patterns of gene duplication and loss that do not correlate with the known relationships between those species. This signature, termed phylogenetic instability, has been previously linked to adaptations that change the way an organism samples and responds to its environment; conversely, low phylogenetic instability has been previously linked to proteins with endogenous functions. With the increase in genome-level data, there is a need to identify and quantify phylogenetic instability. Here, we present Minimizing Instability in Phylogenetics (MIPhy), a tool that solves this problem by quantifying the incongruence of a gene's evolutionary history. The motivation behind MIPhy was to produce a tool to aid in interpreting phylogenetic trees. It can predict which members of a gene family are under adaptive evolution, working only from a gene tree and the relationship between the species under consideration. While it does not conduct any estimation of positive selection-which is the typical indication of adaptive evolution-the results tend to agree. We demonstrate the usefulness of MIPhy by accurately predicting which members of the mammalian cytochrome P450 gene superfamily metabolize xenobiotics and which metabolize endogenous compounds. Our predictions correlate very well with known substrate specificities of the human enzymes. We also analyze the Caenorhabditis collagen gene family and use MIPhy to predict genes that produce an observable phenotype when knocked down in C. elegans, and show that our predictions correlate well with existing knowledge. The software can be downloaded and installed from https://github.com/dave-the-scientist/miphy and is also available as an online web tool at http://www.miphy.wasmuthlab.org.Entities:
Keywords: Gene family evolution; Phylogenetic clustering; Phylogenetic instability
Year: 2018 PMID: 29868279 PMCID: PMC5983006 DOI: 10.7717/peerj.4873
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1MIPhy results interface.
MIPhy visualization page for the 628 vertebrate Cyps from (Thomas, 2007). The MIGs are listed in the table on the left as well as indicated by the light orange shapes on the interior of the tree. The instability of each cluster is visualized by the bar charts around the outside of the tree. The colors of the band just inside of the circle match the colors of the tree nodes, and represent the originating species of each sequence.
Figure 2The phylogenetic instability of the 59 human Cyp proteins.
The vertical dashed line separates the stable from the unstable sequences. “Substrate” indicates those proteins with primarily endogenous roles (filled squares), primarily xenobiotic roles (empty circles), both xenobiotic and endogenous roles (empty squares), and pseudogenes (P). “Selection” indicates which of the 18 sequences tested showed evidence of positive selection (+), or no positive selection (−). In the “Clusters” row, the solid lines indicate those genes that are located in tandem arrays in the human genome or are syntenic with a tandem array in the mouse genome (S). All substrate, positive selection, and clustering data were taken from (Thomas, 2007).
Figure 3The phylogenetic instability of the 151 C. elegans collagen MIGs.
The MIGs containing genes with an observable knock-down phenotype are indicated by triangles.