| Literature DB >> 31127271 |
Camilo Andres Perez-Romero1,2, Bram Weytjens1,2, Dries Decap2, Toon Swings3,4,5, Jan Michiels3,4, Dries De Maeyer1,2, Kathleen Marchal1,2.
Abstract
IAMBEE is a web server designed for the Identification of Adaptive Mutations in Bacterial Evolution Experiments (IAMBEE). Input data consist of genotype information obtained from independently evolved clonal populations or strains that show the same adapted behavior (phenotype). To distinguish adaptive from passenger mutations, IAMBEE searches for neighborhoods in an organism-specific interaction network that are recurrently mutated in the adapted populations. This search for recurrently mutated network neighborhoods, as proxies for pathways is driven by additional information on the functional impact of the observed genetic changes and their dynamics during adaptive evolution. In addition, the search explicitly accounts for the differences in mutation rate between the independently evolved populations. Using this approach, IAMBEE allows exploiting parallel evolution to identify adaptive pathways. The web-server is freely available at http://bioinformatics.intec.ugent.be/iambee/ with no login requirement.Entities:
Mesh:
Year: 2019 PMID: 31127271 PMCID: PMC6602435 DOI: 10.1093/nar/gkz451
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of IAMBEE, a web-service for the identification of adaptive pathways from the sequence data of parallel evolved clonal populations. The input consists of a genome wide interaction network of the organism of interest and sequence data obtained from parallel evolved populations (each parallel population is indicated with a different color). Variant calling allows detecting for each population its variants (referred to as the mutant). Extra information on the ‘functional impact’ of each variant (larger functional impact is indicated with a darker coloring) and the frequency increase of the variants during the sweep are optional. The frequency increase together with the mutation rate of the different populations can also be estimated by IAMBEE from the VCF files. All genes with at least one mutation in any of the independently evolved populations are mapped on a topology-weighted interaction network. The functional impact and/or the frequency increase and/or the mutation rate of the population carrying the variant are used to assign to each gene (network nodes) a relevance score (reflecting the potential relevance of the node for the acquired phenotype). The degree of shading of the nodes is indicative of their relevance score. In this pathfinding step the N-best paths are enumerated that originate from an aberrant gene in a population and end in any other gene mutated in another population (indicated by the gene pairs). The probability of a path depends on the topology-based weights of the edges that define the path, combined with a weighting of the path that is based on the ‘relevance scores’ of the start and stop genes that make up the path. The subsequent optimization step operates on the collection of edges/nodes composing the N-Best paths selected during the pathfinding step. The optimization algorithm searches in this collection of preselected nodes/edges for highly probable paths that connect as many as possible mutations occurring in different populations using the least number of edges (referred to as the highest scoring subnetwork). This results in recurrently mutated neighborhoods that are a proxy of adaptive pathways (indicated by the shaded area).
Figure 2.Adaptive pathways involved in ethanol tolerance. The colored segments surrounding each node indicate the populations in which the node (gene) was mutated. In total 16 parallel populations were analyzed, each indicated with a different color. If a gene was affected in multiple populations, it contains multiple colored segments. Genes involved in DNA repair, osmotic stress and amino acid biosynthesis are indicated in orange boxes. The edges in the network visualization are colored according to the interaction type they represent; each function of the interaction is explained in the legend. The edge width depicts the relevance of the edge to the phenotype (as determined by the sweep on the edge cost parameter). This weight is assigned to the edges based on the maximum edge cost for which they are still included in an optimal subnetwork. More reliable edges will have a smaller width.