| Literature DB >> 27618184 |
Alan McNally1,2, Yaara Oren3, Darren Kelly4, Ben Pascoe5, Steven Dunn1, Tristan Sreecharan1, Minna Vehkala6, Niko Välimäki6, Michael B Prentice7, Amgad Ashour7, Oren Avram3, Tal Pupko3, Ulrich Dobrindt8, Ivan Literak9, Sebastian Guenther10, Katharina Schaufler10, Lothar H Wieler10,11, Zong Zhiyong12, Samuel K Sheppard5, James O McInerney4,13, Jukka Corander6,14.
Abstract
The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug-resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements.Entities:
Mesh:
Year: 2016 PMID: 27618184 PMCID: PMC5019451 DOI: 10.1371/journal.pgen.1006280
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Fig 1Maximum likelihood phylogeny of 228 E. coli ST131 isolates.
Strains isolated from dogs and cats (domesticated animals), wild birds (avian), and cattle (livestock) are indicated by colour coding at the tips of the tree, with all other strains not colour coded being human isolates. Clades A, B and C are indicated by colour coding of the branches. The large black circles indicate statistically significant inferences of host jumps or ecological adaptations within the phylogeny as detected by AdaptML. The grey circles indicate phylogenetic inferences with > 99% bootstrap support. The names of the taxa match those in S1 Table.
Fig 2A) Graphical representation of the clustering of isolates based on their accessory gene content based on a pairwise comparison of the accessory gene content of all 228 genomes. The colour scheme is a heatmap representation of the levels of identity of accessory genes between strains based on the BLAST score output from LS-BSR, with red equalling 100% and dark blue representing 0%. Numbers on the X and Y-axis indicate the accessory genome cluster labels. The ESBL gene type of each strain is indicated by the colour coded bar on the Y-axis. B) A maximum likelihood phylogenetic tree of the accessory genome of all isolates based on a binary gene presence-v-absence alignment file. Colour coding refers to the accessory genome clusters identified in panel A.
Fig 3Maximum likelihood phylogeny of the ST131 core genome, with the accessory genome profile overlaid.
Clades A, B and C are colour coded by branch (blue, cyan, and magenta respectively). The accessory genome is presented as a heatmap (red = high identity to blue = low identity) of pairwise Spearman correlations of the accessory gene content between each strain, such that warmer colours indicate subsets of isolates with substantially more similar gene content between them than on average between randomly chosen isolates. The colour coding to the right indicates the accessory genome cluster of each strain as determined by Kpax2.
Fig 4Maximum likelihood phylogeny of the ST131 core genome, with gene regulatory region allele profiles overlaid.
Clades A, B and C are colour coded by branch (blue, cyan, and magenta respectively). The gene regulatory region allele profiles are presented as a heatmap (red = high identity to blue = low identity) of pairwise Spearman correlations of the regulatory region alleles between each strain, such that warmer colours indicate subsets of isolates with substantially more similar regulatory region alleles between them than on average between randomly chosen isolates. The colour coding to the right indicates the accessory genome cluster of each strain as determined by Kpax2.