| Literature DB >> 29402213 |
Eliot C Bush1, Anne E Clark2,3, Carissa A DeRanek2, Alexander Eng2,3, Juliet Forman2, Kevin Heath2,4, Alexander B Lee2,5, Daniel M Stoebel2, Zunyan Wang2, Matthew Wilber2, Helen Wu2.
Abstract
BACKGROUND: Genomic islands play an important role in microbial genome evolution, providing a mechanism for strains to adapt to new ecological conditions. A variety of computational methods, both genome-composition based and comparative, have been developed to identify them. Some of these methods are explicitly designed to work in single strains, while others make use of multiple strains. In general, existing methods do not identify islands in the context of the phylogeny in which they evolved. Even multiple strain approaches are best suited to identifying genomic islands that are present in one strain but absent in others. They do not automatically recognize islands which are shared between some strains in the clade or determine the branch on which these islands inserted within the phylogenetic tree.Entities:
Keywords: Gene family; Genomic island; Horizontal transfer; Synteny
Mesh:
Year: 2018 PMID: 29402213 PMCID: PMC5799925 DOI: 10.1186/s12859-018-2038-0
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Example species tree. An input tree consisting of a focal clade and several outgroups
Fig. 2Phylogenetic tree used in genome simulations. We ran xenoGI on simulated genomes that were generated on the tree shown. On each branch we show the true positive rate (red) and the positive predictive value (blue) for xenoGI on that branch
Summary of xenoGI results on validation cases in five strains
| Num. islands | Total bases | Base coverage | In single island | |
|---|---|---|---|---|
| Burkholderia cenocepacia J2315 | 13 | 530,772 | 0.977 | 0.917 |
| Corynebacterium diphtheriae NCTC 13129 | 13 | 249,918 | 0.983 | 0.769 |
| Cronobacter sakazakii ATCC BAA-894 | 14 | 305,124 | 0.879 | 0.643 |
| Streptococcus equi 4047 | 7 | 243,337 | 0.951 | 0.857 |
| Vibrio cholerae O1 biovar eltor str. N16961 | 6 | 295,096 | 0.970 | 0.833 |
Each row corresponds to a strain. Num. islands represents the number of validation islands and total bases represents the total number of nucleotides in those islands. Base coverage is the proportion of all bases in the validation islands that xenoGI correctly recognized as an island. In single island indicates the proportion of validation islands that xenoGI captured as a single island
Fig. 3Examples from enteric bacteria. a Phylogenetic tree of 11 enteric species. Symbols indicate the branches of insertion of GIs in b–d. The images in b–d were made by outputting xenoGI islands and then displaying in the IGB genome browser. Note that the scale for the three is not exactly the same. In the figures, different islands are given different colors. All islands with an mrca at or before the point where C. rodentium diverges are colored black. b Salmonella pathogenicity island 2 shown in three Salmonella species. c The acid fitness island as reconstructed by xenoGI in two E. coli species and E. albertii. d The island around gadB in our four Escherichia species