| Literature DB >> 27433520 |
Abstract
Genomic islands (GIs) are chunks of genomic fragments that are acquired from nongenealogical organisms through horizontal gene transfer (HGT). Current researches on studying donor-recipient relationships for HGT are limited at a gene level. As more GIs have been identified and verified, the way of studying donor-recipient relationships can be better modeled by using GIs rather than individual genes. In this paper, we report the development of a computational framework for detecting origins of GIs. The main idea of our computational framework is to identify GIs in a query genome, search candidate genomes that contain genomic regions similar to those GIs in the query genome by BLAST search, and then filter out some candidate genomes if those similar genomic regions are also alien (detected by GI detection tools). We have applied our framework in finding the GI origins for Mycobacterium tuberculosis H37Rv, Herminiimonas arsenicoxydans, and three Thermoanaerobacter species. The predicted results were used to establish the donor-recipient network relationships and visualized by Gephi. Our studies have shown that donor genomes detected by our computational approach were mainly consistent with previous studies. Our framework was implemented with Perl and executed on Windows operating system.Entities:
Year: 2014 PMID: 27433520 PMCID: PMC4897231 DOI: 10.1155/2014/732857
Source DB: PubMed Journal: Int Sch Res Notices ISSN: 2356-7872
Figure 1A schematic view of our computational framework for predicting the origins of GIs.
Figure 2A schematic illustration of overlapping results. The circles represent genomes, with the top middle one denoting a query genome and the rest ones denoting initial candidate donor genomes. The GI set in query genome, G, includes {g 1, g 2, g 3}, the initial candidate donor genomes' set, S, includes {s (1,1), s (1,2), s (1,3), s (2,1), s (2,2)}, and the detected GIs' set in intial candidate donor genomes, G′, includes {g 1′(1,1), g 2′(1,1), g 1′(1,2), g 2′(1,2), g 1′(1,3), g 2′(1,3), g 3′(1,3), g 1′(2,1), g 2′(2,1), g 1′(2,2)}. Only those candidate donor genomes whose similar subsequences are not GIs are predicted to be donors. Otherwise, the candidates are predicted as recipients. For instance, the initial candidate donor s (1,1) is labeled as recipient because g 1 is similar to g 2′(1,1), a member of GI set, G′, of s (1,1). The initial candidate donor s (1,2) is labeled as donor because the g 1's similar subsequence (the sequence was not labeled but was highlighed and connected to g 1 with a dashed line) does not overlap with any member in the GI set, G′, of s (1,2). In this example, s (1,2) and s (2,1) are predicted to be final donor genomes for the query genome.
Figure 3The network of predicted Mycobacterium tuberculosis H37Rv's donors. The undirected network demonstrates all candidate donor genomes of Mycobacterium tuberculosis H37Rv. The weight of each edge denotes the number of donated GIs. Each color represents one genus of species.
Figure 4The predicted donors of Mycobacterium tuberculosis H37Rv. The candidate donors were grouped as genus, and the number of donated GIs was represented by the length of bars.
Figure 5The network of predicted Herminiimonas arsenicoxydans' donors. The undirected network demonstrates all candidate donor genomes of Herminiimonas arsenicoxydans. The weight of each edge denotes the number of donated GIs. Each color represents one genus of species.
Figure 6The predicted donors of Herminiimonas arsenicoxydans. The candidate donors were grouped as genus, and the number of donated GIs was represented by the length of bars.
Figure 7A directed network of donor-recipient relationship based on T. pseudethanolicus strain ATCC33223, T. tengcongensis MB4, and T. strain X514. The directed network demonstrates the GI's transfer direction. The weight of each edge denotes the number of occurrences of HGT.
Figure 8The predicted donors of Thermoanaerobacter. The chart consists of three recipients which have been labeled with different colorful bars. Each bar represents the number of donated GIs from one specific donor for a corresponding recipient.