Literature DB >> 22419842

GIST: Genomic island suite of tools for predicting genomic islands in genomic sequences.

Mohammad Shabbir Hasan, Qi Liu, Han Wang, John Fazekas, Bernard Chen, Dongsheng Che.   

Abstract

UNLABELLED: Genomic Islands (GIs) are genomic regions that are originally from other organisms, through a process known as Horizontal Gene Transfer (HGT). Detection of GIs plays a significant role in biomedical research since such align genomic regions usually contain important features, such as pathogenic genes. We have developed a use friendly graphic user interface, Genomic Island Suite of Tools (GIST), which is a platform for scientific users to predict GIs. This software package includes five commonly used tools, AlienHunter, IslandPath, Colombo SIGI-HMM, INDeGenIUS and Pai-Ida. It also includes an optimization program EGID that ensembles the result of existing tools for more accurate prediction. The tools in GIST can be used either separately or sequentially. GIST also includes a downloadable feature that facilitates collecting the input genomes automatically from the FTP server of the National Center for Biotechnology Information (NCBI). GIST was implemented in Java, and was compiled and executed on Linux/Unix operating systems. AVAILABILITY: The database is available for free at http://www5.esu.edu/cpsc/bioinfo/software/GIST.

Entities:  

Keywords:  Genomic islands; Prokaryotic genomes; Sequence analysis

Year:  2012        PMID: 22419842      PMCID: PMC3302003          DOI: 10.6026/97320630008203

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Some prokaryotic genomes contain genomic sequences with different patterns than the remaining parts of the host genomes. Such differences may include GC content bias [1], codon usage bias [2, 3], k-mer nucleotide frequency bias [4], and the existence of mobile genes such as integrase genes and transposes genes [5]. In some other cases, such regions are also bordered by transfer RNAs (t-RNA) [6]. The abnormal regions that contain such types of characteristics are known as Genomic Islands (GIs). Research on identifying genomic islands has become more important as the scientific community can be significantly benefitted from such findings. Biomedical researchers and microbiologists can use the results to explain the pathogenicity of organisms, or discover industrial important metabolic components from GIs. Based on such findings, pharmacists can use them to design corresponding vaccines and antibiotics, and eventually promote pharmaceutical companies to produce medicines at a large scale. As it is generally believed that each genome contains unique genomic sequence signature, some computation tools based on sequence signature have been developed. Such sequence composition based tools include AlienHunter [7], COLOMBO SIGI-HMM [8], GIDetector [9], IslandPath [10], INDeGenIUS [11], and PAI-IDA [12]. Recent studies have shown that none of these tools can predict GIs accurately in all genomes [13]. Hence it necessary to develop a computational framework that produces a better prediction results by combining the results of existing programs [14]. We have recently developed a tool, EGID, which has shown to optimize the results of individual tools, and produce a better prediction result for all genomes [15]. We realize that the majority users of these tools are biologists. Unfortunately, these programs are command line based, and different programs usually require different inputs to predict GIs, thus making it difficult for such group of people to use these tools for genomic island analyses. To this end, we have developed a user friendly graphic user interface, GIST, which contains a suite of tools for GI prediction. GIST provides a feature that allows user to download the necessary files required to run the tools automatically from the FTP server of the National Center for Biotechnology Information (NCBI) ftp://ftp.ncbi.nih.gov/genomes/Bacteria). Depending on the user's interest, GIST allows the user to select any combination of the tools, invokes and runs selected programs in the back end, generates and organizes prediction results. We believe that the development of GIST should benefit the scientific communities for easy use in studying genome evolutions and gene transfer mechanisms.

Software Input and Output

GIST includes five individual GI prediction programs, as well as the optimization tool EGID, which uses the prediction results of any combination of individual programs as the inputs, and produces consensus predicted GIs. The GUI layout of GIST is shown in Figure 1. GIST requires five different types of files for any single genome for GI prediction. These file types include FNA, FAA, FFN, GBK and PTT, where the required information such as k-mers, G+C content, codon usage, and dinucleotide frequency can be extracted. For the same genome, all of these files need to be saved in the same directory that is used as the input for that genome. If users are only interested in a particular program, they can select the program from the ‘Programs’ panel and hit the ‘Start Prediction’ button. It is important to note that if EGID is selected, it executes other tools along with itself thereby produces the optimized prediction results.
Figure 1

Main window of the GIST tool.

Users can specify the output folder location; otherwise the output files are saved into the default output directory. The output file for each tool is a text file containing the start and stop positions of the genomic island regions for the input genome. For the detailed usage of GIST for GI prediction, please refer to the user guide of our website (http://www5.esu.edu/cpsc/bioinfo/software/GIST).

Automatic Genome Download Feature

One of the most important features of GIST is the functionality of automatic connecting and downloading of the required genomic files through the FTP server of NCBI, as shown in (Figure 2). The panel ‘FTP Directory’ contains the tree representation of the organisms available in the FTP server of NCBI. User can select any genome that belongs to any of the organisms by exploring the tree node of that organism.
Figure 2

Graphical User Interface to download genomes

To add a genome into the download list, the user can double click on that genome name or use the ‘Add’ button in the ‘Add/Remove’ after selecting that genome. On the other hand, to remove any genome from the download list, the user can use the ‘Remove’ button. When the ‘Start Download’ button is pressed, necessary files of all genomes in the download list are downloaded automatically and the progress bar shows the download progress status. Downloaded files are saved into the corresponding directory of each genome. User can specify the directory location to save the downloaded files. By default, this program saves the downloaded files in the ‘Download’ directory (GIST_1.0/Download) if the location is not specified by the user.

Caveat and Future Development

The current version of GIST produces prediction results in text file. In the next version, we will integrate the visualization feature such as circular representation, so that users can easily compare the results.
  15 in total

1.  IslandPath: aiding detection of genomic islands in prokaryotes.

Authors:  William Hsiao; Ivan Wan; Steven J Jones; Fiona S L Brinkman
Journal:  Bioinformatics       Date:  2003-02-12       Impact factor: 6.937

2.  Detecting pathogenicity islands and anomalous gene clusters by iterative discriminant analysis.

Authors:  Qiang Tu; Dafu Ding
Journal:  FEMS Microbiol Lett       Date:  2003-04-25       Impact factor: 2.742

3.  Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models.

Authors:  Stephan Waack; Oliver Keller; Roman Asper; Thomas Brodag; Carsten Damm; Wolfgang Florian Fricke; Katharina Surovcik; Peter Meinicke; Rainer Merkl
Journal:  BMC Bioinformatics       Date:  2006-03-16       Impact factor: 3.169

Review 4.  Pathogenicity islands of virulent bacteria: structure, function and impact on microbial evolution.

Authors:  J Hacker; G Blum-Oehler; I Mühldorfer; H Tschäpe
Journal:  Mol Microbiol       Date:  1997-03       Impact factor: 3.501

Review 5.  Pathogenicity islands and the evolution of microbes.

Authors:  J Hacker; J B Kaper
Journal:  Annu Rev Microbiol       Date:  2000       Impact factor: 15.500

6.  Classification of genomic islands using decision trees and their ensemble algorithms.

Authors:  Dongsheng Che; Cory Hockenbury; Robert Marmelstein; Khaled Rasheed
Journal:  BMC Genomics       Date:  2010-11-02       Impact factor: 3.969

Review 7.  Pathogenicity islands in bacterial pathogenesis.

Authors:  Herbert Schmidt; Michael Hensel
Journal:  Clin Microbiol Rev       Date:  2004-01       Impact factor: 26.132

8.  Codon usages in different gene classes of the Escherichia coli genome.

Authors:  S Karlin; J Mrázek; A M Campbell
Journal:  Mol Microbiol       Date:  1998-09       Impact factor: 3.501

9.  EGID: an ensemble algorithm for improved genomic island detection in genomic sequences.

Authors:  Dongsheng Che; Mohammad Shabbir Hasan; Han Wang; John Fazekas; Jinling Huang; Qi Liu
Journal:  Bioinformation       Date:  2011-11-20

10.  Evaluation of genomic island predictors using a comparative genomics approach.

Authors:  Morgan G I Langille; William W L Hsiao; Fiona S L Brinkman
Journal:  BMC Bioinformatics       Date:  2008-08-05       Impact factor: 3.169

View more
  19 in total

Review 1.  Experimental approaches to tracking mobile genetic elements in microbial communities.

Authors:  Christina C Saak; Cong B Dinh; Rachel J Dutton
Journal:  FEMS Microbiol Rev       Date:  2020-09-01       Impact factor: 16.408

2.  Complete Genome Sequencing Analysis of Deinococcus wulumuqiensis R12, an Extremely Radiation-Resistant Strain.

Authors:  Zijie Dai; Zhidong Zhang; Liying Zhu; Zhengming Zhu; Ling Jiang
Journal:  Curr Microbiol       Date:  2022-08-16       Impact factor: 2.343

3.  IslandViewer update: Improved genomic island discovery and visualization.

Authors:  Bhavjinder K Dhillon; Terry A Chiu; Matthew R Laird; Morgan G I Langille; Fiona S L Brinkman
Journal:  Nucleic Acids Res       Date:  2013-05-15       Impact factor: 16.971

4.  Insights into Ongoing Evolution of the Hexachlorocyclohexane Catabolic Pathway from Comparative Genomics of Ten Sphingomonadaceae Strains.

Authors:  Stephen L Pearce; John G Oakeshott; Gunjan Pandey
Journal:  G3 (Bethesda)       Date:  2015-04-07       Impact factor: 3.154

5.  HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers.

Authors:  Qiyun Zhu; Michael Kosoy; Katharina Dittmar
Journal:  BMC Genomics       Date:  2014-08-26       Impact factor: 3.969

Review 6.  Identifying pathogenicity islands in bacterial pathogenomics using computational approaches.

Authors:  Dongsheng Che; Mohammad Shabbir Hasan; Bernard Chen
Journal:  Pathogens       Date:  2014-01-13

7.  Microevolution Analysis of Bacillus coahuilensis Unveils Differences in Phosphorus Acquisition Strategies and Their Regulation.

Authors:  Zulema Gómez-Lunar; Ismael Hernández-González; María-Dolores Rodríguez-Torres; Valeria Souza; Gabriela Olmedo-Álvarez
Journal:  Front Microbiol       Date:  2016-02-08       Impact factor: 5.640

8.  Analysis of the Taxonomy and Pathogenic Factors of Pectobacterium aroidearum L6 Using Whole-Genome Sequencing and Comparative Genomics.

Authors:  Peidong Xu; Huanwei Wang; Chunxiu Qin; Zengping Li; Chunhua Lin; Wenbo Liu; Weiguo Miao
Journal:  Front Microbiol       Date:  2021-07-02       Impact factor: 5.640

Review 9.  Computational methods for predicting genomic islands in microbial genomes.

Authors:  Bingxin Lu; Hon Wai Leong
Journal:  Comput Struct Biotechnol J       Date:  2016-05-07       Impact factor: 7.271

10.  A Novel Method to Predict Genomic Islands Based on Mean Shift Clustering Algorithm.

Authors:  Daniel M de Brito; Vinicius Maracaja-Coutinho; Savio T de Farias; Leonardo V Batista; Thaís G do Rêgo
Journal:  PLoS One       Date:  2016-01-05       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.