Literature DB >> 22355227

IGIPT - Integrated genomic island prediction tool.

Ruchi Jain1, Sandeep Ramineni, Nita Parekh.   

Abstract

UNLABELLED: IGIPT is a web-based integrated platform for the identification of genomic islands (GIs). It incorporates thirteen parametric measures based on anomalous nucleotide composition on a single platform, thus improving the predictive power of a horizontally acquired region, since it is known that no single measure can absolutely predict a horizontally transferred region. The tool filters putative GIs based on standard deviation from genomic average and also provides raw output in MS excel format for further analysis. To facilitate the identification of various structural features, viz., tRNA integration sites, repeats, etc. in the vicinity of GIs, the tool provides option to extract the predicted regions and its flanking regions. AVAILABILITY: The database is available for free at http://bioinf.iiit.ac.in/IGIPT/

Entities:  

Keywords:  genomic islands; horizontal gene transfer

Year:  2011        PMID: 22355227      PMCID: PMC3280501          DOI: 10.6026/007/97320630007307

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background:

A horizontally transferred event is defined as movement of genetic material between phylogenetically unrelated organisms by mechanisms other than vertical descent. These regions from diverse organisms, called Genomic Islands (GIs), are typically 10-200Kb in size (containing clusters of genes). Any biological advantage provided to the recipient organism by transferred DNA creates selective pressure for its retention in the host genome and several pathways of horizontal transfer have been established influencing traits such as antibiotic resistance, symbiosis and fitness, virulence and adaptation [1]. For example, horizontal gene transfer has been demonstrated in many pathogenic strains of bacteria and shown to be responsible for its virulence. The identification of genomic islands also forms the first step in the annotation of newly sequenced genomes. Various bioinformatics approaches have been proposed in their identification [2]. In the genomic era, with availability of large number of bacterial genomes, the preferred methods are based on nucleotide base compositions and comparative genomics. In IGIPT, we have implemented thirteen measures that capture anomaly in nucleotide composition, providing both genome-based and gene-based search on a single platform.

Methodology:

In any genome, vertically transmitted genes experience a particular set of directional mutation pressures mediated by the specific features of the replication machinery of the cell, such as balance of dNTP pools, mutational biases of the DNA polymerases, efficiency of mismatch repair systems and so on [3]. As a result each genome exhibits its own unique signatures, viz., distinct variations in the GC content, dinucleotide relative abundance, variations in usage of k-mer words, codons and amino acids. These measures, called parametric methods, are the most widely used approaches as the putative transferred genes can be identified without relying on comparisons with other organisms, thus providing an independent means of assessing the impact of gene transfer across lineages. The parametric measures implemented in IGIPT are broadly classified as genome-based or gene-based, depending on the analysis (shown as left- and right panel in Figure. 1). These measures are computed in a sliding window and regions deviant from the genomic average by user defined standard deviation (default 1.5σ) are identified as probable GIs.
Figure 1

Snapshot of the web-server IGIPT.

Measures at Genome Level:

The major advantage of these measures is that they do not require pre-existing annotation or comparison of homologous sequences, and can, therefore, be applied directly to newly sequenced genomes. The input to these measures is the complete genome/contig in Fasta format.

GC content:

It computes the frequency of G and C nucleotides, called the GC content [4].

Genomic signature:

The set of dinucleotide relative abundance values constitutes a “genomic signature” of an organism. Please see supplementary material.

k-mer Distributions:

It has been proposed by Karlin that most horizontally acquired genomic regions have distinct word (k-mer) compositions [5]. Please see supplementary material.

Measures at the Gene Level:

This module identifies horizontally acquired genes in a fully annotated gene set of the organism (in multi-fasta format). In the absence of this information, IGIPT provides comparison of two gene sets, one a representative gene set of the organism and the other whose horizontal acquisition needs to be confirmed (e.g., genes in predicted GIs from genome-based measures). This feature also allows comparison of predicted gene(s) with highly expressed genes of the organism, e.g., ribosomal genes, chaperon genes, etc. to reduce false predictions.

Codon usage Bias:

The unequal usage of synonymous codons has been extensively studied and virtually every codon has been shown to be preferentially used in some organisms and rarely used in others. Please see supplementary material.

Amino Acid Bias:

This bias refers to the deviation in the frequency of usage of individual amino acids over the average usage of all 20 amino acids.Please see supplementary material.

GC Content at Codon Positions:

This involves comparing the frequency of G or C at the three codon positions, GC1, GC2 and GC3, for a given gene set with the core gene set (or genomic average or highly expressed genes) of the organism [8]. IGIPT provides an option to download the predicted horizontally transferred regions/genes and its flanking regions (lower panel in Figure. 1) to facilitate analysis of conserved structural features in the vicinity of probable GIs, e.g., genes coding for integrases or transposases required for chromosomal integration and excision are flanked by direct repeats and are inserted in the vicinity of tRNA and tmRNA genes [9]. This feature is also useful for further analysis such as comparative genomics or phylogenetic analysis of putative GIs. The output of IGIPT is windows/genes filtered based on standard deviation and also provides option to download unfiltered output in MS excel format.

Conclusion:

Evolution of species by horizontal gene transfer is very common not only in prokaryotes but also in eukaryotes. It gives unique functionality to the organism to adapt to different environmental conditions and their identification is particularly useful in pathogens for identifying virulent genes. Since no single measure truly identifies a horizontally acquired region, by integrating numerous parametric measures on a single platform, IGIPT allows the users to analyze the predicted horizontally transferred regions/genes by thirteen different measures simultaneously, thus greatly increasing the confidence of prediction. A drawback of these parametric methods is that regions acquired from donors with similar compositional bias as the host genome will not be identified.
  9 in total

Review 1.  Selfish operons: the evolutionary impact of gene clustering in prokaryotes and eukaryotes.

Authors:  J Lawrence
Journal:  Curr Opin Genet Dev       Date:  1999-12       Impact factor: 5.578

Review 2.  Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes.

Authors:  S Karlin
Journal:  Trends Microbiol       Date:  2001-07       Impact factor: 17.079

Review 3.  Horizontal gene transfer in prokaryotes: quantification and classification.

Authors:  E V Koonin; K S Makarova; L Aravind
Journal:  Annu Rev Microbiol       Date:  2001       Impact factor: 15.500

Review 4.  Genomic islands in pathogenic and environmental microorganisms.

Authors:  Ulrich Dobrindt; Bianca Hochhut; Ute Hentschel; Jörg Hacker
Journal:  Nat Rev Microbiol       Date:  2004-05       Impact factor: 60.633

Review 5.  Detecting genomic islands using bioinformatics approaches.

Authors:  Morgan G I Langille; William W L Hsiao; Fiona S L Brinkman
Journal:  Nat Rev Microbiol       Date:  2010-05       Impact factor: 60.633

6.  Compositional differences within and between eukaryotic genomes.

Authors:  S Karlin; J Mrázek
Journal:  Proc Natl Acad Sci U S A       Date:  1997-09-16       Impact factor: 11.205

7.  A computational approach for identifying pathogenicity islands in prokaryotic genomes.

Authors:  Sung Ho Yoon; Cheol-Goo Hur; Ho-Young Kang; Yeoun Hee Kim; Tae Kwang Oh; Jihyun F Kim
Journal:  BMC Bioinformatics       Date:  2005-07-21       Impact factor: 3.169

8.  GC-Profile: a web-based tool for visualizing and analyzing the variation of GC content in genomic sequences.

Authors:  Feng Gao; Chun-Ting Zhang
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

9.  n-Gram characterization of genomic islands in bacterial genomes.

Authors:  Gordana M Pavlović-Lazetić; Nenad S Mitić; Milos V Beljanski
Journal:  Comput Methods Programs Biomed       Date:  2008-12-19       Impact factor: 5.428

  9 in total
  4 in total

1.  PIPS: pathogenicity island prediction software.

Authors:  Siomar C Soares; Vinícius A C Abreu; Rommel T J Ramos; Louise Cerdeira; Artur Silva; Jan Baumbach; Eva Trost; Andreas Tauch; Raphael Hirata; Ana L Mattos-Guaraldi; Anderson Miyoshi; Vasco Azevedo
Journal:  PLoS One       Date:  2012-02-15       Impact factor: 3.240

Review 2.  Identifying pathogenicity islands in bacterial pathogenomics using computational approaches.

Authors:  Dongsheng Che; Mohammad Shabbir Hasan; Bernard Chen
Journal:  Pathogens       Date:  2014-01-13

3.  Strikingly bacteria-like and gene-rich mitochondrial genomes throughout jakobid protists.

Authors:  Gertraud Burger; Michael W Gray; Lise Forget; B Franz Lang
Journal:  Genome Biol Evol       Date:  2013       Impact factor: 3.416

Review 4.  Microbial genomic island discovery, visualization and analysis.

Authors:  Claire Bertelli; Keith E Tilley; Fiona S L Brinkman
Journal:  Brief Bioinform       Date:  2019-09-27       Impact factor: 11.622

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.