| Literature DB >> 29724163 |
Frederick Johannes Clasen1,2, Rian Ewald Pierneef3, Bernard Slippers4, Oleg Reva3.
Abstract
BACKGROUND: Genomic islands (GIs) are inserts of foreign DNA that have potentially arisen through horizontal gene transfer (HGT). There are evidences that GIs can contribute significantly to the evolution of prokaryotes. The acquisition of GIs through HGT in eukaryotes has, however, been largely unexplored. In this study, the previously developed GI prediction tool, SeqWord Gene Island Sniffer (SWGIS), is modified to predict GIs in eukaryotic chromosomes. Artificial simulations are used to estimate ratios of predicting false positive and false negative GIs by inserting GIs into different test chromosomes and performing the SWGIS v2.0 algorithm. Using SWGIS v2.0, GIs are then identified in 36 fungal, 22 protozoan and 8 invertebrate genomes.Entities:
Keywords: Comparative genomics; Eukaryotes; Genomic island; Horizontal gene transfer; SWGIS v2.0; Software tools
Mesh:
Year: 2018 PMID: 29724163 PMCID: PMC5934851 DOI: 10.1186/s12864-018-4724-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Graphical output of the SWGIS v2.0 algorithm. The SWGIS v2.0 displays GIs (pink blocks) on large linear chromosomes. Several parameters are also shown on the graph: GC-content (black curve); ratio of generalized to local relative variances calculated for tetranucleotide usage patterns normalized by the GC-content (blue curve, n1_4mer:GRV/n1_4mer:RV); distances between not-normalized local tetranucleotide usage pattern and the global pattern calculated for the complete chromosome (red curve, n0_4mer:D); asymmetry between not-normalized tetranucleotide usage patterns calculated for the direct and complement DNA strands (green curve, n0_4mer:PS). Use of these parameters for GI identification and their standard abbreviations were explained in more detail by Ganesan et al. (2008). and on the EuGI website (http://eugi.bi.up.ac.za) [25]
False positive and false negative values of SWGIS v2.0 calculated for different naïve chromosomes
| False positive a | False negative | |
|---|---|---|
|
| 16% | 5.67% |
|
| 0% | 7.73% |
|
| 0% | 30.93% |
|
| 0% | 30.41% |
|
| 1% | 16.49% |
|
| 2% | 21.13% |
|
| 11% | 24.74% |
|
| 1% | 29.90% |
aRandom chromosome fragments were moved from the other chromosome of the same organism (e.g. for NW_139454 random fragments were moved from NW_139474 to NW_139454 and vice versa for NW_139474)
False positive and false negative values of SWGIS v2.0 calculated for different non-naïve chromosomes
| False positive a | False negative | |
|---|---|---|
| Fungi | ||
| | 18% | 30.93% |
| | 16% | 34.54% |
| | 2% | 11.34% |
| | 7% | 42.78% |
| Protozoa | ||
| | 25% | 7.22% |
| | 22% | 5.67% |
| Invertebrates | ||
| | 49% | 13.40% |
| | 6% | 9.28% |
aRandom chromosome fragments were moved from the successive chromosome of the one listed in the table (e.g. for A. fumigatus random fragments were moved from NC_007195 to NC_007194, and similarly for the other organisms listed in column one)
Distribution of GIs identified by SWGIS v2.0 versus Mallet et al.
| Chromosome # | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Total |
|---|---|---|---|---|---|---|---|---|---|
| #GIs – Mallet et al. | 30 | 28 | 28 | 22 | 22 | 31 | 14 | 14 | 189 |
| #GIS – SWGIS v2.0 | 19 | 24 | 26 | 17 | 16 | 20 | 11 | 8 | 141 |
Fig. 2Comparison of GI sizes identified by Mallet et al. and SWGIS v2.0
Fig. 3BLASTN alignment of coding GIs identified in D. ananassae and the Wolbachia endosymbiont of D. ananassae. Circoletto [39] was used to visualize sequence similarity of coding GIs of D. ananassae and its Wolbachia endosymbiont. Scaffolds of Wolbachia are represented by their accession numbers from the original GenBank file downloaded from NCBI. Predicted GIs from D. ananassae are indicated by the scaffold accession number (e.g. NW.001939327) followed by a number which indicate the number of the GI predicted in the specific scaffold (e.g. NW. 001939327.1). Line colours are indicative of e-values between queries and subjects with red the smallest, orange second, green third and blue the highest e-values
GIs predicted in different eukaryotic lineages using SWGIS v2.0
| # species | # chromosomes | # GIs | # coding GIs | |
|---|---|---|---|---|
| Fungi | 36 | 614 | 3080 | 2299 |
| Protozoa | 22 | 392 | 2911 | 2506 |
| Invertebrate | 8 | 56 | 4559 | 494 |
| Total | 66 | 1062 | 10,550 | 5299 |