| Literature DB >> 24118975 |
Choumouss Kamoun1, Thibaut Payen, Aurélie Hua-Van, Jonathan Filée.
Abstract
BACKGROUND: Insertion Sequences (ISs) and their non-autonomous derivatives (MITEs) are important components of prokaryotic genomes inducing duplication, deletion, rearrangement or lateral gene transfers. Although ISs and MITEs are relatively simple and basic genetic elements, their detection remains a difficult task due to their remarkable sequence diversity. With the advent of high-throughput genome and metagenome sequencing technologies, the development of fast, reliable and sensitive methods of ISs and MITEs detection become an important challenge. So far, almost all studies dealing with prokaryotic transposons have used classical BLAST-based detection methods against reference libraries. Here we introduce alternative methods of detection either taking advantages of the structural properties of the elements (de novo methods) or using an additional library-based method using profile HMM searches.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24118975 PMCID: PMC3852290 DOI: 10.1186/1471-2164-14-700
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Simplified workflow of the methods. Running softwares are indicated in blue, rectangles schematize the different output files at each step.
Figure 2Simplified workflow of the profile HMM search. Running softwares are indicated in blue, rectangles schematize the different output files at each step.
Respective performance of the different methods against reference dataset of 30 Archaeal genomes
| Number of different ISs | 190 | 187 | 125 | 67 |
| Number of different ISs > 2 copies | 115 | 120 (+4,3%) | 125 (+8.7%) | 67 (-47,8%) |
| Number of different MITEs | 26 | 26 (+0%) | 39 (+50%) | 39(+50%) |
| False positives | 0 | 11 | 99 | 0 |
Performance of a BLAST based search compared to a profile HMM search for diverse simulate and real metagenomes
| 149 0 | 133 0 | |
| 149 0 | 129 0 | |
| 149 0 | 29 0 | |
| 20 NA | 13 NA | |
| PBS Metagenome, ~1 kb | 189 281 | 264 7 |
| JCVI Metagenome, ~900 bp | 44 114 | 87 0 |
Sulfolobus datasets correspond to in silico fragmented genomes and the results indicated the number of individual transposases identified. For the real marine metagenomes, the results indicate the number of apparent true transposases identified (false or ambiguous positives have been removed after careful alignments visualization) in addition to the numbers of false positives. NA: not applicable (see the text for further details).
Main characteristics of the new ISs identified using methods
| | ISCNY (?) | 9 3 | 10e-12 IS | 0 | 0 | |
| | ? | 8 2 | - | 0 | 0 | |
| | IS200/605 | 4 0 | 10e-20 IS | 0 | 0 | |
| | ? | 4 0 | - | 6 | 27 | |
| | IS200/605 | 4 0 | 6e-36 IS | 0 | 0 | |
| | ? | 6 1 | - | 9 | 17 | |
| | IS3 | 4 1 | 5e-29 IS | 0 | 16 | |
| | IS4 (?) | 3 0 | 5e-9 IS | 0 | 34 | |
| | IS200/605 | 5 0 | 5e-18 IS | 0 | 0 | |
| | ? | 13 6 | - | 0 | 24 | |
| | ? | 2 1 | - | 0 | 16 | |
| | ? | 2 0 | - | 0 | 21 | |
| | ? | 5 0 | - | 0 | 19 | |
| | IS200/605 | 3 0 | 8e-35 IS | 0 | 10 | |
| ? | 2 1 | - | 0 | 14 |
The host genomes and the names according the ISFinder nomenclature is indicated, the IS family when it was defined, the copy number (C: complete, P: Partial), the similarity with ISFinder (04/2011 update) and the size of the IRs and DRs.
Main characteristics of the new MITEs identified using methods
| IS | ? | 136 | 6 | 15 | 4 | |
| IS | ? | 274 | 5 | 14 | 0 | |
| IS | IS | 368 | 17 | 0 | 0 | |
| IS | IS | 237 | 34 | 17 | 5 | |
| IS | IS | 570 | 13 | 0 | 0 | |
| IS | IS | 165 | 7 | 25 | 8 | |
| Is | IS | 185 | 5 | 16 | 0 | |
| IS | IS | 147 | 5 | 15 | 0 | |
| IS | IS | 247 | 23 | 20 | 6 | |
| IS | IS | 418 | 5 | 0 | 0 | |
| IS | IS | 89 | 5 | 16 | 0 | |
| IS | IS | 249 | 15 | 18 | 0 | |
| IS | IS | 160 | 14 | 26 | 4 | |
| IS | IS | 140 | 24 | 16 | 3 | |
| IS | IS | 179 | 6 | 22 | 7 | |
| IS | IS | 252 | 12 | 17 | 0 | |
| IS | IS | 169 | 7 | 25 | 0 | |
| IS | IS | 341 | 2 | 0 | 0 |
The host genomes, the potential IS autonomous partner and its family when defined are indicated. Lenght (in bp), copy number and size of the DRs and IRs are also mentioned.
Figure 3Sequence alignment of 3′ and 5′ ends of a representative set of MITEs identified in this study with their putative autonomous IS partners. Names of MITEs and their autonomous ISs are given according the ISFinder nomenclature, gray colors indicate conserved residues and the spaces delimitate the IRs of each elements.