| Literature DB >> 23251097 |
Gonzalo Riadi1, Cristobal Medina-Moenne, David S Holmes.
Abstract
Transposases (Tnps) are enzymes that participate in the movement of insertion sequences (ISs) within and between genomes. Genes that encode Tnps are amongst the most abundant and widely distributed genes in nature. However, they are difficult to predict bioinformatically and given the increasing availability of prokaryotic genomes and metagenomes, it is incumbent to develop rapid, high quality automatic annotation of ISs. This need prompted us to develop a web service, termed TnpPred for Tnp discovery. It provides better sensitivity and specificity for Tnp predictions than given by currently available programs as determined by ROC analysis. TnpPred should be useful for improving genome annotation. The TnpPred web service is freely available for noncommercial use.Entities:
Year: 2012 PMID: 23251097 PMCID: PMC3506888 DOI: 10.1155/2012/678761
Source DB: PubMed Journal: Comp Funct Genomics ISSN: 1531-6912
A comparison of the selectivities, sensitivities, and cutoff e-values derived from TnpPred versus the corresponding Pfam HMM profiles for 19 IS families.
| IS family | Pfam | TnpPred | |||||||
|---|---|---|---|---|---|---|---|---|---|
| HMM | Selectivity | Sensitivity | Cutoff1 | HMM | Selectivity | Sensitivity | Cutoff1 | ||
| Transposase_27 | 96.4% | 83.2% | 1.2 | Combined* | 100.0% | 95.4% | 2.0 | ||
| 1 | IS1 | — | — | — | — | ORF1* | 99.9% | 100.0% | 32 |
| — | — | — | — | ORF2* | 99.4% | 100.0% | 3.9 | ||
|
| |||||||||
| 2 | IS110 | Transposase_9 | 95.3% | 94.2% | 7.0 | ORF1* | 100.0% | 100.0% | 1.3 |
| Transposase_20 | 100.0% | 99.1% | 1.2 | ||||||
|
| |||||||||
| 3 | IS1380 | — | — | — | — | ORF1* | 100.0% | 100.0% | 2.3 |
|
| |||||||||
| 4 | IS200/IS605 | Transposase_17 | 100.0% | 100.0% | 3.9 | ORF1* | 100.0% | 100.0% | 4.2 |
|
| |||||||||
| — | — | — | — | Combined* | 93.9% | 96.0% |
| ||
| 5 | IS21 | — | — | — | — | ORF1* | 100.0% | 93.7% | 84 |
| IstB_N | 72.8% | 79.3% | 6.7 | ORF2* | 100.0% | 100.0% | 2.0 | ||
| IstB | 76.6% | 79.5% | 2.6 | 100.0% | 100.0% | 2.0 | |||
|
| |||||||||
| 6 | IS256 | Transposase_mut* | 100.0% | 100.0% | 8.7 | ORF1 | 99.4% | 98.8% | 3.1 |
|
| |||||||||
| — | — | — | — | Combined* | 99.5% | 81.8% | 2.7 | ||
| IS3_IS150 | — | — | — | — | ORF1* | 90.3% | 69.7% | 5.4 | |
| — | — | — | — | ORF2 | 100.0% | 100.0% | 2.4 | ||
| Transposase_8 | 80.1% | 78.8% | 2.5 | Combined* | 100.0% | 100.0% | 5.7 | ||
| IS3_IS2 | — | — | — | — | ORF1* | 100.0% | 90.0% | 1.2 | |
| — | — | — | — | ORF2* | 100.0% | 100.0% | 1.3 | ||
| — | — | — | — | Combined* | 98.6% | 89.6% | 3.3 | ||
| 7 | IS3_IS3 | — | — | — | — | ORF1* | 93.6% | 76.5% | 2.5 |
| — | — | — | — | ORF2* | 100.0% | 100.0% | 7.9 | ||
| — | — | — | — | Combined* | 99.9% | 100.0% | 3.9 | ||
| IS3_IS407 | — | — | — | — | ORF1* | 99.7% | 95.8% | 3.0 | |
| — | — | — | — | ORF2* | 100.0% | 100.0% | 3.5 | ||
| — | — | — | — | Combined* | 100.0% | 91.4% | 5.4 | ||
| IS3_IS51 | — | — | — | — | ORF1* | 87.4% | 74.3% | 5.5 | |
| — | — | — | — | ORF2 | 100.0% | 100.0% | 8.1 | ||
|
| |||||||||
| 8 | IS30 | — | — | — | — | ORF1* | 100.0% | 100.0% | 1.7 |
|
| |||||||||
| 9 | IS4 | Transposase_11 | 99.0% | 96.0% | 9.5 | ORF1* | 100.0% | 96.1% | 1.3 |
| Transposase_Tn5 | 51.8% | 58.9% | 1.1 | ||||||
|
| |||||||||
| 10 | IS481 | Mu-transpos_C | 67.7% | 54.0% | 6.7 | ORF1* | 99.9% | 100.0% | 4.0 |
|
| |||||||||
| IS5_IS1031 | — | — | — | — | ORF1* | 100.0% | 100.0% | 1.2 | |
| IS5_IS427 | — | — | — | — | Combined* | 99.7% | 97.7% | 6.7 | |
| 11 | IS5_IS5 | Transposase_33 | 54.6% | 60.4% | 1.1 | ORF1* | 100.0% | 100.0% | 4.2 |
| IS5_IS903 | — | — | — | — | ORF1* | 100.0% | 100.0% | 7.3 | |
| IS5_ISH1 | — | — | — | — | ORF1* | 100.0% | 100.0% | 1.5 | |
| IS5_ISL2 | — | — | — | ORF1* | 100.0% | 100.0% | 2.6 | ||
|
| |||||||||
| 12 | IS6 | — | — | — | — | ORF1* | 100.0% | 100.0% | 4.1 |
|
| |||||||||
| 13 | IS630 | Transposase_14 | 52.8% | 68.2% | 9.8 | ORF1* | 98.4% | 97.7% | 2.7 |
|
| |||||||||
| Transposase_34 | 89.9% | 73.1% | 2.4 | Combined | 85.6% | 79.0% | 1.5 | ||
| 14 | IS66 | — | — | — | — | ORF1* | 97.8% | 82.6% | 1.4 |
| — | — | — | — | ORF2* | 94.0% | 88.4% | 3.4 | ||
| — | — | — | — | ORF3* | 100.0% | 88.8% | 1.7 | ||
|
| |||||||||
| 15 | IS91 | Transposase_32 | 100.0% | 100.0% | 7.9 | ORF1* | 100.0% | 100.0% | 4.1 |
|
| |||||||||
| 16 | IS982 | — | — | — | — | ORF1* | 99.3% | 99.2% | 3.5 |
|
| |||||||||
| 17 | ISAs1 | — | — | — | — | ORF1* | 100.0% | 100.0% | 4.1 |
|
| |||||||||
| 18 | ISL3 | Transposase_12 | 100.0% | 99.0% | 4.7 | ORF1* | 100.0% | 99.0% | 5.0 |
|
| |||||||||
| 19 | Tn3 | Transposase_7* | 100.0% | 63.3% | 1.7 | ORF1* | 94.8% | 68.3% | 0.0 |
1Cutoff e-values are derived from ROC charts for each model (Supplementary File 1); *indicates the HMM that was selected for incorporation into the TnpPred web service.
Summary of Tnp predictions by TnpPred compared to ISsaga.
| Organism |
|
|
|---|---|---|
| Kingdom | Bacteria | Bacteria |
| Class | Acaryochloris | Gammaproteobacteria |
| Date | May 27, 10 | July 9, 10 |
| Accession number | NC_009925 | NC_010943 |
| % G + C | 47.3% | 66.3% |
| Length (Mbp) | 6.5 | 4.9 |
| Confirmed total | 244 | 46 |
| Class A* | 214 | 42 |
| Class B* | 30 | 4 |
| Class C* | 22 | 1 |
| Not found by TnpPred | 27 | 6 |
| Number of IS families TnpPred | 17 | 10 |
| Total: not found + TnpPred | 293 | 53 |
| Number of IS families ISsaga | 17 | 9 |
| Total TnpPred | 266 | 47 |
| Total ISsaga | 272 | 39 |
∗See Figure 1 for the definition of classes.
Figure 1Classes of improvement of gene annotation using TnpPred. (a) Additional information such as “family classifiaction” is provided for a previously annotated transposase, (b) prediction of a transposase where a previously hypothetical gene had been annotated, (c) prediction of a transposase where no prior annotation existed.