| Literature DB >> 19493340 |
Kira S Makarova1, Yuri I Wolf, Eugene V Koonin.
Abstract
BACKGROUND: The prokaryotic toxin-antitoxin systems (TAS, also referred to as TA loci) are widespread, mobile two-gene modules that can be viewed as selfish genetic elements because they evolved mechanisms to become addictive for replicons and cells in which they reside, but also possess "normal" cellular functions in various forms of stress response and management of prokaryotic population. Several distinct TAS of type 1, where the toxin is a protein and the antitoxin is an antisense RNA, and numerous, unrelated TAS of type 2, in which both the toxin and the antitoxin are proteins, have been experimentally characterized, and it is suspected that many more remain to be identified.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19493340 PMCID: PMC2701414 DOI: 10.1186/1745-6150-4-19
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Figure 1Two computational strategies for the identification of TAS.
Previously characterized and new candidate TAS detected with the the first approach
| COG1848 | PIN | 1.1 | COG8614 | RHH | 1.0 |
| 1.2 | COG0640 | ArsR family HTH | 0.8 | ||
| 0.9 | 1.0 | ||||
| COG4679 | RelE | 0.8 | COG5606 | HTH | 0.6 |
| COG2026 | RelE | 0.7 | 0.7 | ||
| COG3668 | RelE | 0.9 | COG2161 | StbD/axe | 0.5 |
| 0.4 | 0.9 | ||||
| COG3657 | RelE | 0.7 | COG3636 | HTH | 0.6 |
| COG1848 | PIN | 0.4 | COG2002 | AbrB/MazE/PemI | 0.9 |
| COG3668 | RelE | 0.9 | COG3609 | RHH | 0.4 |
| 0.6 | |||||
| COG9434 | MazF | 0.6 | COG5302 | CcdA | 0.7 |
| COG1598 | hicB | 0.7 | COG1724 | hicA | 0.6 |
| COG3549 | RelE | 0.7 | COG3093 | HTH | 0.6 |
| COG2026 | RelE | 0.4 | COG2161 | StbD_axe | 0.9 |
| COG3668 | RelE | 0.7 | COG9004 | RHH | 0.5 |
| COG6187 | RelE | 0.6 | COG2944 | HTH | 0.6 |
| COG1487 | PIN | 0.7 | COG4710 | RHH | 0.4 |
| COG1487 | PIN | 0.6 | COG4456 | AbrB/MazE/PemI | 0.4 |
| COG3742 | PIN | 0.5 | COG4423 | RHH | 0.4 |
| COG1487 | PIN | n/a | COG5450 | RHH | 0.5 |
COG numbers below 5600 correspond to the COGs that are available on the NCBI site:
COGs that were predicted to include novel toxins and antitoxins in this work are shown by bold type.
CV – coefficient of variation.
Previously characterized and potential new TAS components detected with the second approach
| [ | PIN | Nuclease | 312 | |
| MazF | RNA interferase | 97 | ||
| Fic/Doc | AMPylation enzyme | 48 | ||
| RelE | RNA interferase | 27 | ||
| Table 1 | Aha1 domain | 280 | ||
| COG0394* | Arsenate reductase arsC; Part of a larger conserved gene associations | 98 | ||
| COG2217* | Cation transport ATPase; Part of a larger conserved gene associations | 83 | ||
| COG0798* | Arsenite efflux pump ACR3; Part of a larger conserved gene associations | 74 | ||
| COG2391* | YeeE/YedE family, DUF395; Part of a larger conserved gene associations | 64 | ||
| COG1055* | Arsenical pump membrane protein; Part of a larger conserved gene associations | 53 | ||
| [ | RelE | RNA interferase | 376 | |
| PIN | Nuclease | 335 | ||
| Acetyltransferase | 139 | |||
| MazF | RNA interferase | 62 | ||
| COG3505* | TraG/TraD/VirD4 family; Part of a larger conserved gene associations | 58 | ||
| DUF497 | 55 | |||
| ParA* | ParA, plasmid partitioning ATPase; Part of a larger conserved gene associations | 54 | ||
| HicA | RNA interferase | 26 | ||
| RHH | DNA-binding domain | 23 | ||
| COG0716* | Flavodoxin; Part of a larger conserved gene associations | 22 | ||
| COG4962* | Type II/IV secretion system protein; Part of a larger conserved gene associations | 21 | ||
| [ | AbrB | DNA-binding domain | 55 | |
| xre | DNA-binding domain | 22 | ||
| Unknown | 11 | |||
| [ | AbrB | DNA-binding domain | 107 | |
| RHH | DNA-binding domain | 81 | ||
| MazF/ccd | RNA interferase | 43 | ||
| Unknown | 29 | |||
| [ | xre | Transcriptional regulator | 730 | |
| RHH | DNA-binding domain | 510 | ||
| PHD | DNA-binding domain | 337 | ||
| Zn peptidase (fused to HTH) | 15 | |||
| Predicted DNA-binding domain; RHH fold | 10 | |||
| [ | RHH | DNA-binding domain | 366 | |
| AbrB | DNA-binding domain | 348 | ||
| PHD | DNA-binding domain | 285 | ||
| Protein of unknown function DUF433 | 97 | |||
| Uncharacterized protein family (UPF0175) | 46 | |||
| Zn peptidase (fused to HTH) | 42 | |||
| MazF/ccd | COG5302 | 41 | ||
| COG1211* | 4-diphosphocytidyl-2-methyl-D-erithritol synthase; Part of a larger conserved gene associations | 39 | ||
| Transcriptional regulator | 33 | |||
| COG1066* | Sms; Part of a large conserved gene associations | 26 | ||
| COG1092* | SAM-dependent methyltransferase; Part of a larger conserved gene associations | 26 | ||
| Predicted DNA-binding protein; AbrB superfamily | 24 | |||
| pfam00155* | Aminotransferase; Part of a larger conserved gene associations | 24 | ||
| COG5257* | Translation initiation factor 2; Part of a larger conserved gene associations | 20 | ||
| Predicted DNA-binding domain; RHH fold | 20 | |||
| [ | PIN | Nuclease | 276 | |
| PemK/MazFI | 15 | |||
| Table 1 | HEPN | Unknown | 445 | |
| Table 1 | MNT | Predicted nucleotidyltransferase | 482 | |
| [ | RelE | RNA interferase | 614 | |
| HipA | EF-Tu kinase | 244 | ||
| Zincin protease | 194 | |||
| xre | A variety of proteins containing xre-like HTH, many fused with various domain, not a distinct set | 97 | ||
| PIN | Nuclease | 67 | ||
| Unknown | 64 | |||
| COG0800* | 2-keto-3-deoxy-6-phosphogluconate aldolase; Part of a larger conserved gene associations | 46 | ||
| COG3842* | PotA is ABC-type transporter; Part of a larger conserved gene associations | 40 | ||
| PA2784-like* | A membrane protein, likely an exporter | 36 | ||
| YoaS-like* | A membrane protein, likely permease | 32 | ||
| BRO family; KilA – letal to host cells | 30 | |||
| Acetyltransferase | 30 | |||
| COG3063* | Tfp pilus assembly protein PilF; Part of a large conserved gene associations | 27 | ||
| PA4076-like* | A membrane protein, likely an exporter | 27 | ||
| COG4974* | Site-specific recombinase XerD; Apparent phage components with another function | 25 | ||
| COG0483* | Archaeal fructose-1,6-bisphosphatase; Part of a larger conserved gene associations | 21 | ||
| Predicted transcriptional regulator | 118 | |||
| [ | xre | Transcriptional regulator | 333 | |
| COG3550 | HipA C-terminal | 51 | ||
| [ | xre | Transcriptional regulator | 145 | |
| PIN | Nuclease | 60 | ||
| RelE | RNA interferase | 22 | ||
| [ | PIN | Nuclease | 25 | |
| Table 1 | Transcriptional regulator | 259 | ||
| [ | COG4636 | Predicted endonuclease | 114 | |
| [ | Predicted RNA interferase | 43 | ||
New TAS discussed in this work are shown in bold type; other associations marked by asterisk in column 3 were disregarded for reasons indicated in column 4.
Predicted new TAS
| MNT – minimal nucleotidyltransferase, possible toxin; HEPN – possible substrate binding domain; Structure solved (MNT: 1no5 and HEPN: 1o3u and 1jog). Molecular mechanism unknown. | ||
| PIN | Structure of AT is solved (PDB: | |
| PIN | Structure of AT is solved (PDB: | |
| PIN | AT – RHH (RHH); Specific for archaea | |
| PIN | AT: truncated MerR | |
| AT – predicted HTH domain; T – predicted RelE superfamily protein | ||
| RelE | Specific for methanogens; AT – predicted RHH superfamily protein | |
| MazF | No prediction for AT | |
| Fic/Doc | AT – is predicted DNA-binding protein; Specific for enteroproteobacteria | |
| PHD | T – predicted MazF superfamily protein; Molecular mechanism is likely the same as for MazF toxin | |
| AT: predicted RHH family protein; Molecular mechanism unknown. | ||
| Xre/cro HTH | T – no prediction; molecular mechanism unknown. | |
| Xre/cro HTH | T – predicted Zn-dependent protease. Often fused to AT domain. Frequent association with RelE and PIN toxins | |
| AT – xre family HTH; T – RES domain; Molecular mechanism unknown. | ||
| Xre/cro HTH | T: motility quorum-sensing regulator mqsR [ | |
| Xre/cro HTH | The closest characterized GNAT family acetyltrasferase is involved in antibiotic resistance [ | |
| RHH | T – is GNAT family acetyltrasferase | |
| Xre/cro HTH | ||
| T – Cyclase/dehydratase family protein. (PDB: | ||
New toxins and antitoxins predicted in this work are shown in bold type
Figure 2Predicted new families of toxins. A. Multiple alignment of COG4679 family (RelE interferase supefamily). B. Multiple alignment of SMa0917 family (PemK/MazF interferase superfamily). C. Multiple alignment of COG2929 family (RelE interferase superfamily) representative. The sequences are denoted by Gene Identification (GI) numbers from the GenBank database and abbreviated species names. Species name abbreviations (generally consisting of 3 first letters of genus name and 4 first letters of species) for all alignments are given in Additional file 13. The positions of the first and the last residues of the aligned region in the corresponding protein are indicated for each sequence. The numbers within the alignment represent poorly conserved inserts that are not shown. The coloring is based on the consensus shown underneath the alignment; h indicates hydrophobic residues (ACFILMVWY), p indicates polar residues (STEDKRNQH), s indicates small residues (AGSVC) and a indicates aromatic residues (WYFH). The secondary structure elements are shown according to structural data if the structure is available or predicted using the PSIPRED program [106]; E indicates β-strand and H indicates α-helix.
Figure 3Distribution of TAS across bacterial and archaeal taxa. Black: TAS absent in the taxon while random expectation is significantly non-zero. Dark gray: TAS absent in the taxon with random expectation not significantly different from zero. Blue: TAS is significantly underrepresented in a taxon with more than twofold difference from random expectation. Cyan: TAS is significantly underrepresented in a taxon with less than twofold difference from random expectation. Light gray: abundance of a TAS in a taxon does not significantly differ from random expectation. Orange: TAS is significantly overrepresented in a taxon with less than twofold difference from random expectation. Red: TAS is significantly overrepresented in a taxon with more than twofold difference from random expectation. The random expectation estimate is based on the total number of TAS of the given type and the total number of protein-coding genes in the given taxon. The statistical significance was estimated using the χ2 test (critical χ2 value of 3.84 for 1 degree of freedom and p-value of 0.05).
Figure 4Predicted new families of antitoxins. A. Multiple alignment of COG2442 family domain, a predicted DNA-binding antitoxin protein of winged HTH motif superfamily. B. Multiple alignment of COG2886 family of predicted antitoxins containing the HTH domain. C. Multiple alignment of COG2880 family, an AbrB superfamily representative. Designations are the same as in Figure 2.
Figure 5Multiple alignments of distinct predicted antitoxin families containing ribbon-helix-helix (RHH) domains. A. RHH domain of COG1753 family. B. RHH domain in MJ1172 family. C. RHH domain in COG5304/COG3514. D. RHH domain fused to HEPN domain (paREP 1 subfamily). Designations are as in Figure 2.
Figure 6Multiple alignment of the conserved cores of two distinct families of HEPN domains. The distinct subfamilies of HEPN domains are indicated by brackets on the right. Designations are as in Figure 2.
Figure 7Relative abundance of HEPN_T, HEPN_M and MNT domains in thermophiles and mesophiles. A. The total number of HEPN-MNT pairs in hyperthermophiles and thermophiles ("Thermo"), mesophiles and psychrophiles ("Meso") and all ("Both") genomes. B. The number of HEPN_T, HEPN_M and MNT genes in selected genomes. Font color indicates the temperature preference: red – hyperthermophiles; gold – thermophiles; green – mesophiles; blue – psychrophiles. Asterisks indicate Archaea.
Figure 8The relationship between the number of detected TA pairs and genome size.
Association of TAS with ecological features of prokaryotes
| Group 1/Group 2 | Group 1 median | Group 2 median | T-test |
| Archaea/Bacteria | 0.39 | 0.00 | |
| (hyper)thermophiles/meso- & psychrophiles | 0.34 | 0.05 | |
| Terrestrial & multi-environmental/other | -0.01 | 0.05 | |
| (hyper)thermophiles/meso- & psychrophiles | 0.16 | -0.01 | 0.0592 |
| Terrestrial & multi-environmental/other | -0.05 | 0.00 | |
| Archaea/Bacteria | 0.22 | -0.01 | |
| Terrestrial & multi-environmental/other | -0.05 | 0.00 | |
| Archaea/Bacteria | 0.30 | -0.01 | |
| (hyper)thermophiles/meso- & psychrophiles | 0.25 | -0.01 | |
Bold type highlights p-values significant at the 0.05 level.
Figure 9Graph of relationships between different families of toxins and antitoxins. Known (black) and predicted (magenta) toxins (red circles) and antitoxins (blue circles) and their operon organizations. Lines connect genes with 5 or more two-component operons found; thickness of a line is proportional to the frequency of the respective operon.
TAS "islands" in prokaryotic genomes.
| ACC no. | Genome | TAS pair | NO. of pairs | distance threshold | No. observed | No. expected | Chi2 |
| Synechococcus sp. WH 8102 | all | 11 | 3 | 6 | 0.3 | 82.79 | |
| Alteromonas macleodii 'Deep ecotype' | all | 19 | 3 | 6 | 0.5 | 48.27 | |
| Leptospira biflexa serovar Patoc strain 'Patoc 1 (Ames)' | all | 22 | 3 | 7 | 0.7 | 47.09 | |
| Mycobacterium tuberculosis F11 | all | 46 | 2 | 12 | 2.1 | 43.43 | |
| Nitrosomonas europaea ATCC 19718 | all | 48 | 3 | 18 | 4.5 | 40.8 | |
| Mycobacterium tuberculosis H37Ra | all | 57 | 3 | 16 | 3.9 | 36.41 | |
| Mycobacterium tuberculosis CDC1551 | all | 52 | 3 | 14 | 3.2 | 35.89 | |
| Chlorobium phaeobacteroides DSM 266 | all | 28 | 2 | 8 | 1.2 | 35.71 | |
| Marinobacter aquaeolei VT8 | all | 15 | 7 | 5 | 0.5 | 31.72 | |
| Mycobacterium bovis AF2122/97 | all | 53 | 3 | 14 | 3.5 | 30.44 | |
| Nitrosococcus oceani ATCC 19707 | all | 32 | 3 | 9 | 1.7 | 28.88 | |
| Mycobacterium bovis BCG str. Pasteur 1173P2 | all | 58 | 3 | 15 | 4.2 | 27.66 | |
| Mycobacterium tuberculosis H37Rv | all | 53 | 4 | 14 | 4.1 | 23.22 | |
| Chlorobium limicola DSM 245 | all | 26 | 5 | 10 | 2.4 | 23.02 | |
| Leptospira biflexa serovar Patoc strain 'Patoc 1 (Paris)' | all | 20 | 8 | 7 | 1.4 | 20.53 | |
| Chlorobium phaeobacteroides BS1 | all | 42 | 3 | 12 | 3.5 | 20.14 | |
| Salinibacter ruber DSM 13855 | all | 13 | 8 | 5 | 0.8 | 19.39 | |
| Haloquadratum walsbyi DSM 16790 | all | 10 | 5 | 3 | 0.3 | 19.31 | |
| Archaeoglobus fulgidus DSM 4304 | all | 30 | 3 | 8 | 1.8 | 18.79 | |
| Bartonella tribocorum CIP 105476 | all | 22 | 14 | 12 | 3.9 | 18.33 | |
| Pelodictyon phaeoclathratiforme BU-1 | all | 65 | 5 | 23 | 10.3 | 17.24 | |
| Thermofilum pendens Hrk 5 | HEPN-MNT | 14 | 3 | 4 | 0.7 | 10.92 | |
| Archaeoglobus fulgidus DSM 4304 | AbrB-PIN | 12 | 3 | 4 | 1.3 | 4.18 |
Figure 10TAS in selected genomes. Red dots show the approximate position of TAS genes on the circular chromosomes.
Figure 11Fractions of solo and two-gene operon occurrences for each family of toxins and antitoxins. Red, fraction of solo genes; blue, fraction of genes in (predicted) operons.
Over-representation of TAS on plasmids and chromosome
| MazF-PHD | 12 | 7 | Plasmid |
| COG5654-Xre | 55 | 195 | Plasmid |
| MerR-PIN | 9 | 34 | Plasmid |
| GNAT-RHH | 28 | 154 | Plasmid |
| RelE-RHH | 92 | 511 | Plasmid |
| ArsR-COG3832 | 13 | 310 | Chromosome |
| DUF397-Xre | 3 | 129 | Chromosome |
| HEPN-MNT | 11 | 572 | Chromosome |
| GNAT-Xre | 0 | 67 | Chromosome |
The enrichment was estimated compared to the random expectation given the analyzed amount of chromosomal and plasmid sequences. The distributions of the rest of the TAS were statistically indistinguishable from the random expectation.