| Literature DB >> 31022176 |
Hatice Akarsu1, Patricia Bordes2, Moise Mansour2, Donna-Joe Bigot2, Pierre Genevaux2, Laurent Falquet1.
Abstract
Bacterial Toxin-Antitoxin systems (TAS) are involved in key biological functions including plasmid maintenance, defense against phages, persistence and virulence. They are found in nearly all phyla and classified into 6 different types based on the mode of inactivation of the toxin, with the type II TAS being the best characterized so far. We have herein developed a new in silico discovery pipeline named TASmania, which mines the >41K assemblies of the EnsemblBacteria database for known and uncharacterized protein components of type I to IV TAS loci. Our pipeline annotates the proteins based on a list of curated HMMs, which leads to >2.106 loci candidates, including orphan toxins and antitoxins, and organises the candidates in pseudo-operon structures in order to identify new TAS candidates based on a guilt-by-association strategy. In addition, we classify the two-component TAS with an unsupervised method on top of the pseudo-operon (pop) gene structures, leading to 1567 "popTA" models offering a more robust classification of the TAs families. These results give valuable clues in understanding the toxin/antitoxin modular structures and the TAS phylum specificities. Preliminary in vivo work confirmed six putative new hits in Mycobacterium tuberculosis as promising candidates. The TASmania database is available on the following server https://shiny.bioinformatics.unibe.ch/apps/tasmania/.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31022176 PMCID: PMC6504116 DOI: 10.1371/journal.pcbi.1006946
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Overview of the pipeline to build the TASmania database.
The different steps include: downloading EnsemblBacteria, updating the InterPro annotation, selecting the proteins matching an arbitrary list of reference TAS IPR, building the corresponding HMM profiles and scanning the proteomes. In parallel, we structure target genomes into pseudo-operons and include phylum information. Finally, we add extra value to TASmania by clustering the HMM profiles into larger families for TA combinations analysis.
Fig 2Unique proteins length distribution of TASmania putative hits.
(A) Antitoxins length distribution (in amino acids). (B) Toxins length distribution (in amino acids). Blue and red vertical lines correspond to default thresholds used by TAfinder.
Fig 3Pseudo-operon types distribution.
(A) All hits from the TASmania (only the 20 most frequent pseudo-operons structures are shown). (B). Canonical hits only (two-genes T->A or A->T modules) highlighting the higher abundance of the A->T module type versus the T->A type.
Fig 4Comparison of TASmania and TAfinder hits.
Using M.tuberculosis as a proof-of-principle, a list of manually curated, new and promising TASmania-specific hits is shown in Table 1, compared to the results obtained by TAfinder on the same genomes. (A) Mycobacterium tuberculosis H37Rv. (B) Mycobacterium smegmatis HMC2 155. (C) Caulobacter crescentus CB15. (D) Staphylococcus aureus NCTC8325. These TASmania-specific TA hits correspond mostly to: i) type I or type IV systems; ii) orphan loci; iii) guilt-by-association “x” loci iv) unusual combinations (“TT”, “AA”). This confirms that our strategy of not filtering out any unusual TAS operon structures or protein lengths allows us to be more discovery-orientated. Including the guilt-by-association “x” cognates is also useful when looking for uncharacterized TAS families.
TASmania hits missed by TAfinder.
Some putative M.tuberculosis TAS are shown. For a complete automated list of hits missed by TAfinder, see S3 Table. Experimentally validated toxins are flagged with a “ab” superscripts. The qualifier “interpro_only” describes proteins that are not found by our HMMs, but had an InterPro hit of the primary IPR list.
| ensembl gene id | gene description | protein length | nearest Pfam identifier | nearest Pfam description | TAS info | operon id | operon structure |
|---|---|---|---|---|---|---|---|
| Hypothetical protein | 197 | interpro_only | interpro_only | T | 43 | TT | |
| Conserved protein | 68 | SymE_toxin | Toxin SymE, type I toxin-antitoxin system | T | 43 | TT | |
| Conserved hypothetical protein | 242 | interpro_only | interpro_only | T | 119 | xTx | |
| Hypothetical methlytransferase (methylase) | 263 | guilt_by_association | guilt_by_association | x | 119 | xTx | |
| Possible conserved membrane protein with PIN domain | 226 | PIN | PIN domain | T | 132 | xT | |
| Hypothetical protein | 326 | guilt_by_association | guilt_by_association | x | 132 | xT | |
| Hypothetical protein | 169 | PhdYeFM_antitox | Antitoxin Phd_YefM, type II toxin-antitoxin system | A | 150 | xA | |
| Conserved hypothetical protein | 397 | guilt_by_association | guilt_by_association | x | 150 | xA | |
| Possible toxin VapC25. Contains PIN domain. | 142 | PIN | PIN domain | T | 157 | T | |
| Conserved hypothetical protein | 197 | Zeta_toxin | Zeta toxin | T | 207 | xAATx | |
| Hypothetical protein | 129 | ParD_like | ParD-like antitoxin of type II bacterial toxin-antitoxin system | A | 207 | xAATx | |
| Possible toxin MazF1 | 93 | PemK_toxin | PemK-like, MazF-like toxin of type II toxin-antitoxin system | T | 248 | xxT | |
| Possible antitoxin MazE1 | 57 | guilt_by_association | guilt_by_association | x | 248 | xxT | |
| Conserved protein | 88 | interpro_only | interpro_only | T | 304 | Tx | |
| Probable ribonucleoside-diphosphate reductase (large subunit) NrdZ (ribonucleotide reductase | 692 | guilt_by_association | guilt_by_association | x | 304 | Tx | |
| Unknown protein | 83 | VapB_antitoxin | Bacterial antitoxin of type II TA system, VapB | A | 339 | A | |
| Conserved hypothetical protein | 207 | AbiEi_4 | Transcriptional regulator, AbiEi antitoxin | A | 551 | AT | |
| Hypothetical protein | 293 | AbiEii | Nucleotidyl transferase AbiEii toxin, Type IV TA system | T | 551 | AT | |
| Hypothetical protein | 191 | HicA_toxin | HicA toxin of bacterial toxin-antitoxin | T | 1059 | TA | |
| Transcriptional regulatory protein | 346 | HTH_3 | Helix-turn-helix | A | 1059 | TA | |
| Conserved protein | 396 | guilt_by_association | guilt_by_association | x | 1128 | Axxx | |
| Conserved protein | 143 | MraZ | MraZ protein, putative antitoxin-like | A | 1128 | Axxx | |
| Conserved protein | 189 | PemK_toxin | PemK-like, MazF-like toxin of type II toxin-antitoxin system | T | 1271 | T | |
| Conserved hypothetical protein | 153 | interpro_only | interpro_only | T | 1324 | xT | |
| Conserved hypothetical protein | 415 | guilt_by_association | guilt_by_association | x | 1324 | xT | |
| Conserved hypothetical protein | 256 | interpro_only | interpro_only | T | 1914 | xxxT | |
| Probable dipeptide-transport ATP-binding protein ABC transporter DppD | 548 | guilt_by_association | guilt_by_association | x | 1914 | xxxT |
atoxin genes that have been tested experimentally
btoxin genes that have been tested experimentally and validated as toxic and rescued by the antitoxin cognate
Fig 5Expression of putative toxins in M.smegmatis.
M.smegmatis strain MC2155 was freshly transformed with pLAM12-based constructs expressing the putative toxin encoding genes of M.tuberculosis identified in this work, namely Rv0078A, Rv0207c, Rv0229c, Rv0269c, Rv0366c, Rv0569, Rv2016, Rv2165c, Rv2514c, Rv3641c and Rv3662c. Transformants were plated on LB agar supplemented with kanamycin, without (-) or with 0.2% acetamide inducer (+). Plates were incubated 3 days at 37°C.
Fig 6Six putative TA of M.tuberculosis validated by rescue test in M.smegmatis.
M.smegmatis strain MC2155 was freshly transformed with pLAM12-based constructs expressing the putative toxic genes of M.tuberculosis (Rv0078A, Rv0207c, Rv0269c, Rv0366c, Rv2016 and Rv2514c) either alone or as an operon together with their respective putative antitoxin genes, namely Rv0078B/Rv0078A, Rv0208c/Rv0207c, Rv0269c/Rv0268c, Rv0367c/Rv0366c, Rv2016/Rv2017 and Rv2515c/Rv2514c. Transformants were plated on LB agar supplemented with kanamycin and acetamide inducer (0.2%), except for Rv0366c and Rv0367c/Rv0366c, which shows suppression by the putative antitoxin only in the absence of acetamide. Plates were incubated for three days at 37°C.
Fig 7Examples of co-occurrence of toxin and antitoxin clusters within two-genes pseudo-operons (popTAs).
The color key correspond to percentages (%), given in each cell. (A) Antitoxin clusters in A->T orientation, and their relation to toxin clusters. For instance, the modular A74 antitoxin cluster has three main cognates the T4, T65 and T78 toxin clusters: A74.T4 (31.06% of A74 popTAs) (nearest Pfam PhdYeFM_antitox.YafQ_toxin), A74.T65 (44.41% of A74 popTAs) (nearest Pfam PhdYeFM_antitox.PIN) and A74.T78 (13.59% of A74 popTAs) (nearest Pfam PhdYeFM_antitox.ParE_toxin). (B) Antitoxin clusters in T->A orientation, and their relation to toxin clusters. For instance, the bi-directional A12 antitoxin cluster’s main toxin cognate is T102, as in T102.A12 (29.15% of A12 popTAs) (nearest Pfam HicB_lk_antitox.HicA_toxin). A restrictive antitoxin cluster is also highlighted with A124 co-occurring mainly with T34 as in T34.A124 (99.88% of A124 popTAs) (nearest Pfam BrnT_toxin.BrnA_antitoxin). (C) Toxin clusters in A->T orientation, and their relation to antitoxin clusters. The restrictive T60 toxin cluster and its association with A46 in A46.T60 (98.79% of T60 popTAs) (nearest Pfam CcdA.CcdB) is given as example. (D) Toxin clusters in T->A orientation, and their relation to antitoxin clusters. T152 is also a quite restrictive toxin cluster that mostly has A23 as the main antitoxin cognate, as in T152.A23 (nearest Pfam HigB-like_toxin.HTH_3). The complete co-occurrence is shown in S3 Fig and described in S5 Table.
Fig 8popTAs across phyla.
The most abundant popTAs in relative numbers are specific to each phylum.
popTA across phyla.
List of popTA and their corresponding nearest Pfam annotation of the most enriched popTA in each phylum.
| popTA | Nearest pfam | phylum |
|---|---|---|
| A100.T10 | PhdYeFM_antitox.PIN | |
| A56.T125 | PhdYeFM_antitox.PIN | |
| A26.T83 | VapB_antitoxin.PIN | |
| A10.T93 | AbiEi_4.AbiEii | |
| A128.T19 | RHH_1.PIN | |
| A27.T123 | PhdYeFM_antitox.ParE_toxin | |
| A100.T124 | PhdYeFM_antitox.PIN | |
| A128.T132 | RHH_1.PIN | |
| A26.T115 | VapB_antitoxin.PIN | |
| A62.T122 | RHH_1.PemK_toxin | |
| A80.T75 | RHH_1.PIN | |
| A8.T101 | MazE_antitoxin.PIN | |
| A128.T20 | RHH_1.PIN | |
| A46.T129 | CcdA.PIN | |
| A128.T74 | RHH_1.PIN | |
| T85.A32 | Gp49.HTH_3 | |
| T48.A32 | Gp49.HTH_3 | |
| A32.T14 | HTH_3.HipA_C | |
| T152.A23 | HigB-like_toxin.HTH_3 | |
| A104.T68 | ParD_antitoxin.ParE_toxin | |
| T67.A40 | HigB-like_toxin.HTH_3 | |
| A60.T39 | AbiEi_4.AbiEii | |
| T67.A23 | HigB-like_toxin.HTH_3 | |
| T102.A113 | HicA_toxin.HicB_lk_antitox | |
| T45.A78 | HigB_toxin.HTH_3 | |
| T144.A69 | HicA_toxin.HicB | |
| A128.T89 | RHH_1.PIN | |
| T48.A95 | Gp49.HicB_lk_antitox | |
| A10.T39 | AbiEi_4.AbiEii | |
| A45.T6 | PhdYeFM_antitox.YoeB_toxin | |
| A90.T92 | PhdYeFM_antitox.YoeB_toxin | |
| T82.A95 | HicA_toxin.HicB_lk_antitox | |
| T102.A12 | HicA_toxin.HicB_lk_antitox | |
| A32.T13 | HTH_3.Zeta_toxin | |
| A58.T59 | RelB.ParE_toxin | |
| A24.T9 | PhdYeFM_antitox.ParE_toxin | |
| A104.T37 | ParD_antitoxin.ParE_toxin | |
| T34.T15 | BrnT_toxin.BrnA_antitoxin | |
| A44.T10 | MazE_antitoxin.PIN | |
| A18.T10 | MazE_antitoxin.PIN | |
| A30.T49 | MazE_antitoxin.PemK_toxin | |
| A94.T33 | PhdYeFM_antitox.Fic | |
| T25.A77 | HigB_toxin.HTH_3 | |
| T38.A32 | RelE.HTH_3 | |
| A48.T62 | RelB.YafQ_toxin | |
| A53.T65 | RHH_3.PIN | |
| A32.T2 | HTH_3.HipA_C | |
| A33.T24 | PrlF_antitoxin.Toxin_YhaV |
Fig 9A9 cluster as example of cluster modularity.
(A) Multiple sequence alignment of A9 antitoxin cluster (nearest Pfam PhdYeFM_antitox) proteins that are associated with T6 (nearest Pfam YoeB_toxin) or T9 (nearest Pfam ParE_toxin) toxin clusters. (B) HMM profile from antitoxin A9 cluster proteins, in A9.T6 and A9.T9 popTA. (C) Multiple sequence alignment of T6 or T9 toxin clusters proteins associated with A9 antitoxin cluster. (D) HMM profile from toxin T6 and T9 clusters proteins, in A9.T6 and A9.T9 popTA. Note: for clarity, only a subset of sequences are drawn. The magenta bars and stars highlight the conserved residues and regions.
Granularity of the traditional TAS annotations.
Example of M.tuberculosis H37Rv with some so-called VapB.VapC TA pairs. We propose a more objective nomenclature of the TAS based on the HMM profiles clusters. Note that all VapCs shown here have a PIN Pfam annotation, however their TASMANIA.Tn (Tn) is split into multiple sub-clusters emphasizing the diversity of the PIN domains. In contrast, their associated so-called VapB-like antitoxins have very diverse Pfam annotations, but consistent TASMANIA.An (An) clusters.
| ensembl gene id | gene description | hmm cluster Pfam annotation | popTA |
|---|---|---|---|
| Antitoxin VapB15 | VapB_antitoxin | A26.T115 | |
| Toxin VapC15 | PIN | A26.T115 | |
| Possible antitoxin VapB17 | VapB_antitoxin | A26.T83 | |
| Possible toxin VapC17 | PIN | A26.T83 | |
| Possible antitoxin VapB4 | PhdYeFM_antitox | A100.T10 | |
| Possible toxin VapC4 | PIN | A100.T10 | |
| Possible antitoxin VapB5 | PhdYeFM_antitox | A100.T10 | |
| Possible toxin VapC5 | PIN | A100.T10 | |
| Conserved protein | PhdYeFM_antitox | A100.T125 | |
| Hypothetical alanine rich protein | PIN | A100.T125 | |
| Possible antitoxin VapB46 | PhdYeFM_antitox | A100.T124 | |
| Possible toxin VapC46. Contains PIN domain. | PIN | A100.T124 | |
| Possible antitoxin VapB26 | RHH_1 | A128.T132 | |
| Possible toxin VapC26. Contains PIN domain. | PIN | A128.T132 | |
| Possible antitoxin VapB37 | RHH_1 | A128.T19 | |
| Possible toxin VapC37. Contains PIN domain. | PIN | A128.T19 | |
| Possible antitoxin VapB41 | RHH_1 | A128.T74 | |
| Possible toxin VapC41. Contains PIN domain. | PIN | A128.T74 | |
| Possible antitoxin VapB44 | RHH_1 | A128.T20 | |
| Possible toxin VapC44. Contains PIN domain. | PIN | A128.T20 | |
| Possible antitoxin VapB29 | RHH_1 | A80.T75 | |
| Possible toxin VapC29. Contains PIN domain. | PIN | A80.T75 | |
| Possible antitoxin VapB3 | CcdA | A46.T129 | |
| Possible toxin VapC3 | PIN | A46.T129 | |
| Possible antitoxin VapB27 | MazE_antitoxin | A8.T101 | |
| Possible toxin VapC27. Contains PIN domain. | PIN | A8.T101 | |
| Possible antitoxin VapB40 | SpoVT_C | A91.T88 | |
| Possible toxin VapC40. Contains PIN domain. | PIN | A91.T88 | |
| Possible antitoxin MazE2 | RHH_1 | A62.T122 | |
| Toxin MazF2 | PemK_toxin | A62.T122 | |
| Antitoxin RelF | PhdYeFM_antitox | A27.T123 | |
| Toxin RelG | ParE_toxin | A27.T123 |
Putative new antitoxin families inferred from the “guilt-by-association” loci.
The guilt-by-association loci next to toxin (as in “xT”, “Tx”) hits are collected and analysed for putative new antitoxin families.
| putative new antitoxin group | nearest Pfam equivalent | popTx | popTx Pfam annotation | TAS orientation | strains (examples) | ensembl gene id pairs (examples) | taxa where found |
|---|---|---|---|---|---|---|---|
| A*371 | VraX | A*371.T143 | VraX.PemK_toxin | AT | staphylococcus_aureus_subsp_aureus_vrs1 (Firmicutes) | ||
| A*77 | Glyoxalase | T32.A*77 | YafQ_toxin.Glyoxalase | TA | actinomyces_sp_s6_spd3 (Actinobacteria) | ||
| A*190 | Colicin_Pyocin | A*190.T4 | Colicin_Pyocin.YafQ_toxin | AT | helicobacter_pylori_51 (Proteobacteria) | ||
| A*237 | Antirestrict | A*237.T3 | Antirestrict.CbtA_toxin | AT | yersinia_pekkanenii (Proteobacteria) | ||
| A*2 | Response_reg | T5.A*2 | Cpta_toxin.Response_reg | TA | microbacterium_sp_leaf351 (Actinobacteria) | ||
| enterococcus_gilvus_atcc_baa_350_gca_000407545 (Firmicutes) | |||||||
| bradyrhizobium_sp_btai1 (Proteobacteria) | |||||||
| ardenticatena_maritima_gca_001306175 (Others) | |||||||
| A*72 | AP_endonuc_2 | A*72.T47 | AP_endonuc_2.ParE_toxin | AT | xanthomonas_euvesicatoria (Proteobacteria) | ||
| A*72.T56 | pseudorhodoferax_sp_leaf267 (Proteobacteria) | ||||||
| A*72.T63 | pseudomonas_amygdali_pv_aesculi (Proteobacteria) | ||||||
| A*72.T91 | sinorhizobium_meliloti_2011 (Proteobacteria) | ||||||
| A*72.T136 | acetobacter_tropicalis (Proteobacteria) | ||||||
| A*8 | HTH_3 | T25.A*8 | HigB_toxin.HTH_3 | TA | klebsiella_pneumoniae_subsp_pneumoniae_kpnih27 (Proteobacteria) | ||
| T45.A*8 | pedobacter_glucosidilyticus (Bacteroidetes) | ||||||
| A*1 | HTH_3 | A*1.T2 | HTH_3.HipA_C | AT | clavibacter_michiganensis_subsp_sepedonicus (Actinobacteria) | ||
| A*1.T14 | porphyromonas_cangingivalis (Bacteroidetes) | ||||||
| roseburia_faecis (Firmicutes) | |||||||
| pseudomonas_batumici (Proteobacteria) | |||||||
| A*1.T16 | corynebacterium_sp_nml_130206 (Actinobacter) | ||||||
| A*1 | HTH_3 | T14.A*1 | HipA_C.HTH_3 | TA | sulfurospirillum_multivorans_dsm_12446 (Proteobacteria) | ||
| T17.A*1 | acinetobacter_baumannii_naval_57 (Proteobacteria) | ||||||
| A*1, A*8 | HTH_3 | T21.A*1 | ParE_toxin.HTH_3 | TA | haemophilus_influenzae (Proteobacteria) | ||
| T78.A*1 | agrobacterium_arsenijevicii (Proteobacteria) | ||||||
| T123.A*1 | ralstonia_solanacearum_fqy_4 (Proteobacteria) | ||||||
| T9.A*8 | pseudomonas_brassicacearum_subsp_brassicacearum_nfm421 (Proteobacteria) | ||||||
| A*1 | HTH_3 | T18.A*1 | RelE.HTH_3 | TA | streptomyces_purpurogeneiscleroticus (Actinobacteria) | ||
| anaerovibrio_lipolyticus (Firmicutes) | |||||||
| citrobacter_freundii (Proteobacteria) | |||||||
| T38.A*1 | mycobacterium_tuberculosis_gca_001376955 (Actinobacteria) | ||||||
| escherichia_vulneris_nbrc_102420 (Proteobacteria) | |||||||
| A*27 | RHH_1 | A*27.T61 | RHH_1.ParE_toxin | AT | propionibacterium_acnes_hl043pa2 (Actinobacteria) | ||
| listeria_booriae (Firmicutes) | |||||||
| salmonella_enterica_subsp_enterica_gca_001431385 (Proteobacteria) | |||||||
| A*6 | Phage_integrase | T58.A*6 | CcdB.Phage_integrase | TA | escherichia_coli_10_0833 (Proteobacteria) | ||
| A*6 | Phage_integrase | A*6.T57 | Phage_integrase.PemK_toxin | AT | geobacillus_thermoglucosidasius_c56_ys93 (Firmicutes) | ||
| A*6.T13 | Phage_integrase.Zeta_toxin | acidovorax_sp_12322_1 (Proteobacteria) | |||||
| arcobacter_butzleri_l348 (Firmicutes) |