| Literature DB >> 24478765 |
Heini Ruhanen1, Daniel Hurley1, Ambarnil Ghosh2, Kevin T O'Brien1, Catrióna R Johnston3, Denis C Shields1.
Abstract
Short linear motifs (SLiMs) are functional stretches of protein sequence that are of crucial importance for numerous biological processes by mediating protein-protein interactions. These motifs often comprise peptides of less than 10 amino acids that modulate protein-protein interactions. While well-characterized in eukaryotic intracellular signaling, their role in prokaryotic signaling is less well-understood. We surveyed the distribution of known motifs in prokaryotic extracellular and virulence proteins across a range of bacterial species and conducted searches for novel motifs in virulence proteins. Many known motifs in virulence effector proteins mimic eukaryotic motifs and enable the pathogen to control the intracellular processes of their hosts. Novel motifs were detected by finding those that had evolved independently in three or more unrelated virulence proteins. The search returned several significantly over-represented linear motifs of which some were known motifs and others are novel candidates with potential roles in bacterial pathogenesis. A putative C-terminal G[AG].$ motif found in type IV secretion system proteins was among the most significant detected. A KK$ motif that has been previously identified in a plasminogen-binding protein, was demonstrated to be enriched across a number of adhesion and lipoproteins. While there is some potential to develop peptide drugs against bacterial infection based on bacterial peptides that mimic host components, this could have unwanted effects on host signaling. Thus, novel SLiMs in virulence factors that do not mimic host components but are crucial for bacterial pathogenesis, such as the type IV secretion system, may be more useful to develop as leads for anti-microbial peptides or drugs.Entities:
Keywords: antibacterial; bioinformatics; motif mimicry; pathogen; short linear motifs (SLiMs); virulence factor
Year: 2014 PMID: 24478765 PMCID: PMC3896991 DOI: 10.3389/fmicb.2014.00004
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Examples of known instances of Short Linear Motifs in bacterial virulence factors.
| CagL, Mce | RGD | An integrin binding cell adhesion motif | Conradi et al., |
| CagL | Participates in integrin binding | Conradi et al., | |
| YadA | Repeated collagen binding motif | Tahir et al., | |
| InlJ, InlA, nanA, CspA | Cell wall anchor motif | Harris et al., | |
| Eno | Plasminogen binding motif | Bergmann et al., | |
| CBPA-G, LytA-C | Repeated cholin (cell wall) binding motif | Garau et al., | |
| SpsA | Host secretory immunglobulin A (SlgA) and secretory component (SC) binding motif | Hammerschmidt et al., | |
| SipA, SopA | DEVD | Caspase 3 cleavage site motif | Srikanth et al., |
| AvrPto, HopF2, AvrB, AvrRpm1 | MG..C | Shan et al., | |
| WtsE, AvrE1, IpgB2, IpgB1, Map, EspM, EspT, SifA, SifB | Host Rho GTPase activation/modulation motif | Alto et al., | |
| WtsE, | [LR][KQVS][KQLR][EST][GQR][FLKS][EGPK] [MLVAS][KNAL][SGIE] | Putative endoplasmic reticulum membrane retention/retrieval motif | Ham et al., |
| SifA, AnkB | CLCCFL | CAAX box, putative prenylation motif (addition of farnesyl or geranylgeranyl group) | Boucrot et al., |
| YopE, SptP, ExoS | G.LR… T(YopE | Arginine finger motif, essential for Rho GAP function of virulence factors | Black and Bliska, |
| PopB, PopP2, AvrBs3 | [ | Nuclear localization signal (NLS) motifs | Szurek et al., |
| CagA, Tarp, AnkA, LspA | E[PNS][IV]Y[AEG] | Membrane targeting/phosphorylation motif | Higashi et al., |
| SspH2, SseI | ….GSGC….., G(C)M[GS][CL][KP]C, | Hicks et al., | |
| ExoS, SopE | [FIV]..[FIV].[FIV]..[NC].[FIV] | Membrane localization motif (targets ExoS to the Golgi-endoplasmic reticulum) | Zhang and Barbieri, |
| SopD2, SifA, SseJ, SspH2 | Translocation/late endocytic compartments targeting motif | Brown et al., | |
| ExoU | Plasma membrane localization/ubiquitinylation motif | Rabin and Hauser, | |
| SopE, BopE | Catalytic loop motif essential for guanine nucleotide exchange | Schlumberger et al., | |
| AvrPphB | Autoproteolytic cleavage motif | Dowen et al., | |
| SopA, IpaH, SspH1 | L….TC, C.D | E3 ubiquitin ligase motif | Zhang et al., |
| VirF | LP… … ….L | F-box domain motif, mediates protein–protein interactions | Tzfira et al., |
| VopL, VopF | [ | WH2-domain motif | Liverman et al., |
| SpvC, OspF, VirA | [KR]{0,2}[KR].{0,2}[KR].{2,4}[ILVM].[ILVF] | D motif, Docking motif required for specific binding to MAPKs | Zhu et al., |
| IpaA | L..AA..VA..V..LI..A. | Vinculin binding domain motif | Hamiaux et al., |
| ExoS, ExoT | FAS (14-3-3 protein) binding motif, mediates activation of the ADPRT domain | Sun et al., | |
| EspF | [RKY]..P..P, P..P.[KR],…[PV]..P, KP..[QK]… | SH3 binding motif | Alto et al., |
| Map, NleH1, EspI (NleA) | …[ST].[ACVILF]$, … [VLIFY].[ACVILF]$,…[DE].[ACVILF]$ (EspI | C-terminal PDZ1 binding motif | Lee et al., |
| Listeriolysin O (LLO) | PPASP | PEST-motif, involved in phagosomal escape of bacteria in infected cells | Lety et al., |
| VirD4, VirB11, VirB4, SecA | G….GK[TS] | Walker A motif, nucleotide-binding motif | Sato et al., |
| SecA | [RK]….G….L[VILFWYMC]{4,4}D | Walker B, nucleotide binding motif | Sato et al., |
| MsbA, PiaA, PiuA | LSGGQ (PiaA | ABC-motif, ATP binding cassette transporter motif | Garmory and Titball, |
| EsxA, EsxB, esat6 | W.G motif helps to create a shallow cleft structure and may represent a peptide recognition feature by which cargo proteins are acquired for transport | Burts et al., | |
Proven role in virulence. Bold, a non-eukaryotic motif.
start of the protein or if in the middle of the motif sequence states which amino acids are excluded in the position,
“$” end of the protein, “.” any amino acid, {} defines the length of a range in the motif sequence, [] defines which amino acids can occur at a given motif position, () marks positions of specific interest e.g., covalent modification or is used to group parts of the expression. Motif table modified from Dean (.
Functional search terms used to retrieve and download protein sequences from virulence factor database MvirDBbrowser tool.
| Adherence | 749 | 181 |
| Capsule | 332 | 57 |
| Chemotaxis | 192 | 18 |
| Effector | 111 | 27 |
| Endotoxin | 66 | 27 |
| Enzyme | 647 | 121 |
| Exotoxin | 92 | 12 |
| Lipoprotein | 463 | 70 |
| Motility | 86 | 23 |
| Siderophore | 150 | 43 |
| Type III secretion system | 571 | 75 |
| Type IV secretion system | 181 | 38 |
Figure 2Heat map visualization of the distribution of novel SliMFinder identified motifs amongst effector proteins from a selection of 60 organisms represented in the MvirDB. Columns: The bacterial species name with the total number of UPCs indicated in brackets at the start of the description. Purple, High GC Gram+ bacteria; Blue, Firmicutes; Yellow, a-proteobacteria; Light Brown, b-proteobacteria; Dark Brown, e-proteobacteria; Green, g-proteobacteria (non-enterobacteria); Red, g-proteobacteria (enterobacteria); Black, others (CFB). Rows: The motif regular expression with the total number of incidences in UPCs across all 60 organisms indicated in brackets at the start of the description. Color scale: The logarithm of the normalized N_UPC returned from the SLiMSearch results.
Figure 3Heat map visualization of the distribution of known virulence motifs amongst effector proteins from a selection of 60 organisms represented in the MvirDB. Columns: The bacterial species name with the total number of UPCs indicated in brackets at the start of the description. Purple, High GC Gram+ bacteria; Blue, Firmicutes; Yellow, a-proteobacteria; Light Brown, b-proteobacteria; Dark Brown, e-proteobacteria; Green, g-proteobacteria (non-enterobacteria); Red, g-proteobacteria (enterobacteria); Black, others (CFB). Rows: The motif regular expression with the total number of incidences in UPCs across all 60 organisms indicated in brackets at the start of the description. Color scale: The logarithm of the normalized N_UPC returned from the SLiMSearch results.
Figure 4Phylogenetic tree of the 60 organisms used to assess the distribution of prokaryotic protein motifs in Figures . Purple, High GC Gram+ bacteria; Blue, Firmicutes; Yellow, a-proteobacteria; Light Brown, b-proteobacteria; Dark Brown, e-proteobacteria; Green, g-proteobacteria (non-enterobacteria); Red, g-proteobacteria (enterobacteria); Black, others (CFB).
Significant motifs returned by SLiMFinder in each dataset (where probability <0.05).
| Adherence | 749 | Signal peptide motif | 3 | 151 | 47 | 0.00E+00 | Juncker et al., | |
| Signal peptide motif | 3 | 102 | 35 | 1.15E–10 | Juncker et al., | |||
| Signal peptide motif | 3 | 91 | 34 | 1.41E–08 | Juncker et al., | |||
| Signal peptide motif | 2.77 | 58 | 21 | 5.80E–07 | Juncker et al., | |||
| Signal peptide motif | 3 | 24 | 12 | 0.002 | Juncker et al., | |||
| KK$ | C-terminal KK | 3 | 23 | 11 | 0.034 | Bergmann et al., | ||
| Capsule | 332 | Signal peptide motif | 3 | 55 | 18 | 6.96E–06 | Juncker et al., | |
| Signal peptide motif | 4 | 16 | 7 | 0.002 | Juncker et al., | |||
| Signal peptide motif | 3 | 59 | 18 | 0.003 | Juncker et al., | |||
| Signal peptide motif | 2.63 | 39 | 13 | 0.003 | Juncker et al., | |||
| Signal peptide motif | 3 | 52 | 17 | 0.006 | Juncker et al., | |||
| Chemotaxis | 192 | – | – | – | – | – | – | |
| Effector | 111 | – | – | – | – | – | – | |
| Endotoxin | 66 | – | – | – | – | – | – | |
| Enzyme | 647 | Signal peptide motif | 3 | 27 | 14 | 9.52E–05 | Juncker et al., | |
| Signal peptide motif | 3 | 30 | 13 | 0.042 | Juncker et al., | |||
| Exotoxin | 92 | – | – | – | – | – | – | |
| Lipoprotein | 463 | L.[AG]C[AGS] | Lipobox | 3.4 | 78 | 30 | 0.00E+00 | Braun and Rehn, |
| [FLV].L.[AG]C | Lipobox | 3.4 | 136 | 24 | 0.00E+00 | Braun and Rehn, | ||
| [ILV].[AGS]C | Lipobox | 2.27 | 370 | 53 | 0.00E+00 | Braun and Rehn, | ||
| [AGS]C[AGS] | Lipobox | 2.27 | 285 | 50 | 7.22E–15 | Braun and Rehn, | ||
| Signal peptide motif | 3 | 109 | 27 | 1.27E–09 | Juncker et al., | |||
| L.{1,2}GC.{0,1}A | Lipobox | 4 | 41 | 15 | 1.65E–09 | Braun and Rehn, | ||
| A.{0,2}L..C.{0,2}S | Lipobox | 4 | 68 | 19 | 3.02E–09 | Braun and Rehn, | ||
| Signal peptide motif | 2.63 | 66 | 19 | 4.36E–09 | Juncker et al., | |||
| Signal peptide motif | 2.63 | 65 | 17 | 1.07E–07 | Juncker et al., | |||
| [ILV]..C.[AGS] | Lipobox | 2.27 | 217 | 36 | 4.11E–06 | Braun and Rehn, | ||
| KK$ | C-terminal KK | 3 | 22 | 9 | 0.016 | Bergmann et al., | ||
| Motility | 86 | – | – | – | – | – | – | |
| Siderophore | 150 | Signal peptide motif | 2.77 | 12 | 7 | 0.024 | Juncker et al., | |
| Type III secretion | 571 | – | – | – | – | – | – | |
| Type IV secretion | 181 | Signal peptide motif | 2.77 | 29 | 9 | 0.003 | Juncker et al., | |
| Signal peptide motif | 2.63 | 27 | 10 | 0.025 | Juncker et al., | |||
| Adherence | 749 | LP.G.Y | 4 | 37 | 12 | 0.012 | ||
| Capsule | 332 | G.S..M.L | 4 | 15 | 7 | 0.029 | ||
| Chemotaxis | 192 | E..Q.I[AG].I | 4.77 | 22 | 5 | 0.004 | ||
| E..Q.[IV]..I | 3.77 | 24 | 7 | 0.02 | ||||
| Effector | 111 | 3 | 21 | 6 | 0.012 | |||
| [LV].PY | 2.77 | 46 | 11 | 0.042 | ||||
| 2.77 | 17 | 6 | 0.049 | |||||
| Endotoxin | 66 | – | – | – | – | – | ||
| Enzyme | 647 | A.I.P.VL | 5 | 14 | 7 | 0.019 | ||
| VSIL.S | 5 | 11 | 7 | 0.049 | ||||
| Exotoxin | 92 | – | – | – | – | – | ||
| Lipoprotein | 463 | ML..C | 3 | 14 | 7 | 0.017 | ||
| Motility | 86 | – | – | – | – | – | ||
| Siderophore | 150 | I.K..G | 3 | 28 | 17 | 0.044 | ||
| GYP..TP | 5 | 5 | 4 | |||||
| Type III secretion | 571 | – | – | – | – | – | ||
| Type IV secretion | 181 | G[AG].$ | 2.77 | 19 | 9 | 2.64E–04 | ||
Where very similar motifs are returned for a protein group, only a representative motif is shown.
Information content (Edwards et al., 2007),
start of the protein, “$” end of the protein, “.” any amino acid, {} defines the range of a repeat in the motif sequence, [] defines which amino acids can occur at a given motif position.
Italic font is used when Probability (Sig-value) is higher than the 0.05 confidence level.
Figure 1MEME suite motif logos of the novel and known motifs returned in the SLiMFinder analysis. Each position in the motif is represented as a stack of letters. The total height of the stack is the “information content” of that position in the motif in bits. The height of the individual letters in a stack is the probability of the letter at that position multiplied by the total information content of the stack. Black box: the most significant novel motif G[AG].$, Yellow box: KK$ motifs found in Adherence and Lipoprotein datasets, Red box: Known bacterial motifs ∧.K.{0,2}K and [ILV].[AGS]C.
List of proteins containing G[AG].$ and KK$ motifs.
| G[AG].$ | Type IV secretion | 2.64E–04 | GGN | 9 | virB11 protein homolog|9992|YP_034060|49476019|8040| [ |
| GAK | virB4|10558|NP_863348|32469876|8343|VirB4 [ | ||||
| GAK | virB8|10560|NP_863298|32469826|8344|VirB8 [ | ||||
| GAE | cag pathogenicity island protein (cag11)|10866|NP_207327|15645157|8497| [ | ||||
| GGK | trwF protein|9938|YP_034270|49476229|8013|[ | ||||
| GAI | trwH2 hypothetical protein BH15720|9944|YP_034268|49476227|8016 [ | ||||
| GAS | cag pathogenicity island protein (cag25)|10894|NP_207342|15645172|8511| [ | ||||
| GGN | Putative type IV secretion system protein|41299|NP_790379|NP_790379.1|18355| [ | ||||
| GAI | trwH1 hypothetical protein BH15690|9942|YP_034265|49476224|8015| [ | ||||
| GGN | cag pathogenicity island protein (cag7)|10904|NP_207323|15645153|8516| [ | ||||
| KK$ | Adherence | 0.034 | KK | 11 | hmw2C putative accessory processing protein [ |
| KK | kpsC polysaccharide modification protein [ | ||||
| KK | ica operon transcriptional regulator [ | ||||
| KK | pavA adherence and virulence protein A [ | ||||
| KK | Type 4 fimbrial biogenesis protein PilO [ | ||||
| KK | Type 4 fimbrial biogenesis protein PilN [ | ||||
| KK | Putative collagen binding protein [ | ||||
| KK | oapA opacity associated protein [ | ||||
| KK | neuC1 putative N-acetylglucosamine-6-phosphate 2-epimerase/N-acetylglucosamine-6-phosphatase [ | ||||
| KK | waaE D,D-heptose 1-phosphate adenosyltransferase/7-phosphate kinase [ | ||||
| KK | hmw1C putative accessory processing protein [ | ||||
| KK | cytotoxin [ | ||||
| KK$ | Lipoprotein | 0.016 | KK | 9 | Multidrug resistance outer membrane efflux protein mdtP; Flags: Precursor|58083|Q8CVH8|24068| |
| KK | ylpB/yscJ needle complex inner membrane lipoprotein [ | ||||
| KK | Yop proteins translocation lipoprotein J OS = | ||||
| KK | Lipoprotein [ | ||||
| KK | LPP20 lipoprotein OS = | ||||
| KK | Outer membrane factor of efflux pump [ | ||||
| KK | Lipoprotein, putative [ | ||||
| KK | Iron transport lipoprotein SirF [ | ||||
| KK | Export protein prsA cytoplasmic membrane protein, protein folding|3480|GBAA2336 prsA-3 |GBAA2336|2870| | ||||
| KK | LPP20 lipoprotein OS = | ||||
| KK | Yop proteins translocation lipoprotein J OS = | ||||
| KK | YaeC family lipoprotein [ | ||||
| KK | Major outer membrane lipoprotein OS = | ||||
| KK | Putative lipoprotein [ |